Coverage for python/lsst/daf/butler/datastores/posixDatastore.py : 91%

Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
# This file is part of daf_butler. # # Developed for the LSST Data Management System. # This product includes software developed by the LSST Project # (http://www.lsst.org). # See the COPYRIGHT file at the top-level directory of this distribution # for details of code ownership. # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see <http://www.gnu.org/licenses/>.
"""Basic POSIX filesystem backed Datastore.
Attributes ---------- config : `DatastoreConfig` Configuration used to create Datastore. registry : `Registry` `Registry` to use when recording the writing of Datasets. root : `str` Root directory of this `Datastore`. locationFactory : `LocationFactory` Factory for creating locations relative to this root. formatterFactory : `FormatterFactory` Factory for creating instances of formatters. storageClassFactory : `StorageClassFactory` Factory for creating storage class instances from name. templates : `FileTemplates` File templates that can be used by this `Datastore`. name : `str` Label associated with this Datastore.
Parameters ---------- config : `DatastoreConfig` or `str` Configuration.
Raises ------ ValueError If root location does not exist and ``create`` is `False` in the configuration. """
"""Path to configuration defaults. Relative to $DAF_BUTLER_DIR/config or absolute path. Can be None if no defaults specified. """
"checksum", "size"])
def setConfigRoot(cls, root, config, full): """Set any filesystem-dependent config options for this Datastore to be appropriate for a new empty repository with the given root.
Parameters ---------- root : `str` Filesystem path to the root of the data repository. config : `Config` A `Config` to update. Only the subset understood by this component will be updated. Will not expand defaults. full : `Config` A complete config with all defaults expanded that can be converted to a `DatastoreConfig`. Read-only and will not be modified by this method. Repository-specific options that should not be obtained from defaults when Butler instances are constructed should be copied from `full` to `Config`. """ toUpdate={"root": root}, toCopy=("cls", "records.table"))
raise ValueError("No root directory specified in configuration") raise ValueError("No valid root at: {0}".format(self.root))
# Now associate formatters with storage classes
# Read the file naming templates
# Name ourselves
# Storage of paths and formatters, keyed by dataset_id "size": int, "checksum": str, "dataset_id": int} value=self.RecordTuple, key="dataset_id", registry=registry)
"""Record internal storage information associated with this `DatasetRef`
Parameters ---------- ref : `DatasetRef` The Dataset that has been stored. info : `StoredFileInfo` Metadata associated with the stored Dataset. """ storage_class=info.storageClass.name, checksum=info.checksum, size=info.size)
"""Remove information about the file associated with this dataset.
Parameters ---------- ref : `DatasetRef` The Dataset that has been removed. """
"""Retrieve information associated with file stored in this `Datastore`.
Parameters ---------- ref : `DatasetRef` The Dataset that is to be queried.
Returns ------- info : `StoredFileInfo` Stored information about this file and its formatter.
Raises ------ KeyError Dataset with that id can not be found. """ # Convert name of StorageClass to instance checksum=record.checksum, size=record.size)
"""Check if the dataset exists in the datastore.
Parameters ---------- ref : `DatasetRef` Reference to the required dataset.
Returns ------- exists : `bool` `True` if the entity exists in the `Datastore`. """ # Get the file information (this will fail if no file)
# Use the path to determine the location
"""Load an InMemoryDataset from the store.
Parameters ---------- ref : `DatasetRef` Reference to the required Dataset. parameters : `dict` `StorageClass`-specific parameters that specify, for example, a slice of the Dataset to be loaded.
Returns ------- inMemoryDataset : `object` Requested Dataset or slice thereof as an InMemoryDataset.
Raises ------ FileNotFoundError Requested dataset can not be retrieved. TypeError Return value from formatter has unexpected type. ValueError Formatter failed to process the dataset. """
# Get file metadata and internal metadata
# Use the path to determine the location
# Too expensive to recalculate the checksum on fetch # but we can check size and existence " expected location of {}".format(ref.id, location.path)) raise RuntimeError("Integrity failure in Datastore. Size of file {} ({}) does not" " match recorded size of {}".format(location.path, size, storedFileInfo.size))
# We have a write storage class and a read storage class and they # can be different for concrete composites.
# Is this a component request?
storageClass=writeStorageClass, parameters=parameters), component=component) except Exception as e: raise ValueError("Failure from formatter for Dataset {}: {}".format(ref.id, e))
# Validate the returned data type matches the expected data type raise TypeError("Got type {} from formatter but expected {}".format(type(result), pytype))
def put(self, inMemoryDataset, ref): """Write a InMemoryDataset with a given `DatasetRef` to the store.
Parameters ---------- inMemoryDataset : `object` The Dataset to store. ref : `DatasetRef` Reference to the associated Dataset.
Raises ------ TypeError Supplied object and storage class are inconsistent. DatasetTypeNotSupportedError The associated `DatasetType` is not handled by this datastore. """
# Sanity check raise TypeError("Inconsistency between supplied object ({}) " "and storage class type ({})".format(type(inMemoryDataset), storageClass.pytype))
# Work out output file name except KeyError as e: raise DatasetTypeNotSupportedError(f"Unable to find template for {datasetType}") from e
# Get the formatter based on the storage class f"{datasetType.storageClass.name} or " f"DatasetType {typeName}") from e
# Write the file
"""Add an on-disk file with the given `DatasetRef` to the store, possibly transferring it.
The caller is responsible for ensuring that the given (or predicted) Formatter is consistent with how the file was written; `ingest` will in general silently ignore incorrect formatters (as it cannot efficiently verify their correctness), deferring errors until ``get`` is first called on the ingested dataset.
Parameters ---------- path : `str` File path. Treated as relative to the repository root if not absolute. ref : `DatasetRef` Reference to the associated Dataset. formatter : `Formatter` (optional) Formatter that should be used to retreive the Dataset. If not provided, the formatter will be constructed according to Datastore configuration. transfer : str (optional) If not None, must be one of 'move', 'copy', 'hardlink', or 'symlink' indicating how to transfer the file. The new filename and location will be determined via template substitution, as with ``put``. If the file is outside the datastore root, it must be transferred somehow.
Raises ------ RuntimeError Raised if ``transfer is None`` and path is outside the repository root. FileNotFoundError Raised if the file at ``path`` does not exist. FileExistsError Raised if ``transfer is not None`` but a file already exists at the location computed from the template. """
"assumed to be relative to self.root unless they are absolute." .format(fullPath))
path = os.path.relpath(path, absRoot) else: else: raise NotImplementedError("Transfer type '{}' not supported.".format(transfer))
# Create Storage information in the registry
# Associate this dataset with the formatter for later read. size=size, checksum=checksum) # TODO: this is only transactional if the DatabaseDict uses # self.registry internally. Probably need to add # transactions to DatabaseDict to do better than that.
# Register all components with same information
"""URI to the Dataset.
Parameters ---------- ref : `DatasetRef` Reference to the required Dataset. predict : `bool` If `True`, allow URIs to be returned of datasets that have not been written.
Returns ------- uri : `str` URI string pointing to the Dataset within the datastore. If the Dataset does not exist in the datastore, and if ``predict`` is `True`, the URI will be a prediction and will include a URI fragment "#predicted". If the datastore does not have entities that relate well to the concept of a URI the returned URI string will be descriptive. The returned URI is not guaranteed to be obtainable.
Raises ------ FileNotFoundError A URI has been requested for a dataset that does not exist and guessing is not allowed.
"""
# if this has never been written then we have to guess
else: # If this is a ref that we have written we can get the path. # Get file metadata and internal metadata
# Use the path to determine the location
"""Indicate to the Datastore that a Dataset can be removed.
.. warning::
This method does not support transactions; removals are immediate, cannot be undone, and are not guaranteed to be atomic if deleting either the file or the internal database records fails.
Parameters ---------- ref : `DatasetRef` Reference to the required Dataset.
Raises ------ FileNotFoundError Attempt to remove a dataset that does not exist. """ # Get file metadata and internal metadata
raise FileNotFoundError("No such file: {0}".format(location.uri))
# Remove rows from registries
"""Retrieve a Dataset from an input `Datastore`, and store the result in this `Datastore`.
Parameters ---------- inputDatastore : `Datastore` The external `Datastore` from which to retreive the Dataset. ref : `DatasetRef` Reference to the required Dataset in the input data store.
"""
"""Compute the checksum of the supplied file.
Parameters ---------- filename : `str` Name of file to calculate checksum from. algorithm : `str`, optional Name of algorithm to use. Must be one of the algorithms supported by :py:class`hashlib`. block_size : `int` Number of bytes to read from file at one time.
Returns ------- hexdigest : `str` Hex digest of the file. """ raise NameError("The specified algorithm '{}' is not supported by hashlib".format(algorithm))
|