Driver Task for ingesting raw data into Gen3 Butler repositories.
This Task is intended to be runnable from the command-line, but it doesn't
meet the other requirements of CmdLineTask or PipelineTask, and wouldn't
gain much from being one. It also wouldn't really be appropriate as a
subtask of a CmdLineTask or PipelineTask; it's a Task essentially just to
leverage the logging and configurability functionality that provides.
Each instance of `RawIngestTask` writes to the same Butler and maintains a
cache of Dimension entries that have already been added to or extracted
from its Registry. Each invocation of `RawIngestTask.run` ingests a list
of files (possibly semi-atomically; see `RawIngestConfig.onError`).
RawIngestTask should be subclassed to specialize ingest for the actual
structure of raw data files produced by a particular instrument.
Subclasses must either provide populated `MetadataReader` instances in the
`dataIdReader`, `visitReader`, and `exposureReader` class attributes, or
alternate implementations of the `extractDataId`, `extractVisit`, and
`extractExposure` methods that do not use those attributes (each
attribute-method pair may be handled differently). Subclasses may also
wish to override `getFormatter` and/or (rarely) `getDatasetType`. We do
not anticipate overriding `run`, `ensureDimensions`, `ingestFile`, or
`processFile` to ever be necessary.
Parameters
----------
config : `RawIngestConfig`
Configuration for whether/how to transfer files and how to handle
conflicts and errors.
butler : `~lsst.daf.butler.Butler`
Butler instance. Ingested Datasets will be created as part of
``butler.run`` and associated with its Collection.
Other keyword arguments are forwarded to the Task base class constructor.
Definition at line 83 of file ingest.py.
def lsst.obs.base.gen3.ingest.RawIngestTask.extractDataId |
( |
|
self, |
|
|
|
file, |
|
|
|
headers, |
|
|
|
obsInfo |
|
) |
| |
Return the Data ID dictionary that should be used to label a file.
Parameters
----------
file : `str` or path-like object
Absolute path to the file being ingested (prior to any transfers).
headers : `list` of `~lsst.daf.base.PropertyList`
All headers returned by `readHeaders()`.
obsInfo : `astro_metadata_translator.ObservationInfo`
Observational metadata extracted from the headers.
Returns
-------
dataId : `DataId`
A mapping whose key-value pairs uniquely identify raw datasets.
Must have ``dataId.dimensions() <= self.dimensions``, with at least
Instrument, Exposure, and Detector present.
Definition at line 317 of file ingest.py.
def lsst.obs.base.gen3.ingest.RawIngestTask.run |
( |
|
self, |
|
|
|
files |
|
) |
| |
Ingest files into a Butler data repository.
This creates any new Exposure or Visit Dimension entries needed to
identify the ingested files, creates new Dataset entries in the
Registry and finally ingests the files themselves into the Datastore.
Any needed Instrument, Detector, and PhysicalFilter Dimension entries
must exist in the Registry before `run` is called.
Parameters
----------
files : iterable over `str` or path-like objects
Paths to the files to be ingested. Will be made absolute
if they are not already.
Definition at line 148 of file ingest.py.