lsst.obs.base  16.0-18-ge18fa5b
Public Member Functions | Public Attributes | Static Public Attributes | List of all members
lsst.obs.base.gen3.ingest.RawIngestTask Class Reference
Inheritance diagram for lsst.obs.base.gen3.ingest.RawIngestTask:
lsst.obs.base.gen3.ingest.VisitInfoRawIngestTask

Public Member Functions

def getDatasetType (cls)
 
def __init__ (self, config=None, butler, kwds)
 
def run (self, files)
 
def readHeaders (self, file)
 
def ensureDataUnits (self, file)
 
def ingestFile (self, file, headers, dataId, run=None)
 
def processFile (self, file)
 
def extractDataId (self, file, headers)
 
def extractVisitEntry (self, file, headers, dataId, associated)
 
def extractExposureEntry (self, file, headers, dataId, associated)
 
def getFormatter (self, file, headers, dataId)
 

Public Attributes

 butler
 
 datasetType
 
 units
 
 unitEntryCache
 
 stashRun
 

Static Public Attributes

 ConfigClass = RawIngestConfig
 

Detailed Description

Driver Task for ingesting raw data into Gen3 Butler repositories.

This Task is intended to be runnable from the command-line, but it doesn't
meet the other requirements of CmdLineTask or PipelineTask, and wouldn't
gain much from being one.  It also wouldn't really be appropriate as a
subtask of a CmdLineTask or PipelineTask; it's a Task essentially just to
leverage the logging and configurability functionality that provides.

Each instance of `RawIngestTask` writes to the same Butler and maintains a
cache of DataUnit entries that have already been added to or extracted
from its Registry.  Each invocation of `RawIngestTask.run` ingests a list
of files (possibly semi-atomically; see `RawIngestConfig.onError`).

RawIngestTask should be subclassed to specialize ingest for the actual
structure of raw data files produced by a particular camera. Subclasses
must either provide populated `MetadataReader` instances in the
`dataIdReader`, `visitReader`, and `exposureReader` class attributes, or
alternate implementations of the `extractDataId`, `extractVisit`, and
`extractExposure` methods that do not use those attributes (each
attribute-method pair may be handled differently).  Subclasses may also
wish to override `getFormatter` and/or (rarely) `getDatasetType`.  We do
not anticipate overriding `run`, `ensureDataUnits`, `ingestFile`, or
`processFile` to ever be necessary.

Parameters
----------
config : `RawIngestConfig`
    Configuration for whether/how to transfer files and how to handle
    conflicts and errors.
butler : `~lsst.daf.butler.Butler`
    Butler instance.  Ingested Datasets will be created as part of
    ``butler.run`` and associated with its Collection.

Other keyword arguments are forwarded to the Task base class constructor.

Definition at line 81 of file ingest.py.

Constructor & Destructor Documentation

◆ __init__()

def lsst.obs.base.gen3.ingest.RawIngestTask.__init__ (   self,
  config = None,
  butler,
  kwds 
)

Definition at line 129 of file ingest.py.

Member Function Documentation

◆ ensureDataUnits()

def lsst.obs.base.gen3.ingest.RawIngestTask.ensureDataUnits (   self,
  file 
)
Extract metadata from a raw file and add Exposure and Visit
DataUnit entries.

Any needed Camera, Sensor, and PhysicalFilter DataUnit entries must
exist in the Registry before `run` is called.

Parameters
----------
file : `str` or path-like object
    Absolute path to the file to be ingested.

Returns
-------
headers : `list` of `~lsst.daf.base.PropertyList`
    Result of calling `readHeaders`.
dataId : `dict`
    Data ID dictionary, as returned by `extractDataId`.

Definition at line 197 of file ingest.py.

◆ extractDataId()

def lsst.obs.base.gen3.ingest.RawIngestTask.extractDataId (   self,
  file,
  headers 
)
Return the Data ID dictionary that should be used to label a file.

Parameters
----------
file : `str` or path-like object
    Absolute path to the file being ingested (prior to any transfers).
headers : `list` of `~lsst.daf.base.PropertyList`
    All headers returned by `readHeaders()`.

Returns
-------
dataId : `dict`
    Must include "camera", "sensor", and "exposure" keys. If the
    Exposure is associated with a PhysicalFilter and/or Visit,
    "physical_filter" and "visit" keys should be provided as well
    (respectively).

Definition at line 343 of file ingest.py.

◆ extractExposureEntry()

def lsst.obs.base.gen3.ingest.RawIngestTask.extractExposureEntry (   self,
  file,
  headers,
  dataId,
  associated 
)
Create an Exposure DataUnit entry from raw file metadata.

Parameters
----------
file : `str` or path-like object
    Absolute path to the file being ingested (prior to any transfers).
headers : `list` of `~lsst.daf.base.PropertyList`
    All headers returned by `readHeaders()`.
dataId : `dict`
    The data ID for this file.  Implementations are permitted to
    modify this dictionary (generally by stripping off "sensor" and
    adding new metadata key-value pairs) and return it.
associated : `dict`
    A dictionary containing other associated DataUnit entries.
    Guaranteed to have "Camera", "Sensor", "PhysicalFilter", and
    "Visit" keys, but the latter two may map to ``None`` if
    `extractDataId` did not contain keys for these or mapped them to
    ``None``.  May also contain additional keys added by
    `extractVisitEntry`.

Returns
-------
entry : `dict`
    Dictionary corresponding to an Exposure database table row.
    Must have all non-null columns in the Exposure table as keys.

Definition at line 396 of file ingest.py.

◆ extractVisitEntry()

def lsst.obs.base.gen3.ingest.RawIngestTask.extractVisitEntry (   self,
  file,
  headers,
  dataId,
  associated 
)
Create a Visit DataUnit entry from raw file metadata.

Parameters
----------
file : `str` or path-like object
    Absolute path to the file being ingested (prior to any transfers).
headers : `list` of `~lsst.daf.base.PropertyList`
    All headers returned by `readHeaders()`.
dataId : `dict`
    The data ID for this file.  Implementations are permitted to
    modify this dictionary (generally by stripping off "sensor" and
    "exposure" and adding new metadata key-value pairs) and return it.
associated : `dict`
    A dictionary containing other associated DataUnit entries.
    Guaranteed to have "Camera", "Sensor",  and "PhysicalFilter" keys,
    but the last may map to ``None`` if `extractDataId` either did not
    contain a "physical_filter" key or mapped it to ``None``.
    Subclasses may add new keys to this dict to pass arbitrary data to
    `extractExposureEntry` (`extractVisitEntry` is always called
    first), but note that when a Visit is comprised of multiple
    Exposures, `extractVisitEntry` may not be called at all.

Returns
-------
entry : `dict`
    Dictionary corresponding to an Visit database table row.
    Must have all non-null columns in the Visit table as keys.

Definition at line 364 of file ingest.py.

◆ getDatasetType()

def lsst.obs.base.gen3.ingest.RawIngestTask.getDatasetType (   cls)
Return the DatasetType of the Datasets ingested by this Task.

Definition at line 123 of file ingest.py.

◆ getFormatter()

def lsst.obs.base.gen3.ingest.RawIngestTask.getFormatter (   self,
  file,
  headers,
  dataId 
)
Return the Formatter that should be used to read this file after
ingestion.

The default implementation returns None, which uses the formatter
configured for this DatasetType/StorageClass in the Butler.

Definition at line 425 of file ingest.py.

◆ ingestFile()

def lsst.obs.base.gen3.ingest.RawIngestTask.ingestFile (   self,
  file,
  headers,
  dataId,
  run = None 
)
Ingest a single raw file into the repository.

All necessary DataUnit entres must already be present.

This method is not transactional; it must be wrapped in a
``with self.butler.transaction` block to make per-file ingest
atomic.

Parameters
----------
file : `str` or path-like object
    Absolute path to the file to be ingested.
headers : `list` of `~lsst.daf.base.PropertyList`
    Result of calling `readHeaders`.
dataId : `dict`
    Data ID dictionary, as returned by `extractDataId`.
run : `~lsst.daf.butler.Run`, optional
    Run to add the Dataset to; defaults to ``self.butler.run``.

Definition at line 266 of file ingest.py.

◆ processFile()

def lsst.obs.base.gen3.ingest.RawIngestTask.processFile (   self,
  file 
)
Ingest a single raw data file after extacting metadata.

This creates any new Exposure or Visit DataUnit entries needed to
identify the ingest file, creates a new Dataset entry in the
Registry and finally ingests the file itself into the Datastore.
Any needed Camera, Sensor, and PhysicalFilter DataUnit entries must
exist in the Registry before `run` is called.

Parameters
----------
file : `str` or path-like object
    Absolute path to the file to be ingested.

Definition at line 307 of file ingest.py.

◆ readHeaders()

def lsst.obs.base.gen3.ingest.RawIngestTask.readHeaders (   self,
  file 
)
Read and return any relevant headers from the given file.

The default implementation simply reads the header of the first
non-empty HDU, so it always returns a single-element list.

Parameters
----------
file : `str` or path-like object
    Absolute path to the file to be ingested.

Returns
-------
headers : `list` of `~lsst.daf.base.PropertyList`
    Single-element list containing the header of the first
    non-empty HDU.

Definition at line 178 of file ingest.py.

◆ run()

def lsst.obs.base.gen3.ingest.RawIngestTask.run (   self,
  files 
)
Ingest files into a Butler data repository.

This creates any new Exposure or Visit DataUnit entries needed to
identify the ingested files, creates new Dataset entries in the
Registry and finally ingests the files themselves into the Datastore.
Any needed Camera, Sensor, and PhysicalFilter DataUnit entries must
exist in the Registry before `run` is called.

Parameters
----------
files : iterable over `str` or path-like objects
    Paths to the files to be ingested.  Will be made absolute
    if they are not already.

Definition at line 148 of file ingest.py.

Member Data Documentation

◆ butler

lsst.obs.base.gen3.ingest.RawIngestTask.butler

Definition at line 131 of file ingest.py.

◆ ConfigClass

lsst.obs.base.gen3.ingest.RawIngestTask.ConfigClass = RawIngestConfig
static

Definition at line 118 of file ingest.py.

◆ datasetType

lsst.obs.base.gen3.ingest.RawIngestTask.datasetType

Definition at line 132 of file ingest.py.

◆ stashRun

lsst.obs.base.gen3.ingest.RawIngestTask.stashRun

Definition at line 146 of file ingest.py.

◆ unitEntryCache

lsst.obs.base.gen3.ingest.RawIngestTask.unitEntryCache

Definition at line 142 of file ingest.py.

◆ units

lsst.obs.base.gen3.ingest.RawIngestTask.units

Definition at line 133 of file ingest.py.


The documentation for this class was generated from the following file: