|
def | getDatasetType (self) |
|
def | __init__ (self, Optional[RawIngestConfig] config=None, *Butler butler, **Any kwargs) |
|
RawFileData | extractMetadata (self, str filename) |
|
List[RawExposureData] | groupByExposure (self, Iterable[RawFileData] files) |
|
RawExposureData | expandDataIds (self, RawExposureData data) |
|
Iterator[RawExposureData] | prep (self, files, *Optional[Pool] pool=None, int processes=1) |
|
List[DatasetRef] | ingestExposureDatasets (self, RawExposureData exposure, *Optional[str] run=None) |
|
def | run (self, files, *Optional[Pool] pool=None, int processes=1, Optional[str] run=None) |
|
Driver Task for ingesting raw data into Gen3 Butler repositories.
Parameters
----------
config : `RawIngestConfig`
Configuration for the task.
butler : `~lsst.daf.butler.Butler`
Writeable butler instance, with ``butler.run`` set to the appropriate
`~lsst.daf.butler.CollectionType.RUN` collection for these raw
datasets.
**kwargs
Additional keyword arguments are forwarded to the `lsst.pipe.base.Task`
constructor.
Notes
-----
Each instance of `RawIngestTask` writes to the same Butler. Each
invocation of `RawIngestTask.run` ingests a list of files.
Definition at line 160 of file ingest.py.
Iterator[RawExposureData] lsst.obs.base.ingest.RawIngestTask.prep |
( |
|
self, |
|
|
|
files, |
|
|
*Optional[Pool] |
pool = None , |
|
|
int |
processes = 1 |
|
) |
| |
Perform all ingest preprocessing steps that do not involve actually
modifying the database.
Parameters
----------
files : iterable over `str` or path-like objects
Paths to the files to be ingested. Will be made absolute
if they are not already.
pool : `multiprocessing.Pool`, optional
If not `None`, a process pool with which to parallelize some
operations.
processes : `int`, optional
The number of processes to use. Ignored if ``pool`` is not `None`.
Yields
------
exposure : `RawExposureData`
Data structures containing dimension records, filenames, and data
IDs to be ingested (one structure for each exposure).
bad_files : `list` of `str`
List of all the files that could not have metadata extracted.
Definition at line 388 of file ingest.py.
def lsst.obs.base.ingest.RawIngestTask.run |
( |
|
self, |
|
|
|
files, |
|
|
*Optional[Pool] |
pool = None , |
|
|
int |
processes = 1 , |
|
|
Optional[str] |
run = None |
|
) |
| |
Ingest files into a Butler data repository.
This creates any new exposure or visit Dimension entries needed to
identify the ingested files, creates new Dataset entries in the
Registry and finally ingests the files themselves into the Datastore.
Any needed instrument, detector, and physical_filter Dimension entries
must exist in the Registry before `run` is called.
Parameters
----------
files : iterable over `str` or path-like objects
Paths to the files to be ingested. Will be made absolute
if they are not already.
pool : `multiprocessing.Pool`, optional
If not `None`, a process pool with which to parallelize some
operations.
processes : `int`, optional
The number of processes to use. Ignored if ``pool`` is not `None`.
run : `str`, optional
Name of a RUN-type collection to write to, overriding
the default derived from the instrument name.
Returns
-------
refs : `list` of `lsst.daf.butler.DatasetRef`
Dataset references for ingested raws.
Notes
-----
This method inserts all datasets for an exposure within a transaction,
guaranteeing that partial exposures are never ingested. The exposure
dimension record is inserted with `Registry.syncDimensionData` first
(in its own transaction), which inserts only if a record with the same
primary key does not already exist. This allows different files within
the same exposure to be incremented in different runs.
Definition at line 483 of file ingest.py.