lsst.daf.persistence  13.0-17-gd5d205a+2
 All Classes Namespaces Files Functions Variables Typedefs Friends Macros
Public Member Functions | Static Public Member Functions | Public Attributes | List of all members
lsst.daf.persistence.butler.Butler Class Reference
Inheritance diagram for lsst.daf.persistence.butler.Butler:

Public Member Functions

def __init__
 
def __repr__
 
def defineAlias
 
def getKeys
 
def queryMetadata
 
def datasetExists
 
def get
 
def put
 
def subset
 
def dataRef
 
def __reduce__
 

Static Public Member Functions

def getMapperClass
 

Public Attributes

 log
 
 datasetTypeAliasDict
 
 storage
 
 persistence
 

Detailed Description

Butler provides a generic mechanism for persisting and retrieving data using mappers.

A Butler manages a collection of datasets known as a repository. Each dataset has a type representing its
intended usage and a location. Note that the dataset type is not the same as the C++ or Python type of the
object containing the data. For example, an ExposureF object might be used to hold the data for a raw
image, a post-ISR image, a calibrated science image, or a difference image. These would all be different
dataset types.

A Butler can produce a collection of possible values for a key (or tuples of values for multiple keys) if
given a partial data identifier. It can check for the existence of a file containing a dataset given its
type and data identifier. The Butler can then retrieve the dataset. Similarly, it can persist an object to
an appropriate location when given its associated data identifier.

Note that the Butler has two more advanced features when retrieving a data set. First, the retrieval is
lazy. Input does not occur until the data set is actually accessed. This allows datasets to be retrieved
and placed on a clipboard prospectively with little cost, even if the algorithm of a stage ends up not
using them. Second, the Butler will call a standardization hook upon retrieval of the dataset. This
function, contained in the input mapper object, must perform any necessary manipulations to force the
retrieved object to conform to standards, including translating metadata.

Public methods:

__init__(self, root, mapper=None, **mapperArgs)

defineAlias(self, alias, datasetType)

getKeys(self, datasetType=None, level=None)

queryMetadata(self, datasetType, format=None, dataId={}, **rest)

datasetExists(self, datasetType, dataId={}, **rest)

get(self, datasetType, dataId={}, immediate=False, **rest)

put(self, obj, datasetType, dataId={}, **rest)

subset(self, datasetType, level=None, dataId={}, **rest)

dataRef(self, datasetType, level=None, dataId={}, **rest)

Initialization:

The preferred method of initialization is to pass in a RepositoryArgs instance, or a list of
RepositoryArgs to inputs and/or outputs.

For backward compatibility: this initialization method signature can take a posix root path, and
optionally a mapper class instance or class type that will be instantiated using the mapperArgs input
argument. However, for this to work in a backward compatible way it creates a single repository that is
used as both an input and an output repository. This is NOT preferred, and will likely break any
provenance system we have in place.

Parameters
----------
root - string
    .. note:: Deprecated in 12_0
              `root` will be removed in TBD, it is replaced by `inputs` and `outputs` for
              multiple-repository support.
    A fileysystem path. Will only work with a PosixRepository.
mapper - string or instance
    .. note:: Deprecated in 12_0
              `mapper` will be removed in TBD, it is replaced by `inputs` and `outputs` for
              multiple-repository support.
    Provides a mapper to be used with Butler.
mapperArgs - dict
    .. note:: Deprecated in 12_0
              `mapperArgs` will be removed in TBD, it is replaced by `inputs` and `outputs` for
              multiple-repository support.
    Provides arguments to be passed to the mapper if the mapper input arg is a class type to be
    instantiated by Butler.
inputs - RepositoryArgs or string
    Can be a single item or a list. Provides arguments to load an existing repository (or repositories).
    String is assumed to be a URI and is used as the cfgRoot (URI to the location of the cfg file). (Local
    file system URI does not have to start with 'file://' and in this way can be a relative path).
outputs - RepositoryArg or string
    Can be a single item or a list. Provides arguments to load one or more existing repositories or create
    new ones. String is assumed to be a URI and as used as the repository root.

Definition at line 225 of file butler.py.

Constructor & Destructor Documentation

def lsst.daf.persistence.butler.Butler.__init__ (   self,
  root = None,
  mapper = None,
  inputs = None,
  outputs = None,
  mapperArgs 
)

Definition at line 304 of file butler.py.

Member Function Documentation

def lsst.daf.persistence.butler.Butler.__reduce__ (   self)

Definition at line 1094 of file butler.py.

def lsst.daf.persistence.butler.Butler.__repr__ (   self)

Definition at line 613 of file butler.py.

def lsst.daf.persistence.butler.Butler.dataRef (   self,
  datasetType,
  level = None,
  dataId = {},
  rest 
)
Returns a single ButlerDataRef.

Given a complete dataId specified in dataId and **rest, find the unique dataset at the given level
specified by a dataId key (e.g. visit or sensor or amp for a camera) and return a ButlerDataRef.

Parameters
----------
datasetType - str
    The type of dataset collection to reference
level - str
    The level of dataId at which to reference
dataId - dict
    The data id.
**rest
    Keyword arguments for the data id.

Returns
-------
dataRef - ButlerDataRef
    ButlerDataRef for dataset matching the data id

Definition at line 1044 of file butler.py.

def lsst.daf.persistence.butler.Butler.datasetExists (   self,
  datasetType,
  dataId = {},
  rest 
)
Determines if a dataset file exists.

Parameters
----------
datasetType - str
    The type of dataset to inquire about.
dataId - DataId, dict
    The data id of the dataset.
**rest keyword arguments for the data id.

Returns
-------
exists - bool
    True if the dataset exists or is non-file-based.

Definition at line 796 of file butler.py.

def lsst.daf.persistence.butler.Butler.defineAlias (   self,
  alias,
  datasetType 
)
Register an alias that will be substituted in datasetTypes.

Paramters
---------
alias - str
    The alias keyword. It may start with @ or not. It may not contain @ except as the first character.
datasetType - str
    The string that will be substituted when @alias is passed into datasetType. It may not contain '@'

Definition at line 683 of file butler.py.

def lsst.daf.persistence.butler.Butler.get (   self,
  datasetType,
  dataId = None,
  immediate = True,
  rest 
)
Retrieves a dataset given an input collection data id.

Parameters
----------
datasetType - str
    The type of dataset to retrieve.
dataId - dict
    The data id.
immediate - bool
    If False use a proxy for delayed loading.
**rest
    keyword arguments for the data id.

Returns
-------
    An object retrieved from the dataset (or a proxy for one).

Definition at line 914 of file butler.py.

def lsst.daf.persistence.butler.Butler.getKeys (   self,
  datasetType = None,
  level = None,
  tag = None 
)
Get the valid data id keys at or above the given level of hierarchy for the dataset type or the
entire collection if None. The dict values are the basic Python types corresponding to the keys (int,
float, str).

Parameters
----------
datasetType - str
    The type of dataset to get keys for, entire collection if None.
level - str
    The hierarchy level to descend to. None if it should not be restricted. Use an empty string if the
    mapper should lookup the default level.
tags - any, or list of any
    Any object that can be tested to be the same as the tag in a dataId passed into butler input
    functions. Applies only to input repositories: If tag is specified by the dataId then the repo
    will only be read from used if the tag in the dataId matches a tag used for that repository.

Returns
-------
Returns a dict. The dict keys are the valid data id keys at or above the given level of hierarchy for
the dataset type or the entire collection if None. The dict values are the basic Python types
corresponding to the keys (int, float, str).

Definition at line 713 of file butler.py.

def lsst.daf.persistence.butler.Butler.getMapperClass (   root)
static
posix-only; gets the mapper class at the path specifed by root (if a file _mapper can be found at
that location or in a parent location.

As we abstract the storage and support different types of storage locations this method will be
moved entirely into Butler Access, or made more dynamic, and the API will very likely change.

Definition at line 675 of file butler.py.

def lsst.daf.persistence.butler.Butler.put (   self,
  obj,
  datasetType,
  dataId = {},
  doBackup = False,
  rest 
)
Persists a dataset given an output collection data id.

Parameters
----------
obj -
    The object to persist.
datasetType - str
    The type of dataset to persist.
dataId - dict
    The data id.
doBackup - bool
    If True, rename existing instead of overwriting.
    WARNING: Setting doBackup=True is not safe for parallel processing, as it may be subject to race
    conditions.
**rest
    Keyword arguments for the data id.

Definition at line 965 of file butler.py.

def lsst.daf.persistence.butler.Butler.queryMetadata (   self,
  datasetType,
  format,
  dataId = {},
  rest 
)
Returns the valid values for one or more keys when given a partial
input collection data id.

Parameters
----------
datasetType - str
    The type of dataset to inquire about.
format - str, tuple
    Key or tuple of keys to be returned.
dataId - DataId, dict
    The partial data id.
**rest -
    Keyword arguments for the partial data id.

Returns
-------
A list of valid values or tuples of valid values as specified by the
format.

Definition at line 749 of file butler.py.

def lsst.daf.persistence.butler.Butler.subset (   self,
  datasetType,
  level = None,
  dataId = {},
  rest 
)
Return complete dataIds for a dataset type that match a partial (or empty) dataId.

Given a partial (or empty) dataId specified in dataId and **rest, find all datasets that match the
dataId.  Optionally restrict the results to a given level specified by a dataId key (e.g. visit or
sensor or amp for a camera).  Return an iterable collection of complete dataIds as ButlerDataRefs.
Datasets with the resulting dataIds may not exist; that needs to be tested with datasetExists().

Parameters
----------
datasetType - str
    The type of dataset collection to subset
level - str
    The level of dataId at which to subset. Use an empty string if the mapper should look up the
    default level.
dataId - dict
    The data id.
**rest
    Keyword arguments for the data id.

Returns
-------
subset - ButlerSubset
    Collection of ButlerDataRefs for datasets matching the data id.

Examples
-----------
To print the full dataIds for all r-band measurements in a source catalog
(note that the subset call is equivalent to: `butler.subset('src', dataId={'filter':'r'})`):

>>> subset = butler.subset('src', filter='r')
>>> for data_ref in subset: print(data_ref.dataId)

Definition at line 999 of file butler.py.

Member Data Documentation

lsst.daf.persistence.butler.Butler.datasetTypeAliasDict

Definition at line 321 of file butler.py.

lsst.daf.persistence.butler.Butler.log

Definition at line 306 of file butler.py.

lsst.daf.persistence.butler.Butler.persistence

Definition at line 351 of file butler.py.

lsst.daf.persistence.butler.Butler.storage

Definition at line 323 of file butler.py.


The documentation for this class was generated from the following file: