lsst.pipe.tasks gcf00bf066d+4f59a27f16
Loading...
Searching...
No Matches
lsst.pipe.tasks.schemaUtils Namespace Reference

Functions

str column_dtype (felis.datamodel.DataType felis_type, nullable=False)
 
 readSdmSchemaFile (str schemaFile)
 
 checkSdmSchemaColumns (schema, colNames, tableName)
 
 checkDataFrameAgainstSdmSchema (schema, sourceTable, tableName)
 
 convertDataFrameToSdmSchema (schema, sourceTable, tableName, skipIndex=False)
 
 convertTableToSdmSchema (schema, sourceTable, tableName)
 
 dropEmptyColumns (schema, sourceTable, tableName)
 
 make_empty_catalog (schema, tableName)
 

Variables

dict _dtype_map
 

Detailed Description

Utilities for working with sdm_schemas.

Function Documentation

◆ checkDataFrameAgainstSdmSchema()

lsst.pipe.tasks.schemaUtils.checkDataFrameAgainstSdmSchema ( schema,
sourceTable,
tableName )
Force a table to conform to the supplied schema.

This method uses the table definitions in ``sdm_schemas`` to load the
schema.

Parameters
----------
schema : `dict` [`str`, `felis.datamodel.Schema`]
    Dictionary of Schemas from ``sdm_schemas`` containing the table definition to use.
sourceTable : `pandas.DataFrame`
    The input table to check.
tableName : `str`
    Name of the table in the schema to use.

Definition at line 135 of file schemaUtils.py.

◆ checkSdmSchemaColumns()

lsst.pipe.tasks.schemaUtils.checkSdmSchemaColumns ( schema,
colNames,
tableName )
Check if supplied column names exists in the schema.

Parameters
----------
schema : `dict` [`str`, `felis.datamodel.Schema`]
    Dictionary of Schemas from ``sdm_schemas`` containing the table definition to use.
colNames : `list` of ``str`
    Names of the columns to check for in the table.
tableName : `str`
    Name of the table in the schema to use.

Returns
-------
missing : `list` of `str`
    All column names that are not in the schema

Definition at line 108 of file schemaUtils.py.

◆ column_dtype()

str lsst.pipe.tasks.schemaUtils.column_dtype ( felis.datamodel.DataType felis_type,
nullable = False )
Return Pandas data type for a given Felis column type.

Parameters
----------
felis_type : `felis.datamodel.DataType`
    Felis type, on of the enums defined in `felis.datamodel` module.

Returns
-------
column_dtype : `type` or `str`
    Type that can be used for columns in Pandas.

Raises
------
TypeError
    Raised if type is cannot be handled.

Definition at line 57 of file schemaUtils.py.

◆ convertDataFrameToSdmSchema()

lsst.pipe.tasks.schemaUtils.convertDataFrameToSdmSchema ( schema,
sourceTable,
tableName,
skipIndex = False )
Force a table to conform to the schema defined by the SDM schema.

Parameters
----------
schema : `dict` [`str`, `felis.datamodel.Schema`]
    Dictionary of Schemas from ``sdm_schemas`` containing the table definition to use.
sourceTable : `pandas.DataFrame`
    The input table to convert.
tableName : `str`
    Name of the table in the schema to use.

Returns
-------
`pandas.DataFrame`
    A table with the correct schema and data copied from
    the input ``sourceTable``.

Definition at line 162 of file schemaUtils.py.

◆ convertTableToSdmSchema()

lsst.pipe.tasks.schemaUtils.convertTableToSdmSchema ( schema,
sourceTable,
tableName )
Force an Astropy table to conform to the schema defined by the SDM schema.

Parameters
----------
schema : `dict` [`str`, `felis.datamodel.Schema`]
    Dictionary of Schemas from ``sdm_schemas`` containing the table definition to use.
sourceTable : `astropy.table.Table`
    The input table to convert.
tableName : `str`
    Name of the table in the schema to use.

Returns
-------
`astropy.table.Table`
    A table with the correct schema and data copied from
    the input ``sourceTable``.

Definition at line 218 of file schemaUtils.py.

◆ dropEmptyColumns()

lsst.pipe.tasks.schemaUtils.dropEmptyColumns ( schema,
sourceTable,
tableName )
Drop empty columns that are nullable.

Parameters
----------
schema : `dict` [`str`, `felis.datamodel.Schema`]
    Dictionary of Schemas from ``sdm_schemas`` containing the table definition to use.
sourceTable : `pandas.DataFrame`
    The input table to remove missing data columns from.
tableName : `str`
    Name of the table in the schema to use.

Returns
-------
`pandas.DataFrame`
    The table with columns that are missing and nullable dropped.

Definition at line 256 of file schemaUtils.py.

◆ make_empty_catalog()

lsst.pipe.tasks.schemaUtils.make_empty_catalog ( schema,
tableName )
Make an empty catalog for a table with a given name.

Parameters
----------
schema : `dict` [`str`, `felis.datamodel.Schema`]
    Dictionary of Schemas from ``sdm_schemas`` containing the table definition to use.
tableName : `str`
    Name of the table in the schema to use.

Returns
-------
catalog : `pandas.DataFrame`
    An empty catalog.

Definition at line 282 of file schemaUtils.py.

◆ readSdmSchemaFile()

lsst.pipe.tasks.schemaUtils.readSdmSchemaFile ( str schemaFile)
Read a schema file in YAML format.

Parameters
----------
schemaFile : `str`
    Fully specified path to the file to be read.

Returns
-------
schemaTable : dict[str, felis.datamodel.Table]
    A dict of the schemas in the given table defined in the specified file.

Raises
------
ValueError
    If the schema file can't be parsed.

Definition at line 81 of file schemaUtils.py.

Variable Documentation

◆ _dtype_map

dict lsst.pipe.tasks.schemaUtils._dtype_map
protected
Initial value:
1= {
2 felis.datamodel.DataType.double: ("float64", "float64"), # Cassandra utilities need np.nan not pd.NA
3 felis.datamodel.DataType.float: ("float32", "float32"), # Cassandra utilities need np.nan not pd.NA
4 felis.datamodel.DataType.timestamp: ("datetime64[ms]", "datetime64[ms]"),
5 felis.datamodel.DataType.long: ("Int64", "int64"),
6 felis.datamodel.DataType.int: ("Int32", "int32"),
7 felis.datamodel.DataType.short: ("Int16", "int16"),
8 felis.datamodel.DataType.byte: ("Int8", "int8"),
9 felis.datamodel.DataType.binary: ("object", "object"),
10 felis.datamodel.DataType.char: ("object", "object"),
11 felis.datamodel.DataType.text: ("object", "object"),
12 felis.datamodel.DataType.string: ("object", "object"),
13 felis.datamodel.DataType.unicode: ("object", "object"),
14 felis.datamodel.DataType.boolean: ("boolean", "bool"),
15}

Definition at line 40 of file schemaUtils.py.