lsst.pipe.tasks gb957171fc7+91f703d445
Loading...
Searching...
No Matches
Public Member Functions | Public Attributes | Protected Member Functions | Protected Attributes | List of all members
lsst.pipe.tasks.parquetTable.ParquetTable Class Reference
Inheritance diagram for lsst.pipe.tasks.parquetTable.ParquetTable:
lsst.pipe.tasks.parquetTable.MultilevelParquetTable

Public Member Functions

 __init__ (self, filename=None, dataFrame=None)
 
 write (self, filename)
 
 pandasMd (self)
 
 columnIndex (self)
 
 columns (self)
 
 toDataFrame (self, columns=None)
 

Public Attributes

 filename
 
 columns
 

Protected Member Functions

 _getColumnIndex (self)
 
 _getColumns (self)
 
 _sanitizeColumns (self, columns)
 

Protected Attributes

 _pf
 
 _df
 
 _pandasMd
 
 _columns
 
 _columnIndex
 

Detailed Description

Thin wrapper to pyarrow's ParquetFile object

Call `toDataFrame` method to get a `pandas.DataFrame` object,
optionally passing specific columns.

The main purpose of having this wrapper rather than directly
using `pyarrow.ParquetFile` is to make it nicer to load
selected subsets of columns, especially from dataframes with multi-level
column indices.

Instantiated with either a path to a parquet file or a dataFrame

Parameters
----------
filename : str, optional
    Path to Parquet file.
dataFrame : dataFrame, optional

Definition at line 41 of file parquetTable.py.

Constructor & Destructor Documentation

◆ __init__()

lsst.pipe.tasks.parquetTable.ParquetTable.__init__ (   self,
  filename = None,
  dataFrame = None 
)

Reimplemented in lsst.pipe.tasks.parquetTable.MultilevelParquetTable.

Definition at line 61 of file parquetTable.py.

Member Function Documentation

◆ _getColumnIndex()

lsst.pipe.tasks.parquetTable.ParquetTable._getColumnIndex (   self)
protected

Reimplemented in lsst.pipe.tasks.parquetTable.MultilevelParquetTable.

Definition at line 105 of file parquetTable.py.

◆ _getColumns()

lsst.pipe.tasks.parquetTable.ParquetTable._getColumns (   self)
protected

Reimplemented in lsst.pipe.tasks.parquetTable.MultilevelParquetTable.

Definition at line 124 of file parquetTable.py.

◆ _sanitizeColumns()

lsst.pipe.tasks.parquetTable.ParquetTable._sanitizeColumns (   self,
  columns 
)
protected

Definition at line 130 of file parquetTable.py.

◆ columnIndex()

lsst.pipe.tasks.parquetTable.ParquetTable.columnIndex (   self)
Columns as a pandas Index

Definition at line 98 of file parquetTable.py.

◆ columns()

lsst.pipe.tasks.parquetTable.ParquetTable.columns (   self)
List of column names (or column index if df is set)

This may either be a list of column names, or a
pandas.Index object describing the column index, depending
on whether the ParquetTable object is wrapping a ParquetFile
or a DataFrame.

Definition at line 112 of file parquetTable.py.

◆ pandasMd()

lsst.pipe.tasks.parquetTable.ParquetTable.pandasMd (   self)

Definition at line 90 of file parquetTable.py.

◆ toDataFrame()

lsst.pipe.tasks.parquetTable.ParquetTable.toDataFrame (   self,
  columns = None 
)
Get table (or specified columns) as a pandas DataFrame

Parameters
----------
columns : list, optional
    Desired columns.  If `None`, then all columns will be
    returned.

Reimplemented in lsst.pipe.tasks.parquetTable.MultilevelParquetTable.

Definition at line 133 of file parquetTable.py.

◆ write()

lsst.pipe.tasks.parquetTable.ParquetTable.write (   self,
  filename 
)
Write pandas dataframe to parquet

Parameters
----------
filename : str
    Path to which to write.

Definition at line 76 of file parquetTable.py.

Member Data Documentation

◆ _columnIndex

lsst.pipe.tasks.parquetTable.ParquetTable._columnIndex
protected

Definition at line 74 of file parquetTable.py.

◆ _columns

lsst.pipe.tasks.parquetTable.ParquetTable._columns
protected

Definition at line 73 of file parquetTable.py.

◆ _df

lsst.pipe.tasks.parquetTable.ParquetTable._df
protected

Definition at line 65 of file parquetTable.py.

◆ _pandasMd

lsst.pipe.tasks.parquetTable.ParquetTable._pandasMd
protected

Definition at line 66 of file parquetTable.py.

◆ _pf

lsst.pipe.tasks.parquetTable.ParquetTable._pf
protected

Definition at line 64 of file parquetTable.py.

◆ columns

lsst.pipe.tasks.parquetTable.ParquetTable.columns

Definition at line 109 of file parquetTable.py.

◆ filename

lsst.pipe.tasks.parquetTable.ParquetTable.filename

Definition at line 62 of file parquetTable.py.


The documentation for this class was generated from the following file: