Table API¶
API for working with the Table port type.
Import this module like this:
from sympathy.api import table
A Table with columns, where each column has a name and a data type. All columns in the Table must always be of the same length.
Accessing the data¶
There are multiple APIs in the Table for adding and reading columns. The simplest one is to use indexing with column names:
>>> from sympathy.api import table
>>> mytable = table.Table()
>>> mytable['foo'] = np.array([1,2,3])
>>> print(mytable['foo'])
[1 2 3]
It is also possible to convert between a Table and a pandas DataFrame
(Table.to_dataframe()
, Table.from_dataframe()
), a numpy recarray
(Table.to_recarray()
, Table.from_recarray()
), or a generator/list
of rows (Table.to_rows()
, Table.from_rows()
).
The size of the table can easily be found with the methods
Table.number_of_rows()
and Table.number_of_columns()
. The column
names are available via the method Table.column_names()
.
Column and Table attributes¶
Both the Table itself and any of its columns can have attributes attached to it. Attributes can be scalar values (str, float, int, bool).
There is currently no support for storing datetimes or timedeltas in attributes. As a workaround, you can convert the datetimes or timedeltas to either string or floats and store those instead.
Name restrictions¶
Column names, attribute names and table names can be almost any unicode
strings. For column names an empty string or a single period (.) are not
allowed. For attribute names only the empty string is not allowed. The names of
table attributes must also not be of the format __table_*__
since this
is reserved for storing attributes internal to the Sympathy platform.
Class table.Table
¶
- class sympathy.api.table.Table(filename: str | None = None, mode: str = 'r', **kwargs)¶
A Table with columns, where each column has a name and a data type. All columns in the Table must always be of the same length.
- __contains__(key: str) bool ¶
Return True if table contains a column named key.
Equivalent to
has_column()
.
- __deepcopy__(memo: dict | None = None)¶
Return new TypeAlias object that does not share references with self.
Must be re-implemented by subclasses that introduce additional fields to ensure that the fields are copied to the returned object.
- __delitem__(key: str)¶
Delete column named key.
Added in version 4.0.0.
- __getitem__(index: str | _index_type_2d) Table | np.ndarray ¶
If
index
is a string, it is assumed to be a column name and that column array will be returned as a numpy array:>>> from sympathy.api import table >>> mytable = table.Table.from_rows( ... ['a', 'b', 'c'], ... [[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> mytable['a'] array([1, 4, 7])
Otherwise return a new
Table
object with a subset of the table data. This method fully supports both one- and two-dimensional single indices and slices:>>> from sympathy.api import table >>> mytable = table.Table.from_rows( ... ['a', 'b', 'c'], ... [[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> print(mytable) a b c 0 1 2 3 1 4 5 6 2 7 8 9 (3 x 3) >>> print(mytable[1]) a b c 0 4 5 6 (1 x 3) >>> print(mytable[:,1]) b 0 2 1 5 2 8 (3 x 1) >>> print(mytable[1,1]) b 0 5 (1 x 1) >>> print(mytable[:2,:2]) a b 0 1 2 1 4 5 (2 x 2) >>> print(mytable[::2,::2]) a c 0 1 3 1 7 9 (2 x 2) >>> print(mytable[::-1,:]) a b c 0 7 8 9 1 4 5 6 2 1 2 3 (3 x 3)
- __init__(filename: str | None = None, mode: str = 'r', **kwargs)¶
Fileobj is a file owned. It should be closed by self. Data is a borrowed file. It shall not be closed by self. Filename is used to construct a new fileobj. Mode and scheme are used together with filename to construct the filename. Import_links is only usable together with filename and enables links to the file source to be written.
Fileobj, data and filename are mutually exclusive.
- __setitem__(index: str | _index_type_2d, other_table: Table | np.ndarray)¶
Update the column or the values at index with the values from other_table.
If index is a string, it is assumed to be a column name. other_table should in this case be a numpy array:
>>> from sympathy.api import table >>> import numpy as np >>> mytable = table.Table.from_rows( ... ['a', 'b', 'c'], ... [[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> mytable['a'] = np.array([10, 40, 70]) >>> print(mytable) a b c 0 10 2 3 1 40 5 6 2 70 8 9 (3 x 3)
This method fully supports both one- and two-dimensional single indices and slices, but the dimensions of the slice must be the same as the dimensions of other_table:
>>> from sympathy.api import table >>> mytable = table.Table.from_rows( ... ['a', 'b', 'c'], ... [[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> newtable = table.Table.from_rows( ... ['b', 'c'], ... [[50, 60], [80, 90]]) >>> mytable[1:,1:] = newtable >>> print(mytable) a b c 0 1 2 3 1 4 50 60 2 7 80 90 (3 x 3)
- __str__() str ¶
Return str(self).
- attr(name: str) _attr_type | None ¶
Get the table’s attribute with name.
- property attrs: dict¶
Return dictionary of attributes for table.
Added in version 1.3.4.
- clear()¶
Clear the table. All columns and attributes will be removed.
- column_names() List[str] ¶
Return a list with the names of the table columns.
- column_type(column) dtype ¶
Return the dtype of column named
column
.
- completions(**kwargs) CompletionBuilder ¶
Return completions builder for this object.
- classmethod from_array(column_names: Sequence[str], array: ndarray) Table ¶
Return a new
Table
with data from numpy 2D array,array
.column_names
should be a list of strings which are used to name the resulting columns.Added in version 4.0.0.
- static from_dataframe(dataframe: DataFrame) Table ¶
Return a new
Table
with data from pandas dataframedataframe
.
- static from_recarray(recarray: recarray) Table ¶
Return a new
Table
with data from numpy.recarray objectrecarray
.
- classmethod from_rows(column_names: Sequence[str], rows: Iterable[Sequence], column_types: Sequence[_dtype_type] | None = None, missing_type: _dtype_type | None = None) Table ¶
Returns new
Table
with data from iterable rows and the specified column names. Columns with only None values or no rows are ignored unless the column type is specified.- Parameters:
column_names – Used to name the resulting columns.
rows – Zero or more rows of cell values. Number of values should be the same for each row in the data.
column_types – Used to specify type for the resulting columns, must be of the same length as column_names.
missing_type – Used to specify type for columns with no rows or only None values. Can be used instead of column_types to avoid specifying all types.
should (The lengths of column_names and column_types (when provided))
row. (match the number of cell values for each)
- Return type:
- get_attributes() Tuple[dict, dict] ¶
Get all table attributes and all column attributes.
Returns a tuple where the first element contains all the table attributes and the second element contains all the column attributes.
- get_column_attributes(column_name: str) _attr_dict_type ¶
Return dictionary of attributes for column_name.
- get_column_to_array(column_name: str, index: _index_type | None = None, kind: str = 'numpy') np.ndarray ¶
Return named column as an array.
When only supplied a column name it is preferred to instead use
__getitem__()
:# These are equivalent: data = table[column_name] data = table.get_column_to_array(column_name)
Return type is numpy.array when kind is ‘numpy’ (by default) and dask.array.Array when kind is ‘dask’.
Dask arrays can be used to reduce memory use in locked subflows by handling data more lazily.
- get_column_to_series(column_name: str) Series ¶
Return named column as pandas series.
- get_name() str ¶
Return table name or None if name is not set.
- get_table_attributes() dict ¶
Return dictionary of attributes for table.
- has_column(key: str) bool ¶
Return True if table contains a column named key.
Added in version 1.1.3.
- hjoin(other_table: Table, mask: bool = False, rename: bool = False)¶
Add the columns from other_table.
Equivalent to
update()
.
- classmethod icon() str ¶
Return full path to svg icon.
- is_empty() bool ¶
Returns
True
if the table is empty.
- names(kind: str | None = None, fields: Sequence[str] | None = None, **kwargs) Any ¶
The names that can be automatically adjusted from a table.
kind should be one of ‘cols’ (all column names), ‘attrs’ (all table attribute names), or ‘name’ (the table name).
- number_of_columns() int ¶
Return the number of columns in the table.
- number_of_rows() int ¶
Return the number of rows in the table.
- set_attributes(attributes: Tuple[dict, dict])¶
Set table attributes and column attrubutes at the same time.
Input should be a tuple of dictionaries where the first element of the tuple contains the table attributes and the second element contains the column attributes.
- set_column_attributes(column_name: str, attributes: dict)¶
Set dictionary of scalar attributes for column_name.
Attribute values can be any numbers or strings.
- set_column_from_array(column_name: str, array: np.ndarray, attributes: _attr_dict_type | None = None)¶
Write numpy array to column named by column_name. If the column already exists it will be replaced.
Using
__setitem__()
instead is preferred:# These are equivalent: table[column_name] = data table.set_column_to_array(column_name, data)
- set_column_from_series(series: Series, name: str | None = None)¶
Write pandas series to column named by name, if name is None series.name is used as name. The resulting name should not be None.
If the column already exists it will be replaced.
- set_name(name: str)¶
Set table name. Use None to unset the name.
- set_table_attributes(attributes: dict)¶
Set table attributes to those in dictionary attributes.
Attribute values can be any numbers or strings. Replaces any old table attributes.
Example:
>>> from sympathy.api import table >>> mytable = table.Table() >>> mytable.set_table_attributes( ... {'Thou shall count to': 3, ... 'Ingredients': 'Spam'})
- source(other_table: Table, shallow: bool = False)¶
Update self with the data from other, discarding any previous state in self.
- Parameters:
other (type of self) – Object used as the source for (to update) self.
shallow (bool) –
When shallow is True a deepcopy of other will be avoided to improve performance, shallow=True must only be used in operations that do not modify other.
When shallow is False the result should be similar to performing the shallow=True with a deepcopy of other so that no modifications of either self or other, after the source operation, can affect the other object.
- to_array() ndarray ¶
Return 2D numpy array with all the row data in the table.
Changed in version 5.2.0: If any column has masked values, the output will be a masked array.
Added in version 4.0.0.
- to_csv(filename: str, header: bool = True, encoding: str = 'UTF-8', delimiter: str = ';', quotechar: str = '"')¶
Save/Export table to filename.
- to_dataframe() DataFrame ¶
Return pandas DataFrame object with all columns in table.
- to_recarray() recarray ¶
Return numpy.recarray object with the table content or None if there are no columns.
- to_rows() Iterable[Sequence] ¶
Return a generator over the table’s rows.
Each row will be represented as a tuple of values.
- update(other_table: Table)¶
Updates the columns in the table with columns from other table keeping the old ones.
If a column exists in both tables the one from other_table is used. Creates links where possible.
- update_column(column_name: str, other_table: Table, other_name: str | None = None)¶
Updates a column from a column in another table.
The column other_name from other_table will be copied into column_name. If column_name already exists it will be replaced.
When other_name is not used, then column_name will be used instead.
- version() str ¶
Return the version as a string. This is useful when loading existing files from disk.
Added in version 1.2.5.
- classmethod viewer() Type[TableViewer] ¶
Return viewer class, which must be a subclass of sympathy.api.typeutil.ViewerBase
- vjoin(other_tables: Iterable[Table], input_index: str = '', output_index: str = '', fill: bool | None = True, minimum_increment: int = 1)¶
Add the rows from the other_tables at the end of this table.
- Parameters:
other_tables
input_index – Has no effect anymore
output_index – Column name for output index column generated
fill – When True, attempt to fill with NaN or a zero-like value. When False, discard columns not present in all other_tables. When None, mask output.
minimum_increment – Index increment added for empty tables.
- vsplit(output_list: list, input_index: ndarray | None, remove_fill: bool)¶
Split the current table to a list of tables by rows.
Class table.Column
¶
- class sympathy.api.table.Column(name: str, parent_data: sytable)¶
The
Column
class provides a read-only interface to a column in a Table.- attr(name) _attr_type ¶
Return the value of the column attribute
name
.
- property attrs: _attr_dict_type¶
A dictionary of all column attributes of this column.
- property data: ndarray¶
The data of the column as a numpy array. Equivalent to calling
Table.get_column_to_array()
.
- property name: str¶
The name of the column.