Table API

API for working with the Table type.

Import this module like this:

from sympathy.api import table

A Table with columns, where each column has a name and a data type. All columns in the Table must always be of the same length.

Any node port with the Table type represents an object of this class.

Accessing the data

There are multiple APIs in the Table for adding and reading columns. The simplest one is to use indexing with column names:

>>> from sympathy.api import table
>>> mytable = table.Table()
>>> mytable['foo'] = np.array([1,2,3])
>>> print(mytable['foo'])
[1 2 3]

It is also possible to convert between a Table and a pandas DataFrame (to_dataframe(), from_dataframe()), a numpy recarray (to_recarray(), from_recarray()), or a generator/list of rows (to_rows(), from_rows()).

The size of the table can easily be found with the methods number_of_rows() and number_of_columns(). The column names are available via the method column_names().

Column and Table attributes

Both the Table itself and any of its columns can have attributes attached to it. Attributes can be scalar values (str, float, int, bool).

There is currently no support for storing datetimes or timedeltas in attributes. As a workaround, you can convert the datetimes or timedeltas to either string or floats and store those instead.

Name restrictions

Column names, attribute names and table names can be almost any unicode strings. For column names an empty string or a single period (.) are not allowed. For attribute names only the empty string is not allowed. The names of table attributes must also not be of the format __table_*__ since this is reserved for storing attributes internal to the Sympathy platform.

Class table.Table

class sympathy.api.table.Table(filename: Optional[str] = None, mode: str = 'r', **kwargs)

A Table with columns, where each column has a name and a data type. All columns in the Table must always be of the same length.

__contains__(key: str) bool

Return True if table contains a column named key.

Equivalent to has_column().

__deepcopy__(memo: Optional[dict] = None)

Return new TypeAlias object that does not share references with self.

Must be re-implemented by subclasses that introduce additional fields to ensure that the fields are copied to the returned object.

__delitem__(key: str)

Delete column named key.

New in version 4.0.0.

__getitem__(index: Union[str, _index_type_2d]) Union[Table, np.ndarray]
Return type:

Table

Return a new Table object with a subset of the table data.

This method fully supports both one- and two-dimensional single indices and slices.

Examples:

>>> from sympathy.api import table
>>> mytable = table.Table.from_rows(
...     ['a', 'b', 'c'],
...     [[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> mytable.to_dataframe()
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9
>>> mytable[1].to_dataframe()
   a  b  c
0  4  5  6
>>> mytable[:,1].to_dataframe()
   b
0  2
1  5
2  8
>>> mytable[1,1].to_dataframe()
   b
0  5
>>> mytable[:2,:2].to_dataframe()
   a  b
0  1  2
1  4  5
>>> mytable[::2,::2].to_dataframe()
   a  c
0  1  3
1  7  9
>>> mytable[::-1,:].to_dataframe()
   a  b  c
0  7  8  9
1  4  5  6
2  1  2  3

If the key (index) is a string, it is assumed to be a column name and that column array will be returned.

__init__(filename: Optional[str] = None, mode: str = 'r', **kwargs)

Fileobj is a file owned. It should be closed by self. Data is a borrowed file. It shall not be closed by self. Filename is used to construct a new fileobj. Mode and scheme are used together with filename to construct the filename. Import_links is only usable together with filename and enables links to the file source to be written.

Fileobj, data and filename are mutually exclusive.

__setitem__(index: Union[str, _index_type_2d], other_table: Union[Table, np.ndarray])

Update the values at index with the values from other_table.

This method fully supports both one- and two-dimensional single indices and slices, but the dimensions of the slice must be the same as the dimensions of other_table.

If the key (index) is a string, it is assumed to be a column name and the value (other_table) argument an array.

__str__() str

Return str(self).

attr(name: str) Optional[_attr_type]

Get the table’s attribute with name.

property attrs: dict

Return dictionary of attributes for table.

New in version 1.3.4.

clear()

Clear the table. All columns and attributes will be removed.

col(name: str) Column

Get a Column object for column with name.

New in version 1.3.4.

cols() Sequence[Column]

Get a list of all columns as Column objects.

New in version 1.3.4.

column_names() List[str]

Return a list with the names of the table columns.

column_type(column) dtype

Return the dtype of column named column.

completions(**kwargs) CompletionBuilder

Return completions builder for this object.

classmethod from_array(column_names: Sequence[str], array: ndarray) Table

Return a new Table with data from numpy 2D array, array. column_names should be a list of strings which are used to name the resulting columns.

New in version 4.0.0.

static from_dataframe(dataframe: DataFrame) Table

Return a new Table with data from pandas dataframe dataframe.

static from_matrix(column_names, matrix)

Warning

from_matrix was removed in version 5.0.0.

Please use sympathy.api.table.Table.from_array instead.

static from_recarray(recarray: recarray) Table

Return a new Table with data from numpy.recarray object recarray.

classmethod from_rows(column_names: Sequence[str], rows: Iterable[Sequence], column_types: Optional[Sequence[_dtype_type]] = None, missing_type: Optional[_dtype_type] = None) Table

Returns new Table with data from iterable rows and the specified column names. Columns with only None values or no rows are ignored unless the column type is specified.

Parameters:
  • column_names – Used to name the resulting columns.

  • rows – Zero or more rows of cell values. Number of values should be the same for each row in the data.

  • column_types – Used to specify type for the resulting columns, must be of the same length as column_names.

  • missing_type – Used to specify type for columns with no rows or only None values. Can be used instead of column_types to avoid specifying all types.

  • should (The lengths of column_names and column_types (when provided)) –

  • row. (match the number of cell values for each) –

Return type:

Table

get_attributes() Tuple[dict, dict]

Get all table attributes and all column attributes.

Returns a tuple where the first element contains all the table attributes and the second element contains all the column attributes.

get_column_attributes(column_name: str) _attr_dict_type

Return dictionary of attributes for column_name.

get_column_to_array(column_name: str, index: Optional[_index_type] = None, kind: str = 'numpy') np.ndarray

Return named column as an array.

Return type is numpy.array when kind is ‘numpy’ (by default) and dask.array.Array when kind is ‘dask’.

Dask arrays can be used to reduce memory use in locked subflows by handling data more lazily.

get_column_to_series(column_name: str) Series

Return named column as pandas series.

get_name() str

Return table name or None if name is not set.

get_table_attributes() dict

Return dictionary of attributes for table.

has_column(key: str) bool

Return True if table contains a column named key.

New in version 1.1.3.

hjoin(other_table: Table, mask: bool = False, rename: bool = False)

Add the columns from other_table.

Analoguous to update().

classmethod icon() str

Return full path to svg icon.

is_empty() bool

Returns True if the table is empty.

names(kind: Optional[str] = None, fields: Optional[Sequence[str]] = None, **kwargs) Any

The names that can be automatically adjusted from a table.

kind should be one of ‘cols’ (all column names), ‘attrs’ (all table attribute names), or ‘name’ (the table name).

number_of_columns() int

Return the number of columns in the table.

number_of_rows() int

Return the number of rows in the table.

set_attributes(attributes: Tuple[dict, dict])

Set table attributes and column attrubutes at the same time.

Input should be a tuple of dictionaries where the first element of the tuple contains the table attributes and the second element contains the column attributes.

set_column_attributes(column_name: str, attributes: dict)

Set dictionary of scalar attributes for column_name.

Attribute values can be any numbers or strings.

set_column_from_array(column_name: str, array: np.ndarray, attributes: Optional[_attr_dict_type] = None)

Write numpy array to column named by column_name. If the column already exists it will be replaced.

set_column_from_series(series: Series, name: Optional[str] = None)

Write pandas series to column named by name, if name is None series.name is used as name. The resulting name should not be None.

If the column already exists it will be replaced.

set_name(name: str)

Set table name. Use None to unset the name.

set_table_attributes(attributes: dict)

Set table attributes to those in dictionary attributes.

Attribute values can be any numbers or strings. Replaces any old table attributes.

Example:

>>> from sympathy.api import table
>>> mytable = table.Table()
>>> mytable.set_table_attributes(
...     {'Thou shall count to': 3,
...      'Ingredients': 'Spam'})
source(other_table: Table, shallow: bool = False)

Update self with the data from other, discarding any previous state in self.

Parameters:
  • other (type of self) – Object used as the source for (to update) self.

  • shallow (bool) –

    When shallow is True a deepcopy of other will be avoided to improve performance, shallow=True must only be used in operations that do not modify other.

    When shallow is False the result should be similar to performing the shallow=True with a deepcopy of other so that no modifications of either self or other, after the source operation, can affect the other object.

to_array() ndarray

Return 2D numpy array with all the row data in the table.

Changed in version 5.2.0: If any column has masked values, the output will be a masked array.

New in version 4.0.0.

to_csv(filename: str, header: bool = True, encoding: str = 'UTF-8', delimiter: str = ';', quotechar: str = '"')

Save/Export table to filename.

to_dataframe() DataFrame

Return pandas DataFrame object with all columns in table.

to_matrix()

Warning

to_matrix was removed in version 5.0.0.

Please use sympathy.api.table.Table.to_array instead.

to_recarray() recarray

Return numpy.recarray object with the table content or None if there are no columns.

to_rows() Iterable[Sequence]

Return a generator over the table’s rows.

Each row will be represented as a tuple of values.

update(other_table: Table)

Updates the columns in the table with columns from other table keeping the old ones.

If a column exists in both tables the one from other_table is used. Creates links where possible.

update_column(column_name: str, other_table: Table, other_name: Optional[str] = None)

Updates a column from a column in another table.

The column other_name from other_table will be copied into column_name. If column_name already exists it will be replaced.

When other_name is not used, then column_name will be used instead.

version() str

Return the version as a string. This is useful when loading existing files from disk.

New in version 1.2.5.

classmethod viewer() Type[TableViewer]

Return viewer class, which must be a subclass of sympathy.api.typeutil.ViewerBase

vjoin(other_tables: Iterable[Table], input_index: str = '', output_index: str = '', fill: Optional[bool] = True, minimum_increment: int = 1)

Add the rows from the other_tables at the end of this table.

Parameters:
  • other_tables

  • input_index – Has no effect anymore

  • output_index – Column name for output index column generated

  • fill – When True, attempt to fill with NaN or a zero-like value. When False, discard columns not present in all other_tables. When None, mask output.

  • minimum_increment – Index increment added for empty tables.

vsplit(output_list: list, input_index: Optional[ndarray], remove_fill: bool)

Split the current table to a list of tables by rows.

Class table.Column

class sympathy.api.table.Column(name: str, parent_data: sytable)

The Column class provides a read-only interface to a column in a Table.

attr(name) _attr_type

Return the value of the column attribute name.

property attrs: _attr_dict_type

A dictionary of all column attributes of this column.

property data: ndarray

The data of the column as a numpy array. Equivalent to calling Table.get_column_to_array().

property name: str

The name of the column.