Table API

API for working with the Table type.

Import this module like this:

from sympathy.api import table

Class table.File

class sympathy.typeutils.table.File(fileobj=None, data=None, filename=None, mode=u'r', scheme=u'hdf5', source=None, managed=False, import_links=False)[source]

A Table with columns, where each column has a name and a data type. All columns in the Table must always be of the same length.

Any node port with the Table type will produce an object of this kind.

The data in the Table can be accessed in different ways depending on whether you plan on using numpy or pandas for processing the data. When using numpy you can access columns individually as numpy arrays via get_column_to_array() and set_column_from_array():

>>> from sympathy.api import table
>>> mytable = table.File()
>>> mytable.set_column_from_array('foo', np.array([1,2,3]))
>>> print(mytable.get_column_to_array('foo'))
[1 2 3]

Or you can access them as pandas Series using get_column_to_series() and set_column_from_series():

>>> from sympathy.api import table
>>> mytable = table.File()
>>> mytable.set_column_from_series(pandas.Series([1,2,3], name='foo'))
>>> print(mytable.get_column_to_series('foo'))
0  1
1  2
2  3
Name: foo, dtype: int64

When working with the entire table at once you can choose between numpy recarrays (with to_recarray() and from_recarray()) or numpy matrices (to_matrix() and from_matrix()), or pandas data frame via the methods to_dataframe() and from_dataframe().

All these different ways of accessing the data can be mixed freely.

__contains__(key)[source]

Return True if table contains a column named key.

Equivalent to has_column().

__deepcopy__(memo=None)[source]

Return new TypeAlias that does not share references with self. Must be re-implemented by subclasses that define their own storage fields.

__getitem__(index)[source]
Return type:table.File

Return a new table.File object with a subset of the table data.

This method fully supports both one- and two-dimensional single indices and slices.

Examples:

>>> from sympathy.api import table
>>> mytable = table.File.from_rows(
...     ['a', 'b', 'c'],
...     [[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> mytable.to_dataframe()
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9
>>> mytable[1].to_dataframe()
   a  b  c
0  4  5  6
>>> mytable[:,1].to_dataframe()
   b
0  2
1  5
2  8
>>> mytable[1,1].to_dataframe()
   b
0  5
>>> mytable[:2,:2].to_dataframe()
   a  b
0  1  2
1  4  5
>>> mytable[::2,::2].to_dataframe()
   a  c
0  1  3
1  7  9
>>> mytable[::-1,:].to_dataframe()
   a  b  c
0  7  8  9
1  4  5  6
2  1  2  3

If the key (index) is a string, it is assumed to be a column name and that column array will be returned.

__setitem__(index, other_table)[source]

Update the values at index with the values from other_table.

This method fully supports both one- and two-dimensional single indices and slices, but the dimensions of the slice must be the same as the dimensions of other_table.

If the key (index) is a string, it is assumed to be a column name and the value (other_table) argument an array.

attr(name)[source]

Get the tables attribute with name.

attrs

Return dictionary of attributes for table.

New in version 1.3.4.

clear()[source]

Clear the table. All columns and attributes will be removed.

col(name)[source]

Get a Column object for column with name.

New in version 1.3.4.

cols()[source]

Get a list of all columns as Column objects.

New in version 1.3.4.

column_names()[source]

Return a list with the names of the table columns.

column_type(column)[source]

Return the dtype of column named column.

columns(*args, **kwargs)[source]

Return a list with the names of the table columns.

Deprecated since version 1.0: Use column_names() instead.

static from_dataframe(dataframe)[source]

Return a new table.File with data from pandas dataframe dataframe.

static from_matrix(column_names, matrix)[source]

Return a new table.File with data from numpy matrix matrix. column_names should be a list of strings which are used to name the resulting columns.

static from_recarray(recarray)[source]

Return a new table.File with data from numpy.recarray object recarray.

static from_rows(column_names, rows)[source]

Return new table.File with data from iterable rows. column_names should be a list of strings which are used to name the resulting columns.

get(*args, **kwargs)[source]

Return numpy rec array.

Deprecated since version 1.0: Use to_recarray() instead.

get_attributes()[source]

Get all table attributes and all column attributes.

Returns a tuple where the first element contains all the table attributes and the second element contains all the column attributes.

get_column(*args, **kwargs)[source]

Return numpy array.

Deprecated since version 1.0: Use get_column_to_array() instead.

get_column_attributes(column_name)[source]

Return dictionary of attributes for column_name.

get_column_to_array(column_name, index=None, kind=u'numpy')[source]

Return named column as an array.

Return type is numpy.array when kind is ‘numpy’ (by default) and dask.array.Array when kind is ‘dask’.

Dask arrays can be used to reduce memory use in locked subflows by handling data more lazily.

get_column_to_series(column_name)[source]

Return named column as pandas series.

get_name()[source]

Return table name or None if name is not set.

get_table_attributes()[source]

Return dictionary of attributes for table.

has_column(key)[source]

Return True if table contains a column named key.

New in version 1.1.3.

hjoin(other_table, mask=False, rename=False)[source]

Add the columns from other_table.

Analoguous to update().

classmethod icon()[source]

Return full path to svg icon.

is_empty()[source]

Returns True if the table is empty.

names(kind=None, **kwargs)[source]

The names that can be automatically adjusted from an adaf.

kind should be one of ‘cols’ (all column names), ‘attrs’ (all table attribute names), or ‘name’ (the table name).

number_of_columns()[source]

Return the number of columns in the table.

number_of_rows()[source]

Return the number of rows in the table.

set(*args, **kwargs)[source]

Write rec array.

Deprecated since version 1.0: Use from_recarray() instead.

set_attributes(attributes)[source]

Set table attributes and column attrubutes at the same time.

Input should be a tuple of dictionaries where the first element of the tuple contains the table attributes and the second element contains the column attributes.

set_column(*args, **kwargs)[source]

Set a column.

Deprecated since version 1.0: Use set_column_from_array() instead.

set_column_attributes(column_name, attributes)[source]

Set dictionary of scalar attributes for column_name.

Attribute values can be any numbers or strings.

set_column_from_array(column_name, array, attributes=None)[source]

Write numpy array to column named by column_name. If the column already exists it will be replaced.

set_column_from_series(series)[source]

Write pandas series to column named by series.name. If the column already exists it will be replaced.

set_name(name)[source]

Set table name. Use None to unset the name.

set_source_id(*args, **kwargs)[source]

Set source identifier. Not implemented.

set_table_attributes(attributes)[source]

Set table attributes to those in dictionary attributes.

Attribute values can be any numbers or strings. Replaces any old table attributes.

Example:

>>> from sympathy.api import table
>>> mytable = table.File()
>>> mytable.set_table_attributes(
...     {'Thou shall count to': 3,
...      'Ingredients': 'Spam'})
source(other_table, shallow=False)[source]

Update self with the data from other, without keeping the old state. When shallow is False (default), self should be updated with a deepcopy of other.

self and other must be of the exact same type.

to_dataframe()[source]

Return pandas DataFrame object with all columns in table.

to_matrix()[source]

Return numpy matrix with all the columns in the table.

to_recarray()[source]

Return numpy.recarray object with the table content or None if there are no columns.

to_rows()[source]

Return a generator over the table’s rows.

Each row will be represented as a tuple of values.

types(kind=None, **kwargs)[source]

Return types associated with names.

update(other_table)[source]

Updates the columns in the table with columns from other table keeping the old ones.

If a column exists in both tables the one from other_table is used. Creates links where possible.

update_column(column_name, other_table, other_name=None)[source]

Updates a column from a column in another table.

The column other_name from other_table will be copied into column_name. If column_name already exists it will be replaced.

When other_name is not used, then column_name will be used instead.

value(*args, **kwargs)[source]

Return numpy.recarray object with the table content.

Deprecated since version 1.0: Use to_recarray() instead.

version()[source]

Return the version as a string. This is useful when loading existing files from disk.

New in version 1.2.5.

classmethod viewer()[source]

Return viewer class, which must be a subclass of sympathy.api.typeutil.ViewerBase

vjoin(other_tables, input_index=u'', output_index=u'', fill=True, minimum_increment=1)[source]

Add the rows from the other_tables at the end of this table.

vsplit(output_list, input_index, remove_fill)[source]

Split the current table to a list of tables by rows.

Class table.Column

class sympathy.typeutils.table.Column(name, parent_data)[source]

The Column class provides a read-only interface to a column in a Table.

attr(name)[source]

Return the value of the column attribute name.

attrs

A dictionary of all column attributes of this column.

data

The data of the column as a numpy array. Equivalent to calling File.get_column_to_array().

name

The name of the column.