Convert specific columns in Tables¶
- Input ports:
port1: [table]
Input Table
- Output ports:
port2: [table]
Tables with converted columns
- Configuration:
- Select columns (in_column_list)
- Select the columns to use
- Select type (in_type_list)
- Select the type to use
- Convert columns (out_column_list)
- Selected columns to convert
- Convert types (out_type_list)
- Selected types to use
With the considered node it is possible to convert the data types of a number of selected columns in the incoming Table. In general, the columns in the internal Table type can have the same data types that exist for numpy arrays, except for numpy object type. For this node the list of available data types to convert to is restricted.
- The following data types are available for conversion:
- binary
- bool
- datetime (UTC or naive)
- float
- integer
- text
Converting strings to datetimes¶
Converting a str/unicode column to datetime might require some extra thought if the strings include time-zone information. The datetimes stored by Sympathy have no time zone information (due to limitations in the underlying data libraries), but Sympathy is able to use the time-zone information when creating the datetime columns. This can be done in two different ways, which we call “UTC” and “naive”.
datetime (UTC)¶
The option datetime (UTC) will calculate the UTC-time corresponding to each datetime in the input column. This is especially useful when your data contains datetimes from different time zones (a common reason for this is daylight savings time), but when looking in the viewer, exports etc. the datetimes will not be the same as in the input.
For example the string '2016-01-01T12:00:00+0100'
will be stored as
2016-01-01T11:00:00
which is the corresponding UTC time.
There is currently no standard way of converting these UTC datetimes back to the localized datetime strings with time-zone information.
datetime (naive)¶
The option datetime (naive) simply discards any time-zone information. This corresponds pretty well to how we “naively” think of time when looking at a clock on the wall.
For example the string '2016-01-01T12:00:00+0100'
will be stored as
2016-01-01T12:00:00
.
Text vs. binary¶
Text data is a string of arbitrary characters from any writing system. Binary data on the other hand is a series of bytes as they would be stored in a computer. Text data can be converted to binary data and vice versa by choosing one of several different character encodings. The character encoding maps the characters onto series of bytes, but many encodings only support some subset of all the different writing systems.
This node currently only supports the ASCII encoding, which means that only the letters a-z (lower and upper case), as well as digits and a limited number of punctuation characters can be converted. Trying to convert a string with any other characters will lead to errors.