Appendix

Encoding

Some Sympathy nodes allows you to choose an encoding, especially ones that write or read from files or communicate over a network. This section is a short introduction about encodings to help you choose.

Character encoding determines the translation between text characters and bytes, for example, stored in a file. Each encoding uses a different translation scheme and can support different languages.

  • Encode: text characters -> bytes

  • Decode: bytes -> text characters

Choose the same encoding for decode as was originally used to encode the data.

See https://en.wikipedia.org/wiki/Character_encoding for more information.

Notable Encodings

Here are some encodings that we offer as choices in Sympathy.

  • UTF-8, supports essentially all written languages. Widely used on the web and strongly recommended when you have the freedom to choose. Capable of encoding all valid unicode characters.

  • UTF-16, supports essentially all written languages. There are variations depending on byte order (endianness) UTF-16-LE, UTF-16-BE, and UTF-16 which uses byte order mark (BOM) to determine to use LE or BE. UTF-16, which is used internally Microsoft Windows is, now, generally superceeded by UTF-8. Capable of encoding all valid unicode characters. Use only when required!

Older encodings:

These are not recommended but could be needed when working with existing files and applications. Use only when required!.

  • US-ASCII, supports American English.

  • ISO 8859-1 (Latin-1), supports Western European Languages, superset of US-ASCII.

  • ISO 8859-15 (Latin-9), supports Western European Languages, similar to ISO 8859-1 but replaces some less common symbols, introducing the euro sign.

  • Windows-1252, supports Western European Languages, superset of ISO 8859-1 in terms of printable characters. Used in the legacy components of Microsoft Windows for English and many European languages.

For other encodings (if you type the name by hand), use the Codec names from https://docs.python.org/3/library/codecs.html#standard-encodings.

Windows code pages

Older applications and file formats, especially ones for Windows, sometimes use Windows code page encodings. These can be identified with a code page identifiers. Python supports a subset of these, but use different identifiers. Many are prefixed by cp, for example cp-1252 (same as windows-1252 mentioned earlier). These are superseded by unicode and UTF-8, etc. but can still be found in files today.

Mojibake

Garbled text resulting from decode using an unintended character encoding, making characters appears as unrelated ones.

Swedish example:

Björnbärssnår (Blackberry thicket)

Encode

Decode

Result

UTF-8

UTF-8

Björnbärssnår

UTF-8

ISO-8859-1

Björnbärssnår

UTF-8

ISO-8859-2

BjÜrnbärssnür

UTF-8

UTF-16-LE

橂뛃湲썢犤獳썮犥

UTF-8

UTF-16-BE

䉪쎶牮拃ꑲ獳滃ꕲ

As seen, unmatched encoding can result in anything from misrepresented special characters to a result that is compeletely off. The result can even be correct for some words. Both encode and decode can also fail if there is no possible translation, depending on the combination of characters (encode) or bytes (decode).

See https://en.wikipedia.org/wiki/Mojibake for more information.

Python

Encodings in Python is performed using 2 different methods: str.encode and bytes.decode. Names for available encodings can be found in the documentation for the codecs module.

Encode using an unsupported encoding results in an UnicodeEncodeError.

>>> 'Björnbärssnår'.encode('ascii')
Traceback (most recent call last):
...
UnicodeEncodeError: 'ascii' codec can't encode character '\xf6' in position 2: ordinal not in range(128)

Decode using an unsupported encoding results in UnicodeDecodeError.

>>> encoded = 'Björnbärssnår'.encode('iso-8859-1')
>>> encoded
b'Bj\xf6rnb\xe4rssn\xe5r'
>>> encoded.decode('iso-8859-1')
'Björnbärssnår'
>>> encoded.decode('utf-8')
Traceback (most recent call last):
...
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 2: invalid start byte

Often, the right way to deal with these exceptions is simply to choose the intended encoding. When the exact encoding is unknown or if the data is somehow corrupt, Python offers the errors parameter for encode and decode - which can substitute or ignore unsupported symbols.

>>> encoded = 'Björnbärssnår'.encode('iso-8859-1')
>>> encoded
b'Bj\xf6rnb\xe4rssn\xe5r'
>>> encoded.decode('iso-8859-1')
'Björnbärssnår'
>>> encoded.decode('utf-8', errors='replace')
'Bj�rnb�rssn�r'

Here, errors=’replace’ substitutes � in place of unhandled characters instead of raising a UnicodeDecodeError. For more options, see https://docs.python.org/3/howto/unicode.html.

Input typed values as text

Some nodes will allow you to input text to use to produce a typed value - which could depend, for example, on the type of columns used in the operation. The text needs to use a format that is understood by the functions for reading the type used.

If the type is text, any input will do, but for other types see the following examples:

bool

True, False, true, false, 1, 0

integer

0, 1, 2, …

float

0, 0.0, 1, 1.1, …

text

Anything goes here!

datetime

1970-01-01T00:00:00.000000, 1970-01-01 00:00:00.000000, 1970-01-01 00:00:00.00, 1970-01-01

timedelta

1 days, 2 d, 44.333 seconds, 2 days 2 h 44 seconds,

complex

1.1 + 2j

All command line options

Top-level

python -m sympathy --help

usage: sympathy [-h]
                   {gui,cli,viewer,install,uninstall,tests,clear,launch}
                   ...

Sympathy for Data

optional arguments:
  -h, --help            show this help message and exit
  --version             show Sympathy for Data version and exit

Commands:
  {gui,cli,viewer,install,uninstall,tests,clear,launch}
                        Command
    gui                 run Sympathy in GUI mode
    cli                 run Sympathy in CLI mode
    viewer              run the viewer for sydata files.
    install             install Sympathy (start menu, file associations,
                        documentation)
    uninstall           uninstall Sympathy (start menu, file associations)
    tests               run the test suite
    clear               cleanup temporary files
    launch              internal use only

Gui and Cli

The options for the gui and cli commands are similar.

python -m sympathy gui --help

usage: __main__.py gui [-h] [--exit-after-exception {0,1}]
                       [-L LOGGER [LEVEL ...]]
                       [--num-worker-processes NUM_WORKER_PROCESSES]
                       [-I INIFILE] [--nocapture]
                       [filename]

positional arguments:
filename              file containing workflow.

optional arguments:
-h, --help            show this help message and exit
--exit-after-exception {0,1}
                      exit after uncaught exception occurs in a signal handler
-L LOGGER [LEVEL ...],
--loglevel LOGGER [LEVEL ...]
                      A logger configuration with a logger name and a level
                      (e.g. -L app.stats warning). This argument can be
                      repeated.
--num-worker-processes NUM_WORKER_PROCESSES
                      number of python worker processes (0) use system number
                      of CPUs
-I INIFILE,
--inifile INIFILE
                      settings ini-file to use instead of the default
--environment-credentials PREFIX
                      read credential secrets from environment
                      variables starting with PREFIX that are encoded as
                      json lists, with json dictionary values e.g,
                      PREFIX["secret","foo"]={"secret":"bar"}.
--nocapture           disable capturing of node output and send it directly to
                      stdout/stderr.
usage: launch.py gui [-h] [--exit-after-exception {0,1}] [-v]
                     [-L {0,1,2,3,4,5}] [-N {0,1,2,3,4,5}]
                     [--num-worker-processes NUM_WORKER_PROCESSES]
                     [-C CONFIGFILE [CONFIGFILE ...]] [-I INIFILE]
                     [--nocapture]
                     [filename]

positional arguments:
  filename              file containing workflow.

optional arguments:
  -h, --help            show this help message and exit
  --exit-after-exception {0,1}, --exit_after_exception {0,1}
                        exit after uncaught exception occurs in a signal
                        handler
  -L {0,1,2,3,4,5}, --loglevel {0,1,2,3,4,5}
                        (0) disable logging, (5) enable all logging
  -N {0,1,2,3,4,5}, --node-loglevel {0,1,2,3,4,5}, --node_loglevel {0,1,2,3,4,5}
                        (0) disable logging, (5) enable all logging
  --num-worker-processes NUM_WORKER_PROCESSES, --num_worker_processes NUM_WORKER_PROCESSES
                        number of python worker processes (0) use system
                        number of CPUs
  -C CONFIGFILE [CONFIGFILE ...], --configfile CONFIGFILE [CONFIGFILE ...]
                        workflow configuration file, used to change parameters
                        and an optional outfile for the modified workflow
  -I INIFILE, --inifile INIFILE
                        settings ini-file to use instead of the default
  --environment-credentials PREFIX
                        read credential secrets from environment
                        variables starting with PREFIX that are encoded as
                        json lists, with json dictionary values e.g,
                        PREFIX["secret","foo"]={"secret":"bar"}.
  --nocapture           disable capturing of node output and send it directly
                        to stdout/stderr.

Viewer

python -m sympathy viewer --help

usage: sympathy viewer [-h] [filename]

positional arguments:
  filename    sydata file

optional arguments:
  -h, --help  show this help message and exit

Install

python -m sympathy install --help

usage: sympathy install [-h] [--generate-all] [--compile] [--compile-all]
                        [--register] [--set-preference OPT-NAME OPT-VALUE]
                        [--all]

optional arguments:
  -h, --help            show this help message and exit
  --generate-all        generate parser files
  --compile             compile sympathy
  --compile-all         compile all site-package files
  --register            register desktop application and create shortcuts
  --set-preference OPT-NAME OPT-VALUE
                        set value of setting
  --all                 perform full installation, includes all options if
                        enabled or by default if no other options are provided

Uninstall

python -m sympathy uninstall --help

usage: sympathy uninstall [-h]

optional arguments:
  -h, --help  show this help message and exit

Clear

python -m sympathy clear --help

usage: sympathy clear [-h] [--caches] [--sessions]

optional arguments:
  -h, --help  show this help message and exit
  --caches    Clear caches for Sympathy.
  --sessions  Clear sessions for Sympathy.