The Datconv API reference

datconv program

The datconv script has following call syntax:

datconv [=]conf_file [--key1:val [--key2:val ...]] [arg1 [arg2 ...]]
where:
conf_file - path to file in YAML format in which Reader, Filter and Writer compoments are configured.
            See below listing for more detailed desctiption of this file.
            If there is '=' before conf_file then default configuration file is not used.
            If conf_file is equal to 'def' than only default configuration file is used.
--key1:val - any number of arguments that add new settings or overwrite settings from conf_file.
            It works this way: let say that in conf_file we have:
            Writer:
                Module: datconv.writers.dcxml
                CArg:
                    pretty:   true
            by invoking option --Writer:CArg:pretty:false we overwrite 'pretty' option of Writer.
            Note that in YAML file we must have space after : at end of the key, while in command line there are no spaces.
arg1 - any number of arguments (that do not begin with --).
    Those arguments will replace $1, $2, ... markers in conf_file according to their position in command line:
    i.e. $1 will be replaced by first argument that do not begin with --, etc.
or
datconv --default[-raw]
which prints path to and contents of default configuration file;
if post-fix -raw is used configuration file is printed as it is (without parsing).

or
datconv --version
which prints version number to standard output and exit.

or
datconv --help
which prints short usage information

The datconv script returns to shell:
    0 on sucess
    1 on general error (exception)
    2 on invalid command parameters
    3 on user break (Ctrl-C)

Sample main YAML configuration file layout:

# Major configuration file for datconv script.
# It must follow YAML syntax and has at least 2 obligatory top level keys: Reader, Writer.
# 
# Note that speficied modules must be loadable from datconv script, i.e.
#   packages must be placed directly in folder from which datconv script is run (its current folder)
#   or datconv must be wrapped in script that define PYTHONPATH environmental variable which point to packages root folder
#   or packages root folder must be added to Python configuration as folder with packages
#   every package to be loadable must have (even empty) file named __init__.py
#
# The keys listed below are samle keys, for full list of available options see conf_template.yaml files
# contained in readers, writers and filters folders.

# Obligatory key that specify Reader module
Reader: 
    # Obligatory key that specify Python module which implements Reader
    # Module must define class DCReader, which must follow interface specification described in readers/_skeleton.py
    Module: datconv.readers.dcxml
    # Optional or not (depends on configured reader) key that specify DCReader class constructor parameters
    CArg:
        # Here follows DCReader class constructor parameters
        # Concrete keys depands on choosen Reader.
        # See particular readers' documentation.
        # Note that if default values are good than they mey be omitted.
        # However it is not allowed to have any extra keys not specified in reader documentation
        # If you want to preserve some keys for future use - outcomemnt them
        encoding:  utf-8
        # ...

    # Usually obligatory key that specify parameters for DCReader.Process method
    PArg:
        # Here follows DCReader.Process parameters
        # Concrete keys depands on choosen Reader.
        # See particular readers documentation.
        inpath:  ../GET-Data/cdc_5019/AddnDrawNbrs_c5019_s38.xml
        # optional - if not defined OutConnector is used
        #outpath: out/AddnDrawNbrs_c5019_s38.xml
        # ...

# Obligatory key that specify Writer module
Writer:
    # Obligatory key that specify Python module which implements Writer
    # Module must define class DCWriter, which must follow interface specification described in writers/_skeleton.py
    Module: datconv.writers.dcxml
    # Optional or not (depends on configured writer) key that specify DCWriter class constructor parameters
    CArg: 
        # Here follows DCWriter class constructor parameters
        # Concrete keys depands on choosen Writer.
        # See particular writers' documentation.
        encoding: utf-8

# Optional key that specify Filter module
# If it is missing or null no filter is used
# default: null
Filter:
    # If Filter is defined, this key is obligatory and specify Python module which implements Filter
    # Module must define class DCFilter, which must follow interface specification described in filters/_skeleton.py
    Module: datconv.filters.stat
    # Optional or not (depends on configured filter) key that specify DCFilter class constructor parameters
    CArg: 
        # Here follows DCFilter class constructor parameters
        # Concrete keys depands on choosen Filter.
        # See particular filters' documentation.
        retval: 0
        # ...

# Optional key that specify Filter module
# If it is missing or null datconv.outconn.dcfile(Reader:PArg:outpath) is used
# default: null
OutConnector:
    Module: datconv.outconn.dcfile
    CArg:
        # relative or absolute path to output file; obligatory
        path: "out/AcctAgentAdjustment_c5019_s38.xml"

# Optional key that specify Logger class configuration
# If it is missing or null following configuration is used:
#   all log messages are being sent to standard error stream
#   messages of severity below WARNING are discarded
# If this key value is a string it means that Logger class is
#   inherited from calling code by invoking logging.getLogger('XXX').getChild('datconv')
#   where XXX is the key value (name of parent logger).
# If this key value is dictionary (i.e. key contains bubkeys)
#   it directly specify loger configuration or redirects to other file (as described below).
# default: null
Logger:
    # This key allows to redirect logger configuration to other file.
    Conf: Logger.yaml
    
    # As an alternative, all keys contained in Logger.yaml file may be explicitly placed here (as subkeys of Logger key)
    
# Optional key that specify log level of default console logger used when Logger: key is not present.
# Possible values: CRITICAL, ERROR, WARNING, INFO, DEBUG
# default: INFO
DefLogLevel: WARNING
    
    

See also:

datconv package

This module provides Datconv class which encapsulate all datconv program features and can be created and called from other Python script in one of following ways:

from datconv import Datconv 
dc = Datconv()
conf = {...}
dc.Run(conf)

Or:

from datconv import Datconv 
dc = Datconv()
conf = {...}
for rec in dc.Iterate(conf):
  print ('Field1: %s' % str(rec['Field1']))
class datconv.Datconv[source]

Bases: object

Instead of calling datconv from command line, one can create instance of this class inside other Python script and call its Run() method.

Run(conf)[source]

Method that runs conversion process.

Parameters:conf – is a dict() object with keys as apecified by datconv main YAML configuration file. In this case datconv default configuration file is not used.
Returns:2 in case of invalid configuration; 0 if run sucessfully; may throw exception
Iterate(conf)[source]

Method that runs conversion process in iteration mode - i.e. every output record is being returned to caling loop.

Parameters:conf – is a dict() object with keys as apecified by datconv main YAML configuration file. In this case datconv default configuration file is not used.
GetHeader()[source]

Returns Header assiciated with data. Method intendent to be used with iteration interface.

GetFooter()[source]

Returns Footer assiciated with data. Method intendent to be used with iteration interface.

Version(ext_module=None, ext_verobj='__version__')[source]

If ext_module is None method returns datconv version. Otherwise it loads ext_module module and returns its ext_verobj object (__version__ by default).