The Datconv API reference¶
datconv program¶
The datconv script has following call syntax:
datconv [=]conf_file [--key1:val [--key2:val ...]] [arg1 [arg2 ...]]
where:
conf_file - path to file in YAML format in which Reader, Filter and Writer compoments are configured.
See below listing for more detailed desctiption of this file.
If there is '=' before conf_file then default configuration file is not used.
If conf_file is equal to 'def' than only default configuration file is used.
--key1:val - any number of arguments that add new settings or overwrite settings from conf_file.
It works this way: let say that in conf_file we have:
Writer:
Module: datconv.writers.dcxml
CArg:
pretty: true
by invoking option --Writer:CArg:pretty:false we overwrite 'pretty' option of Writer.
Note that in YAML file we must have space after : at end of the key, while in command line there are no spaces.
arg1 - any number of arguments (that do not begin with --).
Those arguments will replace $1, $2, ... markers in conf_file according to their position in command line:
i.e. $1 will be replaced by first argument that do not begin with --, etc.
or
datconv --default[-raw]
which prints path to and contents of default configuration file;
if post-fix -raw is used configuration file is printed as it is (without parsing).
or
datconv --version
which prints version number to standard output and exit.
or
datconv --help
which prints short usage information
The datconv script returns to shell:
0 on sucess
1 on general error (exception)
2 on invalid command parameters
3 on user break (Ctrl-C)
Sample main YAML configuration file layout:
# Major configuration file for datconv script.
# It must follow YAML syntax and has at least 2 obligatory top level keys: Reader, Writer.
#
# Note that speficied modules must be loadable from datconv script, i.e.
# packages must be placed directly in folder from which datconv script is run (its current folder)
# or datconv must be wrapped in script that define PYTHONPATH environmental variable which point to packages root folder
# or packages root folder must be added to Python configuration as folder with packages
# every package to be loadable must have (even empty) file named __init__.py
#
# The keys listed below are samle keys, for full list of available options see conf_template.yaml files
# contained in readers, writers and filters folders.
# Obligatory key that specify Reader module
Reader:
# Obligatory key that specify Python module which implements Reader
# Module must define class DCReader, which must follow interface specification described in readers/_skeleton.py
Module: datconv.readers.dcxml
# Optional or not (depends on configured reader) key that specify DCReader class constructor parameters
CArg:
# Here follows DCReader class constructor parameters
# Concrete keys depands on choosen Reader.
# See particular readers' documentation.
# Note that if default values are good than they mey be omitted.
# However it is not allowed to have any extra keys not specified in reader documentation
# If you want to preserve some keys for future use - outcomemnt them
encoding: utf-8
# ...
# Usually obligatory key that specify parameters for DCReader.Process method
PArg:
# Here follows DCReader.Process parameters
# Concrete keys depands on choosen Reader.
# See particular readers documentation.
inpath: ../GET-Data/cdc_5019/AddnDrawNbrs_c5019_s38.xml
# optional - if not defined OutConnector is used
#outpath: out/AddnDrawNbrs_c5019_s38.xml
# ...
# Obligatory key that specify Writer module
Writer:
# Obligatory key that specify Python module which implements Writer
# Module must define class DCWriter, which must follow interface specification described in writers/_skeleton.py
Module: datconv.writers.dcxml
# Optional or not (depends on configured writer) key that specify DCWriter class constructor parameters
CArg:
# Here follows DCWriter class constructor parameters
# Concrete keys depands on choosen Writer.
# See particular writers' documentation.
encoding: utf-8
# Optional key that specify Filter module
# If it is missing or null no filter is used
# default: null
Filter:
# If Filter is defined, this key is obligatory and specify Python module which implements Filter
# Module must define class DCFilter, which must follow interface specification described in filters/_skeleton.py
Module: datconv.filters.stat
# Optional or not (depends on configured filter) key that specify DCFilter class constructor parameters
CArg:
# Here follows DCFilter class constructor parameters
# Concrete keys depands on choosen Filter.
# See particular filters' documentation.
retval: 0
# ...
# Optional key that specify Filter module
# If it is missing or null datconv.outconn.dcfile(Reader:PArg:outpath) is used
# default: null
OutConnector:
Module: datconv.outconn.dcfile
CArg:
# relative or absolute path to output file; obligatory
path: "out/AcctAgentAdjustment_c5019_s38.xml"
# Optional key that specify Logger class configuration
# If it is missing or null following configuration is used:
# all log messages are being sent to standard error stream
# messages of severity below WARNING are discarded
# If this key value is a string it means that Logger class is
# inherited from calling code by invoking logging.getLogger('XXX').getChild('datconv')
# where XXX is the key value (name of parent logger).
# If this key value is dictionary (i.e. key contains bubkeys)
# it directly specify loger configuration or redirects to other file (as described below).
# default: null
Logger:
# This key allows to redirect logger configuration to other file.
Conf: Logger.yaml
# As an alternative, all keys contained in Logger.yaml file may be explicitly placed here (as subkeys of Logger key)
# Optional key that specify log level of default console logger used when Logger: key is not present.
# Possible values: CRITICAL, ERROR, WARNING, INFO, DEBUG
# default: INFO
DefLogLevel: WARNING
See also:
datconv package¶
This module provides Datconv class which encapsulate all datconv program features and can be created and called from other Python script in one of following ways:
from datconv import Datconv
dc = Datconv()
conf = {...}
dc.Run(conf)
Or:
from datconv import Datconv
dc = Datconv()
conf = {...}
for rec in dc.Iterate(conf):
print ('Field1: %s' % str(rec['Field1']))
-
class
datconv.
Datconv
[source]¶ Bases:
object
Instead of calling datconv from command line, one can create instance of this class inside other Python script and call its Run() method.
-
Run
(conf)[source]¶ Method that runs conversion process.
Parameters: conf – is a dict()
object with keys as apecified by datconv main YAML configuration file. In this case datconv default configuration file is not used.Returns: 2 in case of invalid configuration; 0 if run sucessfully; may throw exception
-
Iterate
(conf)[source]¶ Method that runs conversion process in iteration mode - i.e. every output record is being returned to caling loop.
Parameters: conf – is a dict()
object with keys as apecified by datconv main YAML configuration file. In this case datconv default configuration file is not used.
-
GetHeader
()[source]¶ Returns Header assiciated with data. Method intendent to be used with iteration interface.
Returns Footer assiciated with data. Method intendent to be used with iteration interface.
-