When to use this Format
Etlworks can automatically parse and create practically any XML document. You can use additional parameters to hint the XML parser on how to deal with the XML nodes and attributes when reading and creating XML documents.
Use JSON Format when configuring a source-to-destination transformation that reads or writes XML documents.
Read how to work with nested JSON and XML files in Etlworks.
Create Format
To create a new XML Format, go to Connections
, select the Formats
tab, click Add Format
, type in xml
in the Search
field, and select XML
. Note that it is not the same as an XML Data Set.
Parameters
XML Document
: when this option is selected (the default), the parser automatically adds the root tagxml
to the document. Always select it if you want your XML documents to be created with an encoding different fromUTF-8
.
-
<?xml version="1.0" encoding="UTF-8"?>
Start Element Name
: the value of this parameter is used as an XML root element.
When creating XML document, the Start Element Name
is used as a root node. For example, when the Start Element Name
is set to Patient
:
-
<Patient> the rest of the xml document </Patient>
When reading XML documents, the Start Element Name
is used as a hint for the XML parser to automatically create a nested data set with the name equal to the value of this field.
XML Version
: the XML standard version. The default is1.0
.Namespaces
: all applicable XML namespaces.Namespaces
can only be set ifStart Element Name
is not empty. For example, when theStart Element Name
is set toPatient
andNamespaces
is set toxmlns="http://www.meditlive.com/datacontract xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
:
-
<Patient xmlns="http://www.meditlive.com/datacontract" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> the rest of the xml document </Patient>
Row Tag
: the tag used for repeating rows.
When reading XML like below:
-
<Patients> <Patient> <first_name>Joe</first_name> <last_name>Doe</last_name> <dob>01/01/2001</dob> </Patient> .... <Patients>
By default, the system creates nested data sets for each patient row. If you wish to flatten the data set you can set the Row Tag
to patient
. As a result, the flat data set with the following rows and columns will be created:
-
first_name,last_name,dob joe,doe,01/01/2001 .... jane,doe,01/02/2002
When creating an XML document, the value of the Row Tag
field is used literally to create new repeating XML nodes. For example, with the Row Tag
set to ROW
:
-
<Patients> <ROW> </ROW> <ROW> </ROW> <ROW> </ROW> <Patients>
Null Attribute
: if set, a tag with anull
value will be created using this attribute.
For example (with no Null Attribute):
-
<MiddleName />
Another example (with the Null Attribute set to i:nil="true"
):
-
<MiddleName i:nil="true"/>
Value Attribute
: if set, the values are set using specific attributes.
For example (with no Value Attribute):
-
<FirstName>Joe</FirstName>
Another example (with the Value Attribute set to value
):
-
<FirstName value="Joe"/>
Template
: a template in the XML Format. If this field is not empty, Etlworks will use it to populate column names and data types.Template
is an optional field. For example:<root><FirstName/><LastName/><Dob/></root>
.Column names compatible with SQL
: convert column names to SQL-compatible column names by removing all characters, except alphanumeric and spaces.Parse XML Attributes
: enabling this option will force the system to parse XML attributes:<tag attr="value">
.Parse XML Attributes in Root Node
: disable this flag if you don't want the system to parse attributes in the root XML node. This feature only works whenParse XML Attributes
is enabled. An example of the attributes in the root node:<data xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1">
.Parse CDATA section
: disable this flag if you know for sure that there are no CDATA sections in the XML to parse. This will improve parsing performance.Parse Comments
: if this option is enabled, the system will add a field with a suffix-comments
for each node that has comments.-
Ignore XML Attributes if Node has Value and Attributes
: -
Here is a node with the attribute and value:
<node attr="attr_value">value</node>
When this option is enabled the parser will read the value and will ignore the attribute.
-
Ignore XML Attributes if Node has Value and Attributes
: -
Here is a node with the attribute and value:
<node attr="attr_value">value</node>
When this option is enabled the parser will read the attribute and will ignore the value.
Transformation type
: the default isquery
, which is equal to applying an XSL Style Sheet, but you can also selectPreprocessor
. Read more about using a preprocessor to modify the content of the source document.XSL Style Sheet or Preprocessor
: an XSL stylesheet can transform an XML document into the desired Format. An XSL stylesheet is applied when reading and writing XML documents. Read more about using XSL stylesheets to transform XML.Encoding
: XML document encoding. The default isUTF-8
.All fields are strings
: if this option is enabled (it is disabled by default) the system will create all fields with a string data type.Date and Time Format
: a Format for timestamps (date+time).Date Format
: a Format for date (date only, no time).Time Format
: a Format for time (time only, no date).Parse Dates
: if this option is enabled, and date or time value is not recognized as one of the Formats defined above, Etlworks will try to parse it using one of the well-known date+time Formats.Trim Strings
: if this option is enabled, Etlworks will trim leading and trailing white spaces from the value.Treat 'null' as null
: if this option is enabled, Etlworks will treat string values equal tonull
as actual nulls (no value).-
Do not create parent node for nested arrays
By default Etlworks XML connector creates a parent node for nested XML arrays. The name of the node is the same as the name of the first node in the array.
<data>
<prescriber>Mr Smith</prescriber>
<patient>
<patient>John Doe</patient>
<patient>Jane Doe</patient>
</patient>
</data>If this option is enabled (it is disabled by default) the connector will not create the parent node.
<data>
<prescriber>Mr Smith</prescriber>
<patient>John Doe</patient>
<patient>Jane Doe</patient>
</data>
Comments
0 comments
Please sign in to leave a comment.