Overview
There are several ways to configure the XML parser.
Process
Configure XML parser
You can configure XML Format to parse (or not to parse) the specific parts of the XML.
Parse XML attributes
By default, if the source XML contains attributes, such as <node attribute="value">
, the attributes will not be automatically parsed. Enable parsing attributes by creating a new XML Format and selecting the flag Parse XML Attributes
.
Not parsing XML attributes in the root node
If the flag Parse XML Attributes
for the XML Format is enabled, Etlworks will parse the attributes in all nodes, including the root node. Frequently, the attributes in the root node are in fact XML schema(s), such as the following:
<data xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
To disable parsing the root attributes, create a new XML Format and disable the flag Parse XML Attributes in Root Node
.
This flag has no effect on parsing if Parse XML Attributes
is disabled.
Not parsing CDATA section
By default, Etlworks will parse the CDATA sections in XML:
<name>
<![CDATA[this is a test]]>
</name>
Disable this flag if you are absolutely sure that there are no CDATA sections in the XML to parse. This will enable the XML to be parsed slightly faster.
Parse comments
By default, Etlworks ignores comments when parsing the source XML documents.
<!-- this is comments -->
<name>
value
</name>
If you enable the flag Parse Comments
the system will add a field with a suffix -comments
for each node with comments.
Using the example above, the two fields will be created: name=value
and name-comments=this
is comments
.
Ignore XML Attributes if Node has Value and Attributes
Here is a node with the attribute and value:
<node attr="attr_value">value</node>
When this parameter is enabled, the parser will read the value and will ignore the attribute.
Ignore Node Value if Node has Value and Attributes
Here is a node with the attribute and value:
<node attr="attr_value">value</node>
When this parameter is enabled, the parser will read the attribute and will ignore the value.
Do not create parent node for nested arrays
By default, Etlworks XML connector creates a parent node for nested XML arrays. The name of the node is the same as the name of the first node in the array.
<data>
<prescriber>Mr Smith</prescriber>
<patient>
<patient>John Doe</patient>
<patient>Jane Doe</patient>
</patient>
</data>
If this option is enabled (it is disabled by default), the connector will not create the parent node.
<data>
<prescriber>Mr Smith</prescriber>
<patient>John Doe</patient>
<patient>Jane Doe</patient>
</data>
Comments
0 comments
Please sign in to leave a comment.