Overview
Etlworks supports two primary approaches for transforming source documents before they are loaded or sent to the destination:
-
Source Query: Applies a transformation to the entire source document using JavaScript, SQL, or XSLT, before any format-specific or internal processing occurs.
-
Format-Level Configuration: Uses settings defined in the source format (e.g., JSON path, JavaScript preprocessor, or XSLT) to extract or reshape the document during the parsing stage.
Source Query transformations
Source Query is the recommended and most flexible way to transform documents in Etlworks. Instead of configuring transformation logic in a format, you define it directly in the Source Query field of the flow.
Why Use Source Query
-
No need to create or modify formats: Transformations are defined inline within the flow, making them easier to implement, test, and reuse.
-
Executed before internal processing: The transformation is applied to the raw source document before it’s parsed into datasets or mapped to a destination.
-
Supports multiple transformation types: Etlworks automatically detects whether the Source Query is written in SQL, JavaScript, or XSLT.
-
Ideal for JSON-to-JSON flows: Especially useful for web service-to-web service integrations, where unnecessary parsing and serialization can be avoided entirely.
How Source Query Detection Works
Etlworks detects the transformation type based on the content of the Source Query:
-
SQL: If the query is valid SQL, it is executed against the document as a virtual table. This technique is for filtering, reshaping, and querying structured content.
-
JavaScript: If the query is not valid SQL, Etlworks treats it as JavaScript. The raw document is available in the message variable, and the transformed content must be assigned to value.
This is functionally identical to a JavaScript preprocessor in the format, but far more convenient, as it eliminates the need to edit format definitions for each use case.
-
XSLT: If the source format is XML and the query contains valid XSLT, it is automatically applied as a stylesheet transformation. This is equivalent to setting an XSLT in the format but is easier to manage and test directly in the flow.
Example: JavaScript Source Query
var obj = JSON.parse(message);
var result = obj.items.map(function(item) {
return {
id: item.id,
name: item.name,
metadata: JSON.stringify(item.details)
};
});
value = JSON.stringify(result);
What it does:
-
Parses the input JSON.
-
Extracts the items array.
-
Flattens and reshapes each item.
-
Converts nested details objects into strings.
-
Returns the final array as the transformed result.
Real-World Use Case: Web Service to Web Service Flow
Suppose you’re building a flow that pulls data from one REST API and pushes it to another. Both APIs use JSON, but their schemas differ.
Input JSON from source API:
{
"status": "ok",
"data": [
{ "itemNumber": "ABC", "qtyAvailable": 100 },
{ "itemNumber": "XYZ", "qtyAvailable": 250 }
]
}
Expected JSON by destination API:
{
"inventory": [
{ "sku": "ABC", "quantity": 100 },
{ "sku": "XYZ", "quantity": 250 }
]
}
Step 1: Create ETL flow where both source and destination are web service:
Step 2. Configure source to destination transfromation:
To prevent Etlworks from performing any additional parsing or transformations, set both the source and destination formats to CLOB. CLOB is a raw text format that preserves the exact transformed output.
Step 3. Click Configure button
and set Source query:
Source Query (JavaScript):
var obj = JSON.parse(message);
var payload = {
inventory: obj.data.map(function(item) {
return {
sku: item.itemNumber,
quantity: item.qtyAvailable
};
})
};
value = JSON.stringify(payload);
Step 4. Click Test transformation to test the output
Testing JavaScript Source Queries in Explorer
You can test your transformation logic without running a full flow using the Explorer:
Step 1. Open the Explorer and select the source document (file, API endpoint, etc.).
Step 2. Write your JavaScript transformation using the message variable and assign the result to value.
Step 3. From the dropdown next to the play icon, choose Execute Script.
Etlworks will execute the transformation and show the result immediately, perfect for debugging and iteration.
Format-level transformations
Why use format-level transformations
In many cases, the source document can be processed as-is. However, format-level transformations are useful when you need to:
-
Extract a specific section of a large or deeply nested document.
-
Flatten hierarchical structures into a tabular format.
-
Convert documents between formats (e.g., XML → JSON).
-
Remove unnecessary elements to optimize downstream processing.
How format-level transformations work
When setting up a flow, the source format defines how the incoming document is interpreted. This happens after the raw document is loaded but before it is parsed into datasets. The destination format then controls how the data is serialized before delivery.
Etlworks provides several options within the source format for shaping the document:
-
Preprocessor – JavaScript code that modifies the document before parsing.
-
Parameters – Format-specific settings, such as JSON path extraction or flattening options.
-
XSLT – For XML documents, an XSL Style Sheet can be applied to restructure or transform the XML.
Working with JSON Documents
Using Format Parameters to transform JSON
JSON Format provides two key Format parameters to transform JSON source documents before they are processed:
Path to the Node to Parse
• Extracts a specific nested structure from the JSON document.
• Skips all other nodes.
• Format: parent...parent.start
Serialize Nested Elements as Strings
• Converts nested objects/arrays into string representations instead of structured JSON.
Example: Using “Path to the Node to Parse”
When this parameter is set, the JSON parser starts reading from the specified node and ignores everything else.
Input JSON Document
{
"data": {
"boards": [
{"id": 1, "name": "Board A"},
{"id": 2, "name": "Board B"}
],
"next": "123"
}
}
Configuration
• Path to the Node to Parse: data.boards
Transformed Output
[
{"id": 1, "name": "Board A"},
{"id": 2, "name": "Board B"}
]
💡 Effect: The parser extracts the boards array and removes the rest of the JSON document.
Example: Using “Serialize Nested Elements as Strings”
When enabled, this setting converts nested objects/arrays into string representations instead of keeping them as structured JSON.
Input JSON Document
{
"id": 123,
"details": {
"category": "Tech",
"price": 100
}
}
Configuration
• Serialize Nested Elements as Strings: ✅ Enabled
Transformed Output
{
"id": 123,
"details": "{\"category\":\"Tech\",\"price\":100}"
}
💡 Effect: The details object is converted into a JSON-encoded string.
Transforming JSON Documents with JavaScript Preprocessor
A preprocessor is a JavaScript function that modifies the original JSON document stored in the message variable. The modified document is returned as a string.
Configuring the Preprocessor for JSON Format
To configure a Preprocessor for JSON format:
Step 1. Open source JSON format used in a transformation.
Step 2. Locate the Preprocessor field under the Inline transformation section.
Step 3. Enter the JavaScript code in the provided text box. The script should modify the message variable and return the transformed JSON as a string.
Example: Extracting and Flattening a Nested JSON Structure
Let’s assume an API returns the following deeply nested JSON:
{
"status": "success",
"meta": {
"count": 2,
"timestamp": "2025-01-30T12:00:00Z"
},
"data": [
{
"id": 1,
"name": "John Doe",
"profile": {
"age": 30,
"gender": "male"
},
"location": {
"city": "New York",
"country": "USA"
}
},
{
"id": 2,
"name": "Jane Doe",
"profile": {
"age": 28,
"gender": "female"
},
"location": {
"city": "Los Angeles",
"country": "USA"
}
}
]
}
💡 Goal:
• Extract only the data array.
• Flatten nested objects inside data (like contact and address).
• Convert all nested structures into JSON-encoded strings.
Preprocessor Code
var json = JSON.parse(message); // Parse the JSON input
var transformedData = json.data.map(function(item) {
var flattenedItem = {}; // New object to store the transformed structure
Object.keys(item).forEach(function(key) {
if (typeof item[key] === "object" && item[key] !== null) {
flattenedItem[key] = JSON.stringify(item[key]); // Convert nested object to string
} else {
flattenedItem[key] = item[key]; // Keep primitive values as they are
}
});
return flattenedItem;
});
value = JSON.stringify(transformedData); // Return the transformed JSON as a string
Transformed Output
[
{
"id": 1,
"name": "John Doe",
"profile": "{\"age\":30,\"gender\":\"male\"}",
"location": "{\"city\":\"New York\",\"country\":\"USA\"}"
},
{
"id": 2,
"name": "Jane Doe",
"profile": "{\"age\":28,\"gender\":\"female\"}",
"location": "{\"city\":\"Los Angeles\",\"country\":\"USA\"}"
}
]
💡 Effect: JSON is flattened, nested nodes converted to strings
How It Works
1. Extracts the data array from the JSON.
2. Iterates over each item in the data array.
3. Checks each field inside the item:
• If the field is a nested object, it is converted to a string.
• If the field is a primitive value (string, number, boolean, null), it is kept as-is.
4. Returns a flat JSON array where all nested structures are stringified.
Working with XML documents
Transforming XML Documents with JavaScript Preprocessor
In Etlworks, when using a JavaScript preprocessor to manipulate XML, you can process XML as a string (manual approach) or use Java’s built-in SAX parser for structured transformations.
Configuring the Preprocessor for XML Format
To configure a Preprocessor for XML format:
Step 1. Open source XML format used in a transformation.
Step 2. Locate the XSL Style Sheet or Preprocessor field under the Inline transformation section.
Step 3. Enter the JavaScript code in the provided text box. The script should modify the message variable and return the transformed XML as a string.
Step 4. Set Transformation type to preprocessor.
Below are examples demonstrating both approaches.
Working with XML as a String (Manual Approach)
Using string functions, you can modify XML documents before processing. This is useful for simple transformations like removing, replacing, or extracting specific elements.
Example: Remove a Specific XML Node
Input XML
<root>
<items>
<item>
<id>1</id>
<name>Item A</name>
</item>
<item>
<id>2</id>
<name>Item B</name>
</item>
</items>
<metadata>
<timestamp>2025-01-01T12:00:00</timestamp>
</metadata>
</root>
JavaScript Preprocessor Code
// Remove <metadata> node using regex
var transformedXml = message.replace(/<metadata>.*?<\/metadata>/s, '');
value = transformedXml; // Return transformed XML
Transformed Output
<root>
<items>
<item>
<id>1</id>
<name>Item A</name>
</item>
<item>
<id>2</id>
<name>Item B</name>
</item>
</items>
</root>
💡 Effect: The <metadata> node is removed.
Example: Rename a Node
JavaScript Preprocessor Code
// Replace <items> with <list> and </items> with </list>
var transformedXml = message
.replace(/<items>/g, '<list>')
.replace(/<\/items>/g, '</list>');
value = transformedXml; // Return transformed XML
Transformed Output
<root>
<list>
<item>
<id>1</id>
<name>Item A</name>
</item>
<item>
<id>2</id>
<name>Item B</name>
</item>
</list>
</root>
💡 Effect: <items> is renamed to <list>.
Example: Extract a Specific Node’s Content
// Extract content of the <items> node
var match = message.match(/<items>(.*?)<\/items>/s);
value = match ? match[1] : ''; // Extract content inside <items>
Transformed Output
<item>
<id>1</id>
<name>Item A</name>
</item>
<item>
<id>2</id>
<name>Item B</name>
</item>
💡 Effect: Extracts only the contents of <items>.
Working with Java’s SAX Parser in JavaScript
For more complex transformations, you can use Java’s SAX parser in the JavaScript preprocessor. SAX (Simple API for XML) is a fast, memory-efficient way to parse and modify XML.
Example: Removing a Specific XML Node Using Java SAX Parser
JavaScript Preprocessor Code
var SAXParserFactory = Java.type("javax.xml.parsers.SAXParserFactory");
var InputSource = Java.type("org.xml.sax.InputSource");
var StringReader = Java.type("java.io.StringReader");
var DefaultHandler = Java.type("org.xml.sax.helpers.DefaultHandler");
var StringWriter = Java.type("java.io.StringWriter");
var TransformerFactory = Java.type("javax.xml.transform.TransformerFactory");
var Transformer = TransformerFactory.newInstance().newTransformer();
var DOMImplementationLS = Java.type("org.w3c.dom.ls.DOMImplementationLS");
var DocumentBuilderFactory = Java.type("javax.xml.parsers.DocumentBuilderFactory");
// Create XML Parser
var factory = SAXParserFactory.newInstance();
var builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
var doc = builder.parse(new InputSource(new StringReader(message)));
// Remove <metadata> node
var metadataNode = doc.getElementsByTagName("metadata").item(0);
if (metadataNode !== null) {
metadataNode.getParentNode().removeChild(metadataNode);
}
// Convert modified XML document back to a string
var writer = new StringWriter();
Transformer.transform(new javax.xml.transform.dom.DOMSource(doc),
new javax.xml.transform.stream.StreamResult(writer));
value = writer.toString(); // Return modified XML
💡 Effect: The <metadata> node is removed before processing.
Renaming a Node (<items> → <list>)
This example renames the <items> node to <list> dynamically.
Preprocessor Code
var SAXParserFactory = Java.type("javax.xml.parsers.SAXParserFactory");
var InputSource = Java.type("org.xml.sax.InputSource");
var StringReader = Java.type("java.io.StringReader");
var DefaultHandler = Java.type("org.xml.sax.helpers.DefaultHandler");
var DocumentBuilderFactory = Java.type("javax.xml.parsers.DocumentBuilderFactory");
var TransformerFactory = Java.type("javax.xml.transform.TransformerFactory");
var Transformer = TransformerFactory.newInstance().newTransformer();
var StringWriter = Java.type("java.io.StringWriter");
// Create XML Parser
var factory = SAXParserFactory.newInstance();
var builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
var doc = builder.parse(new InputSource(new StringReader(message)));
// Rename <items> to <list>
var itemsNode = doc.getElementsByTagName("items").item(0);
if (itemsNode !== null) {
var newElement = doc.createElement("list");
// Move all child nodes from <items> to <list>
while (itemsNode.hasChildNodes()) {
newElement.appendChild(itemsNode.firstChild);
}
// Replace <items> with <list>
itemsNode.getParentNode().replaceChild(newElement, itemsNode);
}
// Convert modified XML document back to a string
var writer = new StringWriter();
Transformer.transform(new javax.xml.transform.dom.DOMSource(doc),
new javax.xml.transform.stream.StreamResult(writer));
value = writer.toString(); // Return modified XML
💡 Effect: <items> is renamed to <list> dynamically
Extracting Only a Specific XML Node’s Content
This example extracts only the <items> node from the XML.
var SAXParserFactory = Java.type("javax.xml.parsers.SAXParserFactory");
var InputSource = Java.type("org.xml.sax.InputSource");
var StringReader = Java.type("java.io.StringReader");
var DefaultHandler = Java.type("org.xml.sax.helpers.DefaultHandler");
var DocumentBuilderFactory = Java.type("javax.xml.parsers.DocumentBuilderFactory");
var TransformerFactory = Java.type("javax.xml.transform.TransformerFactory");
var Transformer = TransformerFactory.newInstance().newTransformer();
var StringWriter = Java.type("java.io.StringWriter");
// Create XML Parser
var factory = SAXParserFactory.newInstance();
var builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
var doc = builder.parse(new InputSource(new StringReader(message)));
// Extract <items> node
var itemsNode = doc.getElementsByTagName("items").item(0);
if (itemsNode !== null) {
doc.removeChild(doc.getDocumentElement()); // Remove root
doc.appendChild(itemsNode.cloneNode(true)); // Keep only <items>
}
// Convert modified XML document back to a string
var writer = new StringWriter();
Transformer.transform(new javax.xml.transform.dom.DOMSource(doc),
new javax.xml.transform.stream.StreamResult(writer));
value = writer.toString(); // Return modified XML
💡 Effect: Converts XML into JSON before processing.
Using XSL Transformations to transform XML documents
In Etlworks, you can apply XSL transformations (XSLT) to modify XML documents dynamically before they are processed. XSLT is a language for transforming XML documents into different structures or even into other formats like JSON, CSV, or HTML.
Key Benefits of Using XSLT in Etlworks
• Native XML Transformation: No need for additional scripting; works inline.
• Flexible Processing: Can restructure, extract, or convert XML as needed.
• Compatible with Existing XML Workflows: Can be used with external APIs and data integration tasks.
Configuring the XSL transormation for XML Format
To use XSLT in Etlworks, follow these steps:
Step 1. Set the Transformation Type to query.
Step 2. Enter a valid XSL Style Sheet in the XSL Style Sheet or Preprocessor field.
Step 3. The transformation will be applied at runtime, modifying the XML before further processing.
Example : Extracting Specific XML Nodes using XSL
Input XML
<root>
<items>
<item>
<id>1</id>
<name>Item A</name>
</item>
<item>
<id>2</id>
<name>Item B</name>
</item>
</items>
<metadata>
<timestamp>2025-01-01T12:00:00</timestamp>
</metadata>
</root>
XSLT for Extraction
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<filteredItems>
<xsl:copy-of select="//item"/>
</filteredItems>
</xsl:template>
</xsl:stylesheet>
Transformed Output
<filteredItems>
<item>
<id>1</id>
<name>Item A</name>
</item>
<item>
<id>2</id>
<name>Item B</name>
</item>
</filteredItems>
💡 Effect: The <metadata> section is removed, leaving only <item> nodes.
Example : Flattening XML Structure
This example flattens a nested XML structure by converting elements into attributes.
<order>
<customer>
<name>John Doe</name>
<email>john@example.com</email>
</customer>
<details>
<product>Gadget</product>
<price>99.99</price>
</details>
</order>
XSLT for Flattening
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<orderSummary>
<customer name="{order/customer/name}" email="{order/customer/email}"/>
<product name="{order/details/product}" price="{order/details/price}"/>
</orderSummary>
</xsl:template>
</xsl:stylesheet>
Transformed Output
<orderSummary>
<customer name="John Doe" email="john@example.com"/>
<product name="Gadget" price="99.99"/>
</orderSummary>
💡 Effect: The structure is flattened, making it easier to process in tabular formats.
Flatten Nested Documents using JavaScript
The source JSON or XML documents contain deeply nested structures that need to be flattened into a simple, tabular format before loading into a destination such as a database, spreadsheet, or flat file.
This can be achieved using custom JavaScript functions in the Source Query or Format Preprocessor.
Flatten JSON Documents
The following JavaScript function recursively traverses a nested JSON document and produces a completely flat structure.
Each nested field name is concatenated using a configurable separator (for example, _).
Function flattenJson
function flattenJson(message, normalize, allRecords, fieldSeparator, rootNode) {
normalize = (normalize === true);
allRecords = (allRecords === true);
fieldSeparator = fieldSeparator || '_';
rootNode = (typeof rootNode === 'string' && rootNode.trim() !== '') ? rootNode : null;
function isArray(val) { return Object.prototype.toString.call(val) === '[object Array]'; }
function isObject(val) { return val !== null && typeof val === 'object' && !isArray(val); }
function mergeKeys(records, allKeys) {
for (var i = 0; i < records.length; i++) {
var rec = records[i];
for (var k = 0; k < allKeys.length; k++) {
var key = allKeys[k];
if (!(key in rec)) rec[key] = null;
}
}
}
function collectAllKeys(records) {
var keySet = {};
for (var i = 0; i < records.length; i++) {
for (var key in records[i]) keySet[key] = true;
}
var allKeys = [];
for (var k in keySet) allKeys.push(k);
return allKeys;
}
function flatten(obj, prefix, base) {
var results = [];
var record = base || {};
for (var key in obj) {
var val = obj[key];
var fullKey = prefix ? prefix + fieldSeparator + key : key;
if (isArray(val)) {
if (val.length > 0 && (isObject(val[0]) || isArray(val[0]))) {
if (allRecords) {
for (var i = 0; i < val.length; i++) {
var clone = JSON.parse(JSON.stringify(record));
var nested = flatten(val[i], fullKey, clone);
for (var j = 0; j < nested.length; j++) results.push(nested[j]);
}
return results;
} else {
var nested = flatten(val[0], fullKey, record);
for (var j = 0; j < nested.length; j++) results.push(nested[j]);
return results;
}
} else {
record[fullKey] = val.length > 0 ? val[0] : null;
}
} else if (isObject(val)) {
var nestedObj = flatten(val, fullKey, record);
for (var i = 0; i < nestedObj.length; i++) results.push(nestedObj[i]);
return results;
} else {
record[fullKey] = val;
}
}
if (results.length === 0) results.push(record);
return results;
}
var parsed = JSON.parse(message);
if (rootNode && parsed.hasOwnProperty(rootNode)) parsed = parsed[rootNode];
if (!isArray(parsed)) parsed = [parsed];
var output = [];
for (var i = 0; i < parsed.length; i++) {
var flat = flatten(parsed[i], '', {});
output = output.concat(flat);
}
if (normalize) {
var allKeys = collectAllKeys(output);
mergeKeys(output, allKeys);
}
return JSON.stringify(output);
}
Parameters for the function flattenJson
-
message — JSON string to be flattened.
-
normalize — Ensures each record has the same set of fields (missing ones filled with null).
-
allRecords — If true, flattens all elements in nested arrays; otherwise, only the first.
-
fieldSeparator — Character used to separate nested field names (default _).
-
rootNode — Optional key to extract a specific object or array from the input JSON (for example, 'data').
How to Use function flattenJson
Step 1. Create a flow where the source is a JSON file or API response.
Step 2. Open Source Query (or Source JSON Format Preprocessor) and define the function above.
Step 3. Add the following line at the end of the query to execute the transformation:
value = flattenJson(message, true, true, '_', null);
If the JSON payload contains a root node (for example, data), call:
value = flattenJson(message, true, true, '_', 'data');
Flatten XML Documents
Similarly, XML documents can be flattened using a JavaScript function that recursively traverses element hierarchies and converts them into a flat XML table structure.
Function flattenXml
function flattenXml(xmlString, parseAttributes, ignorePrologAttributes, normalize, allRecords, fieldSeparator, rootNode) {
parseAttributes = (parseAttributes !== false);
ignorePrologAttributes = (ignorePrologAttributes !== false);
normalize = (normalize === true);
allRecords = (allRecords === true);
fieldSeparator = fieldSeparator || '_';
rootNode = rootNode || 'root';
var DocumentBuilderFactory = Java.type('javax.xml.parsers.DocumentBuilderFactory');
var InputSource = Java.type('org.xml.sax.InputSource');
var StringReader = Java.type('java.io.StringReader');
var Node = Java.type('org.w3c.dom.Node');
var TransformerFactory = Java.type('javax.xml.transform.TransformerFactory');
var DOMSource = Java.type('javax.xml.transform.dom.DOMSource');
var StreamResult = Java.type('javax.xml.transform.stream.StreamResult');
var StringWriter = Java.type('java.io.StringWriter');
function isElement(node) { return node.getNodeType() === Node.ELEMENT_NODE; }
function createDoc() { return DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument(); }
function createElement(doc, name, value) {
var el = doc.createElement(name);
if (value != null) el.appendChild(doc.createTextNode(value.toString()));
return el;
}
function collectAllKeys(records) {
var keySet = {};
for (var i = 0; i < records.length; i++) for (var key in records[i]) keySet[key] = true;
var allKeys = [];
for (var k in keySet) allKeys.push(k);
return allKeys;
}
function mergeKeys(records, allKeys) {
for (var i = 0; i < records.length; i++) {
var rec = records[i];
for (var j = 0; j < allKeys.length; j++) {
var key = allKeys[j];
if (!(key in rec)) rec[key] = '';
}
}
}
function flattenElement(node, prefix, base, results) {
base = base || {};
results = results || [];
var record = JSON.parse(JSON.stringify(base));
var nodeName = prefix ? prefix + fieldSeparator + node.getNodeName() : node.getNodeName();
if (parseAttributes && node.hasAttributes()) {
var attrs = node.getAttributes();
for (var i = 0; i < attrs.getLength(); i++) {
var attr = attrs.item(i);
if (!ignorePrologAttributes || node.getParentNode() !== null)
record[nodeName + fieldSeparator + attr.getName()] = attr.getValue();
}
}
var childElements = [];
var hasText = false;
var textValue = null;
var children = node.getChildNodes();
for (var i = 0; i < children.getLength(); i++) {
var child = children.item(i);
if (child.getNodeType() === Node.TEXT_NODE || child.getNodeType() === Node.CDATA_SECTION_NODE) {
var text = child.getNodeValue().trim();
if (text.length > 0) { hasText = true; textValue = text; }
} else if (isElement(child)) {
childElements.push(child);
}
}
if (childElements.length === 0 && hasText) record[nodeName] = textValue;
var childGroups = {};
for (var i = 0; i < childElements.length; i++) {
var child = childElements[i];
var tag = child.getNodeName();
if (!childGroups[tag]) childGroups[tag] = [];
childGroups[tag].push(child);
}
var addedNested = false;
for (var tag in childGroups) {
var group = childGroups[tag];
if (group.length > 0) {
if (allRecords) {
for (var i = 0; i < group.length; i++) {
var copy = JSON.parse(JSON.stringify(record));
flattenElement(group[i], nodeName, copy, results);
}
addedNested = true;
} else {
flattenElement(group[0], nodeName, record, results);
addedNested = true;
}
}
}
if (!addedNested) results.push(record);
return results;
}
function recordsToXml(records) {
var doc = createDoc();
var root = doc.createElement(rootNode);
doc.appendChild(root);
for (var i = 0; i < records.length; i++) {
var row = doc.createElement('record');
var rec = records[i];
for (var key in rec) row.appendChild(createElement(doc, key, rec[key]));
root.appendChild(row);
}
var tf = TransformerFactory.newInstance().newTransformer();
var sw = new StringWriter();
tf.transform(new DOMSource(doc), new StreamResult(sw));
return sw.toString();
}
var factory = DocumentBuilderFactory.newInstance();
var builder = factory.newDocumentBuilder();
var inputSource = new InputSource(new StringReader(xmlString));
var doc = builder.parse(inputSource);
var root = doc.getDocumentElement();
var records = flattenElement(root, '', {}, []);
if (normalize) {
var allKeys = collectAllKeys(records);
mergeKeys(records, allKeys);
}
return recordsToXml(records);
}
Parameters for the function flattenXml
-
message — XML string to flatten.
-
parseAttributes — Include XML attributes as fields.
-
ignorePrologAttributes — Skip attributes in prolog or root-level metadata elements.
-
normalize — Ensures each record has the same set of fields by filling missing values.
-
allRecords — If true, creates one record per nested element; otherwise, only the first.
-
fieldSeparator — Separator between nested field names.
-
rootNode — Name of the root node in the flattened XML.
How to Use function flattenXml
Step 1. Create a flow where the source is a XML file or API response.
Step 2. Open Source Query (or Source XML Format Preprocessor) and define the function above.
Step 3. Add the following line at the end of the query to execute the transformation:
value = flattenXml(message, true, true, true, true, '_', 'root');
The function will output a flattened XML structure where each record corresponds to a single row of data, making it ready for loading into the destination.