Overview
The Etlworks includes a transformation that can flatten nested datasets by extracting dimensions.
This transformation is executed last after the whole nested data object is read into memory hence enabling this transformation automatically disables data streaming.
Process
Step 1. Create an ETL flow where the source is a file.
Step 2. Configure Extract Dimensions
transformation.
To configure the transformation, go toMAPPING
/Complex Transformations
/Extract Dimensions
.
For this transformation, the following parameters are available:
Extract Dimensions
: a comma-separated list of dimensions that are used to specify the path to the data element in the nested dataset to extract data from.Columns to Include Together with Dimensions to Extract
: a comma-separated list of the columns in the driving dataset to include together with columns from the extracted dimension.
Example
Let's assume the response from the web service is a nested JSON objectusers.json
like the one presented below:
[
{
"firstName":"Duke",
"lastName":"Java",
"test":[
{
"who":"duke",
"why":"because"
}
],
"age":18,
"streetAddress":"100 Internet Dr",
"city":"JavaTown",
"state":"JA",
"postalCode":"12345",
"phoneNumbers":[
{
"type":"Mobile",
"phoneNumber":"111-111-1111"
},
{
"type":"Home",
"phoneNumber":"222-222-2222"
}
]
}
]
To extract all the phone numbers together with the user data, we could configure the Extract dimensions transformation as the following :
- For
Extract Dimensions
: phoneNumber. - For
Columns to Include Together with Dimensions to Extract
:firstName,lastName,age
.
As a result, the system will create the following flat dataset:
firstName lastName age .... phoneNumber_type phoneNumber
Duke Java 18.... Mobile 111-111-1111
Duke Java 18.... Home 222-222-2222
Comments
0 comments
Please sign in to leave a comment.