Overview
Scripting transformations are code transformations executed at the different steps in the ETL process, such as before extract/load, for each row, after extract row, etc.
The typical use cases are:
- Change a data set on the fly, including fields, values, and structure.
- Programmatically set global and flow variables.
- Modify source query.
You can use JavaScipt (default) or Python in the scripting transformation.
Scripting transformations
To configure one of the following scripting transformations, go to:
Transformation
/MAPPING
/Additional Transformations
/Before Prepare Source Query
: Scripting transformation executed before the source query is merged with the flow variables.Transformation
/MAPPING
/Additional Transformations
/Before Extract
: Scripting transformation executed before extract.Transformation
/MAPPING
/Additional Transformations
/For each row
: Scripting transformation executed for each extracted row. The alternative to the scripting transformationFor each row
is per-field Mapping.Transformation
/MAPPING
/Additional Transformations
/After Extract
: Scripting transformation executed after extract.Transformation
/MAPPING
/Additional Transformations
/After Load
: Scripting transformation executed after load.
Before Prepare Source Query
This transformation is executed right before the Source query
is merged with the Flow variables, so it is a perfect place for setting the values of the Flow variables.
It can also be used to programmatically modify the Source query
itself.
Change the Source query programmatically
source.setSql('select * from table where id > {VAR_NAME});
Set the value of the Flow variable
scenario.getVariable('VAR_NAME').setValue(value);
Before Extract
This transformation is executed before the extract. It can be used to set global and Flow variables, logging, or for any other reason.
For Each Row
This transformation is executed for each row of the extracted data set. Use it to modify field values on the fly. The alternative to the scripting transformation For each row
is per-field Mapping.
After Extract
This transformation is executed after the extract is finished. It can be used to create a completely new data set using the existing data set as a source. For example, you can create a highly-nested data set from the source, which later will be used to create a complex XML document. In order for this transformation to work, streaming must be disabled (Parameters
/ Stream Data
/ Disable
).
After Load
This transformation is executed in the last step of the extract-transform-load after the data was successfully loaded into the destination. It can be used to set global and Flow variables, logging, clean up, or for any other reason.
You can use all 4 transformations together.
Available objects
The following objects can be referenced by name from JavaScript or Python code:
Object name | Class name / JavaDoc | Package | Available in |
---|---|---|---|
dataSet | com.toolsverse.etl.common.DataSet | com.toolsverse.etl.common |
After extract, for each row |
currentRow | com.toolsverse.etl.common.DataSetRecord | com.toolsverse.etl.common |
For each row |
etlConfig | com.toolsverse.etl.core.config.EtlConfig | com.toolsverse.etl.core.config |
Before extract, for each row, afetr extract |
scenario | com.toolsverse.etl.core.engine.Scenario | com.toolsverse.etl.core.engine |
Before extract, for each row, afetr extract |
row | current 0-based row number | For each row |
Most commonly used methods
The following methods of the DataSet object are typically used when modifying a data set.
Method | Purpose |
---|---|
getRecordCount() | returns the number of records in the data set |
getFieldCount() | returns the number of fields in the data set |
DataSetRecord getRecord(index) | returns a record by index |
DataSetData getData() | returns an array of records |
DataSetFields getFields() | returns an array of fields (columns) |
getFieldDef(name) | returns the field's definition by name |
getFieldDef(index) | returns the field's definition by index |
TypedKeyValue<fielddef, dataset=""> findField(name)</fielddef,> | finds a field and, optionally, the nested data set it belongs to |
setData(DataSetData) | sets records |
setFields(DataSetFields) | sets fields (columns) |
getFieldValue(row, column) | returns the value of a field for the given row and column |
getFieldValue(DataSetRecord, column) | returns the value of a field for the given record and column |
getFieldValue(DataSetRecord, name) | returns the value of a field for the given record and field name |
addField(FieldDef) | adds a field (column) |
addField(name, value) | adds a field (column) and value |
setFieldValue(DataSetRecord, column, value) | sets the field value for the given record and given column number |
setFieldValue(row, column, value) | sets the field value for the given row and given column number |
setValue(DataSetRecord, name, value) | sets the field value for the given record, adds a new field if needed |
getActualData() | gets the data in its internal format. Currently used for HL7 messages only |
setActualData(Object) | sets the data in its internal format. Currently used for HL7 messages only |
Example
The example below demonstrates how to traverse the original data set and create a brand new one, which will then be used to create a destination file. The After extract
transformation is used for this.
var stageDataSet = new com.toolsverse.etl.common.DataSet();
var rows = dataSet.getRecordCount();
for (row = 0; row < rows; row++) {
var record = dataSet.getRecord(row);
stageDataSet.addField("Name.FirstName", dataSet.getFieldValue(record, "firstName"));
stageDataSet.addField("Name.LastName", dataSet.getFieldValue(record, "lastName"));
stageDataSet.addField("Age", dataSet.getFieldValue(record, "age"));
stageDataSet.addField("City", dataSet.getFieldValue(record, "city"));
stageDataSet.addField("State", dataSet.getFieldValue(record, "state"));
stageDataSet.addField("ZipCode", dataSet.getFieldValue(record, "postalCode"));
stageDataSet.addField("Addr1", dataSet.getFieldValue(record, "streetAddress"));
var phones = dataSet.getFieldValue(record, "phoneNumbers");
if (phones != null && phones.getRecordCount() > 0) {
var phonesCount = phones.getRecordCount();
for (phoneRow = 0; phoneRow < phonesCount; phoneRow++) {
stageDataSet.addField("Phone.Type",
dataSet.getFieldValue(record, "phoneNumbers.phoneNumbers_type", '.', phoneRow));
stageDataSet.addField("Phone.Num",
dataSet.getFieldValue(record, "phoneNumbers.phoneNumbers", '.', phoneRow));
}
}
}
dataSet.setFields(stageDataSet.getFields());
dataSet.setData(stageDataSet.getData());
Comments
0 comments
Please sign in to leave a comment.