Overview
The entire content of the nested document (such as JSON or XML) is stored as-is in a single text column without parsing or transformation.
When to use this Format
The CLOB Format is used to load the entire content of the file into a single column in a data set.
It can be extremely useful when you just want to make a few changes in the source document without applying a full-fledged source-to-destination transformation.
Process
To create a new CLOB Format, go to Connections, select the Formats tab, click Add Format, and type in clob in the Search field.
The following parameters are available when configuring the Format:
- Column Name: the value of this property will be used as a column name when loading the content of the file into a single column in a data set. The default column name is content.
- Default Extension: the default extension is used when the file name doesn't have an extension. If not entered, txt is the default extension.
- Preprocessor: this is code written in JavaScript, which is used to change the entire document. Read more about the preprocessor.
- Output Template: if not empty, the {tokens} in the output template will be merged with global variables to produce the output.
- Encoding: character encoding when reading and writing a file. No encoding means there will be no additional encoding.
Use the CLOB format to transform source messages
- If you want to extract data from any source, transform it and load it into any destination, you will be using source-to-destination transformations.
- If you want to move or copy files unmodified from the source to the destination, you will be using file management Flows, such as copy, move, delete, etc.
- If you want to transform XML files using XSLT, you will most likely be using XSLT Flow.
- If you want to create complex nested XML or JSON documents, you will most likely use JavaScript or Python.
The most generic transformation is source-to-destination. In almost all cases, it hides the complexity of working with specific data Formats and allows ETL developers to use high-level instruments, such as Mapping editor.
There are cases, however, when you want to make a few changes in the source text document and save it to the same or different location.
Example
Here's a scenario where you have to tweak the source text document and save it either on the same location or a different one:
- Rename all nodes in the JSON document, which start with <name to <the_name.
The example above can be easily accomplished using the technique explained below:
Step 1. Create source and destination Connections. It can be any file storage, cloud storage, or HTTP Listener.
Step 2. Create a new CLOB Format.
Step 3. When creating a Format, enter the transformation code in the Preprocessor field. For example:
if (message != null) {
message = message.replace('<name', '<the_name');
}
value = message;
Step 4. Create a source-to-destination transformation where the Connections are the same Connections you created in step 1, and the Format is a Format created in step 2.
Step 5. Set the source and destination name. You can use the wildcard file name as a source and enable wildcard file processing.