When to use this transformation
Use this transformation to split a larger dataset into smaller partitions.
This transformation is only available when the destination is a file.
There are two options for partitioning:
Create a transformation
To configure a Partition By
transformation, go to Transformation
/ MAPPING
/ Complex Transformations
/ Partition By
.
Split by the maximum number of records in the partition
If you type in the numeric value in the field Partition By
, the system will assume that this value is the maximum number of records in a file. For example, if Partition By is set to 100
and the dataset has 1,000 records, then 10 files, each containing 100 records, will be created.
The file names will be assigned using the following algorithm: original filename
+ _
+ index + original file extension
.
Split by unique values of partition-by fields
If you type in the alphanumeric value in the field Partition By
, the system will assume that this value is a comma-separated list of fields to group them by before splitting the file. For example, if Partition By
is set to last_name,first_name
and the dataset has records with multiple, identical last and first names, then multiple files will be created, grouped by the same last and first name.
The file names will be assigned using the following algorithm: original filename
+ _
+ value of the column to partition by
+ original file extension
.
Ignore the original filename
When using the Partition By
transformation to split a dataset into multiple files, it creates files with names that include the original destination filename, for example: order_1234.csv
, where 1234 is the value of the Partition By
column. You can configure the transformation to ignore the original file name by enabling property Transformation
/ MAPPING
/ Complex Transformations
/ Ignore Original File Name
. Using the example above, the file with the name _1234.csv
will be created.
Comments
0 comments
Please sign in to leave a comment.