Documentation
Defining Transformation
how to define and run transformation
Transformation Parameters (or transform_params) is the way you define a set of transformations (LLM or otherwise) that you want to do on your data.
transform_params
Overview
The transform_params
object contains configuration settings for performing data transformations using our AI models. Below are the detailed parameters included in transform_params
:
model
- Type:
string
- Description: Specifies the type of LLMs engine to use for the transformation. Options include
"trellis-premium"
,"trellis-vertix"
andtrellis-scale
, each indicating a different level of speed and accuracy. We recommendtrellis-premium
.
mode
- Type:
string
- Description: Method of processing the data. The default here should be
document
. You can also usetable
if you’re parsing tables.
operations
- Type:
list of operation
- Description: Each operation is a data field to extract from your data. Each operation is detailed by information about the target column, the data type of that column, the type of transformation to apply, and a description of the task.
Each object within operations
encompasses the following parameters:
a. column_name
- Type:
string
- Description: Names the column in the dataset on which the operation will be executed. It identifies the specific data point that will undergo transformation or extraction. Must be in snake case.
b. column_type
- Type:
string
- Description: Indicates the data type of the target column, adhering to all PostgreSQL data types as documented in the PostgreSQL documentation (https://www.postgresql.org/docs/current/datatype.html). Valid types include, but are not limited to,
text
for string data,text[]
for arrays of text,numeric
for numerical data, anddate
for date values.
c. transform_type
- Type:
string
- Description: Describes the transformation or extraction method to be applied to the data in the target column. The term
"extraction"
suggests that the operation aims to retrieve specific pieces of data from the column. Other types includesclassification
andgeneration
.
d. task_description
- Type:
string
- Description: Provides a clear, human-readable explanation of what the operation seeks to achieve. Examples include extracting URLs from text data, where the description would outline the purpose of extracting such information.
Here’re an example of a transformation_params
Json
When you’re done with defining the transformation you can go to create transforms to kick-off the transformation run.