How to define and run a Transformation
model
- which Trellis LLM engine do you want the transformation to use?
mode
- will you upload freeform document
-type assets, or structured table
-type assets?
operations
- what data do you want to extract, and how?
transform_params
object is the single source of truth for all three of these parameters. Defining and redefining transform_params
is equivalent to setting up and updating your transformation.
transform_params
objecttransform_params
object contains configuration settings for performing data transformations using our AI models. Below are the detailed parameters included in transform_params
:
model
string
trellis-vertix
and trellis-scale
, each indicating a different level of speed and accuracy. We recommend trellis-vertix
.
mode
string
document
. You can also use table
if you’re parsing tables.
operations
list of operation
operation
object contains the following information:
transform_params
object requires a minimum of one operation
with the following parameters:column_type = 'assets'
transform_type = 'parse'
column_name
string
column_type
string
assets
for file data,text
for string data, text[]
for arrays of text, numeric
for numerical data, and date
for date values.
transform_type
string
"parse"
refers to the transformation of assets
-type columns into parsed text data, ready for other columns to reference. The term "extraction"
suggests that the operation aims to retrieve specific pieces of data from the column. Other types includes classification
and generation
.
task_description
string
Extract the invoice amount from {{Invoice}}
). Use cases for references include culling data from parsed assets
, classifying extracted data and more.
transform_type
in ['parse', 'manual']
, all operations’ task description must reference at least one other operation. Reference is done in the format {{column_name}}
Task descriptions for operations with transform_type
in ['parse', 'manual']
should be populated with “N/A”.