跳到主要内容

NebulaAIFlow处理组件

处理组件在流程中处理和转换数据。

在流程中使用处理组件

在这个流程中,拆分文本处理组件将传入的数据拆分成块,以嵌入到向量存储组件中。

该组件提供对块大小、重叠度和分隔符的控制,这些参数会影响向量存储检索结果的上下文和粒度。

Combine text

This component concatenates two text sources into a single text chunk using a specified delimiter.

Inputs

NameDisplay NameInfo
first_textFirst TextThe first text input to concatenate.
second_textSecond TextThe second text input to concatenate.
delimiterDelimiterA string used to separate the two text inputs. Defaults to a space.

Outputs

NameDisplay NameInfo
messageMessageA Message object containing the combined text.

Data combiner

This component combines multiple data sources into a single unified Data object.

The component iterates through the input list of data objects, merging them into a single data object. If the input list is empty, it returns an empty data object. If there's only one input data object, it returns that object unchanged. The merging process uses the addition operator to combine data objects.

Inputs

NameDisplay NameInfo
dataDataA list of data objects to be merged.

Outputs

NameDisplay NameInfo
merged_dataMerged DataA single Data object containing the combined information from all input data objects.

DataFrame operations

This component performs the following operations on Pandas DataFrame:

OperationDescriptionRequired Inputs
Add ColumnAdds a new column with a constant valuenew_column_name, new_column_value
Drop ColumnRemoves a specified columncolumn_name
FilterFilters rows based on column valuecolumn_name, filter_value
HeadReturns first n rowsnum_rows
Rename ColumnRenames an existing columncolumn_name, new_column_name
Replace ValueReplaces values in a columncolumn_name, replace_value, replacement_value
Select ColumnsSelects specific columnscolumns_to_select
SortSorts DataFrame by columncolumn_name, ascending
TailReturns last n rowsnum_rows

Inputs

NameDisplay NameInfo
dfDataFrameThe input DataFrame to operate on.
operationOperationSelect the DataFrame operation to perform. Options: Add Column, Drop Column, Filter, Head, Rename Column, Replace Value, Select Columns, Sort, Tail
column_nameColumn NameThe column name to use for the operation.
filter_valueFilter ValueThe value to filter rows by.
ascendingSort AscendingWhether to sort in ascending order.
new_column_nameNew Column NameThe new column name when renaming or adding a column.
new_column_valueNew Column ValueThe value to populate the new column with.
columns_to_selectColumns to SelectList of column names to select.
num_rowsNumber of RowsNumber of rows to return (for head/tail). Default: 5
replace_valueValue to ReplaceThe value to replace in the column.
replacement_valueReplacement ValueThe value to replace with.

Outputs

NameDisplay NameInfo
outputDataFrameThe resulting DataFrame after the operation.

Filter data

This component filters a Data object based on a list of keys.

Inputs

NameDisplay NameInfo
dataDataData object to filter.
filter_criteriaFilter CriteriaList of keys to filter by.

Outputs

NameDisplay NameInfo
filtered_dataFiltered DataA new Data object containing only the key-value pairs that match the filter criteria.

Filter values

The Filter values component filters a list of data items based on a specified key, filter value, and comparison operator.

Inputs

NameDisplay NameInfo
input_dataInput dataThe list of data items to filter.
filter_keyFilter KeyThe key to filter on, for example, 'route'.
filter_valueFilter ValueThe value to filter by, for example, 'CMIP'.
operatorComparison OperatorThe operator to apply for comparing the values.

Outputs

NameDisplay NameInfo
filtered_dataFiltered dataThe resulting list of filtered data items.

Lambda filter

This component uses an LLM to generate a Lambda function for filtering or transforming structured data.

To use the Lambda filter component, you must connect it to a Language Model component, which the component uses to generate a function based on the natural language instructions in the Instructions field.

This example gets JSON data from the https://jsonplaceholder.typicode.com/users API endpoint. The Instructions field in the Lambda filter component specifies the task extract emails. The connected LLM creates a filter based on the instructions, and successfully extracts a list of email addresses from the JSON data.

Inputs

NameDisplay NameInfo
dataDataThe structured data to filter or transform using a Lambda function.
llmLanguage ModelThe connection port for a Model component.
filter_instructionInstructionsNatural language instructions for how to filter or transform the data using a Lambda function, such as Filter the data to only include items where the 'status' is 'active'.
sample_sizeSample SizeFor large datasets, the number of characters to sample from the dataset head and tail.
max_sizeMax SizeThe number of characters for the data to be considered "large", which triggers sampling by the sample_size value.

Outputs

NameDisplay NameInfo
filtered_dataFiltered DataThe filtered or transformed Data object.
dataframeDataFrameThe filtered data as a DataFrame.

LLM router

This component routes requests to the most appropriate LLM based on OpenRouter model specifications.

Inputs

NameDisplay NameInfo
modelsLanguage ModelsList of LLMs to route between
input_valueInputThe input message to be routed
judge_llmJudge LLMLLM that will evaluate and select the most appropriate model
optimizationOptimizationOptimization preference (quality/speed/cost/balanced)

Outputs

NameDisplay NameInfo
outputOutputThe response from the selected model
selected_modelSelected ModelName of the chosen model

Message to data

This component converts Message objects to Data objects.

Inputs

NameDisplay NameInfo
messageMessageThe Message object to convert to a Data object.

Outputs

NameDisplay NameInfo
dataDataThe converted Data object.

Parser

This component formats DataFrame or Data objects into text using templates, with an option to convert inputs directly to strings using stringify.

To use this component, create variables for values in the template the same way you would in a Prompt component. For DataFrames, use column names, for example Name: {Name}. For Data objects, use {text}.

Inputs

NameDisplay NameInfo
stringifyStringifyEnable to convert input to a string instead of using a template.
templateTemplateTemplate for formatting using variables in curly brackets. For DataFrames, use column names (e.g. Name: {Name}). For Data objects, use {text}.
input_dataData or DataFrameThe input to parse - accepts either a DataFrame or Data object.
sepSeparatorString used to separate rows/items. Default: newline.
clean_dataClean DataWhen stringify is enabled, cleans data by removing empty rows and lines.

Outputs

NameDisplay NameInfo
parsed_textParsed TextThe resulting formatted text as a Message object.

Split text

This component splits text into chunks based on specified criteria.

Inputs

NameDisplay NameInfo
data_inputsInput DocumentsThe data to split.The component accepts Data or DataFrame objects.
chunk_overlapChunk OverlapThe number of characters to overlap between chunks. Default: 200.
chunk_sizeChunk SizeThe maximum number of characters in each chunk. Default: 1000.
separatorSeparatorThe character to split on. Default: newline.
text_keyText KeyThe key to use for the text column (advanced). Default: text.

Outputs

NameDisplay NameInfo
chunksChunksList of split text chunks as Data objects.
dataframeDataFrameList of split text chunks as DataFrame objects.

Update data

This component dynamically updates or appends data with specified fields.

Inputs

NameDisplay NameInfo
old_dataDataThe records to update
number_of_fieldsNumber of FieldsNumber of fields to add (max 15)
text_keyText KeyKey for text content
text_key_validatorText Key ValidatorValidates text key presence

Outputs

NameDisplay NameInfo
dataDataUpdated Data objects.