Breadcrumbs

File Manipulation - Filter CSV

Overview

This Alli Workflows script removes any rows from a .csv file that contain null (missing) values in specified columns. It helps ensure cleaner datasets for downstream automation steps such as enrichment, transformation, or upload.


Variables


Name

Reference

Type

Required

Default

Options

Description

Source Folder Name

FOLDER_NAME

Alphanumeric




Path to the folder containing your CSV file. Defaults to the current directory (.).

Source Filename

FILE_NAME

Alphanumeric



Name of the CSV file to clean. Must include the .csv extension.

Note: Output will overwrite this file with the filtered version.

Column Name(s)

COLUMN_NAME

Alphanumeric



Comma-separated list of column names to check for nulls (e.g. name,price).

Example Values in Alli Workflows:

Variable

Example Value

FOLDER_NAME

upstream/cleaning/input

FILE_NAME

products.csv

COLUMN_NAME

title,price,availability


What the Script Does

  1. Reads Input File
    Uses FOLDER_NAME and FILE_NAME to locate the .csv file. Defaults to the working directory if FOLDER_NAME is not set.

  2. Validates Configuration

    • Ensures FILE_NAME is provided and ends in .csv.

    • Verifies that COLUMN_NAME is populated.

  3. Parses Columns & Drops Null Rows

    • Converts COLUMN_NAME into a list.

    • Filters out any rows that contain null values in any of the specified columns.

  4. Overwrites the Original File
    The filtered data is saved back to the same path for use in the next step.

  5. Logs Result Summary
    The script prints:

    pgsql
    

    CopyEdit

    Filtered X row(s) with nulls in [columns]. Saved cleaned CSV back to /path/to/file.csv


Requirements

  • A valid .csv file must be present, either uploaded in a prior step or passed from an upstream task.

  • Column names in COLUMN_NAME must match exactly (case-sensitive) with the CSV headers.


Typical Use Cases

  • Removing incomplete rows from product feeds before AI enrichment or scoring.

  • Ensuring clean survey or form response data before analytics.

  • Validating required fields prior to exporting to external systems.


Alli Workflows Tips

  • Use an Upload File step before this script to provide the .csv file if it’s not generated in a previous task.

  • Chain this task before model inference, data export, or QA automation steps.

  • Review logs for details on filtered rows and affected columns.