Overview
Compare two CSV files and return files that show the rows with differing and overlapping information. It is expected that the two files provided will:
-
Provide columns in the first row of the CSV file.
-
Have the same number of columns
-
Have the same column names
After comparing data in the two files, the following files may be generated:
-
{File Name 1}_only.csv (contains only rows found in
File Name 1) -
{File Name 2}_only.csv (contains only rows found in
File Name 2) -
{File Name 1}_overlap.csv (contains rows found in both
File Name 1ANDFile Name 2)
If there is no unique data, a file with the _only will not be created. If there is no overlapping data, the _overlap file will not be created.
This Template is relatively memory intensive because it loads both file contents into memory using Pandas. For larger file sizes, we recommend running a comparison directly in a database.
Variables
|
Name |
Reference |
Type |
Required |
Default |
Options |
Description |
|---|---|---|---|---|---|---|
|
File Name 1 |
MANIPULATION_SOURCE_FILE_NAME_1 |
Alphanumeric |
✅ |
- |
- |
Name of the target file on Platform. |
|
Folder Name 1 |
MANIPULATION_SOURCE_FOLDER_NAME_1 |
Alphanumeric |
➖ |
- |
- |
Name of the local folder on Platform where the target file lives. If left blank, will look in the home directory. |
|
File Name 2 |
MANIPULATION_SOURCE_FILE_NAME_2 |
Alphanumeric |
✅ |
- |
- |
Name of the 2nd target file on Platform. |
|
Folder Name 2 |
MANIPULATION_SOURCE_FOLDER_NAME_2 |
Alphanumeric |
➖ |
- |
- |
Name of the local folder on Platform where the target file lives. If left blank, will look in the home directory. |