Alli Cloud Storage
Vendor/Partner | Alli |
---|---|
Version | N/A |
API Documentation | N/A |
Sunset Date | N/A |
Channel(s) | All |
Refresh Time (CST) | 4 am, 7 am, 10 am, 1 pm, 4 pm, 7 pm, 10 pm |
Default backfill | All Unprocessed Files |
Alli Data Library |
|
This data source will attempt to load all files that match the configured regex or file name supplied.
Users should be cautious using this functionality to pull in several files at once especially if they are zipped or contain 1M+ rows as the entire batch of files is subject to the 4 hour timeout for any datasource.
Before Getting Started
To get started with the Alli Cloud Storage datasource, ensure you have the following:
A file available in Alli Cloud Storage that you would like to ingest into Alli Data. The available files can be located in Alli Central.
Additional Configurations
Client File Location
Select where your file exists if it’s in Alli Marketplace or Alli Products, otherwise select Custom (Manually Specify Directory).
File Name
Enter the name of the file.
Directory
If using the Custom (Manually Specify Directory) client file location option, enter the directory path.
Frequently Asked Questions
What is the difference between Amazon S3, Amazon S3 Advanced, and Alli Cloud Storage datasource types?
The Amazon S3 datasource type is a legacy datasource type used to gather data from Alli managed S3 buckets prior to client credentials. The Amazon S3 Advanced datasource type should be leveraged when you want to bring data into Alli but the files are hosted in a non-Alli managed S3 instance. The Alli Cloud Storage datasource types should be leveraged when you want to ingest data that is stored within the Alli managed S3 instance from things like SFTP or other Alli applications.
I have a new file dropped into my S3 bucket daily, how can I have Alli only ingest the file for yesterday?
Within the Alli Cloud Storage datasource set up, you can use bracketed text to dynamically insert values. For example, if I have a file with the current date appended, /daily_file_2024-01-04.csv
, and every day I want Alli to pull in the new file with the current date appended, I would put my file name as: /daily_file_{yesterdate}.csv
. The default date format is YYYY-MM-DD
but can be declared within the brackets (ex. {yesterdate:YYMMDD}
), more information about formatting can be found here: http://momentjs.com/docs/#/displaying/format/ .
I have multiple files in a directory that I want Alli to ingest, how would I set that up?
This can be accomplished using regular expression matching. For example, if I have a directory, /directory1/
, and it contains multiple files, /file1.csv
and /file2.csv
, I can set the file name in the Alli datasource set up form using regular expression as, /file.*.csv
. This result in both files be ingested into Alli.
How would I use the Alli Cloud Storage datasource with S3?
S3 locations automatically pull in the bucket for a client so all that is needed when using the Custom
setting under Client File Location
is the remainder of the file’s key from S3. Regex is accepted in this field. For example, a file located at the s3 url of s3://alliprod-bucket-12345/source=vendor/vendor=facebook/
will require a Client File Location
of source=vendor/vendor=facebook
.
Where does my file go after the datasource runs?
Processed files are moved from their original location after processing, so backfilling will require a specific backfill file to be added to the search location.