Amazon S3 - Advanced
Vendor/Partner | Amazon |
---|---|
Version |
|
API Documentation | |
Sunset Date | none as of October 2024 |
Channel(s) | All |
Refresh Time (CST) | 4 am, 7 am, 10 am, 1 pm, 4 pm, 7 pm, 10 pm |
Default backfill | N/A |
Alli Data Library |
|
Before Getting Started
To get started with the Amazon S3 Advanced datasource, ensure you have the following:
AWS Access Key - This can be found in the Security Credentials section of the AWS Console and needs to have the
AmazonS3FullAccess
permission level.AWS Secret - This is generated when you create your AWS Access Key.
AWS Region - The region the data is stored in. This can be found in the AWS Console by navigating to Services, then S3.
ex.
us-east-1
AWS S3 Bucket - The name of the AWS S3 Bucket the data lives in.
A file you want to ingest into Alli
Additional Configurations
Access Key ID
Enter the AWS Access Key.
Access Secret
Enter the AWS Secret.
Region
Enter the AWS Region.
Bucket Name
Enter the AWS S3 Bucket name.
Directory
Enter the directory path. If your file is in the root of the bucket, leave this blank.
File Name
Enter the name of the file.
Frequently Asked Questions
What is the difference between Amazon S3, Amazon S3 Advanced, and Alli Cloud Storage datasource types?
The Amazon S3 datasource type is a legacy datasource type used to gather data from Alli managed S3 buckets prior to client credentials. The Amazon S3 Advanced datasource type should be leveraged when you want to bring data into Alli but the files are hosted in a non-Alli managed S3 instance. The Alli Cloud Storage datasource types should be leveraged when you want to ingest data that is stored within the Alli managed S3 instance from things like SFTP or other Alli applications.
I have a new file dropped into my S3 bucket daily, how can I have Alli only ingest the file for yesterday?
Within the S3 Advanced datasource set up, you can use bracketed text to dynamically insert values. For example, if I have a file with the current date appended, /daily_file_2024-01-04.csv
, and every day I want Alli to pull in the new file with the current date appended, I would put my file name as: /daily_file_{yesterdate}.csv
. The default date format is YYYY-MM-DD
but can be declared within the brackets (ex. {yesterdate:YYMMDD}
), more information about formatting can be found here: http://momentjs.com/docs/#/displaying/format/ .
I have multiple files in a directory that I want Alli to ingest, how would I set that up?
This can be accomplished using regular expression matching. For example, if I have a directory, /directory1/
, and it contains multiple files, /file1.csv
and /file2.csv
, I can set the file name in the Alli datasource set up form using regular expression as, /file.*.csv
. This result in both files be ingested into Alli.
Why does Alli need full access to my S3 bucket?
As Alli process the files, it renames the file to indicate that it has been process so it’s not picked up again. As a result, Alli requires the AWS Access Key to have s3:GetObject
, s3:PutObject
and s3:DeleteObject
permission levels. The AmazonS3FullAccess
permission level enables everything needed.