Overview
Amazon S3, provided by Amazon Web Services, is a web-based file storage service enabling you to store and retrieve any volume of data at your convenience, from any web location.
|
Vendor/Partner |
Amazon |
|---|---|
|
Version |
|
|
API Documentation |
|
|
Sunset Date |
none as of October 2025 |
|
Channel(s) |
All |
|
Refresh Time (CST) |
4 am, 7 am, 10 am, 1 pm, 4 pm, 7 pm, 10 pm |
|
Default backfill |
N/A |
|
Alli Data Library |
|
Before Getting Started
This source has limited support
While this data source will continue to live in Alli due to its necessary usage in other areas, it is not actively maintained. This source is best used by technical experts who are able to troubleshoot and mitigate issues that arrive.
Alli Cloud Storage is the suggested source to migrate functionality.
To get started with the Amazon S3 Advanced datasource, ensure you have the following:
-
AWS Access Key - This can be found in the Security Credentials section of the AWS Console and needs to have the
AmazonS3FullAccesspermission level. -
AWS Secret - This is generated when you create your AWS Access Key.
-
AWS Region - The region the data is stored in. This can be found in the AWS Console by navigating to Services, then S3.
-
ex.
us-east-1
-
-
AWS S3 Bucket - The name of the AWS S3 Bucket the data lives in.
-
A file you want to ingest into Alli
Additional Configurations
Access Key ID
Enter the AWS Access Key.
Access Secret
Enter the AWS Secret.
Role Arn
Enter a role arn to assume
Role Session Name
The session name to name your assume role session as
External Id
The external id that is sometimes needed to assume a role
Region
Enter the AWS Region.
Bucket Name
Enter the AWS S3 Bucket name.
Directory
Enter the directory path. If your file is in the root of the bucket, leave this blank.
File Name
Enter the name of the file.
Frequently Asked Questions
What is the difference between Amazon S3, Amazon S3 Advanced, and Alli Cloud Storage datasource types?
The Amazon S3 datasource type is a legacy datasource type used to gather data from Alli managed S3 buckets prior to client credentials. The Amazon S3 Advanced datasource type should be leveraged when you want to bring data into Alli but the files are hosted in a non-Alli managed S3 instance. The Alli Cloud Storage datasource types should be leveraged when you want to ingest data that is stored within the Alli managed S3 instance from things like SFTP or other Alli applications.
I have a new file dropped into my S3 bucket daily, how can I have Alli only ingest the file for yesterday?
Within the S3 Advanced datasource set up, you can use bracketed text to dynamically insert values. For example, if I have a file with the current date appended, /daily_file_2024-01-04.csv, and every day I want Alli to pull in the new file with the current date appended, I would put my file name as: /daily_file_{yesterdate}.csv. The default date format is YYYY-MM-DD but can be declared within the brackets (ex. {yesterdate:YYMMDD}), more information about formatting can be found here: http://momentjs.com/docs/#/displaying/format/ .
I have multiple files in a directory that I want Alli to ingest, how would I set that up?
This can be accomplished using regular expression matching. For example, if I have a directory, /directory1/, and it contains multiple files, /file1.csv and /file2.csv, I can set the file name in the Alli datasource set up form using regular expression as, /file.*.csv. This result in both files be ingested into Alli.
Why does Alli need full access to my S3 bucket?
As Alli process the files, it renames the file to indicate that it has been process so it’s not picked up again. As a result, Alli requires the AWS Access Key to have s3:GetObject , s3:PutObject and s3:DeleteObject permission levels. The AmazonS3FullAccess permission level enables everything needed.
What is needed in my IAM role to allow for alli data to assume it?
Your iam role needs a trust policy to allow the datawarehouse roles assume it. Those datawarehouse roles are arn:aws:iam::196897816688:role/ecs-task@datawarehouse{prod | staging}. Below is an example trust policy you may use. Also make sure your role has "s3:PutObject", "s3:GetObject", "s3:DeleteObject", "s3:ListBucket"
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::196897816688:role/ecs-task@datawarehouseprod"
},
"Action": "sts:AssumeRole"
}
]
}