Inside the Alli Data, there are hundreds of tables of raw data that come from the Datasources you make. That data will likely need to be aggregated into something more consumable to us humans. To do this, views (Reports) in the Alli Data UI) are created. These views comprise of a SQL query that determines how to take all the raw data and turn it into something more manageable.
Sometimes, these views can grow large and overwhelming, and they take a few minutes or more to execute. Reporting on these views that take a while to query can be a bottleneck.
So, an option is to cache the output of those views. If a view runs in three minutes, but you create a table from the output, then that concrete table no longer performs the SQL or calculations that created it. This makes querying the table extremely fast. But, the table no longer gets updated with new data.
Understanding the trade-off between data recency and query speed plays into the use of the Redshift Cache App. The point of this App type is to perform the caching described above so that you can query your data faster and put less load on the Alli Data.
In addition to the common configuration, there is some extra configuration required for these Apps.
Above we see a Query Type option. The Query Type determines where the data we are caching comes from in the Alli Data.
Relation - A Relation is a table or view in the Alli Data. The name of that relation goes in the Relation Name input. This is the relation to cache. In the example above, this App would be cache the contents of the
playground__forklift_testview. The name of the table that is created with the cached data is
playground__forklift_test, the value in Relation Name.
Custom SQL - Custom SQL is a SQL
SELECTstatement that determines what data to cache. In the example above, the output of that query is what is cached. The name of the table that is created with the cached data is
forklift__test_sum_of_spend, the value in Destination Relation.
In both options above, there is a table created with the cached data, and that table has a specific name. The cached tables from the Redshift Cache Apps are created in a different schema, the
tableau schema, inside Redshift. This means to query those cached tables, you need to add
tableau. to the front of the table name.
SELECT * FROM tableau.forklift__tests_sum_of_spend.
After the cache table has been created, Alli Marketplace attempts to grant permissions to the correct group(s) inside Redshift.
The relation name is consulted to see if there is a client name at the start of the relation name that is followed by two underscores. If there is, then that client (group in Redshift) is granted select access to the new table.
As an example, the relation name
forklift_example__datasource_name would result in the
forklift_example client (group) getting select access to the new table. These client names will usually come from the client name inside Alli Data and are needed to access the datasource tables in Redshift.
Redshift Cache Apps are scheduled by Alli Marketplace to run at each of the Schedules you provide in configuration. However, Alli Marketplace allows only one Redshift Cache App to run at a time. This is done to prevent overloading the Alli Data's Redshift resources by performing too many caches at once. Note that these App Executions are independent of all other App types.