7. Anomaly Detection
In these instructions, we’ll focus on how to use SQL to identify when there is a large change in CPJ for Planet Fitness locations.
The sql includes calculations for “small”, “medium” and “large” date ranges (in this case 7, 14, and 30) to compare against, but the can easily be adjusted to whatever length of time is most effective for analysis (ie a 3 day, 7 day and 10 day comparison)
Single out locations with huge changes in CPJ using Alli Custom Alerts in Marketplace. Instructions for how to create a custom alert can be found in the set-up instructions for the app in Marketplace and in the doc below.
Pre-requisites:
To leverage the SQL in this alert, your client must have access to a datasource that contains:
a date column
your campaign names (from which locations can be extracted in Planet Finesses case)
spend
and whatever metric used to calculate Cost Per Join (CPJ).
For PF: Using the facebook_insights_ad datasource, those columns are ‘date_start’, ‘campaign_name’, ‘spend’ and ‘action_offsite_conversion_fb_pixel_complete_registration’.
Instructions: Alli Data
Create your SQL
Login to Alli Data and select the client you want to create an query for
Navigate to Reports → Explorer and select New Data Report
Give your report a Name and update the Report Type to Custom Redshift
Copy and paste the SQL Example in from below and update anything within and including
{
}
to match your client settingsOnce your SQL is updated, click Save & then Publish to update the query
Click Preview to ensure the output is as expected
SQL Example
Context:
How to Adjust Metric Calculated: (currently calculates CPJ, cost per join/registration)
You can transfer this alert to any metric simply by adjusting metric in line 29
- CODE
sum(action_offsite_conversion_fb_pixel_complete_registration) AS daily_metric
For example, you can change the calculation to look at
- CODE
sum(action_offsite_conversion_fb_pixel_complete_registration) AS daily_metric
How to Adjust Location column
You can adjust what piece of the campaign name is pulled to create ‘location’ in line 28
- CODE
split_part(campaign_name,'|',3) as location
For example; you can select a difference piece of the same campaign name
- CODE
split_part(campaign_name,'|',4) as location
You can also change the delimiter you are splitting by
- CODE
split_part(campaign_name,'^',3) as location
Or if one single rule can’t extract the location from every campaign name you can use a case statement or something else
- CODE
CASE WHEN campaign_name ilike '%Parks%' then split_part(campaign_name,'|',3) WHEN campaign_name ilike '%hiring%' then split_part(campaign_name,'^',4) END AS location
How to Adjust the spend minimum (currently set to $500)
This is used throughout the alert as a spend minimum required for any time period to use that time in comparison. For example, in line 3 - 5, it is used to determine whether to compare the daily CPJ with the small, medium, or large time frame. In lines 8 -11, it is used to ensure that both time periods used each have at least $500 of spend
You can transfer this to any level by adjusting line 13
- CODE
500 as spend_minimum
How to Adjust Time Periods Used for Calculation: (currently uses 7, 14, and 30)
You can transfer this to any level by adjusting line 14 - 25
- CODE
avg(daily_metric) OVER(PARTITION BY location ORDER BY date ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING) AS past_small_avg, avg(daily_metric) OVER(PARTITION BY location ORDER BY date ROWS BETWEEN 14 PRECEDING AND 1 PRECEDING) AS past_med_avg, avg(daily_metric) OVER(PARTITION BY location ORDER BY date ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING) AS past_large_avg, sum(cost) OVER(PARTITION BY location ORDER BY date ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING) AS past_small_spend, sum(cost) OVER(PARTITION BY location ORDER BY date ROWS BETWEEN 14 PRECEDING AND 1 PRECEDING) AS past_med_spend, sum(cost) OVER(PARTITION BY location ORDER BY date ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING) AS past_large_spend
In each calculation,
ORDER BY date ROWS BETWEEN {X} PRECEDING AND 1 PRECEDING)
determines the date range. To adjust the 7 day average to a 3 day average, change the 7 in line 15 & 21 (above line 2 & 8 respectively) to 3.
How to Adjust Metric Calculated: (currently calculates CPJ, cost per join/registration)
Simplified SQL - Output contains date, location, day_vs_average_percent_difference & avg_vs_average_percent_difference (w/ location breakout)
SELECT date, location,
Case
when past_small_spend > spend_minimum then (ABS(past_small_avg - daily_metric) / (NULLIF((daily_metric+ past_small_avg)/2,0))) * 100
when past_med_spend > spend_minimum then (ABS(past_med_avg - daily_metric) / (NULLIF((daily_metric+ past_med_avg)/2,0))) * 100
when past_large_spend > spend_minimum then (ABS(past_large_avg - daily_metric) / (NULLIF((daily_metric+ past_large_avg)/2,0))) * 100
end AS day_vs_average_percent_difference,
Case
when past_small_spend > spend_minimum and past_med_spend > spend_minimum then (ABS(past_med_avg - past_small_avg) / (NULLIF((past_small_avg+ past_med_avg)/2,0))) * 100
when past_med_spend > spend_minimum and past_large_spend > spend_minimum then (ABS(past_large_avg - past_med_avg) / (NULLIF((past_med_avg+ past_large_avg)/2,0))) * 100
else 0
end AS avg_vs_average_percent_difference
FROM
(SELECT date, location, daily_metric, 500 as spend_minimum,
avg(daily_metric) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING) AS past_small_avg,
avg(daily_metric) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 14 PRECEDING AND 1 PRECEDING) AS past_med_avg,
avg(daily_metric) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING) AS past_large_avg,
sum(cost) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING) AS past_small_spend,
sum(cost) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 14 PRECEDING AND 1 PRECEDING) AS past_med_spend,
sum(cost) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING) AS past_large_spend
FROM
(SELECT date_start as date,
split_part(campaign_name,'|',3) as location,
sum(action_offsite_conversion_fb_pixel_complete_registration) AS daily_metric,
sum(spend) as cost
FROM planet_fitness.facebook_insights_ad
GROUP BY date, location)
ORDER BY DATE DESC)
where date = current_date -1
and avg_vs_average_percent_difference > 20
order by location, date
Full SQL - Output contains all columns (w/ location breakout)
SELECT *,
Case
when past_small_spend > spend_minimum then (ABS(past_small_avg - daily_metric) / (NULLIF((daily_metric+ past_small_avg)/2,0))) * 100
when past_med_spend > spend_minimum then (ABS(past_med_avg - daily_metric) / (NULLIF((daily_metric+ past_med_avg)/2,0))) * 100
when past_large_spend > spend_minimum then (ABS(past_large_avg - daily_metric) / (NULLIF((daily_metric+ past_large_avg)/2,0))) * 100
end AS day_vs_average_percent_difference,
Case
when past_small_spend > spend_minimum and past_med_spend > spend_minimum then (ABS(past_med_avg - past_small_avg) / (NULLIF((past_small_avg+ past_med_avg)/2,0))) * 100
when past_med_spend > spend_minimum and past_large_spend > spend_minimum then (ABS(past_large_avg - past_med_avg) / (NULLIF((past_med_avg+ past_large_avg)/2,0))) * 100
else 0
end AS avg_vs_average_percent_difference
FROM
(SELECT date, location, daily_metric, 500 as spend_minimum,
avg(daily_metric) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING) AS past_small_avg,
avg(daily_metric) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 14 PRECEDING AND 1 PRECEDING) AS past_med_avg,
avg(daily_metric) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING) AS past_large_avg,
sum(cost) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING) AS past_small_spend,
sum(cost) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 14 PRECEDING AND 1 PRECEDING) AS past_med_spend,
sum(cost) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING) AS past_large_spend
FROM
(SELECT date_start as date,
split_part(campaign_name,'|',3) as location,
sum(action_offsite_conversion_fb_pixel_complete_registration) AS daily_metric,
sum(spend) as cost
FROM planet_fitness.facebook_insights_ad
GROUP BY date, location)
ORDER BY DATE DESC)
where date = current_date -1
and avg_vs_average_percent_difference > 20
order by location, date
SQL w/out location breakout
SELECT *,
Case
when past_small_spend > spend_minimum then (ABS(past_small_avg - daily_metric) / (NULLIF((daily_metric+ past_small_avg)/2,0))) * 100
when past_med_spend > spend_minimum then (ABS(past_med_avg - daily_metric) / (NULLIF((daily_metric+ past_med_avg)/2,0))) * 100
when past_large_spend > spend_minimum then (ABS(past_large_avg - daily_metric) / (NULLIF((daily_metric+ past_large_avg)/2,0))) * 100
end AS day_vs_average_percent_difference,
Case
when past_small_spend > spend_minimum and past_med_spend > spend_minimum then (ABS(past_med_avg - past_small_avg) / (NULLIF((past_small_avg+ past_med_avg)/2,0))) * 100
when past_med_spend > spend_minimum and past_large_spend > spend_minimum then (ABS(past_large_avg - past_med_avg) / (NULLIF((past_med_avg+ past_large_avg)/2,0))) * 100
else 0
end AS avg_vs_average_percent_difference
FROM
(SELECT date, location, daily_metric, 500 as spend_minimum,
avg(daily_metric) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING) AS past_small_avg,
avg(daily_metric) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 14 PRECEDING AND 1 PRECEDING) AS past_med_avg,
avg(daily_metric) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING) AS past_large_avg,
sum(cost) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING) AS past_small_spend,
sum(cost) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 14 PRECEDING AND 1 PRECEDING) AS past_med_spend,
sum(cost) OVER(PARTITION BY location
ORDER BY date ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING) AS past_large_spend
FROM
(SELECT date_start as date,
campaign_name,
sum(action_offsite_conversion_fb_pixel_complete_registration) AS daily_metric,
sum(spend) as cost
FROM planet_fitness.facebook_insights_ad
GROUP BY date, location)
ORDER BY DATE DESC)
where date = current_date -1
and avg_vs_average_percent_difference > 20
order by location, date
Instructions: Setting Up Your Custom Alert
Navigate to the Marketplace within Alli and select ‘Discover’
Either using the Search feature, or scrolling through the available options, select
Alli Custom Alert
The Set-Up instructions drop down will also contain instructions and tips for setup that mirror this page and can be used as a reference. Once you are confident, click
Install App
Name your pallet and and place your email (or the Custom Alerts owner) in the Notification Emails section on the left. If there is an error with your app, this will be who Alli Marketplace emails, so don’t forget this step!
Use the
+
button underSchedules
to set the schedule of when your Custom Alert will runFill out your template variables. Explanations of each are provided beneath the name within the app, and also below
View Name
The full name of the Alli Data view or Bigquery view (EX: test.datasource) OR
Your full SQL query, copy and pasted, surrounded by parentheses EX: (select * from test.datasource)
Alert_when_results
Do you want to be alerted (via slack) when your query (the view you created) returns results? - this will most likely be true, if so, leave as ‘Yes’
If you would like to be alerted when your result doesn’t return results, select ‘No’.
File_name
The name you'd like the file to be downloaded as when you receive the slack or email message
Client
What client is this for? (Should typically match the client you choose in Marketplace, but it’s up to you)
Slack_channel
The slack channel to send the message to, without the # (ie test, not #test)
Alli Alerts (@Alli Insights for PMG and @Alli for Moroch) must be invited, and if you are sending to a private channel, you must also invite @pmgadmin.
Slack_users
the pmg emails of the users you’d like tagged EX) jamie.reinhard@pmg; will tag Jamie EX) jamie.reinhard@pmg.com,garrett@pmg.com; will tag both Jamie & Garrett
a user group you’d like tagged EX) group:alli-alert-support; will tag the @alli-alert-support user group. Ensure you prepend group: before your user group name
a combination of both EX) jamie.reinhard@pmg.com,group:alli-alert-support
To tag multiple, comma separate without spaces the emails or user group.
Message
What message do you want to see when this issue happens? Be descriptive: this will automatically include the tagged users above at the front, and the downloadable link at the end.
Origin
is this view (or SQL) in BigQuery or Redshift (aka Alli Data)
Communication Type
How would you like the alert delivered? Alerts can be delivered: slack only, email only, or slack & email
**At this time, if your alert is bundled and you select either email only or slack & email, the email alert will not be bundled, it will be sent immediately
Service
Classify what channel or business this is alert is for: social, search, display, seo, analytics, or other
Slack bundling
Bundles are sent at 11 am & 3 pm (Central)
Select “yes, include slack messages in bundle” for you alerts to be sent in the next bundle.
Select “no, do not include slack messages in bundle” if you want your alert to be sent immediately.
Moroch users cannot opt into message bundling at the moment, select
no, do not include slack messages in bundle
*Note: if your alert is set to run after the last bundle (after 3pm) it will be included in the next bundle at 11 am the next day
Priority
What is the priority of this alert? low, medium, high
Email Subject
If you’ve selected email only or email & slack as your communication type, this will be the subject of the email alerts
Email Recipients
Comma separated list of email recipients of email alerts (if selected as communication type)
Email Body
Body/message of the email alert (if selected as communication type)
Display Results in Slack
Display the results of your query in slack if the rows & columns are less than or equal to the max column + rows selected
Choose “Yes, display results in slack & provide download link” or “No, only provide download link”
Max Displayed Columns
If "yes" is selected for 'display_results_in_slack', your query will display if the # of columns is less than this value. If your query returns more columns than selected, it will not display
Max Displayed Rows
If "yes" is selected for 'display_results_in_slack', your query will display if the # of rows is less than this value. If your query returns more rows than selected, it will not display
To finish your Custom alert, select
Save
orSave & Run
if you want to run your alert manually