Client Scrape Request Sample

Example Client Onboarding Email

{Client},

We're hoping to onboard Gitlab to a tool that will regularly scrape all URLs that are being used in live media. The intention to catch any URLs that are broken or redirecting to other pages to ensure the best advertising experience for the user and prevent wasteful ad spend. This will help the team catch and fix these issues as soon as possible. To achieve this we would need to scrape your site regularly to identify pages with issues. We recognize that your site may have restrictions on how and when we should perform these scrapes. Are you okay with us moving forward with this tool, and if so could you answer the below questions so that we don't cause any issues on the site.

Do you have any requirements on what days we can scrape the site?

Do you have any requirements on what times of the day we can scrape the site?

Do we need to apply a rate limit to our scraper for your site?

The bot will have the user agent "pmg.com-urlscraper" in case you need to whitelist or identify traffic from these scrapes.

Are there any other restrictions or requirements we should be aware of when scraping your site.

Thanks,

Pallet variables:

pmg.com-urlscraper
100ms

Example S3 Datasource

https://data.alliplatform.com/datasource/5a89ddc0221faaba6da98226/S3_advanced