Bot Traffic in Google Analytics
I was messing around with Scroll Depth data in Google Analytics when I noticed that, literally, the numbers weren’t adding up. The Scroll Depth plugin works by triggering events when a user scrolls 25%, 50%, 75%, and 100% of the page. It also fires an immediate “Baseline” event when the page loads. In theory, the number of Baseline events should be equal to the number of pageviews.
But when I created an Advanced Segment for visits that included a Baseline event, the segment only returned 75% of total visits. Mysteriously, 25% of visits were not triggering the Baseline event—in fact they weren’t triggering any Scroll Depth events at all.
100% New Visits, Average Visit Duration 0, and 0.14 Pages/Visit?
Next step? Create more Advanced Segments. I created a Bots segment, which included only visits where the Service Provider field matched a
(microsoft corp)|(microsoft corporation)|(amazon.com inc)|(amazon technologies inc) regex. And then I created a No Bots segment based on visits that excluded the same regex.
For the site I was looking at, the bot traffic only began in early November. The Bots segment showed bot traffic accounting for a surprising 17% of visits.
If you have this sort of bot traffic obviously it means that volume-based metrics like Visits and Unique Visitors are artificially inflated. It also means that average-based metrics like Visit Duration and Pages/Visit are falsely skewed down.
I’m sure we’ll see a lot more discussion about this soon. In the meantime, here a few things you can do to see if bot traffic is having an effect on your Google Analytics data:
- Look for offending service providers like Microsoft and Amazon
- Look for suspicious numbers like zero Visit Duration, <1 Pages/Visit, and very high New Visit %
- Check out this post from LunaMetrics about filtering bots in Google Analytics
- Keep an eye on Intelligence Events for spikes in direct traffic and other anomalous events
- If you discover bot activity, use Advanced Segments or profile filters to isolate the data you care about
Update: I wrote a follow-up post explaining how to identify bots based on behavior instead of properties like ISP or Browser.