Bot Traffic in Google Analytics

I was messing around with Scroll Depth data in Google Analytics when I noticed that, literally, the numbers weren’t adding up. The Scroll Depth plugin works by triggering events when a user scrolls 25%, 50%, 75%, and 100% of the page. It also fires an immediate “Baseline” event when the page loads. In theory, the number of Baseline events should be equal to the number of pageviews.

But when I created an Advanced Segment for visits that included a Baseline event, the segment only returned 75% of total visits. Mysteriously, 25% of visits were not triggering the Baseline event—in fact they weren’t triggering any Scroll Depth events at all.

So then I created an Advanced Segment for all visits that did not include a Scroll Depth event. My suspicion was that an older browser may have been encountering a JavaScript error that prevented the Scroll Depth plugin from executing. I started poking around in the Technology section and stumbled on this.

100% New Visits, Average Visit Duration 0, and 0.14 Pages/Visit?

Bots.

Next step? Create more Advanced Segments. I created a Bots segment, which included only visits where the Service Provider field matched a (microsoft corp)|(microsoft corporation)|(amazon.com inc)|(amazon technologies inc) regex. And then I created a No Bots segment based on visits that excluded the same regex.

For the site I was looking at, the bot traffic only began in early November. The Bots segment showed bot traffic accounting for a surprising 17% of visits.

If you have this sort of bot traffic obviously it means that volume-based metrics like Visits and Unique Visitors are artificially inflated. It also means that average-based metrics like Visit Duration and Pages/Visit are falsely skewed down.

Historically, bot activity in Google Analytics wasn’t something we really had to worry about. Bots weren’t capable of executing JavaScript. Those days appear to be over. At least until Google Analytics decides to filter this traffic before it reaches the reports.

I’m sure we’ll see a lot more discussion about this soon. In the meantime, here a few things you can do to see if bot traffic is having an effect on your Google Analytics data:

  • Look for offending service providers like Microsoft and Amazon
  • Look for suspicious numbers like zero Visit Duration, <1 Pages/Visit, and very high New Visit %
  • Check out this post from LunaMetrics about filtering bots in Google Analytics
  • Keep an eye on Intelligence Events for spikes in direct traffic and other anomalous events
  • If you discover bot activity, use Advanced Segments or profile filters to isolate the data you care about

Update: I wrote a follow-up post explaining how to identify bots based on behavior instead of properties like ISP or Browser.

For more stuff like this, follow me on Twitter.