Backup Your Google Analytics Data
Google Analytics is used by millions of businesses. Many of these businesses rely exclusively on Google Analytics for web analytics reports, yet don't keep a backup copy of their GA data. This isn't a very smart strategy.
It's easy to keep a backup copy of your Google Analytics data. You simply configure your GA tracking code to make a copy of the tracking data sent to Google - the copy is stored in your web server log file. Once you have the data, you can process it with an on-premises web analytics software solution.
The primary reason to keep a backup copy of anything is to be able to recover from unexpected issues. Google Analytics is no different - consider the following:
Reasons to Backup GA Data
1) Business Continuity
Let's be honest: data loss isn't a major concern with Google Analytics because Google's redundant, geographically distributed datacenters will (probably) not simultaneously explode. But if you use Google Analytics, your company is still exposed to potential issues. For example:
- What happens if Google decides to delete your historical data? (see Data Retention Policy, below)
- What happens if your new CMO or agency decides to standardize on another analytics tool?
- What happens if your company can't use Google Analytics anymore? (see Privacy Regluations, below)
The easiest way to prepare for (and overcome) the above scenarios is to start keeping a backup copy of your GA data.
2) Data Retention Policy
Google Analytics only guarantees to retain your most recent 25 months of data. This policy hasn't been enforced yet but they can pull the trigger anytime.
Here's why this is a legitimate concern: there's a cost (e.g. hardware, network, power, development, legal) associated with every Google Analytics account. The more websites use Google Analytics, the more data is stored, and the higher the cost.
If Google decides to cut costs by enforcing the data retention policy, you will lose your historical data unless you have a backup copy.
3) Privacy Regulations
Internet Privacy is a hot topic among legislators. Currently, there are broad usage restrictions for Google Analytics in industries like healthcare, finance, education, and government/military. Looking ahead, Internet Privacy regulations are expected to increase in the coming years.
There's a lot more that can be discussed about this topic, especially for businesses in the EU. But here's the salient point: if a new law restricts your company from using Google Analytics, you will be better prepared if you have a backup copy of your GA data.
4) Verify Data Collection
If you only use Google Analytics, you won't know if the data you see is actually correct. For example, if your visits are being double-counted or if 30% of your website traffic is collected by a different tracking ID / UA number, how would you notice the problem?
An effective way to verify your data is to compare it against reports from another product. Many companies use 2 web analytics products for this exact reason.
4) Expose Hidden Visitors
Hidden Visitors are an unseen problem for most websites, and it's caused by visitors blocking the tracking gif via a browser plugin. This means you won't see all of your website traffic in your Google Analytics reports. The percentage of hidden visitors depends on the type of website you have. We've seen it as high as 25%, but it's usually around 10%.
When you keep a backup copy of GA tracking data, the tracking gif is sent directly to your website domain...and it typically isn't blocked. You'll see the hidden visitors if you process the data in another web analytics software program.
5) Fix Mistakes
C'mon, admit it - you (or someone you know) created an incorrect filter and didn't realize the mistake until a big chunk of new data was missing. It's ok - we've all done it. :-)
Google Analytics applies configuration settings before your data appears in the reports. If you make a mistake, you can't fix any data that was previously processed. But if you have a backup copy of your Google Anaytics data, you can reprocess the data in another web analytics software program and eliminate the mistakes for reporting purposes.
6) Exclude Bogus Traffic
Sometimes your GA view is configured perfectly but unwanted data appears in your reports. For example:
In 2013, Microsoft's search crawler (bingbot) started running GA code and sending tracking data. In 2014, the semalt crawler did the same thing. GA reports showed higher traffic in both cases, and it wasn't possible to eliminate the crawler traffic from the previous weeks.
Referrer Spam & Event Spam
This is a relatively recent phenomenon - companies will send bogus traffic to your Google Analytics account that advertises their product or service website. The hope that you'll notice the Source or Event and say "what website is this?" and visit it. But mostly, this cheap promotional stunt just clogs your reports with unnecessary bogusness.
Someone Else Uses Your UA number
We've seen this happen when a robot scrapes your content and shows it on another website without stripping out your GA code. We've also seen web design agencies that forget to swap out the UA number before a site launches.
If traffic from any of the above scenarios ends up in your Google Analytics data, you can't remove it from the reports.
8) See Details Not Shown in GA
Google Analytics reports don't provide many details. You can use segments and filters to isolate blocks of traffic, but you can't see individual visitor details. Plus, the Google Analytics ToS (Terms of Service) doesn't allow you to store PII (Personally Identifiable Information).
If you keep a backup copy of GA data and process it in a different web analytics product, your reports won't be governed by the same ToS. This means you can see visitor details like IP address, individual clickpaths, usernames, and PII.
Hopefully you now understand the value of keeping a backup copy of your GA data.
But once you have the raw data, there's another question to answer: how are you going to get the data into the hands of your business users in a format they can actually use? Specifically:
- Where to store it?
- How to sessionize it?
- How to visualize it?
- Build or Buy?
We recommend looking at Angelfish Software. Angelfish is an on-premises web analytics software package that can process Google Analytics tracking data, and is an ideal solution for this exact scenario.