What is OpenVPN? A Comprehensive Guide for Google Analytics Users
Introduction
Website analytics are crucial for understanding visitor behavior, measuring marketing ROI, and optimizing user experience. Google Analytics is by far the most widely used web analytics platform – according to BuiltWith, over 29 million websites currently use Google Analytics, representing about 55% of all websites. However, the accuracy of Google Analytics data is increasingly impacted by visitors using virtual private networks (VPNs) to mask their IP address and location. One of the most popular VPN protocols is OpenVPN. In this in-depth guide, we‘ll explain what OpenVPN is, how it affects Google Analytics tracking, and steps you can take to adjust your analytics configuration and reporting to account for VPN usage.
What is OpenVPN?
OpenVPN is an open-source VPN protocol that uses SSL/TLS for encryption. Its primary purpose is to enable users to securely and privately connect to remote networks and mask their IP address and location in the process. When a user connects to an OpenVPN server, their traffic is routed through an encrypted "tunnel", replacing their real IP address with one provided by the VPN server. This makes it appear as if their traffic is originating from the VPN server‘s location, which could be anywhere in the world.
Behind the scenes, OpenVPN works by using a combination of OpenSSL for encryption, digital certificates for authentication, and multiplexing to allow multiple logical connections to share the same physical link. The OpenVPN client and server application handles negotiating the secure connection and routing the encrypted traffic. While the technical implementation is complex, the end result for users is a secure, private connection that masks their real location and browsing activity.
Why Using a VPN Impacts Web Analytics
VPNs like OpenVPN have become popular tools for protecting online privacy, encrypting sensitive information, bypassing geographic content restrictions, and anonymizing web browsing activity. However, these same properties can cause issues for web analytics platforms like Google Analytics:
-
IP address masking – Google Analytics relies heavily on IP addresses to identify unique visitors, estimate location, and stitch together sessions. When a visitor‘s real IP address is replaced by a VPN server‘s IP, this breaks Google Analytics‘ tracking logic.
-
Location spoofing – VPNs allow users to "tunnel" their connection through servers located anywhere in the world. This means a user physically located in New York could be routing their web traffic through a VPN server in London, causing Google Analytics to incorrectly record their location as the UK.
-
Bypassing filters and blocks – Network administrators sometimes block Google Analytics tracking, either for privacy reasons or to prevent Analytics data from "internal" users skewing the metrics. VPNs allow users to easily circumvent these blocks.
The end result is that VPN usage can significantly degrade the accuracy of Google Analytics data by inflating visitor counts, distorting location metrics, and making it difficult to identify unique visitors. As we‘ll see, VPN usage is becoming increasingly common, making it essential for Google Analytics users to be aware of these issues and take steps to compensate.
The Growing Popularity of VPNs
So just how many people are using VPNs? The exact numbers are difficult to determine, as VPN providers don‘t publicly report detailed usage statistics. However, several studies have attempted to quantify the rapid growth of VPN adoption:
- A 2021 Security.org survey found that 29% of US internet users had used a VPN in the past month, up from just 11% in 2019.
- Globally, DataProt estimates there are over 1 billion VPN users, representing about 31% of all internet users.
- According to AtlasVPN‘s analysis, VPN usage grew 27.1% in 2020, likely driven by the shift to remote work during the COVID-19 pandemic.
Looking at VPN usage from a web analytics perspective, a 2021 study by DeviceAtlas estimated that 36% of website visits are conducted over a VPN or proxy service. This number varies significantly by country – for example, in Indonesia over 55% of traffic comes through VPNs, compared to just 22% in the US.
The key takeaway is that VPN usage is substantial and growing rapidly, likely already impacting a material percentage of most websites‘ Google Analytics data. With this context in mind, let‘s look at some techniques for detecting and adjusting for VPN traffic.
Identifying VPN Usage in Google Analytics
There are a few different methods for spotting visitors using VPNs like OpenVPN in your Google Analytics data:
-
Monitor the "Hostname" and "Service Provider" dimensions – VPN usage will often show up here as traffic to non-website hostnames (e.g. the OpenVPN server subdomain) and service providers associated with VPNs (e.g. Amazon if using AWS-based VPNs). Sudden spikes in traffic to unusual hostnames or providers can be a red flag.
-
Use the "Network" report to identify and filter IP address ranges used by popular VPN services. You can find lists of IP addresses for major VPN providers on sites like WhatIsMyIPAddress.com.
-
Set up custom alerts for large increases in traffic from specific geographies or languages not usually associated with your user base. For example, if you normally get very little traffic from Indonesia but see a big jump in Indonesian visitors, VPN usage is a likely culprit.
-
Use the "User Explorer" report to identify individual users exhibiting unusual behavior that could indicate VPN usage, like erratic location hopping or abnormally high session counts. You can create a segment to group and monitor these suspicious users over time.
While you‘re unlikely to be able to identify every single VPN user, these techniques can help you gauge the relative scale of VPN usage and create segments to better understand how these visitors are impacting your overall Google Analytics metrics.
Adjusting Google Analytics Reporting for VPNs
Armed with an understanding of how much VPN traffic you‘re getting and some of the common identifiers, you can start adjusting your Google Analytics configuration and reporting to compensate:
-
Create a "VPN Traffic" segment using the hostname, service provider, IP range, and other signals discussed above. Use this segment to exclude suspected VPN traffic when calculating KPIs and analyzing behavior trends.
-
Avoid using location dimensions like "Country" in your default reporting views, as these are most likely to be skewed by VPN usage. Instead, look at metrics like "Page Views per Session" or "Bounce Rate" that are less impacted.
-
Focus more on tracking behavior and conversions for identified users (e.g. those who created an account or made a purchase), as you‘ll have more reliable data for these visitors compared to anonymous traffic that could be going through VPNs.
-
If you have a sense for the percentage of your traffic using VPNs, you can apply an estimated deflation factor to metrics like total users and sessions for a more realistic picture. Just be sure to callout any such adjustments when reporting the data.
-
For critical metrics, consider investing in a server-side tracking setup, as this is much harder for VPNs to spoof compared to client-side JavaScript tracking. Alternatively, explore session recording tools that don‘t rely solely on IP-based tracking.
Looking Ahead: The Future of VPNs and Web Analytics
With the growing focus on online privacy and security, VPN adoption shows no signs of slowing down. If anything, usage of protocols like OpenVPN is likely to become even more mainstream in the years ahead. For web analytics practitioners, this means the "noise" created by VPNs is likely to get worse before it gets better.
In the short term, familiarizing yourself with the VPN detection and adjustment techniques discussed in this guide will help you get a more accurate picture from your Google Analytics data. Longer term, we may see web analytics solutions evolve to be less dependent on individual tracking mechanisms like IP addresses and cookies that are easily masked by VPNs.
Server-side tracking, identity stitching across devices, and the use of machine learning to identify anomalous traffic patterns are all promising areas of innovation. Ultimately, web analytics solutions will likely engage in an ongoing cat-and-mouse game with VPNs and other privacy tools, similar to what we‘ve seen in the digital advertising space.
Conclusion
For Google Analytics users grappling with the impact of VPNs like OpenVPN, the key is to stay informed about the evolving privacy landscape and be proactive about adjusting your tracking configuration and reporting to compensate. By segmenting out suspicious VPN traffic, emphasizing behavior-based metrics over geography, and exploring server-side tracking alternatives, you can continue to get actionable insights even as VPN usage grows.
Hopefully this guide has given you a solid foundation for understanding what OpenVPN is, how it impacts Google Analytics, and what you can do to adapt your analytics practices for an increasingly privacy-focused world. By staying on top of these trends and being transparent about any adjustments to your data, you‘ll be well positioned to make smart, data-driven decisions for your business in 2022 and beyond.