Make Better Investments Than Competitors With Alternative Data

In today‘s data-driven world, investment managers are increasingly turning to alternative data to gain an edge over their competitors. But what exactly is alternative data, and how can it lead to better investment decisions? As an expert in web scraping and data analytics with over 10 years of experience extracting insights from alternative data sources, I‘ll provide a comprehensive guide on how to tap into this valuable resource.

Demystifying Alternative Data: A Primer

Alternative data refers to information culled from non-traditional sources outside of typical financial statements, earnings calls, and corporate disclosures. This includes data derived from sources like:

  • Social media activity
  • Satellite and geospatial imagery
  • Credit card transactions
  • Web traffic and searches
  • Mobile devices and IoT sensors
  • Government and public records

But it helps to think of alternative data more broadly as any data that provides meaningful insights into a company, industry, or economic trend – just from unconventional sources. Here are a few examples of alternative datasets I‘ve worked with firsthand:

  • Restaurant reservation data from platforms like OpenTable to forecast hospitality revenue
  • Shipping container analytics to predict fluctuations in global trade
  • Aggregated credit card data to track retail spending by geography and demographics
  • Satellite imagery to monitor parking lot traffic and analyze store performance
  • Job posting analytics to assess talent recruitment trends

Essentially any novel data stream from non-financial sources that helps explain what‘s happening in near real-time can provide an investment edge. The rise of big data and analytics has opened up a treasure trove of these rapidly-updating alternative data sources.

While conventional financial data reveals the past, alternative data provides forward-looking visibility into the present. This "predictive power" makes it an invaluable complement to traditional data, enabling investors to make smarter decisions fueled by real-time insights.

The Staggering Growth of Alternative Data

The adoption of alternative data by the investment community has absolutely exploded over the past decade. Here‘s a look at the hockey stick growth:

  • In 2009, only about 10% of hedge funds used alternative data sources in their investment process.
  • By 2019, an estimated 85% of hedge funds were using some form of alternative data.
  • The projected market size of the alternative data industry is expected to surpass $17 billion by 2022.


This surging demand stems from the game-changing insights and demonstrable alpha generation enabled by previously unexplored data streams. The "alternative data revolution" is well underway in the investment community.

Why Alternative Data Leads to Better Investing

Let‘s explore the mechanics of how integrating alternative data into the investment process empowers more informed decisions and superior returns:

Gain Early Signals on Company Performance

By tapping into real-time alternative data on customer behavior and operations, investors can gain visibility into leading indicators that telegraph financial performance before quarterly results are reported.

For example, by aggregating anonymized credit card transaction data from millions of consumers, investors can piece together granular demand trends – detecting sales spikes or slowdowns weeks or months before they appear on an earnings release.

Web traffic and engagement metrics act as high frequency indicators of interest in a company‘s products. Drops in site visitors or e-commerce conversion rates tip off upcoming revenue declines. Reviews and mentions on social media reveal shifts in brand perception.

Essentially, alternative data grants investors access to a live data feed of metrics that help predict later financial results, allowing more agility in trading around anticipated earnings shocks.

Assess Competitive Threats and Opportunities

From employee review sites to commercial satellite imagery, alternative data opens up new ways to benchmark a company‘s operational performance against competitors. This empowers more informed assessments of competitive threats and opportunities.

For example, vacancy postings for engineering and technical roles at a self-driving car startup could signal aggressive R&D hiring. News of a rival securing large funding rounds might suggest an impending competitive threat. Sudden departures of top executives to competitors based on public announcements and Linkedin updates hints at future instability.

By expanding the range of observable metrics beyond just financial statements, alternative data provides a multidimensional view of competitive positioning impossible with conventional disclosures alone.

Verify or Refute Management Commentary

Earnings calls and presentations offer carefully-crafted narratives by management about their business. Alternative data enables testing those narratives against objective real-world signals.

For instance, executives touting strong customer retention and satisfaction can be verified by analyzing churn rates based on anonymized account data. Claims of supplier constraints might be contradicted by shipping container data showing no logistics slowdown.

Alternative data empowers looking beyond the "sales pitch" offered by management – bridging the disconnect between rhetoric and reality.

Monitor Shifts in Demand and Macro Trends

The versatility of alternative data allows tracking indicators that mirror broader economic conditions and end-market demand. For example:

  • Restaurant bookings foreshadow consumer discretionary spending
  • Class 8 truck orders trace industrial activity
  • Copper prices reflect global construction demand
  • Web searches for "unemployment" or "food stamps" indicate economic anxiety

Incorporating alternative data that aligns with macro drivers or final demand can help guide top-down sector allocation and position sizing. Identifying leading indicators enables better timing around economic turning points.

The Data Science of Alternative Data

While tremendous potential lies within alternative data, the challenge is extracting actionable insights from the noise. Harnessing alternative data requires embracing sophisticated data science techniques.

Here‘s an overview of key steps in the alternative data analysis process:

Sourcing and Collection

  • Identify relevant alternative data sources aligned to research goals
  • Assess accessibility, coverage, quality and cost trade-offs of different datasets
  • Set up data pipelines and licenses to access targeted datasets

Cleaning and Normalization

  • Resolve missing values, duplicates, outliers that create distortions
  • Normalize messy data formats into analysis-ready structured tables

Enrichment and Augmentation

  • Merge alternative data with traditional data sources
  • Append derived metrics, aggregates, and time lags to power analysis

Exploration and Visualization

  • Mine data for trends, correlations, clustering and anomalies
  • Create interactive visualizations optimized for gaining insights

Statistical Modeling and Machine Learning

  • Build regression models, classifiers, and neural nets to uncover alpha predictive signals
  • Backtest models rigorously to measure out-of-sample predictive power

Business Integration and Productionization

  • Embed model predictions and data-driven signals into live trading systems
  • Build guardrails for data monitoring, refresh, and model drift

The investments into the data science capabilities required to follow this process help justify the high costs typically associated with alternative data.

Real-World Examples of Alternative Data in Action

Here are some real-world case studies that demonstrate the power of alternative data in action:

Using Satellite Data to Predict Retail Bankruptcies

By measuring parking lot traffic at big box retail locations through satellite image analysis, RS Metrics was able to forecast plunging visitor numbers at Toys R Us and Sears stores months before their eventual bankruptcies. The foot traffic declines provided early warning signals of their demise.

Analyzing Shipping Data to Predict Economic Shifts

By purchasing access to a database of real-time container shipping data, hedge fund Numerai was able to closely track global trade volumes. They discovered the shipping data acted as a reliable leading indicator of GDP growth – moving several months before official macroeconomic statistics were released. This allowed more informed trading around economic turning points.

Estimating Tesla Sales from Social Media

JCB Insights built an algorithm to track images of new Tesla vehicle deliveries posted by owners on forums like Twitter, Reddit, and Facebook. By tallying these social media posts, they estimated quarterly Tesla sales increased 112% year-over-year – a figure that ultimately aligned closely with Tesla‘s actual reported deliveries.

Using Web Data to Detect Accounting Irregularities

Academic researchers detected that Google search volume for terms like "[Company Name] Accounting Fraud" spiked notably before accounting scandals and financial restatements at firms like Enron and Sunbeam. This demonstrates how web data could provide early red flags around financial distortions and underlying irregularities.

These anecdotes reveal how creative sourcing and analysis of alternative data can transform opaque firms and markets into open books – granting those with access tremendous advantages.

Guide to Top Alternative Data Providers

Here I‘ll share a sampling of reputable alternative data providers I‘ve worked with first-hand over the years, categorized by their niche focus areas:

Web Data

  • Thinknum – Web scraped datasets on ecommerce demand, hiring trends, and consumer interest
  • Quiver Quant – Platform for aggregating SEC filing text analytics
  • Bombora – Analytics on B2B web behavior and buyer intent
  • SimilarWeb – Digital insights into website performance

Location Data

  • SafeGraph – Aggregated mobile location pings and foot traffic data
  • Placer – Real-time venue visitor analytics platform
  • Skyhook – Mobile location data and geospatial intelligence

Satellite Data

  • Orbital Insight – Satellite/aerial image analytics at industrial facilities
  • SpaceKnow – Image data for supply chain and shipping monitoring
  • Ursa – Satellite derived oil, mineral, and commodity analytics

Card Transaction Data

  • Affinity Solutions – Anonymized credit and debit card spend data
  • Second Measure – Insights into consumer spend changing at merchants
  • SPLICE – Massive dataset of matched credit card and mobile location data

Social and Web Data

  • Thinknum – Social metrics across platforms including TV, web, mobile
  • Sprout Social – Social media listening, sentiment tracking
  • Brandwatch – Text analytics applied to social media narratives

Job Listings Data

  • Thinknum – Analytics on hiring demand and compensation trends
  • Indeed – Dataset of job postings across thousands of websites
  • Burning Glass – Labor market analytics derived from job listings

This list just scratches the surface of the reputable providers in the alternative data space. The key is identifying vendors with specialized data feeds directly relevant to your research workflow needs and objectives.

Best Practices for Using Alternative Data

Based on my decade of experience as an alternative data analyst, here are my recommended best practices for successfully incorporating alternative data into the investment process:

Clarify Your Questions and Objectives

Rather than getting distracted by shiny new datasets, focus your alternative data search on sources that help answer specific questions about industries, sectors, or companies that are relevant to your core investment theses and strategies. Aligning alternative data to clear use cases is crucial.

Validate Any Signals with Fundamental Diligence

View predictive signals from alternative data as interesting hypotheses rather than absolute truths. Combining alternative data with traditional fundamental analysis allows properly contextualizing signals. Don‘t outsource thinking to the data.

Construct Counterfactuals to Avoid Biases

Carefully consider what insights you would expect to find in a given alternative dataset under various scenarios. This guards against making false inferences simply because the data matches your preconceived biases. Avoid confirmation bias traps.

Watch for Low Signal-to-Noise Ratios

Certain alternative datasets may provide only intermittent, low frequency signals amidst a lot of noise. Prioritize high signal-to-noise ratio data with consistent predictive power. Don‘t get distracted by statistically weak correlations.

Build Rigorous Data Monitoring Guardrails

Monitor the ongoing performance of alternative data signals against reported results to quickly detect any deterioration in predictive power. This ensures models remain relevant and prevents data drift.

Recognize the “Agnostic” Nature of Data

No dataset represents absolute truth – it simply reflects a series of measurements or observations. Maintain healthy skepticism rather than blindly trustingpatterns found within alternative data.

While it takes work to properly harness its potential, applying these principles allows alternative data to deliver invaluable forward-looking insights unavailable through conventional information sources alone. The investment winners of the future will be those who can master alternative data analysis.