How to Highlight Duplicates in Google Sheets [Step-by-Step]

Duplicate data is the silent killer of spreadsheet accuracy. It sneaks in when you least expect it—an accidental copy/paste here, a redundant manual entry there. Before you know it, you‘re making decisions based on flawed data and your reports have more errors than a maths exam.

Just how common are duplicates? Studies have found that up to 20% of records in an average database are duplicates. And manual data entry has an average error rate of 1% – meaning one out of every 100 entries is likely to be incorrect or duplicated.

Scary stats, I know. But don‘t start questioning your sanity or your team‘s competence just yet. Duplicate data is an extremely common problem, and Google Sheets has some handy built-in tools to help you find and fix those pesky dupes.

In this guide, I‘ll walk you through exactly how to highlight duplicates in Google Sheets step-by-step. We‘ll cover:

  • Why duplicates are so problematic
  • How to manually highlight duplicates in a single column
  • Using formulas to find duplicates across multiple columns
  • Deciding what to do with duplicates once you find them
  • Tips for preventing duplicate data entry in the first place

By the end, you‘ll be equipped to keep your sheets squeaky clean and dupe-free. Let‘s dive in!

Why Duplicates Wreak Havoc on Your Data

First, let‘s talk about why duplicate data is such a menace. A few redundant rows here and there might not seem like a big deal, but over time, duplicates can seriously undermine your data integrity and lead to some major headaches:

Inaccurate reporting and analysis

Duplicates throw off key metrics like sums, averages, and counts. For example, say you‘re using Google Sheets to track customer orders. If a single order gets entered twice, your total revenue and average order value will be inflated.

Duplicate data across multiple sheets can also cause discrepancies that are hard to track down. If you‘re pulling data from different sources into summary dashboards, duplicates in the source data can lead to numbers not matching up.

Wasted time and resources

Imagine this: You export a list of 1,000 customers from your CRM to Google Sheets for a big email marketing campaign. But you don‘t realize that 200 of those customers appear twice in the list. You end up emailing 200 people twice and paying for an extra 200 sends with your email service provider. Not only is that a waste of time and money, it‘s also a poor customer experience.

Duplicates also waste your time on wild goose chases trying to reconcile numbers and track down discrepancies. The Pareto Principle states that 80% of consequences come from 20% of causes. I‘d be willing to bet that at least 20% of your spreadsheet troubleshooting time is spent dealing with duplicates!

Poor data quality and loss of trust

At the end of the day, duplicates make your data unreliable. If key stakeholders can‘t trust the data, they won‘t use it to drive decisions. Lack of faith in data is cited as one of the biggest barriers to creating a data-driven culture.

Duplicates are especially problematic when sharing data externally with clients, investors, or partners. Even small errors can make you look sloppy and damage your credibility.

Bottom line: Duplicates are bad news bears. The good news is that Google Sheets makes it easy to nip them in the bud. Next up, I‘ll show you how to highlight duplicates in a few quick clicks.

How to Manually Highlight Duplicates in Google Sheets

The simplest way to find duplicates is to visually scan through your data. But that‘s tedious and impractical for all but the smallest datasets. A better approach is to use Google Sheets‘ built-in conditional formatting to automatically highlight duplicates.

Here‘s how to manually highlight duplicates in a single column in Google Sheets:

  1. Open your Google Sheet and select the range of cells you want to check for duplicates. You can click and drag or use Ctrl/Cmd+click to select non-adjacent cells.

  2. In the top menu, go to Format > Conditional Formatting. The conditional format rules sidebar will appear on the right.

  3. Under the "Format cells if…" drop-down, choose "Custom formula is".

  4. In the value or formula field, enter the following formula:

    =COUNTIF(A:A,A1)>1 

    Be sure to replace "A:A" with the actual column you selected in Step 1. For example, if you want to check column C, you‘d use:

    =COUNTIF(C:C,C1)>1

Conditional formatting sidebar with custom duplicate formula

  1. The default format is a nice shade of green. But you can customize the formatting by clicking on the "Formatting style" box and choosing your own font style, color, and cell background.

  2. Click "Done" to apply the conditional formatting rule. Any cells containing values that appear more than once in the selected range will now be highlighted.

Spreadsheet with duplicate values highlighted

That‘s it! With just a few clicks, you can quickly visualize which values in a column have been duplicated.

Pro Tip: If you want to check the entire sheet for duplicates, you can select the whole data range instead of a single column in Step 1. For example, if your data is in columns A to D and rows 1 to 1000, you‘d select A1:D1000 and use this formula in Step 4:

=COUNTIF($A$1:$D$1000,A1)>1

The dollar signs ($) make it an absolute reference, so the formatting will apply correctly even as you copy the formula to other cells.

Using Formulas to Highlight Duplicates Across Multiple Columns

The conditional formatting method works great for a single column. But what if you need to check for duplicates across multiple columns? For example, let‘s say you have a list of customers and you want to flag any rows where the same email address appears more than once, even if the names are different.

To do that, we need to use a more advanced formula. Don‘t worry, it‘s not as complicated as it looks! Here‘s the step-by-step:

  1. Select the range you want to check for duplicates across multiple columns. In this example, we‘ll check columns A to C.

  2. Go to Format > Conditional Formatting to bring up the formatting rules sidebar.

  3. Choose "Custom formula is" from the drop-down.

  4. Paste in this formula:

=COUNTIFS($A:$A,$A1,$B:$B,$B1,$C:$C,$C1)>1

This formula uses the COUNTIFS function to check for duplicate values across columns A, B, and C. The $ makes the references absolute.

  1. Set your formatting style (I‘m partial to a bold red for duplicates) and click "Done".

Now any rows with the same values across columns A, B, and C will be highlighted, like this:

Duplicates highlighted across multiple columns

You can easily modify this formula to check for duplicates across any number of columns. Just add additional ranges and criteria to the COUNTIFS function, like this:

=COUNTIFS($A:$A,$A1,$B:$B,$B1,$C:$C,$C1,$D:$D,$D1)>1

Finding Duplicates with Filters or Pivot Tables

In addition to conditional formatting, you can also use filters or pivot tables to quickly spot duplicates.

To filter for duplicates:

  1. Select your data range and click the filter icon to add filters to each column.
  2. Click the filter arrow on the column you want to check for duplicates.
  3. Uncheck the "Select all" box to deselect all items.
  4. Check the box next to "(Blanks)" to select only blank cells.
  5. Look for any values that appear more than once in the list – those are your duplicates.

Using a filter to find duplicates

Another nifty trick is to create a pivot table to count occurrences of each unique value in a column. Here‘s how:

  1. Select your data and go to Data > Pivot table.
  2. In the pivot table editor, add the column you want to check for duplicates to the "Rows" section.
  3. In the "Values" section, choose "COUNTA" as the aggregation.
  4. Look for any values with a count greater than one in the resulting pivot table. Those are duplicates.

For example, here‘s a pivot table showing that "[email protected]" appears twice in the email column:

Pivot table showing duplicate counts

While these methods are handy for a quick duplicate check, I recommend using conditional formatting for highlighting and dealing with duplicates on an ongoing basis. It‘s faster and makes the duplicates easier to spot visually.

What to Do with Duplicates Once You Find Them

So you‘ve gone to the trouble of highlighting duplicates in your Google Sheet. Now what? You have a few options:

Remove the duplicates

The most common approach is to simply delete the duplicate rows (making sure to keep one instance of each record, of course). To quickly delete duplicates:

  1. Select the range of cells you want to deduplicate.
  2. Go to Data > Remove duplicates.
  3. Check the box next to "Data has header row" if your sheet includes headers.
  4. Click "Remove duplicates" and confirm the deletion.

Remove duplicates dialog box

Merge duplicates

In some cases, you may have duplicate rows that contain different information in certain columns. For example, say you have two rows for the same customer but one has a phone number and the other has an email address.

Rather than deleting one of the duplicates, you can consolidate the data into one master record. To do that:

  1. Highlight your duplicates using conditional formatting.
  2. Create a new sheet and paste in your header row.
  3. Find the first set of duplicates and copy the values from each row into a single row on the new sheet, combining information as needed.
  4. Repeat for each set of duplicates until you have one record per unique value on the new sheet.

It‘s a bit of manual work, but it allows you to keep all the relevant data without redundant entries.

Update your original data source

If you‘re bringing data into Google Sheets from another source (like a CRM or marketing platform), you‘ll want to go update that original database to remove the duplicates there. Otherwise, you‘ll just end up re-importing those duplicates again next time you refresh your data!

Exactly how you merge or purge duplicates in the source system will depend on the specific software. But in general, you can use the duplicates highlighted in your Google Sheet to identify which records need to be updated or deleted.

Flag duplicates for review

In situations where you‘re not sure if you should delete or merge duplicate records, you can flag them for further review. To do that:

  1. Highlight duplicates using conditional formatting.
  2. Create a new column called "Duplicate Review" or similar.
  3. Use an IF formula to automatically mark the duplicates, like this:
    =IF(COUNTIF($A$2:$A$1000,$A2)>1,"Need Review","")
  4. Go through the records flagged with "Need Review" and decide how to handle them on a case-by-case basis.

This allows you to quickly see which records have potential issues without accidentally deleting data you might need.

Best Practices for Preventing Duplicates

Of course, the best way to deal with duplicates is to prevent them from sneaking into your sheets in the first place. Here are some of my favorite tips for keeping data duplicate-free:

Standardize data entry

Inconsistent data entry is a major source of duplicates. For example, if sometimes you enter "John Smith" and other times "Smith, John", you‘re going to end up with redundant records.

Solve this by establishing standardized formats for common fields like names, addresses, and phone numbers. Use drop-down lists or data validation to ensure team members are entering data in a consistent format.

Use unique identifiers

Add unique identifier columns (like customer IDs, order numbers, or SKUs) to your sheets to make it easy to spot and eliminate duplicates. When you concatenate (combine) multiple columns together into an ID, it‘s easy to check for duplicates using the strategies we covered above.

Set up data validation

Google Sheets offers several data validation techniques to prevent invalid or duplicate data entry. For example, you can use data validation to:

  • Limit data entry to a specific list of options in a drop-down menu
  • Require a unique value in a specific column
  • Restrict numbers to a certain range
  • Validate emails, URLs, and zip codes
  • Show a warning message when invalid data is entered

Learn how to set up data validation rules in our full tutorial.

Deduplicate before importing or combining data

Before bringing new data into your Google Sheets, take a moment to check for duplicates in the source data. If you‘re importing from multiple sources, deduplicate each one separately before combining them.

You can use Excel‘s Remove Duplicates tool, a database query, or a third-party deduplication tool to clean your data before it ever hits your Google Sheet.

Schedule regular data audits

Even with all these preventative measures in place, duplicates happen. Make a plan to regularly audit your important datasets for duplicates using the tactics we‘ve covered.

Exactly how often depends on your data volume and velocity. But in general, aim to check for duplicates at least monthly (more frequently if you‘re dealing with large volumes of data entry or imports).

Spending a little time proactively identifying and dealing with duplicates now saves hours of data cleanup and troubleshooting later.

Deduplication Made Easy

I hope this guide has shown you that duplicate data doesn‘t have to derail your Google Sheets projects. With a few simple techniques and proactive processes, you can keep your data squeaky clean and maintain confidence in your reporting.

To recap, here are the key steps to highlight and handle duplicates in Google Sheets:

  1. Use conditional formatting to automatically highlight duplicates in a single column or across multiple columns
  2. Try filters or pivot tables for a quick visual check of duplicates
  3. Decide whether to remove, merge, or flag the duplicates you find
  4. Put preventative measures in place like data validation and standardized entry formats to stop duplicates at the source

With these strategies in your spreadsheet toolbelt, you‘re well equipped to be a data deduplication dynamo!

I‘d love to hear how you‘re using these techniques in your own work. Share your favorite duplicate-busting tip in the comments below!