Categories
Excel

Find and Remove Duplicate Data

People frequently have to identify duplicated data within a list or table. Doing this manually can be very time-consuming and error-prone. To make this job much easier, you can learn one of Excel’s standard features, conditional formatting.

  1. Highlight Duplicates
  2. Remove Duplicates

Highlight Duplicates using Conditional Formatting

You can use conditional formatting to highlight duplicate values in your data. This can help you visually inspect your data and decide whether to keep or delete the duplicates. To use conditional formatting, follow these steps:

  1. Select the range of cells that you want to check for duplicates.
  2. Click on the Home tab and click on Conditional Formatting in the Styles group.
  3. Click on Highlight Cells Rules
  4. Click on Duplicate Values in the submenu.
  5. In the Duplicate Values dialog box, choose a format to apply to the duplicate cells. You can also change the values option to Unique if you want to highlight only the unique values instead of the duplicates.
  6. Click OK and Excel will apply the formatting to your data.

For older versions (97-2003)

  1. Select Format » Conditional Formatting.
  2. The Conditional Formatting dialog box appears.
  3. Select Formula Is from the top-left pop-up menu.
  4. In the field to its right, enter the following code for a table of data with a range of A1:C5:
    =COUNTIF($A$1:$C$5,A1)>1
  5. Click the Format tab followed by the Patterns tab
  6. Select a color you want applied to visually identify duplicate data.
  7. Click OK to return to the Conditional Formatting dialog box
  8. Click OK again to apply the formatting.

Using the Remove Duplicates feature in the Data tab

The Remove Duplicates command essentially looks for distinct values in each column you selected and then removes all records necessary to end up with a unique list of values in each column.

  1. Select the range of cells that you want to check for duplicates.
  2. Click on the Data tab.
  3. Click on Remove Duplicates in the Data Tools group.
  4. In the Remove Duplicates dialog box, uncheck the columns that you don’t want to use as criteria for finding duplicates.
  5. Click OK and Excel will delete any duplicate rows.

Excel will delete any duplicate entries and display a message telling you how many values were removed and how many values remain as shown in the following figure:

Note: The Remove Duplicate feature will not delete duplicate entries between the columns.


Understanding Workbooks and Worksheets: