Find Duplicate Records In Table Without Unique Id
Last updated
Was this helpful?
Last updated
Was this helpful?
I recently came across a couple methods for listing out instances of duplicate records in a table where the table doesn't have an explicit unique identifier. Here is that explains how to do this when a unique identifier is present.
If the table doesn't have an explicit primary key or other uniquely identifying value, then we'll have to get some help from — namely the ctid
.
The ctid
is:
The physical location of the row version within its table.
Let's use the example of the mailing_list
table with potential duplicate email
values.
Here is the :
This uses a subquery to find the first occurrence of every unique email and then deletes the rest. The ctid
is the unique value that we can call the min
aggregate on.
A :
This uses delete using
to join the table against itself as a cartesian product to compare every entry to every other entry.