I am looking for records with duplicate keys, so I am running this query:
SELECT
fileid, COUNT(*)
FROM
file
GROUP BY
fileid
HAVING
COUNT(*)>1
The table has an index on fileid (non-unique index) so I am surprised
that postgres is doing a table scan. This database is >15GB, and there
are a number of fairly large string columns in the table. I am very
surprised that scanning the index is not faster than scanning the
table. Any thoughts on that? Is scanning the table faster than
scanning the index? Is there a reason that it needs anything other than
the index?