Re: Find duplicates in a column then print Information to a file

From: Martin Moore <martin(dot)moore(at)avbrief(dot)com>
To: Sherman Willden <operasopranos(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Find duplicates in a column then print Information to a file
Date: 2017-12-31 18:23:59
Message-ID: FCF00F10-2F77-48BA-A419-25DA669072C2@avbrief.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

From: Sherman Willden <operasopranos(at)gmail(dot)com>
Date: Sunday, 31 December 2017 at 18:19
To: <pgsql-general(at)postgresql(dot)org>
Subject: Find duplicates in a column then print Information to a file

Development Platform: Ubuntu 17.10 mainly command line work

Tools: perl 5.26 and postgresql 9.6

Goal: Display duplicate aria titles on screen and to a local file

Database name: arias

Table name: aria_precis

csv delimiter: the # symbol

arias=# \d aria_precis
Table "public.aria_precis"
Column | Type | Modifiers
-------------+------+-----------
id | text | not null
aria | text |
artist | text |
a_artist | text |
album_title | text |
Indexes:
"aria_precis_pkey" PRIMARY KEY, btree (id)

Problems:
1. Can't connect aria title to id
2. Can't write discovered information to file

I know about this link but I probably didn't understand what I read there.
https://www.postgresql.org/docs/current/static/sql-select.html#SQL-WITH

What I know displays the 46 duplicate aria titles of 413 entries but not the id or artist:
SELECT aria FROM aria_precis WHERE aria IN (SELECT aria FROM aria_precis GROUP BY aria HAVING COUNT(aria)>1);

When I perform the following I get (0 rows):
SELECT aria FROM aria_precis WHERE aria IN (SELECT aria FROM aria_precis GROUP BY id, aria HAVING COUNT(aria)>1);
aria
------
(0 rows)

After I get the above information how do I write that information to a file? The only thing I know writes the entire database to a file:
\COPY aria_precis TO '/home/sherman/aria_precis_2.csv' WITH DELIMITER '#,' CSV HEADER;

Thank you;

Sherman

========================================

On the cmdline just do:

psql arias <pwd> -c ‘SELECT aria FROM aria_precis WHERE aria IN (SELECT aria FROM aria_precis GROUP BY aria HAVING COUNT(aria)>1)‘   > outfile.txt

Martin.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Francisco Olarte 2017-12-31 18:53:43 Re: Find duplicates in a column then print Information to a file
Previous Message Sherman Willden 2017-12-31 18:19:15 Find duplicates in a column then print Information to a file