Data Loss from SQL SELECT (vs. COPY/pg_dump)

From: Kong Man <kong_mansatiansin(at)hotmail(dot)com>
To: "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Data Loss from SQL SELECT (vs. COPY/pg_dump)
Date: 2013-04-05 19:56:16
Message-ID: DUB116-W8909170CBFD4431563CE808BDA0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I am troubled to find out that a SELECT statement produces fewer rows than the actual row count and have not been able to answer myself as to why. I hope someone could help shedding some light to this.

I attempted to generate a set of INSERT statements, using a the following SELECT statement, against my translations data to reuse elsewhere, but then realized the row count was 8 rows fewer than the source of 2,178. COPY and pg_dump don't seem to lose any data. So, I compare the results to identify the missing data as follows. I don't even see any strange encoding in those missing data.

What scenario could have caused my SELECT query to dump out the 8 blank rows, instead of the expected data?

Here is how I find the discrepancy:
===============================================================================
$ psql -c "CREATE TABLE new_translation AS
SELECT display_name, name, type, translation
FROM translations t JOIN lang l USING (langid)
WHERE display_name = 'SPANISH_CORP'
ORDER BY display_name, name"
SELECT 2178

$ psql -tAc "SELECT
'INSERT INTO new_translation VALUES ('
||quote_literal(display_name)||
', '||quote_literal(name)||
', '||quote_literal(type)||
', '||quote_literal(translation)||');'
FROM new_translation
ORDER BY display_name, name" >/tmp/new_translation-select.sql

$ pg_dump --data-only --inserts --table=new_translation clubpremier |
sed -n '/^INSERT/,/^$/p' >/tmp/new_translation-pg_dump.sql

$ grep ^INSERT /tmp/new_translation-pg_dump.sql | wc -l
2178

$ grep ^INSERT /tmp/new_translation-select.sql | wc -l
2170

$ diff /tmp/new_translation-select.sql /tmp/new_translation-pg_dump.sql
27c27
<
---
> INSERT INTO new_translation VALUES ('SPANISH_CORP', 'AGENCY_IN_USE_BY_COBRAND', NULL, 'La cuenta no puede ser eliminada porque está siendo utilizada actualmente por la co-marca #cobrand#');
506c506
<
---
> INSERT INTO new_translation VALUES ('SPANISH_CORP', 'CAR_DISTANCE_UNIT', NULL, 'MILLAS');
1115c1115
<
---
> INSERT INTO new_translation VALUES ('SPANISH_CORP', 'HOTEL_PROMO_TEXT', 'label', NULL);
1131,1134c1131,1134
<
<
<
<
---
> INSERT INTO new_translation VALUES ('SPANISH_CORP', 'INSURANCE_SEARCH_ADVERTISEMENT_SECTION_ONE', 'checkout', NULL);
> INSERT INTO new_translation VALUES ('SPANISH_CORP', 'INSURANCE_SEARCH_ADVERTISEMENT_SECTION_THREE', 'checkout', NULL);
> INSERT INTO new_translation VALUES ('SPANISH_CORP', 'INSURANCE_SEARCH_ADVERTISEMENT_SECTION_TWO', 'checkout', NULL);
> INSERT INTO new_translation VALUES ('SPANISH_CORP', 'INSURANCE_SEARCH_FOOTER', 'checkout', NULL);
1615c1615
<
---
> INSERT INTO new_translation VALUES ('SPANISH_CORP', 'PAGE_FORGOT_PASSWORD', 'page_titles', NULL);
2215a2216
>
===============================================================================

Thank you in advance for your help,
-Kong

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Kong Man 2013-04-05 22:51:38 Re: Data Loss from SQL SELECT (vs. COPY/pg_dump)
Previous Message Kaleeswaran Velu 2013-04-04 17:54:54 Re: Postgres trigger issue with update statement in it.