From: | Rich Shepard <rshepard(at)appl-ecosys(dot)com> |
---|---|
To: | "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Bulk inserts into two (related) tables |
Date: | 2019-05-22 14:38:29 |
Message-ID: | alpine.LNX.2.20.1905220725100.1484@salmo.appl-ecosys.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, 22 May 2019, Jeremy Finzel wrote:
> There's absolutely no need to use anything beyond SQL here, though you
> could if you want to.
Jeremy,
This is a new experience for me so I didn't think of a SQL solution.
> I really wonder how much we are just talking past each other simply because
> we don't know what your data looks like, so we can't show you how our
> examples apply to your use case. If you provided a sample scrubbed data
> file, this whole thread probably would have been much shorter :). Can you
> do that?
Not necessary; see below. Also, these data come from a regulator and
provided as an Excel spreadsheet. If they were extracted from a database
then that was very poorly designed because there's no consistency in how
fields/columns are formatted. This requires manual cleaning.
Each row in the source file (exported from the spreadsheet as .csv and
renamed to .txt for processing in emacs and awk) is a mixture of attributes
that belong in either or both of the organization and people tables in my
database. An awk script will extract the appropriate fields for each table.
> You told Francisco that the data file does not have a unique org name that
> could be used as a unique organization identifier. However you seem to
> have contradicted that by responding favorably to this solution:
The org_name is not the PK; the org_id is. This was assigned by postgres
when the original rows were inserted. Now, I can add the org_id in the
values to be inserted as I know the maximum org_id number in that table.
> INSERT INTO PEOPLE (id, name, org_id) VALUES (1, ‘Bob’, (SELECT org_id FROM
> org WHERE org_name=‘Main Office’))
Question: do I use this same syntax for each row to be inserted or can I
make it one long insert statement by separating the parenthesized values
with commas as I do when I update multiple rows in a table?
Thanks very much,
Rich
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2019-05-22 14:39:48 | Re: distinguish update from insert (on conflict) |
Previous Message | Pierre Couderc | 2019-05-22 14:30:12 | Re: how to write correctly this update ? |