From: | Eli Murray <ejmurra2(at)illinimedia(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Creating composite keys from csv |
Date: | 2015-03-09 15:12:41 |
Message-ID: | CAHReO_vxos=mOHRJweHai8TQAkJXytBKe9mUwKs2As9yJVw0ww@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Thank you all for your help. I'm following along with John McKown's
suggestion but when I run the update query I get "UPDATE 32956" but the
personid column in my rawdata table has null values for every record.
Here's the exact query I ran:
UPDATE rawdata SET personid = (SELECT personid FROM assignid WHERE
rawdata.personid = assignid.personid);
I think the issue might be that it's only selecting records where personid
is the same in both tables and right now there are only null values in
rawdata.personid. What query should I write to SET rawdata.personid =
assignid.personid WHERE rawdata.employeename && rawdata.totalsalary =
assignid.name && assignid.totalsalary?
On Mon, Mar 9, 2015 at 8:36 AM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:
> On 03/08/2015 08:49 PM, Eli Murray wrote:
>
>> Hi all,
>>
>> I'm a student journalist working on a project for our student paper
>> which lists salaries and positions for every staff member at the
>> university. We received the data from an FOI request but the university
>> is refusing to give us primary keys for the data.
>>
>> The issue we've run into is that if there are two staff members with the
>> same name (and there are) our current web app adds their salaries
>> together and considers them one person. Now, luckily, we can create a
>> composite key if we combine their name column with their salary column.
>> Unfortunately, the format of the data we have makes it more difficult
>> than that (of course!) because some employees can hold multiple paying
>> positions.
>>
>> Here's some example data:
>>
>> Name, Position, Salary,Total Salary, ...
>> Jane Doe, Dean, 100.000, 148.000, ...
>> John Locke, Custodian, 30.000, 30.000, ...
>> Jane Doe, Academic Adviser, 48.000, 148.000, ...
>> Jane Doe, Trainer, 46.000, 46.000, ...
>>
>> Basically, what we'd like to do is create a serial primary key but
>> instead of having it increment every row, it needs to check the name and
>> total salary columns and only increment if that person doesn't already
>> exist. If they do exist, it should just assign the previously created
>> number to the column.
>>
>
> Well the above is not going to work, because the id would not be unique
> across rows and therefore could not be a primary key. If I am following
> what you want is a staff id that identifies a particular staff member
> across rows and is derived from the (Name, Total Salary) combination, is
> that correct? If so you could use a serial column to generate a surrogate
> primary key for each row without worrying about the names and total salary.
> Then it becomes an issue of generating the staff id for unique staff
> members. For that I would see John McKowns answer.
>
> However, our team is small and between us we have
>
>> very little experience working with databases and we haven't found a way
>> to accomplish this goal yet. In fact, we may be trying to solve this in
>> the wrong way entirely.
>>
>> So, to put it succinctly, how would you approach this problem? What are
>> our options? Do we need to write a script to clean the data into
>> separate csv tables before we import it to postgres, or is this
>> something we can do in postgres? We'd really appreciate any help you all
>> may be able to offer.
>>
>> Best!
>> Eli Murray
>>
>>
>>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
--
Reporter at The Daily Illini
ejmurra2(at)illinimedia(dot)com
(815) 985-8760
From | Date | Subject | |
---|---|---|---|
Next Message | pinker | 2015-03-09 15:21:24 | Re: VACUUM FULL doesn't reduce table size |
Previous Message | pinker | 2015-03-09 15:05:36 | Re: VACUUM FULL doesn't reduce table size |