Re: Duplicate records returned

From: Richard Huxton <dev(at)archonet(dot)com>
To: daniel65456(at)gmail(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Duplicate records returned
Date: 2007-06-26 08:52:57
Message-ID: 4680D3E9.7020706@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

daniel65456(at)gmail(dot)com wrote:
> I'm getting duplicate rows returned. I don't know know and can't find
> out how to construct the SQL to return what I want. I have an old
> version of postgres which I cannot upgrade. I'm not even sure how to
> get the version. Does this help?
>
> $ grep -i version ..../pi/bin/setup
> wtversion='2.0'
> export wtversion

Nope - nothing to do with PostgreSQL I'm afraid.

Try issuing "SELECT version()" as an SQL statement.

>
> My SQL knowledge is quite limited so I've searched all over the net
> and read doco but I can't figure this out.
>
> I want to find the total number of hours worked on all projects with
> projects.parent="Projects", projects.pct<100, restrictions.hidden=5
> (not hidden)
>
> I keep getting duplicate records and I think it's it's becuase I'm
> joining a table on 2 other tables. I've tried INTERSECT but I keep
> getting parse errors and can't work out why that is. I've tried
> creating a temp table but still have to do the twin joins!! I've also
> tried creating a view but without success

> but there will be many projects to process and to select the
> applicable projects requires test on 2 other tables, projects and
> restrictions
>
> $the_sql = " SELECT projectname, username, sum(hours)";
> $the_sql .= " FROM timerecs";
> $the_sql .= " WHERE projectname = projects.projectname ";
> $the_sql .= " AND projectname = restrictions.projectname";
> $the_sql .= " AND projects.parent = 'Projects'";
> $the_sql .= " AND projects.pct < 100";
> $the_sql .= " AND restrictions.hidden = 5";
> $the_sql .= " AND projectname = 'Testing'"; # just for tsting
> $the_sql .= " AND username = 'long'"; # just for testing
> $the_sql .= " AND projectname = projects.projectname ";
> $the_sql .= " GROUP BY projectname, username";
> $the_sql .= " ORDER BY projectname, username";
> $the_sql .= " ;";

You might want to read up on "HERE documents" for multi-line blocks of text.

1. You've also not put all your tables into the FROM clause:
FROM timerecs, projects, restrictions
This *should* be generating a warning of some kind
2. You're not qualifying which column comes from which table, which
makes it harder to see what's happening. Try:
FROM timerecs t, projects p, restrictions r
WHERE t.projectname = p.projectname
AND ...
That's called table aliasing, where you give a short name to tables.
3. This query *can't* give duplicates for (projectname,username) pairs
unless you're activating a bug. The GROUP BY eliminates duplicates.

> produces
>
> 1 Testing|long|254
>
> How do I get the right list of projectname from timerecs by joining
> with the projects and restrictions tables?

You've not said what "right" means to you.

> I've tried SELECT DISTINCT projectname but make no difference. If I
> take the 'restrictions' join out it's fine. I've also tried prefacing
> all column names with table names without any change

Try the table aliasing, then post the query again (oh, trim the perl/php
if you could) along with some sample data, the results and what the
results should be. Otherwise there's not much anyone can say.

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2007-06-26 08:54:00 Re: problem importing data with psql
Previous Message PFC 2007-06-26 08:17:38 Re: a JOIN on same table, but 'slided over'