From: | Alec Mitchell <apm13(at)columbia(dot)edu> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Strange query optimization in 7.3.2 |
Date: | 2003-04-16 01:56:07 |
Message-ID: | 200304151856.07783.apm13@columbia.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Monday 14 April 2003 10:39 pm, Tom Lane wrote:
> Indeed, there's a problem here with nulls --- there is a case in
> eqjoinsel() that didn't account for nulls, and should've. I've applied
> the attached patch to fix it. This reduces the estimate of the tr/t/m
> join size from 1119 to 332, which is still large compared to the true
> size of 52, but it's a lot better. Without the stops table, I can't
> really tell if this changes the complete plan or not --- could you apply
> the patch and find out?
>
> BTW, the remaining error seems to be because the tr.group_num = 1
> constraint skews the fraction of matching "trailer" values. I fear
> there's little chance of persuading the system to figure that out in the
> near future --- it can't be done without cross-column correlation
> statistics, which we do not keep.
I applied the patch to the 7.3.2 sources, but strangely I get a segfault when
I run initdb using the patched PostgreSQL. Specifically I get this error:
creating system views... ok
loading pg_description... /usr/local/pgsql/bin/initdb: line 801: 14633 Done
( cat <<EOF
CREATE TEMP TABLE tmp_pg_description ( objoid oid, classname
name,objsubid int4, description text) WITHOUT OIDS;
COPY tmp_pg_description FROM STDIN;
EOF
cat "$POSTGRES_DESCR"; cat <<EOF
\.
INSERT INTO pg_description SELECT t.objoid, c.oid, t.objsubid,
t.description FROM tmp_pg_description t, pg_class c WHERE c.relname =
t.classname;
EOF
)
14634 Segmentation fault | "$PGPATH"/postgres $PGSQL_OPT template1
>/dev/null
initdb failed.
Removing /var/lib/postgres/data.
This doesn't make much sense to me as the patch is pretty straightforward.
I'm running an up to date Debian Woody distribution, with a 2.4.21-pre6
kernel. Let me know if there is anything I can do to help figure this out.
Having spent so much time thinking about this query and its odd results I've
realized that I there is a query which does not involve the manifests table
but returns the same results and always provides more accurate statistics for
the planner (the stops table is not normalized and has some redundant data,
allowing a few different paths to the same results). It's marginally slower
but should scale better as the stops and manifests tables grow. As a result,
I'm not too worried about the particulars of this optimization anymore
(though looking at the plan and estimates for my new query, an estimate of
332 rows from m should probably result in the faster nested loop plan being
choosen).
Thanks for all your help,
Alec Mitchell
From | Date | Subject | |
---|---|---|---|
Next Message | d0p7heidumd4j02 | 2003-04-16 02:04:36 | Re: OT: mail server blocked |
Previous Message | Stephan Szabo | 2003-04-16 01:51:35 | Re: Performance Problem |