From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Mark kirkwood <markir(at)slingshot(dot)co(dot)nz> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Re : Solaris Performance - Profiling |
Date: | 2002-04-02 16:02:51 |
Message-ID: | 25091.1017763371@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
Mark kirkwood <markir(at)slingshot(dot)co(dot)nz> writes:
> I am a little concerned that the cpu time recored by the profiler seems
> to be a bit on the short side at 4047.53 seconds (67 minutes), as
> opposed to 119.2 for the backend itself. Are these two numbers supposed
> to match closely ? (they did for shorter queries..)
Hmm. Where exactly did you get those numbers from? I see 4118.54 sec
as the total CPU accounted for in the profile.
Some versions of gprof hide the time spent in the profiler subroutine,
leading to discrepancies between the time accounted for in the profile
and the actual process CPU time. However, yours doesn't seem to be one
of them --- internal_mcount is right there.
> Anyway in the hope that this captured output is interesting enough, here
> it is...or rather here is some of it (biggish file), so I am leaving a
> complete copy under :
> http://homepages.slingshot.co.nz/~markir/tar/solaris/
> Let me know if further information is likely to be helpful. I have
> profiled the same query (on the same dataset) on Freebsd and have the
> output available at :
> http://homepages.slingshot.co.nz/~markir/tar/freebsd/
Hmm. Assuming that the profile data is trustworthy and the queries are
indeed the same (did you compare EXPLAIN output?), it seems that
Solaris' problem is a spectacularly bad qsort() implementation.
The FreeBSD profile shows:
1 qsort [29]
0.00 0.31 1/2 load_password_cache [44]
0.00 0.31 1/2 tuplesort_performsort [45]
[29] 9.8 0.01 0.62 2+1 qsort [29]
0.01 0.62 148033/148033 qsort_comparetup [30]
1 qsort [29]
where Solaris has
0.00 1387.23 1/2 load_password_cache [18]
0.00 1387.23 1/2 tuplesort_performsort [19]
[4] 68.5 0.00 2774.46 2 qsort [4]
19.73 2754.58 1/1 qst [5]
0.01 0.14 88012/1680561053 qsort_comparetup [6]
0.00 0.00 1/61473 .umul [154]
-----------------------------------------------
19.73 2754.58 1/1 qsort [4]
[5] 68.5 19.73 2754.58 1 qst [5]
167.74 2586.81 1680473041/1680561053 qsort_comparetup [6]
0.02 0.00 59630/59703 .udiv [114]
0.01 0.00 59630/61473 .umul [154]
and all the rest of the top profile entries are explained by the fact
that qsort_comparetup is called 1.68 billion times instead of 148K
times.
Can these really be the same dataset? Can Solaris' qsort really be that
outstandingly incompetent? How many rows are actually being sorted
here, anyway?
It might be entertaining to snarf a qsort off the net (from glibc,
perhaps) and link it into Postgres to see if you get better results.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | David Esposito | 2002-04-02 16:12:10 | Re: Core dump on PG 7.1.3 |
Previous Message | Doug McNaught | 2002-04-02 15:56:36 | Re: v7.2.1 Released: Critical Bug Fix |
From | Date | Subject | |
---|---|---|---|
Next Message | Reinoud van Leeuwen | 2002-04-02 16:09:26 | status of IPv6 Support for INET/CIDR types |
Previous Message | Doug McNaught | 2002-04-02 15:56:36 | Re: v7.2.1 Released: Critical Bug Fix |