| From: | Mike Mascari <mascarm(at)mascari(dot)com> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Hannu Krosing <hannu(at)tm(dot)ee>, Matthias Urlichs <smurf(at)noris(dot)de>, pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: Berkeley DB... | 
| Date: | 2000-05-26 18:48:22 | 
| Message-ID: | 392EC6F6.91F4537D@mascari.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Tom Lane wrote:
> 
> Hannu Krosing <hannu(at)tm(dot)ee> writes:
> >> As Vadim points out in his comparison
> >> of COPY vs. INSERT, something is *wrong* with the time it takes
> >> for PostgreSQL to parse, plan, rewrite, and optimize.
> 
> We might have part of the story in the recently noticed fact that
> each insert/update query begins by doing a seqscan of pg_index.
> 
> I have done profiles of INSERT in the past and not found any really
> spectacular bottlenecks (but I was looking at a test table with no
> indexes, so I failed to see the pg_index problem :-().  Last time
> I did it, I had these top profile entries for inserting 100,000 rows
> of 30 columns apiece:
> 
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  ms/call  ms/call  name
>  30.08    290.79   290.79                             _mcount
>   6.48    353.46    62.67 30702766     0.00     0.00  AllocSetAlloc
>   5.27    404.36    50.90   205660     0.25     0.25  write
>   3.06    433.97    29.61 30702765     0.00     0.00  MemoryContextAlloc
>   2.74    460.45    26.48   100001     0.26     0.74  yyparse
>   2.63    485.86    25.41 24300077     0.00     0.00  newNode
>   2.22    507.33    21.47  3900054     0.01     0.01  yylex
>   1.63    523.04    15.71 30500751     0.00     0.00  PortalHeapMemoryAlloc
>   1.31    535.68    12.64  5419526     0.00     0.00  hash_search
>   1.18    547.11    11.43  9900000     0.00     0.00  expression_tree_walker
>   1.01    556.90     9.79  3526752     0.00     0.00  SpinRelease
> 
> While the time spent in memory allocation is annoying, that's only about
> ten mallocs per parsed data expression, so it's unlikely that we will be
> able to improve on it very much.  (We could maybe avoid having *three*
> levels of subroutine call to do an alloc, though ;-).)  Unless you are
> smarter than the flex and bison guys you are not going to be able to
> improve on the lex/parse times either.  The planner isn't even showing
> up for a simple INSERT.  Not much left, unless you can figure out how
> to write and commit a tuple with less than two disk writes.
> 
> But, as I said, this was a case with no indexes to update.
> 
> I intend to do something about caching pg_index info ASAP in the 7.1
> cycle, and then we can see how much of a difference that makes...
> 
>                         regards, tom lane
It will be interesting to see the speed differences between the
100,000 inserts above and those which have been PREPARE'd using
Karel Zak's PREPARE patch. Perhaps a generic query cache could be
used to skip the parsing/planning/optimizing stage when multiple
exact queries are submitted to the database? I suppose the cached
plans could then be discarded whenever a DDL statement or a
VACUUM ANALYZE is executed? The old Berkeley Postgres docs spoke
about cached query plans *and* results (as well as 64-bit oids,
amongst other things). I'm looking forward to when the 7.1 branch
occurs... :-)
Mike Mascari
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ron Peterson | 2000-05-26 19:30:22 | Re: SPI & file locations | 
| Previous Message | Mike Mascari | 2000-05-26 18:33:44 | Re: SPI & file locations |