Quick Links

Re: Berkeley DB...

From:	Mike Mascari <mascarm(at)mascari(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Hannu Krosing <hannu(at)tm(dot)ee>, Matthias Urlichs <smurf(at)noris(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Berkeley DB...
Date:	2000-05-26 18:48:22
Message-ID:	392EC6F6.91F4537D@mascari.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tom Lane wrote:
>
> Hannu Krosing <hannu(at)tm(dot)ee> writes:
> >> As Vadim points out in his comparison
> >> of COPY vs. INSERT, something is *wrong* with the time it takes
> >> for PostgreSQL to parse, plan, rewrite, and optimize.
>
> We might have part of the story in the recently noticed fact that
> each insert/update query begins by doing a seqscan of pg_index.
>
> I have done profiles of INSERT in the past and not found any really
> spectacular bottlenecks (but I was looking at a test table with no
> indexes, so I failed to see the pg_index problem :-(). Last time
> I did it, I had these top profile entries for inserting 100,000 rows
> of 30 columns apiece:
>
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 30.08 290.79 290.79 _mcount
> 6.48 353.46 62.67 30702766 0.00 0.00 AllocSetAlloc
> 5.27 404.36 50.90 205660 0.25 0.25 write
> 3.06 433.97 29.61 30702765 0.00 0.00 MemoryContextAlloc
> 2.74 460.45 26.48 100001 0.26 0.74 yyparse
> 2.63 485.86 25.41 24300077 0.00 0.00 newNode
> 2.22 507.33 21.47 3900054 0.01 0.01 yylex
> 1.63 523.04 15.71 30500751 0.00 0.00 PortalHeapMemoryAlloc
> 1.31 535.68 12.64 5419526 0.00 0.00 hash_search
> 1.18 547.11 11.43 9900000 0.00 0.00 expression_tree_walker
> 1.01 556.90 9.79 3526752 0.00 0.00 SpinRelease
>
> While the time spent in memory allocation is annoying, that's only about
> ten mallocs per parsed data expression, so it's unlikely that we will be
> able to improve on it very much. (We could maybe avoid having *three*
> levels of subroutine call to do an alloc, though ;-).) Unless you are
> smarter than the flex and bison guys you are not going to be able to
> improve on the lex/parse times either. The planner isn't even showing
> up for a simple INSERT. Not much left, unless you can figure out how
> to write and commit a tuple with less than two disk writes.
>
> But, as I said, this was a case with no indexes to update.
>
> I intend to do something about caching pg_index info ASAP in the 7.1
> cycle, and then we can see how much of a difference that makes...
>
> regards, tom lane

It will be interesting to see the speed differences between the
100,000 inserts above and those which have been PREPARE'd using
Karel Zak's PREPARE patch. Perhaps a generic query cache could be
used to skip the parsing/planning/optimizing stage when multiple
exact queries are submitted to the database? I suppose the cached
plans could then be discarded whenever a DDL statement or a
VACUUM ANALYZE is executed? The old Berkeley Postgres docs spoke
about cached query plans *and* results (as well as 64-bit oids,
amongst other things). I'm looking forward to when the 7.1 branch
occurs... :-)

Mike Mascari

In response to

Re: Berkeley DB... at 2000-05-26 15:39:56 from Tom Lane

Responses

Re: Berkeley DB... at 2000-05-26 19:43:20 from Zeugswetter Andreas
Re: Berkeley DB... at 2000-05-29 14:57:04 from Karel Zak

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Ron Peterson	2000-05-26 19:30:22	Re: SPI & file locations
Previous Message	Mike Mascari	2000-05-26 18:33:44	Re: SPI & file locations