Quick Links

Re: set-returning calls and overhead

From:	Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: set-returning calls and overhead
Date:	2012-10-30 21:31:20
Message-ID:	CAKuK5J2BOGH1d4Bh-fhGZ_OvExJCss=L1EbXD4T6vLrORWrKRw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Thu, Jul 19, 2012 at 11:07 AM, Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> wrote:
> Recently I found myself wondering what was taking a particular query so long.
> I immediately assumed it was a lack of I/O, because lack of I/O is a
> thorn in my side.
> Nope, the I/O was boring. CPU? Well, the process was using 100% of the
> CPU but the query itself was really very simple.
> I turned to ltrace (horribly imprecise, I know). ltrace told me this:
>
>
> % time seconds usecs/call calls function
> ------ ----------- ----------- --------- --------------------
> 46.54 6.789433 69 97766 memcpy
> 28.16 4.108324 1100 3732 strlen
> 14.45 2.107567 564 3732 malloc
> 9.16 1.336108 28 46877 memset
> 0.74 0.107935 28 3732 strcpy
> 0.73 0.107221 28 3732 free
> 0.16 0.023687 187 126 write
> 0.02 0.003587 28 126 __errno_location
> 0.02 0.003075 59 52 read
> 0.01 0.001523 29 52 memcmp
> ------ ----------- ----------- --------- --------------------
> 100.00 14.588460 159927 total
>
>
> and this:
>
> strlen("SRF multi-call context")
> strcpy(0xe01d40, "SRF multi-call context")
> malloc(1024)
> memcpy(...)
> memset(...)
> ...
> memset(...)
> free(..)
>
> repeat.
>
> I was rather surprised to learn that (per-row):
> (1) memcpy of 64 bytes accounted for 46% of the time spent in library calls
> (2) the (other) costs of strlen, strcpy, malloc, and memset were so
> huge (in particular, strlen)
>
> What, if anything, can be done about this? It seems the overhead for
> setting up the memory context for the SRF is pretty high.
> I notice this overhead pretty much every time I use any of the array
> functions like unnest.
>
> Please help me to understand if I'm misinterpreting things here.
>
> [x86_64, Linux, PostgreSQL 9.1.4]

A followup.

Recently, I imported a bunch of data. The import ran in about 30
seconds. The data itself was represented in a way that made more sense
- from a relational database perspective - as multiple tables. To
accomplish this, I made use of string_to_array and unnest. The
initial table creation and copy run in about 30 seconds, but then the
creation of the new table (create table ... as select ..
unnest(string_to_array(....))) took over 5 minutes. 10 times as long.
What is it about the array functions (actually, all set-returning
functions that I've tried) that causes them to be so expensive? The
per-call overhead is enormous in some cases. PostgreSQL 9.1.5 on
x86_64 (openSUSE 12.2 - but the effect has been observed across
several platforms and major/minor releases of PostgreSQL).

> --
> Jon

--
Jon

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jeff Janes	2012-10-30 21:58:33	Re: How to keep queries low latency as concurrency increases
Previous Message	Kevin Grittner	2012-10-30 20:07:42	Re: How to keep queries low latency as concurrency increases