Quick Links

Re: init_sequence spill to hash table

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: init_sequence spill to hash table
Date:	2013-11-14 23:33:30
Message-ID:	CAApHDvoAmJWtRQy03O33ijmw+tPwQLaQnDxrcRd8=OpSJEVVUg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
> wrote:

> On 14.11.2013 14:38, David Rowley wrote:
>
>> I've just completed some more benchmarking of this. I didn't try dropping
>> the threshold down to 2 or 0 but I did tests at the cut over point and
>> really don't see much difference in performance between the list at 32 and
>> the hashtable at 33 sequences. The hash table version excels in the 16000
>> sequence test in comparison to the unpatched version.
>>
>> Times are in milliseconds of the time it took to call currval() 100000
>> times for 1 sequence.
>> Patched Unpatched increased by 1 in cache 1856.452 1844.11 -1% 32
>> in
>> cache 1841.84 1802.433 -2% 33 in cache 1861.558 not tested N/A 16000 in
>> cache 1963.711 10329.22 426%
>>
>
> If I understand those results correctly, the best case scenario with the
> current code takes about 1800 ms. There's practically no difference with N
> <= 32, where N is the number of sequences touched. The hash table method
> also takes about 1800 ms when N=33. The performance of the hash table is
> O(1), so presumably we can extrapolate from that that it's the same for any
> N.
>
> I think that means that we should just completely replace the list with
> the hash table. The difference with a small N is lost in noise, so there's
> no point in keeping the list as a fast path for small N. That'll make the
> patch somewhat simpler.
> - Heikki
>

I had thought that maybe the biggest type of workloads might only touch 1
or 2 sequences, though it may be small but I had thought there would be an
overhead in both cycles and memory usage in creating a hash table for these
light usages of sequence backends. It would certainly make the patch more
simple by removing this and it would also mean that I could remove the
sometimes unused ->next member from the SeqTableData struct which is just
now set to NULL when in hash table mode. If you think it's the way to go
then I can make the change, though maybe I'll hold off the refactor for now
as it looks like other ideas have come up around rel cache.

Regards

David Rowley

In response to

Re: init_sequence spill to hash table at 2013-11-14 14:03:41 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tomas Vondra	2013-11-14 23:33:43	Re: strncpy is not a safe version of strcpy
Previous Message	Rod Taylor	2013-11-14 23:25:37	Re: GIN improvements part2: fast scan