Quick Links

Re: Sequential scans

From:	Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	Gregory Stark <stark(at)enterprisedb(dot)com>, Simon Riggs <simon(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Sequential scans
Date:	2007-05-02 21:28:26
Message-ID:	4639027A.7080809@enterprisedb.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Jeff Davis wrote:
> On Wed, 2007-05-02 at 20:02 +0100, Heikki Linnakangas wrote:
>> Gregory Stark wrote:
>>> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>>>
>>>> Let's use a normal hash table instead, and use a lock to protect it. If we only
>>>> update it every 10 pages or so, the overhead should be negligible. To further
>>>> reduce contention, we could modify ReadBuffer to let the caller know if the
>>>> read resulted in a physical read or not, and only update the entry when a page
>>>> is physically read in. That way all the synchronized scanners wouldn't be
>>>> updating the same value, just the one performing the I/O. And while we're at
>>>> it, let's use the full relfilenode instead of just the table oid in the hash.
>>> It's probably fine to just do that. But if we find it's a performance
>>> bottleneck we could probably still manage to avoid the lock except when
>>> actually inserting a new hash element. If you just store in the hash an index
>>> into an array stored in global memory then you could get away without a lock
>>> on the element in the array.
>>>
>>> It starts to get to be a fair amount of code when you think about how you
>>> would reuse elements of the array. That's why I suggest only looking at this
>>> if down the road we find that it's a bottleneck.
>> Another trick you could do is to use acquire the lock conditionally when
>> updating it. But I doubt it's a problem anyhow, if we put some sane
>> lower limit in there so that it's not used at all for small tables.
>>
>
> The more sophisticated the data structure the less able we are to avoid
> locking, correct? For instance, if we have an LRU list it might be
> tricky or impossible to avoid locking even on just reads.

Agreed. I'm not concerned about reads, though. You only need to read
from the structure once when you start a scan. It's the updates that
cause most of the traffic.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Re: Sequential scans at 2007-05-02 21:11:48 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2007-05-02 21:28:39	Re: Optimization in convert_string_datum?
Previous Message	Tom Lane	2007-05-02 21:13:08	Re: pg_index updates and SI invalidation