Re: Per-table random_page_cost for tables that we know are always cached

From: "Gurjeet Singh" <singh(dot)gurjeet(at)gmail(dot)com>
To: Decibel! <decibel(at)decibel(dot)org>
Cc: PFC <lists(at)peufeu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Per-table random_page_cost for tables that we know are always cached
Date: 2008-04-22 18:58:07
Message-ID: 65937bea0804221158p8312949gaa0b7f8ceb758144@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 23, 2008 at 12:11 AM, Decibel! <decibel(at)decibel(dot)org> wrote:

> On Apr 22, 2008, at 5:13 AM, PFC wrote:
>
> > In order to have it use the fast plan I must set random_page_cost to 1
> > which I absolutely don't want to do.
> > Setting effective_cache_size to huge values has no effect.
> > If I select a value of parent_id that has much less children, the index
> > will be used, but in this case I think the threshold is misplaced, it should
> > be slightly higher. Here we have about 5% of values selected. Hash join
> > becomes better at about 15% because the table is cached.
> > This is 8.3.
> >
> > Perhaps there would be a need for a per-object setting
> > (object=table,index,partition) to alter the aggressiveness/lazyness of the
> > page flushing and how long the pages for this object are kept in
> > shared_buffers... this would be used to modify random_page_cost on a
> > per-table/index/partition basis.
> >
> > Example : let's imagine a "cache priority" setting.
> >
> > - "cache priority" set to the minimum means this table is mostly
> > write-only
> > - "cache priority" set to default would give current behaviour (which is
> > correct in most cases)
> > - "cache priority" set to a high value would tell Postgres "I know this
> > table/index/partition is small and often accessed rather randomly, so I want
> > you to keep it in shared_buffers, purge it if you must but otherwise keep it
> > in memory, flush something else instead which has lower cache_priority".
> >
> > The optimizer could then use a different (much lower) value of
> > random_page_cost for tables for which "cache priority" is set highest since
> > it would know.
> >
>
> "cache priority" to me sounds like we're trying to influence caching
> behavior, which isn't what's happening. I do agree that we need a better way
> to tell the planner what tables are in memory.
>
> An alternative would be for the background writer to keep some stats and
> > do the thing for us :
> >
> > - begin bgwriter scan
> > - setup hashtable of [relid => page count]
> > - at each page that is scanned, increment "page count" for this relation
> > (uses very little CPU)
> > - end bgwriter stats
> > - for each relation, compare the number of pages we found in
> > shared_buffers with the number of pages in the relation and draw conclusions
> > about how well cached the relation is
> > - update random_page_cost accordingly for this relation
> >
> > This would not examine whatever is in the OS' cache, though.
> >
>
> Actually, there's no need for bgwriter to do that; we can just look at the
> hit rate for the object. But we'd also need stats for how often we find
> pages for a relation in the OS cache, which no one has come up with a good
> method for.

Something related... a per-relation cost setting would also allow users to
tune based on the kind of storage those objects are stored. Using
tablespaces, users can choose to place some objects on really expansive/
really fast storage, and other (not so hot) objects on a slower/cheaper
storage.

--
gurjeet[(dot)singh](at)EnterpriseDB(dot)com
singh(dot)gurjeet(at){ gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Decibel! 2008-04-22 19:19:24 Re: MERGE Specification
Previous Message Decibel! 2008-04-22 18:54:41 Re: Regression test fails when BLCKSZ is 1kB