Re: Do all Postgres queries touch Shared_Buffers at some point?

From: Shiv Sharma <shiv(dot)sharma(dot)1835(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: Do all Postgres queries touch Shared_Buffers at some point?
Date: 2013-12-29 22:54:42
Message-ID: CA+LWa-LuSYyp9DmOC3j5sPBABFDnUgBo-GG0=V6+JPOR9MZbZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks. We are on Greenplum GP 4.2 (Postgres 8.2). As per GP suggestions,
we have 6 primary/6 mirror instances on each server. The server has 64 G
RAM, and shared_buffers is at ...125 MB :-). I suppose the idea is for the
OS buffer cache to do the legwork.

But still...performance is at least "not bad". If all HASH JOIN queries
touch shared_buffers in some way, I find it non-intuitive that we can have
concurrent hash queries involving big tables (100M+ joined with say 100K),
all apparently using the 125MB shared_buffers in some way, and yet giving
reasonable performance.

Basically what is the anatomy of a hash join involving large tables?
Disk->Shared_Buffers->Hash Join areas? something like that?

On Sun, Dec 29, 2013 at 9:18 AM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com>wrote:

> On Sun, Dec 29, 2013 at 9:05 PM, Shiv Sharma <shiv(dot)sharma(dot)1835(at)gmail(dot)com>
> wrote:
> > I am puzzled about the extent to which shared_bufferes is used for
> different
> > queries. Do _all_ queries "touch" shared buffers at some point of their
> > execution?
> >
> > Many of our warehouse queries are seq_scan followed by HASH. I know
> > work_mem is assigned for HASH joins: but does this mean that these
> queries
> > never touch shared buffers at *all* during their execution? Perhaps they
> > are read into shared_buffers and then passed into work_mem HASH areas???
> >
> > What about updates on big tables? What about inserts on big tables? What
> > about append-inserts?
> >
> > I think I could get these answers from Explain Analyze Buffers but I am
> on
> > 8.2 :-(
> >
> > Please tell me which queries use/touch shared_buffers in general terms,
> or
> > please point me to documentation.
> All your queries that interacts with relations.
>
> shared_buffers is used for data caching across all the backends of the
> server, to put it simply pages of the relation involved. Such data can
> be relation data, like data of a table you defined yourself, index
> data, or some system catalog data, containing definitions of the
> database objects. So simply everything that is a relation and contains
> physical data might be in shared buffers. Views for example do not
> enter in this category.
>
> You could for example use pg_buffercache to have a look at what
> contains the shared buffers:
> http://www.postgresql.org/docs/devel/static/pgbuffercache.html
>
> Regards,
> --
> Michael
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Joe Van Dyk 2013-12-30 05:56:04 Re: Replication failed after stalling
Previous Message Sameer Kumar 2013-12-29 18:13:44 Re: PG replication across DataCenters