Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitrios Apostolou <jimis(at)gmx(dot)net>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches
Date: 2023-01-31 15:43:07
Message-ID: CAKFQuwaUy=P_pNGNgcoyFcNExrRSqwhVTUE-SfUrGbj08BzLRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Jan 31, 2023 at 8:28 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Dimitrios Apostolou <jimis(at)gmx(dot)net> writes:
> > The question is why this simple query is taking so long to complete.
>
> Do you get the same 10 rows when you repeat the command?
>
> On the basis of the limited info you provided, I'm going to guess that
>
> (1) there are huge subranges of the table containing no live rows,
> so that a seqscan might have to pass over many blocks before it finds
> some to return;
>

It feels like there is room for improvement here using table statistics and
the visibility map to significantly reduce the number of pages retrieved
that turn out to be all dead. A limit without an order is not a query to
optimize for of course...

> (2) once you do reach an area having live rows, the next SELECT picks
> up scanning in that same area due to the effects of
> "synchronize_seqscans", so you get immediate answers until you reach
> the next desert of dead tuples.
>

Interesting, I was assuming that the behavior on repeated calls was more
due to intelligent buffer pool usage. When doing a sequential scan (I
suppose the key word being sequential...) the system decides which pages to
fetch before checking to see if they are already in shared buffers (as
opposed to returning the ones in shared buffers first then fetching more if
needed)?

David J.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message jacktby@gmail.com 2023-01-31 16:12:44 Re: Re: How could I elog the tupleTableSlot to the fronted terminal?
Previous Message Christophe Pettus 2023-01-31 15:42:48 Re: SELECT * FROM huge_table LIMIT 10; Why does it take more than 10 min to complete, with cold caches