Re: Slow query, possibly not using index

From: Les <nagylzs(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Slow query, possibly not using index
Date: 2023-08-27 17:39:40
Message-ID: CAKXe9UDcDfQxCSJYXh9cNMUdfYhx=R7axicGbcm7s9peXOp-7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> > If I try to select a single unused block this way:
> > explain analyze select id from media.block b where nrefs =0 limit 1
> > then it runs for more than 10 minutes (I'm not sure how long, I cancelled
> > the query after 10 minutes).
>
> Are you sure it isn't blocked on a lock?
>

Yes, I'm sure. I have created a single database instance from a zfs
snapshot and tried the query on that database. It was the only client.

> Another theory is that the index contains many thousands of references
> to now-dead rows, and the query is vainly searching for a live entry.
> Given that EXPLAIN thinks there are only about 2300 live entries,
> and yet you say the index is 400MB, this seems pretty plausible.
>

Nobody ever deleted anything from this table. Since it was created, this
has been a write-only table.

> Have you disabled autovacuum, or something like that? (REINDEX
> could help here, at least till the index gets bloated again.)
>
I did not disable autovacuum.

>
> You might think that even so, it shouldn't take that long ... but
> indexes on UUID columns are a well known performance antipattern.
> The index entry order is likely to have precisely zip to do with
> the table's physical order, resulting in exceedingly-random access
> to the table, which'll be horribly expensive when the table is so
> much bigger than RAM. Can you replace the UUID column with a simple
> serial (identity) column?
>

I'm aware of the problems with random UUID values. I was using this
function to create ulids from the beginning:

CREATE OR REPLACE FUNCTION public.gen_ulid()

RETURNS uuid

LANGUAGE sql

AS $function$

SELECT (lpad(to_hex(floor(extract(epoch FROM clock_timestamp()) * 1000)::
bigint), 12, '0') || encode(gen_random_bytes(10), 'hex'))::uuid;

$function$

;

If I order some rows by id values, I can see that their creation times are
strictly ascending. I did not write this function, it was taken from this
website:

https://blog.daveallie.com/ulid-primary-keys

They have a benchmark section where they show that these ULID values are
slower to generate (at least with this implementation) but much faster to
insert.

I might be able to replace these with int8 values, I need to check.

>
> > I believe it is not actually using the index, because reading a single
> > (random?) entry from an index should not run for >10 minutes.
>
> You should believe what EXPLAIN tells you about the plan shape.
> (Its rowcount estimates are only estimates, though.)
>

All of the 40M rows in this table are live. I'm 100% sure about this,
because nobody ever deleted rows from this table.

I can try to do VACUUM on this table, but I'm limited on resources. I think
it will take days to do this. Maybe I can try to dump the whole database
and restore it on another machine. Would that eliminate dead rows? (Is
there a way to check the number of dead rows?)

Regards,

Laszlo

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Wael Khobalatte 2023-08-27 18:54:02 Re: Slow query, possibly not using index
Previous Message Tom Lane 2023-08-27 13:27:05 Re: Slow query, possibly not using index