Re: Why Postgres doesn't use TID scan?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Vladimir Ryabtsev <greatvovan(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Why Postgres doesn't use TID scan?
Date: 2018-12-18 01:40:37
Message-ID: 31790.1545097237@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Vladimir Ryabtsev <greatvovan(at)gmail(dot)com> writes:
> I want to clean a large log table by chunks. I write such a query:
> delete from categorization.log
> where ctid in (
> select ctid from categorization.log
> where timestamp < now() - interval '2 month'
> limit 1000
> )

> Why does this query want to use Seq Scan and Sort on a 423M rows table?

There's no support for using ctid as a join key in this way; specifically,
nodeTidscan.c doesn't have support for being a parameterized inner scan,
nor does tidpath.c have code to generate such a plan. The header comments
for the latter say

* There is currently no special support for joins involving CTID; in
* particular nothing corresponding to best_inner_indexscan(). Since it's
* not very useful to store TIDs of one table in another table, there
* doesn't seem to be enough use-case to justify adding a lot of code
* for that.

Queries like yours are kinda sorta counterexamples to that, but pretty
much all the ones I've seen seem like crude hacks (and this one is not
an exception). Writing a bunch of code to support them feels like
solving the wrong problem. Admittedly, it's not clear to me what the
right problem to solve instead would be.

(It's possible that I'm overestimating the amount of new code that would
be needed to implement this, however. indxpath.c is pretty huge, but
that's mostly because there are so many cases to consider. There'd only
be one interesting case for an inner TID scan. Also, this comment is
ancient, predating the current approach with parameterized paths ---
in fact best_inner_indexscan doesn't exist as such anymore. So maybe
that old judgment that it'd take a lot of added code is wrong.)

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Vladimir Ryabtsev 2018-12-18 01:54:19 Re: Why Postgres doesn't use TID scan?
Previous Message Vladimir Ryabtsev 2018-12-18 01:16:08 Why Postgres doesn't use TID scan?