Re: Why Postgres doesn't use TID scan?

From: Vladimir Ryabtsev <greatvovan(at)gmail(dot)com>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Why Postgres doesn't use TID scan?
Date: 2018-12-18 01:54:19
Message-ID: CAMqTPq=4Qp80Qt11DdHjjzM4ksp7O0wTmDQexSUbo2sEoTk+hQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I can't believe it.
I see some recommendations in Internet to do like this (e.g.
https://stackoverflow.com/questions/5170546/how-do-i-delete-a-fixed-number-of-rows-with-sorting-in-postgresql
).
Did it really work in 2011? Are you saying they broke it? It's a shame...

Anyway I think the problem is pretty clear: I want to eventually clear the
table based on the predicate but I don't want to lock it for a long time.
The table does not have a primary key.
What should be a proper solution?

--
Vlad

пн, 17 дек. 2018 г. в 17:40, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Vladimir Ryabtsev <greatvovan(at)gmail(dot)com> writes:
> > I want to clean a large log table by chunks. I write such a query:
> > delete from categorization.log
> > where ctid in (
> > select ctid from categorization.log
> > where timestamp < now() - interval '2 month'
> > limit 1000
> > )
>
> > Why does this query want to use Seq Scan and Sort on a 423M rows table?
>
> There's no support for using ctid as a join key in this way; specifically,
> nodeTidscan.c doesn't have support for being a parameterized inner scan,
> nor does tidpath.c have code to generate such a plan. The header comments
> for the latter say
>
> * There is currently no special support for joins involving CTID; in
> * particular nothing corresponding to best_inner_indexscan(). Since it's
> * not very useful to store TIDs of one table in another table, there
> * doesn't seem to be enough use-case to justify adding a lot of code
> * for that.
>
> Queries like yours are kinda sorta counterexamples to that, but pretty
> much all the ones I've seen seem like crude hacks (and this one is not
> an exception). Writing a bunch of code to support them feels like
> solving the wrong problem. Admittedly, it's not clear to me what the
> right problem to solve instead would be.
>
> (It's possible that I'm overestimating the amount of new code that would
> be needed to implement this, however. indxpath.c is pretty huge, but
> that's mostly because there are so many cases to consider. There'd only
> be one interesting case for an inner TID scan. Also, this comment is
> ancient, predating the current approach with parameterized paths ---
> in fact best_inner_indexscan doesn't exist as such anymore. So maybe
> that old judgment that it'd take a lot of added code is wrong.)
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2018-12-18 02:32:51 Re: Why Postgres doesn't use TID scan?
Previous Message Tom Lane 2018-12-18 01:40:37 Re: Why Postgres doesn't use TID scan?