Re: Partitioning update-heavy queue with hash partitions vs partial indexes

From: Dorian Hoxha <dorian(dot)hoxha(at)gmail(dot)com>
To: burcinyazici(at)gmail(dot)com
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Partitioning update-heavy queue with hash partitions vs partial indexes
Date: 2023-08-11 11:09:57
Message-ID: CANsFX07v2kZdfS7X41yHD6tMSx3YzWna765XzeZ8OuL8t-3wrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

@David Rowley
Thanks for explaining.

@Burcin
I'm not too fond of the new column because it introduces a new index that
needs to be maintained.
I could change the index on `view_time` to `(shard_id, view_time)`, but I'm
afraid that may increase the time to traverse the index (because all
shard_id will be the same in a query).
And maybe increase index size, but there is index compression for duplicate
values on last versions.

Thank you

On Fri, Aug 11, 2023 at 7:14 AM <burcinyazici(at)gmail(dot)com> wrote:

> hi,
> Consider adding id%10 as a new column?you will have one time burden but
> after creating index on it, update perf will satisfy.
>
> Burcin 📱
>
> > On 11 Aug 2023, at 07:49, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> >
> > On Thu, 10 Aug 2023 at 20:36, Dorian Hoxha <dorian(dot)hoxha(at)gmail(dot)com>
> wrote:
> >>> Do Not Use Partial Indexes as a Substitute for Partitioning
> >>> While a search in this larger index might have to descend through a
> couple more tree levels than a search in a smaller index, that's almost
> certainly going to be cheaper than the planner effort needed to select the
> appropriate one of the partial indexes. The core of the problem is that the
> system does not understand the relationship among the partial indexes, and
> will laboriously test each one to see if it's applicable to the current
> query.
> >>
> >> Would this be true in my case too?
> >
> > Yes. The process of determining which partial indexes are valid for
> > the given query must consider each index one at a time and validate
> > the index's WHERE clause against the query's WHERE clause to see if it
> > can be used. There is no shortcut that sees you have a series of
> > partial indexes with WHERE id % 10 = N; which just picks 1 index
> > without searching all of them.
> >
> >> Is it faster for the planner to select a correct partition(hash
> partitioning on `id` column) instead of a correct partial index like in my
> case? I don't think I'll need more than ~32 partitions/partial-indexes in
> an extreme scenario.
> >
> > I mean, test it and find out, but probably, yes, the partition pruning
> > code for hash partitioning is an O(1) operation and is very fast.
> > Once the given Constants have been hashed, finding the partition is
> > just a single divide operation away.
> >
> > David
> >
> >
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Les 2023-08-15 20:23:26 slow delete
Previous Message burcinyazici 2023-08-11 05:13:55 Re: Partitioning update-heavy queue with hash partitions vs partial indexes