Re: Problem with default partition pruning

From: yuzuko <yuzukohosoya(at)gmail(dot)com>
To: "Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>
Cc: Yuzuko Hosoya <hosoya(dot)yuzuko(at)lab(dot)ntt(dot)co(dot)jp>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Problem with default partition pruning
Date: 2019-03-05 08:35:59
Message-ID: CAKkQ509Wu28uM6iWxnjH+p3CZOe3fQ8uO8nUTNa=hPLAiawOBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Imai-san,

Thanks for sharing your tests!

On Thu, Feb 28, 2019 at 5:27 PM Imai, Yoshikazu
<imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com> wrote:
>
> Hosoya-san
>
> On Wed, Feb 27, 2019 at 6:51 AM, Yuzuko Hosoya wrote:
> > > From: Amit Langote [mailto:Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp]
> > > Sent: Wednesday, February 27, 2019 11:22 AM
> > >
> > > Hosoya-san,
> > >
> > > On 2019/02/22 17:14, Yuzuko Hosoya wrote:
> > > > Hi,
> > > >
> > > > I found the bug of default partition pruning when executing a range
> > query.
> > > >
> > > > -----
> > > > postgres=# create table test1(id int, val text) partition by range
> > > > (id); postgres=# create table test1_1 partition of test1 for values
> > > > from (0) to (100); postgres=# create table test1_2 partition of
> > > > test1 for values from (150) to (200); postgres=# create table
> > > > test1_def partition of test1 default;
> > > >
> > > > postgres=# explain select * from test1 where id > 0 and id < 30;
> > > > QUERY PLAN
> > > > ----------------------------------------------------------------
> > > > Append (cost=0.00..11.83 rows=59 width=11)
> > > > -> Seq Scan on test1_1 (cost=0.00..5.00 rows=58 width=11)
> > > > Filter: ((id > 0) AND (id < 30))
> > > > -> Seq Scan on test1_def (cost=0.00..6.53 rows=1 width=12)
> > > > Filter: ((id > 0) AND (id < 30))
> > > > (5 rows)
> > > >
> > > > There is no need to scan the default partition, but it's scanned.
> > > > -----
> > > >
> > > > In the current implement, whether the default partition is scanned
> > > > or not is determined according to each condition of given WHERE
> > > > clause at get_matching_range_bounds(). In this example,
> > > > scan_default is set true according to id > 0 because id >= 200
> > > > matches the default partition. Similarly, according to id < 30,
> > scan_default is set true.
> > > > Then, these results are combined according to AND/OR at
> > perform_pruning_combine_step().
> > > > In this case, final result's scan_default is set true.
> > > >
> > > > The modifications I made are as follows:
> > > > - get_matching_range_bounds() determines only offsets of range bounds
> > > > according to each condition
> > > > - These results are combined at perform_pruning_combine_step()
> > > > - Whether the default partition is scanned or not is determined at
> > > > get_matching_partitions()
> > > >
> > > > Attached the patch. Any feedback is greatly appreciated.
> > >
> > > Thank you for reporting. Can you please add this to March CF in Bugs
> > > category so as not to lose
> > track
> > > of this?
> > >
> > > I will try to send review comments soon.
> > >
> > Thank you for your reply. I added this to March CF.
>
> I tested with simple use case and I confirmed it works correctly like below.
>
> In case using between clause:
> postgres=# create table test1(id int, val text) partition by range (id);
> postgres=# create table test1_1 partition of test1 for values from (0) to (100);
> postgres=# create table test1_2 partition of test1 for values from (150) to (200);
> postgres=# create table test1_def partition of test1 default;
>
> [HEAD]
> postgres=# explain analyze select * from test1 where id between 0 and 50;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------
> Append (cost=0.00..58.16 rows=12 width=36) (actual time=0.008..0.008 rows=0 loops=1)
> -> Seq Scan on test1_1 (cost=0.00..29.05 rows=6 width=36) (actual time=0.005..0.005 rows=0 loops=1)
> Filter: ((id >= 0) AND (id <= 50))
> -> Seq Scan on test1_def (cost=0.00..29.05 rows=6 width=36) (actual time=0.002..0.002 rows=0 loops=1)
> Filter: ((id >= 0) AND (id <= 50))
>
>
> [patched]
> postgres=# explain analyze select * from test1 where id between 0 and 50;
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------
> Append (cost=0.00..29.08 rows=6 width=36) (actual time=0.006..0.006 rows=0 loops=1)
> -> Seq Scan on test1_1 (cost=0.00..29.05 rows=6 width=36) (actual time=0.004..0.005 rows=0 loops=1)
> Filter: ((id >= 0) AND (id <= 50))
>
>
>
> I considered about another use case. If default partition contains rows whose id = 300 and then we add another partition which have constraints like id >= 300 and id < 400, I thought we won't scan the rows anymore. But I noticed we simply can't add such a partition.
>
> postgres=# insert into test1 values (300);
> INSERT 0 1
> postgres=# create table test1_3 partition of test1 for values from (300) to (400);
> ERROR: updated partition constraint for default partition "test1_def" would be violated by some row
>
>
> So I haven't come up with bad cases so far :)

I didn't test cases you mentioned.
Thanks to you, I could check correctness of the patch!

--
Best regards,
Yuzuko Hosoya
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuro Yamada 2019-03-05 08:37:17 Re: [HACKERS] CLUSTER command progress monitor
Previous Message David Steele 2019-03-05 08:35:13 Re: Re: SQL:2011 PERIODS vs Postgres Ranges?