From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | amit(dot)kapila16(at)gmail(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Use unique index for longer pathkeys. |
Date: | 2014-07-14 10:32:40 |
Message-ID: | 20140714.193240.181822165.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thank you for reviewing, the revised patch will come later.
At Mon, 14 Jul 2014 11:01:52 +0530, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote in <CAA4eK1+6b6Wjwf51oZMrL+mKFH8xUp9J-pEhQvoR8SE7sWyTWw(at)mail(dot)gmail(dot)com>
> On Fri, Jun 13, 2014 at 1:11 PM, Kyotaro HORIGUCHI <
> horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >
> > Hello, This is the continuation from the last CF.
> >
> > This patch intends to make PG to use index for longer pathkeys
> > than index columns when,
> >
> > - The index is a unique index.
> > - All index columns are NOT NULL.
> > - The index column list is a subset of query_pathkeys.
> >
> > ====
> >
> > This patch consists of mainly three parts.
(snip)
>
> I have spent some time on this patch and would like to share my
> findings as below with you.
>
> 1.
> It seems to me that the new flag (IndexOptInfo.full_ordered) introduced in
> this patch is not getting set properly. Please refer below test:
Yeah, probably you omitted the second condition above.
> create table t (a int, b int, c int, d text);
The following definition instead will make this work. NULLs
cannot be unique.
| create table t (a int not null, b int not null, c int, d text);
> create unique index i_t_pkey on t(a, b);
> insert into t (select a % 10, a / 10, a, 't' from generate_series(0,
> 100000) a);
> analyze;
> explain (costs off, analyze on) select distinct * from t;
...
> 2.
> typedef struct IndexOptInfo
> bool predOK; /* true if predicate matches query */
> bool unique; /* true if a unique index */
> bool immediate; /* is uniqueness enforced immediately? */
> + bool full_ordered; /* don't key columns duplicate? */
>
> It seems you have forgotten to update out function _outIndexOptInfo().
Mmm, it's surely what I didn't intended to make:( , but the
member actually works in collect_common_primary_pathkeys.
The correct (desirable) code should use (allnotnull && unique)
instead of (full_ordered). I'll fix this in the comming version
later but they are equivelant in functionality. It's not a
mistake of forgetting update out (so broken), but branching from
wrong point (so it works contrary to the expectation):)
> 3.
> + root->distinct_pathkeys =
> + shorten_pathkeys_following(root, root->distinct_pathkeys,pk_guide,
> + true);
> +
> + root->query_pathkeys =
> + shorten_pathkeys_following(root,root->query_pathkeys,pk_guide,
> + true);
>
> Add a space after each parameter in function call.
Thank you for pointing out.
> 4.
> +PathKeysComparison
> +compare_pathkeys_ignoring_strategy(List *keys1, List *keys2)
>
> a. This function return 4 different type of values (PATHKEYS_EQUAL,
> PATHKEYS_DIFFERENT, PATHKEYS_BETTER1, PATHKEYS_BETTER2),
> however the caller just checks for PATHKEYS_EQUAL, isn't it better to
> just
> return either PATHKEYS_EQUAL or PATHKEYS_DIFFERENT.
Hmm.. I agree that it is practically removable, but I think it is
necessary from the view of similarity to the function with the
similar name. Perhaps I want another name which don't suggest the
similarity to compare_pathekeys(), say, bool
pathkeys_are_equal_ignoring_strategy() to change the rerurn
values set.
> b. Another idea could be that instead of writting separate function,
> pass an additional parameter to compare_pathkeys() to distinguish
> for ignore strategy case.
It sounds reasonable. According to my faint memory, the reason
why the function is separated from the original one is, perhaps,
simplly a editorial reason.
> c. In any case, I think it is good to add comments on top of newly
> introduced function (compare_pathkeys_ignoring_strategy) or extend
> the comments of compare_pathkeys() if you want to add a new parameter
> to it.
Ouch, thank you for pointing out. I'll add comment if it remains
in the next patch.
> > Finally, the new patch has grown up to twice in size.. Although
> > it seems a bit large, I think that side-effects are clearly
> > avoided.
>
> I am bit worried about the extra cycles added by this patch as compare
> to your previous patch; example
> During trimming of paths, it will build index paths (build_index_pathkeys())
> which it will anyway do again during creation of index paths:
> create_index_paths()->get_index_paths()->build_index_paths()->build_index_pathkeys()
I felt the same thing. On stupid measure to reduce cycles would
be caching created pathkeys in IndexOptInfo. I'll try in the next
patch.
> I am not sure if this has direct impact, however I am suspecting trimming
> logic can add quite a few instructions even for cases when it is not
> beneficial.
> At this moment, I also don't have better idea, so lets proceed with review
> of this
> patch unless there is any other better way to achieve it.
I suppose there's some shortcut which can exclude the cases where
obviously there's no use of pathkeys trimming, for example, using
only pathkey lengths. I'll seek for it.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2014-07-14 10:46:43 | Re: Incorrect comment in postgres_fdw.c |
Previous Message | Shigeru Hanada | 2014-07-14 10:31:58 | Re: Incorrect comment in postgres_fdw.c |