| From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> | 
|---|---|
| To: | Andy Fan <zhihuifan1213(at)163(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com> | 
| Cc: | Julien Rouhaud <rjuju123(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org | 
| Subject: | Re: using extended statistics to improve join estimates | 
| Date: | 2024-04-02 18:22:55 | 
| Message-ID: | 2a9604a5-520d-48ee-b3c8-5f3342b607d3@enterprisedb.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 4/2/24 10:23, Andy Fan wrote:
> 
>> On Wed, Mar 02, 2022 at 11:38:21AM -0600, Justin Pryzby wrote:
>>> Rebased over 269b532ae and muted compiler warnings.
> 
> Thank you Justin for the rebase!
> 
> Hello Tomas,
> 
> Thanks for the patch! Before I review the path at the code level, I want
> to explain my understanding about this patch first.
> 
If you want to work on this patch, that'd be cool. A review would be
great, but if you want to maybe take over and try moving it forward,
that'd be even better. I don't know when I'll have time to work on it
again, but I'd promise to help you with working on it.
> Before this patch, we already use MCV information for the eqjoinsel, it
> works as combine the MCV on the both sides to figure out the mcv_freq
> and then treat the rest equally, but this doesn't work for MCV in
> extended statistics, this patch fill this gap. Besides that, since
> extended statistics means more than 1 columns are involved, if 1+
> columns are Const based on RestrictInfo, we can use such information to
> filter the MCVs we are interesting, that's really cool. 
> 
Yes, I think that's an accurate description of what the patch does.
> I did some more testing, all of them are inner join so far, all of them
> works amazing and I am suprised this patch didn't draw enough
> attention. I will test more after I go though the code.
> 
I think it didn't go forward for a bunch of reasons:
1) I got distracted by something else requiring immediate attention, and
forgot about this patch.
2) I got stuck on some detail of the patch, unsure which of the possible
solutions to try first.
3) Uncertainty about how applicable the patch is in practice.
I suppose it was some combination of these reasons, not sure.
As for the "practicality" mentioned in (3), it's been a while since I
worked on the patch so I don't recall the details, but I think I've been
thinking mostly about "start join" queries, where a big "fact" table
joins to small dimensions. And in that case the fact table may have a
MCV, but the dimensions certainly don't have any (because the join
happens on a PK).
But maybe that's a wrong way to think about it - it was clearly useful
to consider the case with (per-attribute) MCVs on both sides as worth
special handling. So why not to do that for multi-column MCVs, right?
> At for the code level, I reviewed them in the top-down manner and almost
> 40% completed. Here are some findings just FYI. For efficiency purpose,
> I provide each feedback with a individual commit, after all I want to
> make sure my comment is practical and coding and testing is a good way
> to archive that. I tried to make each of them as small as possible so
> that you can reject or accept them convinently.
> 
> 0001 is your patch, I just rebase them against the current master. 0006
> is not much relevant with current patch, and I think it can be committed
> individually if you are OK with that.
> 
> Hope this kind of review is helpful.
> 
Cool! There's obviously no chance to get this into v18, and I have stuff
to do in this CF. But I'll take a look after that.
regards
-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Daniel Gustafsson | 2024-04-02 18:31:48 | Re: Fix resource leak (src/backend/libpq/be-secure-common.c) | 
| Previous Message | Ranier Vilela | 2024-04-02 18:13:19 | Fix resource leak (src/backend/libpq/be-secure-common.c) |