Quick Links

请教集合A中快速排除集合B，两个大表JOIN的问题

From:	"zhang(dot)wensheng" <zhang(dot)wensheng(at)foxmail(dot)com>
To:	"pgsql-zh-general(at)postgresql(dot)org" <pgsql-zh-general(at)postgresql(dot)org>
Cc:	德哥 <digoal(at)126(dot)com>, 李海龙 <hailong(dot)li(at)qunar(dot)com>, 汪洋(平安科技数据库技术支持部经理室) <WANGYANG102(at)pingan(dot)com(dot)cn>, held911(at)163(dot)com
Subject:	请教集合A中快速排除集合B，两个大表JOIN的问题
Date:	2017-03-24 09:41:37
Message-ID:	5cd216a3-79ff-c7f3-5b2c-c864beab451a@foxmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-zh-general

hi~各位老师们：

我们线上有一个这样的查询,A表数据量约7300万，B表约近千万：

SELECT A.p_key
FROM A
WHERE NOT EXISTS
(SELECT 1
FROM B
WHERE B.f_key = A.pkey)
LIMIT 1000;

A和B经过条件过滤之后结果集都很大，A在条件过滤后在几十万到上百万不等，B经过条件过滤后会在0到几十万，现在想快速从A中把所有B的结果排除掉

这是我一个测试环境的查询计划：

Limit (cost=2.04..18.76 rows=1 width=12) (actual time=0.987..512.000
rows=1000 loops=1)
-> Nested Loop Anti Join (cost=2.04..18.76 rows=1 width=12)
(actual time=0.987..511.811 rows=1000 loops=1)
-> Index Scan using dba_users_male_idx_1 on users u
(cost=1.48..13.74 rows=1 width=36) (actual time=0.959..396.402
rows=38109 loops=1)
SubPlan 1
-> Function Scan on select_contact_user_ids q
(cost=0.25..0.76 rows=100 width=4) (actual time=0.177..0.178 rows=3 loops=1)
-> Index Scan using relationships_user_id_other_user_id_idx
on relationships r (cost=0.56..2.79 rows=1 width=4) (actual
time=0.003..0.003 rows=1 loops=38109)
Planning time: 2.217 ms
Execution time: 512.458 ms

在9.6中支持了bloom filter，我创建了这样的索引，但是性能远不如UNIQUE
BTREE索引：
CREATE INDEX ON B USING bloom(id,f_key) WHERE state != 'default' OR
other_state = 'disliked';

目前500多ms的性能基本不能接受。不知道各位有什么好方法解决这个问题？

------------------
张文升

Responses

Re: 答复: 请教集合A中快速排除集合B，两个大表JOIN的问题 at 2017-03-24 10:25:02 from zhang.wensheng
Re: 答复: 请教集合A中快速排除集合B，两个大表JOIN的问题 at 2017-03-24 10:39:13 from zhang.wensheng

Browse pgsql-zh-general by date

	From	Date	Subject
Next Message	zhang.wensheng	2017-03-24 10:25:02	Re: 答复: 请教集合A中快速排除集合B，两个大表JOIN的问题
Previous Message	winston@p1.com	2017-03-23 07:23:00	Re: Re: [pgsql-zh-general] Fw:我们公司正在招聘PG的DBA，求德哥在各个技术群转发。