BUG #8334: Merge Join drops records

From: euclid(at)yandex-team(dot)ru
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #8334: Merge Join drops records
Date: 2013-07-25 16:16:22
Message-ID: E1V2ODG-00026n-Kk@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 8334
Logged by: Euclid Nikiforov
Email address: euclid(at)yandex-team(dot)ru
PostgreSQL version: 9.2.0
Operating system: Linux 3.8.0-11-na
Description:

Hi all

Recently investigating problem in our backend query result, I have found
strange behaviour.

The exact query was:
SELECT r.*, state FROM revision.object_revision r JOIN revision.commit ON id
IN (32397,32403,32404) AND commit_id=id WHERE ((commit_id=32397 AND
(object_id=279638691 OR object_id=279638692)) OR (commit_id=32403 AND
object_id IN (279638118,279638128,279638970)) OR (commit_id=32404 AND
object_id=279638966));

It expected to return 6 rows, but returned only 3 when enable_mergejoin is
'on'
EXPLAIN ANALYZE shows Merge Join at final stage
while with Nested Loop (set enable_mergejoin=off;) everything is fine.

Since data is very huge, here is small synthetic way to reproduce the
behaviuor
In actual environment we don't set any restrictions on planner.
If you manipulate enable_mergejoin or enable_sort, you will see how results
differ.
Seems that merge join fails because of unsorted result from inner loops

Regards
Euclid
---------EXAMPLE--------
SET enable_mergejoin=on;
SET enable_nestloop=off;
SET enable_hashjoin=off;
SET enable_seqscan=off;
SET enable_sort=off;

CREATE TABLE testcase.c
(
id bigint NOT NULL UNIQUE,
s bigint,
CONSTRAINT cpk PRIMARY KEY (id)
);

CREATE TABLE testcase.r
(
c_id bigint NOT NULL,
o_id bigint NOT NULL,
CONSTRAINT rpk PRIMARY KEY (c_id, o_id)
);

CREATE INDEX rpkc ON testcase.r USING btree (c_id);

DELETE FROM testcase.c;
INSERT INTO testcase.c values(1,0);
INSERT INTO testcase.c values(2,0);
INSERT INTO testcase.c values(3,0);

DELETE FROM testcase.r;
INSERT INTO testcase.r values(1,1);
INSERT INTO testcase.r values(2,2);
INSERT INTO testcase.r values(2,3);
INSERT INTO testcase.r values(3,4);
INSERT INTO testcase.r values(3,5);
INSERT INTO testcase.r values(3,6);

SELECT s, r.* FROM testcase.r r JOIN testcase.c ON id IN (1, 2, 3) AND
id=r.c_id
WHERE ((c_id=2 AND (o_id=2 OR o_id=3)) OR
(c_id=3 AND o_id in (4,5,6)) OR (c_id=1 AND o_id=1))

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2013-07-25 16:18:18 Re: BUG #8271: Configure warning: sys/ucred.h: present but cannot be compiled
Previous Message Andres Freund 2013-07-25 16:08:40 Re: BUG #8271: Configure warning: sys/ucred.h: present but cannot be compiled