Re: Urgent - SQL left join bug?

From: Kaijiang Chen <chenkaijiang(at)gmail(dot)com>
To: Pantelis Theodosiou <ypercube(at)gmail(dot)com>
Cc: Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Urgent - SQL left join bug?
Date: 2017-06-21 08:28:06
Message-ID: CAAkGvS8jCN53tL4RT-xNX9ubZf=93NiosdUB7pQ5JgZunb6ywg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The correct SQL should be:

select pha.id,ph.herb_id,pha.name,ph.weight
from prescription_herbs as ph left join pharmacy_herbs as pha on
ph.herb_id=pha.herb_id *and pha.pharmacy_id=22*
where ph.prescription_id=116285 and ph.deleted_at is null and
pha.deleted_at is null order by ph.herb_id;

On Wed, Jun 21, 2017 at 4:26 PM, Kaijiang Chen <chenkaijiang(at)gmail(dot)com>
wrote:

> Hi, Pantelis and Heikki, thank you very much for such a quick response!
>
> I got it. I was so silly.....
>
> On Wed, Jun 21, 2017 at 4:23 PM, Pantelis Theodosiou <ypercube(at)gmail(dot)com>
> wrote:
>
>>
>>
>> On Wed, Jun 21, 2017 at 9:06 AM, Kaijiang Chen <chenkaijiang(at)gmail(dot)com>
>> wrote:
>>
>>> Hi, I'm running PostgreSQL 9.4.10 on CentOS 6.5. It looks like that I
>>> found a bug with left join. It is very URGENT since it is running in the
>>> production servers.
>>>
>>> ### Conditions: ###
>>>
>>> I have 2 tables:
>>>
>>> TABLE 1: (2171209 records)
>>> \d prescription_herbs
>>> Table
>>> "public.prescription_herbs"
>>> Column | Type |
>>> Modifiers
>>> ---------------------+--------------------------------+-----
>>> ------------------------------------------------------------
>>> id | integer | not null default
>>> nextval('prescription_herbs_id_seq'::regclass)
>>> prescription_id | integer | not null
>>> herb_id | integer | not null
>>> weight | integer | not null
>>> created_at | timestamp(0) without time zone | not null
>>> updated_at | timestamp(0) without time zone | not null
>>> deleted_at | timestamp(0) without time zone |
>>> price | numeric(10,5) |
>>> special_manufacture | character varying(255) |
>>> cost | numeric(10,5) |
>>> pharmacy_id | integer |
>>> Indexes:
>>> "prescription_herbs_pkey" PRIMARY KEY, btree (id)
>>> "prescription_herbs_hid" btree (herb_id)
>>> "prescription_herbs_prid" btree (prescription_id)
>>>
>>> TABLE 2: (4406 records)
>>>
>>> \d pharmacy_herbs
>>> Table "public.pharmacy_herbs"
>>> Column | Type |
>>> Modifiers
>>> -------------+--------------------------------+-------------
>>> ------------------------------------------------
>>> id | integer | not null default
>>> nextval('pharmacy_herbs_id_seq'::regclass)
>>> pharmacy_id | integer |
>>> herb_id | integer |
>>> cost | numeric(10,5) |
>>> price | numeric(10,5) |
>>> no | character varying(255) |
>>> deleted_at | timestamp(0) without time zone |
>>> created_at | timestamp(0) without time zone | not null
>>> updated_at | timestamp(0) without time zone | not null
>>> name | character varying(255) |
>>> Indexes:
>>> "pharmacy_herbs_pkey" PRIMARY KEY, btree (id)
>>> "pharmacy_herbs_herb_id" btree (herb_id)
>>>
>>> ### BUG: ###
>>>
>>> I ran a SQL:
>>> select pha.id,ph.herb_id,pha.name,ph.weight
>>> from prescription_herbs as ph *left join* pharmacy_herbs as pha on
>>> ph.herb_id=pha.herb_id
>>> where ph.prescription_id=116285 and ph.deleted_at is null and
>>> pha.deleted_at is null and pha.pharmacy_id=22 order by ph.herb_id;
>>>
>>> Expected:
>>>
>>> It should have 10 rows because the SQL "select * from prescription_herbs
>>> as ph where ph.prescription_id=116285 and ph.deleted_at is null" returned
>>> 10 rows and I'm using *LEFT JOIN *in the above SQL.
>>>
>>> Actual Result:
>>>
>>> It returned only 9 rows and the result is the same as JOIN (not LEFT
>>> JOIN).
>>>
>>> ### More info: ###
>>>
>>> I explain the SQL:
>>> explain select pha.id,ph.herb_id,pha.name,ph.weight
>>> from prescription_herbs as ph left join pharmacy_herbs as pha on
>>> ph.herb_id=pha.herb_id
>>> where ph.prescription_id=116285 and ph.deleted_at is null and
>>> pha.deleted_at is null and pha.pharmacy_id=22 order by ph.herb_id;
>>>
>>> I got:
>>>
>>> Sort (cost=131.73..131.76 rows=10 width=24)
>>> Sort Key: ph.herb_id
>>> -> * Hash Join* (cost=122.02..131.56 rows=10 width=24)
>>> Hash Cond: (ph.herb_id = pha.herb_id)
>>> -> Index Scan using prescription_herbs_prid on
>>> prescription_herbs ph (cost=0.43..9.68 rows=23 width=8)
>>> Index Cond: (prescription_id = 116285)
>>> Filter: (deleted_at IS NULL)
>>> -> Hash (cost=113.08..113.08 rows=681 width=20)
>>> -> Seq Scan on pharmacy_herbs pha (cost=0.00..113.08
>>> rows=681 width=20)
>>> Filter: ((deleted_at IS NULL) AND (pharmacy_id =
>>> 22))
>>>
>>> I think the above* "Hash Join" SHOULD BE "Hash Left Join"*, right?
>>>
>>> I tried to explain another SQL:
>>> explain select * from doctors d left join prescriptions p on d.id
>>> =p.doctor_id;
>>>
>>> I got:
>>> Hash Right Join (cost=2159.33..31453.58 rows=130330 width=2936)
>>> Hash Cond: (p.doctor_id = d.id)
>>> -> Seq Scan on prescriptions p (cost=0.00..9273.30 rows=130330
>>> width=495)
>>> -> Hash (cost=576.37..576.37 rows=5037 width=2441)
>>> -> Seq Scan on doctors d (cost=0.00..576.37 rows=5037
>>> width=2441)
>>>
>>> The "Hash Right Join" is the correct node.
>>>
>>> Any help is very appreciated! WAITING...
>>>
>>> Thanks,
>>> Kaijiang
>>>
>>>
>> This is not a bug. The conditions of the "right" table pha - basically
>> (pha.pharmacy_id=22) and secondary ((pha.deleted_at is null) )- that you
>> have put in the WHERE clause make the query act as if it was an INNER join.
>>
>> Move them to the ON and you'll get your 10 rows.
>>
>> select ...
>> from prescription_herbs as ph *left join* pharmacy_herbs as pha
>> on ph.herb_id=pha.herb_id and pha.deleted_at is null and
>> pha.pharmacy_id=22
>> where ph.prescription_id=116285 and ph.deleted_at is null
>> order by ...
>>
>> The (pha.deleted_at is null) part may give different results depending on
>> where it is placed (ON vs WHERE) but my guess - since you want all the 10
>> rows of the left table - is that it should be in ON, too
>>
>> Pantelis
>>
>
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kaijiang Chen 2017-06-21 08:30:01 Re: Urgent - SQL left join bug?
Previous Message Kaijiang Chen 2017-06-21 08:26:51 Re: Urgent - SQL left join bug?