From: | Aidar Uldanov <aidar(at)donorbox(dot)org> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18690: A count function returns wrong value when using FDW |
Date: | 2024-11-06 15:10:57 |
Message-ID: | CAOCMfz=GKA+90BC0LXyuGYPeTT5u1mF_aVxXHHBpG+zBfL_Y=g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
David, after some troubleshooting I found out that the issue with FWD and
bigint values, so when column type int then it works well but after
changing to a bigint type a count() function returns the wrong value.
Here is how to reproduce that
-- ===> remote db schema and data
create table orgs(id int, name varchar, provider_id int);
create table recurring_payments(
id int primary key,
org_id int
);
create table transactions(
id serial primary key,
amount int,
org_id int,
recurring_payment_id int REFERENCES recurring_payments(id)
);
-- populate remote db
insert into orgs(id, name, provider_id) values (1, 'one', 11), (2, 'two',
22);
insert into recurring_payments(id, org_id) values (1, 1), (2, 1), (3, 2);
insert into transactions(amount, org_id, recurring_payment_id) values (1,
1, 1), (2, 1, null), (3, 1, null), (4, 1, 1);
-- <=== remote db
-- ===> host db
-- connect back to base database and create server and fdw
CREATE EXTENSION postgres_fdw;
CREATE SERVER foreign_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', port '5432', dbname 'test3');
CREATE USER MAPPING FOR postgres
SERVER foreign_server
OPTIONS (user 'postgres', password '12345');
CREATE FOREIGN TABLE orgs_f (
id int, name varchar, provider_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'orgs');
CREATE FOREIGN TABLE recurring_payments_f (
id int, org_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'recurring_payments');
CREATE FOREIGN TABLE transactions_f (
id serial,
amount bigint,
org_id int,
recurring_payment_id int
)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'transactions');
create view transactions_v as
select t.*, o.provider_id as real_org_id
from transactions_f t
join orgs_f o on o.id = t.org_id;
create view recurring_payments_v as
select rp.*, o.provider_id as real_org_id
from recurring_payments_f rp
join orgs_f o on o.id = rp.org_id;
-- This one returns (1 + 4) / 2 = 2.5 which is correct
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11;
-- wrapping this to count gives 2 which is not correct
select count(*) from (
select avg(t.amount)
from transactions_v t
join recurring_payments_v rp on rp.id = t.recurring_payment_id
where t.real_org_id = 11 and rp.real_org_id = 11
);
On Wed, Nov 6, 2024 at 3:47 PM Aidar Uldanov <aidar(at)donorbox(dot)org> wrote:
> Thanks David
>
> I am trying to reproduce it on a new db but I couldn't so far, though I
> was able to simplify the queries
>
> ```select count(*)
> FROM (
> SELECT AVG(t.usd_amount_cents / 100) AS average
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.org_id = 1 AND r.org_id = 1 and t.status = 'paid'
>
> );
> ```
> count
> -------
> 22429
> (1 row)
>
> ```
> select count(*)
> FROM (
> SELECT AVG(t.usd_amount_cents / 100) AS average
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.status = 'paid'
>
> );
> ```
> count
> -------
> 1
> (1 row)
>
> Those queries are the same and both should return 1 because the inner
> query returns just one AVG value. The difference between those two queries
> is `t.org_id = 1 AND r.org_id = 1`
>
>
> Here are SQL explains
>
> explain verbose
> select count(*)
> FROM (
> SELECT AVG(t.usd_amount_cents / 100) AS average
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.org_id = 1 AND r.org_id = 1 and t.status = 'paid'
> );
>
>
> QUERY PLAN
>
>
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> ----------------------------------------------------------------------------------------------------------
> Aggregate (cost=213.22..213.22 rows=1 width=8)
> Output: count(*)
> -> Foreign Scan (cost=102.30..213.21 rows=1 width=32)
> Output: NULL::numeric
> Relations: Aggregate on ((((transactions t) INNER JOIN (orgs o))
> INNER JOIN (recurring_payments rp)) INNER JOIN (orgs o_1))
> Remote SQL: SELECT NULL::numeric FROM (((public.transactions r4
> INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r5.provider_id
> = 1)) AND ((r4.status = 'paid')))) INNER JOIN public.recurring_payments r7
> ON (((r4.recurrin
> g_payment_id = r7.id)))) INNER JOIN public.orgs r8 ON (((r7.org_id = r8.id))
> AND ((r8.provider_id = 1))))
> (7 rows)
>
>
> explain verbose
> select count(*)
> FROM (
> SELECT AVG(t.usd_amount_cents / 100) AS average
> FROM transactions t
> JOIN recurring_payments r ON r.id = t.recurring_payment_id
> WHERE t.status = 'paid'
>
> );
>
> QUERY PLAN
>
>
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> -------------------------------------------------
> Aggregate (cost=15478.69..15478.69 rows=1 width=8)
> Output: count(*)
> -> Aggregate (cost=15478.69..15478.69 rows=1 width=32)
> Output: NULL::numeric
> -> Foreign Scan (cost=396.66..15478.69 rows=820775 width=32)
> Relations: (((transactions t) INNER JOIN (orgs o)) INNER
> JOIN (recurring_payments rp)) INNER JOIN (orgs o_1)
> Remote SQL: SELECT NULL FROM (((public.transactions r4
> INNER JOIN public.orgs r5 ON (((r4.org_id = r5.id)) AND ((r4.status =
> 'paid')))) INNER JOIN public.recurring_payments r7 ON
> (((r4.recurring_payment_id = r7.id)))) INNER
> JOIN public.orgs r8 ON (((r7.org_id = r8.id))))
> (8 rows)
>
> Thanks and regards
> Aidar
>
> On Wed, Nov 6, 2024 at 12:03 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
>> On Wed, 6 Nov 2024 at 22:58, PG Bug reporting form
>> <noreply(at)postgresql(dot)org> wrote:
>> > ```
>> > WITH
>> >
>> > transactions AS ( SELECT * FROM transactions_fwd WHERE org_id =
>> 1
>> > ),
>> > recurring_payments AS ( SELECT * FROM recurring_payments_fwd
>> WHERE
>> > org_id = 1 )
>> > SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd
>> > FROM transactions t
>> > JOIN recurring_payments r ON r.id = t.recurring_payment_id
>> > WHERE t.status = 'paid' AND t.deleted_at IS NULL AND
>> t.amount_refunded_cents
>> > = 0
>> > ```
>> >
>> > But when I wrap it to `select count(*) from (my_query)` i get 288 as a
>> > result, but I expect it to be 1.
>> >
>> > ```
>> > select count(*)
>> > FROM (
>> > WITH
>> >
>> > transactions AS ( SELECT * FROM transactions_fwd WHERE org_id =
>> 1
>> > ),
>> > recurring_payments AS ( SELECT * FROM recurring_payments_fwd
>> WHERE
>> > org_id = 1 )
>> > SELECT AVG(t.usd_amount_cents / 100) AS average_donation_usd
>> > FROM transactions t
>> > JOIN recurring_payments r ON r.id = t.recurring_payment_id
>> > WHERE t.status = 'paid' AND t.deleted_at IS NULL AND
>> t.amount_refunded_cents
>> > = 0
>> > );
>> > ```
>>
>> Can you show the EXPLAIN ANALYZE of both of these queries?
>>
>> Also, can you work on getting a self-contained reproducer that we can
>> run to recreate the issue.
>>
>> David
>>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-11-06 15:29:31 | Re: BUG #18690: A count function returns wrong value when using FDW |
Previous Message | Aidar Uldanov | 2024-11-06 14:47:53 | Re: BUG #18690: A count function returns wrong value when using FDW |