Re: Multi-column join + aggregate subquery resulting in infinite run time

From: Steve Estes <denzera(at)gmail(dot)com>
To: pgsql-novice(at)lists(dot)postgresql(dot)org
Subject: Re: Multi-column join + aggregate subquery resulting in infinite run time
Date: 2020-07-09 17:28:47
Message-ID: CAJjrZPA8vc5zKYGNLtS1JdaoFFBqjXs8ue=+22Rb4Qpehbh_Dg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Just wanted to bump this in case anyone had any insights. If you live in a
place served by Grubhub or Doordash, I will literally buy you lunch if you
can help me figure out why these joins wouldn't take linear time to bolt
on, rather than geometrically increasing. As I envision it, if you've got
a common base table serving as the parent in a LEFT JOIN, each additional
table you want to LEFT JOIN onto it should take roughly the same amount of
time, because there's no need to calculate any interplay between that table
and any others, you're just matching on your keys with one scan through the
indexes and off you go to the next one. Clearly I'm either thinking about
it wrong, or doing something wrong as my joins increase beyond 1.

-Steve

On Mon, Jul 6, 2020 at 10:59 AM Steve Estes <denzera(at)gmail(dot)com> wrote:

> Hey gang,
>
> I thought I knew what I was doing with queries, but this has more-or-less
> ruined my 4th of July holiday so I think it's high time to ask for help. I
> imagine it's something trivial and obvious to you.
>
> I've got 3 tables in question here:
>
> *prod_account_details* => 538k records
> line_number PK,
> acct_id integer, -- nearly unique, but not quite; acct_id PLUS address is
> unique
> address varchar(100),
> entity_name varchar(100),
> (... a few other fields ...)
> btree index on acct_id, hash index on address, also btree on (acct_id,
> address)
>
> *prod_customer_profiles* => 532k records, *nearly* 1:1 with account
> details
> acct_id integer, -- corresponds to account_details.acct_id
> match_input_address varchar(100), -- corresponds to account_details.address
> (... lots of other fields ...)
> btree index on acct_id, hash index on match_input_address, also btree on
> (acct_id, match_input_address)
>
> *prod_dunning_letters* => 518k records, 1:M with account_details
> (covering ~181k acct_ids)
> line_number PK,
> acct_id integer, -- FK to account_details, not formally via constraint but
> semantically
> (... a few other fields ...)
> btree index on acct_id
>
>
> --- Queries ---
>
> So firstly, what works:
>
> (1)
>
> SELECT ad.acct_id, ad.entity_name, ad.address, acx.flag_infobase_match
>
> FROM prod_account_details ad
>
> INNER JOIN prod_customer_profiles acx ON ad.acct_id = acx.acct_id
>
> AND ad.address = acx.match_input_address;
>
>
> ...returns the 532k matched records in 3.0 seconds.
>
> Now if we *INNER* join to the dunning_letters table via a grouped
> subquery:
>
> (2)
>
> SELECT ad.acct_id, ad.entity_name, ad.address, acx.flag_infobase_match,
> dl.num_records AS dl_count
>
> FROM prod_account_details ad
>
> INNER JOIN prod_customer_profiles acx ON ad.acct_id = acx.acct_id
>
> AND ad.address = acx.match_input_address
>
> INNER JOIN (SELECT count(*) AS num_records, acct_id FROM
> prod_dunning_letters GROUP BY acct_id ) dl ON ad.acct_id = dl.acct_id;
>
> ... it returns fine too, 161k records in 1.3 seconds.
>
> But now if I do something as simple as changing that second join (to the
> subquery) to a LEFT JOIN, it blows up. i.e. I want all valid accounts,
> appended with the count of records in the dunning_letters table if any
> exist:
>
> (3)
>
> SELECT ad.acct_id, ad.entity_name, ad.address, acx.flag_infobase_match,
> dl.num_records AS dl_count
>
> FROM prod_account_details ad
>
> INNER JOIN prod_customer_profiles acx ON ad.acct_id = acx.acct_id
>
> AND ad.address = acx.match_input_address
>
> LEFT JOIN (SELECT count(*) AS num_records, acct_id FROM
> prod_dunning_letters GROUP BY acct_id ) dl ON ad.acct_id = dl.acct_id;
>
> What happens is, this runs for 20+ seconds, and then starts returning ~100
> records every 10 seconds, which at the volume I'm talking about (>500k)
> would take forever to return. And next up I need to bolt on the counts for
> some other, bigger tables too, so we're quickly approaching
> heat-death-of-the-universe amounts of runtime here.
>
> Now what's interesting is, if I relax the join on the first two tables,
> i.e. drop the address part of the join and only do the acct_id one,
>
> (4)
>
> SELECT ad.acct_id, ad.entity_name, ad.address, acx.flag_infobase_match,
> dl.num_records AS dl_count
>
> FROM prod_account_details ad
>
> INNER JOIN prod_customer_profiles acx ON ad.acct_id = acx.acct_id
>
> -- AND ad.address = acx.match_input_address
>
> LEFT JOIN (SELECT count(*) AS num_records, acct_id FROM
> prod_dunning_letters GROUP BY acct_id ) dl ON ad.acct_id = dl.acct_id;
>
> ...it starts returning immediately and spools 695k records in 4.7
> seconds. The problem is that we need the address part of the join in
> there, because there are some duplicate acct_ids on both sides so we get a
> bunch of cartesian-join junk.
>
>
> --- EXPLAIN query plans ---
>
> So here's the Explain for query (1), with just the two tables:
>
> Gather (cost=102995.20..123944.32 rows=1 width=41)
> Workers Planned: 2
> -> Parallel Hash Join (cost=101995.20..122944.22 rows=1 width=41)
> Hash Cond: ((ad.acct_id = acx.acct_id) AND (ad.address =
> (acx.match_input_address)::text))
> -> Parallel Seq Scan on prod_account_details ad
> (cost=0.00..14226.43 rows=224343 width=39)
> -> Parallel Hash (cost=97096.88..97096.88 rows=224288 width=26)
> -> Parallel Seq Scan on prod_customer_profiles acx
> (cost=0.00..97096.88 rows=224288 width=26)
>
>
> Now, when we add on the third table, its strategy for that join changes.
> Here's the Explain for (2), with the inner join that works:
>
> Gather (cost=22500.40..83636.43 rows=1 width=49)
> Workers Planned: 2
> -> Nested Loop (cost=21500.40..82636.33 rows=1 width=49)
> -> Hash Join (cost=21500.40..40427.74 rows=56285 width=54)
> Hash Cond: (ad.acct_id = prod_dunning_letters.acct_id)
> -> Parallel Seq Scan on prod_account_details ad
> (cost=0.00..14226.43 rows=224343 width=39)
> -> Hash (cost=19345.40..19345.40 rows=123920 width=15)
> -> GroupAggregate (cost=0.42..18106.20 rows=123920
> width=15)
> Group Key: prod_dunning_letters.acct_id
> -> Index Only Scan using
> prod_dunning_letters_acct_id_idx on prod_dunning_letters
> (cost=0.42..14269.36 rows=519529 width=7)
> -> Index Scan using
> prod_customer_profiles_match_input_address_idx on prod_customer_profiles
> acx (cost=0.00..0.74 rows=1 width=26)
> Index Cond: ((match_input_address)::text = ad.address)
> Filter: (ad.acct_id = acct_id)
>
>
> But then change it to a LEFT JOIN from query (3), and it changes to:
>
> Nested Loop Left Join (cost=102995.63..144838.72 rows=1 width=49)
> Join Filter: (ad.acct_id = prod_dunning_letters.acct_id)
> -> Gather (cost=102995.20..123944.32 rows=1 width=41)
> Workers Planned: 2
> -> Parallel Hash Join (cost=101995.20..122944.22 rows=1 width=41)
> Hash Cond: ((ad.acct_id = acx.acct_id) AND (ad.address =
> (acx.match_input_address)::text))
> -> Parallel Seq Scan on prod_account_details ad
> (cost=0.00..14226.43 rows=224343 width=39)
> -> Parallel Hash (cost=97096.88..97096.88 rows=224288
> width=26)
> -> Parallel Seq Scan on prod_customer_profiles acx
> (cost=0.00..97096.88 rows=224288 width=26)
> -> GroupAggregate (cost=0.42..18106.20 rows=123920 width=15)
> Group Key: prod_dunning_letters.acct_id
> -> Index Only Scan using prod_dunning_letters_acct_id_idx on
> prod_dunning_letters (cost=0.42..14269.36 rows=519529 width=7)
>
>
> Just reading that, I can't readily tell why #2 finishes fast and #3 runs
> without end. If the difference was JUST the use of a left join, then why
> does #4 finish easily too, just with a single-column join between the first
> two tables?
>
> This has befuddled the rest of my experimentation with it, so I'd really
> appreciate some insight.
>
> -Steve
>
>

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Vianello, Dan A 2020-07-09 19:58:13 RE: [EXTERNAL] Re: Multi-column join + aggregate subquery resulting in infinite run time
Previous Message Laurenz Albe 2020-07-07 09:41:46 Re: What's the best practice to compare the transaction with the checkpoint?