Re: Inconsistent query performance based on relation hit frequency

From: Achilleas Mantzios - cloud <a(dot)mantzios(at)cloud(dot)gatewaynet(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Inconsistent query performance based on relation hit frequency
Date: 2024-06-27 13:27:54
Message-ID: b6b08fb2-294f-208c-8957-787657fdab02@cloud.gatewaynet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 6/27/24 03:50, Laura Hausmann wrote:
> Heya, I hope the title is somewhat descriptive. I'm working on a
> decentralized social media platform and have encountered the following
> performance issue/quirk, and would like to ask for input, since I'm
> not sure I missed anything.
>
> I'm running PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc
> (GCC) 13.2.1 20230801, 64-bit, running on an Arch Linux box with 128GB
> of RAM & an 8c16t Ryzen 3700X CPU. Disk is a NVME RAID0.
>
> Postgres configuration: https://paste.depesz.com/s/iTv
>
> I'm using autovacuum defaults & am running a manual VACUUM ANALYZE on
> the entire database nightly.
>
> The relevant database parts consist of a table with posts (note), a
> table with users (user), and a table with follow relationships
> (following). The query in question takes the most recent n (e.g. 50)
> posts, filtered by the users follow relations.
>
> The note table on my main production instance grows by about 200k
> entries per week.
>
> Schema & tuple counts: https://paste.depesz.com/s/cfI
>
> Here's the shortest query I can reproduce the issue with:
> https://paste.depesz.com/s/RoC
> Specifically, it works well for users that follow a relatively large
> amount of users (https://explain.depesz.com/s/tJnB) and is very slow
> for users that follow a low amount of users / users that post
> infrequently (https://explain.depesz.com/s/Mtyr)
>
> From what I can tell, this is because this query causes postgres to
> scan the note table from the bottom (most recent posts first),
> discarding anything by users that are not followed.
>
> Curiously, rewriting the query like this
> (https://paste.depesz.com/s/8rN) causes the opposite problem, this
> query is fast for users with a low following count
> (https://explain.depesz.com/s/yHAz#query) and slow for users with a
> high following count (https://explain.depesz.com/s/1v6L,
> https://explain.depesz.com/s/yg3N)
>
> These numbers are even further apart (to the point of 10-30s query
> timeouts) in the most extreme outlier cases I've observed, and on
> lower-end hardware.
>
> I've sidestepped the issue by running either of these queries based on
> a heuristic that checks whether there are more than 250 matching posts
> in the past 7 days, recomputed once per day for every user, but it
> feels more like a hack than a proper solution.
>
> I'm able to make the planner make a sensible decision in both cases by
> setting enable_sort = off, but that tanks performance for the rest of
> my application, is even more of a hack, and doesn't seem to work in
> all cases.
>
> I've been able to reproduce this issue with mock data
> (https://paste.depesz.com/s/CnY) though it's not generating quite the
> same query plans and is behaving a bit differently.

Before deep dive into everybody's favorite topic you may simplify your
query :

select o.* from objects o where o."userId" = :userid UNION select o.*
from objects o where o."userId" IN

(SELECT r."followeeId" FROM relationships r WHERE r."followerId"= :userid)

postgres(at)[local]/laura=# explain (analyze, buffers) select o.* from
objects o where o."userId" = 1 UNION select o.* from objects o where
o."userId" IN (SELECT r."followeeId" FROM relati
onships r WHERE r."followerId"=1) ORDER BY id DESC ;
                                                                                        QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---
Sort  (cost=8622.04..8767.98 rows=58376 width=40) (actual
time=1.041..1.053 rows=314 loops=1)
  Sort Key: o.id DESC
  Sort Method: quicksort  Memory: 39kB
  Buffers: shared hit=1265
  ->  HashAggregate  (cost=3416.92..4000.68 rows=58376 width=40)
(actual time=0.900..1.006 rows=314 loops=1)
        Group Key: o.id, o."userId", o.data
        Batches: 1  Memory Usage: 1585kB
        Buffers: shared hit=1265
        ->  Append  (cost=0.42..2979.10 rows=58376 width=40) (actual
time=0.024..0.816 rows=314 loops=1)
              Buffers: shared hit=1265
              ->  Index Scan using "objects_userId_idx" on objects o
 (cost=0.42..3.10 rows=17 width=21) (actual time=0.003..0.003 rows=0
loops=1)
                    Index Cond: ("userId" = 1)
                    Buffers: shared hit=3
              ->  Nested Loop  (cost=0.70..2684.12 rows=58359 width=21)
(actual time=0.020..0.794 rows=314 loops=1)
                    Buffers: shared hit=1262
                    ->  Index Only Scan using
"relationships_followerId_followeeId_idx" on relationships r
 (cost=0.28..7.99 rows=315 width=4) (actual time=0.011..0.030 rows=315
loops=
1)
                          Index Cond: ("followerId" = 1)
                          Heap Fetches: 0
                          Buffers: shared hit=3
                    ->  Index Scan using "objects_userId_idx" on
objects o_1  (cost=0.42..6.65 rows=185 width=21) (actual
time=0.002..0.002 rows=1 loops=315)
                          Index Cond: ("userId" = r."followeeId")
                          Buffers: shared hit=1259
Planning:
  Buffers: shared hit=8
Planning Time: 0.190 ms
Execution Time: 1.184 ms
(26 rows)

Time: 1.612 ms
postgres(at)[local]/laura=# explain (analyze, buffers) select o.* from
objects o where o."userId" = 4 UNION select o.* from objects o where
o."userId" IN (SELECT r."followeeId" FROM relati
onships r WHERE r."followerId"=4) ORDER BY id DESC ;
                                                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Sort  (cost=27.53..28.03 rows=202 width=40) (actual time=0.015..0.016
rows=0 loops=1)
  Sort Key: o.id DESC
  Sort Method: quicksort  Memory: 25kB
  Buffers: shared hit=5
  ->  HashAggregate  (cost=17.77..19.79 rows=202 width=40) (actual
time=0.013..0.013 rows=0 loops=1)
        Group Key: o.id, o."userId", o.data
        Batches: 1  Memory Usage: 40kB
        Buffers: shared hit=5
        ->  Append  (cost=0.42..16.26 rows=202 width=40) (actual
time=0.011..0.012 rows=0 loops=1)
              Buffers: shared hit=5
              ->  Index Scan using "objects_userId_idx" on objects o
 (cost=0.42..3.10 rows=17 width=21) (actual time=0.005..0.005 rows=0
loops=1)
                    Index Cond: ("userId" = 4)
                    Buffers: shared hit=3
              ->  Nested Loop  (cost=0.70..12.14 rows=185 width=21)
(actual time=0.005..0.005 rows=0 loops=1)
                    Buffers: shared hit=2
                    ->  Index Only Scan using
"relationships_followerId_followeeId_idx" on relationships r
 (cost=0.28..1.39 rows=1 width=4) (actual time=0.005..0.005 rows=0
loops=1)
                          Index Cond: ("followerId" = 4)
                          Heap Fetches: 0
                          Buffers: shared hit=2
                    ->  Index Scan using "objects_userId_idx" on
objects o_1  (cost=0.42..8.90 rows=185 width=21) (never executed)
                          Index Cond: ("userId" = r."followeeId")
Planning:
  Buffers: shared hit=8
Planning Time: 0.201 ms
Execution Time: 0.048 ms
(25 rows)

Time: 0.490 ms

>
> I'd appreciate any and all input on the situation. If I've left out
> any information that would be useful in figuring this out, please tell me.
>
> Thanks in advance,
> Laura Hausmann

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Laura Hausmann 2024-06-27 14:58:15 Re: Inconsistent query performance based on relation hit frequency
Previous Message Andrew Okhmat 2024-06-27 10:45:04 Re: Row level security