Quick Links

Re: Erratically behaving query needs optimization

From:	Barbu Paul - Gheorghe <barbu(dot)paul(dot)gheorghe(at)gmail(dot)com>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	pgsql-performance(at)lists(dot)postgresql(dot)org
Subject:	Re: Erratically behaving query needs optimization
Date:	2019-08-26 08:25:57
Message-ID:	CAB14va-FAhf_wp7rMnsmOxY16vGb5MHjLTUaiu8uO-3=DfLf8Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Sun, Aug 25, 2019 at 5:51 PM Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>
> Yes, it certainly looks like it is due to cold caches. But you say it is slow at first, and then say it varies greatly during a run. Is being slow at first the only way it varies greatly, or is there large variation even beyond that?

There is a great variation in run times (hundreds of ms to several
seconds) even beyond the start of the server.
The query runs several times with a different device_id, object_id and
another list of attribute_ids and it varies from one another.

> You can use pg_rewarm to overcome the cold cache issue when you first start up the server.

I cannot find anything related to pg_rewarm other than some dead ends
from 2013 from which I gather it only works on Linux.
Anyway, I have problems even beyond the start of the database, it's
just easier to reproduce the problem at the start, otherwise I have to
leave the application running for a while (to invalidate part of the
cache I think).

> If you query only on "results" with only the conditions that apply to "results", what is the expected number of rows, and what is the actual number of rows?

Explain for the query on results only: https://explain.depesz.com/s/Csau

EXPLAIN (ANALYZE,BUFFERS)
SELECT DISTINCT ON (results.attribute_id) results.timestamp,
results.data FROM results
WHERE
results.data <> '<NullData/>'
AND results.data IS NOT NULL
AND results.object_id = 1955
AND results.attribute_id IN (4, 5) -- possibly a longer list here
AND results.data_access_result = 'SUCCESS'
ORDER BY results.attribute_id, results.timestamp DESC
LIMIT 2 -- limit by the length of the attributes list

Limit (cost=166793.28..167335.52 rows=2 width=54) (actual
time=134783.510..134816.941 rows=2 loops=1)
Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
-> Unique (cost=166793.28..168420.01 rows=6 width=54) (actual
time=134783.507..134816.850 rows=2 loops=1)
Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
-> Sort (cost=166793.28..167606.64 rows=325346 width=54)
(actual time=134783.505..134802.602 rows=205380 loops=1)
Sort Key: attribute_id, "timestamp" DESC
Sort Method: external merge Disk: 26456kB
Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
-> Gather (cost=1000.00..125882.23 rows=325346
width=54) (actual time=32.325..133815.793 rows=410749 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=19086 read=46836
-> Parallel Seq Scan on results
(cost=0.00..92347.63 rows=135561 width=54) (actual
time=18.496..133871.888 rows=136916 loops=3)
Filter: ((data IS NOT NULL) AND (data <>
'<NullData/>'::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND
(object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
Rows Removed by Filter: 920123
Buffers: shared hit=19086 read=46836
Planning Time: 5.071 ms
Execution Time: 134874.687 ms

As far as I can see the estimates are close to the real returned rows
in the "Parallel Seq Scan on results".
The numbers are similar (of course) if I turn off query parallelism.
Or should I VACUUM ANALYZE again?
I'm sure I ran it enough.

>> How can I improve it to be consistently fast (is it possible to get to
>> several milliseconds?)?
>
>
> Maybe. Depends on the answer to my previous question.
>
>>
>> What I don't really understand is why the nested loop has 3 loops
>> (three joined tables)?
>
>
> Each parallel execution counts as a loop. There are 2 parallel workers, plus the leader also participates, making three.
>
>>
>> And why does the first index scan indicate ~60k loops? And does it
>> really work? It doesn't seem to filter out any rows.
>
>
> The parallel hash join returns about 20,000 rows, but I think that that is just for one of the three parallel executions, making about 60,000 in total. I don't know why one of the nodes report combined execution and the other just a single worker. Parallel queries are hard to understand. When I want to optimize a query that does parallel execution, I just turn off parallelism ("set max_parallel_workers_per_gather TO 0;") at first to make is simpler to understand.
>
> Apparently everything with device_id = 97 just happens to pass all the rest of your filters. If you need those filters to make sure you get the right answer in all cases, then you need them. A lifeboat isn't useless just because your ship didn't happen to sink today.

The part with the workers makes sense, thanks.
For the condition, I thought there is something more contrived going
on in the planner, but I failed to see it was that simple: there is no
need to remove anything since everything matches the condition.

>> Should I add an index only on (attribute_id, object_id)? And maybe
>> data_access_result?
>> Does it make sens to add it on a text column (results.data)?
>
>
> Which parts of query you give are going to change from execution to execution?
>
> Assuming the parts for object_id and attribute_id are variable and the rest are static, I think the optimal index would be "create index on results (object_id, attribute_id) where data IS NOT NULL and data <> '<NullData/>' and data_access_result = 'SUCCESS'"

That's right, object_id and attribute_id are variable, the device_id
is variable as well.
If I create that index it is not chosen by the planner when executing
the full "joined" query, it is though if I run it only on "results"
related conditions, without joining, which makes sense corroborated
with what you said below.

> Why does results.data have two different "spellings" for null data?

One where the device couldn't be contacted (NULL), one where the
device was contacted and for my (object_id, attribute_ids)
combinations actually returned null data ('<NullData/>').

> However, if the number of rows from "results" that meet all your criteria are high, the index won't make much of a difference. The planner has a fundamental choice to make, should it seek things with device_id = 97, and then check each of those to see if they satisfy your conditions on "results" fields conditions. Or, should it seek things that satisfy the "results" fields conditions, and then check each of those to see if they satisfy device_id = 97. It is currently doing the first of those. Whether it should be doing the second, and whether creating the index will cause it to switch to using the second, are two (separate) questions which can't be answered with the data given.

So maybe I should de-normalize and place the device_id column into the
"results" table and add it to the index in your suggestion above?

> Cheers,
>
> Jeff

Thank you for the detailed response.

Barbu Paul - Gheorghe

In response to

Re: Erratically behaving query needs optimization at 2019-08-25 14:50:55 from Jeff Janes

Responses

Re: Erratically behaving query needs optimization at 2019-08-26 11:51:41 from Justin Pryzby
Re: Erratically behaving query needs optimization at 2019-09-02 21:57:17 from Jeff Janes

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Justin Pryzby	2019-08-26 11:51:41	Re: Erratically behaving query needs optimization
Previous Message	Jeff Janes	2019-08-25 14:50:55	Re: Erratically behaving query needs optimization