From: | Robert Leach <rleach(at)princeton(dot)edu> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10 |
Date: | 2023-11-02 20:04:19 |
Message-ID: | 76BDCC62-0333-41A8-95E5-40B7A8C3D698@princeton.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Thanks David,
I very much appreciate your time and assessment. I feel like I'm stuck, so all suggestions are very much welcome.
>> I cannot claim to understand the bug that is causing this issue, so the best
>> I can do is simply provide the explain output and try to keep from providing
>> confusing details, because this is outside the realm of my expertise:
>>
>>
>> (' -> Index Scan using
>> "DataRepo_peakdata_peak_group_id_4dd87f4a" on "DataRepo_peakdata"
>> (cost=0.25..8.26 rows=1 width=8) (actual time=0.017..7.149 rows=7896
>> loops=1)',)
>
> Nothing looks particularly bug like so far. It seems the
> pg_class.reltuples estimate for this relation is way out.
If this is operating as intended, there are a couple questions I have:
1. In postgres 10, the query always runs in under 1 second, regardless of the context. In postgres 13, depending on the context, why does it almost always takes over 200 seconds?
2. In postgres 13, the query runs in under 1 second if I run just the one test that (starting from an empty database) loads data and runs the query. BUT, if I run that one test after having run tests in another python/django test class (which loads data, runs various tests, then destroys all the data that that test class loaded), that query takes over 200 seconds to run. Why would that be?
I understand that the above involves variables unrelated to postgres (i.e. Django test classes), but if the only thing I change is postgres (v10 versus v13), v10 is always fast and v13 has this context-dependent slowness and I have not been able to find a problem in the python code after having investigated this now for a week and a half.
We have been endeavoring to keep our code database-architecture-independent, and as far as I understand the suggestions made by the Django folks, it seems like running the vacuum command in key places is potentially a strategy to deal with this issue (even though I haven't been able to get it to consistently work yet), but doing that would tie the codebase to a particular database architecture. Are there alternate solutions that do not involve altering our Django codebase?
> Has autovacuum gotten to this table recently?
I only just learned the other day about the "VACUUM FULL ANALYZE" command and have been trying to manually run it at different points with mixed outcomes WRT speed. I don't know what the difference between that and "autovacuum" is.
> select * from pg_stat_user_tables where relid = '"DataRepo_peakdata"'::regclass;
When I try to execute this command, I get:
psycopg2.errors.UndefinedTable: relation "DataRepo_peakdata" does not exist
Maybe I'm not running it in the correct location? Or I'm running it too many times?
> The plan would likely come good if you analyzed that table. You
> should see if you can figure out why autovacuum hasn't analyzed it.
How do I determine that? Also, I have tried inserting `vacuum full analyze` commands at different points in the code, and it doesn't appear to consistently speed things up. In fact, before I started attempting vacuuming, there are occasional instances where it randomly ran fast without me changing anything, i.e. the problem intermittently goes away. Though like I said, I've been working on solving this for a week and a half, so I could be mistaken. I've tried so many things and I have been switching back to v10 occasionally, so it could be that I was on v10, thinking I was on v13 when I saw it run fast.
> For the future, it might be best to post EXPLAIN output as an
> attachment. The extra formatting makes it difficult to read.
Sorry about that. I used the web form and there wasn't at attachment field.
Rob
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2023-11-02 22:00:00 | Re: BUG #18177: certain queries under certain contexts take multiple orders of magnitude longer compared to v10 |
Previous Message | PG Bug reporting form | 2023-11-02 17:01:47 | BUG #18180: Maybe a bug? repomd.xml signature could not be verified for pgdg-common |