From: | Ron <ronljohnsonjr(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2 |
Date: | 2019-01-02 15:44:57 |
Message-ID: | c0ac4d7a-62d2-eea1-5818-4f9a45cbe453@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
Try using SERIALIZABLE transactions instead of the default READ COMMITTED.
On 1/2/19 9:28 AM, Mark wrote:
> Hi Ron,
>
> Yes, my process will commit the transaction (or roll it back) eventually.
> It's the window where
> one transaction has deleted all the rows (before committing) and an
> ANALYSE has ran.
>
> The deleted rows won't make it into the sample even though the transaction
> has NOT been commited.
>
> During this time I will get bad row estimates on rows for that ID. You can
> see this in the example below with two
> database connections (labelled 1 and 2).
>
> I would have expected the DELETE to have no effect until it was committed.
>
> connection 1=> create table test_table(school_id integer, note text);
> CREATE TABLE
> connection 1=> insert into test_table(school_id, note) SELECT g.id
> <http://g.id>,md5(random()::text) FROM generate_series(1,2) as g(id);
> INSERT 0 2
> connection 1=> insert into test_table(school_id, note) SELECT g.id
> <http://g.id>,md5(random()::text) FROM generate_series(1,2) as g(id);
> INSERT 0 2
> connection 1=> insert into test_table(school_id, note) SELECT g.id
> <http://g.id>,md5(random()::text) FROM generate_series(1,2) as g(id);
> INSERT 0 2
> connection 1=> select * from test_table ;
> school_id | note
> -----------+----------------------------------
> 1 | 0e08cf3990a04f0e943584517c564d31
> 2 | 96bf83ae5f8eb9342e8408b1ac25cb14
> 1 | f8fd943012edfe42a03a421df660bc33
> 2 | cd5d7ff0abca61f18857df9b21d234e0
> 1 | 60d731f430cb68c7285ddbcd9186eaa0
> 2 | 635e6c9cf305147ad8684213f0a9299c
> (6 rows)
>
> connection 1=> analyse verbose test_table ;
> INFO: analyzing "public.test_table"
> INFO: "test_table": scanned 1 of 1 pages, containing 6 live rows and 0
> dead rows; 6 rows in sample, 6 estimated total rows
> ANALYZE
> connection 1=> explain analyse select * from test_table where school_id = 2 ;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------
> Seq Scan on test_table (cost=0.00..1.07 rows=3 width=37) (actual
> time=0.011..0.015 rows=3 loops=1)
> Filter: (school_id = 2)
> Rows Removed by Filter: 3
> Planning time: 0.164 ms
> Execution time: 0.043 ms
> (5 rows)
>
> connection 2=> BEGIN ;
> BEGIN
> connection 2=> delete from test_table where school_id = 2 ;
> DELETE 3
> connection 2=> /* This connection is now idle */
>
>
> connection 1=> analyse verbose test_table ;
> INFO: analyzing "public.test_table"
> INFO: "test_table": scanned 1 of 1 pages, containing 6 live rows and 0
> dead rows; 3 rows in sample, 6 estimated total rows
> ANALYZE
> connection 1=> explain analyse select * from test_table where school_id = 2 ;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------
> Seq Scan on test_table (cost=0.00..1.07 rows=1 width=37) (actual
> time=0.009..0.014 rows=3 loops=1)
> Filter: (school_id = 2)
> Rows Removed by Filter: 3
> Planning time: 0.095 ms
> Execution time: 0.039 ms
> (5 rows)
>
>
> On Wed, 2 Jan 2019 at 14:04 Ron <ronljohnsonjr(at)gmail(dot)com
> <mailto:ronljohnsonjr(at)gmail(dot)com>> wrote:
>
> But Jeff said "left open indefinitely without either committing or
> rolling back". Your process is committing the transaction.
>
>
> On 1/2/19 6:15 AM, Mark wrote:
>> Hi Jeff,
>>
>> Thanks for your help. That is exactly what is happening.
>>
>> I have a long running job which deletes all of the common_student
>> table and then repopulates it. It takes long time to load all the
>> other data and commit the transaction. I didn't think the delete
>> inside the transaction would have any effect until it is commited or
>> rolled back.
>>
>> I will have to rewrite the application so it updates the existing
>> rows rather than deleting all and then inserting.
>>
>> Thanks again for helping me understand what's happening here.
>>
>> Proof:
>>
>> db=> explain analyze select * from common_student where school_id = 36;
>> QUERY PLAN
>> --------------------------------------------------------------------------------------------------------------------------------------
>> Bitmap Heap Scan on common_student (cost=88.30..3846.49 rows=1533
>> width=384) (actual time=4.852..7.065 rows=1388 loops=1)
>> Recheck Cond: (school_id = 36)
>> Heap Blocks: exact=67
>> -> Bitmap Index Scan on idx_common_student_sid (cost=0.00..87.91
>> rows=1533 width=0) (actual time=4.817..4.817 rows=1388 loops=1)
>> Index Cond: (school_id = 36)
>> Planning time: 0.097 ms
>> Execution time: 8.084 ms
>> (7 rows)
>>
>> db=> /* At this point I have started a long running transaction that
>> deletes all of common_student for school_id 36 */ ;
>>
>> db=> analyse verbose common_student(school_id);
>> INFO: analyzing "public.common_student"
>> INFO: "common_student": scanned 7322 of 7322 pages, containing 65431
>> live rows and 8060 dead rows; 56818 rows in sample, 65431 estimated
>> total rows
>> ANALYZE
>> db=> explain analyze select * from common_student where school_id = 36;
>> QUERY PLAN
>> --------------------------------------------------------------------------------------------------------------------------------------------
>> Index Scan using idx_common_student_sid on common_student
>> (cost=0.41..8.43 rows=1 width=384) (actual time=0.017..1.498
>> rows=1388 loops=1)
>> Index Cond: (school_id = 36)
>> Planning time: 0.098 ms
>> Execution time: 2.583 ms
>> (4 rows)
>>
>> db=> /* At this point I have killed the long running transaction that
>> deletes all of common_student for school_id 36 */ ;
>> db=> vacuum analyze common_student;
>> VACUUM
>> db=> explain analyze select * from common_student where school_id = 36;
>> QUERY PLAN
>> --------------------------------------------------------------------------------------------------------------------------------------
>> Bitmap Heap Scan on common_student (cost=79.17..3357.79 rows=1388
>> width=383) (actual time=0.088..1.302 rows=1388 loops=1)
>> Recheck Cond: (school_id = 36)
>> Heap Blocks: exact=67
>> -> Bitmap Index Scan on idx_common_student_sid (cost=0.00..78.83
>> rows=1388 width=0) (actual time=0.077..0.077 rows=1388 loops=1)
>> Index Cond: (school_id = 36)
>> Planning time: 0.327 ms
>> Execution time: 2.311 ms
>> (7 rows)
>>
>>
>> On Sun, 23 Dec 2018 at 02:57 Jeff Janes <jeff(dot)janes(at)gmail(dot)com
>> <mailto:jeff(dot)janes(at)gmail(dot)com>> wrote:
>>
>>
>> - Does the analyse output below mean that it only scanned
>> 51538 of 65463 rows in the table? Is school_id 36 just being
>> missed in the sample? (This happens when the analyse is
>> repeated )
>>
>>
>> Is there a transaction which had deleted all of school_id=36, and
>> then was just left open indefinitely without either committing or
>> rolling back?
>>
>> That would explain it, and I don't know of anything else that
>> could. The deleted but not committed tuples are still live, but
>> don't get sampled.
>>
>> Cheers,
>>
>> Jeff
>>
>
> --
> Angular momentum makes the world go 'round.
>
--
Angular momentum makes the world go 'round.
From | Date | Subject | |
---|---|---|---|
Next Message | Mark | 2019-01-02 16:04:03 | Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2 |
Previous Message | Mark | 2019-01-02 15:28:02 | Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2 |
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Verite | 2019-01-02 15:58:35 | backslash-dot quoting in COPY CSV |
Previous Message | Heikki Linnakangas | 2019-01-02 15:36:33 | Re: GiST VACUUM |