Re: READ COMMITTED vs. index-only scans

From: Melvin Davidson <melvin6925(at)gmail(dot)com>
To: Jacek Kołodziej <kolodziejj(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: READ COMMITTED vs. index-only scans
Date: 2018-01-17 14:30:05
Message-ID: CANu8FiznGYHJURot3+4Yo_chptQC4PpGb=thyOVbV_+56=ymzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Jan 17, 2018 at 8:08 AM, Jacek Kołodziej <kolodziejj(at)gmail(dot)com>
wrote:

> Hello everyone,
>
> this is my first post here and I'm starting with asking a question about
> data consistency between two consecutive SELECTs using PostgreSQL 9.6.
>
> I'm sorry if that's something that was already discussed - I couldn't find
> it either in archives, nor in _general internet_. If it is, I would
> appreciate pointing it out.
>
> I have an "append-only" events table - only INSERT and SELECT queries are
> issued to it. It has an integer (sequence) ID as a primary key.
>
> I'm issuing following two queries (within the same READ COMMITTED
> transaction) to that table:
> - first query (A): SELECT id FROM events ORDER BY events.id DESC LIMIT 1
> - I'm saving the result as a "max ID" for the second query
> - second query (B): SELECT id, ... FROM events WHERE id > "min ID" AND id
> <= "max ID" AND ...
> - I won't dig into what's "min ID" but you can assume it's at most 100
> less than "max ID"
>
> Concurrently, rows are being added to that table.
>
> Please note that there's enough data in the index for the first query (A)
> to perform an index-only scan. I'm not sure if that's relevant but "B"
> query does an index scan because of other conditions I haven't mentioned
> but still needs to fetch data from the table's heap.
>
> Here's what happening to me: the "A" query occasionally (in my case: on
> the order of tenths per day) returns an ID _higher_ than any ID present in
> second query's result (other conditions I haven't specified do _not_ filter
> any more rows than "id <= max ID") - as if some entries were visible for
> the first query, but not for the second one. This is an inconsistency that
> is very problematic for me.
>
> All I can think of is that it might be caused by the index-only-scan and
> READ COMMITTED transaction isolation level but from what I gather from
> documentation, it should not be possible due to the use of visibility
> map[0][1][2].
>
> And yet it happens, likely for some other reason but I can't think of any.
> I've tried switching transaction isolation level to REPEATABLE READ (on the
> basis that it's an faulty phenomenon occurring during to some bug) but that
> didn't help.
>
> Where am I wrong? What am I missing? What information may I provide to
> help with investigating this?
>
> [0] https://www.postgresql.org/docs/9.6/static/indexes-index-
> only-scans.html
> [1] https://www.postgresql.org/docs/9.6/static/storage-vm.html
> [2] https://wiki.postgresql.org/wiki/Index-only_scans
>
> --
> Kind regards,
> Jacek Kołodziej
> http://kolodziejj.info
>

*>- first query (A): SELECT id FROM events ORDER BY events.id
<http://events.id> DESC LIMIT 1 - I'm saving the result as a "max ID" for
the second query*

*Just a suggestion. The first query is not really needed.*
*You can simply do:*

*second query (B): *

*SELECT id, ... FROM events WHERE id > MIN(ID) AND id <= MAX(ID)
AND ...*

*See https://www.postgresql.org/docs/9.6/static/functions-aggregate.html
<https://www.postgresql.org/docs/9.6/static/functions-aggregate.html> *

*MAX and MIN functions*

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Laurenz Albe 2018-01-17 15:17:06 Re: READ COMMITTED vs. index-only scans
Previous Message Jacek Kołodziej 2018-01-17 13:08:23 READ COMMITTED vs. index-only scans