READ COMMITTED vs. index-only scans

From: Jacek Kołodziej <kolodziejj(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: READ COMMITTED vs. index-only scans
Date: 2018-01-17 13:08:23
Message-ID: CAB=Xmr7uCWN7Jbvp5VxQ699EXBF3T_7ZCxz1N9xcsuNpoFgeOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello everyone,

this is my first post here and I'm starting with asking a question about
data consistency between two consecutive SELECTs using PostgreSQL 9.6.

I'm sorry if that's something that was already discussed - I couldn't find
it either in archives, nor in _general internet_. If it is, I would
appreciate pointing it out.

I have an "append-only" events table - only INSERT and SELECT queries are
issued to it. It has an integer (sequence) ID as a primary key.

I'm issuing following two queries (within the same READ COMMITTED
transaction) to that table:
- first query (A): SELECT id FROM events ORDER BY events.id DESC LIMIT 1 -
I'm saving the result as a "max ID" for the second query
- second query (B): SELECT id, ... FROM events WHERE id > "min ID" AND id
<= "max ID" AND ...
- I won't dig into what's "min ID" but you can assume it's at most 100
less than "max ID"

Concurrently, rows are being added to that table.

Please note that there's enough data in the index for the first query (A)
to perform an index-only scan. I'm not sure if that's relevant but "B"
query does an index scan because of other conditions I haven't mentioned
but still needs to fetch data from the table's heap.

Here's what happening to me: the "A" query occasionally (in my case: on the
order of tenths per day) returns an ID _higher_ than any ID present in
second query's result (other conditions I haven't specified do _not_ filter
any more rows than "id <= max ID") - as if some entries were visible for
the first query, but not for the second one. This is an inconsistency that
is very problematic for me.

All I can think of is that it might be caused by the index-only-scan and
READ COMMITTED transaction isolation level but from what I gather from
documentation, it should not be possible due to the use of visibility
map[0][1][2].

And yet it happens, likely for some other reason but I can't think of any.
I've tried switching transaction isolation level to REPEATABLE READ (on the
basis that it's an faulty phenomenon occurring during to some bug) but that
didn't help.

Where am I wrong? What am I missing? What information may I provide to help
with investigating this?

[0] https://www.postgresql.org/docs/9.6/static/indexes-index-only-scans.html
[1] https://www.postgresql.org/docs/9.6/static/storage-vm.html
[2] https://wiki.postgresql.org/wiki/Index-only_scans

--
Kind regards,
Jacek Kołodziej
http://kolodziejj.info

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Melvin Davidson 2018-01-17 14:30:05 Re: READ COMMITTED vs. index-only scans
Previous Message Rakesh Kumar 2018-01-17 11:35:25 Re: Parallel Btree index scan