From: | "Yngve N(dot) Pettersen" <yngve(at)spec-work(dot)net> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Cc: | "Andrew Sullivan" <ajs(at)crankycanuck(dot)ca> |
Subject: | Re: Possible multiprocess lock/unlock-loop problem in Postgresql 9.2 |
Date: | 2014-03-10 01:41:44 |
Message-ID: | op.xchgburr3dfyax@killashandra.invalid.invalid |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi again,
I have now had time to do further research about this issue. I have been
able to produce a script (available on request) that reproduces the
problem, even in tables as small as 100 items and using a single thread,
and as a result may have located an area that may cause the problem: A
conditional index.
As mentioned in the attachment to my previous email the table which is
having the problem look like this:
Table "public.probedata2_probequeue"
Column | Type |
Modifiers
----------------+----------------------+-------------------------------------
id | bigint | not null default
nextval('probedata2_probequeue_id_seq'::regclass)
part_of_run_id | integer | not null
server_id | integer | not null
state | character varying(1) |
Indexes:
"probedata2_probequeue_pkey" PRIMARY KEY, btree (id)
"run_server" UNIQUE CONSTRAINT, btree (part_of_run_id, server_id)
"probedata2_probequeue_finished" btree (id) WHERE state::text =
'F'::text
"probedata2_probequeue_run_idle" btree (part_of_run_id) WHERE
state::text = 'I'::text
"probedata2_probequeue_started" btree (part_of_run_id) WHERE
state::text = 'S'::text
Foreign-key constraints:
"probedata2_probequeue_part_of_run_id_fkey" FOREIGN KEY
(part_of_run_id) REFERENCES probedata2_proberun(id) DEFERRABLE INITIALLY
DEFERRED
"probedata2_probequeue_server_id_fkey" FOREIGN KEY (server_id)
REFERENCES probedata2_server(id) DEFERRABLE INITIALLY DEFERRED
In my test database I have been running tests without the
"probedata2_probequeue_run_idle"-equivalent and the other conditional
indexes.
Without the "run_idle"-index the queries (fetch idle candidates, mark as
started, fetch the records) complete in less than a second (<0.9 seconds),
as expected.
*With* the index, the time to complete that operation increases by a
factor ranging from 10-30 times for small sequences and a single thread,
to 1000 times for large sequences (23 million, 4-10 threads), taking up to
20
minutes to complete an update of 6000 rows for each thread, running the
process at 100% CPU the whole time.
The purpose of the index is to provide quick access to the idle items for
a given job, both records and counts. Normally, there will currently be
just a single active job in the table, and at the time the problem is
occurring all entries for the job will be in the index.
As I mentioned last time, the problematic command is be the UPDATE command
UPDATE probedata2_probequeue SET "state"=E'S'
WHERE "state" = E'I' AND id IN ( .....)
RETURNING id
and I have confirmed that with the built-in Django debug query information
in my test script.
On Sat, 08 Feb 2014 15:57:10 +0100, Yngve N. Pettersen
<yngve(at)spec-work(dot)net> wrote:
> Hi again,
>
> Sorry about the delay, but an unexpected 3 week trip combined with not
> being
> able to fix the system's router whose configuration had become corrupted
> before I left, meant that I could not perform any testing until this
> week,
> after the router had been fixed and reinstalled.
>
> I just did a test using expanded logging of queries and duration, and
> actually saw this happen with just 4 (four) processes, not the 10 I have
> seen before.
>
> The relevant parts of the log, as well as an EXPLAIN and table info dump
> are attached in a zipfile; the large parts consisting of 6000
> to-be-updated
> IDs in the UPDATE commands have been removed for clarity (the sequences
> were not overlapping, with a numerical distance of at least 80000 from
> the other updates, and each sequence was within a range of 7500 IDs)
>
> Background: Immediately before this operation the queue had been set up
> with the
> command
>
> INSERT INTO probedata2_probequeue (part_of_run_id, server_id, state)
> SELECT '334' AS part_of_run_id, server_id, E'I' AS state FROM
> probedata2_preparedqueueitem
> WHERE part_of_queue_id = '2'
>
> followed by a COMMIT, a process taking ~15 minutes to copy a 23 million
> entry table into the queue. After this the total number of entries in the
> target table is 70 million, distributed across 3 runs.
>
> Some SELECT count and ANALYZE operations have probably also been
> performed
> after the initialization operation, before processes
>
> Shortly afterward (within a couple of minutes), four processes initiated
> the SELECT and UPDATE
> sequence I outlined earlier, each operation was performed between 1 and
> 10
> seconds after one of the others, 20 seconds from start to last command
> started.
>
> SELECT id FROM probedata2_probequeue
> WHERE state = E'I' AND part_of_run_id = 334
> LIMIT 6000 OFFSET 85103
>
> UPDATE probedata2_probequeue SET "state"=E'S'
> WHERE "state" = E'I' AND id IN ( .....)
> RETURNING id
>
> The SELECT operations took 21-122ms to complete, while the UPDATEs took
> 1093000-1120000 ms (~18 minutes, to complete). During this time the
> processes were running at 100% CPU. With more processes involved earlier
> I
> recall seeing even longer execution times for the UPDATEs, before I
> killed
> the processes (this is the first time I have seen this kind of situation
> be resolved without killing the processes).
>
> For reference, once these 4 commands had completed (the system is set up
> to wait until it sees task completed messages from the first processes
> that were started, before starting new ones), the next sequence of these
> commands took 122ms and 107ms, respectively, and the second took 50ms and
> 108ms.
>
> Any suggestions for where to investigate further?
>
> I am considering setting up a small example to see if I can reproduce,
> but
> have not had time to do so yet.
>
> On Sat, 04 Jan 2014 20:06:19 +0100, Yngve N. Pettersen
> <yngve(at)spec-work(dot)net> wrote:
>
>> On Sat, 04 Jan 2014 19:40:31 +0100, Andrew Sullivan
>> <ajs(at)crankycanuck(dot)ca> wrote:
>>
>>> On Sat, Jan 04, 2014 at 07:07:08PM +0100, Yngve N. Pettersen wrote:
>>>> I tried that before, but ran into some issues, IIRC a similar looping
>>>> problem as this where queries never ended. I split it up in an
>>>> attempt to
>>>> solve that problem.
>>>
>>> Pulling the data out into the application and sending it back in won't
>>> improve things. Exactly the same number of rows need to be visited,
>>> but the way you have it now you have to marshall all the data and ship
>>> it to the application too. So it's automatically slower. Indeed,
>>> making it slower might have masked your problem.
>>
>> Could be
>>
>>>
>>>> In the select/update case there is no sorting in the query; there is
>>>> an
>>>> offset/limit clause though, number of records retrieved are currently
>>>> restricted to <10000 per query (out of 20 million in the active
>>>> subset).
>>>>
>>>> SELECT id from queue where state = E'I' and job_id = <integer>
>>>> offset
>>>> <random 200..150000> limit <1-6000>
>>>
>>> This could be part of problem. Are the different threads working on
>>> different job_ids, or is that the same job_id? If you don't SORT that
>>
>> Same job_id, at least in the current system.
>>
>>> query before the OFFSET, then the rows will come back in whatever
>>> order the system likes.
>>
>> I suspect that a sort operation on (currently) 20+ million rows for
>> every query for just 6000 (previous version was 1500 entries) would
>> cause quite a bit more slowness than breaking up the query in two
>> operations, or the risk of collisions would, because each process would
>> have to load all that information (even if it is cached).
>>
>>>> However, in the UPDATE case, the looping processes are all UPDATE
>>>> queries,
>>>> no SELECTs involved.
>>>
>>> But you said it's all in the same transaction scope. The lock is a
>>> the transaction scope.
>>
>> But the statement it locks/loops on are only UPDATE statements, also in
>> the processes that are waiting.
>>
>>> Anyway, what I'd do is try to cause the condition and post the
>>> pg_locks information. When I've done this in the past, usually the
>>> best thing to do is also to have query logs on for everything (don't
>>> forget to log the pid!) so you can see what the other transaction
>>> you're waiting on touched already. You can usually find the inversion
>>> that way. Once you see it, it's always obvious what you've done, in
>>> my experience (and completely mystifying before that, unfortunately).
>>
>> Will take a look in a few days, probably midweek.
>>
>>
>
>
--
Sincerely,
Yngve N. Pettersen
Using Opera's mail client: http://www.opera.com/mail/
From | Date | Subject | |
---|---|---|---|
Next Message | Aggarwal, Ajay | 2014-03-10 01:52:33 | replication timeout in pg_basebackup |
Previous Message | Daniel Verite | 2014-03-10 01:18:02 | Re: libpq - lack of support to set the fetch size |