Re: DeadLocks..., DeadLocks...

From: Tom Allison <tom(at)tacocat(dot)net>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Bill Moran <wmoran(at)potentialtech(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: DeadLocks..., DeadLocks...
Date: 2007-06-15 10:13:36
Message-ID: 46726650.9040400@tacocat.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Gregory Stark wrote:
>
> The insert is deadlocking against the update delete.
>
> The problem is that the insert has to lock the records to be sure they aren't
> deleted. This prevents the update for updating them. But the update has
> already updated some other records which the insert hasn't referred to yet.
> When the insert tries to insert a record referring to those it can't lock them
> before they're already locked by the update and you have a deadlock.
>
> Do you really need the update at all? Do you use the last_seen field for
> anything other than diagnostics?
>
> You could try breaking the update up into separate transactions instead of a
> single batch statement. That would perform poorly but never deadlock.
>
> You could try to order them both but I don't know if that's possible. UPDATE
> doesn't take an ORDER BY clause. I suppose you could execute the update
> statement as separate queries within a single transaction in whatever order
> you want which would avoid the performance issue of issuing hundreds of
> transactions while allowing you to control the order.
>

The last_seen is a purge control -- when last_seen < current_date - ?? then I
remove the record.

I think there are two ways I could do this without killing performance. Please
let me know what you think...

I could modify the update to something more like:

update tokens set last_seen = now() where token_idx in (...)
and last_seen < current_date
or even push it back multiple days.

There's always the risk of losing a few records, but I'm probably not going to
notice. (Not bank transactions)

The other approach would be to use an external file to queue these updates and
run them from a crontab. Something like:
open (my $fh, ">> /var/spool/last_seen");
flock($fh, LOCK_EX);
seek($fh, 0, 2)
print join("\n", @$tokens),"\n";
flock($fh, LOCK_UN);
close $fh
and then run a job daily to read all these in to a hash (to make them unique
values) and then run one SQL statement at the end of the day.

Is there a limit to the number of values you can have in an IN(...) statement?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Francisco Reyes 2007-06-15 11:46:36 Re: pg_restore out of memory
Previous Message Tomasz Ostrowski 2007-06-15 09:11:07 Re: Historical Data Question