Re: allow LIMIT in UPDATE and DELETE

From: Shelby Cain <alyandon(at)yahoo(dot)com>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Csaba Nagy <nagy(at)ecircle-ag(dot)com>, SCassidy(at)overlandstorage(dot)com, Postgres general mailing list <pgsql-general(at)postgresql(dot)org>, pgsql-general-owner(at)postgresql(dot)org
Subject: Re: allow LIMIT in UPDATE and DELETE
Date: 2006-05-22 22:49:01
Message-ID: 20060522224901.64033.qmail@web37205.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

----- Original Message ----
>From: Jim C. Nasby <jnasby(at)pervasive(dot)com>
>To: Shelby Cain <alyandon(at)yahoo(dot)com>
>Cc: Csaba Nagy <nagy(at)ecircle-ag(dot)com>; SCassidy(at)overlandstorage(dot)com; >Postgres general mailing list <pgsql-general(at)postgresql(dot)org>; >pgsql-general-owner(at)postgresql(dot)org
>Sent: Monday, May 22, 2006 5:04:51 PM
>Subject: Re: [GENERAL] allow LIMIT in UPDATE and DELETE
>
>Just try and do (what should stay) a small queue table in the same
>database as long-running reporting transactions. As long as a
>long-running report is going you might as well suspend all vacuuming on
>that queue table, because it won't do you any good; the report
>transaction means that vacuum can't remove anything.
>
>I've seen a case where a queue table should always fit into a single
>database page; 2 at most. But because some transactions will run for a
>minute or two, that table is normally about 40 pages, almost entirely
>all dead space. Of course the same problem affects all the indexes on
>that table as well.
>
>I can't imagine how bad this would be if the database actually had
>hour-long reports that had to run... and luckily the system is quiet at
>night when pg_dump runs.

My experience with job queues comes from clients that mostly use Oracle as the backend. However, even with Oracle a queue table should be storing information about a job and not have records unnecessarily locked simply because they are being "worked on" by another hypothetical "job runner" process... by this I mean that the status of a job should be updated to a distinct state at any given moment in time (eg: unprocessed, processing, processed). In the case I present above, if you are using Postgresql you wouldnt have any open long-running transactions on that table and vacuuming should work... or am I misunderstanding the issue?

>> Regardless, this type of queue problem can also be tackled by having your data >>layer persisting the input from the web in memory (which maintains a low >>perceived response time to the client) and posting to the table as fast as the >>database allows.
>
>Uh, and just what happens when your web front-end crashes then??

Sorry, I probably should have said middle tier instead of data layer. Obviously, no single design choice will cover 100% of circumstances.

If your middle tier is located on the web server and you have a physical crash then obviously you are going to miss the jobs there were being held in memory waiting to be posted to the database and you'll have to resubmit (if possible) on restart. Having sane restart requirements/procedures thought out beforehand also goes a long way to making recover from an unexpected failure easier.

If your middle tier lives on another machine then you'd miss (at most) the one job that the web application was in the process of handing off to your middle tier. If your application is such that a missed job submission is a highly critical problem then you'd want a redundant solution for your middle tier to help avoid the problem or perhaps you might want to rethink posting asynchronously from your web UI and buffering the submissions.

Regards,

Shelby Cain

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Francisco Reyes 2006-05-23 00:18:08 psql freezes loading large file
Previous Message Tom Lane 2006-05-22 22:43:22 Re: Why won't it index scan?