Re: Small PosgreSQL locking function request - with bounty

From: Ralf Schuchardt <rasc(at)gmx(dot)de>
To: David(dot)I(dot)Noel(at)gmail(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Small PosgreSQL locking function request - with bounty
Date: 2013-09-12 12:45:16
Message-ID: F69C9B15-48B3-4EAB-8905-2CC79C50E28C@gmx.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi David,

have you used the "for update" clause in your select statements? With this clause "select" locks the selected row(s) in a table for modifications and other "select for updates".
My understanding is, that "for update" does what you need. You can execute your select and update statements in a single transaction in your Java application and get the desired effect (if the "URL" column is the primary key).

Regards,
Ralf

Am 12.09.2013 um 13:40 schrieb David Noel <david(dot)i(dot)noel(at)gmail(dot)com>:

> I have a few database queries that I've been running from within a
> Java project. I have recently come to the understanding that I need to
> run them instead within the PostgreSQL server as stored functions. I
> have that understanding because I need to make use of locking
> functionality, and that seems only able to be done by means of
> PostgreSQL functions. Transactions don't seem to be able to provide
> this. I've never written functions for postgres, so I thought maybe
> someone here could help.
>
> To provide some context: the code is a part of a webcrawler. More
> specifically, it is a part of the queuing system that handles the
> management of URL's to be crawled. The system takes URL's from the
> queue with one query, and marks them as active with a second. It then
> sends the results on to the crawler. Once the crawler has crawled the
> URL, a third query removes the URL from the queue.
>
> The code is running concurrently in multiple threads on multiple
> servers, and in scaling it to multiple servers I've run into some
> problems. It seems that due to the way postgres is designed I am
> unable to lock tables, or utilize transactions in Java to acheive
> concurrency. So I need it instead to be run as a postgres
> function/stored procedure. It seems. Am I correct in this, or did I
> misread the PosgreSQL Transactions documentation?
>
> Assuming the only way to accomplish this is with a postgres function,
> would anyone care to implement this for a small reward (though
> relative to the amount of work required I'd say it's probably a decent
> to large reward)?
>
> The queries are as follows:
>
> String querySelect =
> "select \"URL\",\"PublishDate\",\"SiteName\",\"Classification\",\"Special\",\"NextCrawlDate\"
> from \"crawlq\" " +
> "where \"Special\" = ? " +
> "AND \"Active\" = 'true' " +
> "AND \"TimeoutDate\" <= now() " +
> "AND \"CrawlError\" = 'false' " +
> "OR " +
> "\"Special\" = ? " +
> "AND \"Active\" = 'false' " +
> "AND \"CrawlError\" = 'false' " +
> "order by \"NextCrawlDate\" asc limit 1";
>
> String queryUpdateActive =
> "update \"crawlq\" " +
> "set \"Active\" = 'true', " +
> "\"TimeoutDate\" = now() + interval '5 minutes' " +
> "where \"URL\" = ? " ;
>
> This is what I need the function to do:
>
> I need the PostgreSQL function to first lock the table "crawlq".
> I then need it to perform the "querySelect" query.
> I then need it to perform the "queryUpdateActive" query.
> I then need it to unlock the table.
> I then need it to return the values from the select query to the Java project.
>
> Deliverables: I need the postgres function and a simple java program
> that calls the function and returns a result set. It doesn't need to
> do anything with the data, just call the function and return the value
> to the program. This should be a very simple project that shouldn't
> take more than 15 minutes for anyone familiar with writing postgres
> functions. Would $50 via PayPal be enough to entice anyone to offer a
> solution? I could also offer the payment in LiteCoins if you'd rather
> do it that way. Of course if you're feeling benevolent I wouldn't
> object to anyone who felt like doing it for free.
>
> The Java function I'm using that fetches elements from the queue
> currently is as follows:
>
> public synchronized FetchType fetch(String cs){
> if(debug_level == 1)
> System.out.println(new java.util.Date(System.currentTimeMillis()) +
> " : DAO : fetching element from database");
>
> /**
> * prepare the select query
> * execute it -- pull the items from the queue database
> * load the query results into a return container
> * clean up
> * prepare the update query
> * execute it -- update the record as active
> * clean up
> * commit the transaction
> * return the query results
> */
>
> try {
> if(!dbq.isValid(10))
> connectQ();
> } catch (SQLException e1) {
> e1.printStackTrace();
> }
>
> PreparedStatement stmt = null;
> PreparedStatement stmt2 = null;
> ResultSet rset = null;
> FetchType ret = null;
>
> // TODO: use a stored function
> String querySelect =
> "select \"URL\",\"PublishDate\",\"SiteName\",\"Classification\",\"Special\",\"NextCrawlDate\"
> from \"crawlq\" " +
> "where \"Special\" = ? " +
> "AND \"Active\" = 'true' " +
> "AND \"TimeoutDate\" <= now() " +
> "AND \"CrawlError\" = 'false' " +
> "OR " +
> "\"Special\" = ? " +
> "AND \"Active\" = 'false' " +
> "AND \"CrawlError\" = 'false' " +
> "order by \"NextCrawlDate\" asc limit 1";
>
> String queryUpdateActive =
> "update \"crawlq\" " +
> "set \"Active\" = 'true', " +
> "\"TimeoutDate\" = now() + interval '5 minutes' " +
> "where \"URL\" = ? " ;
>
> try {
> stmt = dbq.prepareStatement(querySelect);
> stmt.setEscapeProcessing(true);
> stmt.setString(1, cs);
> stmt.setString(2, cs);
> rset = stmt.executeQuery();
>
> if(rset.next()){
> ret = new FetchType(
> rset.getString("URL"),
> rset.getString("SiteName"),
> rset.getString("Classification"),
> rset.getDate("PublishDate"),
> rset.getString("Special")
> );
> } else
> ret = null;
>
> rset.close();
> stmt.close();
>
> if (ret != null){
> stmt2 = dbq.prepareStatement(queryUpdateActive);
> stmt2.setEscapeProcessing(true);
> stmt2.setString(1, ret.getURL());
> stmt2.execute();
> stmt2.close();
> dbq.commit();
> }
>
> if(debug_level == 1)
> System.out.println(new java.util.Date(System.currentTimeMillis())
> + " : DAO : fetch complete " + ret.getURL());
>
> return ret;
> } catch (SQLException e) {
> try {
> e.printStackTrace();
> dbq.rollback();
> stmt.close();
> stmt2.close();
> e.printStackTrace();
> return null;
> } catch (SQLException e2) {
> e2.printStackTrace();
> return null;
> }
> }
> }
>
> Running on one machine I'm bypassing the transaction concurrency issue
> by synchronizing the method. But the Java concurrency constructs I'm
> using here don't scale to multiple machines.
>
> At any rate, have I provided enough information to get the solution
> I'm looking for? Have I provided enough financial incentive to get
> this implemented? If so, please respond with code here to the list so
> multiple people don't implement it and expect to be paid. I can only
> pay one person, though if another person fixes a bug in a proposed
> solution I'm open to splitting the bounty however seems fair.
>
> Thanks for reading, hope to hear back!
>
> -David Noel
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message againstdemons84 2013-09-12 13:11:17 Re: Trouble with replication
Previous Message Andreas 'ads' Scherbaum 2013-09-12 12:30:03 Re: Call for design: PostgreSQL mugs