Re: Why are stored procedures looked on so negatively?

From: Neil Tiffin <neilt(at)neiltiffin(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Cc: Some Developer <someukdeveloper(at)gmail(dot)com>
Subject: Re: Why are stored procedures looked on so negatively?
Date: 2013-07-25 23:59:29
Message-ID: 46E7303D-D75E-4BCF-B692-14C54E548817@neiltiffin.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Jul 23, 2013, at 7:29 PM, Some Developer <someukdeveloper(at)gmail(dot)com> wrote:

> I've done quite a bit of reading on stored procedures recently and the consensus seems to be that you shouldn't use them unless you really must.

Application architecture is a specific software engineering discipline. These types of generalizations come from coders who don't really understand application architecture and/or databases. There are specific reasons to put code in the database server, application middleware, or the application. To make this decision, much more must be known that what has been presented in this thread.

For example, if you want to maintain data integrity, then you really want to use very specific table definitions with foreign keys, defaults, and constraints. While this is not related to stored procedures, application coders try to shy away from these (like they do stored procedures) because it makes working with the database harder. It forces the data to be correct before it comes into the database. When foreign keys, defaults, and constraints are not enough to ensure data integrity then stored procedures should be used. The question is, how important is your data and how much time do you want to spend correcting it after it enters the database?

The next reason is performance. Stored procedures can in certain circumstances dramatically increase or decrease performance of both the client and the server, network traffic, and application response time. But which one is most important in your application? The consensus does not know. The best practice depends on the type of multitasking the application is performing, the type of client, the client coding environment, the locations of the data being processed, the locking requirements, the concurrency requirements, the capacity of the servers and clients, the network topology, the expected response time for the activity, etc. It is not at all uncommon to think that a stored procedure should be in the database server and to have performance testing show that it is better in the application and vice versa. Keep in mind that as the database becomes loaded, these performance issues may change and any decisions you make on a development database with only partial data may not prove out in the final application.

There may also be reasons to normalize/denormalize data in the database, but present a different view to the application. This should, if done correctly, make the application code simpler to maintain and understand. What is your support experience level? No experienced DBAs, this is probably a bad idea. Relatively inexperienced application coders, this is probably a really good idea.

Sophisticated applications may even have more than one database server. One update server and multiple read only servers is very common in the environments I work in. Since the update server is not burdened by providing all of the read only data, it has much more capacity to handle stored procedures. Some of our environments see 80 or 90% of the load as read only. This is the network topology part.

Another example, if the result of a procedure is one number, but requires 15 columns, from 200 rows the question is, is it faster to do it on the server and only put one resulting number back on the network, or should the system get all 15 columns times 200 rows worth of data and put that on the network for the client to analyze? The answer is, well it depends? Well, maybe not for this example, but hopefully you get the point. Now if part of the procedure requires data that comes from a GUI table or user entered data that only resides in the application, then the situation changes.

Wherever you put the code, you should have specific reasons for doing so and for high performance applications it is not appropriate to generalize that all the code should go exclusively into the database or the app.

Neil

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rob Sargent 2013-07-26 00:10:27 trouble with pam building 9.3beta2
Previous Message Adrian Klaver 2013-07-25 23:54:28 Re: how _not_ to log?