Every programming position I have held has been very database centric, and as a result I have been exposed to a number of approaches to SQL code management. My current preference is to keep the SQL within stored procedures, and treat it as I would a class API.
I have come up with a set of what I feel are best practices, and would be very interested in feedback from the community.
My most recent experience in this area has been with microsoft sql server 2000, so I will focus on that in this node.
Document your stored procedures internally
I find it helpful to document the intent for stored procedures within their body, and to also document changes to the stored procedure with a date stamp and a ticket number. This makes it easier for future maintainers.
CREATE PROCEDURE dbo.SOMETHING_SAVE
@something_id int,
@name varchar(50)
AS
/*
Does blah blah
-- Changed on 2006-09-07 to resolve ticket #570
*/
-- Do stuff
GO
Treat your stored procedures as any other asset - keep them in a versioning system
Your schema and stored procedures should always be recorded in a versioning system. If you find yourself needing to use a version of your application from 2 years ago, you need to be able to recreate the database as it was at that time.
Additionally, when moving from a development to production environment you need to be able to easily identify the changes that have been made.
Privilege Separation
I prefer to only access the data through stored procedures (and possibly views). With some databases you can revoke all table permissions, and grant selective permissions to individual SPs/Views. This works because of 'Ownership Chaining'.
By doing this you can restrict the level of access each database user is granted to a much finer level. Different interfaces into the system (e.g. admin vs external user) can have access to specific procedures which adhere to business logic.
Another way I like to use this is to have password verification be a black box.
$sql = q{EXEC AUTH_USER @username = ?, @password = ?};
my ($user_id) = $dbh->selectrow_array($sql,{},$username,$password);
And finally, it overs an additional protection from SQL Injection. If they can't touch the table, they are less dangerous.
Naming conventions
I like is to reverse the normal naming scheme, and use NOUN_VERB.
Examples: SOMETHING_BY_ID, SOMETHING_SAVE, SOMETHING_LIST.
I often add an extra level to that by prefixing the name with an abbreviation of the intended interface. e.g.:
- PORTAL_SOMETHING_LIST
- PORTAL_SOMETHING_BY_ID
- ADMIN_SOMETHING_LIST
- ADMIN_SOMETHING_BY_ID
- ADMIN_SOMETHING_SAVE
I find that this keeps related items grouped when ordered alphabetically.
Benchmark the stored procedures
Make use of whatever tools are available for your database. For MS SQL2000 there is Query Analyer, which is a wonderful tool if you enable the 'Show Execution Plan' option.
The profiler tool is also useful for identifying bottlenecks.
Solutions (MS SQL2000 centric) to common reasons against stored procedures
Dynamic ordering
SQL Server 2000 does not directly support dynamic ordering, but you can emulate it with a CASE statement.
An important thing to remember is that it will not intelligently switch what datatype is being compared,
so if some of the fields are char and some are integer, it will try to make them all integer - and cause an error
To avoid this you need to have multiple CASE statements, with the WHEN clauses grouped by data type.
order by
CASE @order
WHEN 'name' THEN lastname
WHEN 'state' THEN state
END,
CASE @order
WHEN 'name' THEN firstname
END,
CASE @order
WHEN 'user_id' THEN user_id
END
Dynamic Where clauses
Make use of boolean shortcircuiting.
CREATE PROCEDURE [dbo].[dynamic_example]
@lastname varchar(50) = NULL,
@firstname varchar(50) = NULL
AS
SELECT * from Users
WHERE (@lastname IS NULL OR users.lastname LIKE @lastname)
AND (@firstname IS NULL OR users.firstname LIKE @firstname)
GO
EXEC dynamic_example @firstname = 'chris%'
Update - moved dynamic paging to a reply, instead of linking to my perlmonk.org account
Re: Moving SQL from perl to Stored Procedures
by tilly (Archbishop) on Sep 10, 2006 at 01:12 UTC
|
I disagree with calling this a best practice without qualifiers or more details.
There are a lot of arguments on both sides when it comes to moving logic in and out of the database. Either decision is very defensible. In a particular situation it may make sense to choose one over the other, but I do not believe that either choice is universally better.
Furthermore if you're going to move logic into the database, then there are additional things you need to do as well. I think that if you're saying that you should do the first, you should at least mention the second.
What are some of the trade-offs?
Against keeping logic in databases:
- Who is developing your application? Probably programmers. What are they best at? Probably their application language (given this site, likely to be Perl). Why make them program in another language?
- You present SQL injection attacks as a key problem to solve. That has many solutions. For instance you can use placeholders. Or you can use an object relational manager such as Class::DBI.
- Stored procedure languages tend to be quite bad. Business logic etc can often be far more clearly done outside of the database than inside the database.
- Having significant logic in stored procedures makes code releases significantly more complex.
- If multiple programmers are working on logic in stored procedures in the database, they are more likely to accidentally step on each other's toes.
- If there is any chance that you'll need to change databases, the more logic in the database, the worse it is. Very relevant to this point is that proprietary database vendors actively attempt to charge you as close to your cost of switching databases as possible. Those negotiations may go better for you if you really can switch databases easily. (See Information Rules for an explanation of why they do this.)
- The database tends to be the hardest part of your system to scale. Verifying logic takes work for your database, and may cause it to run into problems sooner. (Note "may", scalability can go either way.) Yes, I know that database vendors claim that it is more scaleable to put logic in the database. It is also in their financial interest that you do so...
- As the anonymous monk pointed out above, when you leave some logic in the application and move other logic into the database, you are tightly coupling two piece of code that are very far from each other. In general, this is an unwise thing to do.
For putting logic in databases:
- Security. While SQL injection attacks can prevented in other ways, if someone manages to break into your webserver in another way, having the database locked down makes it more difficult for them to accomplish.
- When multiple applications access the same database (one of those "applications" may be ad hoc queries), you can't guarantee that they will all implement the same business logic unless it is in the database.
- Performance. It is often possible to code business logic to run faster in the database than it will outside. The biggest win being due to avoiding unnecessary round trips between machines. Some may wonder how this squares with my scalability point above. To understand it, remember that scalability is like a Mack truck - a sports car may get your groceries home faster, but the truck can move more groceries.
- Avoid race conditions. When logic is out of the database, you sometimes have to choose between accepting a race condition and holding a lock that can be a scalability hit. (See my comment before on scalability going either way...) The same locking done within the database is less of a scalability issue because you do away with the latency of round trips between the database and the client.
Now looking at those lists, what stands out? The first thing that stands out for me is that there is no single "killer" point that makes it obvious that one is always the right way to go and the other is not. The second is that the list of reasons against logic in the database is longer than the list of reasons to put it there. The third is that the reasons against putting logic in the database are mostly addressable, while the benefits of putting it in the database are mostly things you're stuck with.
Therefore my conclusion is that, depending on the project, either choice can be reasonable. But if you choose to put logic in the database, then you should address the problems that can arise. Here is a short list of things that I think you should do.
- Give each developer their own database. That way they won't get in each other's way as they are developing.
- Create automated procedures for synchronizing your database with what is in source control. This will help take the pain out of setting up and maintaining per developer databases, and those procedures can become part of your release process, addressing that issue.
- Make it very clear in your application design what the line is between what belongs in the database and what does not. Be very vigilant in maintaining that line. That way you'll limit the coupling issues.
- Make sure that your developers understand enough about how to work with stored procedures to do what they need to do. They don't need to be experts, but they should be able to understand what a procedure does.
If you don't do at least some (and probably most, if not all) of those things, then I think it is a bad idea to move logic into the database. If you do all of those things and you actively need the benefits from moving your logic there, then you are probably making a wise choice. | [reply] |
|
The first thing that stands out for me is that there is no single "killer" point ...
The killer points for me in favour of stored procedures over embedded SQL are:
- Relational logic and database design is hard to master.
A good DBA (a rare breed), can save the company their salary many times over. Very few general programmers become really adept and DB design.
- Applications need only the results, not how to obtain them.
DB design transcends applications for all but the most trivial. A DBA can provide cross-application overview and arbitrate between their disparate requirements. Application programmers are usually to caught up in their current applications requirements to step back and subjugate their perception of their applications needs in favour of another applications with such arbitration.
Isolating your DB design from your applications, and vice versa, is a sound application of the decoupling principle at so many levels. New applications come on line that require changes to the DB schema. Other applications should not have to change to accommodate this. The stored procedures are modified to account for the new schema and existing applications continue to run without change.
- Business code and data should live together and be application independent.
Tax rates, depreciation terms and a whole gamut of other typical business calculations, rules and logic live outside of the applications remit, and are subject to change by government edicts and legislation.
When they change, they change across the board for all affected applications. Changes should be implemented and tested once, not in every application.
- You do not realise the true benefits of a RDBMS until all your business data exists in a single database.
It is only once it is possible to perform relational logic across the entire range of company data assets that they really begin to earn their keep.
Warehousing solutions where the data from individual databases is replicated wholesale into secondary conglomerations miss opportunities for ad-hoc queries that the conglomeration process does not support.
Similarly, warehousing that attempts to perform cross DB boundary searches via meta-data and meta searching are rarely, if ever, capable of keeping up with changes and accommodating speculative ad-hoc queries. There is usually a fairly substantial delay between the notion of a cross-DB connection query being useful and it becoming available; and a not inconsiderable expense in making it available, which requires a costed business case to be made before budget can be allocated. Those delays, costs and red-tape suppress discovery.
No web programmer worth his salt would advocate the mixing of Perl and HTML. Why advocate the mixing of Perl and SQL.
The obligatory analogy. GP's diagnose, prescribe and manage year on year health care, but they defer specialist treatment to specialists. Eye, brain, heart specialists etc. etc.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
|
| [reply] |
|
|
|
Excellent points tilly - thank you for the feedback.
I agree that stored procedures do not fit every situation, and that there are new issues that arise when you decide to embrace that solution - the biggest one for me is coupling. That said, there are many situations where they are the best solution available.
The intent for the original post was not to suggest that stored procedures are the best practice by the way - it was to discuss best practices when using stored procedures.
| [reply] |
Re: Moving SQL from perl to Stored Procedures
by Jenda (Abbot) on Sep 09, 2006 at 11:03 UTC
|
Document your stored procedures
I tend to put the docs just above the CREATE PROCEDURE statement. I also do not include the dates and descriptions of changes, that's what the source control software is for. As I keep all stored procedures as separate files in Perforce and do use the changelist descriptions I do not feel the need to repeat the same info elsewhere. If I need the history I know where to look.
Identifying changes for promotion
While I do keep the procs in Perforce and do have separate Dev, QA and Prod branches I usualy do not use this when generating the update scripts. I run a service on the servers indexing (generating MD5 hashes of) the objects (pages, dlls, executables, stored procs, views, schemas of tables, ...) and whenever preparing the update simply compare the indexes and pack the differing files.
Naming conventions
I like to be able to read the names of the procs. And since the stored procs are commands they should IMHO have the form of a command sentence. FetchThis, InsertThat, UpdateSomething, DeleteWhatever, ExpireJobs, ExportSites, ImportSitePostingParameters, GetSiteName, ...
The most common verbs are Fetch = return a recordset with some data, Get = return a few pieces of info in OUTPUT params, Insert = insert a new row to a table and return the generated ID, Update = update a row (usually using the ID), Set = insert or update something that doesn't have its own generated ID (like some additional details for some object), Delete = delete something.
Dynamic Where clauses
As explained in Re^3: Moving SQL from perl to Stored Procedures - MSSQL tricks, dynamic paging it's most efficient, even if inconvenient, to branch out and write several SELECT statements.
| [reply] |
Re: Moving SQL from perl to Stored Procedures - MSSQL tricks, dynamic paging
by imp (Priest) on Sep 09, 2006 at 04:04 UTC
|
Paging is difficult in SQL Server 2000 for the following reasons:
- You cannot use a variable with TOP
- You cannot specify an offset with TOP
Read more... Common solutions, and my preferred approach (5 kB)
| [reply] [d/l] [select] |
|
Use cursors!
Update:
BTW: These dynamic where clauses are very expensive because the optimizer cannot optimize them.
Better way: Generate SQL-Statements with a fixed filter, and execute them from within a procedure. This is much more performant. There was an article in SQL - mag in the black belt sql column recently.
Maybe I should meditate someday about "SQL-cursors for the unwary" or something like that.
| [reply] |
|
You mean build an SQL statement in a nvarachar variable and then execute it? Well well well ... this has two big problems. The first being that this way you force the server to generate a query plan for each execution which may be a rather expensive operation. And the second, even more important, is that this is prone to SQL injection attacks. And it's even more dangerous than SQL statements built in the pages/scripts. First because it's kinda hidden in something that's by most believed to be safe and second because the statement is executed with the permissions of the stored procedure owner (usually dbo!) instead of the permissions of the user accessing the database!
I can't say I never use this, but I do try to take this as a last resort solution. It's much better, even if not very convenient to branch out and write several SELECT statements. That way the server may generate and REMEMBER the query plans for each of the cases and there is nothing ad-hoc generated and therefore dangerous:
CREATE PROCEDURE [dbo].[dynamic_example]
@lastname varchar(50) = NULL,
@firstname varchar(50) = NULL
AS
BEGIN
IF (@lastname IS NULL)
IF (@firstname IS NULL)
SELECT * from Users
ELSE
SELECT * from Users
WHERE (users.firstname LIKE @firstname)
ELSE
IF (@firstname IS NULL)
SELECT * from Users
WHERE (users.lastname LIKE @lastname)
ELSE
SELECT * from Users
WHERE (users.lastname LIKE @lastname)
AND (users.firstname LIKE @firstname)
END
GO
I know it's inconvenient, especially as the number of conditions increases, but it's the safest and most efficient solution. It's a shame I can't use some macros in SQL :-(
| [reply] [d/l] |
|
declare @paging_buffer table(
sort_id int not null identity(1,1) PRIMARY KEY,
user_id int not null,
firstname varchar(50),
lastname varchar(50)
);
?
| [reply] [d/l] |
|
| [reply] |
Re: Moving SQL from perl to Stored Procedures
by derby (Abbot) on Sep 09, 2006 at 10:32 UTC
|
and treat it as I would a class API
++ and given a decent naming convention (we use NOUN_I,
NOUN_GET, NOUN_U, NOUN_D - I couldn't convince them to use _C and _R - dbas are a strange breed), you can use your dbs stock utilities to automagically create an API class.
| [reply] |
Re: Moving SQL from perl to Stored Procedures
by dws (Chancellor) on Sep 10, 2006 at 05:56 UTC
|
The idea of moving logic into stored procedure makes me squeamish. I've only had direct experience with a significant number of stored procedures once, but lots of indirect exposure through friends.
There were two common themes.
The first is that it always seemed to split development into two camps, the non-database developers and the database developers. And too often, friction between the camps built up and hindered progress.
The second one, lack of ability to unit test business logic that had migrated into stored procedures, may have been resolved by now. But when I was dealing with it, the difficulty of writing unit tests for stored procedures really sucked, and slowed down development. You could test functionally, by throwing data at the stored procedure and then probing the database to make sure the right things had happened, but such tests seem to require a lot of maintenance, and tend to miss problems of the type that a mock object would spot right away.
I would need to see a pretty compelling performance argument before I would consider putting any significant business logic in stored procedures. But that's me. YMMV.
| [reply] |
Re: Moving SQL from perl to Stored Procedures
by gam3 (Curate) on Sep 10, 2006 at 13:38 UTC
|
It seems to me that there are several different arguments going on here and one that has nothing to do with Stored procedures is:
Should there be SQL mixed with perl
This question has nothing to do with the stored procedure question. If you are using code like
my $sql = "EXEC dynamic_example @firstname = ?";
$dbh->execute('chris%')
to get SQL out of your perl code, then I whould suggest that you should use
code like
$dbp->execute('dynamic_example', firstname => 'chris');
where $dbp holds an object that can either use ad-hoc SQL or stored procedures.
The advantages to this approach are that interface to the database is a
single object and all the procedures or sql that is accessing the
database are well known. The main disadvantage is that if the database
needs to accesed in a new way the $dpb object must be extended. And if Stored procedures are used, then the database has to be extended as well.
-- gam3
A picture is worth a thousand words, but takes 200K.
| [reply] [d/l] [select] |
Re: Moving SQL from perl to Stored Procedures
by Anonymous Monk on Sep 09, 2006 at 11:13 UTC
|
Stored procedures *shakes head*.
Any university student will know that dividing application logic between application and SQL server is messy at best. It's a direct violation of the principle of cohesion which seeks to keep related logic together.
Indeed, the principle of coupling is also attacked using stored procedures.
I see industry racing to use these "cool ideas" but in reality so many of them are unsound. Too many cowboys in industry.. | [reply] |
|
You're looking at it wrong -- we're maintaining coupling in the database -- I'm keeping the data with its associated logic.
Should I leave the data to be misinterpreted by each application's programmers?
Should data security be moved from the database and to the application?
Should each application implement the same accessors or business logic? (remember -- they might not all be using the same programming language)
Any person with experience knows that every situation is different, and there are advantages and disadvantages to almost every method. There may be times when it makes sense to keep all of the business logic out of the database (eg, you have reason to believe that your company may be switching database vendors in the future, and you'll have to re-code all stored procedures), and there are times when keeping it all close to the database makes sense (eg, it's a large database that's used by multiple applications, and you need to keep applications from making calls that might adversely affect the overall performance of the database).
| [reply] |
|
|