Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Perl or SQL ?

by Anonymous Monk
on Dec 20, 2005 at 12:09 UTC ( [id://518040]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'm fairly new to Perl and have an issue I'd like your feedback on... here goes
I need to create a view in a Sybase database to be able to extract data from via BCP
The view needs to be created dynamically every time the code is run and needs to finish up looking like this

create view aud_extr as select * from sybsecurity..sysaudits_01 where suid in (1,8,13,20) union select * from sybsecurity..sysaudits_02 where suid in (1,8,13,20) union . .
The number of unions in the above sql depends on how many audit tables are present, hence it could be 1 or upto 8 table joins . The sysaudits_nn number increasing for each union. I can find out how many sysaudits tables are present easily enough
The issue is should I be preparing this create view statement within the Perl code (I'm using DBD::Sybase and 12.5.3 Sybase) or within a stored proc in the Sybase server ?

If it's within the Perl code what's the best way to build the string before executing it ?

I'm sure that Perl is more than capable of doing this but is it more or less efficient as executing a stored proc to do this ?
Any suggestions/comments welcome

Replies are listed 'Best First'.
Re: Perl or SQL ?
by cbrandtbuffalo (Deacon) on Dec 20, 2005 at 13:25 UTC
    In response to your performance question, you're right to at least question where you should do your heavy lifting in this case. Databases are often faster than perl when it comes to queries and fetching data, but there are times when perl will be faster because of what you are doing. One example is the 'in' statement in Oracle--Oracle can have performance issues with that and sometimes perl is faster. Your example of a union is another case.

    If you have the time and resources, the best way to get a feel for the performance of each is to try it. Set up a best and worst case scenario for the number of joins that would be needed. Given the info you provided, set up a query with 1 join and one with 8 joins. Then run them in both scenarios. This is the best way to get data you can trust and feel confident with when you make your final decision.

    Want a guess? The database will be faster with 1 union, perl faster with 8. The break-even will be somewhere in the middle. If this is truly the case, you need to make a call based on how often users will request each end of the spectrum.

    I too am a little puzzled by the 'views on the fly' solution. I'm no Sybase expert, but it doesn't feel right to me. I think you're likely to lose a bunch of your performance in the creation of the views.

    Good luck.

      Thanks for the answer. There's one bit of information I missed off the supplied sql which is kinda important...
      The select includes a date range e.g.
      select * from sysaudits_01 where suid in (1,2,3,4,5) and eventtime > ' +$suppliedtime' union . .
      This therefore doesn't lend itself to using a stored proc solution since due to the variable timestamp input the sql is ever changing.
      Thanks to you for your answer and the others who have shared there wisdom. As always there are many ways to skin a cat !
Re: Perl or SQL ?
by derby (Abbot) on Dec 20, 2005 at 13:23 UTC

    This is one of those cases where it really depends on the skills and strengths of your development team.

    If you have a clear line between db devs and front-end devs, then I would say this is something that should be done in a stored proc (since it's very database specific and probably will not translate well to other dbs if you ever decide to switch).

    If you don't have that luxury or you're a one man shop, it would depend on where you're comfortable doing the work. As for efficiency, I really wouldn't worry about it (at first) - both languages (perl and transact sql) have enough power and magic to work efficiently (but then again, both give you more than enough rope to do it poorly).

    -derby
Re: Perl or SQL ?
by aquarium (Curate) on Dec 20, 2005 at 12:24 UTC
    This looks more like ad-hoc queries to me -- rather than views. Specifically, views are used for often repeated sql, perhaps with a test on a field or two. Unless these (dynamically generated) views get used a few times (to make it worthwhile constructing a view in the first place), then creating the views is more trouble (and processing expense at the RDBMS) than just running dynamic sql. If it's possible and under your control, perhaps re-think the partitioning of the sysaudits table. Rather than having sysaudits_01, sysaudits_02, etc., you could have a single sysaudits table with a column for "audit number". Then you have a simple query to achieve your required results; instead of multiple unions.
    the hardest line to type correctly is: stty erase ^H
      Unfortunatley I don't have control over the sysaudits partitioning. Also, the use of the view is to restrict the data set I need to extract via the BCP which won't work with ad-hoc queries.
Re: Perl or SQL ?
by dokkeldepper (Friar) on Dec 20, 2005 at 14:37 UTC

    Hi, using prepared statements to create views is quite a useless excercise (sorry, dont mean it as harsh as it sounds). Prepared statments internally create stored procedures that will be eventually parameterized. View creation is nothing more than writing text into sysviews (or whatever) and is not an expensive operation. Using stored procedures to create views sounds a bit like overkill.

    By the way, if you can connect to the database via perl, why do you extract the data by bcp? If you have the power of perl at hand why not using it for data extraction and immeadiate reporting (or so)? Performance is the only issue I could think of... However raw data as extracted via bcp usually requires some further munging to be useful.

      Actually, I hadn't realised just how powerful Perl is. I can forget about creating views/using bcp altogether !
      Thanks
Re: Perl or SQL ?
by Perl Mouse (Chaplain) on Dec 20, 2005 at 14:07 UTC
    The advantage of using a stored proc is that the stored proc will only be compiled once per restart of the database. If you send over the SQL it needs to be compiled each time. The advantage of sending it over is that the code is less complicated, as you can tailor the amount of unions (I presume you know how many unions you need before sending over the SQL code). The stored procedure will be more complicated as you don't know how many unions you're going to need when writing the stored proc.

    That's the trade-off you need to make, and only you have sufficient knowledge to make the right decision. If you're going to do this stuff millions of times, you're probably better off using a stored proc. If you do this once a day, I'd go for the simpler code. If it's somewhere in between, well, good luck making the decision! ;-)

    Perl --((8:>*
Re: Perl or SQL ?
by EvanCarroll (Chaplain) on Dec 21, 2005 at 03:49 UTC
    My vote SQL:
    With very few exceptions I feel that everything that can be done in SQL, should be. It is generally faster, easier to maintain, and more easily controlled. But, then again I use postgres, and while I rarely ever need to, I have pl/perl to fall back on. Some advantages of using the database over perl.
    • Static query plan
    • Single transmission of data
    • Dynamic updateing of view
    • DB Dependant data stays in DB that it depends on


    Evan Carroll
    www.EvanCarroll.com
Re: Perl or SQL ?
by mpeppler (Vicar) on Dec 21, 2005 at 07:34 UTC
    Here's my take on this. Given that the sysaudits tables can be very large I would use a perl script to generate the view, and then use bcp to extract the data.

    My reasoning is that bcp will pull data out from Sybase much faster than DBD::Sybase can (much less overhead), so assuming that the views cover a fairly large subset of the sysaudits tables you could potentially gain quite a lot of time.

    The view is only needed so that you can use bcp (bcp will not accept any WHERE qualifiers as you know, but others here might not), and generating it in the perl script on the fly and dropping it once the request is done seems perfectly reasonable. You would probably want to create the view in the tempdb database to avoid any unnecessary database polution.

    Michael

      As you know my original thought was to go down the create view/bcp data out route as you have suggested also. The problem is that I need to run a regex over the extracted data and I figure it's easier to do this via perl than inside sql given the complexities of what I need to do.
      Here is the test code I ended up with (some of which you may recognise ). This is taking the pure select route rather than the create view/bcp route. However I have a problem I getting the regex to work.

      This is a snapshot of the dataset 92 1 24 Dec 20 2005 2:14PM 1 14 2 sa tempdb select * from sysuser +s 92 1 24 Dec 20 2005 2:14PM 1 14 2 sa tempdb select * from master. +.sysprocesses 92 1 24 Dec 20 2005 2:14PM 1 14 2 sa tempdb select * from sysuser +s 92 1 24 Dec 20 2005 2:14PM 1 14 2 sa tempdb select * from master. +.sysprocesses 92 1 25 Dec 20 2005 2:14PM 1 14 2 sa tempdb use tempdb 92 1 25 Dec 20 2005 2:14PM 1 14 2 sa tempdb select * from sysuser +s 92 1 26 Dec 20 2005 2:14PM 1 14 2 sa tempdb use tempdb 92 1 26 Dec 20 2005 2:14PM 1 14 2 sa tempdb select @@version 92 1 26 Dec 20 2005 2:14PM 1 14 2 sa tempdb COMMIT TRAN 92 1 26 Dec 20 2005 2:14PM 1 14 2 sa tempdb create table #ttt (fo +o varchar(20), bar int) 92 1 26 Dec 20 2005 2:14PM 1 14 2 sa tempdb insert #ttt values('a + string', 1) 92 1 26 Dec 20 2005 2:14PM 1 14 2 0001fb520002 sa tempdb insert #t +tt values('another string', 2) 92 1 26 Dec 20 2005 2:14PM 1 14 2 0001fb520002 sa tempdb insert #t +tt values('foodiboo', 3) 92 1 26 Dec 20 2005 2:14PM 1 14 2 0001fb520002 sa tempdb ROLLBACK TRAN Here's the code my $lastrunt = "Dec 20 2005 9:00AM"; my @sqlcmd; $dbh->do("use sybsecurity"); $sth=$dbh->prepare("select * from sysobjects where name like 'sysaudit +s_%'"); $sth->execute; while ($data = $sth->fetch) { push @sqlcmd,"select extrainfo from @$data where suid in (select + a.suid from master..syslogins a, master..s ysloginroles b where b.srid=0 and a.suid=b.suid and a.suid not in (1,8 +))) and event=92 and eventtime > '$lastrunt'"; push @sqlcmd," union"; } pop @sqlcmd; $sth=$dbh->prepare("@sqlcmd"); $sth->execute; while ($data = $sth->fetch) { # print "@$data\n"; if($data =~ /create/) { print "@$data\n"; } }
      I was trying to only pick out the create statements but I don't get anything. If I uncomment the print then the above dataset is returned.

      I'm trying to get to grips with Perl but can't figure out what is wrong with my regex statement ?

        $sth->fetch returns an array reference, and $sth->fetchrow returns a list. Each item in the array or list is a column in the SELECT statement. So:

        $sth = $dbh->prepare("select a, b, c from foo"); $sth->execute; while($row = $sth->fetch) { # $row->[0] is column 'a' # $row->[1] is column 'b', # etc... }
        and
        while(($one, $two, $three) = $sth->fetchrow_array) { # $one, $two and $three are columns a, b and c }
        Michael
        I've just figured it out
        $sth=$dbh->prepare("@sqlcmd"); $sth->execute; while (($data) = $sth->fetchrow) { if($data =~ /create/) { print "$data\n"; } }
        I've simplified the regex I really need to use to demonstrate what I was attempting to do.
        I agree that for larger datasets the create view/bcp out is the way to go. Performance testing has revealed this to be the case.
        Thanks for all the responses, this is a top class site

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://518040]
Approved by holli
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-04-23 15:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found