Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^4: Avoiding compound data in software and system design

by metaperl (Curate)
on Apr 28, 2010 at 14:23 UTC ( [id://837317]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Avoiding compound data in software and system design
in thread Avoiding compound data in software and system design

No I'm not. You are making an artificial separation where none exists.
Everything a human does is 'artificial' - I think what you mean is superficial or arbitrary. And as this thread shows, even EF Codd was somewhat vague and arbitrary in specifying what constituted atomic data. So yes, you're right, the definitions are vague and somewhat subjective. But throwing some light and angst on the issue should make us more aware and intelligent in future API decisions.



The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development.

-- Terence Parr, "Enforcing Strict Model View Separation in Template Engines"

  • Comment on Re^4: Avoiding compound data in software and system design

Replies are listed 'Best First'.
Re^5: Avoiding compound data in software and system design
by BrowserUk (Patriarch) on Apr 28, 2010 at 17:06 UTC

    EF Codd eh? Circa 1981, I had to do a CS project, and having read an article (in Byte I think) on Codd's paper, I wrote up the proposal for my project as: "A simple exploration of the Relational Model". To be written in BASIC Plus 2. And yes, BASIC.

    I had one term to write it.

    It took 6 weeks for the college library to obtain a photocopy of the paper--it had to come from the British Library in London, the only people in the UK who had a copy. It was photocopy, of a photocopy, of a bound paper with all the distortions and fuzzy greyness that entails. It took me two whole weeks to read it--I understood very little of it. So there I was with half my time gone and nothing to show for it.

    Back to the point.

    And that is, all DBI needs to know is the first two fields of the DSN. The first must match 'dbi' (+-case); the second must match a module "DBD::<2ndfield>" that is installed locally. What comes after that is none of its concern. It just gets passed through to the loaded DBD driver.

    And the forms of that opaque token are myriad. A quick survey turns up:

    $dbh = DBI->connect("dbi:Informix:$database", $user, $pass, %attr); $dbh = DBI->connect("DBI:Unify:dbname[;options]" [, user [, auth [, a +ttr]]]); $dbh = DBI->connect("dbi:Oracle:host=$host;sid=$sid", $user, $passwd) +; $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","",""); $dbh = DBI->connect("DBI:drizzle:database=test;host=localhost", "joe" +, "joe's password", {'RaiseError' => 1}); $dbh = DBI->connect('dbi:ODBC:DSN', 'user', 'password'); $dbh = DBI->connect("dbi:Pg:dbname=$dbname", '', '', {AutoCommit => 0 +}); $dbh = DBI->connect('DBI:RAM:','usr','pwd',{RaiseError=>1}); $dbh = DBI->connect("DBI:Wire10:host=$host", $user, $password, {Raise +Error' => 1, 'AutoCommit' => 1} $dbh = DBI->connect("DBI:CSV:f_dir=/home/joe/csvdb") $dbh = DBI->connect("dbi:JDBC:hostname=$hostname;port=$port;url=$url" +, $user, $password); $dbh = DBI->connect("dbi:Sqlflex:$database", $user, $pass, %attr); $dbh = DBI->connect("dbi:DB2:db_name", $username, $password); $dbh = DBI->connect("DBI:mysql:database=test"); $dbh = DBI->connect('DBI:DBMaker:' . $database, $user, $pass); $dbh = DBI->connect('dbi:PgPP:dbname=$dbname', '', ''); $dbh = DBI->connect('dbi:PgLite:dbname=file'); $dbh = DBI->connect("dbi:ADO:Provider=Microsoft.Jet.OLEDB.4.0;Data So +urce=C:\data\test.mdb", $usr, $pwd, $att ) $dbh = DBI->connect("DBI:Ingres:dbname[;options]", user [, password], + \%attr); $dbh = DBI->connect('DBI:Solid:TCP/IP somewhere.com 1313', $user, $pa +ss, 'Solid'); $dbh = DBI->connect("dbi:Google:", $KEY);

    Look at the variations once you get beyond the first two fields. Yes you could keep these all separate in a hash, but to what end? You (as a DBI user) cannot do anything useful with them because there is insufficient consistency to make even validation judgements, much less anything else.

    Even where several DBDs require, for example, a "dbname", for some this will be have SQL identifier limitations--though even they aren't consistent across all SQL-like DBs.

    For some it will be a filename (with local filesystem semantics--case dependance (or not); reserved characters (or not); length limitations (or not).

    For some, it's a hostname and port.

    For some--see the ADO example--it's a whole bunch of stuff entirely unique to that DBD.

    For some the subfields have to be prefixed with their tagname, others are position dependant.

    Why stick all these disparate bit into a hash and then have DBI concatenate the bits--risking getting it wrong because (for example) it adds tagnames where none are required, or the hash ordering screws up the position dependance; or ...?

    To achieve all that, you'd need more than just a hash. You'd need one flag per field to decide whether the key name should be prepended to the fields value. You'd need another value to ensure ordering. You'd need yet another flag to ensure that (for example) backslashes in pathnames got escaped for interpolation.

    And all of that complexity buys you what? The user can far more easily know what the requirements are for the DBD (or two; or three) he is going to use, than any programmer can try and unify into one generic interface structure that will stand the test of time.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Why stick all these disparate bit into a hash and then have DBI concatenate the bits--risking getting it wrong because (for example) it adds tagnames where none are required, or the hash ordering screws up the position dependance; or ...?

      Why in the world would DBI stick all the bits back together? This information is not useful to anyone in this serialized form. It's entirely an artificial construct, dictated by DBI's (poor) decision to require multiple distinct pieces of information to be passed in a single string argument. The DBD has to parse it and break it back into pieces to do its work (e.g., extracting the hostname and port to make the system call to open a socket, extracting the database name to pass in the connection command, etc.)

      To achieve all that, you'd need more than just a hash. You'd need one flag per field to decide whether the key name should be prepended to the fields value. You'd need another value to ensure ordering. You'd need yet another flag to ensure that (for example) backslashes in pathnames got escaped for interpolation.

      No, you wouldn't, because the specially-formatted DSN string never needs to be constructed at all, for any reason.

        No, you wouldn't, because the specially-formatted DSN string never needs to be constructed at all, for any reason.

        I see. So you're volunteering to go through and modify all the 600+ DBD::* modules; and all the modules that use them; and all the code written in the last 15 years that use them; just so that you can provide introspection of things that nobody will ever want to introspect?

        Let's just pretend for a moment that we could re-write history, and DBI had specified that the first parameter to DBI was a hashref. And (say), the only required pair was dbi => Pg|MySQL|Whatever. And that each DBD was free to require whatever pairs it needed. What does that achieve?

        You would have a hash rather than a string. That would make it easier to wrapover in your DSN object--though this parsing you speak of is hardly onerous. But then, as now, that wrapover is pointless.

        • A hash is far easier to use than an object.

          $hash{ $key }++ is infinitely preferable to  $dsn->keySet( $key, $dsn->keyGet( $key ) + 1 );

        • And apart from using clumsy OO syntax to get, set or iterate the contents of that hash, what else does that dns class do? What else could it do?

          You can't hope to validate all the possibilities. And there are no useful methods beyond get/set/iterate you could apply to it.

          Even ignoring that you'd:

          • either have to standardize the fields in the dsn.

            Which is impractical as each DBD has its own unique set of requirements

          • or eval the setters/getters into existance based upon what the user put in the hash.

            Which besides any problems with eval, means that

            1. DBDs would need to name all their fields.

              Which many own don't currently have, and neither need nor want to have;

            2. you would risk evaling the users typos into existance with absolutely no way to validate them.
        • Ultimately, you'd have to provide a method that returned the attributes as a simple hash(ref) in order to pass it to DBI anyway.

          Unless you envisage passing your DSN object to DBI and have it pass it through to the DBDs (another breaking re-write).

          So then all the DBDs acquire yet another dependancy for no good purpose beyond the creation of OO tagliatelle.

        So, even if we could ignore history, and turn back the clock to do things your way, there'd be no value in it. And a considerable downside of increased complexity and dependancies.

        OO is fine and dandy when used properly, but using it to enforce a one-size-fits-all syntax fetish is no good use. Like all programming tools, the trick is to know when to use--and when not to.

        You may think that I don't agree with you, because I haven't tried the KoolAid yet, but you'd be wrong. I tried your flavour of KoolAid (along with most others) a long time ago, I just didn't like it. Or rather, can tolorate all flavours, each is fine for certain occasions, but I see no good reason to limit myself (or others) to just one flavour. Of anything.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      And that is, all DBI needs to know is the first two fields of the DSN. The first must match 'dbi' (+-case); the second must match a module "DBD::<2ndfield>" that is installed locally. What comes after that is none of its concern. It just gets passed through to the loaded DBD driver.

      A half-way alternative would be to specify the driver separated from the DBD-specific stuff. Then this:

      $dbh = DBI->connect("dbi:Informix:$database", $user, $pass, %attr); $dbh = DBI->connect("DBI:Unify:dbname[;options]" [, user [, auth [, at +tr]]]); $dbh = DBI->connect("dbi:Oracle:host=$host;sid=$sid", $user, $passwd);

      would become this:

      $dbh = DBI->connect(Informix => $database, $user, $pass, % +attr); $dbh = DBI->connect(Unify => "dbname[;options]", $user, $pass, % +attr); $dbh = DBI->connect(Oracle => "host=$host;sid=$sid", $user, $pass, % +attr); ...

      Please note that this is not an API change request/suggestion ;).

      --
       David Serrano
       (Please treat my english text just like Perl code, i.e. feel free to notify me of any syntax, grammar, style and/or spelling errors. Thank you!).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://837317]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2024-04-20 04:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found