Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

OO design and persistence

by jimbus (Friar)
on Dec 22, 2005 at 15:02 UTC ( [id://518542]=perlquestion: print w/replies, xml ) Need Help??

jimbus has asked for the wisdom of the Perl Monks concerning the following question:

I've started reading a book on object oriented design and I started thinking ahead a bit and have a question on object persistence.

The current discussion is about inheritence and its using the example of people objects having certain attributes and behaviors and then certain subgroups of people (say, teacher or student) have specialized attributes and behaviors on top of those they inherit as being people. So even though I'm in chapter one and we are just talking in abstraction, my mind moves off toward how I might implement some of this in code... specifically how would I handle persistence.

If I implement an person object, with all its attributes and methods, I would persist it as a table with colums for each of the attributes and methods for accessing them and performing its actions. If I then create a student object that ISA person, do I create a new table that has all of the person and student attributes or do I have two tables, one for inherited data and one with new data and join on an object id? Or am I confusing layers here... persistence should be kept separate from behavior and attributes or maintained on the lowest level of inheritence to avoid confusion?

As a extension: from a DB perspective, if a chuck of info is someting that you would logically normalize into its own table, does that suggest that it should be its own object?

Thanks,

Jimbus

Never moon a werewolf!

Replies are listed 'Best First'.
Re: OO design and persistence
by tirwhan (Abbot) on Dec 22, 2005 at 15:28 UTC

    That's a bit of a rat's nest you're opening there and cannot possibly be answered in a PM node (just type "object-relational impedance mismatch" into Google and read some of the 25,200 articles that come up to get an idea). I'll stick to your last question and answer that with a resounding "probably no". The rules of database normalisation are very different from those of good OO design. You want to organise your objects much more according to the business logic of your application, the whole point of objects is to expose an interface which makes them easy to work with. Database normalisation doesn't particularly care about interfaces, it is about organising the data in the most efficient way possible.

    As an example, supposed you have a database of people containing a name and telephone number. Unless every person has exactly one telephone number (and vice versa) you'll want to keep telephone numbers in a separate table and match the two records with a common id. Your person object OTOH can contain all the telephone numbers associated with this person. Now, if the phone number is something that you'll want to manipulate often (for example if you're a telco and you need to associate phone numbers with phone exchanges, look up number of minutes this number has been in use for the last month etc.) then it may make sense to create a separate phone number class, which contains the necessary methods and data. But if all you're ever going to do is set and retrieve the phone number you don't want to do that, the overhead of writing and instantiating a class for a simple string lookup is wasted effort. (Unless you're programming in Java of course ;-).

    Hope that helps.


    Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan
Re: OO design and persistence
by Joost (Canon) on Dec 22, 2005 at 15:40 UTC
    In practice, I've seen three strategies, two of which you mentioned.

    Use one table per inheritance tree with all possible attributes and a "type" column that specifies the class of the object.

    Pro: Fast database actions. Easy to "upgrade" the class of an object (i.e. promote a student to teacher, though if you need to do this regularly, you should probably not use inheritance anyway). Stores all attributes in exactly one place.

    Con: possibly wastes space, gets confusing if you have a largish number of subclasses. You should probably not use this if you want to represent "the ultimate base class" (Object/UNIVERSAL) in your database.

    Use one table for each class with only the additional attributes and the object-id in the subclass' table.

    Pro: allows for larger inheritance-trees to be modeled in the database. Usually stores all attributes in exactly one place.

    Con: makes efficient selecting on multiple attributes an "interesting" problem. Retrieving all attributes of an object is fairly expensive if the inheritance tree is large.

    Use one table for each class with all attributes for that class in the representing table.

    Pro: allows for larger inheritance-trees to be modeled in the database. Can do reasonably efficient selects on multiple attributes.

    Con: not very efficient if you want to retrieve multiple subclasses. Stores inherited attributes in multiple tables (especially nasty for updates/inserts if you have large fields).

    In summary: OO-relational mappings are not as clear-cut as they appear on first glance. Which system is best depends on your typical use. You always have to make trade-offs and in complicated systems you will want to write your own queries for certain actions that just don't map well.

    update: note that some databases *cough* postgres *cough* support something called table-inheritance, which might be an interesting alternative solution.

      update: note that some databases *cough* postgres *cough* support something called table-inheritance, which might be an interesting alternative solution.

      I've looked hopefully at this but been put off by the serious "Caveats" section in the docs on table inheritance. It would appear to essentially break primary and foreign keys!

Re: OO design and persistence
by perrin (Chancellor) on Dec 22, 2005 at 15:26 UTC
    The answers to all questions are on CPAN. Also, Martin Fowler has a nice discussion of this in one of his books and the brief descriptions are here. He calls them single table inheritance, class table inheritance and concrete table inheritance.

      Cool, sometimes getting the appropriate pattern names to research is the most frustrating to me... thanks for the pointers.

      Jimbus

      Never moon a werewolf!
Re: OO design and persistence
by tphyahoo (Vicar) on Dec 22, 2005 at 17:14 UTC
Re: OO design and persistence
by ptum (Priest) on Dec 22, 2005 at 15:11 UTC

    As a extension: from a DB perspective, if a chuck of info is someting that you would logically normalize into its own table, does that suggest that it should be its own object?

    Not necessarily. I think that an object (like a person) will often be implemented across many tables, depending on roles and relationships that apply to that person. I generally try to keep the database implementation details quite separate from the object implementation.


    No good deed goes unpunished. -- (attributed to) Oscar Wilde
Re: OO design and persistence
by herveus (Prior) on Dec 22, 2005 at 17:26 UTC
    Howdy!

    I'm a database guy first, so that colors my perspective.

    Designing a set of classes has a lot in common with database design. In On Flyweights... (with sneaky segue to data modeling), I visited the general area. In essence, I'm advocating applying normalization techniques to object modeling. The more you can minimize data duplication, the better. If you find, subsequently, that you have genuine performance issues that you can pin on that donkey, then (and only then) should you "denormalize". The biggest hurdle is modeling inheritance. I'd say to start by keeping the attributes for each class in their own tables. Child classes would need to keep a reference to the parent instance that contains the parent class data, and a parent class would need some way to distinguish data for an object *of* that parent class from data for an object in a subclass. From there, you can play refactoring games to move fields around, etc.

    yours,
    Michael
Re: OO design and persistence
by derby (Abbot) on Dec 22, 2005 at 15:36 UTC

    While you can find lotsa stuff about OO persistence to RDBMSes and/or OODBMS, I've always found the whole topic to be too much spoonerism for me - the mixed metaphor is real ugly. I've always found it easier to do the two seperately and then marry them with a lightweight persistance layer (err ... DBI/DBD) ... but then again, I'm just silly that way

    -derby
Re: OO design and persistence
by dimar (Curate) on Dec 23, 2005 at 00:50 UTC
    from a DB perspective, if a chuck of info is someting that you would logically normalize ... should (it) be its own object?

    My favorite node on 'big picture' questions in OOP: The world is not object oriented.
    (see also OO concepts and relational databases mentioned elsewhere in this thread)

    Often the 'academic' approach to OOP and Data Architecture differs dramatically from the 'nuts and bolts' approach. This contrast in approaches is made more thorny by the fact that "Object Orientation" means different things to different people; even if they come from the same "school of thought"! From an academic perspective, this is a good thing, because it means more opportunities to publish articles, debate, and apply for grants.

    From a nuts and bolts perspective, most end-users and I.T. clients do not care how sophisticated your object heirarchy is, or how clever you were in implementing it. What matters to them is whether they get what they expect: a performant application that does not make them have to think too hard.

    With all the fire and noise generated behind the alluring mystique of OOP, very few people acknowledge that OO programming and Functional programming really have a lot in common, and consequently do not require such dramatic shifts in perspective when it comes to uniting the codebase with the persistence layer.

    Sometimes, the only tangible difference (to the programmer) is just a difference in syntax. Consider:

    ### style1 $str = 'lrep'; $str = uc($str); $str = reverse($str); print $str; ### style2 $str = 'lrep'; $str = reverse(uc($str)); print $str; ### style3 $str = 'lrep'; $str.toUpperCase().reverse().toConsole();
    This is of course a simplification, but the point is do not let yourself get too mystified by the terminology and buzzwords; if your programming methodology is driving your choices for persistance ... you *might* have a candidate for over-engineered design.

    =oQDlNWYsBHI5JXZ2VGIulGIlJXYgQkUPxEIlhGdgY2bgMXZ5VGIlhGV
Re: OO design and persistence
by venk (Acolyte) on Dec 22, 2005 at 19:49 UTC
    As a extension: from a DB perspective, if a chuck of info is someting that you would logically normalize into its own table, does that suggest that it should be its own object?
    That kind of data-centric design is surely useful, but I think one has to be careful not to put the cart before the horse. If you are building an object-oriented system, IMHO you should put more emphasis on modeling the problem you are trying to solve, and less on worrying about how the database is going to feel about your solution.
Re: OO design and persistence
by CountZero (Bishop) on Dec 22, 2005 at 22:49 UTC
    Much will depend on how you wish to access your objects. Do you wish to "run" them straigth from the persistence layer then it is most likely that you will indeed need multiple tables.

    On the other hand if the persistence layer is only there to persist the data in your objects, then you could probably do with a "flat" database scheme which stores the serialized data and all you have to implement (with a little help from CPAN) is the routine to load /save the object data. Several serialization schemes exist (to name but two: Storable and YAML) and which one is best suited for you will depend on how complicated your data is and/or whether you wish to store the data in a human readable format or not.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

table inheritance in PostgreSQL
by zby (Vicar) on Dec 23, 2005 at 15:50 UTC
    This might be still a bit exotic but in PostgreSQL there is a mechanism called inheritance exactly for cases like this. See 5.8. Inheritance in PostgreSQL manual.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://518542]
Approved by ww
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-04-24 15:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found