OO design and persistence

jimbus has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: OO design and persistence by tirwhan (Abbot) on Dec 22, 2005 at 15:28 UTC
That's a bit of a rat's nest you're opening there and cannot possibly be answered in a PM node (just type "object-relational impedance mismatch" into Google and read some of the 25,200 articles that come up to get an idea). I'll stick to your last question and answer that with a resounding "probably no". The rules of database normalisation are very different from those of good OO design. You want to organise your objects much more according to the business logic of your application, the whole point of objects is to expose an interface which makes them easy to work with. Database normalisation doesn't particularly care about interfaces, it is about organising the data in the most efficient way possible. As an example, supposed you have a database of people containing a name and telephone number. Unless every person has exactly one telephone number (and vice versa) you'll want to keep telephone numbers in a separate table and match the two records with a common id. Your person object OTOH can contain all the telephone numbers associated with this person. Now, if the phone number is something that you'll want to manipulate often (for example if you're a telco and you need to associate phone numbers with phone exchanges, look up number of minutes this number has been in use for the last month etc.) then it may make sense to create a separate phone number class, which contains the necessary methods and data. But if all you're ever going to do is set and retrieve the phone number you don't want to do that, the overhead of writing and instantiating a class for a simple string lookup is wasted effort. (Unless you're programming in Java of course ;-). Hope that helps. Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan	[reply]
Re: OO design and persistence by Joost (Canon) on Dec 22, 2005 at 15:40 UTC
In practice, I've seen three strategies, two of which you mentioned. Use one table per inheritance tree with all possible attributes and a "type" column that specifies the class of the object. Pro: Fast database actions. Easy to "upgrade" the class of an object (i.e. promote a student to teacher, though if you need to do this regularly, you should probably not use inheritance anyway). Stores all attributes in exactly one place. Con: possibly wastes space, gets confusing if you have a largish number of subclasses. You should probably not use this if you want to represent "the ultimate base class" (Object/UNIVERSAL) in your database. Use one table for each class with only the additional attributes and the object-id in the subclass' table. Pro: allows for larger inheritance-trees to be modeled in the database. Usually stores all attributes in exactly one place. Con: makes efficient selecting on multiple attributes an "interesting" problem. Retrieving all attributes of an object is fairly expensive if the inheritance tree is large. Use one table for each class with all attributes for that class in the representing table. Pro: allows for larger inheritance-trees to be modeled in the database. Can do reasonably efficient selects on multiple attributes. Con: not very efficient if you want to retrieve multiple subclasses. Stores inherited attributes in multiple tables (especially nasty for updates/inserts if you have large fields). In summary: OO-relational mappings are not as clear-cut as they appear on first glance. Which system is best depends on your typical use. You always have to make trade-offs and in complicated systems you will want to write your own queries for certain actions that just don't map well. update: note that some databases cough postgres cough support something called table-inheritance, which might be an interesting alternative solution. "What should it profit a man, if he should win a flame war, yet lose his cool?"	[reply]
Re^2: OO design and persistence by qq (Hermit) on Dec 22, 2005 at 22:56 UTC
update: note that some databases cough* postgres cough support something called table-inheritance, which might be an interesting alternative solution.* I've looked hopefully at this but been put off by the serious "Caveats" section in the docs on table inheritance. It would appear to essentially break primary and foreign keys!	[reply]
Re: OO design and persistence by perrin (Chancellor) on Dec 22, 2005 at 15:26 UTC
The answers to all questions are on CPAN. Also, Martin Fowler has a nice discussion of this in one of his books and the brief descriptions are here. He calls them single table inheritance, class table inheritance and concrete table inheritance.	[reply]
Re^2: OO design and persistence by jimbus (Friar) on Dec 22, 2005 at 15:31 UTC
Cool, sometimes getting the appropriate pattern names to research is the most frustrating to me... thanks for the pointers. Jimbus Never moon a werewolf!	[reply]
Re: OO design and persistence by tphyahoo (Vicar) on Dec 22, 2005 at 17:14 UTC
On this theme, you may be enlightened by dragonchild's excellent post at OO concepts and relational databases.	[reply]
Re: OO design and persistence by ptum (Priest) on Dec 22, 2005 at 15:11 UTC
As a extension: from a DB perspective, if a chuck of info is someting that you would logically normalize into its own table, does that suggest that it should be its own object? Not necessarily. I think that an object (like a person) will often be implemented across many tables, depending on roles and relationships that apply to that person. I generally try to keep the database implementation details quite separate from the object implementation. No good deed goes unpunished. -- (attributed to) Oscar Wilde	[reply]
Re: OO design and persistence by herveus (Prior) on Dec 22, 2005 at 17:26 UTC
Howdy! I'm a database guy first, so that colors my perspective. Designing a set of classes has a lot in common with database design. In On Flyweights... (with sneaky segue to data modeling), I visited the general area. In essence, I'm advocating applying normalization techniques to object modeling. The more you can minimize data duplication, the better. If you find, subsequently, that you have genuine performance issues that you can pin on that donkey, then (and only then) should you "denormalize". The biggest hurdle is modeling inheritance. I'd say to start by keeping the attributes for each class in their own tables. Child classes would need to keep a reference to the parent instance that contains the parent class data, and a parent class would need some way to distinguish data for an object of that parent class from data for an object in a subclass. From there, you can play refactoring games to move fields around, etc. yours, Michael	[reply]
Re: OO design and persistence by derby (Abbot) on Dec 22, 2005 at 15:36 UTC
While you can find lotsa stuff about OO persistence to RDBMSes and/or OODBMS, I've always found the whole topic to be too much spoonerism for me - the mixed metaphor is real ugly. I've always found it easier to do the two seperately and then marry them with a lightweight persistance layer (err ... DBI/DBD) ... but then again, I'm just silly that way -derby	[reply]
Re: OO design and persistence by dimar (Curate) on Dec 23, 2005 at 00:50 UTC
from a DB perspective, if a chuck of info is someting that you would logically normalize ... should (it) be its own object? My favorite node on 'big picture' questions in OOP: The world is not object oriented. (see also OO concepts and relational databases mentioned elsewhere in this thread) Often the 'academic' approach to OOP and Data Architecture differs dramatically from the 'nuts and bolts' approach. This contrast in approaches is made more thorny by the fact that "Object Orientation" means different things to different people; even if they come from the same "school of thought"! From an academic perspective, this is a good thing, because it means more opportunities to publish articles, debate, and apply for grants. From a nuts and bolts perspective, most end-users and I.T. clients do not care how sophisticated your object heirarchy is, or how clever you were in implementing it. What matters to them is whether they get what they expect: a performant application that does not make them have to think too hard. With all the fire and noise generated behind the alluring mystique of OOP, very few people acknowledge that OO programming and Functional programming really have a lot in common, and consequently do not require such dramatic shifts in perspective when it comes to uniting the codebase with the persistence layer. Sometimes, the only tangible difference (to the programmer) is just a difference in syntax. Consider: `### style1 $str = 'lrep'; $str = uc($str); $str = reverse($str); print $str; ### style2 $str = 'lrep'; $str = reverse(uc($str)); print $str; ### style3 $str = 'lrep'; $str.toUpperCase().reverse().toConsole();` [download] This is of course a simplification, but the point is do not let yourself get too mystified by the terminology and buzzwords; if your programming methodology is driving your choices for persistance ... you might* have a candidate for over-engineered design*. =oQDlNWYsBHI5JXZ2VGIulGIlJXYgQkUPxEIlhGdgY2bgMXZ5VGIlhGV	[reply] [d/l]
Re: OO design and persistence by venk (Acolyte) on Dec 22, 2005 at 19:49 UTC
As a extension: from a DB perspective, if a chuck of info is someting that you would logically normalize into its own table, does that suggest that it should be its own object? That kind of data-centric design is surely useful, but I think one has to be careful not to put the cart before the horse. If you are building an object-oriented system, IMHO you should put more emphasis on modeling the problem you are trying to solve, and less on worrying about how the database is going to feel about your solution.	[reply]
Re: OO design and persistence by CountZero (Bishop) on Dec 22, 2005 at 22:49 UTC
Much will depend on how you wish to access your objects. Do you wish to "run" them straigth from the persistence layer then it is most likely that you will indeed need multiple tables. On the other hand if the persistence layer is only there to persist the data in your objects, then you could probably do with a "flat" database scheme which stores the serialized data and all you have to implement (with a little help from CPAN) is the routine to load /save the object data. Several serialization schemes exist (to name but two: Storable and YAML) and which one is best suited for you will depend on how complicated your data is and/or whether you wish to store the data in a human readable format or not. CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law	[reply]
table inheritance in PostgreSQL by zby (Vicar) on Dec 23, 2005 at 15:50 UTC
This might be still a bit exotic but in PostgreSQL there is a mechanism called inheritance exactly for cases like this. See 5.8. Inheritance in PostgreSQL manual.	[reply]


Just another Perl shrine
	PerlMonks

OO design and persistence

Use one table per inheritance tree with all possible attributes and a "type" column that specifies the class of the object.

Use one table for each class with only the additional attributes and the object-id in the subclass' table.

Use one table for each class with all attributes for that class in the representing table.