Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^2: Maybe database tables aren't such great "objects," after all ...

by DStaal (Chaplain)
on Mar 25, 2011 at 20:54 UTC ( [id://895596] : note . print w/replies, xml ) Need Help??


in reply to Re: Maybe database tables aren't such great "objects," after all ...
in thread Maybe database tables aren't such great "objects," after all ...

While I might not go quite that far in many cases, I fully agree that 'table==object' is not abstraction. It's just convolution. In most database cases I've worked with, a single object might well have data in multiple tables. (With relations between each other.) Or an object could be made out of just part of a table. Thinking that an object is a table, and vice-versa, is just going to limit your idea of what can be done and what solutions are possible.

Replies are listed 'Best First'.
Re^3: Maybe database tables aren't such great "objects," after all ...
by jordanh (Chaplain) on Mar 27, 2011 at 16:46 UTC
    I'm not experienced in using ORMs or the architectures being discussed here, but it seems to me that problems might arise if the database structure is too hidden from the people writing and maintaining an application.

    How do you maintain transactional integrity in a situation where Perl objects reference multiple tables and different Perl objects might have overlapping table references? It seems to me that sometimes, you want to make sure that a number of Objects being updated result in the database tables being updated atomically and not understanding the impact on the underlying tables might lead one to believe this is being done when it's not. Conversely, you might only want to operate on a single object and have that update the table without waiting on operations on associated objects.

    Forgive me if I'm speaking from ignorance of the tools.

      How do you maintain transactional integrity...

      By encapsulation, usually as a stored procedure or sometimes, when the database engine is too limited, by a subroutine in the database interface layer.

      If one person is working on an application that means that the one person needs to understand both the database and application and how they relate to one another. He or she will be responsible for deciding whether code belongs in the application layers or the database layers (stored procedures/database interface libraries).

      In a larger team, it is possible that some people would only work with application objects, others would only work only with databases, and there might be a dual skilled programmer (database and OOP) working with both teams handling the mapping between the application model and the database model.

      I realize it is attractive to think one can eliminate the need for dual knowledge and translation between models by collapsing the object architecture and the database into one. The skill and human resource requirements become much more complex if there really are two separate models.

      However, it rarely works over the long term because databases and applications have very different goals. A database's job is to maintain the integrity of persistent data for a business or research project. Databases are fundamentally conservative. Since a database is at the center of an application ecosystem, changes to data structures are very disruptive.

      By contrast, an application's job is to find ways to use that data. Applications are essentially innovative, finding ever new ways to help a business make use of its existing information resources to meet changing operational and market needs.

      One of the big take-aways of normalization/database theory is that within certain constraints we can take one data structure and morph it into another more useful form. With the help of joins and projections we can create nearly any database view or object we need. In many cases we can use the same rules to automatically convert a view back into discrete tables and rows. We lose the benefit of that insight if we insist on an artificial one-to-one relationship between objects and tables.

      How do you maintain transactional integrity in a situation where Perl objects reference multiple tables and different Perl objects might have overlapping table references? It seems to me that sometimes, you want to make sure that a number of Objects being updated result in the database tables being updated atomically and not understanding the impact on the underlying tables might lead one to believe this is being done when it's not.
      Then your encapsulation/abstraction is broke^Wsub optimal.

      Your example seems to come from a notion that still has a very tight coupling between data layout and Perl data structures.

      Conversely, you might only want to operate on a single object and have that update the table without waiting on operations on associated objects.
      "The table"? What table? Again, loosen the idea there should be a mapping between tables and objects (although you probably mean rows and objects (or tables and classes)).

      In addition to ELISHEVA's excellent post: One of the basic tasks of a database is the ability to atomically do actions in multiple tables and/or records when needed. If your database can't do that, you need to look for a different database. Most databases can also update multiple records in a table at the same time by different processes/queries, without one update getting in the way of the other. (Subject to volume, structure, and resource limitations. There are a couple of low-end databases that don't do that one, and may be useful in some situations as long as you are aware of that limitation.)

      So, yes, that would be an ignorance of the tools issue: The tools should be able to handle those situations, when used correctly.

        It still seems to me that sometimes you'd want to operate on a number of objects inside a single transaction and other times in different transactions.

        This would expose the nature of the underlying database at a high level if you were to explicitly commit at a high level.

        I think ELISHEVA makes a good point. To design these things, you'd need someone with excellent DB and Object knowledge, but I'm wondering if you might also need that kind of knowledge to use the Objects effectively.

        Update: I'm probably talking nonsense here. I can't really think of where you wouldn't want to encapsulate the commits with the updates, although perhaps you could get some efficiencies by deferring them across object references. That could be built into the toolkit, too, if you were clever.