Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Dear Monks,

I would like some advice on loading a daily downloaded file of XML data into a database in a simple but robust manner. FWIW it is Mergentís Standardized Data Feed called Mergent Global Company Data, which includes for each company its history, summaries of its quarterly financial statements, lists of officers, text sections, etc. and itís a bit complex. For example the feed will change over time. It can have different sections indicating different classes of executives it lists, etc. Text sections are as <![CDATA[ Öhtml hereÖ ]]> too. I can get the full file or just the new bits delivered, by FTP. (No, this is not RSS here.)

I am planning on using this in a Catalyst app which normally would be using DBIx::Class, and will also have some manually entered data for fields not in the feed. Does anyone have experience with this kind of database updated by an XML data feed? Iíve skimmed some likely sounding modules in CPAN like DBIx::XML::DataLoader (in beta for 6 years now), DBIx::DBStag and its cookbook, etc.

The last guy who tried to write a schema for the feed quit the job, and I want to be a good lazy japh so now Iím even thinking about some way to just load the XML once on startup and search that. Otherwise, I need to rebuild the database from the whole feed automatically.

Use of the data will involve simple display of a companyís data and also printing that to a report.. something that can be edited and sent to PDF like OpenOffice or maybe LyX.

Mainly I want to keep it simple and not require lots of maintenance, so rather than making and maintaining huge db schemas and creating DBIx relations (a company has many statements, a section has many officers, etc.) I wonder if a simpler answer is possible. Otherwise I could just code the existing data and then merge in new data daily that the model understands.

Thanks for your help.

Matt R.

UPDATED: Looks like maybe using XML::Parser to build my table is the answe. Also found DBIx::XML::DataLoader, has anyone used it, is it being maintained? The docs are a little opaque to me.. but not as scary, I think, as XML::RDB.

In reply to Building a database from XML data feed by mattr

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2023-06-02 22:18 GMT
Find Nodes?
    Voting Booth?
    How often do you go to conferences?

    Results (3 votes). Check out past polls.