Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Dear Monks,

I would like some advice on loading a daily downloaded file of XML data into a database in a simple but robust manner. FWIW it is Mergent’s Standardized Data Feed called Mergent Global Company Data, which includes for each company its history, summaries of its quarterly financial statements, lists of officers, text sections, etc. and it’s a bit complex. For example the feed will change over time. It can have different sections indicating different classes of executives it lists, etc. Text sections are as <![CDATA[ …html here… ]]> too. I can get the full file or just the new bits delivered, by FTP. (No, this is not RSS here.)

I am planning on using this in a Catalyst app which normally would be using DBIx::Class, and will also have some manually entered data for fields not in the feed. Does anyone have experience with this kind of database updated by an XML data feed? I’ve skimmed some likely sounding modules in CPAN like DBIx::XML::DataLoader (in beta for 6 years now), DBIx::DBStag and its cookbook, etc.

The last guy who tried to write a schema for the feed quit the job, and I want to be a good lazy japh so now I’m even thinking about some way to just load the XML once on startup and search that. Otherwise, I need to rebuild the database from the whole feed automatically.

Use of the data will involve simple display of a company’s data and also printing that to a report.. something that can be edited and sent to PDF like OpenOffice or maybe LyX.

Mainly I want to keep it simple and not require lots of maintenance, so rather than making and maintaining huge db schemas and creating DBIx relations (a company has many statements, a section has many officers, etc.) I wonder if a simpler answer is possible. Otherwise I could just code the existing data and then merge in new data daily that the model understands.

Thanks for your help.

Matt R.

UPDATED: Looks like maybe using XML::Parser to build my table is the answe. Also found DBIx::XML::DataLoader, has anyone used it, is it being maintained? The docs are a little opaque to me.. but not as scary, I think, as XML::RDB.


In reply to Building a database from XML data feed by mattr

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-04-16 21:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found