Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Yes I'm throwing out the old work. It requires building a new table for every company every year, is undocumented, ...I'll stop there.

The spec is understandable. There is the question of building a facility to merge specific partial feeds together manually, or just rebuild the whole thing daily from a full feed. A rollback and maybe a way to lock fields from being updated.

I've also been pondering a model built around the full XML feed loaded right into memory at server startup, which might make it more robust and configurable.. Also yes they say the schema will change but not how, I figure the most important parts won't change but would like to make it configurable by the admin so I do not have to support it forever. Certainly an update will be issued when an executive of a company is hired or retires, also new types of officers could be added, etc.

So yes I can see a way to model the largest features of the XML structure in DBIx but am intrigued by the possibility of not greatly minimizing that. Somewhere though I'll have to do some degree of linking feed data to manually entered data, or importing them into the same database. It can all just be string data. Maybe json and yml could be useful.

The data looks like this. Probably thousands of companies, here's just one. I think storing 500 companies is more what we need to do for now though.

<?xml version="1.0" encoding="ISO-8859-1"?> <Feed ExtractDate="08/08/2006" ExtractTime="11:30:41"> <ENTITY EntityReference="0000127509" LegalName="21st Century Holding C +o." Status="A"> <COMPANY> <Identity> <OfficialName>21st Century Holding Co.</OfficialName> <ShortName>21st Century Holding Co.</ShortName> <Status>Active</Status> <CountryCode>USA</CountryCode> <Region>South Atlantic</Region> <CompNumber>00096995</CompNumber> <CIK>0001069996</CIK> <MergentIndustryCode>8.2</MergentIndustryCode> <CommonTicker>TCHC</CommonTicker> <CommonExchange>NMS</CommonExchange> <CommonCusip>90136Q100</CommonCusip> <Street1>4161 N.W. 5th Street</Street1> <City>Plantation</City> <State>FL</State> <Country>USA</Country> <Zipcode>33317</Zipcode> <PhoneNumber>954 581 9993</PhoneNumber> <Email></Email> <WebSite></WebSite> <FYE>12/31/2005</FYE> </Identity> <BusinessActivities> <SIC Primary="6331" Secondary="6719"/> <NAIC Primary="524126" Secondary="551112"/> <TextSection Title="Business Summary" Date="06/01/2006"> <![CDATA[ <p>21st Century Holding is an insurance holding company, which, throug +h its subsidiaries, controls the insurance underwriting, distribution + and claims process. Co. underwrites personal automobile insurance an +d homeowners and mobile home property and casualty insurance in the S +tate of Florida through its subsidiary, Federated National Insurance +Company. Co. has underwriting authority for third-party insurance com +panies which it represents through a managing general agent. Co. also + offers financing to its own and third-party insureds through its sub +sidiary, Federated Premium Finance, Inc., and pays advances through F +ed First Corp.</p> ]]> </TextSection> </BusinessActivities> <Executives> <Section Title="Officers"> <Executive FirstName="Edward" MiddleName="J." LastName="Lawson" + Title="Chmn., Pres."/> <Executive FirstName="Richard" MiddleName="A." LastName="Widdic +ombe" Title="C.E.O."/> <Executive FirstName="Michele" MiddleName="V." LastName="Lawson +" Title="V.P., Agency Oper., Treas."/> <Executive FirstName="James" MiddleName="G." LastName="Jennings +" Suffix="III" Title="C.F.O."/> <Executive FirstName="Keith" MiddleName="M." LastName="Linder" +Title="C.O.O."/> <Executive FirstName="James" MiddleName="A." LastName="Epstein" + Title="Sec."/> </Section> <Section Title="Directors"> <Executive FirstName="Edward" MiddleName="J." LastName="Lawson" + Title="Chmn."/> <Executive FirstName="Carl" MiddleName="" LastName="Dorf"/> <Executive FirstName="Bruce" MiddleName="" LastName="Simberg"/> <Executive FirstName="Charles" MiddleName="B." LastName="Hart" +Suffix="Jr."/> <Executive FirstName="Richard" MiddleName="W." LastName="Wilcox +" Suffix="Jr."/> <Executive FirstName="Peter" MiddleName="" LastName="Prygelski" +/> </Section> </Executives> <FinData_Generated> <Report> <ReportDate>03/31/2006</ReportDate> <ReportType>Q1</ReportType> <Auditor>U</Auditor> <Currency>USA</Currency> <Consolidated>True</Consolidated> <fi Mapcode="-402" Amount="23001737"/> <fi Mapcode="-384" Amount="0.83"/> <fi Mapcode="-379" Amount="53213270"/> <fi Mapcode="-365" Amount="8599042"/> <fi Mapcode="-364" Amount="40167125"/> <fi Mapcode="-356" Amount="227079885"/> <fi Mapcode="-344" Amount="93988871"/> <fi Mapcode="-337" Amount="28367811"/> <fi Mapcode="-333" Amount="6013312"/> <fi Mapcode="-310" Amount="25114709"/> <fi Mapcode="-249" Amount="36.8577792400461"/> </Report> ... 20 more reports here ... <ReportDate>03/31/2002</ReportDate> <ReportType>Q1</ReportType> <Auditor>U</Auditor> <Currency>USA</Currency> <Consolidated>True</Consolidated> <fi Mapcode="-402" Amount="6086503"/> <fi Mapcode="-384" Amount="0.22"/> <fi Mapcode="-379" Amount="14592615"/> <fi Mapcode="-365" Amount="6165671"/> <fi Mapcode="-364" Amount="5822488"/> <fi Mapcode="-356" Amount="59264371"/> <fi Mapcode="-344" Amount="17710206"/> <fi Mapcode="-337" Amount="549056"/> <fi Mapcode="-333" Amount="991370"/> <fi Mapcode="-310" Amount="9507000"/> <fi Mapcode="-249" Amount="16.2431471547281"/> </Report> </FinData_Generated> <Miscellaneous> <Employee Description="AppoximateFullTime" Count="135" AsOf="12/ +31/2005"/> <Shareholders Count="3000" AsOf="03/29/2006"/> <ShareHolderRelations Name="Becky Campillo" PhoneNumber="954-581 +-9993 x1257"/> <Incorporation Country="USA" State="FL" Month="3" Year="1991"/> <Provider ServiceType="Auditor" Name="McKean, Paul, Chrycy, Flet +cher &amp; Co."/> <Provider ServiceType="Counsel" Name="Broad &amp; Cassel"/> </Miscellaneous> <StockSummary> <StockIssue Type="Common" Description="common"> <StockOutstanding Amount="6048842.00" Units="SHR" Date="12/31/ +2004"/> <Par Amount="0.01" Units="USA"/> <Authorized Amount="37500000.00" Units="SHR" Unlimited="No"/> <Treasury Amount="696849.00" Units="SHR"/> <StockIdentity Ticker="TCHC" Exchange="Nasdaq National Market" +/> <TextSection Title="Stock Splits" Date="06/01/2006"> <![CDATA[ <p><font color="black">$0.01 par shares split in the form of a 50% sto +ck dividend on Sept. 7, 2004.</font></p> ]]> </TextSection> <TextSection Title="Ownership" Date="06/01/2006"> <![CDATA[ <p><font color="black">As of April 15, 2005, Edward J. Lawson and all +directors and executive officers as a group held 25.1% and 33.1%, res +pectively of Co.'s outstanding common stock.</font></p> ]]> </TextSection> <TextSection Title="Voting Rights" Date="06/01/2006"> <![CDATA[ <p><font color="black">Entitled to one vote per share.</font></p> ]]> </TextSection> <TextSection Title="Dividends Paid" Date="06/01/2006"> <![CDATA[ <table border="1"> <tr> <td> <p><font color="teal"><two +column>2001</twocolumn></font></p> </td> <td> <p><fo +nt color="teal"><twocolumn>0.08</twocolumn></font></p> </td> + <td> <p><font color="teal"><twocolumn>2002</twocolumn></font +></p> </td> <td> <p><font color="teal"><twocolumn>0. +11</twocolumn></font></p> </td> <td> <p><font color= +"teal"><twocolumn>2003</twocolumn></font></p> </td> <td> + <p><font color="teal"><twocolumn>0.32</twocolumn></font></p> + </td> </tr> </table><p/> <p><font color="red"><footnote>&#6540 +7;</footnote></font><font color="black">Adjusted for 3-for-2 split:</ +font></p> <table border="1"> <tr> <td> <p><font color +="teal"><twocolumn>2004</twocolumn></font></p> </td> <td> + <p><font color="teal"><twocolumn>0.32</twocolumn></font></p> + </td> <td> <p><font color="teal"><twocolumn>[1]2005</t +wocolumn></font></p> </td> <td> <p><font color="teal +"><twocolumn>0.32</twocolumn></font></p> </td> <td>&#65407; +</td> <td>&#65407;</td> </tr> </table><p/> <p><font color=" +red"><footnote>[1]To Dec. 1</footnote></font></p> ]]> </TextSection> <TextSection Title="Options" Date="06/01/2006"> <![CDATA[ <p><font color="black">Dec. 31, 2004, authorized for issuance, 3,688,5 +00 shares; options outstanding, 1,119,575 shares. </font></p> ]]> </TextSection> <TextSection Title="Transfer Agent &amp; Registrar" Date="06/0 +1/2006"> <![CDATA[ <p><font color="black">Global Securities Transfer, Inc., Denver, CO</f +ont></p> ]]> </TextSection> <TextSection Title="Price Range" Date="06/01/2006"> <![CDATA[ <table border="1"> <tr> <td>&#65407;</td> <td> <p> +<font color="green"><pricerange>2004</pricerange></font></p> </t +d> <td> <p><font color="green"><pricerange>2003</priceran +ge></font></p> </td> <td> <p><font color="green"><pr +icerange>2002</pricerange></font></p> </td> <td> <p> +<font color="green"><pricerange>2001</pricerange></font></p> </t +d> <td> <p><font color="green"><pricerange>2000</priceran +ge></font></p> </td> <td> <p><font color="green"><pr +icerange>1999</pricerange></font></p> </td> <td> <p> +<font color="green"><pricerange>1998</pricerange></font></p> </t +d> </tr> <tr> <td> <p><font color="green"><priceran +ge>High</pricerange></font></p> </td> <td> <p><font +color="green"><pricerange>24.50</pricerange></font></p> </td> + <td> <p><font color="green"><pricerange>23.59</pricerange>< +/font></p> </td> <td> <p><font color="green"><pricer +ange>13.75</pricerange></font></p> </td> <td> <p><fo +nt color="green"><pricerange>3.88</pricerange></font></p> </td> + <td> <p><font color="green"><pricerange>7 15/16</priceran +ge></font></p> </td> <td> <p><font color="green"><pr +icerange>7 3/4</pricerange></font></p> </td> <td> <p +><font color="green"><pricerange>8 1/4</pricerange></font></p> < +/td> </tr> <tr> <td> <p><font color="green"><pricer +ange>Low</pricerange></font></p> </td> <td> <p><font + color="green"><pricerange>9.17</pricerange></font></p> </td> + <td> <p><font color="green"><pricerange>9</pricerange></fon +t></p> </td> <td> <p><font color="green"><pricerange +>3</pricerange></font></p> </td> <td> <p><font color +="green"><pricerange>0.98</pricerange></font></p> </td> <td +> <p><font color="green"><pricerange>2 7/16</pricerange></font +></p> </td> <td> <p><font color="green"><pricerange> +2 7/8</pricerange></font></p> </td> <td> <p><font co +lor="green"><pricerange>5 3/4</pricerange></font></p> </td> < +/tr> </table><p/> ]]> </TextSection> <TextSection Title="Offered" Date="06/01/2006"> <![CDATA[ <p><font color="black">(1,250,000 shares) at $7.50 per share (proceeds + to Co., $6.90 per share) on Nov. 10, 1998 through Gilford Securities + Incorporated; and associates. Offering contained over-allotment opt +ions to cover 187,500 shares. Proceeds used for contribution to Fede +rated National's capital to increase its underwriting capacity, repay +ment of a portion of the outstanding balance under Co.'s revolving li +ne of credit agreement, financing of acquisitions and working capital + and general corporate purposes.</font></p> ]]> </TextSection> </StockIssue> </StockSummary> </COMPANY> </ENTITY> ... more entities here ... </Feed>

In reply to Re^2: Building a database from XML data feed by mattr
in thread Building a database from XML data feed by mattr

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2023-12-03 11:05 GMT
Find Nodes?
    Voting Booth?
    What's your preferred 'use VERSION' for new CPAN modules in 2023?

    Results (20 votes). Check out past polls.