Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Read Excel cell comments?

by misterwhipple (Monk)
on Apr 02, 2007 at 14:56 UTC ( [id://607840]=note: print w/replies, xml ) Need Help??


in reply to Read Excel cell comments?

You could try opening the spreadsheet with Excel, and saving it as an XML spreadsheet. The XML should be much easier to parse.

I just did a test using Excel 2003, and the cell comments are preserved. Here is the relevant portion of the XML. (I've changed the whitespace and removed font tags for clarity)

<Row> <Cell> <Data ss:Type="String">This is a cell</Data> <Comment ss:Author="Throckmorton P. Ruddygore"> <ss:Data xmlns="http://www.w3.org/TR/REC-html40"> <B>Throckmorton P. Ruddygore:</B>&#10; This is a comment </ss:Data> </Comment> </Cell> </Row>

If you have an inconveniently large number of spreadsheets to work with, perhaps you could use OLE to automate the conversion to XML.

cat >~/.sig </dev/interesting

Replies are listed 'Best First'.
Re^2: Read Excel cell comments?
by pKai (Priest) on Apr 02, 2007 at 21:38 UTC
    While being bound to Excel 11 (aka 2003) 10 (aka 2002)*) and above might be bearable — depending on your environment and what you need it for —, the more severe problem I found with this approach is the internal optimization Excel performs with empty cells.

    Excel will drop <Cell> tags if no "significant" data or format information has to be saved there. Instead the next cell in the output will get an attribute to indicate its column position in the worksheet instead.

    Suppose you have a row which looks similar to
    data     more data

    Then your XMLSS output might look somewhat like:

    <Row> <Cell> <Data ss:Type="String">data</Data> </Cell> <Cell ss:Index="3"> <Data ss:Type="String">more data</Data> </Cell> ...

    which makes it harder than necessary for my taste to navigate to the cells in search for.

    Update: *) I now think XML-Spreadsheet (XMLSS) export was introduced with Excel 10 ->MSDN

      The spreadsheets are coming from someone else... I only have Excel 9/2000 and it seems XML is not one of the output formats... not too interested in giving MS more money for something I don't want to use anyway... Thanks for the warning about the empty cell optimization... I'm sure it is helpful for large, sparse spreadsheets, but at least it looks way better than the contorted HTML Excel 9 produces... although most of the bizarrest stuff is header boiler plate, and much of it can be ignored for my application... but the HTML version also doesn't export comments....

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://607840]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2024-04-23 14:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found