Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Testing for Chinese Characters

by graff (Chancellor)
on Jun 15, 2016 at 21:35 UTC ( #1165797=note: print w/replies, xml ) Need Help??


in reply to Testing for Chinese Characters

Here's a command-line script I posted a long time ago: xls2tsv -- it's so old it still uses Spreadsheet::ParseExcel (i.e. it assumes the old "xls" format rather than "xlsx"), but apparently, you are already using a module that handles your particular Excel spreadsheets, so the basic point that is relevant here is:
my $xl = Spreadsheet::ParseExcel->new; # or whatever module/version w +orks my $wb = $xl->Parse( $filepath ) or die "$filepath: $!\n"; for my $sheet ( @{$wb->{Worksheet}} ) { $sheet->{MaxRow} ||= $sheet->{MinRow}; for my $row ( $sheet->{MinRow} .. $sheet->{MaxRow} ) { $sheet->{MaxCol} ||= $sheet->{MinCol}; for my $col ( $sheet->{MinCol} .. $sheet->{MaxCol} ) { my $cell = $sheet->{Cells}[$row][$col]; my $val = $cell->{Val}; if ( $cell->{Code} eq 'ucs2' ) { $val = decode( "UTF-16BE", $val ); if ( $val =~ /\p{Han}/ ) { # this cell contains Chinese characters } # NB: there may be non-ASCII Unicode characters that a +re not Chinese } } } }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1165797]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (6)
As of 2021-03-01 10:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?