Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Substr cannot extract last special character

by juo (Curate)
on Aug 19, 2005 at 03:35 UTC ( [id://485013]=perlquestion: print w/replies, xml ) Need Help??

juo has asked for the wisdom of the Perl Monks concerning the following question:

I noticed when using substract it cannot work to extract the last character if that last character is a special symbol. For example (microns,ohm,...) Ù If I have a string like : 24Ù and I want to get the unit of that string and use :

my $ohm = '24Ù'; $unit = substr $ohm, -1; print "$unit\n"; # This will return a ? in my CMD be aware that you c +annot see the Ohm character

Does anybody have any idea how to resolve this. I noticed in a previous post state that we could use the Hex code to do things like this (\x{00A1}) but does anybody has any idea how to get the HEX code for a given special symbol? I found Unibook rather difficult to get the right code, as it is looking in a labyrinth for the right one.

Retitled by davido from 'Substract cannot extract last special character'.

Replies are listed 'Best First'.
Re: Substr cannot extract last special character
by Samy_rio (Vicar) on Aug 19, 2005 at 03:54 UTC

    Hi, If i understood your question correctly, here is my coding:

    my $ohm = '24Ù'; print "Before substract : $ohm"; if ($ohm =~ /[^!-~\s]/g)#This helps to find the non ascii character { $unit=$&; $ohm=~s/$unit//e; } print "\nUnit : $unit\nAfter substract : $ohm"; # This will return a +? in my CMD be aware that you cannot see the Ohm character

    I think it helps you.

    Regards,
    Velusamy R.

      Sorry juo, I misunderstood the question, here is my suggestion:

      my $ohm = '24Ù 34Ã 234Æ 32Ï 23Ð 1½ 23§'; print "Before substract : $ohm\n"; while ($ohm =~ /[^!-~\s]/g)#This helps to find the non ascii character { $unit=$&; $ohm=~s/$unit//e; $foo ='&#x'.sprintf("%04X",ord($unit)).";"; print "Unit : $unit\tHex : $foo\n"; }

      In CMD, display the following hexadecimal values:

      Hex : Ù Hex : Ã Hex : Æ Hex : Ï Hex : Ð Hex : ½ Hex : §

      These hexadecimal values are viewed in IE as HTML and XML file, I am getting exact symbol which are present in the code.

      Please, try this.

      Regards,
      Velusamy R.

Re: Substr cannot extract last special character
by newroz (Monk) on Aug 19, 2005 at 08:09 UTC
    Hi, It shows Ù, when used -2 offset, due of two byte length of unicode chars.
    #!/usr/bin/perl my $ohm = qw(24Ù); $unit = substr $ohm,-2; print "$unit","\n";
      In particular, if you embed utf8 chars in the program source, you have to let perl know with the utf8 pragma:
      my $x ='Ù'; print length($x), "\n"; # prints 2
      compared with
      use utf8; my $x ='Ù'; print length($x), "\n"; #' prints 1

      Dave.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://485013]
Approved by Roger
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (8)
As of 2024-04-23 12:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found