Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re^7: Begginer's question: If loops one after the other. Is that code correct?

by predrag (Scribe)
on Jan 13, 2017 at 22:33 UTC ( [id://1179553]=note: print w/replies, xml ) Need Help??


in reply to Re^6: Begginer's question: If loops one after the other. Is that code correct?
in thread Begginer's question: If loops one after the other. Is that code correct?

Hauke D, thanks so much for such detailed answer, comments and suggestions.

I accepted your suggestion regarding the order of operations in the foreach $char loop, but didn't have time to rewrite this yet, and will try the way you've suggested, that I also see is the most logical. Also, I understand your suggestions about using modules for parsing HTML, instead of doing the way I did. I've already installed XML::LibXML module and tried choroba's code but I didn't get good result yet. Never mind, I tried one other, simplest example with that module, without any conversion, just to have some experience. It worked well and I've noticed it works for   The example I've sent doesn't have that test, but I've tried on other example.

You are absolutely right in your third paragraph (handling & characters) when noticed that my block of the code is a too limited solution. I knew that and I've wrote that for   just for a test, and was very happy to see that way I could maybe even handle some more complicated and mixed HTML pages. That doesn't mean that I will use that way, it was just a phase in practicing

But I think that I should wait for new design of my site and see what HTML code will have and then, it will be much easier to finish a complete converter. It is because I could have some CSS, or something other that have to be additionaly handled. But anyway, even if I would have some pages with something very special, it will not be so big problem if I work on these conversion "manually".

I've already successfully tried some examples with creating PDF files (a choice for printing some articles), working with directories etc. and that way I am preparing myself for final work on my website design. I often prefer to learn in phases, "in circles", first time just to touch, then deeper etc. instead of going straight to the essence in one step

So, I have to convert all Latin letters, not just "different ones", but other characters (such interpunction etc. will stay untouched? Hopefully, it is not a programmers' site where I would have all possible characters in the text area. :)

All your comments, general and other are really very useful for me and I am learning from these better then just to passively read somewhere.

Regarding my code

 binmode(STDOUT, ":utf8"); use open ':encoding(utf8)';

I will try your suggestion too. I had to put that code (found on the web) because I practiced to print some output on STDOUT and without that Cyrillic letters were not visible.

Somehow, I love TIMTOWTDI, maybe because I love this principle in general life. Regarding this my project, the most important for me is that code works perfectly, so the output will be good too. I do not need too fast code or something too fancy, but of course, I understand that code must be correct and clean.

One separate task for my site in Cyrillic will be IDN encoding, it is something completely new for me and I've recently learned just a little about that from our national domain service. It is because I will have the url that is on Cyrillic too (national domain) and maybe other pages will have urls in Cyrillic too (more complicated option).

I am maybe a bit slow in work and learning, but I don't have always free time and as I've wrote in a previous post, I try to build a good foundation for my future learning, not just to strive for fast solutions. Also, I have many other interests but is is really amazing for me to see that when I've found Perl, I even don't need to go any further in programming and that Perl could be useful for me in some other fields too

Replies are listed 'Best First'.
Re^8: Begginer's question: If loops one after the other. Is that code correct?
by haukex (Archbishop) on Jan 14, 2017 at 13:58 UTC

    Hi predrag,

    One separate task for my site in Cyrillic will be IDN encoding

    I just wanted to point out the power of CPAN. Perl and CPAN have been around for a long time and two of several areas where Perl excels is text processing and web development. I've already linked you to several HTML and XML processing modules, and a quick search on CPAN for "translit" is what gave me, among other things, Lingua::Translit, and a quick search for "IDN" shows me Net::IDN::Encode, again just one module among several.

    use warnings; use strict; use open qw/:std :utf8/; use Lingua::Translit; my $tr = new Lingua::Translit("ISO/R 9"); my $txt = "\x{0441}\x{0440}\x{043F}\x{0441}\x{043A}\x{0438}"; my $latin = $tr->translit($txt); my $cyrillic = $tr->translit_reverse($latin); die "text mismatch" unless $txt eq $cyrillic; print "$latin <-> $cyrillic\n"; use Net::IDN::Encode qw/domain_to_ascii domain_to_unicode/; my $idn = "\x{0442}\x{0435}\x{0441}\x{0442}.\x{0441}\x{0440}\x{0431}"; my $asc = domain_to_ascii($idn); my $dom = domain_to_unicode("xn--e1aybc.xn--90a3ac"); die "domain mismatch" unless $idn eq $dom; print "$asc <-> $dom\n";

    Output:

    srpski <-> српски
    xn--e1aybc.xn--90a3ac <-> тест.срб
    

    Note: I did not verify that the "ISO/R 9" transliteration table is identical to the Serbian / Cyrillic transliteration table you're using, but at least Wikipedia says it's suitable.

    Regards,
    -- Hauke D

      Hauke D, I simply can't say enough thanks to you. I can only hope that all this is useful to you too and maybe it is true, because as I know, in pedagogy they say the best way for learning is to teach someone (people) and the inverse: teaching is considered as the best way for learning.

      I've already had installed cpanm, and with its help, I've succesfully and easy instaled these two modules you used in the code. They are really powerful, and of course, I constantly convince myself how CPAN is huge and powerful

      Just tried your code and it works, completely well. Excellent. I see the code does two things, and I have to look at that more detailed and make many tests. I am really excited and happy but you should remember that I am still a beginner and need a time to settle knowledge acquired here

      It is fantastic you showed me into IDN encoding. As I've wrote last time, I've recently learned something about IDN and got ACE string on my Cyrillic domain on our national domain service web page form, so I will try to check it with your script now. So happy!

      You mentioned "ISO/R 9" and it is my shame, at this moment I can't tell you if it is what I use. It seems I am a bit more confused and a bit tired today that can't answer now. Anyway, you brought me to the solution of so important things for my project

      Websites with Cyrillic domain name are rare and I've noticed on the web that these mostly have the rest on the url on Latin, but it seems I will have in Cyrillic, wow!

        Hi predrag,

        I can only hope that all this is useful to you too and maybe it is true ... teaching is considered as the best way for learning.

        Yes, part of the reason I wrote the code above is to satisfy my own curiosity :-)

        I've already had installed cpanm, and with its help, I've succesfully and easy instaled these two modules you used in the code.

        I just wanted to point out that since CPAN is huge, there are also a lot of modules that might not be well-tested or have other problems. In this node I wrote about some ways to tell which modules are good and which might not be.

        Also, another thing is that it's usually recommended to avoid modifying the Perl that comes with your Linux system. The system Perl is usually managed by the system's package manager, and there might be programs that require those specific versions of Perl or its modules. Also, installing Perl modules into the system Perl with both the package manager and CPAN can sometimes lead to breaking the installation. If you've already installed modules into your system Perl, then as long as your system keeps running normally you don't need to worry, but I would recommend installing a separate version of Perl locally so that you can avoid touching the system Perl at all when installing modules. Another advantage of installing a separate version of Perl is that you can always have the latest version. The program perlbrew makes this very easy.

        Regards,
        -- Hauke D

        I've just tried your code for my Cyrillic domain name and ACE string I've already got, works perfectly

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1179553]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2024-04-24 12:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found