Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: HTML::Parser / Regex

by eyepopslikeamosquito (Archbishop)
on May 28, 2017 at 06:30 UTC ( [id://1191398]=note: print w/replies, xml ) Need Help??


in reply to HTML::Parser / Regex

What you're attempting as a first program is too tough for a complete beginner IMHO ... So, like kcott, I suggest you read perlintro or some of these Learning Perl links.

Then write some simpler programs first, to gain some confidence. Feel free to ask more questions if you get stumped. Once you've done that (will probably take a week or two) return to your original problem.

That said, I can see you're very determined to try to solve your real world problem immediately! If so, try running this simple program:

use strict; use warnings; my $ca = "california.html"; open(my $f1, "<" , $ca) or die "Can't open file '$ca': $!"; while ( my $line = <$f1> ) { print "line: $line"; if ( $line =~ m{Employee +([^<]+)</th><th>([^<]+)} ) { my $name = $1; my $two = $2; print " name='$name' two='$two'\n"; } } close ($f1);
on your original test california.html file:
</tr></table></body><body bgcolor="black"><h1> Summary</h1><table border="1"><tr><th>Employee A</th><th>-0.82</th> </tr><tr><th>Employee B</th><th>-5.02</th> </tr><tr><th>Employee C</th><th>19</th> </tr></table></body><body bgcolor="black"><h1> Summary</h1><table border="1"><tr><th>Employee A</th><th></th> </tr><tr><th>Employee B</th><th></th> </tr><tr><th>Employee C</th><th></th>
which should produce the following output:
line: </tr></table></body><body bgcolor="black"><h1> line: Summary</h1><table border="1"><tr><th>Employee A</th><th>-0.82</ +th> name='A' two='-0.82' line: </tr><tr><th>Employee B</th><th>-5.02</th> name='B' two='-5.02' line: </tr><tr><th>Employee C</th><th>19</th> name='C' two='19' line: </tr></table></body><body bgcolor="black"><h1> line: Summary</h1><table border="1"><tr><th>Employee A</th><th></th> line: </tr><tr><th>Employee B</th><th></th> line: </tr><tr><th>Employee C</th><th></th>
Now, take the time to understand how the above program works by reading the introductory Perl links above. Feel free to ask any questions about it.

Please note that I am NOT endorsing the above program as a sound way to solve your real world problem. It is just a simple program, directly related to your real world problem, to help motivate you to learn some Perl basics. For a sound solution to your problem, I suspect HTML-Parser is the way to go.

Replies are listed 'Best First'.
Re^2: HTML::Parser / Regex
by MissPerl (Sexton) on May 29, 2017 at 01:33 UTC
    Hi eyepopslikeamosquito,

    Thanks for your sample code!

    However I come across with the error "Can't use global $1 in "my"" This isn't the first time I see them, I tried googled and get around with it, but unfortunately nothing worked.

    As I am still reading the beginners' material, for my knowledge, I would think that I need $_ or $1, to scan for current lines?

    And I figured you are the best person I could ask for advice?!

      "... the error "Can't use global $1 in "my"" ..."

      Somewhere in your code, you have "... my $1 ...". Here's a couple of examples:

      $ perl -e 'my $1;' Can't use global $1 in "my" at -e line 1, near "my $1" Execution of -e aborted due to compilation errors. $ perl -e 'my $1 = 42;' Can't use global $1 in "my" at -e line 1, near "my $1 " Execution of -e aborted due to compilation errors.

      You can find the full description of the problem from "perldiag - Perl diagnostic messages". Until you're familiar with that document, it can be a bit difficult finding the information. In this instance, you'd need to search for "Can't use global" (not "Can't use global $1"). Doing so, locates this:

      Can't use global %s in "%s"
      (F) You tried to declare a magical variable as a lexical variable. This is not allowed, because the magic can be tied to only one location (namely the global variable) and it would be incredibly confusing to have variables in your program that looked like magical variables but weren't.

      While you're learning, you may find it useful to use the diagnostics pragma. Put this line near the start of your code:

      use diagnostics;

      That will give you a full description, rather than the somewhat terse shortened form.

      Important: That pragma is intended as a developer tool. Do not leave it production code.

      — Ken

      ... the error "Can't use global $1 in "my""

      What is the specific code that produces this error? If you don't show us the code, we can only make more or less wild quesses. This just wastes our time and yours. Please see How do I post a question effectively? and How (Not) To Ask A Question.

      Update: BTW: I ran the code eyepopslikeamosquito posted here under Perl 5.8.9 and I get the advertised output with no errors or warnings.


      Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1191398]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (2)
As of 2024-04-26 07:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found