Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Re: Re: reading lines

by BrowserUk (Patriarch)
on Dec 07, 2003 at 04:38 UTC ( [id://312855]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: reading lines
in thread reading lines

A couple of comments on what you have.

As you are reading line-by-line, you will only ever get one newline character which will be at the ned of the line. Therefore, the /gs options on  s/\r|\n//gs; are redundant. The /g because there can be only one \n (\r is most unlikely to occur!). The \s because unless your regex contains one or .'s, /s does nothing. In any case, what I assume you are trying to do is remove the trailing newline, in which case, see chomp which is design for doing exactly this.

With your 3 if statements, all 3 conditions are being tested for every line you read. As only one of them can ever match a given line, you would be better coding that as an if(...){}elsif(...){}elsif(...){}else{} cascade.

That said, the problem with processing multi-line records line-wise, is that if forces you to retain state. Ie. You have to remember what the last thing you parsed was in order to know how to handle the next line.

#! perl -slw use strict; my( $hostname, @dnsservers, $nodetype ); my $last = ''; while (<DATA>) { chomp; if ( /Host Name.*?:\s*(.*)$/ ) { $hostname = $1; $last = 'host'; } elsif ( /DNS Servers.*?:\s*(.*)$/ ) { @dnsservers = $1; $last = 'dns'; } elsif ( /Node Type.*?:\s*(.*)$/ ) { $nodetype = $1; $last = 'node'; } elsif( !/:/ and $last eq 'dns' ) { print '?'; push @dnsservers, $1 if /\s+(\S+)$/; } else{ warn "Unknown linetype at $. : '$_'\n"; } } print $hostname, $/, join' | ', @dnsservers, $/, $nodetype; __DATA__ Host Name . . . . . . . . . : LAPTOP.no.cox.net DNS Servers . . . . . . . . : 205.152.132.211 181.171.2.11 10.10.10.1 Node Type . . . . . . . . . : Broadcast

Whilst that does the job, I'm not fond of flag variables and try and avoid them where I can. In this case, perl can remember the state for you -- which is almost as bad in some ways, but it's there, so you might as well make use of it.

As the output from ipconfig is unlikely to be much more that a hundred or so bytes, there is no reason not to read the whole thing into scalar, and then use the regex engines "memory" (/gc options) of where it has got to in processing the string, to allow you to nibble away at it a piece at a time.

#! perl -slw use strict; my $ipConfig = do{ local $/; <DATA> }; my $hostName = $1 if $ipConfig =~ m[Host Name . . . . . . . . . : (\S+)\s*\n]gc; my @dnsIPs = $ipConfig =~ m[(?:\s*([0-9.]+)\n)]gc if $ipConfig =~ m[DNS Servers . . . . . . . . :]gc; my $nodeType = $1 if $ipConfig =~ m[Node Type . . . . . . . . . : (\S+)\s*\n]; print $hostName, $/, join( ' | ', @dnsIPs), $/, $nodeType; __DATA__ Host Name . . . . . . . . . : LAPTOP.no.cox.net DNS Servers . . . . . . . . : 205.152.132.211 181.171.2.11 10.10.10.1 Node Type . . . . . . . . . : Broadcast

Note: That can just as easily be structured using 'normal' if statements

#! perl -slw use strict; my $ipConfig = do{ local $/; <DATA> }; my( $hostName, @dnsIPs, $nodeType ); if( $ipConfig =~ m[Host Name . . . . . . . . . : (\S+)\s*\n]gc ) { $hostName = $1; } if( $ipConfig =~ m[DNS Servers . . . . . . . . :]gc ) { @dnsIPs = $ipConfig =~ m[(?:\s*([0-9.]+)\n)]gc; } if( $ipConfig =~ m[Node Type . . . . . . . . . : (\S+)\s*\n] ) { $nodeType = $1; } print $hostName, $/, join( ' | ', @dnsIPs), $/, $nodeType; __DATA__ Host Name . . . . . . . . . : LAPTOP.no.cox.net DNS Servers . . . . . . . . : 205.152.132.211 181.171.2.11 10.10.10.1 Node Type . . . . . . . . . : Broadcast

Except now I have to pre-declare the results variables and, to my eyes, the purpose of the code has become obscured by the structure. In the previous version is was very obvious that the three blocks of code produced 3 sets of results and what they were called. In this version, I have to look much more closely to see what is going on...but that's me:)


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!
Wanted!

Replies are listed 'Best First'.
Re: Re: Re: Re: reading lines
by spill (Novice) on Dec 07, 2003 at 05:06 UTC
    My files had both /n and /r on each line and chop is kinda scary. I am reading line by line, because my file contains multiple ipconfig outputs. I was keeping state for each machine and cycling through. I tried the code you suggest, but when I modify to read from STDIN, the regex doesn't see the DNS Servers line. I am not great with regex and having trouble figuring out what is wrong.
      my file contains multiple ipconfig outputs.

      Ah, that changes things a bit. Is there a distinct and consistent string pattern that separates one ipconfig output from another, such as a blank line, or a line of dashes?

      If so, you can assign that string to "$/" (rather than leaving it with the default value of CRLF, or setting it to undef, as suggested by BrowserUK). This way, you read one whole ipconfig output set in a single iteration of while (<>). For example, if the successive records are separated by a single blank line (and blank lines never occur within a single ipconfig output record), then you could do like this (untested):

      my %hostdata; { $/ = "\x0d\x0a\x0d\x0a"; # record separator is now a blank line while (<>) { # $_ now contains one whole ipconfig output next unless ( /Host Name.*?:\s*(.*)/ ); my $host = $1; @{$hostdata{$host}{dns}} = ( /\s(\d+\.\d+\.\d+\.\d+)\s/g ); ( $hostdata{$host}{typ} ) = ( /Node Type.*?:\s*(.*)/ ); } } # Now, keys %hostdata gives the list of host names from # all the ipconfig runs, and $hostdata{"hostname"} contains # the dns and node type info for each host. for ( sort keys %hostdata ) { print join "\n", $_, @{$hostdata{$_}{dns}}, $hostdata{$_}{typ}, "\n +"; }
      (update: removed a spurious line of code)

      If the separation between ipconfig runs is not distinct or consistent, then you'll want to fall back on BrowserUK's less-favored method that reads one line at a time and uses a status flag to keep track of what it's supposed to look for (and what it's supposed to do) as it works through each multi-line record.

        It didn't dawn on me that HOW I was reading the file would affect how I parsed it. Your example looks like what my end goal is. I want to output each host into csv format and putting them into hashes seem the most logical. Here is a FULL sample of my input file with two computers. I read this file from STDIN and output the results in csv format. Another problem in this data is multiple adapters. My code up to know has choked on more than one adapter.
        ipconfig start Windows IP Configuration Host Name . . . . . . . . . . . . : axter-win2003 Primary Dns Suffix . . . . . . . : Axter2.home Node Type . . . . . . . . . . . . : Hybrid IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No DNS Suffix Search List. . . . . . : Axter2.home brnmll01.nj.comcast.net Ethernet adapter Local Area Connection: Connection-specific DNS Suffix . : brnmll01.nj.comcast.net Description . . . . . . . . . . . : SMC EtherPower II 10/100 Etherne +t Adapter Physical Address. . . . . . . . . : 00-E0-29-0A-F5-E2 DHCP Enabled. . . . . . . . . . . : Yes Autoconfiguration Enabled . . . . : Yes IP Address. . . . . . . . . . . . : 192.168.0.6 Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.0.1 DHCP Server . . . . . . . . . . . : 192.168.0.1 DNS Servers . . . . . . . . . . . : 127.0.0.1 192.168.0.1 Lease Obtained. . . . . . . . . . : Tuesday, September 16, 2003 6:18 +:13 PM Lease Expires . . . . . . . . . . : Friday, September 19, 2003 6:18: +13 PM ipconfig end ipconfig start Windows 98 IP Configuration Host Name . . . . . . . . . : LAPTOP.no.cox.net DNS Servers . . . . . . . . : 205.152.132.235 181.171.2.200 10.10.10.1 Node Type . . . . . . . . . : Broadcast NetBIOS Scope ID. . . . . . : IP Routing Enabled. . . . . : No WINS Proxy Enabled. . . . . : No NetBIOS Resolution Uses DNS : No 0 Ethernet adapter : Description . . . . . . . . : PPP Adapter. Physical Address. . . . . . : 44-45-53-54-00-00 DHCP Enabled. . . . . . . . : Yes IP Address. . . . . . . . . : 181.171.2.147 Subnet Mask . . . . . . . . : 255.255.0.0 Default Gateway . . . . . . : 181.171.2.147 DHCP Server . . . . . . . . : 255.255.255.255 Primary WINS Server . . . . : Secondary WINS Server . . . : Lease Obtained. . . . . . . : 01 01 80 12:00:00 AM Lease Expires . . . . . . . : 01 01 80 12:00:00 AM 1 Ethernet adapter : Description . . . . . . . . : PPP Adapter. Physical Address. . . . . . : 44-45-53-54-00-01 DHCP Enabled. . . . . . . . : Yes IP Address. . . . . . . . . : 0.0.0.0 Subnet Mask . . . . . . . . : 0.0.0.0 Default Gateway . . . . . . : DHCP Server . . . . . . . . : 255.255.255.255 Primary WINS Server . . . . : Secondary WINS Server . . . : Lease Obtained. . . . . . . : Lease Expires . . . . . . . : 2 Ethernet adapter : Description . . . . . . . . : D-Link AIRPLUS Wireless LAN Adapter Physical Address. . . . . . : 00-80-C8-B5-76-1F DHCP Enabled. . . . . . . . : Yes IP Address. . . . . . . . . : 10.10.10.104 Subnet Mask . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . : 10.10.10.1 DHCP Server . . . . . . . . : 10.10.10.1 Primary WINS Server . . . . : Secondary WINS Server . . . : Lease Obtained. . . . . . . : 06 19 03 2:51:53 PM Lease Expires . . . . . . . : 06 26 03 2:51:53 PM ipconfig end
        It is clear that my data is more complicated than my first example. I thought it would be easier to understand what I needed to do, but it only introduced more confusion. ;) Thanks to everyone that has responded, if anything I have learned some other techniques for manipulating my data.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://312855]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-24 06:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found