Re: How to print data between tags from file sequentially?

#!/usr/bin/perl
#
# This is an answer, in comments and code, to the question:
# How to print data between tags from file sequentially?
# URL: https://perlmonks.org/index.pl?node_id=1217156
#
# LET'S EMBED THE FILE IN THE SCRIPT TO MAKE THIS EASY!
# YOUR FILE IS NOW ANYTHING AFTER __DATA__ AT THE END:
#
# open file
# open(FILE, "data.txt") or die("Unable to open file");
#
# ALWAYS START WITH THESE TWO LINES, FOR HELPFUL ERROR MESSAGES:

use strict;
use warnings;

# read file into an array

# PUT my BEFORE ALL VARIABLES TO PREVENT TYPOS LATER ON:
my @data = <DATA>;

# close file

# NOT NECESSARY ANYMORE:
# close(FILE);

# print file contents
foreach my $line (@data)

{
# THIS PRINTS EACH LINE, NOT WHAT YOU WANT:
# print $line;

# MODIFY EACH LINE WITH A REGEX* FOR CUSTOM PRINT!
# (*Short for "Regular Expression")
#
# Regex allows you to search and replace!
# s is the command and // are the quotes:
# s / FIND THIS TEXT / REPLACE WITH THIS /

    $line # For each line
    =~    # Do something like
    s/    # Search for
    <     # a < character
    [^>]+ # and one or more characters: [  ]+
          # that are not a > character:  ^>
    >     # Followed by a > character
          # (So anything like <whatever>)
    //gx; # Replace it with nothing: //
          # g means replace all of them (global)
          # x allows these comments because it is
          # usually put on one line like this:
          #
          # $line =~ s/<[^>]+>//g; # COOL!

# NOW THE LINE IS CHANGED SO YOU CAN LOOK AT EACH LINE
# WITH =~ TO FIND OUT WHICH TEXT TO PRINT:

    # SIMPLE:

    if ($line =~ /ServerName/) {
        print "Computer: $line";
    }

    # SOMETHING MORE COMPLICATED:

    elsif ($line =~ /^  # IF LINE BEGINS WITH

            \d  # A NUMBER (DIGIT)

            /x  # (END REGEX, x JUST ALLOWS THESE COMMENTS)

            ){  # IF THE ELSIF DECISION IS YES BECAUSE WE 
                # SAW A NUMBER, THEN DO THIS:

        print "IP Address: $line";
    }

    # FIND WINDOWS OR LINUX OR MAC:
    # NOTE: i at the end of the next regex means
    # "case insensitive" so an "a" or "A" are equal.

        elsif ($line =~ /Windows|Linux|Mac/i) { 
                print "OS: $line";
        }

    # PRINT UNKNOWN LINES TOO SO YOU CAN SEE HOW 
    # TO ADD MORE DECISIONS ABOVE AND MAKE IT 
    # PRINT WHAT YOU WANT:

    else {
        print "UNKNOWN: $line";
    }
}

# THE EMBEDDED FILE:

__DATA__
<Answer type="string">ServerName</Answer>
<Answer type="string">10.10.10.11</Answer>
<Answer type="string">Windows Server 2012</Answer>
[download]

_{STOP REINVENTING WHEELS, START BUILDING SPACE ROCKETS!—CPAN} 🐫

Comment on Re: How to print data between tags from file sequentially? Download Code

Replies are listed 'Best First'.
Re^2: How to print data between tags from file sequentially? by TonyNY (Beadle) on Jun 22, 2018 at 16:34 UTC
Hi usemodperl, First of all I want to say thanks so much for this tutorial type solution! After changing the following line from # PUT my BEFORE ALL VARIABLES TO PREVENT TYPOS LATER ON: my @data = <DATA>; to: my @data = <FILE>; here are my results after parsing the actual file: results: `UNKNOWN: UNKNOWN: UNKNOWN: UNKNOWN: UNKNOWN: UNKNOWN: ServerName UNKNOWN: 10.10.10.11 UNKNOWN: bfRootServer (0) OS: Linux Red Hat Enterprise Server 6.9 (2 +.6.32-696.23.1.el6.x86_64) UNKNOWN: Fri, 22 Jun 2018 10:26:53 -050 +0 UNKNOWN: 9.2.1.48 UNKNOWN: SUpportGroup1 UNKNOWN: UNKNOWN: UNKNOWN: UNKNOWN: 34.402ms UNKNOWN: Plural UNKNOWN: UNKNOWN: UNKNOWN:` [download] contents of the source text/xml file excluding the actual infrastrucure names: <?xml version="1.0" encoding="UTF-8"?> <BESAPI xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNa +mespaceSchemaLocation="BESAPI.xsd"> <Query Resource="(names of it, ip addresses of it, root server + of it, operating systems of it, last report time of it, agent versio +ns of it, values of results from (BES Property "_SupportGroup&qu +ot;) of it) of bes computers whose ( name of it as lowercase starts w +ith "ServerName")"> <Result> <Tuple> <Answer type="string">ServerName</Answ +er> <Answer type="string">10.10.10.1`</Ans +wer> <Answer type="string">bfRootServer (0) +</Answer> <Answer type="string">Linux Red Hat En +terprise Server 6.9 (2.6.32-696.23.1.el6.x86_64)</Answer> <Answer type="time">Fri, 22 Jun 2018 1 +0:26:53 -0500</Answer> <Answer type="string">9.2.1.48</Answer +> <Answer type="string">SupportGroup1</A +nswer> </Tuple> </Result> <Evaluation> <Time>34.402ms</Time> <Plurality>Plural</Plurality> </Evaluation> </Query> </BESAPI> [download] So to summarize what my desired output after parsing the file is: `Computer: ServerName IP Address: 10.10.10.11 Root Server: bfRootServer OS: Windows Server 2012 Last Report Time: Fri, 22 Jun 2018 10:26:53 -0500 BES Agent Version: 9.2.1.48 Support Group: SupportGroup1` [download] Thanks again!	[reply] [d/l] [select]
Re^3: How to print data between tags from file sequentially? by usemodperl (Beadle) on Jun 22, 2018 at 21:06 UTC
#!/usr/bin/perl -l # # This is an answer, in comments and code, to the question: # How to print data between tags from file sequentially? # URL: https://perlmonks.org/index.pl?node_id=1217156 # =cut NOTE: THIS IS NOT MODERN PERL BEST PRACTICES! THIS IS THE SWISS ARMY CHAINSAW GETTING IT DONE THE OLD FASHIONED WAY: Perlmonks are technically correct about the best way to do it with modules but since OP can't install modules then perl's built in bag of tricks can save the day. This sort of thing is very well known to be a bad solution to a worse problem but sometimes you have to do what works instead of what is best. This is why perl can work miracles and also why people complain about unmaintainable code. To be good code this entire script would have to be rewritten using appropriate CPAN modules. Techniques and comments are geared entirely towards comprehension by the OP, still learning basics. =cut # # LET'S EMBED THE FILE IN THE SCRIPT TO MAKE THIS EASY! # YOUR FILE IS NOW ANYTHING AFTER __DATA__ AT THE END: # # open file # open(FILE, "data.txt") or die("Unable to open file"); # # OPENING FILES THE RIGHT WAY: # use autodie; # SO YOU DON'T HAVE TO CHECK # # OPEN LIKE THIS TO READ FILE: # open my $FILE, "<", "data.txt"; # # THEN YOU CAN DO: # my @data = <$FILE>; # # close $FILE; # # ALWAYS START WITH THESE TWO LINES, FOR HELPFUL ERROR MESSAGES: use strict; use warnings; # THIS MAKES ERROR MESSAGES EVEN BETTER BUT # SHOULD BE REMOVED WHEN DONE HACKING: use diagnostics; # THIS MODULE LETS YOU SEE DATA: use Data::Dumper; # read file into an array # PUT my BEFORE ALL VARIABLES TO PREVENT TYPOS LATER ON: chomp(my @data = <DATA>); # CHOMP REMOVES END OF LINES: \n # LOOK AT DATA: print 'Input data: '; print Dumper @data; print 'That was @data (which now contains DATA).'; print 'Let\'s remove empty lines.'; print 'Press return to continue...'; <STDIN>; # PAUSE # GET RID OF EMPTY LINES: # \S+ means one or more characters that are not space. @data = grep /\S+/, @data; # LOOK AT DATA: print Dumper @data; print 'Empty lines removed.'; print 'Let\'s remove extra space.'; print 'Press return to continue...'; <STDIN>; # REMOVE LEADING SPACE FROM ALL LINES: foreach my $line (@data) { $line =~ s/^\s+//; } # LOOK AT DATA: print Dumper @data; print 'Extra space removed.'; print 'Let\'s make array @data into string $data.'; print 'Press return to continue...'; <STDIN>; # PUT THE ARRAY INTO A STRING: my $data = join "\n", @data; # LOOK AT DATA: print Dumper $data; print 'Made string $data from array @data.'; print 'Let\'s find Tuples in $data and put them in @dat2.'; print 'Press return to continue...'; <STDIN>; # PUT ALL THE TUPLES INTO A NEW ARRAY: my @dat2 = ($data =~ /<Tuple>(.*?)<\/Tuple>/sg); # LOOK AT DATA: print Dumper @dat2; print 'Found Tuples in $data and put them in @dat2.'; print 'Let\'s split @dat2 back into lines.'; print 'Press return to continue...'; <STDIN>; # SPLIT SELF BACK TO LINES @dat2 = map { split /\n/ } @dat2; # LOOK AT DATA: print Dumper @dat2; print 'Split @dat2. Let\'s remove empty lines.'; print 'Press return to continue...'; <STDIN>; # GET RID OF EMPTY LINES: @dat2 = grep /\S+/, @dat2; # LOOK AT DATA: print Dumper @dat2; print 'Removed empty lines.'; print 'Let\'s remove the tags.'; print 'Press return to continue...'; <STDIN>; foreach my $line (@dat2) { # REMOVE THE TAGS: $line =~ s/<[^>]+>//g; # COOL! # ALSO REMOVE THAT TRAILING ` FROM ANY LINE (IP): $line =~ s/\`$//; } # LOOK AT DATA: print Dumper @dat2; print 'Removed the tags.'; print 'Let\'s use our @labels and print formatted data!.'; print 'Press return to continue...'; <STDIN>; # SETUP A COUNTER TO KEEP TRACK OF LINES. SINCE WE KNOW # THERE ARE 7 FOR EACH RECORD, PRINT A SEPARATOR EVERY # 7 LINES. THIS IS BRITTLE: IF THE DATA CHANGES IT WILL # BREAK BUT IF THE DATA FORMAT IS STATIC THIS WILL WORK # TILL THE END OF TIME, OR TILL SOMEONE ELSE BREAKS IT # BY "UPGRADING" THE CODE THIS CODE RELIES ON. BRITTLE! my $count = 0; # DEFINE LABELS FOR EACH LINE OF DATA: my @labels = ( "Computer", "IP Address", "Root Server", "OS", "Last Report Time", "BES Agent Version", "Support Group", ); # GET THE NUMBER OF LABELS: my $size = scalar @labels; foreach my $line (@dat2) { chomp $line; # REMOVE LINE ENDING: \n print "$labels[$count]: $line"; # PRINT LABEL AND DATA (THIS BREAKS EA +SY) if ($count == ($size - 1)) { # BECAUSE... $count = -1; # COMPUTERS START COUNTING AT 0 print ""; # PRINT BLANK LINE TO SEPARATE RECORDS } $count++; # INCREMENT COUNT BY 1 } # THE EMBEDDED FILE: __DATA__ <?xml version="1.0" encoding="UTF-8"?> <BESAPI xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNa +mespaceSchemaLocation="BESAPI.xsd"> <Query Resource="(names of it, ip addresses of it, root server + of it, operating systems of it, last report time of it, agent versio +ns of it, values of results from (BES Property "_IRS_ServerRespo +nsibilityGroup") of it) of bes computers whose ( name of it as l +owercase starts with "vtjaa42vl006052")"> <Result> <Tuple> <Answer type="string">ServerName</Answ +er> <Answer type="string">10.10.10.1`</Ans +wer> <Answer type="string">bfRootServer (0) +</Answer> <Answer type="string">Linux Red Hat En +terprise Server 6.9 (2.6.32-696.23.1.el6.x86_64)</Answer> <Answer type="time">Fri, 22 Jun 2018 1 +0:26:53 -0500</Answer> <Answer type="string">9.2.1.48</Answer +> <Answer type="string">SupportGroup1</A +nswer> </Tuple> </Result> <Evaluation> <Time>34.402ms</Time> <Plurality>Plural</Plurality> </Evaluation> </Query> </BESAPI> <?xml version="1.0" encoding="UTF-8"?> <BESAPI xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNa +mespaceSchemaLocation="BESAPI.xsd"> <Query Resource="(names of it, ip addresses of it, root server + of it, operating systems of it, last report time of it, agent versio +ns of it, values of results from (BES Property "_IRS_ServerRespo +nsibilityGroup") of it) of bes computers whose ( name of it as l +owercase starts with "vtjaa42vl006052")"> <Result> <Tuple> <Answer type="string">ServerName</Answ +er> <Answer type="string">10.10.10.1`</Ans +wer> <Answer type="string">bfRootServer (0) +</Answer> <Answer type="string">Linux Red Hat En +terprise Server 6.9 (2.6.32-696.23.1.el6.x86_64)</Answer> <Answer type="time">Fri, 22 Jun 2018 1 +0:26:53 -0500</Answer> <Answer type="string">9.2.1.48</Answer +> <Answer type="string">SupportGroup1</A +nswer> </Tuple> </Result> <Evaluation> <Time>34.402ms</Time> <Plurality>Plural</Plurality> </Evaluation> </Query> </BESAPI> <?xml version="1.0" encoding="UTF-8"?> <BESAPI xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNa +mespaceSchemaLocation="BESAPI.xsd"> <Query Resource="(names of it, ip addresses of it, root server + of it, operating systems of it, last report time of it, agent versio +ns of it, values of results from (BES Property "_IRS_ServerRespo +nsibilityGroup") of it) of bes computers whose ( name of it as l +owercase starts with "vtjaa42vl006052")"> <Result> <Tuple> <Answer type="string">ServerName</Answ +er> <Answer type="string">10.10.10.1`</Ans +wer> <Answer type="string">bfRootServer (0) +</Answer> <Answer type="string">Linux Red Hat En +terprise Server 6.9 (2.6.32-696.23.1.el6.x86_64)</Answer> <Answer type="time">Fri, 22 Jun 2018 1 +0:26:53 -0500</Answer> <Answer type="string">9.2.1.48</Answer +> <Answer type="string">SupportGroup1</A +nswer> </Tuple> </Result> <Result> <Tuple> <Answer type="string">ServerName</Answ +er> <Answer type="string">10.10.10.1`</Ans +wer> <Answer type="string">bfRootServer (0) +</Answer> <Answer type="string">Linux Red Hat En +terprise Server 6.9 (2.6.32-696.23.1.el6.x86_64)</Answer> <Answer type="time">Fri, 22 Jun 2018 1 +0:26:53 -0500</Answer> <Answer type="string">9.2.1.48</Answer +> <Answer type="string">SupportGroup1</A +nswer> </Tuple> </Result> <Evaluation> <Time>34.402ms</Time> <Plurality>Plural</Plurality> </Evaluation> </Query> </BESAPI> [download] _{STOP REINVENTING WHEELS, START BUILDING SPACE ROCKETS!—CPAN} 🐪	[reply] [d/l]


good chemistry is complicated, and a little bit messy -LW
	PerlMonks