Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Simple perl one-liner for transforming text files

by m5m5m (Initiate)
on Mar 05, 2003 at 20:40 UTC ( [id://240696]=perlquestion: print w/replies, xml ) Need Help??

m5m5m has asked for the wisdom of the Perl Monks concerning the following question:

Hello All ! I am learning perl. I have a file in the format like the following
name: url1.domain1.com date: 2004/2/1 unwanted info: blah blah blah name: url2.domain5.org date: 2004/3/12 unwanted info: blah blah blah . . .
What I would like is a one-liner which results in:
1. url1.domain1.com 2004/2/1 2. url2.domain5.org 2004/3/12 ...and on...
My closest I have got is the following... but obviously I am missing something. perl -n -e 'print if s/^name: |^date: // ' file.txt I have tried others using the ternary operator. I saw some tricks using a hash table for other but similar purposes but it seemed comlicated. For some reason "tr" does not work either although I would for the sake of learning just use perl. And help ? Thanks in advance.

Replies are listed 'Best First'.
Re: Simple perl one-liner for transforming text files
by BrowserUk (Patriarch) on Mar 05, 2003 at 20:58 UTC

    perl -n000 -e" s[name: ([^\n]+)\ndate: ([^\n]+)\n.*][$1 $2]; chomp; pr +int qq[$_\n]; " <file >newfile

    Use 's instead of "s on linux etc.


    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
      Way too verbose! ;-)
      perl -n000 -pe "$_=qq(@{{/(.*?): (.*)/g}}{name,date}\n)"

      jdporter
      The 6th Rule of Perl Club is -- There is no Rule #6.

        Sorry, I wasn't wearing my golf shoes:^)

        jdporter++. I physically applauded that once I stared at it long enough to work out how the hell it worked.


        Examine what is said, not who speaks.
        1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
        2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
        3) Any sufficiently advanced technology is indistinguishable from magic.
        Arthur C. Clarke.
      Very cool. After a bit of reading, I think I understand what you are doing... Does not seem to work for me (prints out every line) yet, maybe I mistyped something... Let me go through it again... Thanks !

        A quick example of it in operation.

        C:\test>type junk name: url1.domain1.com date: 2004/2/1 unwanted info: blah blah blah name: url2.domain5.org date: 2004/3/12 unwanted info: blah blah blah name: url1.domain1.com date: 2004/2/1 unwanted info: blah blah blah name: url2.domain5.org date: 2004/3/12 unwanted info: blah blah blah name: url1.domain1.com date: 2004/2/1 unwanted info: blah blah blah name: url2.domain5.org date: 2004/3/12 unwanted info: blah blah blah name: url1.domain1.com date: 2004/2/1 unwanted info: blah blah blah name: url2.domain5.org date: 2004/3/12 unwanted info: blah blah blah C:\test>perl -n000 -e" s[name: ([^\n]+)\ndate: ([^\n]+)\n.*][$1 $2]; c +homp; print qq[$_\n]; " <junk url1.domain1.com 2004/2/1 url2.domain5.org 2004/3/12 url1.domain1.com 2004/2/1 url2.domain5.org 2004/3/12 url1.domain1.com 2004/2/1 url2.domain5.org 2004/3/12 url1.domain1.com 2004/2/1 url2.domain5.org 2004/3/12 C:\test>

        Examine what is said, not who speaks.
        1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
        2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
        3) Any sufficiently advanced technology is indistinguishable from magic.
        Arthur C. Clarke.
Re: Simple perl one-liner for transforming text files
by pfaut (Priest) on Mar 05, 2003 at 20:56 UTC

    Not quite one line, but...

    You're 'records' are separated by a new blank line so set the input record separator ($/) to a pair of newlines. Then split each record on newlines and take the part after the colon for the fields you want.

    #!/usr/bin/perl -w use strict; $/ = "\n\n"; while (<DATA>) { my @wanted = map { (split(/:\s+/))[1] } (split /\n/)[0..1]; print "@wanted\n"; } __DATA__ name: url1.domain1.com date: 2004/2/1 unwanted info: blah blah blah name: url2.domain5.org date: 2004/3/12 unwanted info: blah blah blah
    --- print map { my ($m)=1<<hex($_)&11?' ':''; $m.=substr('AHJPacehklnorstu',hex($_),1) } split //,'2fde0abe76c36c914586c';
Re: Simple perl one-liner for transforming text files (JAWTDI)
by tye (Sage) on Mar 05, 2003 at 21:37 UTC
    perl -ne 'print ++$n,". $1 " if /^name:\s*(.*)/; print $1,$/ if /^date +:\s*(.*)/'
    or, for Win32:
    perl -ne "print ++$n,qq(. $1 ) if /^name:\s*(.*)/; print $1,$/ if /^da +te:\s*(.*)/"
                    - tye
Re: Simple perl one-liner for transforming text files
by Enlil (Parson) on Mar 05, 2003 at 21:05 UTC
    heres my attempt:
    perl -n0e 'while (/name: (.*)\ndate: (.*)/g){$cnt++;print "$cnt. $1 $2 +\n"}' <FILE>

    -enlil

Re: Simple perl one-liner for transforming text files
by hiseldl (Priest) on Mar 06, 2003 at 03:23 UTC

    Here's another version. When running one-liners it is extremely useful to learn the command line operators. Here's a short breakdown:

    • use -aF: to split on the ':' and store it in @F
    • use -l to automatically chomp $/
    • use -0 (minus zero) to specify the input record separator ($/) as the null character
    • use -n to loop over every line of input
    • use -e to run the script following it
    • the first expression prints the counter if the line contains name
    • the second expression prints the right side of the ':' if the line contains name or date
    • the last expression prints a newline (or record separator) if the line contains date

     perl -al0nF: -e 'print /name/&&++$x,/name|date/&&$F[1],/date/&&$/' data.txt

    the data file contains:

    name: url1.domain1.com date: 2004/2/1 unwanted info: blah blah blah name: url2.domain5.org date: 2004/3/2 unwanted info: blah blah blah name: url3.domain1.com date: 2004/2/3 unwanted info: blah blah blah name: url4.domain5.org date: 2004/3/4 unwanted info: blah blah blah name: url5.domain1.com date: 2004/2/5 unwanted info: blah blah blah name: url6.domain5.org date: 2004/3/6 unwanted info: blah blah blah

    and here are the results of running the script:

    1 url1.domain1.com 2004/2/1 2 url2.domain5.org 2004/3/2 3 url3.domain1.com 2004/2/3 4 url4.domain5.org 2004/3/4 5 url5.domain1.com 2004/2/5 6 url6.domain5.org 2004/3/6

    Change all the ' into " to run on win32

    HTH.

    --
    hiseldl
    What time is it? It's Camel Time!

      Do sick minds think alike? It would seem we came up with very similar answers!

      Cheers,

      -- Dave :-)


      $q=[split+qr,,,q,~swmi,.$,],+s.$.Em~w^,,.,s,.,$&&$$q[pos],eg,print
Re: Simple perl one-liner for transforming text files
by zengargoyle (Deacon) on Mar 06, 2003 at 06:30 UTC
    perl -000ne 'print"$.. @{[(split)[1,3]]}\n"' data.txt
Re: Simple perl one-liner for transforming text files
by OM_Zen (Scribe) on Mar 06, 2003 at 04:48 UTC
    Hi , The regular expression pre-defined variables can be the ones to be for this
    while(<Fnm>){chomp;if($_=~/^name:\s+/){$cnt++;print "$cnt\. $'";}elsif +($_=~/^date:\s+/){print " $'\n";}} __DATA__ name: url1.domain1.com date: 2004/2/1 unwanted info: blah blah blah name: url2.domain5.org date: 2004/3/12 unwanted info: blah blah blah __END__ 1. url1.domain1.com 2004/2/1 2. url2.domain5.org 2004/3/12


Re: Simple perl one-liner for transforming text files
by DaveH (Monk) on Mar 06, 2003 at 01:08 UTC

    Hi.

    Should this move to Obfuscated Code? :-)

    perl -F: -al0ne '$i||=1;/^n/&&print$i;!/^u/&&print$F[1];/^$/&&$i++&&pr +int$/'

    I don't think it meets the "simple" requirement, but it is mostly on one line... ;-)

    Here is what B::Deparse makes of it though, for reference.

    LINE: while (defined($_ = <ARGV>)) { chomp $_; @F = split(/:/, $_, 0); $i ||= 1; print $i if /^n/; print $F[1] if not /^u/; print $/ if /^$/ and $i++; }

    I'm not sure if the preceeding number is a requirement or not. Without the number prefix, I can get down to:

    perl '-F: ' -al0ne '!/^u/&&print qq($F[1] );/^$/&&print$/'

    (To run on Win32, change all the ' into " - it should still work.)

    Hope that helps.

    Cheers,

    -- Dave :-)


    $q=[split+qr,,,q,~swmi,.$,],+s.$.Em~w^,,.,s,.,$&&$$q[pos],eg,print

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://240696]
Approved by valdez
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2024-03-29 08:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found