Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

File Iteration and looking for specific matching data

by bshah (Novice)
on Mar 24, 2014 at 20:27 UTC ( [id://1079590]=perlquestion: print w/replies, xml ) Need Help??

bshah has asked for the wisdom of the Perl Monks concerning the following question:

Hi Gurus,

I'm looking to iterate a file and if find specific word then store the other lines following that which matches specific pattern.The ldap.txt file is pretty large in several Gigs.

user.txt
test1

game

ldap.txt
dn: uid=test1,ou=people,dc=admin,dc=local

blah

blah

maillocaladdress: test1@example.com

maillocaladdress: test.team@example.com

maillocaladdress: test11@example.com

some date

some more data

data


dn: uid=game,ou=people,dc=admin,dc=local

blah

blah

maillocaladdress: game@example.com

maillocaladdress: game.test@example.com

maillocaladdress: game-test@example.com

some date

some more data

data

and so on..


Open user.txt and iterate through each user and check each line on ldap.txt in dn: line. If matches, then store the value of all the lines matching maillocaladdress to the varialbe , I assume in hash key/value pari but here the values are more than one.


e.g.

test1 matches dn: uid=test1,ou=people,dc=admin,dc=local

Store the following values for each user.

test1@example.com

test.team@example.com

test11@example.com

Thank you for all your help & support.

  • Comment on File Iteration and looking for specific matching data

Replies are listed 'Best First'.
Re: File Iteration and looking for specific matching data
by kcott (Archbishop) on Mar 25, 2014 at 06:22 UTC

    G'day bshah,

    "Open user.txt and iterate through each user and check each line on ldap.txt in dn: line."

    That would mean you'd be reading your multi-gigabyte ldap.txt multiple times (i.e. once for every user in user.txt). Instead, read user.txt once and create a regex with a capture group containing an alternation of all the users. If that's unfamiliar to you, see "Perl regular expressions tutorial".

    "If matches, then store the value of all the lines matching maillocaladdress to the varialbe , I assume in hash key/value pari but here the values are more than one."

    What you want here is a "hash of arrays".

    Putting all that together (based on your sample data):

    #!/usr/bin/env perl use strict; use warnings; use autodie; my ($user_file, $ldap_file) = qw{pm_1079590_user.txt pm_1079590_ldap.t +xt}; open my $user_fh, '<', $user_file; my @users; while (<$user_fh>) { chomp; push @users, $_; } close $user_fh; my $user_re = 'uid=(' . join('|', @users) . '),'; my %user_ldap_data; open my $ldap_fh, '<', $ldap_file; my $user = ''; while (<$ldap_fh>) { if (/^dn:/) { $user = /$user_re/ ? $1 : ''; next; } next unless $user; push @{$user_ldap_data{$user}}, $1 if /^maillocaladdress:\s+(\S+)/ +; } close $ldap_fh; use Data::Dump; dd \%user_ldap_data;

    Output:

    { game => [ "game\@example.com", "game.test\@example.com", "game-test\@example.com", ], test1 => [ "test1\@example.com", "test.team\@example.com", "test11\@example.com", ], }

    -- Ken


      Thank you Ken

      Do you mind explaining the following piece of code please ?
      'uid=(' . join('|', @users) . '),'; $user = /$user_re/ ? $1 : ''

      Kind Regards
        "Do you mind explaining the following piece of code please ?"

        Well, you've listed two pieces of code which appear in different parts of the script I posted:

        "'uid=(' . join('|', @users) . '),';"

        I've already provided a link to what's going on here: "... create a regex with a capture group containing an alternation of all the users. If that's unfamiliar to you, see "Perl regular expressions tutorial".". Was there something in that documentation that you didn't understand?

        See "Perl functions A-Z" for information about join (or any of the other core functions I've used).

        You could always run the code I provided with a print statement to see how that expression is evaluated.

        "$user = /$user_re/ ? $1 : ''"

        That's the ternary (aka conditional) operator: see "perlop: Conditional Operator".

        -- Ken

Re: File Iteration and looking for specific matching data
by Preceptor (Deacon) on Mar 24, 2014 at 21:39 UTC

    That looks like an extract from an ldap database. Have you considered instead using Net::LDAP to query the directory directly rather than parsing a dump 'by hand'?

    LDAP queries support searching on loose text matches. (E.g. search key on 'maillocaladdress=test1*')

    Failing that though, this looks like a fairly straightforward 'while' loop on a filehandle. If you're really insistent on hand-parsing an LDAP directory, you'll probably find the 'range' operator useful to read up on.

      Hi,

      Thanks for your reply. Yes, this is a database but its a backup file since this data is removed from current database so we need to use file instead of Net::LDAP

Re: File Iteration and looking for specific matching data
by Kenosis (Priest) on Mar 24, 2014 at 20:43 UTC

    Please use <code> tags to enclose your data to improve its readability.

Re: File Iteration and looking for specific matching data
by choroba (Cardinal) on Mar 25, 2014 at 16:13 UTC
    Crossposted at StackOverflow. It is considered polite to inform about crossposting so people not attending both sites do not waste their time solving a problem already solved at the other end of the Internet.
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1079590]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2024-04-23 07:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found