http://qs321.pair.com?node_id=237724

blm has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a perl script to work with rpm files. Part of it involves matching a RPM filename and determining rpm name and version. Currently I have the following test case

#!/usr/bin/perl -w $filename = "urpmi-parallel-ssh-4.0-20.1mdk.noarch.rpm"; print "$filename\n"; if ($filename =~ m/([a-zA-Z0-9]+)(\-\w+)*-([0-9.]+)-([0-9.]+mdk)\.(\w+ +)\.rpm/) { print $1 . "\n"; print $2 . "\n"; print $3 . "\n"; print $4 . "\n"; }

I want this to print

urpmi-parallel-ssh-4.0-20.1mdk.noarch.rpm urpmi-parallel-ssh 4.0-20.1mdk noarch rpm
Instead it currently prints:
urpmi-parallel-ssh-4.0-20.1mdk.noarch.rpm urpmi -ssh 4.0 20.1mdk

Unfortunately I am struggling to get further. If anyone can help I would appreciate it. Also if there are better approaches to this problem please let me know.

I can't get RPM to install and the RPM file is not local

Replies are listed 'Best First'.
Re: Regex to break up rpm filenames
by xmath (Hermit) on Feb 22, 2003 at 13:45 UTC
    $filename = "urpmi-parallel-ssh-4.0-20.1mdk.noarch.rpm"; print "$filename\n"; if ($filename =~ /^(.+)-([^-]+)-([^-]+)\.(\w+)\.rpm$/ ) { print "package: $1\n", "version: $2\n", "release: $3\n", "arch: $4\n"; }

    •Update: made the pattern simpler and more universal
    •Update: and even simpler, based on a document describing what an rpm name can look like
    •Update: added explanation what the components are

Re: Regex to break up rpm filenames
by jasonk (Parson) on Feb 22, 2003 at 14:25 UTC

    Keep in mind that the version information in the filename is only there to be helpful for the human, the version information that is used when you actually install the file comes from within the RPM itself. This means it is possible for an rpm to have any filename and still be valid, although the regexps already offered here will work in most cases.

    Because of this limitation, the best way to get the information you are looking for is to download the package, and extract the information from inside it. If you can't get the perl RPM package to install, but you do have the rpm command, you can get this information like this:

    my($name,$version,$release,$arch) = split(' ',`rpm -qp --queryformat ' +%{NAME} %{VERSION} %{RELEASE} %{ARCH}\n' $filename`);

    If you can't download the package, the other regexps offered here should work in most cases, just keep this in mind if you find it gives you the wrong information for some package.

Re: Regex to break up rpm filenames
by OM_Zen (Scribe) on Feb 22, 2003 at 16:46 UTC
    Hi ,

    my $filenm = "urpmi-parallel-ssh-4.0-20.1mdk.noarch.rpm"; if ($filenm =~ m/((\w+)-)+/){ ($arr,undef,$var70,undef,$var97) = split (/((?<=[a-z])\.(?=[a-z]))/,$'); local $str57 = $&; $str57 =~ s/-$//; print "[$str57]\n"; print "[$arr]\n"; print "[$var70]\n"; print "[$var97]\n"; } __END__ [urpmi-parallel-ssh] [4.0-20.lmdk] [noarch] [rpm]


    This solution is through extended pattern of regular expressions . This is , I guess what you may looking for as a regular expression matching and having the answer .

Re: Regex to break up rpm filenames
by Aristotle (Chancellor) on Feb 22, 2003 at 21:14 UTC

    I would definitely try a bit harder to get RPM to install. Have you tried finding an RPM package of it?

    There's also RPM::Perlonly as well as a host of other RPM related distributions on CPAN.

    Makeshifts last the longest.

      Thanks for replying.

      Unfortunately to use these modules I would need to download the RPMs to interrogate their headers. The goal of my project was to avoid downloading them unless I needed them.

      I realise that because of this I am relying on the accuracy of filenames which may or may not be accurate

Re: Regex to break up rpm filenames
by Anonymous Monk on Feb 22, 2003 at 13:44 UTC

    if ( my @bits = $filename =~ m[(^[^\d]+)-([\d\.-]+mdk)\.([^\.]+)\.(rpm$)] ) { print $_, $/ for @bits; }
      no, the package name can contain digits.. say "imlib2-1.0.6-fr2.i386.rpm"