Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^9: Update XML Values using two primary keys

by AnomalousMonk (Archbishop)
on Jan 12, 2021 at 20:03 UTC ( [id://11126816]=note: print w/replies, xml ) Need Help??


in reply to Re^8: Update XML Values using two primary keys
in thread Update XML Values using two primary keys

When I am splitting the key to lat long, it gets split with the number, not the decimal part.

I'm not the best one to help with XML stuff, but I can perhaps offer help on this. I think the best approach is not to split away what you do not want, but to extract what you do want, real numbers in this case. The Regexp::Common module is useful in this, for it defines a real-number pattern.

Win8 Strawberry 5.30.3.1 (64) Tue 01/12/2021 14:15:24 C:\@Work\Perl\monks >perl -Mstrict -Mwarnings # pm#11126809 use Regexp::Common qw(number); my $rx_lat_lon = qr{ (?<! \d) $RE{num}{real} (?! \d) }xms; for my $lat_lon ( '123,456', '12.345, -0', ' -2. , 0.987', ' -.0 ,0.987 ', '0, -0', '.0 ,-.0', '0. , -0.', ' 0.0 , -0.0 ', '1, -1', '.1 ,-.1', '1. , -1.', ' 1.2 , -1.2 ', '123 456', '123', 'foo', '123,foo', 'foo,123', '1.234.5', ) { my $got_lat_lon = my ($lat, $lon) = $lat_lon =~ m{ \A \s* ($rx_lat_lon) \s* , \s* ($rx_lat_lon) \s* \z + }xms; # $lat_lon =~ m{ $rx_lat_lon }xmsg; # no data validation if ($got_lat_lon) { printf "%18s -> lat %-10s lon %-10s \n", "'$lat_lon'", n_or_undef($lat), n_or_undef($lon); } else { print "'$lat_lon' FAILED to extract lat/lon \n"; } } sub n_or_undef { return defined $_[0] ? "'$_[0]'" : 'undef'; } ^Z '123,456' -> lat '123' lon '456' '12.345, -0' -> lat '12.345' lon '-0' ' -2. , 0.987' -> lat '-2.' lon '0.987' ' -.0 ,0.987 ' -> lat '-.0' lon '0.987' '0, -0' -> lat '0' lon '-0' '.0 ,-.0' -> lat '.0' lon '-.0' '0. , -0.' -> lat '0.' lon '-0.' ' 0.0 , -0.0 ' -> lat '0.0' lon '-0.0' '1, -1' -> lat '1' lon '-1' '.1 ,-.1' -> lat '.1' lon '-.1' '1. , -1.' -> lat '1.' lon '-1.' ' 1.2 , -1.2 ' -> lat '1.2' lon '-1.2' '123 456' FAILED to extract lat/lon '123' FAILED to extract lat/lon 'foo' FAILED to extract lat/lon '123,foo' FAILED to extract lat/lon 'foo,123' FAILED to extract lat/lon '1.234.5' FAILED to extract lat/lon
(I haven't tested it, but I think this code will work under virtually any Perl version. (Update: Counter-examples are welcome!))

Some comments:

  • The $RE{num}{real} pattern is not bounded (by design!), so defining an explicit $rx_lat_lon bounded pattern with
        my $rx_lat_lon = qr{ (?<! \d) $RE{num}{real} (?! \d) }xms;
    is IMHO good practice and gives you a single point at which to change the definition of a lat/lon value should this become necessary.
  • Defining an explicit pattern for matching a lat/lon field with
        m{ \A \s* ($rx_lat_lon) \s* , \s* ($rx_lat_lon) \s* \z }xms
    is again good practice IMHO because it can provide for a high degree of data validation. The $got_lat_lon flag is true if valid data is extracted.
  • If you are certain there will always be exactly two lat/lon sub-fields present and you don't care what else is there, you can use the simpler
        m{ $rx_lat_lon }xmsg
    matching expression (note the /g modifier). The $got_lat_lon flag becomes almost meaningless in this case.


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^10: Update XML Values using two primary keys
by pratikpooja (Novice) on Jan 12, 2021 at 21:40 UTC

    Thank you. Actually, my Hash Key is a combination of lat lon value like 53.58464144.8.560693391. I was trying to separate the lat lon value so that I can add new tags to my XML file with the lat and on value respectively.

    I did the split part of lat and lon by using the split and join module.
    my @lat_lon_key = (split /[.]/, $key); my $lat_key = join('.', @lat_lon_key[0,1]); my $lon_key = join('.', @lat_lon_key[2,3]);
    $VAR1 = { '53.58964145.8.560993391' => { 'name' => 'A99_984_KEI', 'id' => '-1643796' }, '53.58464146.8.560693392' => { 'id' => -1643795, 'name' => 'A88_984_KEI' } };

    In the above Hash, if the 1st Hash key is present in the XML then it gets updated but if the 2nd Hash key is not present or rejected then it should get added as new tag in XML.

    Now my problem is I am not getting the desired rejected HoH keys in my else part of the statement. What I have to do in else part to get the rejected HoH key and build a new node in XML with it?

      but if the 2nd Hash key is not present or rejected then it should get added as new tag in XML.

      You can only tell if it is not present after you have trawled through all of the XML nodes, so the logic is this:

      1. Loop over each XML node, if the HoH has the key then use it and then delete it from the HoH
      2. After looping over all the nodes, all the remaining HoH keys are extras to be added, so loop over those keys (sorted if you like) and add them to the XML
      #!/usr/bin/env perl use strict; use warnings; my @nodes = qw/foo quux baz/; my %HoH = ( foo => { day => 'Mon' }, bar => { day => 'Tue' }, baz => { day => 'Wed' } ); # Loop over nodes for my $node (@nodes) { if (exists $HoH{$node}) { print "Node $node found in HoH and deleted\n"; # Do more processing here delete $HoH{$node}; } else { print "Node $node not found in HoH\n"; } } # Loop over remaining unmatched keys for my $key (keys %HoH) { print "Key $key with payload $HoH{$key}->{day} not matched, so add +ed\n"; }

      🦛

        Thank you for your inputs. I am now able to add node id, lat,long values in the XML file but I cannot create a new Child tag inside the new node tag to store the tag name and value.

        Below is my input XML(osm file)
        <?xml version='1.0' encoding='UTF-8'?> <osm version='0.6' upload='false' generator='JOSM'> <bounds minlat='53.560255' minlon='8.5117942' maxlat='53.5974042' ma +xlon='8.6024315' origin='CGImap 0.6.1 (11651 thorn-03.openstreetmap.o +rg)' /> <node id='-1643794' lat='53.58464144' lon='8.560693391'> <tag k='name' v='ASC_A_KEI_001' /> </node> </osm>
        My Output XML File is
        <?xml version="1.0" encoding="UTF-8"?> <osm version="0.6" upload="false" generator="JOSM"> <bounds minlat="53.560255" minlon="8.5117942" maxlat="53.5974042" ma +xlon="8.6024315" origin="CGImap 0.6.1 (11651 thorn-03.openstreetmap.o +rg)"/> <node id="-1643795" lat="53.58464144" lon="8.560693391"> <tag k="name" v="ASC_A_KEI_008"/> </node> <node id="-1643796" lat="60.584" lon="10.560"/><node id="-1643795" lat +="59.584" lon="9.560"/> </osm>
        Here is my updated code
        my $dom = 'XML::LibXML'->load_xml(location => 'test.osm'); for my $node ($dom->findnodes('/osm/node')) { my $tag = $node->findnodes('tag[@k="name"]')->[0]; my $key = "$node->{lat}.$node->{lon}"; if (exists $HoH{$key}){ $tag->{v} = $HoH{$key}->{name}; $node->{id} = $HoH{$key}->{id}; delete $HoH{$key}; } } for my $key (keys %HoH){ for my $node ($dom->findnodes('/osm')){ my @lat_lon_key = (split /[.]/, $key); my $lat_key = join('.', @lat_lon_key[0,1]); my $lon_key = join('.', @lat_lon_key[2,3]); my $elm = $dom->createElement('node'); $elm->setAttribute('id', $HoH{$key}->{id}); $elm->setAttribute('lat', $lat_key); $elm->setAttribute('lon', $lon_key); $node->addChild($elm); } } $dom->toFile('new_test.osm');

        I have been trying to add child tags to my new node tags but I am getting error "XML::LibXML::Node::addChild() -- nNode is not a blessed SV reference".

        I have tried installing the library as well but still getting error. How can I add child tags as well as align my new tags as per input? You can see that the new tags are added side by side and there is no line break. Thank you.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11126816]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2024-04-18 00:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found