Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Looking for suggestions, sorting data from XML or JSON

by johnfl68 (Scribe)
on Jul 06, 2013 at 05:26 UTC ( [id://1042854]=perlquestion: print w/replies, xml ) Need Help??

johnfl68 has asked for the wisdom of the Perl Monks concerning the following question:

Hello:

I need to rewrite some code, and probably almost a complete rewrite at that.

I am getting Arrival and Departure data for an airport, currently in XML, but I can also do JSON. This contains other information besides the flightStatus's (see example):

https://www.dropbox.com/s/w50uliyoxug4r5m/depart.xml.txt

Part of the problem is currently the data is being sorted by XML::Simple by the 3 letter Airport code, and I really need things sorted by Arrival/Departure times.

I admit I am not the best when it comes to working with hashes and arrays, but willing to try and learn if pointed in the right direction.

As I am on a shared server, it is often a bit of work getting things added, so I am working with core modules, plus I have XML::Simple and JSON::XS as well. The only reason I bring this up, is often people here offer up all these other modules that are not part of the core. That is fine if you are on your own server and can easily add what you want. It's not quite that easy on a shared server unfortunately.

Everyone seems to be going the JSON route these days, so probably rewrite in JSON unless there is a reason to go with XML I guess.

If I somewhat understand correctly, the best option would be to load the hash data from the XML/JSON into an Array, which will make sorting easier? Is this the best route to go?

Does anyone know of a good place to start with examples of how to best go about loading things into an array, and then sorting, if that is the best way?

Again, thank you for your help and guidance as always.

John

  • Comment on Looking for suggestions, sorting data from XML or JSON

Replies are listed 'Best First'.
Re: Looking for suggestions, sorting data from XML or JSON
by davido (Cardinal) on Jul 06, 2013 at 05:53 UTC

    Can you give a usable example of what the JSON data would look like?

    A module like JSON::XS is going to return a data-structure. Before we can be more specific than the POD from perldsc on how to get at the data in the structure, we would need to know how that structure is ...um... structured.


    Dave

Re: Looking for suggestions, sorting data from XML or JSON
by martell (Hermit) on Jul 06, 2013 at 12:36 UTC

    Hello

    JSON or XML? It depends .... XML has, for me, preference in formal communication with external parties because you can document it more verbose and use the xsd for proofing that the messages are according agreements between parties. JSON is more dense and easier for a computer to transform from/to objects, and is more used as a carrying mechanism between server and client (think javascript in browsers that query the back-end.)

    For the problem at hand, find below a quick and dirty solution, using XML::Simple, Time:Piece (core module) and some sorting and mapping according the Camel handbook. I can really recommend this book.

    Not tested in full and I didn't look for performance! Also be aware for datetime conversions. Check if they are appropriate for your situation. The Data::Dumper is there for testing purposes.

    use strict; use warnings; use XML::Simple; use Data::Dumper; use Time::Piece; my $xml = XMLin( 'depart.xml', ForceArray => ['flightStatus'], # To force the flightstatu +ses into an arrayref GroupTags => { # To eliminate unnecessary leve +ls in the structure airlines => 'airline', airports => 'airport', equipments => 'equipment', flightStatuses => 'flightStatus'}); # just to illustrate that you can easily extract the data my $airports = $xml->{appendix}{airports}; my $equipments = $xml->{appendix}{equipments}; my $airlines = $xml->{appendix}{airlines}; my @flightstatuses = @{$xml->{flightStatuses}}; # make it a normal arr +ay from flightstatuses print "Airports\n"; print "--------\n"; print Dumper $airports; print "Airlines\n"; print "--------\n"; print Dumper $airlines; print "Equipments\n"; print "----------\n"; print Dumper $equipments; print "Flightstatuses\n"; print "----------\n"; print Dumper (\@flightstatuses); # examples of obtaining an element of a flightstatus that i use in the + sorting # take the first element my $flightstatus = $flightstatuses[0]; # obtain departure and arrival times my $departuretime = $flightstatus->{departureDate}{dateUtc}; my $arrivaltime = $flightstatus->{arrivalDate}{dateUtc}; print $departuretime, "\n"; print $arrivaltime, "\n"; # printed after conversion (remark: didn't take the '000Z' in my conve +rsion) print Time::Piece->strptime($departuretime, '%Y-%m-%dT%H:%M:%S.000Z'), + "\n"; print Time::Piece->strptime($arrivaltime, '%Y-%m-%dT%H:%M:%S.000Z'), " +\n"; # transform the times into epoch, needed for sort function (strictly, +it is not, because a Time::Piece # object will stringify to epoch my $departuretime_epoch = Time::Piece->strptime($departuretime, '%Y-%m +-%dT%H:%M:%S.000Z')->epoch; my $arrivaltime_epoch = Time::Piece->strptime($arrivaltime, '%Y-%m-%dT +%H:%M:%S.000Z')->epoch; print $departuretime_epoch , "\n"; print $arrivaltime_epoch, "\n"; # the classic example of the camel cookbook to sort with mapping where + the mappings shown in the examples are used. my @sorted_by_arrival = map {$_->[1]} +# map back to original array sort {$a->[0] cmp $b->[0]} # cla +ssical compare of two values map {[Time::Piece->strptime($_->{arrivalDate}{dateUtc} +, '%Y-%m-%dT%H:%M:%S.000Z')->epoch, $_]} # make a intermediate array +with on first place the value used to sort @flightstatuses; my @sorted_by_departure = map {$_->[1]} + # map back to original array sort {$a->[0] cmp $b->[0]} # c +lassical compare of two values map {[Time::Piece->strptime($_->{departureDate}{date +Utc}, '%Y-%m-%dT%H:%M:%S.000Z')->epoch, $_]} # make a intermediate ar +ray with on first place the value used to sort @flightstatuses; # mapping to be able to print the results my @sorted_by_arrival_id = map {$_->{'flightId'}} @sorted_by_arrival; my @sorted_by_departure_id = map {$_->{'flightId'}} @sorted_by_departu +re; print join " ", @sorted_by_arrival_id, "\n"; print join " ", @sorted_by_departure_id, "\n";

    Hope this will helps

    Martell

Re: Looking for suggestions, sorting data from XML or JSON
by hdb (Monsignor) on Jul 06, 2013 at 08:15 UTC

    What I have not understood is: do you want to create a sorted XML file or do you want to read the xml file and sort the data? For the latter, have a look at this:

    use strict; use warnings; use XML::Simple; use Data::Dumper; my $xml = XMLin( 'depart.xml'); print Dumper $xml; print "First level of data\n"; for my $level1 (keys %$xml) { print "$level1\n"; } # assume you want flightStatuses print "Second level of data\n"; for my $level2 (keys %{ $xml->{flightStatuses} } ) { print "$level2\n"; } # only has flightStatus which is an array ref containing hashes, hint +'=> [' print "Unsorted depature times\n"; for my $level3 ( @{ $xml->{flightStatuses}->{flightStatus} } ) { print $level3->{carrierFsCode}; print $level3->{flightNumber},": "; print $level3->{operationalTimes}->{scheduledGateDeparture}->{dateUt +c},"\n"; } # now that we have the data we can sort print "Sorted depature times\n"; for my $level3 ( sort { $a->{operationalTimes}->{scheduledGateDeparture}->{ +dateUtc} cmp $b->{operationalTimes}->{scheduledGateDeparture +}->{dateUtc} } @{ $xml->{flightStatuses}->{flightStatus} } ) { print $level3->{carrierFsCode}; print $level3->{flightNumber},": "; print $level3->{operationalTimes}->{scheduledGateDeparture}->{dateUt +c},"\n"; }

    This is only an illustration how to deal with complex structures. First you use Data::Dumper to see how the structure looks like and whether you have hashes or arrays, then you untangle it level by level, adding sorting only at the end. HTH.

Re: Looking for suggestions, sorting data from XML or JSON
by Anonymous Monk on Jul 06, 2013 at 07:21 UTC
Re: Looking for suggestions, sorting data from XML or JSON
by johnfl68 (Scribe) on Jul 07, 2013 at 04:44 UTC

    Thank you all. I will have to look over all the replies and suggestions to determine the best way to proceed.

    Davido:

    Here is a JSON example:

    https://www.dropbox.com/s/5vmxxvqi9n9l1tl/depart.json.txt


    Anonymous Monk:

    I try to use "pastebin" links minimally, but in some cases it is the easiest way to share information like this, when it is a large amount of text to view.


    Everyone else:

    Thank you all for your Wisdom!


    John

Re: Looking for suggestions, sorting data from XML or JSON
by sundialsvc4 (Abbot) on Jul 06, 2013 at 19:08 UTC

    “XML vs. JSON” is entirely up to you ... generally, I prefer to make as few “disruptive changes” as possible.   Furthermore, in both cases, you wind up at the same place:   with (I presume ...) a Perl array of Perl hashrefs.   The question before the court has to do with sorting, not with how the structure-to-be-sorted came to be.

    As you will see from perldoc -f sort, the sort function takes as one of its parameters a code-block (or a subroutine reference) which does the actual comparison.   It returns a value less-than, equal-to, or greater-than zero, and Perl provides two operators, <=> and cmp, to do that, for numeric and string values, respectively.   When sorting by more than one field, the || operator is used, as the perldoc specifically shows.   Therefore, this is how you sort your data ... no matter how intricate its structure is, and no matter how the Perl data structure that you are sorting came to be.

    In the case of XML data, sorting can also be requested using an XPath expression, but not, IIRC, with XML::Simple.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1042854]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-04-24 00:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found