johnfl68 has asked for the wisdom of the Perl Monks concerning the following question:
Hello:
I need to rewrite some code, and probably almost a complete rewrite at that.
I am getting Arrival and Departure data for an airport, currently in XML, but I can also do JSON. This contains other information besides the flightStatus's (see example):
https://www.dropbox.com/s/w50uliyoxug4r5m/depart.xml.txt
Part of the problem is currently the data is being sorted by XML::Simple by the 3 letter Airport code, and I really need things sorted by Arrival/Departure times.
I admit I am not the best when it comes to working with hashes and arrays, but willing to try and learn if pointed in the right direction.
As I am on a shared server, it is often a bit of work getting things added, so I am working with core modules, plus I have XML::Simple and JSON::XS as well. The only reason I bring this up, is often people here offer up all these other modules that are not part of the core. That is fine if you are on your own server and can easily add what you want. It's not quite that easy on a shared server unfortunately.
Everyone seems to be going the JSON route these days, so probably rewrite in JSON unless there is a reason to go with XML I guess.
If I somewhat understand correctly, the best option would be to load the hash data from the XML/JSON into an Array, which will make sorting easier? Is this the best route to go?
Does anyone know of a good place to start with examples of how to best go about loading things into an array, and then sorting, if that is the best way?
Again, thank you for your help and guidance as always.
John
Re: Looking for suggestions, sorting data from XML or JSON
by davido (Cardinal) on Jul 06, 2013 at 05:53 UTC
|
Can you give a usable example of what the JSON data would look like?
A module like JSON::XS is going to return a data-structure. Before we can be more specific than the POD from perldsc on how to get at the data in the structure, we would need to know how that structure is ...um... structured.
| [reply] |
Re: Looking for suggestions, sorting data from XML or JSON
by martell (Hermit) on Jul 06, 2013 at 12:36 UTC
|
Hello
JSON or XML? It depends .... XML has, for me, preference in formal communication with external parties because you can document it more verbose and use the xsd for proofing that the messages are according agreements between parties. JSON is more dense and easier for a computer to transform from/to objects, and is more used as a carrying mechanism between server and client (think javascript in browsers that query the back-end.)
For the problem at hand, find below a quick and dirty solution, using XML::Simple, Time:Piece (core module) and some sorting and mapping according the Camel handbook. I can really recommend this book.
Not tested in full and I didn't look for performance! Also be aware for datetime conversions. Check if they are appropriate for your situation. The Data::Dumper is there for testing purposes.
use strict;
use warnings;
use XML::Simple;
use Data::Dumper;
use Time::Piece;
my $xml = XMLin( 'depart.xml',
ForceArray => ['flightStatus'], # To force the flightstatu
+ses into an arrayref
GroupTags => { # To eliminate unnecessary leve
+ls in the structure
airlines => 'airline',
airports => 'airport',
equipments => 'equipment',
flightStatuses => 'flightStatus'});
# just to illustrate that you can easily extract the data
my $airports = $xml->{appendix}{airports};
my $equipments = $xml->{appendix}{equipments};
my $airlines = $xml->{appendix}{airlines};
my @flightstatuses = @{$xml->{flightStatuses}}; # make it a normal arr
+ay from flightstatuses
print "Airports\n";
print "--------\n";
print Dumper $airports;
print "Airlines\n";
print "--------\n";
print Dumper $airlines;
print "Equipments\n";
print "----------\n";
print Dumper $equipments;
print "Flightstatuses\n";
print "----------\n";
print Dumper (\@flightstatuses);
# examples of obtaining an element of a flightstatus that i use in the
+ sorting
# take the first element
my $flightstatus = $flightstatuses[0];
# obtain departure and arrival times
my $departuretime = $flightstatus->{departureDate}{dateUtc};
my $arrivaltime = $flightstatus->{arrivalDate}{dateUtc};
print $departuretime, "\n";
print $arrivaltime, "\n";
# printed after conversion (remark: didn't take the '000Z' in my conve
+rsion)
print Time::Piece->strptime($departuretime, '%Y-%m-%dT%H:%M:%S.000Z'),
+ "\n";
print Time::Piece->strptime($arrivaltime, '%Y-%m-%dT%H:%M:%S.000Z'), "
+\n";
# transform the times into epoch, needed for sort function (strictly,
+it is not, because a Time::Piece
# object will stringify to epoch
my $departuretime_epoch = Time::Piece->strptime($departuretime, '%Y-%m
+-%dT%H:%M:%S.000Z')->epoch;
my $arrivaltime_epoch = Time::Piece->strptime($arrivaltime, '%Y-%m-%dT
+%H:%M:%S.000Z')->epoch;
print $departuretime_epoch , "\n";
print $arrivaltime_epoch, "\n";
# the classic example of the camel cookbook to sort with mapping where
+ the mappings shown in the examples are used.
my @sorted_by_arrival = map {$_->[1]}
+# map back to original array
sort {$a->[0] cmp $b->[0]} # cla
+ssical compare of two values
map {[Time::Piece->strptime($_->{arrivalDate}{dateUtc}
+, '%Y-%m-%dT%H:%M:%S.000Z')->epoch, $_]} # make a intermediate array
+with on first place the value used to sort
@flightstatuses;
my @sorted_by_departure = map {$_->[1]}
+ # map back to original array
sort {$a->[0] cmp $b->[0]} # c
+lassical compare of two values
map {[Time::Piece->strptime($_->{departureDate}{date
+Utc}, '%Y-%m-%dT%H:%M:%S.000Z')->epoch, $_]} # make a intermediate ar
+ray with on first place the value used to sort
@flightstatuses;
# mapping to be able to print the results
my @sorted_by_arrival_id = map {$_->{'flightId'}} @sorted_by_arrival;
my @sorted_by_departure_id = map {$_->{'flightId'}} @sorted_by_departu
+re;
print join " ", @sorted_by_arrival_id, "\n";
print join " ", @sorted_by_departure_id, "\n";
Hope this will helps
Martell | [reply] [d/l] |
Re: Looking for suggestions, sorting data from XML or JSON
by hdb (Monsignor) on Jul 06, 2013 at 08:15 UTC
|
What I have not understood is: do you want to create a sorted XML file or do you want to read the xml file and sort the data? For the latter, have a look at this:
use strict;
use warnings;
use XML::Simple;
use Data::Dumper;
my $xml = XMLin( 'depart.xml');
print Dumper $xml;
print "First level of data\n";
for my $level1 (keys %$xml) {
print "$level1\n";
}
# assume you want flightStatuses
print "Second level of data\n";
for my $level2 (keys %{ $xml->{flightStatuses} } ) {
print "$level2\n";
}
# only has flightStatus which is an array ref containing hashes, hint
+'=> ['
print "Unsorted depature times\n";
for my $level3 ( @{ $xml->{flightStatuses}->{flightStatus} } ) {
print $level3->{carrierFsCode};
print $level3->{flightNumber},": ";
print $level3->{operationalTimes}->{scheduledGateDeparture}->{dateUt
+c},"\n";
}
# now that we have the data we can sort
print "Sorted depature times\n";
for my $level3 ( sort
{ $a->{operationalTimes}->{scheduledGateDeparture}->{
+dateUtc}
cmp
$b->{operationalTimes}->{scheduledGateDeparture
+}->{dateUtc} }
@{ $xml->{flightStatuses}->{flightStatus} } ) {
print $level3->{carrierFsCode};
print $level3->{flightNumber},": ";
print $level3->{operationalTimes}->{scheduledGateDeparture}->{dateUt
+c},"\n";
}
This is only an illustration how to deal with complex structures. First you use Data::Dumper to see how the structure looks like and whether you have hashes or arrays, then you untangle it level by level, adding sorting only at the end. HTH.
| [reply] [d/l] [select] |
Re: Looking for suggestions, sorting data from XML or JSON
by Anonymous Monk on Jul 06, 2013 at 07:21 UTC
|
I admit I am not the best when it comes to working with hashes and arrays, but willing to try and learn if pointed in the right direction. One interactive example of sorting an array by any field Re: Need help to guide how to create client sort table ( mojo.template.array.autoindex.sortable.pl )
dropbox No need for a pastebin at perlmonks, Writeup Formatting Tips
That is fine if you are on your own server and can easily add what you want. It's not quite that easy on a shared server unfortunately. Oh. Yeah, people always bring that up, and it just ain't true :) see Yes, even you can use CPAN Top 11 (GOOD) reasons not to use someone else's Modules, Top Seven (Bad) Reasons Not To Use Modules, Surreptitiously adding modules to GoDaddy basic Linux account?, http://www.cavapackager.com/...Installing modules without root and shell
Use XML::Twig :) it comes with many examples/tutorials
or use XML::Rules, which like XML::Twig only needs XML::Parser which you probably might already have
or use Mojo::DOM , it doesn't even need XML::Parser
see Re^2: XML:: Twig - can you check for text following the element being handled?, Re^3: parse xml and store data in array of hashesh, Re: How to return two and more values by parsing XML with XML::Rules?, Re^3: change setHandlers XML::Parser , Re: The best module for handling xml for examples, walkthroughs, follow my links and the links they link, like these Re: How to grab a portion of file with regex (don't)(parsing html/xml with xpath/twig/dom, because html::parser is low level), Re: How to grab a portion of file with regex (parsing html/xml with xpath/twig/dom, because xml::parser is low level), Re^4: How to grab a portion of file with regex (parsing html/xml with xpath/twig/dom, because ::parser is low level)
Does anyone know of a good place to start with examples of how to best go about loading things into an array, and then sorting, if that is the best way?
perlfaq -> perlfaq4#How do I sort a hash (optionally by value instead of key)?, How do I sort an array by (anything)?
DBI with DBM::Deep or DBD::CSV
Everyone seems to be going the JSON route these days, so probably rewrite in JSON unless there is a reason to go with XML I guess. Um, if its not broke don't fix it :)
| [reply] |
Re: Looking for suggestions, sorting data from XML or JSON
by johnfl68 (Scribe) on Jul 07, 2013 at 04:44 UTC
|
Thank you all. I will have to look over all the replies and suggestions to determine the best way to proceed.
Davido:
Here is a JSON example:
https://www.dropbox.com/s/5vmxxvqi9n9l1tl/depart.json.txt
Anonymous Monk:
I try to use "pastebin" links minimally, but in some cases it is the easiest way to share information like this, when it is a large amount of text to view.
Everyone else:
Thank you all for your Wisdom!
John
| [reply] |
Re: Looking for suggestions, sorting data from XML or JSON
by sundialsvc4 (Abbot) on Jul 06, 2013 at 19:08 UTC
|
“XML vs. JSON” is entirely up to you ... generally, I prefer to make as few “disruptive changes” as possible. Furthermore, in both cases, you wind up at the same place: with (I presume ...) a Perl array of Perl hashrefs. The question before the court has to do with sorting, not with how the structure-to-be-sorted came to be.
As you will see from perldoc -f sort, the sort function takes as one of its parameters a code-block (or a subroutine reference) which does the actual comparison. It returns a value less-than, equal-to, or greater-than zero, and Perl provides two operators, <=> and cmp, to do that, for numeric and string values, respectively. When sorting by more than one field, the || operator is used, as the perldoc specifically shows. Therefore, this is how you sort your data ... no matter how intricate its structure is, and no matter how the Perl data structure that you are sorting came to be.
In the case of XML data, sorting can also be requested using an XPath expression, but not, IIRC, with XML::Simple.
| |
|
|