http://qs321.pair.com?node_id=247056


in reply to Zipcode Proximity script

First of all, what everyone else has said about first trimming your data set applies: for a given longitude and latitude, you should be able to come up with boundaries that are definitely in your set and boundaries that are definitely out. That being said, you can speed up your comparison even in the C code by doing this: I assume that in your main loop you do something like:
while (read_from_file(&newzip, &newlat, &newlon)) { mydist = great_circle_distance(newlat,newlon,origlat,origlon); if (mydist < 25) { // do stuff to add it to file } if (mydist < 50) { ... } ... }
Instead, replace it with this:
float dist1cosine = cos(25.0 / EARTH_RADIUS); float dist2cosine = cos(50.0 / EARTH_RADIUS); float dist3cosine = cos(75.0 / EARTH_RADIUS); while (read_from_file(&newzip, &newlat, &newlon)) { mydistcos = great_circle_distance_cosine(newlat,newlon, origlat,origlon); if (mydistcos >= dist1cosine) { // do stuff to add it to file } if (mydistcos >= dist2cosine) { ... } ... }
where you've got:
static inline float great_circle_distance_cosine(float lat1, float long1, float lat2, float long2) { float delta_long, temp1, temp2, delta_lat; delta_lat = lat2 - lat1; delta_long = long2 - long1; temp1 = sin(lat1) * sin(lat2); temp2 = cos(delta_lat); temp2 -= temp1; temp2 *= cos(delta_long); /* result is cos(lat1) cos(lat2) [ cos(lon1 - lon2) ] + sin(lat1) sin(lat2) */ return (temp1 + temp2); }
The point of this exercise is to avoid the atan2 and sqrt calls, which are done to take the arccos of sqrt(temp) in the original code. Over the range you're dealing with, arccos is strictly decreasing, (hence the switch from "<" to ">=" in the comparison) which means you don't need to take arccos every time in order to Actually, I think that you could translate this algorithm into perl and, even without applying any pre-distance filtering, get better results than your current program.

Edit by tye, remove PRE tags surrounding CODE tags