Perl newbie here trying to make maps. I've written a script to extract the US census block information from the TIGER Line Shapefiles (ESRI format) into a plain tab-delimited spreadsheet. I'm on macOS Sierra. Specifically, I used Jason Kohles' Geo::ShapeFile module to read the shapefile, and copy each Block ID and its associated polygon (coordinates) into a two-column, tab-delimited text file.
Here is the code:
#!/usr/bin/perl
use strict;
use warnings;
use Math::Round;
use Geo::ShapeFile;
use Geo::ShapeFile::Shape;
use Geo::ShapeFile::Point;
# Declare variables
my $blockid;
my $bcount;
my $totalblocks;
my $progress;
my $shapefile; # used for loading the TIGER shapefile
my %attr; # used to read the dbf file
my $polygon; # used to read the shp file
my @points; # used to read the shape into an array of points
my $coor; # used to read the coordinates of each point
my $x;
my $y;
my $pcount; # counter for the coordinates being written into output
my $totalpoints; # total number of coordinates in each polygon
# Variables for time keeping
my $start = time;
my $duration;
# Prepare the TIGER spreadsheet.
open (TIGER, ">tiger.txt") or die "Failed to open";
print TIGER "BLOCKID\tPolygon\n"; # make a header
# Load the TIGER shapefile
$shapefile = 0;
$shapefile = Geo::ShapeFile->new ("tabblock2010_42_pophu"); # use the
+TIGER census block special release (population & housing units only)
$totalblocks = $shapefile->shapes();
for (1 .. $totalblocks) {
%attr = $shapefile->get_dbf_record($_);
$blockid = $attr{BLOCKID10};
$bcount++;
print TIGER "$blockid\t\"geometry\":{\"type\":\"Polygon\",\"coordi
+nates\":[["; # The polygon column is in the GEOJSON format
$polygon = $shapefile->get_shp_record($_);
@points = $polygon->points();
$pcount = 0;
$totalpoints = scalar (@points);
foreach $coor (@points) {
$x = $coor->get_x();
$y = $coor->get_y();
print TIGER "[$x,$y]";
$pcount++;
if ($pcount != $totalpoints) { # All but the last coordinate s
+hould be followed by a comma
print TIGER ",";
}
}
print TIGER "]]}\n"; # finish the entry
print "Done $bcount out of $totalblocks.\n";
%attr = ();
$blockid = 0;
$polygon = 0;
@points = ();
$totalpoints = 0;
}
close (TIGER);
$shapefile = 0;
select()->autoflush(1);
# Show stopwatch
$duration = time - $start;
$duration = round ($duration / 60);
print "\n\nThis script took $duration minutes\n\n";
exit;
This has been working well for smaller states like Connecticut. But when I run this for a large shapefile, the program accumulates a huge amount of memory, and cannot go to the end of the script. I have a piece at the end of the script that prints a stopwatch (execution time). For example, Pennsylvania has 421,545 census blocks, the script will take about 7 minutes to run, and it will display:
...
Done 421543 out of 421545.
Done 421544 out of 421545.
Done 421545 out of 421545.
And it will get stuck there, never finishing and never displaying the stopwatch. And in my Activity Monitor will say perl is taking 7 GB of memory. But my output file is perfectly fine. Geo::ShapeFile doesn't have a close file command, so I suspect there's some problem with the ShapeFile remaining open and taking up all of the memory.