Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^3: Extracting a (UK) Address

by gone2015 (Deacon)
on Jan 02, 2009 at 12:06 UTC ( [id://733743]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Extracting a (UK) Address
in thread Extracting a (UK) Address

So you are looking for three or more lines together, the last ending in something that looks like a post code...

$letter =~ m/((?:[^\n]+\n){2,}[^\n]*?[a-zA-Z]+[0-9]+\s+[0-9]+[a-zA-Z]+\s*?\n)\s*?\n/

...seemed to do the trick, where the entire letter was read into $letter. Obviously this will miss addresses with no post code or really rubbish post codes. You could just extract all groups of 3 or more lines, and then apply some more cunning address recogniser to the result -- perhaps from one of the modules recommended elsewhere.

(I haven't tried to figure out how much work this is asking the regex engine to do on difficult input. I'd worry about that only if it becomes a problem.)

Replies are listed 'Best First'.
Re^4: Extracting a (UK) Address
by jvector (Friar) on Jan 04, 2009 at 20:33 UTC
    The module Geo::Postcode may be of use in recognising the last line of a block as a (UK) postcode. Apparently there are a few gotchas among UK postcodes, that diverge from thhe expected patterns.

    It may be a bit of a sledge-hammer to crack a nut: the module also is able to do lots of good geo stuff you don't need -

    Geo::Postcode will accept full or partial UK postcodes, validate them against the official spec, separate them into their significant parts, translate them into map references or co-ordinates and calculate distances between them. It does not check whether the supplied postcode exists: only whether it is well-formed according to British Standard 7666
    but still could be helpful.

    This signature will be ready by a Christmas

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://733743]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (10)
As of 2024-04-23 08:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found