http://qs321.pair.com?node_id=38259


in reply to (Ovid) Re: Searching for web sites
in thread Searching for web sites

If you’re using such a through regex that checks for dots and allowable characters, you may wish to ditch the http:// completely. People are more likely to list websites in their .plan files without it (for example, I visit perlmonks.org and not I visit http://www.perlmonks.org) Personally I’d feel safe putting anchor tags around anything that looks like xxx.xxx, although you could also include a list of allowable Top Level Domains, something like @TLDs = ("com","net", "org", "edu","us","nl","de","it","se","ch","uk","ca","hr","ae","br","jp","be","us","au","ie","ar","fi","mil","gov","sg","es","mx","no","pt","dk","il","ru","nz","th","pl","id","cy","in","kw","at","za","cn","fr","is","ro","kr","gr","co","ph","bo","hu","cr","pe","cl","tr","arpa","tw","eg","ee","ge","ua","om","ec","hk","ve","ag","cz","ni","to","nu","sm","ni","lt","yu","bg","ba","do","qa","ck","mt","bf","lu","su","bh");

Replies are listed 'Best First'.
RE: RE: (Ovid) Re: Searching for web sites
by mirod (Canon) on Oct 25, 2000 at 15:39 UTC

    Isn't this a little dangerous? Any time new TLD's are added you will need to go and change the list, plus I cannot see .cx, home of a bunch of free software projects in this list.

    http:// or at least www(\..+)+\.\w+ seem the safest matches

      Lets not forget either that InterNIC just released the .god domain.