Where did you find the URL?
I cobbled it together together from the base url and the file I wanted.
If I point my mouse on the file and save the link,
I get the same thing. What I realize from your and bliako's post is that I underused the power of the browser to figure this out.
Using this URL instead of the one you used also stores a list of words to the output file, which I guess is the output you had expected.
Thx, choroba, that is indeed what I seek for my wordgames. With the correct url, my script gets the english dictionary. I decided to try it out with an older source post of yours: Re^7: Words in Words. "Correct" entries are words that have a properly-encompassing word. A hybrid is this:
Source:
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
use 5.016;
my $url = 'https://storage.googleapis.com/google-code-archive-download
+s/v2/code.google.com/dotnetperls-controls/enable1.txt';
my $file = '/home/hogan/Documents/phone/from_laptop/my_data/bb.txt';
getstore($url, $file);
##
open my $IN, '<', $file or die "$!";
my %words;
while (my $word = <$IN>) {
chomp $word;
undef $words{$word};
}
my %reported;
for my $word (keys %words) {
my $length = length $word;
for my $pos (0 .. $length - 1) {
my $skip_itself = ! $pos;
for my $len (1 .. $length - $pos - $skip_itself) {
my $subword = substr($word, $pos, $len);
next if exists $reported{$subword};
next if $word eq $subword . q{s}
or $word eq $subword . q{'s};
if (exists $words{$subword}) {
say "$subword";
undef $reported{$subword};
}
}
}
}
Logophiles like me play gladly with such output. I speak english natively, so I'm rarely challenged with english vocabulary. The resulting list is fascinating:
$ grep phosphorylating bb.txt
dephosphorylating
phosphorylating
$ grep aerially bb.txt
aerially
subaerially
$ grep physiology bb.txt
ecophysiology
electrophysiology
histophysiology
neurophysiology
pathophysiology
physiology
psychophysiology
$ grep quids bb.txt
equids
liquids
nonliquids
quids
semiliquids
soliquids
squids
$ grep consciouses bb.txt
consciouses
preconsciouses
subconsciouses
unconsciouses
$
Who knew that there were 4 different consciouses? I couldn't find an example that failed to have a larger including word.
Anyways, thanks for your comment that got me on the right track and also for the fun of replicating your "words within words" script.
"Perl scripting: great for pandemics...."
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.