If it's looking for a simple pattern it might be doable in a reasonable amount of time. There are extractors for the Open Directory Project and Wikipedia dumps, both of which are in the many GB range, that can process very quickly, even on relatively old machines. I was pulling all of the music content out of ODP in less than a few minutes some 10 years ago on a mac laptop that was reasonably current then, and I don't recall how long it took to pull all the music topics out of Wikipedia, but I think it was quite reasonable.
Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
Want more info? How to link
or How to display code and escape characters
are good places to start.