Just as a text file is both a set of lines and a stream of bytes, an HTML document is both a tree and a stream of elements. HTML::Parser extracts the latter, which is equivalent to walking the DOM tree in some order. The advantage of using HTML::Parser for an application like this is the same as the advantage of processing a text file line-by-line without reading the whole file into memory.
While it is unlikely that an HTML document would not fit into memory on a client, our questioner could be building something that runs on a server, with an instance of the program for each concurrent client connection which can quickly become very large in aggregate if many clients are active. In this case, building the entire tree in memory is unnecessary because the transformation to be applied is very simple: find and mark ocurrances of certain text in a finite sliding window. If this is running on a server, building the DOM tree in memory is both wasteful and foolish, creating an opportunity for easy DoS attacks.
Put simply, if you do not actually need the DOM tree, do not waste time and memory building it!