In a few places I have code that takes a significant piece of text, such as a knowledgebase article, and splits a small section from the beginning as an abstract. The code I generally use came from an answer to Splitting long text for Template
But now, for a different application, I want to create an abstract of text which may contain HTML tags. The problem comes with not wanting to split an HTML tag in two. I want either all of it or none of it. So this is the code I am using...
sub abstract {
my $text = shift;
if (length $text > 200 and $text =~ /^(.{0,200}\b)(.*)$/s) {
$text = "$1...";
}
# Check we have not split an HTML tag
my $lt = $text =~ tr/<//;
my $gt = $text =~ tr/>//;
if ($lt != $gt) {
my ($keep, $strip) = $text =~ /(.*)<(.*)/;
$text = "$keep...";
}
return $text;
}
It does exactly what I want.
However, I cannot help thinking that the code could be more succinct...
Can you suggest a better way to do it?
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|