> seems like a TOC with both a few keywords for each topic AND the markup needed would be appropriate... but a big PITA.
For a start:
Hacked some code augmenting H-tags and generating a TOC:
YMMV
HTH! :)
use strict;
use warnings;
use feature 'say';
use HTML::Entities;
use Data::Dump qw/pp/;
my $text;
my %anchor_count;
my @aoh_toc;
my ($min,$max) = (3,6);
my @stack;
$stack[$min-1] = \@aoh_toc; # root
while (my $line = <DATA>) {
if ( $line =~ m# ^ (\s*) < \s* h([$min-$max]) \s* > (.*) </ \s* h(\
+2) \s* > \s*$ #xi ) {
#say $line;
my ($indent,$level,$text) = ($1,$2,$3);
if ( $indent ) { # ignore indented <h*>
warn "Skipping $line";
}
else {
my $anchor = text2anchor($text);
warn "Duplicate '$anchor' " if $anchor_count{$anchor}++;
$line = "<h$level><a name='$anchor'>$text</a></h$level>\n";
my $a_sub = [];
push @{$stack[$level-1]},
{
text => $text,
link => $anchor,
sub => $a_sub,
level => $level,
};
$stack[$level] = $a_sub;
}
#say $line;
}
$text .= $line
}
# pp \@aoh_toc;
say "<ul>";
create_toc(\@aoh_toc);
say "</ul>";
print $text;
sub text2anchor {
my ($text) = @_;
# trim
$text =~ s/^\s+//;
$text =~ s/\s+$//;
# make valid indentifier
my $encoded = encode_entities( $text );
# replace whitespaces
$encoded =~ s/\s/_/g;
return $encoded;
}
sub create_toc {
my ($aoh_toc) = @_;
for my $h_line (@$aoh_toc) {
my $indent = " " x ($h_line->{level}-$min);
say qq~$indent<li><a href='#$h_line->{link}'>$h_line->{text}</a>
+</li>~;
my $a_sub = $h_line->{sub};
if ( @$a_sub ) {
say "$indent<ul>";
create_toc($a_sub);
say "$indent</ul>";
}
}
}
__DATA__
...
update:
something is wrong, for a reason I do not understand yet the anchors are filtered ...
need to check the allowed tags in the monastery... (under construction)
produced HTML see reply here Re^3: TOCs and deeplinks for our house rules
update
needed extra <a name="$target"> tags
update
Forgot to mention, I had to change the input at some places:
- Some h3 needed to be h4 to be properly nested
- some heading were just examples and were indented to be ignored
An included heading needs to start a line without leading whitespace and can't be followed by anything else but whitespace.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|