note
LanX
<I>> seems like a TOC with both a few keywords for each topic AND the markup needed would be appropriate... but a big PITA.</I><P>
For a start: <P>
Hacked some code augmenting H-tags and generating a TOC:<P>
YMMV<P>
HTH! :)<P>
<readmore>
<c>
use strict;
use warnings;
use feature 'say';
use HTML::Entities;
use Data::Dump qw/pp/;
my $text;
my %anchor_count;
my @aoh_toc;
my ($min,$max) = (3,6);
my @stack;
$stack[$min-1] = \@aoh_toc; # root
while (my $line = <DATA>) {
if ( $line =~ m# ^ (\s*) < \s* h([$min-$max]) \s* > (.*) </ \s* h(\2) \s* > \s*$ #xi ) {
#say $line;
my ($indent,$level,$text) = ($1,$2,$3);
if ( $indent ) { # ignore indented <h*>
warn "Skipping $line";
}
else {
my $anchor = text2anchor($text);
warn "Duplicate '$anchor' " if $anchor_count{$anchor}++;
$line = "<h$level><a name='$anchor'>$text</a></h$level>\n";
my $a_sub = [];
push @{$stack[$level-1]},
{
text => $text,
link => $anchor,
sub => $a_sub,
level => $level,
};
$stack[$level] = $a_sub;
}
#say $line;
}
$text .= $line
}
# pp \@aoh_toc;
say "<ul>";
create_toc(\@aoh_toc);
say "</ul>";
print $text;
sub text2anchor {
my ($text) = @_;
# trim
$text =~ s/^\s+//;
$text =~ s/\s+$//;
# make valid indentifier
my $encoded = encode_entities( $text );
# replace whitespaces
$encoded =~ s/\s/_/g;
return $encoded;
}
sub create_toc {
my ($aoh_toc) = @_;
for my $h_line (@$aoh_toc) {
my $indent = " " x ($h_line->{level}-$min);
say qq~$indent<li><a href='#$h_line->{link}'>$h_line->{text}</a></li>~;
my $a_sub = $h_line->{sub};
if ( @$a_sub ) {
say "$indent<ul>";
create_toc($a_sub);
say "$indent</ul>";
}
}
}
__DATA__
...
</c>
</readmore><P>
<H5> update:</H5><P>
something is wrong, for a reason I do not understand yet the anchors are filtered ...<P>
need to check the allowed tags in the monastery... (under construction)<P>
produced HTML see reply here [id://1202630]<P>
<H5> update</H5><P>
needed extra <C><a name="$target"></C> tags<P>
<H5> update</H5><P>
Forgot to mention, I had to change the [href://?displaytype=xml;node_id=674668|input] at some places:
<UL>
<LI> Some h3 needed to be h4 to be properly nested
<LI> some heading were just examples and were indented to be ignored
</UL>
An included heading needs to start a line without leading whitespace and can't be followed by anything else but whitespace. <!-- Wiki2Monks {"version":1.1416} -->
1202462
1202492