http://qs321.pair.com?node_id=600464


in reply to Programatically reparagraphinating text

A couple ideas which come to mind based on your examples, although I don't really expect them to catch 100% of the cases which should be left alone:

- If presented with N or more lines of the same length, it's likely a binary dump or a pinout diagram, so leave it alone. (I'd probably go with N=3, at least initially, but most dumps/diagrams tend to be longer than that, so you could probably use a larger value of N safely.)

- Multiple consecutive lines with leading whitespace are likely to be ASCII art or columnar text, so leave them alone. (Just one line with leading whitespace is more likely to be the start of a paragraph. For extra credit, if a block of indented lines includes one non-indented line, leave it alone, too, since it's likely part of the ASCII art.)

(I know this isn't modules, which is what you said you're looking for, but it looked like you may be looking for rules, too.)