Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re: Vow Triptych

by Arunbear (Prior)
on Dec 31, 2008 at 16:29 UTC ( #733499=note: print w/replies, xml ) Need Help??

in reply to Vow Triptych

For any comparative linguists, here is what it looks like in Python (it even works):
import re import sys from collections import defaultdict wordsInOrder = [] for line in sys.stdin: wordsInOrder.extend( re.findall(r'\w+', line.lower()) ) single = defaultdict(int) double = defaultdict(int) triple = defaultdict(int) for i, word in enumerate(wordsInOrder): single[word] += 1 try: next_word = wordsInOrder[i+1] double[word + ' ' + next_word] += 1 next_next_word = wordsInOrder[i+2] triple[word + ' ' + next_word + ' ' + next_next_word] += 1 except: pass def sort_by_frequency(d): return sorted(d.iterkeys(), cmp = lambda x,y: cmp(d[y], d[x])) for singlet in sort_by_frequency(single): print singlet for doublet in sort_by_frequency(double): if not doublet.startswith(singlet + ' '): continue print "\t", doublet for triplet in sort_by_frequency(triple): if not triplet.startswith(doublet + ' '): continue print "\t\t", triplet
I needed amusement ;-)

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://733499]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2022-05-17 08:38 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (65 votes). Check out past polls.