Collaborative filtertering

Replies are listed 'Best First'.
Re: Collaborative filtertering by kvale (Monsignor) on Aug 19, 2004 at 00:15 UTC
I don't know of any natural language understanding system that could read the nodes you look at, extract the semantic content and and then direct you to other semantically similar web pages. That is outside the boundary of current technology. But simpler systems are possible and in fact already exist at Perlmonks. One method of extracting relevant nodes is to extract keyword distributions of well-liked nodes and compare them with keyword distributions of of other nodes for similarity. The SMART information retrieval system at Cornell uses an inner product metric for the similarity measure, for instance. Perlmonks has a simple version of this: Super search. Just pick keywords of nodes you like and super search for nodes with desired keywords. Another method of retrieving relevant nodes is to take an approach used by the semantic web people: create ontologies through the use of meta information added to the nodes. Perlmonks has this too! The meta information comes in the form of categorization. In the code catacombs, Q and A, and tutorial sections, nodes are organized by category and it is very easy to find nodes on a desired subject. Other sources of meta information on Perlmonks are the author of the node, children nodes of that node, Best/Worst nodes of a time period, reputation, etc. Perlmonks is quite rich in meta information. There is work by Naftali Tishby's group on the automatic classification of newspaper articles by using an information-theoretic clustering algorithm. The algorithm came up with surprisingly sensible clusters. Many clusters could be identified with a particular subject; others with reporter who wrote the article. It would be fun to apply such a scheme to the Perlmonks universe. Could such a nonparametric algorithm distinguish a meditation from a tutorial? Positive reputation nodes from negative? -Mark	[reply]
Re: Collaborative filtertering by chanio (Priest) on Aug 19, 2004 at 04:16 UTC
As a PopFile fan, let me remind you that the Bayessian Filtering that it uses to classify spam and various personal topics of the received emails, work in a very similar way as this PM's system. Instead of punctuating every answer it asks you to assign a category to every received email. In a few months the system has learned a lot. It can do very exact guesses of your possible classification. And you don't have to correct much of it, any more. I wouldn't imagine what would it learn by having many levels of classification inside of PM. Say, a general level, fed by all the members, and then a personal level for each sepparate member... It doesn't have to classify spam, but your interests. If you are curious about it, you should visit their forum for alternative uses that people create from their nice Popfile opensource! .{\('v')/} _`(___)' __________________________	[reply]
Re: Collaborative filtertering by phydeauxarff (Priest) on Aug 18, 2004 at 21:27 UTC
sounds like you are wanting something like an AI::Categorizer plugin for Perlmonks while I suppose that is would be technically possible to implement, I suspect that overhead on the database would prevent this from ever being deployed in a workable manner. Very intriquing idea though.	[reply]
Re^2: Collaborative filtertering by artist (Parson) on Aug 19, 2004 at 17:45 UTC
It will be based on liking of the node. So it won't be neccessary to use 'AI::Categorizer' or any such similar tool. Also it can be built externally to prevent load on database.	[reply]
Re^3: Collaborative filtertering by phydeauxarff (Priest) on Aug 20, 2004 at 15:23 UTC
I had assumed that you were proposing being able to "mark" a node as something you like....sort of like I can mark shows I like in TIVO and it decides to record suggetions based on my preferences and habits. Marking a node that you like would be pretty easy, you could even just use the current voting system as voting up a node assumes you like it (allthough I hear that some folks use votes to gain XP or affect others XP <grin>) the challange as I understand it is then parsing new nodes to offer the users suggestions on nodes they might be interested in....this is where I would imagine the most significant amount of overhead to the system.	[reply]
Re: Collaborative filtertering by davido (Cardinal) on Aug 19, 2004 at 02:46 UTC
Lurking deep within the code of the Monastery there is a keywords feature, only half-implemented. It's pretty much just awaiting further conceptulization, brainstorming, and coding. It's not really functional yet, and nobody seems to really have a good idea of what to do with it. But when/if it ever gets completed, it will probably facilitate the type of thing you're suggesting (again, with additional code). The keyword feature allows people to assign keywords to nodes, and those keywords can (in theory) be used to search for nodes of like content. As I said, it's pretty rough right now, and there really isn't any search facility attached to it from what I can tell. But maybe someday it'll come to fruition. Dave	[reply]
Re: Collaborative filtertering by eric256 (Parson) on Aug 21, 2004 at 23:58 UTC
Late to the discussion but: You could use the current voteing system. To find nodes you like, find all the people who plused nodes that you did, (score them based on how many nodes both of you ++), doing this would give you a group of people who liked the same stuff you did. Now take this group and get all the nodes that they +'ed and that you haven't -'ed. Now you have a list of nodes that people whith similar likes, liked. Score and order by score and you have your list of nodes :). Add in a link to newest nodes list and you would have a list of new nodes that you would probably like. ___________ Eric Hodges	[reply]
Re^2: Collaborative filtertering by jdalbec (Deacon) on Aug 22, 2004 at 02:47 UTC
find all the people who plused nodes that you did How does one do that? I didn't know there was a way to find out who voted on a node.	[reply]
Re^3: Collaborative filtertering by eric256 (Parson) on Aug 22, 2004 at 02:57 UTC
I don't think there is any way currently. It was just and idea for how that could be done with the existing info. ___________ Eric Hodges	[reply]
Re^2: Collaborative filtertering by artist (Parson) on Aug 23, 2004 at 14:42 UTC
Good Direction: Person could have voted 1000's of nodes just because he/she has extra votes remaining for the day or just liked the node in general rather than specific. IMHO, for practical purpose we cannot use the existing voting system. There is no capabilities in the existing system to vote negative after you have vote positive for the node and vice versa.	[reply]
Re^3: Collaborative filtertering by eric256 (Parson) on Aug 23, 2004 at 15:21 UTC
Well it wouldn't be perfect, as the voting system now is not. With enough people voting though you would be looking at rough averages much more than the actual individuals. The idea is to spread the factors out enough that its only if most the people who agree with you most the time like it. That way individual ++/-- mean less and less, its the overall trend of you group of "friends" that decide about nodes you might like. ___________ Eric Hodges	[reply]


Syntactic Confectionery Delight
	PerlMonks