Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Re^4: RFC: Peer to Peer Conceptual Search Engine

by soonix (Canon)
on Jan 29, 2020 at 16:21 UTC ( #11112024=note: print w/replies, xml ) Need Help??

in reply to Re^3: RFC: Peer to Peer Conceptual Search Engine
in thread RFC: Peer to Peer Conceptual Search Engine

Your "conceptual indexing" sounds a lot like what's today called "social bookmarking", which tries to apply a similiar process to webpages as used in libraries. The Wikipedia page has a section "Comparison with search engines".

The Search API probably was derived from (or the same as) the WiserEarth API, which (still) is in the Internet Archive (FAQ and Documentation)

I don't think there's active scrubbing going on, the "normal" entropic force is strong enough already, especially if the information in question needs active maintenance.
  • Comment on Re^4: RFC: Peer to Peer Conceptual Search Engine

Replies are listed 'Best First'.
Re^5: RFC: Peer to Peer Conceptual Search Engine
by PerlGuy(Tom) (Acolyte) on Jan 30, 2020 at 11:38 UTC
    I misspoke. What I meant was the open source WiserEarth platform. The program(s) that ran the site. The backend rather than the frontend.

    I did just find it on SourceForge,(I think, looks like?).

    There are, I suppose, some parallels between my program, or indexing system and social bookmarking, but social bookmarking, in practice, requires, generally speaking, some proprietary methodology on some specific platform with an inaccessible database. Delicious has gone by the wayside somewhere after passing through different hands.

    Along with it went 180 million bookmarks.

    Presumably, that will happen sooner or later with every such proprietary service or "black box" type database on the internet.

    What is needed IMO is an internet standard, similar to the Dewey Decimal System for books, in public libraries

    What I've endevored to produce is something more along the lines of Ranganathan's "colon classification system".

    Such a faceted metadata structure is compact, concise yet comprehensive, sufficiently flexible and extensible to encompass everything on the internet for the foreseeable future, yet structured enough to be computer readable. i.e. it can be easily and reliably isolated from whatever else appears in the source code of a website (using regular expressions).


      Significant, from the Wikipedia article is this example of Ranganathan's metadata compacting method:

      All that becomes this: L,45;421:6;253:f.44'N5

      The particular concepts incorporated into the example are not particularly relevant to internet indexing. Factually, I would say that only three facets are represented; Topic (hierarchically structured), Location (think geocodeing. Google Earth), and Time (Think, event calendar data, scheduled future events included).

      We can dispense with delimiters, if the pattern structure remains consistent.

      This style of encoding data could incorporate any number of facets.

      What else might be included?

      I've incorporated those I consider important and useful in the program, but it isn't written in stone.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11112024]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (6)
As of 2020-05-27 23:14 GMT
Find Nodes?
    Voting Booth?
    If programming languages were movie genres, Perl would be:

    Results (162 votes). Check out past polls.