Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^5: RFC: Peer to Peer Conceptual Search Engine

by PerlGuy(Tom) (Acolyte)
on Jan 30, 2020 at 11:38 UTC ( [id://11112081]=note: print w/replies, xml ) Need Help??


in reply to Re^4: RFC: Peer to Peer Conceptual Search Engine
in thread RFC: Peer to Peer Conceptual Search Engine

I misspoke. What I meant was the open source WiserEarth platform. The program(s) that ran the site. The backend rather than the frontend.

I did just find it on SourceForge,(I think, looks like?).

https://sourceforge.net/projects/wiserplatform/files/wiserplatform/

There are, I suppose, some parallels between my program, or indexing system and social bookmarking, but social bookmarking, in practice, requires, generally speaking, some proprietary methodology on some specific platform with an inaccessible database. Delicious has gone by the wayside somewhere after passing through different hands.

Along with it went 180 million bookmarks.

Presumably, that will happen sooner or later with every such proprietary service or "black box" type database on the internet.

What is needed IMO is an internet standard, similar to the Dewey Decimal System for books, in public libraries

around the world, but tailored to the kind of information, resources and services on the internet, and which does not neglect the needs of not-for-profit, world betterment type social needs, and general human needs in deference to commercial interests.

I remember back in the early days of the internet, there was general agreement that advertising was not and would not ever be allowed on the internet, and for good reason.

Commercial interests were known to have an undesirable influence on nearly every communications medium in existence.

The censorship issues on the internet today are not so much due to the platforms themselves as to pressure from advertisers. That's how it has always been with magazines, newspapers, radio and TV, before the internet, from what I remember from discussions on the issue of why it was decided not to allow advertising on the then budding internet. I'm talking, I think, like back in the late 80's when the only "social network" was the Whole Earth Lectronic Link as a direct dialup service and the world wide web was not even a glimmer in somebody's eye.

I think The Well (founded circa 1985) sticking to the concept, still does not allow advertising. Perhaps the only social network left on the internet, uncorrupted by advertising dollars.

So, IMO, it is best that this "Search Engine" indexing, metadata or tagging system, not be of a commercial or proprietary nature.

It should be as free and ubiquitous as other internet standards, like HTML, or the structure of a hyperlink.

Clearly, though, the Dewey Decimal System is no longer adequate even for books in public libraries, it is certainly inadequate for the kind of resources on the internet, and besides that OCLC thinks they own it.

https://www.questia.com/magazine/1G1-111069497/oclc-sues-library-hotel-for-trademark-infringement

This sort of overzealous litigation over "trademark rights" for an indexing system over 100 years old and rightly in the public domain, nevertheless had a "chilling effect" on the use of the DDS on the internet. It could be, perhaps, better than nothing, but OCLC might feel they have to start filing more lawsuits.

What I've endevored to produce is something more along the lines of Ranganathan's "colon classification system".

https://en.m.wikipedia.org/wiki/Faceted_classification

Such a faceted metadata structure is compact, concise yet comprehensive, sufficiently flexible and extensible to encompass everything on the internet for the foreseeable future, yet structured enough to be computer readable. i.e. it can be easily and reliably isolated from whatever else appears in the source code of a website (using regular expressions).

Tom

Replies are listed 'Best First'.
Re^6: RFC: Peer to Peer Conceptual Search Engine
by PerlGuy(Tom) (Acolyte) on Jan 30, 2020 at 18:24 UTC
    Significant, from the Wikipedia article is this example of Ranganathan's metadata compacting method:

    All that becomes this: L,45;421:6;253:f.44'N5

    The particular concepts incorporated into the example are not particularly relevant to internet indexing. Factually, I would say that only three facets are represented; Topic (hierarchically structured), Location (think geocodeing. Google Earth), and Time (Think, event calendar data, scheduled future events included).

    We can dispense with delimiters, if the pattern structure remains consistent.

    This style of encoding data could incorporate any number of facets.

    What else might be included?

    I've incorporated those I consider important and useful in the program, but it isn't written in stone.

    Tom

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11112081]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (7)
As of 2024-03-28 11:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found