Parckwart’s Computer Stuff

⟵ Home page

Published: October 21, 2017

description.json is not a good Idea

The website VisiTOR, one of many search engines for Tor onion services, proposes a method to put description files in the root directory of every web onion service. It’s supposed to enhance the searchability of sites within the onion space. Apparently, many onion services have followed the proposal lately. According to VisiTOR themselves, it’s about one hundred right now. Here’s why I won’t.

What description.json doesn’t solve

The idea is that every onion service puts up a file called description.json in the web server root directory, similar to robots.txt. This JSON file contains the following values:

I will go now through these values one by one and explain whether and if yes how the problem has already been solved or is in the process of being solved for years by various internet standards.

title is (pretty self-explanatory) supposed to be the title of the site. This is of course pretty trivial to solve without description.json: Simply use the title HTML tag of the home page, as any other search engine has always done!

description should contain a short description of the site. Something like this has been supported by major search engines for years and will soon be standardized in HTML 5.1: The meta HTML tag with the description name.

relation is supposed to contain different domains for the same site. In the case of onion services, these are most likely normal DNS domains (like www.parckwart.de for parckwartvo7fskp.onion). The Tor Project is working on the problem of linking DNS domains with onion domains right now, not just for web services, but any onion service in general.

keywords is intended to hold a bunch of tags for the site. And here again: The meta tag of the same name is in use since 1995 and was the most popular meta tag in 1997, although nowadays search engines don’t use it, because it’s too abusable.

type is the only aspect about this thing that kind of makes sense to me. It’s less useful for search engines as it is for Yahoo-like index sites from the old days of the web. However, pre-defined categories would probably a good idea here.

language is the language code for the language a site is written in. In HTML, this is done by adding a lang attribute to the html tag.

contactInformation can contain an email address or similar. I don’t really see the point of collecting this information for a search engine. I have to admit, that I don’t know of any standard which quite covers this use(less) case. There’s the address tag in HTML5, however that’s not quite the same thing.

Conclusion

In my humble opinion, description.json is a fairly useless and actually kind of problematic proposal. It’s problematic because it causes even more useless 404 errors in web server logs and more solutions for the same problem are never a good thing. The more there are, the more a webmaster needs to support, thus the higher the complexity. I don’t even understand how one comes up with this, as it’s apparent that these are decades old problems, since onion search engines are mostly the same thing as regular search engines. So it should be obvious that somebody must have thought of this before, shouldn’t it?