Tuesday, June 18, 2013

Semantic Dictionary

One of my “unfinished businesses” is a semantic dictionary. Basically it will be a gigantic database of all words in all forms (as entities) and they’ll have a set of attributes and relations between them, defining what kind of word it is (subject, verb...) and what relations between other words are (synonym, antonym, the same word in different language...).

I expect the primary usage in semantic search. The search engine will use some kind of universal meta language, to which all queries in natural languages will be transformed.

I can’t say how (and if) it will work exactly, it’s just a thing I’ve been thinking of quite a long time. I’m aware there are some nuances and pickles (like the same word with different meaning not only in different languages) or how to deal with phrases in search queries (my former thought was just to analyze word by word, but It’s not that easy).

These are some of expected unknowns I didn’t make any further research about and therefore I don’t have comprehensive overview of it.

And, last, but not least, it will be a nice performance test :)

Tuesday, April 30, 2013

Public service

For a few years I had a vision to open QPDB for public, but not as a particle database, but more like a farm of databases. My inspiration was the concept of Wikia.com, where everyone can start his own wiki. I decided to do a similar thing, only with semantic/structured databases instead of wikis.

Crucial part is unfortunately my biggest weakness – to make it easy for the user. I began with stripping the user interface and ultimately reduced offered functionality to the thinnest core. My target was to make it “just enough”. To keep the power in the backend and offer a bunch of presets, so to speak.

Application layers are: Databases > Classifications > Records (entities) > Attributes and relations.

In a database, user will be able to create classifications (in fact entity types, only available choice will be the name), attribute types (name, unit for numeric values, and data type) and relation types (name only).

Additionally, user can pick everything from preset templates, one set for home and the other for business. Templates can offer settings beyond the default available settings, like value range (e.g. 0-120 for age of a person).

For relations I dealt with unwanted entity types in suggest list, but how to get rid of them, when such thing is not available for this installation? Well, I made a little hack in the suggest algorithm, so when the name of relation type matches a name of any classification (entity type), only such entities are shown.

It’s the first public release of QB, aimed particularly to end users as content creators. QB had a public release two years ago, as Particle Database, but it was mainly to receive some feedback (which it did). Although I kinda like current style of QB, I feel this isn’t “it”.

I can imagine it’s far too off end-user’s expectations, may be confusing and hard to comprehend. But I’m still trying to think out of the box and I at least figured out some nifty stuff this time.