"I have a mind like a steel... uh... thingy." Patrick Logan's weblog.

Search This Blog


Saturday, April 02, 2005

Triples, Quads, Quints?

Bill de hÓra writes about RDF, triples, and quads. I think the point is RDF representa triples (X-Y-Z) but he wants to introduce another party into the relationship, e.g. he wants to distinguish (A-(X-Y-Z)) from (B-(X-Y-Z)). Apparently RDF cannot do this concisely, i.e. you need more than A, B, X, Y, and Z in order to denote that A and B relate to the same triple (X-Y-Z) in each in their own way.

Triples, quads, etc. I have to say I've not used RDF at all explicitly.

I guess "quads" are intended to support the (A-(X-Y-Z)) and (B-(X-Y-Z)) relationships concisely.

But, the engineer in me ignorantly wonders, won't we soon run into a desire for a fifth party in this relationship? Why stop at quads?

Quints, hexes, septs. How many is enough?

I wonder if we need a concise way of representing many occurrences of Many-Many relationships. Something like a Star Schema would simplify bulk Many-Many relationships. Bonus, it implies certain expectations about the contents of the relatonships, i.e. you can speak of measurements, occurrences, hierarchies, allocations, headers / details.

Star schemas can be seen as a shorthand notation for many implied individual RDF triples. Bonus, they are proven to be incredibly scalable and support high performance with fairly simple database implementations. The column-based storage approach goes back at least to the 1970's. Another example is Kdb, although "array based" and columnar, it's not explicitly star schema based.

I'm just thinking RDF triples are probably too primitive, and thinking out loud about what might be more expressive. Mostly I am confused about a triple to quad to (?) progression trying to represent increasingly complex relationships. Star schemas encourage a discipline, style, and simplicity into complex relationships that can scale and evolve.

I'll have to learn more about the triple / quad issue and think about this before I can become more coherent on comparing these modeling approaches.

1 comment:

Richard said...

Triples are powerful enough to do anything, assuming the semantics are present. Indeed, RDF's rdf:Statement and co allow you to describe a triples.

The difficulty comes when attempting to refer to an existing triple --- one that has been asserted. You might want to do this for provenance, security, annotation, signing, attribution...

A reified rdf:Statement is not asserted, and can't be mapped to one that is asserted. So we have no way of referring to an existing statement (or, more generally, a graph) within plain RDF itself.

Most triple stores use quads internally, and provide APIs for working with the fourth node (which is normally regarded as "source" or "context) to, e.g. find triples loaded from a certain file. Redland, Wilbur, and others all take this approach.

There has been lots of discussion about this on the RDF Interest Group (now semanticweb@w3.org) list.


Blog Archive

About Me

Portland, Oregon, United States
I'm usually writing from my favorite location on the planet, the pacific northwest of the u.s. I write for myself only and unless otherwise specified my posts here should not be taken as representing an official position of my employer. Contact me at my gee mail account, username patrickdlogan.