The Semantic Web & RDF

The Semantic Web is a way of putting structured information into Web pages, so that software can extract specific information without resorting to data mining and guessing. The semantic information may overlap with the content which is visible to the user, or it may be separate. It uses the Resource Description Framework (RDF) to characterize the information.

RDF data organization

RDF organizes information in triples, also called statements. A triple consists of a subject, a predicate, and an object. The subject is an identifier for a resource. The predicate is an attribute or relationship which the subject has with something else. The object is the relationship’s value. For example, ”George Washington” might be the subject, ”date of birth” the predicate, and ”February 22, 1732” the object. RDF is a data model, not a data format, and it can be expressed in many different ways.

Regardless of the format, triples have the same significance. The subject, predicate, and object can each be a URI, relating them to other triples. The predicate can be a qualified name in a namespace, making it part of a vocabulary. RDF Schema provides a standard vocabulary for triples, covering things like data types, property hierarchies, and cross-references. Other vocabularies standardize particular usages. For example, the FOAF (friend of a friend) vocabulary is useful in describing interconnections among people or things.

RDFa

The preferred way to use RDF in the Semantic Web is the RDFa format. It uses HTML attributes to define triples. RDFa attributes can be attached to any element, letting it do double duty as human-readable information and RDF data. These attributes have no effect on the appearance of the page, unless its CSS or JavaScript specifically makes use of them.

JavaScript code or an HTML processor can extract the RDFa from a page for any desired use, such as constructing a table. The SPARQL query language lets the pages be treated like a database, with requests for information based on combinations of criteria.

The Semantic Web improves the searchability of pages by storing information with precise syntax and formal definitions, without interfering with their human readability.