Semantic publishing on the Web or semantic web publishing refers to publishing information on the web as documents accompanied by semantic markup. Semantic publication is intended to provide a way for computers to understand the structure and even the meaning of the published information, making information search and data integration more efficient.

Although semantic publishing is not specific to the Web, it has been driven by the rising of the semantic web – a web of data. In the semantic web, information published on the web is accompanied with metadata describing the published information, thus providing a semantic context. As the semantic web is further developed and adopted, adding semantic markup to published data will become an important part of web publishing.

Although semantic publishing has the potential to change the face of web publishing, when this will happen depends on when killer applications will emerge. The current technologies are capable of building web sites with all contents in both HTML format and semantic format. Examples are mindswap, UMBC ebiquity, and open lab. However, semantic web sites are not common yet. Earlier version of news feed, specifically RSS1.0, is in RDF (a semantic web standard) format, although it has become less popular than RSS2.0 and Atom feed. A new attempt from is trying to apply RDF standard more broadly to various data feeds. Anyone can use the new free online service (ufeed) to create and provide RDF data resources and datafeeds for products, news, events, jobs and studies.

Semantic publishing also has the potential to revolutionize scientific publishing. Tim Berners-Lee predicted in 2001 that the semantic web “will likely profoundly change the very nature of how scientific knowledge is produced and shared, in ways that we can now barely imagine” [1]. Revisiting the semantic web in 2006, he and his colleagues believed the semantic web “could bring about a revolution in how, for example, scientific content is managed throughout its life cycle” [2]. One simple idea that may radically change scientific communication is for researchers to directly self-publish their experiment data in semantic format on the web. In one scenario, a scientist could design and run an experiment, and share the experiment information with the world in real time by publishing the data as a semantic object on the web. Semantic search engines will make these semantic data available at everyone’s fingertips. W3C interest group in healthcare and life sciences is exploring this idea of self-publishing of experiment now, for which a demo is available.


Two different approaches to semantic publishing

  • Publish information as data objects using semantic web languages like RDF and OWL. Ontology is usually developed for specific information domain, which is then used to formally represent the data in such domain. Semantic publishing of more general information like product information, news, and job openings uses so-called shallow ontology, as exemplified by the free Ufeed online tool. The W3C SWEO Linking Open Data Project maintains a list of data sources that follow this approach as well as a list of Semantic Publishing Tools
  • Embed formal metadata in documents using new markup languages like RDFa and Microformats.

Examples of ontologies and vocabularies for publishing

Examples of free or open source tools and services

  • Semantic MediaWiki: An extension to the wiki application MediaWiki that allows users to semantically annotate data on the wiki, and then republish it in formats such as RDF XML.
  • Swoogle: A search engine for ontologies and instance data on the Web.
  • Ufeed: A free online tool for publishing data resources and data feeds in RDF, including product information, news, events, jobs and studies.
  • D2R Server: Tool for publishing relational databases on the Semantic Web as Linked Data and SPARQL endpoints.
  • BigBlogZoo: 60,000 xml sources are regularly crawled and articles are reaggregated under a Semantic URL. Articles are categorized using the DMOZ RDF classification schema.

See also


  1. W3C: W3C is developing semantic web infrastructures and standards through its many semantic web activities.
  2. Resource Description Framework (RDF): a language for representing information about resources in the World Wide Web.
  3. Web Ontology Language (OWL): OWL facilitates greater machine interoperability of Web content.
  4. Scientific publishing on the ‘semantic web’, by Tim Berners-Lee and James Hendler, Nature 410, 1023 - 1024 (26 Apr 2001).
  5. The Semantic Web Revisited, by Nigel Shadbolt, Tim Berners-Lee and Wendy Hall, IEEE Intelligent Systems 21(3) pp. 96-101, May/June 2006.


