![]() |
|
|---|---|
| Developer(s) | University of Leipzig, Freie Universität Berlin, OpenLink Software |
| Initial release | 23 January 2007 |
| Stable release | DBpedia 3.4 / 11 November 2009[1] |
| Written in | PHP, Java, VSP |
| Operating system | Virtuoso Universal Server |
| Type | Semantic Web, Linked Data |
| License | GNU General Public License |
| Website | dbpedia.org |
DBpedia is a project aiming to extract structured information from the information created as part of the Wikipedia project. This structured information is then made available on the World Wide Web.[2] DBpedia allows users to query relationships and properties associated with Wikipedia resources, including links to other related datasets.[3] DBpedia has been described by Tim Berners-Lee as one of the more famous parts of the Linked Data project.[4]
Contents |
The project was started by people at the Free University of Berlin and the University of Leipzig, in collaboration with OpenLink Software[5], and the first publicly available dataset was published in 2007. It is made available under free licences, allowing others to reuse the dataset.
Wikipedia articles consist mostly of free text, but also include structured information embedded in the articles, such as "infobox" tables, categorisation information, images, geo-coordinates and links to external Web pages. This structured information is extracted and put in a uniform dataset which can be queried.
As of November 2009, the DBpedia dataset describes more than 2.9 million things, including at least 282,000 persons, 339,000 places (including 241,000 populated places), 88,000 music albums, 44,000 films, 15,000 video games, 119,000 organizations (including 20,000 companies and 29,000 educational institutions), 130,000 species and 4400 diseases. The DBpedia knowledge base features labels and abstracts for these things in 91 different languages; 807,000 links to images and 3,840,000 links to external web pages; 4,878,100 external links into other RDF datasets, 415,000 Wikipedia categories, and 75,000 YAGO categories. From this dataset, information spread across multiple pages can be extracted, for example book authorship can be put together from pages about the work, or the author.
The DBpedia project uses the Resource Description Framework (RDF) to represent the extracted information. As of November 2009, the DBpedia dataset consists of around 479 million pieces of information (RDF triples) out of which 190 million were extracted from the English edition of Wikipedia and 289 million were extracted from other language editions.[6]
DBpedia extracts factual information from Wikipedia pages, allowing users to find answers to questions where the information is spread across many different Wikipedia articles. Data is accessed using an SQL-like query language for RDF called SPARQL. For example, imagine you were interested in the Japanese shōjo manga series Tokyo Mew Mew, and wanted to find the genres of other works written by its illustrator. DBpedia combines information from Wikipedia's entries on Tokyo Mew Mew, Mia Ikumi and on works such as Super Doll Licca-chan and Koi Cupid. Since DBpedia normalises information into a single database, the following query can be asked without needing to know exactly which entry carries each fragment of information, and will list related genres:
PREFIX dbprop: <http://dbpedia.org/property/> PREFIX db: <http://dbpedia.org/resource/> SELECT ?who ?work ?genre WHERE { db:Tokyo_Mew_Mew dbprop:illustrator ?who . ?work dbpprop:author ?who . OPTIONAL { ?work dbpprop:genre ?genre } . }
The dataset is interlinked on RDF level with various other Open Data datasets on the Web. This enables applications to enrich DBpedia data with data from these datasets. As of November 2009, there are more than 3.7 million interlinks between DBpedia and external datasets including: Freebase, OpenCyc, UMBEL, GeoNames, Musicbrainz, CIA World Fact Book, DBLP, Project Gutenberg, DBtune Jamendo, Eurostat, Uniprot, Bio2RDF, and US Census data.[7][8] The Thomson Reuters initiative OpenCalais, the Linked Open Data project of the New York Times, and the Zemanta API also include links to DBpedia.[9][10][11] The BBC uses DBpedia to help organize its content.[12][13]
Amazon provides DBpedia Public Data Set that can be integrated into Amazon Web Services applications.[14]
|
|||||||||||||||||
|
|