Blank Node
A blank node or bnode is a subject or object in a Resource Description Framework (RDF) graph for which a Uniform Resource Identifier (URI) or literal is not given. Using blank nodes is a way to create a container that collates disparate information about an entity without minting a new URI, but it can introduce complexity when SPARQL querying data or merging data from different sources. In the first instance, SPARQL queries across datasets expressing the same information, with and without blank nodes, can return different information due to blank nodes expressing undefined or redundant values. In the second instance, blank node identifiers often have a local scope, so merging datasets might result in a duplication or conflation of blank nodes, or redundant blank nodes where some could be merged. For these reasons, blank nodes may be used as a local identifier within a specific dataset, but should be properly declared (provided with a URI) when combined with other datasets. LINCS does not include blank nodes in its ingested datasets. All entities are identified with URIs, either during the reconciliation process or by minting new URIs.
Examples
- Hogan et al. (2016) “Everything You Always Wanted to Know About Blank Nodes”: The following graph states that the tennis player :Federer won the :FrenchOpen in 2009. It also states that he won :Wimbledon where one such win was in 2003. The blank nodes represent a winning event that links Federer and the specific tournament (Wimbledon and the French Open) and in two cases the year of the win.
Further Resources
- Blank Node (Wikipedia)
- Chen (2012) “Blank Nodes in RDF”
- W3C (2014) “3.5 Replacing Blank Nodes with IRIs”