I haven't been spending very much time following the developments regarding the Bibframe vocabulary and only follow the Bibframe mailing list sporadically. That's why I am happy when someone else makes the effort to take a deeper look at the vocabulary and reports the results. That is what Robert Sanderson did with his text titled "Differences between BibFrame and other Linked Open Data Approaches".
The Problem: Using strings as/instead of identifiers
In his text, Rob wants to "point out the differences between BibFrame's use of RDF and other more common usage patterns". This blog post only discusses the first of the differences Rob points out in the chapter "String Authorities rather than Identifiers": Instead of putting the focus on interlinking things (bibliographic resources, persons, concepts, organisations etc.) identified by URIs, Bibframe emphasizes an approach that rather builds on using blank node identifiers and defining authorities by a canonical string like "Tolkien, J. R. R. (John Ronald Reuel), 1892-1973". Rob writes:
"BibFrame tries to make use of existing authority records and canonical string-based labels due to its background in MARC, a format designed to be as compact as possible for adding metadata to strings. Unfortunately, this does not map well into Linked Data which makes use of identifiers to globally and uniquely distinguish real world and digital entities. These two world-views collide in the use of Authorities in BibFrame."
By and large, I agree with Rob's diagnosis. It is important to note that this "string approach" isn't limited to Bibframe (as Rob says himself when he refers to MADS) but to a whole approach of representing authority data in RDF. Also, this isn't solely a MARC thing but is based on the Anglo-American cataloging practice in general. In fact, one can easily use MARC with an identifier-based authority approach as a look at some MARC records of German libraries shows. (Further down more about this.)There are different practices of authority cataloging
What became clear to me while reading Rob's text is that German libraries might better be avoiding Bibframe, at least if its focus stays on a string-based authority approach. The following statement by Rob may be true for the Anglo-American cataloging practice:
An "authority is about the approved form in which the person's name should be recorded as a string, it does not identify the person directly. (...) This is a fundamental difference between regular Linked Open Data and BibFrame's use of RDF. BibFrame relies on strings, due to its heritage, whereas LOD makes use of identity."
Accordingly, the Library of Congress describes its name authority file as follows (my emphasis):
"The Library of Congress Name Authority File (NAF) file provides authoritative data for names of persons, organizations, events, places, and titles."
Accordingly, this is what the authority RDF for Tolkien looks like in the Library of Congress name authority (snippet):
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix madsrdf: <http://www.loc.gov/mads/rdf/v1#> . @prefix identifiers: <http://id.loc.gov/vocabulary/identifiers/> . <http://id.loc.gov/authorities/names/n79005673> a madsrdf:Authority, madsrdf:PersonalName, skos:Concept ; identifiers:lccn "n 79005673" ; identifiers:oclcnum "oca00239830" ; madsrdf:authoritativeLabel "Tolkien, J. R. R. (John Ronald Reuel), 1892-1973"@en ; madsrdf:elementList ([ madsrdf:elementValue "Tolkien, J. R. R."@en ; a madsrdf:FullNameElement ] [ madsrdf:elementValue "(John Ronald Reuel),"@en ; a madsrdf:FullNameElement ] [ madsrdf:elementValue "1892-1973"@en ; a madsrdf:DateNameElement ] ) ; madsrdf:hasExactExternalAuthority <http://viaf.org/viaf/sourceID/LC%7Cn+79005673#skos:Concept> ; madsrdf:identifiesRWO [ madsrdf:birthdate "18920103" ; madsrdf:deathdate "19730902" ; madsrdf:hasAffiliation [ madsrdf:affiliatedWith "University of Leeds" ; a madsrdf:Affiliation ], [ madsrdf:affiliatedWith "University of Oxford" ; a madsrdf:Affiliation ] ; a madsrdf:RWO, <http://xmlns.com/foaf/0.1/Person> ] .
One may find information about the "real word object" in this RDF but it doesn't get that much attention as it not even gets its own URI but is only identified by a blank node. (We will see further down that Library of Congress's approach is unique in this respect compared to other RDF authority files worldwide.)
About cataloging practice in Germany and Austria, though, one can not say that it "relies on strings" . A central tool for German-speaking catalogers is the German Integrated Authority File — created and curated by many different institutions in the German-speaking world. The Integrated Authority File exists since 2012 and is the product of integrating three different authority files for persons, corporate bodies and subject headings. It is described by the Deutsche Nationalbibliothek (DNB) as follows.
"The Integrated Authority File (GND) contains data records representing persons, corporate bodies, congresses, geographic entities, topics and works."
This already sounds a bit different and doesn't mention "names" at all. Let's take a deeper look at the German cataloging practice regarding authorities.
German ID-based authority practice
In the Integrated Authority File an numeric ID (GND ID) is used to identify an authority record. Likewise, each bibliographic record that references this authority record uses the GND ID. For an example take a look at these two MARC XML records from DNB.
This cataloging practice emerged in the 1990s and makes German library data "linked data ready". The Integrated Authority File data was one of the first linked data publications in the German library world. Publishing the authority data, Deutsche Nationalbibliothek chose another approach than the Library of Congress. Instead of just publishing authority records in RDF and assigning URIs (Uniform Resource Identifiers) to these records they created URIs for the things the authority records describe, i.e. for persons, corporate bodies topics etc. "http://dnb.info/" is used as namespace where the respective GND ID is appended. For example: The ID for Tolkien's authority record is '118623222' and his Linked data URI is 'http://d-nb.info/gnd/118623222'. You can fetch the following RDF information from this URI (snippet, in turtle notation):
GND authority data in RDF
@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix gndo: <http://d-nb.info/standards/elementset/gnd#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . <http://d-nb.info/gnd/118623222> gndo:gndIdentifier "118623222" ; gndo:preferredNameEntityForThePerson [ gndo:forename "J. R. R." ; gndo:surname "Tolkien" ] ; gndo:preferredNameForThePerson "Tolkien, J. R. R." ; a gndo:DifferentiatedPerson ; owl:sameAs <http://dbpedia.org/resource/J._R._R._Tolkien>, <http://viaf.org/viaf/95218067> ; foaf:page <http://de.wikipedia.org/wiki/J._R._R._Tolkien> .
As you can see, the German Integrated Authority File (GND) has its own ontology (GND ontology, see here for an overview over the ontology's class hierarchy) created and maintained by the DNB. The owl:sameAs links to DBpedia and VIAF (which models person authorities as persons, not as strings) clearly show that GND defines name authorities as persons with an ID and not simply as name authorities with canonical strings.
Linking to GND using Dublin Core & MARC relators
From 2010 on a handful of libraries and library service centers in Germany started publishing their bibliographic records as linked data. As one might expect reading the previous paragraph, it was quite easy for them to not only produce RDF but to link to other datasets, at least to the GND. Just take a look at these examples from two German union catalogs in RDF: lobid and b3kat.
In 2012, the DNB started publishing the German national bibliography as linked data. Also in 2012, different linked data publishers form the German-speaking library world started working together within the KIM-DINI working group (KIM = Competence Centre Interoperable Metadata) to promote best practices for the RDF representation of bibliographic records which resulted in a set of recommendations (German, pdf) first published 2013. Following these recommendations, an RDF representation of a DNB title record currently looks like this (snippet):
@prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix marcRole: <http://id.loc.gov/vocabulary/relators/> . <http://d-nb.info/1022176307> marcRole:ill <http://d-nb.info/gnd/156605406> ; marcRole:trl <http://d-nb.info/gnd/110833732> ; dc:title "Der kleine Hobbit" ; dcterms:alternative "The hobbit <dt.>" ; dcterms:creator <http://d-nb.info/gnd/118623222> ; dcterms:issued "2012" .
You can see how DC terms and MARC relator properties are used to directly link to the persons in the GND authority file. (And yes, the German version of "The hobbit" was named "The small hobbit" though Bilbo is actually of average height — at least for a hobbit.)
Linking to GND using Bibframe (test data)
For some weeks now, DNB provides Bibframe representations of the title records along with the just mentioned linked data. Here is the RDF that you get when requesting it for the same resource:
@prefix bf: <http://bibframe.org/vocab/> . <http://d-nb.info/1034321757> bf:dimensions "20 cm" ; bf:ean "9783423715669" ; bf:editionStatement "Neuausg." ; bf:extent "397 S." ; bf:frequency <http://marc21rdf.info/terms/continuingfre%23/u> ; bf:illustrativeContentNote "Ill." ; bf:instanceOf <http://d-nb.info/bf_temp/work_1034321757> ; bf:isbn10 "3423715669" ; bf:isbn13 "9783423715669" ; bf:modeOfIssuance "Einbändiges Werk" ; bf:nbn "13,A46", "13,N20" ; bf:responsibilityStatement "John Ronald R. Tolkien. Aus dem Engl. von Walter Scherf. Mit Vignetten von Max Meinzold" ; bf:title "Der kleine Hobbit" ; a bf:Instance . <http://d-nb.info/bf_temp/work_1034321757> bf:associatedAgent [ bf:hasGNDLink <http://d-nb.info/gnd/118623222> ; bf:label "Tolkien, J. R. R." ; bf:resourceRole <http://id.loc.gov/vocabulary/relators/aut> ; a bf:Person ], [ bf:hasGNDLink <http://d-nb.info/gnd/110833732> ; bf:label "Scherf, Walter" ; bf:resourceRole <http://id.loc.gov/vocabulary/relators/trl> ; a bf:Person ], [ bf:hasGNDLink <http://d-nb.info/gnd/156605406> ; bf:label "Hehn-Kynast, Juliane" ; bf:resourceRole <http://id.loc.gov/vocabulary/relators/ill> ; a bf:Person ], [ bf:hasGNDLink <http://d-nb.info/gnd/1022774611> ; bf:label "Meinzold, Max" ; bf:resourceRole <http://id.loc.gov/vocabulary/relators/ill> ; a bf:Person ] ; bf:hasInstance <http://d-nb.info/1034321757> ; bf:title "Der kleine Hobbit", "The Hobbit" ; bf:uniformTitle "The Hobbit, dt." ; a bf:Work .
The most obvious difference compared to the DC-based RDF above is that there are actually two resources — a Bibframe instance and a work. Taking a look at the links to authority data you see what Rob is complaining about: Instead of a simple dcterms:creator link between a bibliographic resource and a person you get a blank node for a Bibframe person that then links to the GND with bf:hasGNDLink. Doesn't look like any sane person would prefer this data over the RDF shown above.
How do others do it?
So we have these two different practices of representing name authority data in RDF and see that the Bibframe initiative — calling itself the "foundation for the future of bibliographic description that happens on the web and in the networked world" — chose a rather impractical approach. This begs the question of how others do this. I understand Bibframe as an initative with an international scope so I guess it should meet the demands and be in line with authority practices around the world.
Below are examples of some RDF representations of records from other name/person authority files (snippets). I won't go into much detail discussing these but will comment on the general approach taken.
VIAF follows the same Linked Data compatible approach as DNB to represent authorities in RDF. Tolkien is typed as foaf:Person and can be directly linked to using properties like dcterms:creator.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix void: <http://rdfs.org/ns/void#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix viaf: <http://viaf.org/ontology/1.1/#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix skosxl: <http://www.w3.org/2008/05/skos-xl#> . @prefix skos: <http://www.w3.org/2004/02/skos/core#> . @prefix rdaGr2: <http://rdvocab.info/ElementsGr2/> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix rdaEnt: <http://rdvocab.info/uri/schema/FRBRentitiesRDA/> . <http://viaf.org/viaf/95218067> rdaGr2:dateOfBirth "1892-01-03" ; rdaGr2:dateOfDeath "1973-09-02" ; a rdaEnt:Person, foaf:Person ; owl:sameAs <http://d-nb.info/gnd/15818212X>, <http://data.bnf.fr/ark:/12148/cb11926763j#foaf:Person>, <http://dbpedia.org/resource/J._R._R._Tolkien>, <http://libris.kb.se/resource/auth/97224>, <http://www.idref.fr/027164918/id> ; foaf:name "J.R.R Tolkien", "JRR Tolkien", "John Ronald Reuel Tolkien", "T'olk'in, J. R. R. 1892-1973", ... .
Looking to Sweden's Libris catalog, we see a lot of similarities to VIAF. Libris primarily types person authorities as foaf:Person but adds a skos:Concept with its own URI which is linked to the person with foaf:focus.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpedia: <http://dbpedia.org/property/> . @prefix skos: <http://www.w3.org/2004/02/skos/core#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rda: <http://RDVocab.info/ElementsGr2/> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix libris: <http://libris.kb.se/vocabulary/experimental#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . <http://libris.kb.se/resource/auth/97224> rda:dateOfBirth "1892" ; rda:dateOfDeath "1973" ; dbpedia:birthYear "1892" ; dbpedia:deathYear "1973" ; libris:key "Tolkien, J. R. R., 1892-1973" ; rdf:seeAlso <http://en.wikipedia.org/wiki/J._R._R._Tolkien> ; a foaf:Person ; rdfs:isDefinedBy <http://data.libris.kb.se/open/auth/97224.rdf> ; owl:sameAs <http://dbpedia.org/resource/J._R._R._Tolkien>, <http://id.loc.gov/authorities/names/325978>, <http://viaf.org/viaf/95218067> ; foaf:name "J. R. R Tolkien", "John R. R Tolkien", "John Ronald Reuel Tolkien", "Tolkien, J. R. R., 1892-1973", "Tolkien, John R. R., 1892-1973", "Tolkien, John Ronald Reuel, 1892-1973" . <http://libris.kb.se/resource/auth/97224#concept> a skos:Concept ; skos:altLabel "J. R. R Tolkien", "John R. R Tolkien", "John Ronald Reuel Tolkien", "Tolkien, J. R. R., 1892-1973", "Tolkien, John R. R., 1892-1973", "Tolkien, John Ronald Reuel, 1892-1973" ; skos:exactMatch <http://viaf.org/viaf/95218067/#skos:Concept> ; foaf:focus <http://libris.kb.se/resource/auth/97224> .
National Diet Library (NDL), Japan
An authority file of the National Diet Library looks quite similar to Libris' authorities. Interestingly, foaf:primaryTopic is used instead of foaf:focus to link the skos:Concept to the foaf:Person.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix skos: <http://www.w3.org/2004/02/skos/core#> . @prefix xl: <http://www.w3.org/2008/05/skos-xl#> . @prefix rda: <http://RDVocab.info/ElementsGr2/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix dct: <http://purl.org/dc/terms/> . <http://id.ndl.go.jp/auth/entity/00000047> rda:dateOfBirth "1931" ; a foaf:Person ; foaf:name "阿部洋" . <http://id.ndl.go.jp/auth/ndlna/00000047> dct:created "1979-04-01" ; dct:modified "2005-01-05T10:44:08" ; dct:source "奥付", "韓国と台湾の教育開発 / 阿部宗光, 阿部洋 編" ; a skos:Concept ; rdfs:label "阿部, 洋, 1931-" ; skos:exactMatch <http://viaf.org/viaf/sourceID/NDL%7C00000047> ; skos:inScheme <http://id.ndl.go.jp/auth#personalNames> ; xl:prefLabel [ ndl:transcription "Abe, Hiroshi, 1931-"@ja-latn, "アベ, ヒロシ, 1931-"@ja-kana ; xl:literalForm "阿部, 洋, 1931-" ] ; foaf:primaryTopic <http://id.ndl.go.jp/auth/entity/00000047> .
Bibliothèque nationale de France
BNF does it the other way around compared to Libris. Here, the a name authority is primarily typed as skos:Concept. This skos:Concept is linked to the person Tolkien using foaf:focus and, thus, enables direct linking to person authorities.
@prefix bio: <http://vocab.org/bio/0.1/> . @prefix dc: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix rdagroup2elements: <http://RDVocab.info/ElementsGr2/> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix skos: <http://www.w3.org/2004/02/skos/core#> . <http://data.bnf.fr/ark:/12148/cb11926763j> a skos:Concept ; rdfs:seeAlso <http://catalogue.bnf.fr/ark:/12148/cb11926763j>, <http://fr.wikipedia.org/wiki/J._R._R._Tolkien> ; owl:sameAs <http://dbpedia.org/resource/J._R._R._Tolkien>, <http://isni-url.oclc.nl/isni/0000000121441970>, <http://www.idref.fr/027164918> ; skos:altLabel "John Ronald Renel Tolkien (1892-1973)"@fr ; skos:prefLabel "John Ronald Reuel Tolkien (1892-1973)"@fr ; foaf:focus <http://data.bnf.fr/ark:/12148/cb11926763j#foaf:Person> . <http://data.bnf.fr/ark:/12148/cb11926763j#foaf:Person> a foaf:Person ; rdagroup2elements:biographicalInformation "Romancier. - Professeur de langue et littérature anglo-saxonnes" ; rdagroup2elements:dateOfBirth <http://data.bnf.fr/date/1892/> ; rdagroup2elements:dateOfDeath <http://data.bnf.fr/date/1973/> ; rdagroup2elements:fieldOfActivityOfThePerson <http://dewey.info/class/800/>, "Littératures" ; rdagroup2elements:languageOfThePerson <http://id.loc.gov/vocabulary/iso639-2/eng> ; dc:date "1892-1973" ; bio:Birth "1892-01-03" ; bio:Death "1973-09-02" ; owl:sameAs <http://viaf.org/viaf/95218067> ; foaf:birthday "01-03" ; foaf:depiction <http://upload.wikimedia.org/wikipedia/commons/thumb/d/d9/Tolkien_1916-2.jpg/200px-Tolkien_1916-2.jpg> ; foaf:familyName "Tolkien" ; foaf:gender "male" ; foaf:givenName "John Ronald Reuel" ; foaf:name "John Ronald Reuel Tolkien" ; foaf:page <http://data.bnf.fr/ark:/12148/cb11926763j> .
Biblioteca Nacional de España
Looking at the BNE authority data what sticks out is the use of IFLA's FRBR and FRAD vocabularies. Obviously, BNE follows existing Linked Dta practices and doesn't emphasize the canonical strings as it types name authorities as persons. This becomes evident by the owl:sameAs links to GND, Viaf, Libris etc.
Some help for the people who don't know IFLA's FRBR and FRAD properties by heart:
- ifla-frbr:C1005 = Person
- ifla-frbr:P3039 = hasNameOfPerson
- ifla-frad:P4031 = hasOtherVariantNamePerson
- ifla-frbr:P3040 = hasDatesOfPerson
Here is a turtle snippet:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix ifla-frbr: <http://iflastandards.info/ns/fr/frbr/frbrer/> . @prefix ifla-frad: <http://iflastandards.info/ns/fr/frad/> . @prefix locmads: <http://www.loc.gov/mads/rdf/v1#> . <http://datos.bne.es/resource/XX933704> a ifla-frbr:C1005 ; ifla-frbr:P3039 "Tolkien, J. R. R." ; ifla-frad:P4031 "Tolkien", "Tolkien, John Ronald Reuel" ; ifla-frbr:P3040 "1892-1973" ; owl:sameAs <http://d-nb.info/gnd/118623222>, <http://dbpedia.org/resource/J._R._R._Tolkien>, <http://libris.kb.se/resource/auth/97224>, <http://viaf.org/viaf/95218067>, <http://www.idref.fr/027164918/id> .
As the overview shows, one currently can not find any other institution that follows an authority approach similar to the Library of Congress' that focuses on canonical strings. I guess, if Bibframe wants to be picked up by a broad mass of institutions internationally it will have to accomodate to the existing environment which would mean re-thinking Bibframe authorities by putting the primary focus on an approach that supports direct linking to persons as authorities. Today's mails on the Bibframe list by Kevin Ford of LoC (especially this one) give some hope that this may actually happen.
n Kevin Ford's and Ray Denenberg's first reply to Rob Sanderson's text on the Bibframe mailing list they write:
"BIBFRAME has explicitly stated that bf:Authority is not designed to compete with existing library authority efforts or replicate traditional library authorities. Furthermore, nowhere is it ever asserted that 'bf:Person != foaf:Person' and nowhere is it said that 'bf:Authority == madsrdf:Authority'. Nothing, either way, is actually declared presently. Rather, bf:Authority is an abstraction allowing the implementer to reference a traditional authority. It is these traditional authorities that include the strings in question.".
It is correct that there is no explicit and formal statement that a bf:Authority can not be a foaf:Person and must be a mads:Authority. But the use of a vocabulary is not only (and probably even not in the first place) guided by its RDFS/OWL representation. Examples and tools can have a lot more power directing the use of a vocabulary. The example Bibframe data from Deutsche Nationalbibliothek shown above makes clear that even early implementers (at least one) obviously did understand Bibframe authorities as string-centric (otherwise they'd put a direct link to the GND into the data).
Besides the DNB Bibframe test data, there exist other examples suggesting Bibframe is primarily dealing with string authorities:
- Definitions and names currently have a lot of "controlled name" in them, e.g. http://bibframe.org/vocab/Person.html.
- Examples in the Bibframe vocabulary documentation, e.g. at [http://bibframe.org/vocab/creator.html])(http://bibframe.org/vocab/creator.html).
Output of the Bibframe Editor, here is some example output I get when I chose a LoC person authority to link to (converted to turtle):
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . <http://example.org/46bf66bc-51e0-4c80-9444-79141c2e28cc> <http://bibframe.org/vocab/language> <http://id.loc.gov/vocabulary/languages/eng> ; <http://bibframe.org/vocab/title> "Ways of Worldmaking" ; <http://id.loc.gov/vocabulary/relators/aut> <http://example.org/6636d72e-9933-649a-b769-2ef41d241937> ; a <http://bibframe.org/vocab/Work> . <http://example.org/6636d72e-9933-649a-b769-2ef41d241937> <http://bibframe.org/vocab/authoritySource> <http://id.loc.gov/authorities/names> ; <http://bibframe.org/vocab/authorizedAccessPoint> "Goodman, Nelson"@en ; <http://bibframe.org/vocab/hasAuthority> <http://id.loc.gov/authorities/names/n50037322> ; a <http://bibframe.org/vocab/Person> .
If Bibframe wants to make clear that it in fact does NOT require people to use blank nodes with string authorities and that it supports and may even prefer direct interlinking of works and associated agents then it should behave like that and replace the current definitions, examples and output of the Bibframe editor. It would be even better if the Library of Congress changed its approach of modeling its own authorities and added URIs for the real-world objects (persons, corporate bodies etc.) to their authority data so that one could directly link to them.