Descriptive Metadata for
Audio-Oriented Digital Collections

Literature Review

Traditional Approaches to Cataloguing in Sound Archives

Metadata requirements for audio recordings overlay the challenges of description that sound archives have always faced. Understanding the landscape of descriptive practice in cataloguing for sound recordings helps contextualize the question of requirements by highlighting the diversity of disciplinary practice.

Standard bibliographic and archival rules for description (AACR2, RAD, DACS) include rules for describing recordings, and these are used by most general libraries and archives. Additionally, specialized rules for recordings have been developed to supplement the main standards. AACR2 has proved a valuable touchstone, providing a set of rules that other standards have built upon. The Canadian Rules for Archival Description (RAD) and the U.S. equivalent, Describing Archives: A Content Standard (DACS), both relate to AACR2 in this way; RAD mimics the chapter structure of AACR2. Specific cataloguing rules for archival sound recordings have been issued by the Association for Recorded Sound Collections (Association for Recorded Sound Collections [ARSC], 1995) and the International Association of Sound and Audiovisual Archives (International Association of Sound and Audiovisual Archives [IASA], 1999) and both of these also incorporate references to the AACR2 chapter structure. The IASA rules, though, reflect a more international perspective, making reference to ISBD rules, and drawing on other frameworks, including FRBR, RAD, the ARSC rules, and the SAA's Oral History Cataloging Manual, amongst others.

While the notion of 'sound archives' suggests a network of collections, united by a common format, in fact the term amalgamates a diversity of disciplines. An IASA guide to sound archives (Lance, 1983) identifies several categories of sound archives, including: broadcasting, commercial recordings, dialect and linguistic collections, ethnomusicology, folklore, oral history and natural history. The extensive set of examples in appendix B of the IASA cataloguing rules (IASA, 1999) also bears out this diversity. The recognition of this unity in diversity has not always been the case. For instance, the preface of the ARSC rules indicates that, when they were developed by the Associated Audio Archives, that organization was "at this time most interested in cataloging 78 rpm and cylinder recordings" (ARSC, 1995, p. ix). Public Archives Canada published a guide to procedures for sound archives in 1979 (Public Archives Canada [PAC], 1979) but it was primarily a description of how that institution managed its holdings. The guide distinguished PAC in its function of collecting spoken word recordings of debates, and so forth, from the National Library, which collected published recordings, and the Museum of Man, which collected recordings that "document the folk music and folk culture of Canada" (PAC, 1979, p. 2). The guide predates RAD by eight years, and so does not make reference to AACR2, but the narrow focus it takes is clearly related to the institutional structure from which it originates. The boundaries between different kinds of sound archives and institutions remain difficult to pin down today.

Some institutions have contended with collections of mixed commercial and archival recordings from the beginning. As one of the first ethnomusicology archives in North American, the Indiana Archive for Traditional Music, first founded in 1936 at Columbia University by George Herzog, followed the cataloguing model in use at the Berlin Phonogramm-Archiv, where Herzog had worked under Erich von Hornbostel (Archives of Traditional Music [ATM], 1975). The collection focus on oral tradition did not discriminate between commercial recordings, field recordings and broadcast transcriptions and the catalogue method therefore had to accommodate all three. The unit of description was the fonds and specific rules for constructing a title were specified. The catalogue was arranged by accession number, but supplemented with indexes to geographical areas, culture groups, subjects, collectors, performers, etc., and recording companies. With the exception of subjects, all of these elements were included in the constructed title, with the first element being geographical areas. In this way, all three types of recordings could be captured by a single cataloguing system.

Digital Sound Archives

If traditional descriptive practice in archival audio collections was usually determined by individual approaches, based on local needs, the transition to the digital environment has brought with it new challenges overtop of the old. Bradley has observed that "much of the effort devoted to metadata in the heritage sector has focused on descriptive metadata as an offshoot of traditional cataloguing" (Bradley, 2009, p. 21), and warns against the dangers of approaching the problem of metadata as a traditional cataloguing problem that can be solved with a greater level of detail. Sound archives need to address the issue of metadata, however, because digital archives are only going to grow in importance for preservation and access in the coming years. A survey of the state of audio collections in academic libraries (Smith, Allen, & Allen, 2004) found that unique and fragile collections are at risk, because of a critical need for both preservation and access through cataloguing. Because there are no new physical recording formats being developed for the archival community, "there is little choice for sound preservation except digital storage approaches" (Bradley, 2009, p. 4), so archives are looking to digital archives for both preservation and access purposes.


Metadata is important for essential processes in the digital environment beyond mere description. A popular definition calls metadata "data associated with either an information system or an information object for purposes of description, administration, legal requirements, technical functionality, use and usage, and preservation" (Baca, 1998). Relationship metadata, provenance and content ratings have also been cited as relevant categories (Lagoze, Lynch, & Daniel, 1996). Metadata implementation is different than cataloguing, in that the abundance of schemas means choices must be made even before description can begin. In making choices, Zeng and Qin recommend considering "the nature of collection objects; anticipated user needs; and constraints upon metadata creation, implementation and quality control" (Zeng & Qin, 2008, p. 88). Zeng and Qin also remind metadata designers to consider the needs of both end-users and system users, as metadata must act "both as inventory and user access tool" (Zeng & Qin, 2008, p. 93). A thorough understanding will support schema selection and the identification of desired elements in the schema.

Multimedia Metadata

Choosing a metadata schema for multimedia is even more challenging, because "the multimedia domain is far too wide for any single standard" (Smith & Schirling, 2006, p. 86). Even amongst music schemas (e.g. ID3, MPEG-7, etc.) optimal choice can vary greatly depending on intended function (Corthaut, Govaerts, Verbert, & Duval, 2008). The CUIDADO project found when managing large music collections that a combination of automated metadata and bottom up descriptions was useful (Vinet, Herrera, & Pachet, 2002). Another approach, pioneered by the well-regarded Variations2 Indiana University Digital Music Library, has been to incorporate a work-based approach, following the FRBR model, as a way to resolve ambiguities that arise from music content specifically (Notess & Dunn, 2004).

Nevertheless, the need to harmonize approaches across the community of practitioners has been recognized (Bradley, 2009; Lai, et al., 2007; Smith & Schirling, 2006). Lai, et al., found that the heterogeneity of schemas and systems in use amongst three sound repositories hindered the operations of a federated search protocol. Bradley argues for versatility and extensibility as important principles for sound-related metadata, amongst others, so that schema can be combined as needed in application profiles, customized for particular needs, but still be capable of interoperability. One emerging area of interoperability is the use of RDF for semantic web queries. A combination of conventional metadata, enhanced with contextual user supplied information is considered as a possible path forward, especially for music related resources (Moutselakis & Karakos, 2009).

The need to address music and sound retrieval in the archival and folklore areas as a multi-faceted problem has recently been recognized. The EASAIER project (Enabling Access to Sound Archives Through Integration, Enrichment and Retrieval) has explored ways to use multiple techniques for idiomatic retrieval of music recordings, speech recordings and 'cross-media' materials (like printed scores), respectively (Damnjanovic, Barry, & Reiss, 2008). The heterogeneity of folklore collections that include sound recordings has been identified as a challenge to using pre-existing metadata schemas. Development of application profiles that integrate multiple schemas has been attempted as a possible solution (Lourdi, & Papatheodorou, 2004).

This website was created by John Huck in March, 2010
School of Library and Information Studies
University of Alberta