Descriptive Metadata for
Audio-Oriented Digital Collections



The research question called for an empirical method to examine current practice in the field of online audio-oriented collections. The process included the following steps:

  • identifying candidate collections for the study
  • establishing selection criteria
  • selecting 18 collections for inclusion in the study
  • collecting sample records and documentation from each collection
  • analyzing the data in the collection information and element sets

Identifying Candidate Collections

In this study the selection criteria were not established first, because it was not clear at the outset what kinds of collections were actually online. It would have been difficult to establish an a priori definition of online audio-oriented collections that included all of the examples ultimately selected. The discovery of resources was, in fact, a significant aspect of the project.

The process of identifying candidates began with the examination of member lists of the following associations:

Next, online resource lists were explored, including the following:

Selection Criteria

The discovery process resulted a list of over thirty candidate collections, which ranged from national archives to digital library on specialized topics and everything in between. After evaluating the characteristics of these collections, criteria were developed to make a selection of sample collections that would be easy to compare with each other. The selected collections all had the following characteristics:

  • Content focus on recorded sound
  • Search interface
  • Availability of a significant number of recordings for streaming or downloading, either full recordings or short excerpts
  • Publicly accessible and free to use
  • Relatively stable and permanent set of content
  • Operating with the law

The criteria excluded entities like torrent sites for mp3s, commercial services like iTunes or Naxos Music Online, sites with changing content like YouTube or online sellers like, online catalogues without sound files like Indiana University’s Archives of Traditional Music, password protected sites like the Variations3 digital library, and websites that provided only a limited number of sample recordings. Where full recordings were not available, the extent of available samples was considered when deciding whether to include a collection. In fact, the provision of excerpted content was not uncommon, and usually reflected a practical approach to the challenge of offering access to copyright material. The remaining collections shared a common set of characteristics, but remained quite diverse in content. Further purposeful sampling was conducted to arrive at a manageable sample size that maintained a diversity of content. Eighteen online collections were included in the final sample selection.

Note that the collections will be referred to throughout this study with acronyms, which may be found on the list of selected collections.

Collecting Data

Data collecting took place between May and November, 2009, and consisted of two parts. First, basic information was gathered about each collection, including information about its host or sponsor, the types of content it included, significant collection features, as well as any available documentation. Next, sample records were taken from each collection. Wherever possible, a variety of search approaches were employed to find the sample records. For most collections, five records were collected. In three cases, more than five records were collected, because of the variety of types of recordings offered in the collections. These collections were ASR (8 records), CHARM (8 records) and DLA (6 records). The process for capturing each record involved displaying the record, copying the text on the page to a plaintext file and taking one or more screenshots to aid in later interpretation.

Analyzing Data

Data was analyzed in three phases. The first phase compiled into tables the information about collection hosts (Table 1), content types (Table 2), and collection features (Table 3). The second phase analyzed the metadata elements on a collection level, which allowed rough comparisons to be made between collections, in terms of the level of detail in the metadata and the consistency of the cataloguing. The results of this process are recorded in Table 4. The third phase analyzed the metadata elements on an aggregate level to look for commonalities and general trends within the sample set of collections. The results of this process are recorded in Table 6 and summarized in Table 5. The findings recorded in these tables will be discussed in detail in the following section, but a few words should be said about the procedures used in the second and third phases of analysis.

In phase two, elements in each sample record were counted, and the mean number of elements per record for each collection was calculated. The sets of elements and their values from all the sample records for a given collection were then compiled into a master list of all elements observed in that collection. Duplicates were eliminated, and each element was marked with a number to indicate the number of sample records it appeared in. Special note was made of elements that occurred in more than 75% of the samples for a given collection.

In phase three, all of the elements that had occurred in at least 75% of the samples for a given collection (usually meaning four or five occurrences) were compiled into a master list of 223 elements. A reference to the collection each element came from was maintained. The list was then manually sorted into 17 groupings. Using a kind of affinity grouping, the groupings emerged from the data set, and then names were assigned to them. The number of elements included in each grouping, as well as the number of collections represented in each grouping were recorded. Finally, the groupings themselves, were divided among seven metadata categories identified by Lagoze, Lynch, & Daniel (1996), becoming de facto sub-categories of those categories. These authors did not intend for their categories be considered definitive and complete, since they were merely illustrative rather than exclusive. However, the categories have proved useful over time and have been used in this way by others (Greenberg, 2005).

Considerations in Determining Affinity Groupings (Phase Three)

Greenberg (2005) points out that some kinds of metadata serve dual purposes, and this was found to be true when considering identification numbers and format descriptions. It was decided that the grouping "Identification Numbers" would include retrieval numbers, shelf numbers, and the like, but also issue numbers and matrix numbers. The logic behind the idea of combining these numbers into a single category was that that, while issue and matrix numbers are important for describing published recordings, equivalent numbers for archival recordings, such as reel numbers and accession numbers, are less relevant for description and play a role as retrieval numbers. Ultimately, though, the value of these numbers is that they act as unique identifiers for the recordings, regardless of whether for discographic or retrieval purposes, and it seemed better to create a single grouping for them on the basis of this commonality, and to consider them as a form of administrative metadata. This does go against the spirit of the ARSC rules, which identifies label and issue or matrix numbers as vital descriptive elements, but those rules were primarily created to catalogue commercial recordings, so it seems reasonable to make some allowances.

Elements that dealt with the formats of the original sound carriers were combined with elements indicating file components and assigned to the structural metadata category. While this category usually refers to the sequence and relations of segments of electronic files, it was interpreted more freely in this case. The use of the seven categories of Lagoze, Lynch, & Daniel (1996) was, in any event, an adaptation of sorts, since this study was only concerned with the publicly displayed metadata, which tended to be mostly descriptive in nature.

Finally, the grouping of Contributor/Author was used to group all references to people who played a role in the creation of the audio file, without distinguishing between the major and minor contributors. This included performers, recordists, composers, lyricists, transcribers, and uploaders. Some writers have pointed to a certain tension between traditional notions of authorship that AACR2 has supported, especially through its determination of main and added entries, and the distributed authorship that exists amongst a whole team of originators with digital content (Cwiok, 2005). This profusion of roles is common in the production of sound recordings as well. However, all of these contributors could be accounted for in an AACR2 description, in one way or another. So for the purposes of this study, the use of a general grouping for all authors and contributors was not seen as problematic.

Other Challenges

Some challenges were encountered in working with the metadata elements. In some cases, a decision had to be made as to whether information about an object was metadata or whether it was dynamically generated contextual information, of the 'viewers who looked at this also looked at' variety. Usually, this kind of information was determined not to be metadata, and so was not considered an element for the purposes analysis. The presence of these kinds of features was recorded instead as a collection feature.

Another challenge resulted from the fact that some collections, such as DEKKMMA (Digitalisatie van het Etnomusicologisch Klankarchief van het Koninklijk Museum voor Midden-Afrika / Digitization of the Ethnomusicological Sound Archive of the Royal Museum for Central Africa), display all fields all the time, inserting null values as needed. This was taken as a sign that a database rather than digital library technology was being used. The empty fields were taken as an indication of a desired level of description, but for the purposes of this study, the null values were considered non-elements when calculating the mean number of elements. Likewise, repeating fields, such as subject fields, were counted only once, since in other cases, a string of subject headings might be contained within a single element field. For archival-style hierarchical records, where information about an item was contained in a linked pair parent/child records (usually a fonds/collection level record and an item level record), elements from both records were combined into a single record, and duplicate fields eliminated.

This website was created by John Huck in March, 2010
School of Library and Information Studies
University of Alberta