30 octobre 2010

On bringing Europe’s cultural heritage online

This text is an expanded version of a position paper and an oral statement to the Comité des Sages at their public hearing on 27 November 2010.


The younger generations listen to music all the time, yet, as Jill Cousins has pointed out in her statement to the Comité des Sages, only 2% of Europeana is musical contents. This has got to change.

I live in a black hole, that of an archive of unpublished recordings of live performances of contemporary art music: this raises specific barriers in identifying and obtaining the rights to digitize and make available these documents on the Web on our own sites and all the more so to communicate them to Europeana.

I also live in a no-man’s-land squeezed out from the public-private dichotomy, namely that of the not-for-profit private organizations: they don’t benefit from full government financial support1 on the one hand, and can’t develop commercial or marketing strategies on the other hand. This limits their capacity to raise adequate funds to digitize their collections.

One of my tasks at IRCAM has been to design and operate the Portal of contemporary music resources in France, a project which has been partially financed by the French Ministry of Culture. It is a partnership of more than thirty providers – archives, libraries, museums, conservatories, music ensembles, centers for contemporary music… -, all but a couple of which are not-for-profit organizations, all with interesting and sometime unique music archives2 of contemporary works but most without librarians, archivists or specialists of computers and digitization. Most are small – in size and in budget – and appreciate the added visibility such a portal provides to their heritage, and don’t see it as a threat.

This Portal aggregates all the information about their holdings and the events they organize, whether or not that information points to actually digitized documents: this allows localizing relevant resources, such as books, periodicals, music scores and archival documents, which might never be digitized in a foreseeable future. We believe that cultural heritage is not only digital cultural heritage, echoing the concern voiced earlier today by Lucie Verachten from the Belgian Archives, and so “bringing it online” must include also bringing online metadata with no associated data.

This is not what Europeana does (unfortunately, in my opinion). As a consequence, this Portal, which is harvested by Europeana, provides it only with those records pointing to digital resources made available by our partners. Yet, we can’t let Europeana access our audio digital items, nor even just provide thumbnails for them.

How come? The exceptions to the 2001 Directive on the harmonisation of certain aspects of copyright and related rights in the information society3 and the Commission’s recommendation of 24 August 2006 on the digitisation and online accessibility of cultural material and digital preservation4 do not go far enough in addressing some aspects more specifically related to networked digital libraries, to archived musical material and to the kind of cultural institutions that can benefit from exceptions.

I’ll briefly address three implications of this problem, which creates barriers to bringing Europe’s cultural heritage online.

1. Copying for the purpose of indexing and semantic enrichment.

Contrarily to other major digital libraries such as Gallica, Hathi Trust, the Gutenberg project, Scribd or Google Books, Europeana cannot provide full-text search into the many digitized books which it references, but only in its metadata.

Why this limitation? Europeana, as a networked library, does not contain (in its computers) these documents while these digital libraries do. The current laws and their exceptions do not automatically entitle Europeana to make a copy of them from their holders, its content providers and aggregagors, although these documents are already made available either publicly on their web sites or on dedicated terminals on their premises.

This is a very serious disadvantage, as it means that Europeana is, in effect, not more than a large catalogue (of digital objects for sure), but not a full-fledged digital library. Were it able to access the contents, it could not only provide that missing important search feature, but also semantically enrich the collection in many innovative ways arising from the analysis of the contents rather than just of its metadata.

One could envision a more generous exception that would allow Europeana and similar cultural digital libraries – in public institutions and not-for-profit private organizations (which are not mentioned in the exception, see note 5) – to make a temporary copy of the documents5 – even copyright-protected documents – so as to allow for their full indexation and meaningful semantic enrichment: as a result, a search for a phrase (say) would return the lists of the books in which this sentence can be found, and allow the public to access these books on their provider’s sites if publicly accessible (partially or totally) or just identify where in the books the search phrase can be found (as Google Books does for books under copyright).

Such an exception should also cover non-textual material, i.e., sound and audiovisual cultural material (contrarily to the opinion of one of the earlier speakers today). Many techniques of information retrieval are already applicable to musical (and other audio) contents.

2. Rights clearance for musical sound archives

Several speakers, and in particular Renate Dörr from the ZDF, have already addressed some of the especially difficult aspects of clearing the rights needed to digitize and make available musical sound recordings. In her subsequent statement, the representative of the AEPO, Guenaëlle Collet, asserted that this was always feasible, and that the rights collecting societies had the knowledge and resources to help doing so. This is probably true (to some extent at least) for commercial recordings, yet is far from being so for unpublished, archived contents.

In order to be able to digitize their own sound archives and make them available on “dedicated terminals on their premises”, our content providers have to go through a sometime endless and impossible process of identification of all rights holders, all the more difficult when it concerns large archives and when the holders – small cultural organizations such as musical ensembles – do not have, and never had, archivists or librarians who would have thought of, and known how to, keep exceedingly detailed records: they don’t have a publisher to turn to, and have to do the work themselves.

Consider, e.g., a recording of a live concert of 30 years ago that they performed or produced: they would have to identify not only the works and the composers (that’s usually not a problem), but also all the soloists and performing musicians which sang or played on that date in the orchestra or sang in the choir, the publishers of the scores that were used then (it is sometimes impossible to determine as well: some premières might have been played from manuscript scores before their subsequent publication) and if the published score had been bought or rented for that performance…

This in effect prevents them clearing up the rights fully, and is one of the two principal causes for the scarcity of digitized sound heritage, the other one being the high digitization costs, which small institutions can’t afford, all the more so when they are not-for-profit private ones (whence the pressing need for increased public support, as Ben White from the British Library already mentioned).

My recommendation would be to address this barrier by e.g. extending the principle of, and dispositions for, orphan works to sound and audiovisual material6 (to echo Ben White again), or, alternatively, to allow for collection-level rights management (rather than on an individual-archive basis) or even global licences in such cases.

3. Music thumbnails

When one launches a search in Europeana, the results pages show, for each found item, a thumbnail with a descriptive, textual subtitle. This is at least true of text (typically: a small image of a page), of a still image (a reduced view of it or of part of it) and of video (a frame). But in the case of music, there is no general method or algorithm to produce a visual display, all the more so for a sound archive without any accompanying printed documentation7.

Yet it is technically feasible to produce automatically a sound excerpt, and to embed a player (displayed as a play button) of any size – in this case the size of the vignettes used by Europeana – in the results pages. By clicking on this player, the user would be able to listen to this excerpt. This could significantly enhance the search and discovery experience in Europeana by making digital sound recordings “equal citizens” to the other Europeana assets.

Yet while it is possible to provide free citations or thumbnails of text (and arguably of still images), it is definitely not the case for recorded music (as well as for printed music scores). In order to allow the users of our Portal to listen to such excerpts and get an idea of what the music sounds like (and looks like, for music scores), our providers have to pay an extra fee8 to the various rights collecting societies (authors, performers…), which they do.

But this does not allow them to make use of this excerpt on any other web site than their library’s and the Portal: even just in order to use an excerpt on another of their web site, they would have to pay an additional fee. As a result, they can’t allow Europeana to include this excerpt either in the results page or in the record page. So in addition to the scarcity of musical contents referenced by Europeana (see point 2 above), that which is there is not adequately “displayed”.

I would recommend addressing this barrier by e.g. extending the public interest exception for cultural non-commercial purposes to allow for free limited citations of musical (and audiovisual) works as it is the case for text (or at least for the free non-commercial reuse of such citations for which rights have been cleared for their first use). This would not be detrimental in any way to the composers and the performers. On the contrary, such excerpts act as promotional material, as the many messages to the Portal show: people who have listened to them ask where they can listen to the full work, where they can buy a commercial recording thereof or where they can buy or rent the score in order to perform the work.

Thank you.

1 The French Plan national de numérisation grants 100% of the budget of digitization projects to public institutions, but only up to 50% to private not-for-profit organizations.

2  Sound recordings, but also music scores, program notes, etc.

3  Related to preservation purposes, for the benefit of people with a disability and for research and private study on the premises.

4  To improve conditions for digitisation of, and online accessibility to, cultural material by” addressing the issue of orphan works, and “make provision in their legislation so as to allow multiple copying and migration of digital cultural material by public institutions for preservation purposes.

5 I had suggested such a mechanism in a 2005 letter to the then President of the French National Library, Jean-Noël Jeanneney (available here), in which I was outlining how such a networked digital library could be designed.

6 Fees would still be paid to rights collecting societies, and the monies would be put in escrow.

7 In the case of a commercial CD, the cover of the booklet may be displayed, e.g.

8 I.e., in addition to what they have already paid in order to digitize the sound archive and provide full access to in on dedicated terminals on site only.

