CIHA London 2000.
Thirtieth International Congress of the History of Art
Art History for the Millenium: Time.
Section 23
Digital Art History Time
London, 3-8 September 2000
ahwa.gif

Murtha Baca <mbaca@getty.edu>, Head, Standards Program, Getty Research Institute.

"Seek, and Ye Shall Find": Working Toward Integrated Access to Online Art History Resources.

Murtha Baca on the left using transparencies backup while Will Vaughan, Vice-President of Section 23, is trying to repair the computer connection with the projection unit.
 © each author has full responsibility in owning copyright on the texts and on the images they publish on this Web site

Many of us first started doing work for the Web as if we were blindfolded. "Digital art history time" is serendipitous and highly idiosyncratic, not to say schizophrenic. Databases and Web sites have tended to spring up in a haphazard way, even within a single organization such as a university or the Getty.

Thus many organizations now find themselves in the situation of having "sites within a site," each with a different design and even different navigational schemes. In addition, though in theory network technology makes it possible to access diverse online resources as if they were all "in one place," the reality is that it is often technology itself that has created barriers to integrated access. Many proprietary databases are impenetrable by Web search engines. Many organizations have multiple databases on multiple hardware and software platforms. This is the case at the Getty as well, at least at the moment.

I believe that there are five key ingredients for any digital art history project to be successful:

1. intellectually "curated" content, with end-users always in mind from the beginning

2. judicious use of information technology

3. metadata and mapping

4. vocabulary tools

5.  interdisciplinary teamwork (as already mentioned by Hans-Jörg Heusser in his paper)

The primary goal of the work that we have begun at the Getty under the aegis of a project called "getty.edu" is to provide integrated access to all Getty online resources, regardless of where they may reside physically, technically, or institutionally within the Getty institutes and programs. In addition, the Getty program of which I am a part, the Getty Research Institute, is working to create integrated intellectual access to its many and varied online resources, which are presently not all searchable from a single point - i.e., they are not yet technically nor virtually unified and integrated. We recently began the initial data analysis and mapping to move all of the Research Library online resources to a single collections management system for our library catalogue, digital resources, conservation information, and digital assets; the process promises to be a long and arduous one, but well worth the effort. Because of the physical unavailability of many of the Research Institute materials, especially our large archival collections and rare books, making digital surrogates of these items online is particularly important for us; to again echo a point made by Hans-Jörg Heusser, technology should help us to better fulfill our educational mission and to reach a much larger audience.

The Getty's current Web presence strongly reflects the diverse ways in which our various programs have created their Web sites to date: the Getty Museum, the Getty Conservation Institute, and the Getty Research Institute, as well as the Getty Grant Program, ArtsEdNet, and all of the pages with visitor information. As of this writing (January 2001), all of our Web pages are undergoing a process of re-design and unification. The first phase of the re-designed Getty sites should go live during the early months of 2001 (note: The new Getty Web site went live on February 20th, 2001.). In this paper, I will focus on the Web resources of the Getty Museum and the Getty Research Institute.

The Web pages for the Getty Museum collections are generated from the same database that is used to create Art Access, the interactive kiosk system that is available to museum visitors. The Museum was able to automatically generate Keyword and Description metatags for these pages, to be indexed and displayed by our site-wide search engine. The fact that metadata standards and controlled vocabularies are used behind the scenes in the museum's collection management database helps to make this possible.

The data in the Getty Museum collection management system is based on metadata standards such as Categories for the Description of Works of Art and controlled vocabularies like the Art & Architecture Thesaurus (AAT), Union List of Artist Names (ULAN), and Thesaurus of Geographic Names (TGN). The system also includes indices by subject artist, and object type, which provide broad categories for browsing by users of both our interactive museum kiosk system and our Web site.

The Research Library of the Getty Research Institute has large and varied collections. Records for our 800,000-volume library are available via an online catalogue called IRIS, which is currently in a proprietary database system that is not searchable by the search engine that is being used on the main Getty "gateway. Records for our Photo Study Collection, which consists of some two million study photographs, are in yet a different proprietary database system that is also impenetrable by the main Getty search engine, as are the Provenance Index databases. Not only is this typical of the way that many art-historical resources are built; it also points out the rather obvious fact that Web technology is not the answer to everything.

The Collections Integrated Catalogue or CIC is the central gathering place for all Research Library online resources that are available to on-site users or users with a Getty Internet address. This veritable treasure trove includes links to our library catalog, research databases, EAD finding aids, digital library resources, vocabulary databases, and many other research resources, including CD-ROMs and subscription databases such as the Bibliography of the History of Art, the Avery Index to Architectural Periodicals, the Encylopedia Brittanica, the Grove Dictionary of Art, and many others.

Our Special Collections and Visual Resources Finding Aidscan be searched by the main search engine, as they are in HTML format (converted from their original SGML form), but the Encoded Archival Description (EAD) format creates significant problems with regard to usability and interpretability; in addition, the SGML-to-HTML conversion fragments the finding aids so that they area largely indecipherable on conventional search results lists. Also, explicit links from the finding aids to corresponding records in the library catalogue and vice versa do not yet exist except in a few cases. As I mentioned earlier, all of these assets will be migrated to our new collections management system over the next several years, and we hope to solve many of these problems.

At the Getty Research Institute we have begun to produce online resources based on the collections of the Research Library and the activities and programs of the Research Institute as a whole. Research Institute digital projects include Web resources, often based on exhibitions, focusing on a particular artist, subject, collection, or group of collections; they include images, contextual and historical information, and links to related resources both inside and outside the Getty. All digital resources are created under the intellectual direction of one or more of the Getty Research Library's collections curators or scholars; project management and workflow coordination are provided by the Getty Standards Program.

In other digital projects, individual items are scanned, cataloged, and made accessible via the Getty Research Library's databases and online catalogs; the Study Images of Tapestries in the Photo Study Collection database is one such resource. The Photo Study Collection recently began to add digitized images to its database. Some 9000 photographs of medieval and early modern tapestries from American and European collections were digitized at a high resolution. These are now being catalogued by a tapestries specialist, and will be incrementally uploaded into the Photo Study Collection database. The images - many of which are historic in nature - constitute one of the few comprehensive visual resources for the study of tapestries, and as such it is the Getty Research Library's intention to promote the study of tapestries by disseminating them to the widest possible audience. Of course this type of project, if it is to be truly useful to researchers, is very labor-intensive; unless the digitized images are catalogued by a skilled cataloguer with subject expertise, they will be of limited use for research purposes. The cataloguing records in the Photo Study Collection database are complex records, with long descriptive "titles."

Cataloguing these images has brought unexpected benefits, such as creating links to related GRI materials; for example, we have been able to create indices for the French & Company business archive, which is part of the Research Institute's Special Collections.

To date, we have created the following digital resources:

19th-Century Photography of Ancient Greece

The Edible Monument: The Art of Food for Festivals

Irresistible Decay: Ruins Reclaimed

Mexico from Empire to Revolution

Monuments of the Future: Designs by El Lissitzky

Located in our Research Library's Special Collections, the Gary Edwards Collection consists of some 1500 photographs and slides, dating from 1842 to 1959 and documenting ancient Greek architecture and sculpture; most date from the nineteenth century; the collection also documents photographic processes such as calotypes and salt prints. In the digital resource 19th-Century Photography of Ancient Greece, we make available approximately 200 of the photographs from the collection, focusing on Athens.

The Edible Monument: The Art of Food for Festivals presents materials from the Getty Research Library Special Collections depicting ephemeral art created for festivals in European courts and cities from the sixteenth century through the nineteenth century. For example, Juan de la Mata, dessert chef to Philip V and Ferdinand VI of Spain, published his cookbook Arte de reposter'a (Art of Making Desserts) in 1747. The site includes de la Mata's design for a table with 100 place settings. On the Web, we are able to include enlarged images, as well as translations from the original texts, recipes, etc.

In Monuments of the Future: Designs by El Lissitzky, we used Web technology to show meaningful groupings of El Lissitzky's works. The arrow/pictogram that is used as a visual and navigational element throughout the site is El Lissitzky's own, as is his "color coding."

Web technology allows us to present an enlarged, readable image of El Lissitzky's address book from the 1920s, which we hold in our Special Collections. Thus we have begun to use the Web to provide remote access to potentially millions of users, to objects that only a handful of researchers can consult in person.

Technology also enabled us to show the accordion foldout that Lissitzky designed of the Soviet Pavilion's installation at the International Press Exhibition in Cologne in 1928. The foldout, consisting of five photomontages, was described by the artist as "a typographic cinematic show."

Resources like these are created by teams that include staff from the Getty Research Library, Special Collections, Collections Cataloguing, the Research Institute Visual Media Services department and Conservation and Preservation department, the Getty Museum Digital Imaging Lab, and the Getty Standards Program.

We have also begun to exploit the Web as a new "publishing venue;" it is particularly useful for publishing resources that we want to be able to disseminate but also update in a timely manner. We recently did an extensive update of the site devoted to Categories for the Description of Works of Art, a metadata standard for art objects and their visual surrogates. We have included detailed, illustrated cataloguing examples in the new online edition of CDWA, including suggestions for the use of controlled vocabularies and local authority files.A Guide to the Description of Architectural Drawings is another standards-based publication that we make available on the Web.

Another online publication that we recently revised and updated is Introduction to Metadata. This is one of the most frequently accessed online resources of the Getty Research Institute. To facilitate downloading and printing by our end-users, we have made the individual sections of the site, including the glossary and metadata "crosswalks," available as PDF files.

Now let's return to the main Getty search page. Through the coordinative, "big picture" work of the getty.edu project, and the "in-the-trenches" work of groups such as the Getty-wide Standards Working Group, which I lead, we have begun to address the issue of integrated access in a systematic, phased way. A first step was to develop guidelines for implementing Description and Keyword metatags across all Getty programs, to optimize our site-wide search engine. Because our pages are clearly and consistently metatagged, they produce brief, easy-to-decipher search results. As I mentioned before, the metatags for the objects and artists in the collections of the Getty Museum are generated automatically; thus we have a combination of human- and machine-generated metatags on our site.

The main search engine provides the closest thing we have to site-wide access at the moment (as I mentioned previously, it does not search proprietary databases such as our online library catalogue, but only Web pages). For example, a simple keyword search for "holbein" from the main gateway retrieves pages from both the permanent collections and two exhibitions in the Getty Museum, as well as relevant publications from Getty Trust Publications. Because our sites are carefully metatagged, even searches on common misspellings or variants of artists' names yield meaningful results. Thus a search on "lissitsky" (a common misspelling) retrieves the El Lissitzky digital resource as well as an archived press release, because the misspelling is included in the Keyword metatags on those pages, as are other transliterated forms of the Russian artist's name. Similarly, a search on "food in art" or "trionfi" will retrieve the Edible Monument site, even though those exact words do not appear on the home page of the site, because the words and expressions included in the Keyword tag on the Edible Monument home page are carefully formulated.

One of the five key elements for successful Web work that I mentioned at the beginning of this paper is the use of structured vocabularies and thesauri. At the Getty we have been building vocabulary databases for art, architecture, and material culture for a couple of decades now. Of course we also use vocabularies and classification systems like Library of Congress Subject Headings, Library of Congress Name Authority File, the ICONCLASS system for iconographic description, and others.

As we all know, searching for information and images on the Web is usually frustrating and often fruitless. If end-users don't know the name of what they are looking for (a not unusual situation), or don't use the exact form of name or term that is used in the resources they are searching, they can miss a great deal of relevant material.

We make our three vocabulary databases - the AAT, ULAN, and TGN - available free of charge on the Web as lookup tools and cataloguing aids; the data is updated on a monthly basis. This is an example of what Ken Hamma referred to as the union of theory and practice in his opening remarks for this session - where we merge the processes of development and implementation, in this case with regard to our vocabulary databases. We are constantly adding to our vocabularies, both manually and via batch contributions from Getty programs (Provenance Index, Photo Study Collection, IRIS, Getty Museum) and external contributing organizations; but at the same time we are also working on marking them available as searching assistants to enhance end-user access. Thus we have begun to work with the getty.edu team on using a combination of information technology and our own potentially very powerful "knowledge bases" to give our users better, more accurate access to our online resources.

I would like to give a few examples of the potential power of our vocabularies as assistants to end-user searching. Let's say an Italian visitor to the Getty Museum remembers an illusionistic ceiling painting he has seen in our galleries. He enters "gherardo delle notti" and retrieves an image of a painting by Gerrit van Honthorst. This is because the Italian nickname (and numerous variations thereof) is "clustered" with all of the other name forms associated with the Dutch artist who was active in Italy and a follower of Caravaggio, noted for his nocturnal scenes.

An example from the AAT shows the power inherent in the hierarchical structure of a thesaurus. Let's imagine that another Getty Museum visitor has a vivid mental image of a unique object she saw in our decorative arts galleries. She vaguely recalls that the piece of furniture resembled a cabinet and had a French name and had several compartments, but she doesn't recall the name of the artist who created it or any other language-based details she could use for searching on line. She can simply search on "cabinet" in the AAT, browse through the narrower terms under cabinet in the hierarchy, select the descriptors that are French words, view the full records with their definitions, and find the exact term she is looking for, cartonnier. Once she uses this term to search the Getty museum Web pages, she retrieves the page for the cartonnier designed by the ébeniste Bernard van Risenburgh. Including the broader term "cabinet" in the Keyword tag on the relevant Web page also achieves the same result, without the end-user even having to consult the AAT.

The TGN offers both the power of variant names (a search for "ghent" will retrieve records with the vernacular form "gand," "florencia" or "florence" will retrieve "firenze," etc. etc.), as well as of the hierarchical structure of a thesaurus (e.g., retrieve all of the places in the province of Viterbo, region of Lazio, in Italy).

Our latest work on vocabulary-enhanced searching includes adding some "middleware" between our search engine and our vocabulary databases before launching a user's search of our Web pages. Thus a user could enter a blatant misspelling like "holbine" and still retrieve relevant results. We have already tested and proven that this works not only on our own Web pages, but also on external pages that have not been indexed using our vocabularies. For example, a user enters "ducrux" from the Getty search engine; this term is passed first to the middleware program, which finds the closest likely matches in the ULAN database. Then the user's original search expression, along with all of the name forms from the matching ULAN record(s), is passed to the search engine, and used to search all Getty Web pages. This retrieves the Web page with the self-portrait of the French artist Joseph Ducreux in the Getty Museum. If the user then selects the search engine option to run the same search terms against the World Wide Web, he retrieves five Web sites with works by Joseph Ducreux. If he had simply entered his original search, "ducrux," without the middleware and the ULAN as searching assistants, he would have retrieved no results.

We still have a long way to go. But if we continue to work in interdisciplinary teams to use technology, metadata, and vocabulary tools, with the intellectual oversight and content curation of scholars and art historians to deliver our content via the World Wide Web, one day we will achieve our goal of integrated access for our many and varied users.
 
 


Abstract

This paper will focus on the use of metadata standards (Categories for the Description of Works of Art, VRA Core Categories, and others) and controlled vocabularies (Art & Architecture Thesaurus, Union List of Artist Names, Thesaurus of Geographic Names, ICONCLASS, Dublin Core) to make possible accurate end-user searches of diverse art-historical materials online. It will also examine the processes and methodologies, both human and technical, for intellectually integrating a variety of resources via the World Wide Web. The overarching concern in all of these processes and methods should be to provide high-quality access for the myriad of end-users whom art information professionals can now reach via worldwide information networks. User interface issues will also be addressed.
 
 

ahwa.gif

Art History Webmasters ASSOCIATION des webmestres en histoire de l'art
Research and Communication Tools in Art History
Outils de recherche et de communication en histoire de l'art
Since November 14, 1997. Depuis le 14 novembre 1997.

haut de page