Medium Data Photo History

Medium-data photographic histories

Analyzing and curating large(ish) datasets of photographic objects 

 
 

In their efforts to preserve and restore artworks, art conservators collect lots of data on the objects they study. Using sophisticated analytical techniques—like x-ray fluorescence spectroscopy, high-resolution imaging, and scanning electron microscopy—conservators can learn much about an object’s material history, about how it’s been treated in the past, and about how it’s likely to change in the future.

Conservators and art historians can also use all of this data to learn more about an artist’s process, style, and historical moment. But this is not always very easy to do, because it requires that historians blend technical data with their own expert knowledge about an artist to gain the best insights. Traditional big-data approaches to analyzing unstructured, multidimensional datasets (like the kind you might get from single-cell RNA-seq experiments or complex financial models) aren’t well-suited to this kind of task.

A case in point is the early 20th-century American photographer Clarence H. White. Born in Ohio in 1871, White worked at a time when photography was evolving rapidly from a specialized technique into a widely accessible technology that anyone could get their hands on. Over the course of his 30-year career, White experimented with a huge range of photographic techniques. This makes his surviving body of work extremely diverse and endlessly fascinating.

 

Clarence H. White, The Dancers from Barnard College (The Greek Games). Palladium print (1922). Princeton University Art Museum, x1983-725

 

The Princeton University Art Museum was organizing an exhibition of White’s photography in 2017, and they enlisted students and researchers from Yale’s Institute for the Preservation of Cultural Heritage to help them conduct analytic work on their collection. We gathered our instruments and headed down to New Jersey for a weekend of art historical data collection.

We used x-ray fluorescence spectroscopy to gather data on the elemental composition of the image-forming layers in White’s photographs. This can tell us something about the photographic process that White used to print his image. We collected raking-light images of the photographs’ surfaces to visualize the surface textures of the objects. This allows us to understand which photographs might have been printed on similar papers, and thus at similar times. And we consulted with experts on White’s life and art to supplement our analytic datasets with crucial biographical details about the photographer.

Some raking-light photographic surface textures

Some x-ray fluorescence spectra

All of this led us to a problem: How can we use this data in a way that tells us something new about Clarence White as an artist?

These kinds of unstructured, multidimensional datasets are pretty rare in the universe of art history, but they’re going to become more prevalent as the analytic technology for studying the material histories of art objects becomes more commonplace. We needed a way to structure this dataset—to visualize it, stretch it, knead it and shape it into something that yields insights into the artist it represents.

And we needed to do this in a way that allows us to incorporate pieces of crucial historical information about the artists that we’re studying. There’s a lot of expert knowledge that goes into understanding an artist—knowledge that historians gain over the course of decades by reading artists’ letters, visiting artists’ archives, meeting artists’ family members, organizing exhibits, and speaking with other historians. We wanted to make sure that our approach could accommodate all of this valuable information.

So we hand-built a network illustrating the relationships between White’s photographs across all the dimensions of data that we collected.

 
CHW_TE_v4_Reverse5.png

We visualized each photograph as a point in space.

 
CHW_TE_v4_Reverse4.png

We clustered those points according to the year that expert art historians think the photograph was made. The data point in the middle represents a photograph that doesn’t have a date of production associated with it.

 
CHW_TE_v4_Reverse3.png

And we color-coded those points to indicate the process that White used to make each photograph.

 

Finally, we drew links between points that represent photographs with similar paper textures. Thicker lines represent very strong similarities, and thinner lines represent less strong similarities. The end result looks like this:

Clarence H. White, Untitled (Anne Brigman posing, Seguinland School of Photography, Five Islands, Maine). Platinum print (n.d.). Princeton University Art Museum, x1983-662

We can start to see things about White’s career that we might not have been able to see before, and to ask and answer questions that we had never thought about. The one undated photograph that we analyzed, for example, has a paper texture that is very similar to an early photograph of White’s in Princeton’s collection. The undated photograph is of Anne Brigman, a fellow American photographer active after 1901. Thanks to some biographical details on the lives of Brigman and White, we know that the undated photograph can’t possibly come from this early in White’s photographic life. So the photograph with a similar paper texture, an image that White captured as an illustration for a book published in 1903, probably comes from later in White’s career.

This kind of visualization allows us to consider a fairly large number of art objects simultaneously. While an art historian might typically embark on an intensive study of two or three artworks, our methods allow historians to think about 100 times that number of objects. This represents nowhere near the amount of information that demands “big data” analytics, but it’s certainly more information than art historians typically deal with. It’s medium data. It’s a lot of data, but it’s not so much material that it becomes intractable for a human being to parse through it and to gain insights from individual data points.