Archives, Data, and Humanities: A Philosopher’s Reflections

This week our Digital Humanities seminar served as a good reminder of the possibilities and breadth of data potential in humanities fields. Miriam Posner’s blog “Humanities Data: A Necessary Contradiction” was not only an excellent introduction to the notion of all objects bearing metadata, but also a further case for why philosophers should consider data-based projects because data speaks, data can argue. Though I am not a historian and do not actively seek out archival projects, I have had a few experiences with archival-turned-dataset research projects that have taught me a great deal about the local history of Dutch Holland, Michigan, Portuguese influence in the former colony of Goa, India, and the vast works of Indian philosopher and writer Rabindranath Tagore. Visiting archives was both astounding and concerning for different reasons. (As an aside, there was something so profoundly sad about visiting the Goan archives in India and seeing worn, worm-eaten, molding diaries falling further into decay. The loss of cultural history like that hurts the soul.) As discussed in class, it is important to remember that an archive tells a story, and there are those in control of this narrative actively deciding to sculpt this story in a particular fashion. Remembering that archives are the results of decisions made by specific people is crucial to pushing against problematic understandings of history and modern culture; one must challenge easy excuses that historically oppressed or marginalized communities were not participating in events and narratives, because more often than not these communities have been intentionally curated out of such narratives. For example, during my sophomore year of college I was struck by this fact when I was faced with the task of producing a Holland-based digital humanities project. I was concerned about the lack of visibility of the Hispanic/Latino community in Holland both in terms of businesses and physical design and presence (or rather lack thereof) in the archives. According to the most recent census comprises nearly 30% of the Holland population and yet there are next to no references of this community in the archives. There was such a contradiction between what the Tihle Archives said was the history of Holland and what the actual communities, physical architectures, and ongoing traditions like Fiesta, said was the history. I loved the Data Feminism book by Catherine D’Ignazio and Lauren Klein because this book specifically addressed the ways in which data can be shaped to ignore, or, in contrast, intentionally reveal undocumented narratives. This focus on articulating narratives, especially counternarratives to the dominant historical discourse was one I sought going forward into actual data-centered projects like Ethics of Expropriated Art, involving museum permanent collection data that demonstrated power dynamics and complex international relationships in art expropriation. This project taught me about the challenges of data curation and standardization.  The readings by Gilliland, Tanner, Milligan, and the Library of Congress all pointed to various facets of data and metadata curation standards and practices which were insightful and would have been incredibly helpful when I was designing my project! Though I am still not 100% clear on all of what TEI does, from what I do understand, this is just one more tool to help systematize, organize, and standardize data to make it accessible and computer analyzable, which is fantastic. Also these readings reminded me of how much I love that Omeka lets users add their own metadata categories. The flexibility is so valuable for big messy projects!

When considering the new capabilities of big dataset curation, I am fascinated by the new possibilities of research approaches. Specifically, with data analysis and visualization tools like Voyant, Palladio, and Raw Graphs, plugging datasets or text files into these programs can actually prompt questions, not just attempt to reveal answers. I liked reading Franco Moretti’s book Graphs, Maps, and Trees in my undergraduate years because he dissects the ways in which computer readings of texts present new perspectives and questions for exploration that may not have been realized otherwise through close readings. As aforementioned, philosophy does not lend itself to many obvious avenues for data-based projects, so I have not had extensive time to devote to this method of work; however, my understandings of this type of research were broadened by Moretti and then greatly enhanced when I designed and taught the datasets unit of the Mellon seminar. I reengaged in the process with my students as they chose sources like rare books or twitter hashtags to curate into spreadsheets, ask research questions of the data, run them through data analysis and visualization programs, and draft prospecti about projects based on these initial findings. Taking students through this process was tedious for them, much more so than working with pre-made datasets, but I think it was valuable for them to see from just how many sources they can glean valuable information and compelling research topics. Most importantly and most relevant to being a philosopher, this work can form arguments, and strong ones at that! For me and many of my students, this type of work was the first of its kind to be argument-based but without just citations and occasional statistics. Graphs, maps, tables, charts, and figures coalesced into robust statements that translate to broader audiences. I love this aspect of the digital humanities and though philosophy may not be an obvious or easy fit with this type of work, when the two can come together, I think there is great potential for powerful projects, especially in the areas I am interested in: Latina Feminism and decolonial/anticolonial studies.

