Serendipity in the Digital Age

I have admitted previously to being somewhat of a skeptic (or, at the very least, a doubter) on the topic of “The Digital”, since I am often finding myself rather alarmed by the rapid progression of technology within what would seem just the past 20 years or so, and regarding which I usually wonder if the benefits of convenience necessarily outweigh the potential negative effects that we perhaps do not even fully understand yet.  I am thus probably a bit of a—gasp—traditionalist, in this respect.  So on some level, the process Stephen Ramsay describes in “In Praise of Pattern” showcases just what I had feared: that much of the purpose of such Digital Humanities projects just might be for the sake of mere curiosity.  What, I was poised to ask, might be the cost of such curiosity for its own sake? (I will still ask this, but maybe with less apparent certainty/disdain, as if I already know the answer, because I honestly don’t.) (I do know, too, that I am oversimplyfying the issue and there are certainly reasons besides curiosity alone, and that curiosity may, in fact, be a blessed thing at times.) After all, Ramsay notes himself that stage directions for Shakespeare’s plays are notoriously sparse, and we also have reason to believe that sets often did not change from scene to scene, instead leaving any mention of a change of scene to the imaginations of the play’s viewers.  He thus explores the fact that in order to explore the topic he was interested in a certain level of guesswork was necessary: And so, faced with this computationally intractable buffet of confusion, I did what any good humanist scholar would do: I guessed” (182).  But for what ultimate purpose is it truly to work at “guessing” in such wise? 

These were my thoughts upon initially encountering Ramsay’s article.  What I didn’t expect, however, was to be struck by the genuine earnestness of his argument.  I found it particularly refreshing what he writes about neutrality, for instance: not that we should strive eternally for neutrality in our work—which I sometimes feel is desired but inevitably impossible—but instead that we should acknowledge openly the inherently subjective nature of something created by another individual person (in this case, software) that even so does not discount the value of one’s work: “This does not imply that the software should be neutral, as many tools and web sites in digital humanities try to be. It cannot be neutral in this regard, since there is no level at which assumption disappears. It must, rather, assert its utter lack of neutrality with candor, so that the demonstrably non-neutral act of interpretation can occur” (182).  Additionally, I was very charmed by the sense of discovery Ramsay describes and claims to experience while occupied in The Search (he even calls these moments of unearthing something new “epiphanies”—as strong a description as I ever heard!), and the very human—even endearing—manner in which he describes first feeling rather sheepish about discussing his curiosity with those outside the field of English, before ultimately finding greater worth in his own internal yearning for discovery.  The candor with which Ramsay discusses the particular scholar’s joy in the serendipitous encounter was, I found, very compelling in this piece, and, from my perspective, quite a nice argument for the worth of such research methods.

Ramsay, Assumptions, and Neutrality

I rather enjoyed the Stephen Ramsay reading and was particularly struck by his perspective on the notion that DH tools and algorithms cannot be objective or neutral, nor should they be. "It cannot be neutral...since there is no level at which assumption disappears. It must, rather, assert its utter lack of neutrality with candor, so that the demonstrably non-neutral act of interpretation can occur" (182). This ties into the topic of selection bias and preservation we were talking about with the archive, but comes up repeatedly in Ramsay's essay as well. He stresses early on that the point of DH is not to find objective answers to interpretive problems, which is an easy leap to make when including algorithms and math into the humanities. Rather, data visualization is another way to present and sort through data that already exists, allowing for new avenues of interpretation of patterns that would be harder to notice otherwise.

It feels good to see a scholar outright acknowledge and interrogate the notion of neutrality in algorithms. As this is all in favor of interpretation and analysis, the tools being used and/or created are being crafted for that specific purpose, and will thus be 'tainted' by our academic assumptions from the outset. People have a tendency to assume algorithms and math are impartial and immune to bias, and it is oddly validating to see Ramsay outright saying that is silly and everything is biased by our assumptions in some way.


I'll make this short and sweet. It's yet another Monday and, yes, another blog post in which I will mention that I am, at the moment, very tired and very much fighting the urge to float cartoonlike towards my bed. It is a hard-fought battle, one that I am certain to lose - the question is only when. 

I very much enjoyed the Stephen Ramsay article, specifically when he touches on serendipity (that friend everyone in academia likes to pretend they don't hang out with) and the relationship between the positivism of "real" scientific pursuits and the necessarily interprative quality of literary studies (a relationship that I think is presented rather problematically in the Jockers reading). I feel often that, when prompted or wanting to bolster and defend the sociocultural significance of literature studies, the spectre of STEM lingers in the background and, consequently, any defense of literature is figured necessarily in the comparison between the humanities and the sciences. Thus, Ramsay's ackowledgment of a difference but potential symbiosis between these studies provided a nice moment of optimism. 

I've got some more thoughts but I'll save those for class tomorrow. The battle is over. Therefore, time for sleep. 

Patterns and Evidence

(Full disclosure, the download link for the Ramsay reading was not working, so I went online and found a copy of his article "In Praise of Pattern.")

I read the Ramsay first, and it read as a great example of the type of methodological write-up that I would want to turn into Dr. Drouin for our final semester projects. It was easy to read, he had quite a few lit-nerd jokes in there, and I walked away understanding his central premise. That is, we can use digital humanities to find larger strokes in the source texts (in Ramsay's case, the use of scenes in Shakespeare's plays and how it differed over genres), but those larger strokes don't necessarily scientifically prove anything out of hand. So what if comedies, tragedies, romances, and history plays have a different average of different scene locations? To say that the scene variance means something thematically is not, then, a scientific argument. It is a literary/humanistic interpretation of the fact that comedies, tragedies, etc. have a differing amount of locations. Ramsay is, essentially, making sure that we understand what the tools digital humanities scholars give us. They give us hard, scientific data, yes. But they do not give us the key to some unimpeachable literary position. Our theses, our journal articles, those are still the same interpretative statements that we have been making since the dawn of the academy.

The Jockers seems to easily line up with this. Jockers spends time stating that digital humanities tools are, essentially, strip-mining texts for broad tendencies. Digital humanities is, essentially, best used for gathering evidence on a large scale, and less useful for taking on the intricacies of a single novel. It makes sense on a certain level -- why use text miners on a 50-page short story when you could be using them on an author's (or authors') body of text? Again, what you find doesn't necessarily prove anything out of hand. But you can use them as keystones to search out material in the text, or alongside material in the text, to formulate your opinion and lend it the weight of scientific evaluation.

The biggest thing I take away from these readings is that I want my project for the semester to focus on collating/interpreting a large collection of data. My focus on late Victorian/early Modern texts will make that an easy ask when Project Gutenberg .txt files exist. They are, also, an important reminder that a English scholar in digital humanities cannot forget their roots in literary criticism. Graphs don't mean anything without the interpretation that we do everyday, and the data we uncover does not limit our ideas. As Ramsay says, "We are so careful with our software and with our mathemat-ics—so eager to stay within the tightly circumscribed bounds of what the data “allows”—that we are sometimes afraid (or we forget) that all of this is meant to lead us to that area of inquiry where such caution and such tentativeness has no place."

Discussing Derrida and Decay

Before this week I hadn't read much Derrida, and I surprisingly found our section of Archive Fever this week quite interesting (I also hope this was the correct reading, since I wasn't sure about page numbers!). The aspect of Derrida's that I found most compelling is his discussion of why we archive the way that we do; from whence does this urge to catalog pour fourth?  What drives our seemingly inherent completionism?  In Archive Fever, Derrida asserts that archiving is an attempt to work against the impending and ever-present threat of loss.

Although this thought has permeated much of our discussion this class, I was quite struck by it here.  Of late I've been considering the processes of loss and mourning, and wondering about how these might filter into all of a life that ever in a process of decay.  Fascinatingly, Derrida suggests that one of the ways we might combat this knowledge of decay and impermanence is through the archive, for indeed:

“There would indeed be no archive desire without the radical finitude, without the possibility of a forgetfulness which does not limit itself to repression. Above all, and this is the most serious, beyond or within this simple limit called finiteness or finitude, there is no archive fever without the thread of this death drive, this aggression and destruction drive” (19).

As we have been working through various journals and other archives lately, and further discussung just how fluid, subjective, and difficult the archival process is, I've personally been hit with a few doubts as to the possibility of ever "properly" documenting or archiving anything.  What is the purpose, after all, if one cannot be sure that their attempt at presevration is ever good enough?  But when considering the possibility that we archive to work against the decay of an inherently physical, this process begins to take on new significance for me, and I almost begin to feel a greater respect for the yearning expressed in the attempt than for the result itself.

Two Masses and Voyant tool

I tried to make a comparison between two issues of Mases. One is issue of June, 1914, and the other one is issue of June, 1915. One of the most salient characteristics in frequency of words was the use of the word 'women' and 'men'. The frequent words in Masses of 1914 showed both 'men' and 'women' in a similar ratio, but in case of 1915 issue, I could only find the word men occurred frequently. I want to ask about the reason for this difference of frequency in words of men and women, because this would implicate the relationship between class and women. In addition, in the issue of 1914, I could find some words related to the nationality such as Mexico, and American, but in the issue of 1915, I could not find such words. It seems important theme to me, because this can be an evidence of the fact that the magazine also tries to encompass different social movements in different countries. Both of two issues seem to focus on the word ‘masses’ and ‘new’. From this idea, I could find the fact that the magazines seem to focus on the movement of masses and making a new change in the society and I want to study more about this.

Variety and Similarity

Is one purpose of the archive is simply to preserve the variety of any given age?

This question about variety led me to consider what similarities might exist between journals that would seem to be in opposition to one another, for instance, a self-proclaimed "conservative" journal like The Owl and a undeniably political journal such as The Egoist. Thus, I took three issues from each journal and, after putting them into Voyant Tools, subsequently took a look at the terms they have in common, featured below:


I'm going to come back to this post and add some more. But, for right now, I'm appreciating the movement that Voyant Tools allows for this issue of The Masses. I've been reading some queer theory for Dr. McLaughlin's class, and one thing that has really spoken to me is the idea of stasis as a tool of potential oppression used by political hegemonies. Perhaps, through reanimation (digitization, playing with words, etc.), there is a reimbuing of political potential.

Lab 2/12

I was most curious about the potential political ramifications and biases that an archive may be subject to. So rather than if there are any biases, I decided for this excercise, to just ask where any biases manifest, just due to the nature of modernism.

For this lab, I put the June 1914 issues of Blast, The Crisis, and The Masses into a corpus and pumped the whole thing into Voyant Tools. By doing this, I hoped to examine the points of friction between the issues, and potentially examine potential political biases in modernist journal archival.

For starters, the word "like" appears with a higher frequency in Blast than the other journals, and the word links map has it connected to the term "God," indicating a tendency to compare people to higher beings, which furhter indicates a sense of superiority coming from the Vortecists. Meanwhile, terms such as "negro" and "colored" almost exclusively are relegated to The Crisis, highlighting the Anglo centric perspective of Blast and The Masses.

Trying to Explain Why Archives Make The Choices They Do

With the general question of if an archive must explain themselves and the choices they make, the largest objection I could think of would be time. Digitizing and/or cataloging items already takes quite a bit of time. Add in having to explain why you made the choice you did, and you're now spending even more time. And what explanation is long enough? Or too short?


I think a potential solution to this problem is also a potential side step. By linking together texts/objects with metadata tags, you are showing a general theme through specific items that can span the entire archive (as well as helping with search and user access). Perhaps you do not have to excuse your choice as much as you have to prove that said choice has a place in the archive. Maybe that's enough.

That alone is a big ask, as it can be rather subjective as to what terms to use as a metadata tag for which journal. I have decided to explore using the text itself, and using major terms found in the text itself as potential metadata tags. That can take a bit of time, so you to automate it.

To illustrate this, I have used Voyant Tools to automatically read the PDF copies of Blast issue 1, Camera Work number 5, and The Dome, vol. 1 no. 5. By doing so I was able to gather a word map which I have provided below:

The "TermsBerry" that Voyant provides is also useful in this regard, but Voyant does not like making an image of it that I can link here, apparently.

As you can see, there are plenty of words that we can look at to see about using as metadata tags. However, there are also spelling errors, and some more useless words like "good" or "new." We discussed methods in class that we can use to clean up XML files, and I feel as if that is a good way to avoid the spelling errors. This may work rather well if I were to expand the scope even further, and include say all of the MJP's offerings. Then we can get search tags that appear in (at least) a majority of the magazines, and show a clear link between the offerings.