Identification and Verification

Part 1

1. Words


  • Most frequent words: poetry, verse, poems, new, magazine
  • Most notable peaks: magazine, poets, monroe, men, poet
  • Most distinctive words: poetry, king, magazine, verse, volume, net, art, death, english, american...

The most obvious indication these words give is that Poetry revolves primarily around art and poetry, which is of course wholly unsurprising.  Digging a little deeper, however, I considered the fact that a magazine titled "Poetry" would contain so many mentions of its own subject matter.  In addition, it contains a very high usage of words like "verse" and "volume," which suggested to me that it was a far more self-referential periodical than one that simply published poems.  I formulated a prediction that the magazine was concerned with the discussion of poetry and poetics, which from a brief skim of the magazine seems to be more or less correct.

The Egoist

  • Most frequent words: life, man, new, said, time
  • Most notable peaks: new, given, way, course, le (?)
  • Most distinctive words: law, diomedes, interest, liberty, art, pleasure, goodwill, artist, progress, men, believe, think, mother, chastity, life, love, people, dedalus, dante

Based on the word frequencies, The Egoist seems to be largely concerned with the human spirit, with such lofty questions as life and mortality, love, art, pleasure, morality, and time.  In short, The Egoist is truly quite the egoist among magazines!  Skimming over the material, I got the impression that the magazine wants very badly to be an authority on matters of art, morality, politics to some extent, and living.  It is a rather humanist magazine.  (Sidenote: I can only interpret the frequency of le to mean that the magazine contains an overabundance of French expressions which, along with the many Greek mythological references, only serves to support my theory!)

2. Documents


  • Number of documents: 123
  • Longest issues: Volume 17.6, Volume 14.3
  • Highest vocabulary density: Volume 2.4, Volume 1.4

The Egoist

  • Number of documents: 74
  • Longest issues: Volume 1.6, Volume 1.16
  • Higest vocabulary density: Volume 5.8, Volume 5.9

3. Graphs


  • Man, Woman

  • New, Old

The Egoist

  • Man, Woman

  • New, Old

"Man" and "woman" for Poetry is easy!  The first major spike for both "man" and "woman" takes place in 1916, right in the middle of World War I.  The loosening of gender roles had a major push during this time due to the need for women to fill in jobs traditionally held by males while the men were away at war.  The second spike occurs in 1920, which, of course, is the year the U.S. instituted women's suffrage.  The dates of the major points for The Egoist regarding "man" and "woman" are less clearly correlated, but they both happen during WWI, again emphasizing the war's role in gender roles.  What I find most intriguing, though, is the fact that the correlation is far more clear in Poetry than it is in The Egoist--a magazine seemingly concerned only about poetics and aesthetics over the magazine that tries to claim a role in all the lofty issues.

"New" and "old," however, presented what looked like interesting spikes on their own, but less notable against one another.  The only point of much interest is that of Poetry where "old" far outstrips "new" for a change; this takes place in 1920, which aside from women's suffrage is home to the institution of prohibition.  My theory is that such major events, with the amoutn of resistance they met, might have triggered a good deal of concern and nostalgia for the past.

Part II

1. Word and document patterns

  • Most frequent words: new, colored, negro, man, men
  • Most notable peaks: year, negro, given, years, cents
  • Most distinctive words: colored, negro, new, york, people, race, white, south...
  • Number of documents: 508
  • Longest issues: Blast 1, Crisis 18.2
  • Higest vocabulary density: Others 3.6, Others 3.5

Given what I've seen of Poetry and The Egoist, I'd be quite wary of taking this collection of words as an accurate representation of the corpus--which confirms what Professor Drouin said in class about one or two magazines skewing the whole.  The weight of a magazine affects the whole, which can be dangerous when trying to place the corpus into types or themes!

2. Graphs

  • Man, Woman

  • New, Old

One thing I find worth noting is that, in both cases, the words that are less frequent have the most notable peaks.  While "old" is less frequent than "new" overall, its four most notable peaks surpass "new" a good deal; additionally, the two highest peaks for "old" are almost double and triple the highest peaks for "new."  In the case of "man" and "woman," "man" has three major spikes; however, the single most notable spike for "woman" is a little over a full 1% of the words for its point in time, while "man" is a little over .7%.  I'm not quite sure what to make of this, though...

3. Further study

The pair of terms that caught my interest while I was studying the full corpus was that of "work" and "art."  What struck me was the overall steadiness of the use of work, which was used more or less at a regular rate through the years, as opposed to art, which was in general slightly less frequently used, but which had spikes of usage which far outstripped that of work's best moments. 

What this told me was that work was a constantly relevant subject, used because it is a part of life and because it is necessary to address.  Art, on the other hand, was a highly emotional word, going through spikes of use as a country goes through revolutions and uprising.  It is a highly charged, passionate word that is well-loved all the time in the dark, but which also has the potential to rally up enormous numbers of supporters at the right time.

This led me to go back to The Egoist and Poetry to see how they held up against the general trend.  The Egoist was quite straightforward, giving a similar tendency--frequent, steady usage of "work," occasional uprisings of "art."

Poetry, however, threw me for a bit of a loop.

In Poetry, art is actually more frequent than work, which surprised me at first, since this is a break from the trend found in The Egoist and the full corpus; however, I realized it is rather useful to take the title of the magazine into consideration—Poetry is, of course, an ostensibly more art-oriented magazine, with a greater focus on aesthetics, poetics, and art than some of the more politically-oriented zines in the collection.  Under such a consideration, the exception makes much better sense.

4. Conclusion

My general conclusion from these analyses is that graphing is a fantastic, fascinating tool for literary study as it confirms some theories and suggests new ones; however, it is vitally important that any work done with graphs be taken back to the texts.  The graphs can suggest trends, significant dates, seemingly related data; however, any of this could prove to be coincidence or a matter of sample sizes, insufficiently varied samples, or something of the sort.  Weighted data can prove problematic and misleading.  Anyone working with these sorts of data analyses must be certain to check back against the text itself--this is why I checked my theories about the texts against skimming of the actual corpus material (of course, actually reading might be better!).  I'm quite excited about the potential of such work, though--it could prove helpful in identifying works that are important in different ways from thematic significance!  We can begin to find important books by word relevance, for example.  The possibilites are escaping the top of my head, but I do believe there are a good deal of them!


Ack, sorry this turned out so long!  It's the images, I swear!