Identification and Verification

Part 1

1. Words

Poetry

  • Most frequent words: poetry, verse, poems, new, magazine
  • Most notable peaks: magazine, poets, monroe, men, poet
  • Most distinctive words: poetry, king, magazine, verse, volume, net, art, death, english, american...

The most obvious indication these words give is that Poetry revolves primarily around art and poetry, which is of course wholly unsurprising.  Digging a little deeper, however, I considered the fact that a magazine titled "Poetry" would contain so many mentions of its own subject matter.  In addition, it contains a very high usage of words like "verse" and "volume," which suggested to me that it was a far more self-referential periodical than one that simply published poems.  I formulated a prediction that the magazine was concerned with the discussion of poetry and poetics, which from a brief skim of the magazine seems to be more or less correct.

The Egoist

  • Most frequent words: life, man, new, said, time
  • Most notable peaks: new, given, way, course, le (?)
  • Most distinctive words: law, diomedes, interest, liberty, art, pleasure, goodwill, artist, progress, men, believe, think, mother, chastity, life, love, people, dedalus, dante

Based on the word frequencies, The Egoist seems to be largely concerned with the human spirit, with such lofty questions as life and mortality, love, art, pleasure, morality, and time.  In short, The Egoist is truly quite the egoist among magazines!  Skimming over the material, I got the impression that the magazine wants very badly to be an authority on matters of art, morality, politics to some extent, and living.  It is a rather humanist magazine.  (Sidenote: I can only interpret the frequency of le to mean that the magazine contains an overabundance of French expressions which, along with the many Greek mythological references, only serves to support my theory!)

2. Documents

Poetry

  • Number of documents: 123
  • Longest issues: Volume 17.6, Volume 14.3
  • Highest vocabulary density: Volume 2.4, Volume 1.4

The Egoist

  • Number of documents: 74
  • Longest issues: Volume 1.6, Volume 1.16
  • Higest vocabulary density: Volume 5.8, Volume 5.9

3. Graphs

Poetry

  • Man, Woman

  • New, Old

The Egoist

  • Man, Woman

  • New, Old

"Man" and "woman" for Poetry is easy!  The first major spike for both "man" and "woman" takes place in 1916, right in the middle of World War I.  The loosening of gender roles had a major push during this time due to the need for women to fill in jobs traditionally held by males while the men were away at war.  The second spike occurs in 1920, which, of course, is the year the U.S. instituted women's suffrage.  The dates of the major points for The Egoist regarding "man" and "woman" are less clearly correlated, but they both happen during WWI, again emphasizing the war's role in gender roles.  What I find most intriguing, though, is the fact that the correlation is far more clear in Poetry than it is in The Egoist--a magazine seemingly concerned only about poetics and aesthetics over the magazine that tries to claim a role in all the lofty issues.

"New" and "old," however, presented what looked like interesting spikes on their own, but less notable against one another.  The only point of much interest is that of Poetry where "old" far outstrips "new" for a change; this takes place in 1920, which aside from women's suffrage is home to the institution of prohibition.  My theory is that such major events, with the amoutn of resistance they met, might have triggered a good deal of concern and nostalgia for the past.

Part II

1. Word and document patterns

  • Most frequent words: new, colored, negro, man, men
  • Most notable peaks: year, negro, given, years, cents
  • Most distinctive words: colored, negro, new, york, people, race, white, south...
  • Number of documents: 508
  • Longest issues: Blast 1, Crisis 18.2
  • Higest vocabulary density: Others 3.6, Others 3.5

Given what I've seen of Poetry and The Egoist, I'd be quite wary of taking this collection of words as an accurate representation of the corpus--which confirms what Professor Drouin said in class about one or two magazines skewing the whole.  The weight of a magazine affects the whole, which can be dangerous when trying to place the corpus into types or themes!

2. Graphs

  • Man, Woman

  • New, Old

One thing I find worth noting is that, in both cases, the words that are less frequent have the most notable peaks.  While "old" is less frequent than "new" overall, its four most notable peaks surpass "new" a good deal; additionally, the two highest peaks for "old" are almost double and triple the highest peaks for "new."  In the case of "man" and "woman," "man" has three major spikes; however, the single most notable spike for "woman" is a little over a full 1% of the words for its point in time, while "man" is a little over .7%.  I'm not quite sure what to make of this, though...

3. Further study

The pair of terms that caught my interest while I was studying the full corpus was that of "work" and "art."  What struck me was the overall steadiness of the use of work, which was used more or less at a regular rate through the years, as opposed to art, which was in general slightly less frequently used, but which had spikes of usage which far outstripped that of work's best moments. 

What this told me was that work was a constantly relevant subject, used because it is a part of life and because it is necessary to address.  Art, on the other hand, was a highly emotional word, going through spikes of use as a country goes through revolutions and uprising.  It is a highly charged, passionate word that is well-loved all the time in the dark, but which also has the potential to rally up enormous numbers of supporters at the right time.

This led me to go back to The Egoist and Poetry to see how they held up against the general trend.  The Egoist was quite straightforward, giving a similar tendency--frequent, steady usage of "work," occasional uprisings of "art."

Poetry, however, threw me for a bit of a loop.

In Poetry, art is actually more frequent than work, which surprised me at first, since this is a break from the trend found in The Egoist and the full corpus; however, I realized it is rather useful to take the title of the magazine into consideration—Poetry is, of course, an ostensibly more art-oriented magazine, with a greater focus on aesthetics, poetics, and art than some of the more politically-oriented zines in the collection.  Under such a consideration, the exception makes much better sense.

4. Conclusion

My general conclusion from these analyses is that graphing is a fantastic, fascinating tool for literary study as it confirms some theories and suggests new ones; however, it is vitally important that any work done with graphs be taken back to the texts.  The graphs can suggest trends, significant dates, seemingly related data; however, any of this could prove to be coincidence or a matter of sample sizes, insufficiently varied samples, or something of the sort.  Weighted data can prove problematic and misleading.  Anyone working with these sorts of data analyses must be certain to check back against the text itself--this is why I checked my theories about the texts against skimming of the actual corpus material (of course, actually reading might be better!).  I'm quite excited about the potential of such work, though--it could prove helpful in identifying works that are important in different ways from thematic significance!  We can begin to find important books by word relevance, for example.  The possibilites are escaping the top of my head, but I do believe there are a good deal of them!

Lab 10/24

Part I
 
1.  Others is a poetry magazine, so I imagine that words are even more important (or loaded) than they might be in a prose publication.  The five most frequent words used in the magazine are “old,” “night,” “little,” and “love,” and “eyes.”  The three words with the most notable peak are “new,” “shall,” and “things.” Among the most distinctive words were “miggles” (?), “revenge,” and “river.”  Based on the data I have just mentioned, I can likely assume that the magazine uses metaphors and description to convey its message.
 
2. The longest issue is Vol 5 Issue 6, and the shortest is Vol 3 Issue 6.  Supposedly, the shortest issue has the highest vocabulary density. However, that really does not make sense based on the issues I have looked at. Ultimately, the spectrum demonstrates to me that all issues of Others’  publication are relatively short.
 
3.  Based on the Word trends graph created by the data, it is interesting that between the words “old” and “new” the word “old” is the word to experience two medium level inclines and two sharp inclines in usage.  The word “new” does indicate miniscule dips and increases, but generally, its usage does not vary so drastically.
1.  In The Crisis magazine, the most frequently used words are “colored,” “negro,” “new,” “white,” and “school.”    Distinctive words from various issues are most commonly “negro,” and “colored.”  Other distinctive words are of interest as well, such as “woman,” “law,” “social,” and “work.”  From this information, I think it would be safe to assume that readers of these publications (and the public in general) closely associate collective identity by skin color, and not necessarily ethnicity.
 
2.  This archive has 148 documents in it, making it a rather large collection to analyze.  The longest The longest publication is from Volume 18 Issue 2, and the shortest publication is from Vol 14 Issue 3.  Once again, the issue with the highest vocabulary density is from the shortest publication in the archive.  I think the size of the shortest issue of The Crisis (52 pages) indicates that the magazine had an extremely resourceful and perhaps better trained staff than other publications.  Fifty-two pages of information is still a significant amount to produce in a month’s time.
 
3.  I found it more difficult to interpret the graphs I created from various tag-cloud words.  There is so much more information represented by The Crisis graph, so I’ve had a harder time finding significant trends or irregularities.   I do think there might be something interesting to investigate about in the following graph, however.  I input the words “time” and “work” and found that for the most part, the word “work” takes precedent within The Crisis.  However, there are two issues (particularly) in which this is not the case.  I’m curious whether further investigation of the data might indicate some reason as to why “time” is more interesting or important to the reader than “work.”
 
Based on the data gathered from both of my sources it is obvious that both publications are concerned with very different things.  Whereas The Crisis indicates the importance of issues, Others seems to concern itself more with the way in which issues are recounted.  Although Others uses the word “old” more often than the word “new,” The Crisis seems to be more concerned with immediate words, (like “negro,” “school,” etc).
 
Part II

When I click on the link to view Voyant for the entire corpus, I am not shown a tag-cloud, however, I do see statistics displayed, such as the following:

1.  The word patterns throughout the corpus look very much like the word patterns I found in The Crisis.  For instance, the most frequently used words were “new,” “colored,” “negro,” “man,” and “men,” respectively.  In the case of the first magazine I researched (Others), the corpus indicates a difference in the frequency of the use of the word “new” (as “new” was not used so much as was the word “old”).

2.  The longest document in the corpus actually comes from Blast, however, the the shortest document in the corpus comes from Others (which I researched during Part I of the lab).  The document with the highest vocabulary density is from the same issue of Others.  I do not recall the word “year” showing a notable peak in frequency during the first part of the lab, so its usage may have come primarily from a publication I have yet to research.  It does appear as though researching The Crisis gave me a good idea of what the corpus as a whole might contain.

3.  It does seem as though The Crisis steers our study of the corpus as a whole, mainly because it composes the largest of all of the collections.  The Crisis seems to be a publication that is not necessarily modernist; rather, it was merely produced during the modernist era, so The Crisis might pervert a researchers understanding of general trends in modernist literature.  A question I have is why The Crisis is considered a modernist journal.  Nevertheless, a question I have that is based on the data is whether the number of times the term “man” is used (when “woman” is not) indicates that modern publications were not as gender-progressive as I would have suspected.  Of course, I remember our discussion of the romantic era’s obsession with the word “reason,” so I know that my question might be prove to be inadequate.  

However, the usage of “men” is much greater than the usage of “people,” and “people is much more common that the usage of the word“woman.” This doesn’t seem avant-garde to me.

 

Voyant Tools Search

(I could nto figure out how to embed my skins so i took snapshots of them sorry. Also not sure if I did this right so here I go!)

Part I.

 

The Egoist

1)      The most frequent words in The Egoist are life, man, new said and time. The most notable frequencies are new, given, way, course and le. The most distinctive words are law, art, progress, chastity and love as well as many others but those are the top ones for each issue.

 

This tells us that the words used most frequently in the magazine are times that coincide with the thoughts of what femininity is. However, if you look at the works in the volumes they do not really show this at all. In fact most of the works seem to not be about women despite this being a sister magazine to “The New Freewoman”. Which, I found to be an interesting contrast to The New Freewoman.

 

2)      There are 74 documents contained in the corpus. The longest documents in the corpus are vol. 1 no. 6 and vol.1 no. 16. The issues with the most vocabulary density are vol. 5 no. 18 and vol. 5 no. 9.

 

The issues that are in the various ones coincide with each other as they are similar in who they are presenting in their issues like William Carlos Williams was in all of the issue. It seems he’s a very popular poet for the magazine to use.

 

3)      Life and man are words that occurred frequently as the top words in this magazine and in The New Freewoman. I find that interesting considering the magazines were for women. After all the title of the magazine was “The New Freewoman”. The summary was pretty accurate with these words in being the most frequent and having them be distinctive words in certain issues.

4)     

 

The New Freewoman

1)       The most frequent words are man, men, life, women, new. The most notable peaks in frequency in “Blast” are make, little, say, things, think. The top five most distinctive words are freedom, art, democracy, women and women again.

 

The magazine while is a female focused magazine seems to refer to men quite often just like The Egoist did. I found it interesting that man was the highest word for this magazine unlike it was for The Egoist. Life wasn’t the second highest as men came before it which was even more intriguing since once again this was a woman’s magazine.

2)       There are only 13 documents in the corpus. The longest issue is vol. 1 no. 9. The issues with the highest vocabulary are vol.1 no.12 and vol.1 no. 11.

 

I found the information peculiar as the highest vocabulary was in the same volume but in reverse. 12 were higher in vocab than 11. When I looked into their issues I was surprised. No. 11 was a lot lighter in topic than no.12 as no. 12 was a lot more serious in topic than no. 11 although no. 11 was serious as well it was just less deeply in tuned with outer situations of womanhood.

3)       I chose this magazine because it was before The Egoist and because they had similar trends with the word man and life despite being a female magazine. It seems during the time of the magazines life was a topic repeated throughout each issue and

4)       

Blast

1)       The most frequent words are art, life, great, man and war. The most notable peaks in frequency are world, form, men, nature and new. Words with the highest vocabulary density are none. The top two words that are the most distinctive are arghol and war.

 

This information reveals that The Blast does not have a lot of words like the longer magazines I’ve chosen. Yet the words life and man are once again in the top of the most frequent words, which is something I noticed about The Blast magazine and chose it as well to compare with the others. Life and man during their times seemed to be important topics. However war seems to be dominant as well in the corpus being not just frequent but a top distinctive word.

 

2)       There are only 2 documents in the corpus. Blast 1 is the longest issue by word. Blast 2 has highest vocabulary density.

 

This information shows that while there are not a lot in The Blast, the last issue in the corpus has a strong use of words. Which is interesting as one would think the first issue would be it.

 

3)      Life and man has been my focus for this assignment and while The Blast has nothing to do with The Egoist and The New Freewoman, I feel that it is important to look at a third source where life and man was important. It shows that during it’s time the concentration of life and man in written works was an important part of the  

4)       

 

Part II.

 

1)      The most frequent words used in the whole of the corpus are new, colored, Negro, man, and men. The most notable peak in frequency was the word year. The top distinctive words were colored and Negro.

 

The corpus as a whole seemed concerned with the colored man, and life. This concern was interesting as it did not reveal that clearly in the magazines I had chosen. While man was indeed a subject that was frequent, the colored man was not. Life was still high in word use even though it wasn’t in the top five of th emsot frequent words along with man who ended up being in the top 5.

 

2)      There are 508 documents in the corpus. The longest document was The Blast 1. The highest vocabulary document was “Others” vol. 3.6.

 

Although Blast 1 on its own corpus was the longest as well I just was not expecting it to be the longest for the whole of it.

 

 

I was greatly surprised by the number of documents in the corpus as a whole. I was also surprised that the magazines “others” was chosen as the one with highest vocab and even more so surprised that Blast 1 was the longest document.

 

3)      Once again my focus was on life and man as the terms for the whole of the lab. These terms interested me because they always manage to be in the top. Even when life wasn’t on the top of the corpus as a whole man still managed to be in the top. Man seems to be an important word for the written works through the corpus as a whole as it’s often a highly used word. So it makes me think that the subject of man was important for people long ago even more so than today although man will always be an important subject in the end since we are man.

4)      

 

 

Men vs Women

http://voyeurtools.org/tool/TypeFrequenciesChart/?corpus=1329252640907.4758&stopList=stop.en.taporware.txt&type=people&mode=corpus

I am having technical difficulties this morning with images. I wanted to compare the differences between man and woman in the journals. I thought it was interesting to find that men dominated in all of the journals except for three issues. I could not get the images to work on the journal to further my interest but I would like to go back and look at it later.

The three issues that woman came up much more than man were: 1.11 1915 -Feb; 7.1 1920 -May/June; and 7.2 1920 -July/Aug.

I also compared people and people sticks to the middle in between man and woman. In the majority of the issues men are spiked and women are not as high. As soon as the images are back up I'd like to figure out what some of the issues are about and why those three particular issues are focused on woman and the rest are mostly male based. 

Feelings for Death: Heaven and Hell and Social Consciousness

http://voyeurtools.org/tool/TypeFrequenciesChart/?corpus=1329252640907.4758&stopList=stop.en.taporware.txt&type=heaven&type=hell&mode=corpus&freqsMode=raw

 

I am having SO many technical issues, I think it best just to embed the URL for this particular chart.  Anyway, I compared "heaven" and "hell" in order to see how often the "Little Review" published material relevant to either concept, and to see which concept won out.  In a time of strife and war, it is possible that a society's notion of Heaven or Hell, and their concern with one or the other may reflect their opinions about the war. 

 

Interestingly, under the September 1914 "Little Review," the Voyant chart shows a peak in usage of both of the terms, and the content reflect this correlation.  While the table of contents does little to explain the form of the graph, upon looking at the contents, one sees material such as Ford Madox Hueffer's "Hell: (A Part of Heaven Overlooked by Ford Madox Hueffer)"

 

Hell

(A Part of Heaven Overlooked by Ford Madox Hueffer.)

Heaven and Hell are together.

As we walk home on a street in Heaven, in the evening,

Those in Hell will stalk past us

(For Hell is a condition, not a place)

And when we return at dawn will we still see them—

Men bearing infants born dead,

Kissing the inert purple cheeks ;

(For the kiss will be the one punishment of Hell) ;

Men and women holding the severed heads of those they once spat on.

Before a king kissing the head of his queen will we stop,

To give him a kind word ;

Or before an anarchist clasping the head of the king;

Or before a woman carrying the head of the anarchist—

Each unaware of the other's presence.

We will see them walking up and down the streets of Heaven

For countless years,

Till the day when the heads will disappear,

And the head-bearers build homes next to our own.

 

It is hard to decipher precisely the attitude intended by the man who would be Ford Madox Ford, but perhaps that is precisely the point.  The society is ambivalent about WWI, which had just begun two months prior.  But death is certain, even if everyone is not entirely aware.  It seems that, perhaps, Heaven and Hell can coexist because suffering and happiness can coexist in a society. 

The Little Review

http://voyeurtools.org/tool/TypeFrequenciesChart/?corpus=1329252640907.4758&stopList=stop.en.taporware.txt&type=great&type=good&mode=corpus

I found this exercise to be very interesting, in part because I understood how to navigate Voyeur easier than I did Gephi. Manovich's article also helped, because I was able to see what Manovich referred to as "media visualization" (i.e. the tag cloud by which we select words to research.)

After a few random tries, I became interested by the graph that resulted from my search of the words "good" and great." In particular, in March of 1917, there is a decline in the usage of the word "good," and a sharp increase in the usage of the word "great."  I was curious why this could be. (An educated guess led me question whether the reason might result from articles of the "Great War").

Interestingly, there was no mention of the term "Great War" in the magazine.  However, it appears that the word "great" may have entered the subconcious of the public as they anticipated the US entry into the War in April of 1917. This may be supported by the fact that only about half of the uses of the word "great" in The Little Review held a positive connotation.  The term "great" occurs more often as a descriptive word with a negative context, or as a descriptive word with no obvious bent towards either a positive or negative view.

I think it is noteworthy that there were three different books advertised in the magazine in which the word "Great" appears in the title.  Perhaps there was great anxiety as well as understanding that the time at hand was distinguishable from any the world had every seen.

Lastly, I should mention that the only uses of the word "good" in the magazine were in reference to art (Examples: "good musician," "good work of art," "good art," etc.)  My guess is that the writers and editor of the magazine considered the humanities to be one of the only things a person might label as "good."  World War I expanded the human imagination in darker ways.  People, governments, militias, etc., were found to be "bad." People must have been floundering to identify something as inherantly "good," and in the case of The Little Review this label was concretely attached to the discipline of art.

Graphing The Little Review

 

Using the Voyant Tools graph I discovered something interesting when I separated the words “Soul”, “Artist”. “Light”, “Music” and “Hand”  from the rest of the words. They were at the very bottom of the words list although there were still a large number of the words. The word soul ended up becoming the one with the highest points, in fact it was volume 6 no. 5 issue of “The Little Review” that soul was at the highest of its peak and outdid all of the other words combined. When I looked at the issue I realized that first item might have been the cause for Soul being peaked to its highest from what I am calling as the “poetic short story” titled “Cast Iron Lover” by Else Baroness von Freytag-Loringhoven. Constantly she uses soul in her work and clearly is a source of why soul was at its highest peak, an example would be the constant use of “mine soul”.

It is the word “Light” however that at the end of the graph continues going upward out of the other words. It is due to the last issue of “The Little Review” on ModJourn, volume 9 no. 5. Many of the works of literature in this issue reflect a combination of light and dark by infusing light with night.  I found this to be extremely interesting as it was a winter issue of “The Little Review”. So I found it interesting that the top words of the combination I created were “soul” and “light”. “Music”, “hand”, and “artist” were of the lowest, including in the list. So I found that surprising considering one would think artist at least would have been higher. But I guess if the artist referred to them as such it would be disrespectful or just wasn’t in the mood of their works? Music and hand were more understandable.

Reading the graphs of “The Little Review” I discovered that even though the words I chose were used less in the whole magazine, they were still pretty important to the works that the magazine published. I was surprised by certain peaks of the words such as soul’s triumphant peaks and then light outdoing soul in the very end. I found the graph could definitely be used since there was data to back it up for arguments if needed and was pleased by the result. I really liked the fact too that the graphs were pretty clear cut. VoyantTools showed different ways for people to view the usage of words and I felt that it was a good of being able to allow people to see how the words were used in “The Little Review” and that it could be further explored from that area. 

Death

 

I have been having trouble with this program in several ways, but one of the things that perks my interest is that Death happens to be linked with just about everything. It is the biggest piece in "The Little Review". With the readings in this journal it shouldn't come as much of a surprise that death has a big hand, but the fact that it touches a small piece of everything in the magazine is interesting. 

The other side of that is when you filter religion, death goes away. Death is linked to both Greatness and Aesthetics, but when you filter aesthetics or greatness one or the other disappears. Death appears to be the link between many different pieces, but just because it is the main link does not mean those other individual pieces are linked to each other through death. Death holds everything together and without it the majority of the pieces fall away. 

Gephi: The Little Review

 

My first objective was to group the dots similar in size together.  I discovered the vast difference in size between similarly related dots.  Death and decay (located in the bottom right corner), for example, are two that differ in size the most, but seem to be closely related otherwise.  Further investigating this oddity, I isolated the themes directly linked with death:

 

This produced even more surprising results.  Death has a stronger connection with greatness than with war, decay, senility, aging, or dissolution, which are all minor blips on the graph.  But, even this connection with greatness is somewhat inconsistent, as mediocrity takes second place. 

I felt the need to include religion, the lone strayer, in this model of death.  The theme seems appropriate with the topic of death, as even praise and memorial might suggest (but not necessarily require).  However, strangely enough, religion lacks a connection with any and all of these themes.

On the weakness of quantitative methods for qualitative data

I've had a lot of trouble figuring out how to go about acquiring usable information from Gephi—while I have traditionally done well in English and decently in math, the areas of qualitative and quantitative study are extremely segregated in my mind.  As a result, I spent a good deal of time fiddling with settings, messing things up, reopening the file, and trying again.  When it came down to it, changing the settings didn't reveal anything new for me.

What I ended up doing was to consider Moretti's analysis—removing points of data and looking at their effect on the overall network—and see where I could go with that.  What I did was a series of removals with a particular sequence of steps.

1) Screenshot

2) Delete the most heavily linked pieces of data—those that show up large and red.

3) Screenshot.

4) Reapply the node size and colour parameters so that the next largest pieces of data move up to attention.

5) Screenshot.

6) Repeat steps 2-5 until no points remain.

I have put together a gif to provide an overview of the process.  (It seemed a bit saner than trying to upload 16 similar screenshots…)



Some observations… In the earlier stages of this process, it was plainly obvious to identify a few nodes at a time to remove—themes and genres early on, and eventually authors, then finally particular works.  As I went on, the number of nodes to be removed became larger and larger, indicating a growing uniformity in the "value" of those nodes.  By the time the values were that unified, the connections were almost the minimum necessary to keep the remaining points all part of the network. 

I found it interesting that, going about in this manner, it took until the 6th iteration to isolate a node from the rest of the network.  The entire web was so interconnected that it took a lot of removal to isolate anything.   This is one of the primary reasons I find Gephi a difficult medium to extract information from—the complex interconnectedness of the points, growing out of a subjective process of tagging, does not lend itself easily to identifying structural centers.  Using the language of Moretti's article, the central points stand out quite easily—death, art, greatness, etc.—but the structural centers are much more elusive.

I think as far as quantitative analysis goes, I would be far more interested in looking at the frequency of particular words in texts or something along those lines, as this would be far more objective and would reveal something that hadn't already been revealed by the process of labeling and sorting, which is the case with the thematic tagging we have here.

Pages