Green and Yellow in The Egoist and The Crisis

Megan Grier and Katie Boul


We decided to look at the words yellow and green (our two favorite colors), in The Crisis and The Egoist.

We were surprised to find that in The Crisis, yellow was actually a frequently used word. Then, when we looked at the KWIC panel, it became apparent that yellow was actually referring to the Japanese race. Taking into account the fact that The Crisis was a magazine aimed towards the advancement of African Americans and therefore talked mainly about the issues of racism and prejudice, it actually made sense that yellow would be so frequently used. The word green was surprisingly used more frequently than yellow, but upon further investigation, it became clear that most of the usages were for last names, which skewed the data pretty significantly.

The use of the words yellow and green follow a similar trend in the Egoist as they do in the Crisis. Green occurs the most (12 times) in the February 1914 issue, and the primary user of the word is James Joyce in his first installment of A Portrait of the Artist as a Young Man. In this volume, at least, green is never used as a last name, a distinction that does set it apart from the Crisis. Yellow is used a bit less, maxing out at 8 times in the December 1919 issue. Interestingly, James Joyce is, again, the author that employs the word yellow the majority of the times it appears in the magazine. The Joyce piece in this issue is Episode X of Ulysses.

One other note on the word yellow in the December 1919 issue: Voyant claims that the word yellow is used 8 times. However, when you search for the word in a PDF version of the document, it comes up 9 times. This must be an instance of the “dirty” data that Dr. Drouin referenced.

In regards to the entire corpus, green was still a pretty frequently used term, but again due to being a last name, so it’s not really a true representation of the term in the whole corpus. Yellow was used pretty frequently as well (though not as often as green). This probably has something to do with the fact that The Crisis, in which yellow was a frequently used term, accounts for a huge chunk of the corpus.

A question that came from this is if there would possibly be some way to filter out certain uses of a word. If we could’ve filtered out all uses of green as a last name, we would have gotten a better representation of the definition of the term that we were using. We’re not sure if that is already possible within Voyant, but it would be really neat to be able to filter out certain uses of words and I think it would help give a true representation of a term if you’re looking for a really specific term.




Remember to embed your KWIC or Word Trends widgets in your post! It's part of the exercise.

Good observations on the use of words in context, especially since terms that are common names can have such in impact on visualization graphs. As you note, there needs to be a way automatically to distinguish proper names from common terms, so let's make sure to broach the subject of automated name extraction.

I agree a filter for certain uses of words would make the word search tool more helpful and more accurate. Especially the ability to filter out proper nouns. On the flipside it would be nice to be able to search for full names. This would be possible if the ability to search for phrases was added to the technology of the corpus.