American writers of the 20th century
Corpus description
This is a corpus of the second half of the 19th and the beginning of the 20th century British and American literature. It includes 53 books, further divided into 23 books of British writers (Joseph Conrad, Thomas Hardy, James Joyce, Joseph Rudyard Kipling, George Mac Donald, George Orwell and Virginia Woolf) and 24 books of American writers (George Washington Cable, William Faulkner, F. Scott Fitzgerald, Ernest Hemingway, William Dean Howells, Louisa May Alcott and Mark Twain). The remaining 6 books belong to Henry James and are an interesting element in my analysis because of the fact that James is considered to be Anglo-American writer. The reason why I chose such corpus is that I wanted to compare British and American literature from the aforementioned period and see if there are any differences or similarities. An explanation of the choice of the texts is the fact that the books were easily found at “Project Gutenberg” website and are commonly known as their authors, which is a helpful factor when it comes to describe the results.
The authors selected for the project are well-known and they lived and wrote at a similar time. Therefore, I suppose to see some strong similarities especially between the writers from the turn of the century and also between those who wrote only in the 20th century. Moreover, I assume that the style of modernist writers (Virginia Woolf, Joseph Conrad, Ernest Hemingway and William Faulkner) as well as the realist ones (George Washington Cable and William Dean Howells) will be displayed in some network of stronger similarities. I expect to see clear differences between American and British writers in one of my analyses. However, I could not predict the characteristic feature (either more American or more British tendencies) of James’ texts. Additionally, I expect to notice some similarities between Rudyard Kipling and George Mac Donald as they are both considered with children’s books. I am curious about the tendencies of James Joyce’s texts because he is associated with stream of consciousness method of writing and modernist avant-garde.
The list of all of the text that were analysed in this project can be seen in Table 1.
Table 1: British and American corpus
Author |
Title |
Publication date |
George Washington Cable |
Dr Sevier |
1897 |
Grandissimes |
1899 |
|
Kincaid’s Battery |
1908 |
|
Joseph Conrad |
Lord Jim |
1900 |
Nostromo |
1917 |
|
Under Western Eyes |
1911 |
|
William Faulkner |
As I Lay Dying |
1930 |
Collected Stories |
1950 |
|
The Unvanquished |
1938 |
|
F. Scott Fitzgerald |
Flappers and Philosophers |
1920 |
Tales of the Jazz Age |
1922 |
|
The beautiful and Damned |
||
This Side of Paradise |
||
Thomas Hardy |
A Pair of Blue Eyes |
|
Far from the Madding Crowd |
||
The Hand of Ethelberta |
1895 |
|
The Return of the Native |
1912 |
|
Ernest Hemingway |
A Farewell to Arms |
1929 |
For Whom the Bell Tolls |
1940 |
|
The First Forty-Nine Stories |
||
The Sun Also Rises |
1926 |
|
William Dean Howells |
A Hazard of New Fortunes |
1909 |
Indian Summer |
1886 |
|
Venetian Life |
1867 |
|
Henry James |
The Europeans |
1878 |
The Portrait of a Lady |
1881 |
|
The Turn of the Screw |
1898 |
|
Washington square |
1907 |
|
What Maisie Knew |
||
James Joyce |
A Portrait of an Artist as a Young Man |
1916 |
Dubliners |
1914 |
|
Ulysses |
||
Rudyard Kipling |
Indian Tales |
1937 |
Kim |
1901 |
|
Plain tales from the Hills |
1888 |
|
George MacDonald |
At the back of the North Wind |
1871 |
Sir Gibbie |
1879 |
|
The Portent : A Story of the Inner Vision of the Highlanders |
1864 |
|
Louisa May Alcott |
An Old-Fashioned Girl |
1869 |
Jack and Jill |
1880 |
|
Rose in Bloom |
1876 |
|
George Orwell |
Animal Farm |
1945 |
Down and Out in Paris and London |
1933 |
|
Nighteen Eighty-Four |
1949 |
|
Mark Twain |
A Tramp Abroad |
|
The Adventures of Huckleberry Finn |
1884 UK and Canada 1885 USA |
|
Innocents abroad |
||
What is a Man? and Other Essays |
1906 |
|
Virginia Woolf |
Jacob’s Room |
|
Monday or Tuesday |
1921 |
|
Night and Day |
1919 |
|
The Voyage Out |
1915 UK 1920 US |
Method
The analyses were conducted with the help of R programme, which is a software commonly used for statistical computing and data analysis. The program enabled graphical representation of the data gathered in the corpus. In R program I was able to carry out cluster analysis, bootstrap consensus tree analysis and oppose analysis.
The first analysis- cluster analysis resulted in vertical dendogram in which I was able to investigate the relationship between the texts relying on the most frequent words used in the books. The bootstrap consensus tree analysis presented a network of the texts and marked their relationship by showing a “branches” of texts more or less similar on the basis of the most frequent words appearing in the books. However, the next analysis- oppose analysis provided a contrastive analysis between British and American writers testing James’ membership at the same time.
The last analysis was carried out in Gephi program, which constructs dynamic and hierarchical graphs and an interactive visualisation of various networks and complex systems. The data file was created in R and transferred to Gephi. The aim of this analysis was to display not only the nearest neighbours (as in the previous analyses) but also to show weaker connections with the use of colours and lines of different thickness.
RELATED VIDEO
Share this Post
Related posts
Best American novels of the 20th century
Another Great American: the Statue of Liberty. Photograph: Mike Segar/Reuters I read and digested your comments. I agonised…
Read MoreBest American writers of the 20th century
His first critical and commercial success came in 1935 with the publication of Tortilla Flat, a comedic adventure story about…
Read More