All posts by Atabey Kaygun

Computational Literary Analysis
This article was first published on Atabey Kaygun's personal blog, and kindly contributed to

Description of the problem

One of the brilliant ideas that Kurt Vonnegut came up with was that one can track the plot of a literary work using graphical methods. He did that intuitively. Today’s question is “Can we track the plot changes in a text using computational or algorithmic methods?”

Overlaps between units of texts

The basic idea is to split a text into fundamental units (whether this is a sentence, or a paragraph depends on the document) and then convert each unit into a hash table where the keys are stemmed words within the unit, and the values are the number of times these (stemmed) words appear in each unit.

My hypothesis is (and I will test that in this experiment below) that the amount of overlap (the number of common words) between two consecutive units tells us how the plot is advancing.

I will take the fundamental unit as a sentence below.