Probabilistic Detection of Character Voices in Fiction

In James Joyce's novel Ulysses, the school headmaster Mr. Deasy quotes Shakespeare in a lecture in financial responsibility to his employee Stephen Dedalus. “[W]hat does Shakespeare say?” he asks, “Put but money in thy purse” (Joyce 1986, 25). As Stephen...

A Generator of Socratic Dialogues

In the influential 1948 paper “A Mathematical Theory of Communication,” the mathematician Claude Shannon conducts a thought experiment to construct an algorithmic approximation of language. The algorithm can be described like this: Choose a book at random from your bookshelf,...

Chapterize: a Tool for Automatically Splitting Electronic Texts into Chapters

If you do computational analyses of books, and need to break up the book’s text file into its constituent chapters, I’ve just released a tool that you might find useful. It’s called chapterize, and it breaks a book into chapters....

A Macro-Etymological Analysis of Milton’s Paradise Lost

One of Milton’s terms for the expansive, empty gulf separating the Earth from Hell is the “abyss.” The word appears eighteen times in Paradise Lost, and in seven out of twelve of the poem’s books. It is variously described as...

Macroetym: a Command-Line Tool for Macro-Etymological Textual Analysis

I'm proud to introduce macroetym, a command-line tool for macro-etymological textual analysis, which is now available for download with the Python package manager, pip. It's a complete rewrite of The Macro-Etymological Analyzer, the web tool for macro-etymological analysis I wrote...