⇐ Previous | Next ⇒

My NaNoWrimo Stats

Post created 2013-12-21 15:12 by Gabe Koss.

On a total whim I decided to participate in National Novel Writing Month. This is a month long writing marathon in which particpants attempt to write a 50,000 word novel in the month of November. I cheated a little bit and started on October 26.

Total Words 54173
October Words 2917
November Words 51256
Avg Words/Day (Nov) 1709

Progress Over Time

The vertical axis represents the word count of the story as it grew. Each bar indicates the total number of words reached per day. Hovering your mouse will show you the exact number of words reached on that date. The light line is created from the word count done each time I made a substantial save.

Common Words

After excluding common English stop words such as "that" or "is" the 10 most common words in my story were as follows:

sage631 instances
out315 instances
rama249 instances
back184 instances
one165 instances
down144 instances
looked139 instances
here138 instances
more125 instances
know124 instances

Common Bigrams

Bigrams are two word units such as "depraved heathen" or "kind soul". The most common two word groupings were as follows:

of the390 instances
in the222 instances
to the188 instances
on the163 instances
into the151 instances
she had107 instances
was a104 instances
from the92 instances
out of91 instances
she was90 instances

Code snippets:

I wrote the story with Vim and tracked my progress with Git. I did the analysis on this data using a combination of Ruby, D3.js and the Linux command line. Much of my data analysis was inspired by the classic Unix for Poets.

Here are some of the tools I used to do this analysis.

Extract top 10 words:

tr -sc '[A-Z][a-z]' '[\012*]' < story.md | tr '[A-Z]' '[a-z]' | sort | grep -E -v '^.{,2}$' | grep -E -v -f ../stop_words.grep |uniq -c | sort -n | tail -n 10

Extract top 10 bigrams

tr -sc '[A-Z][a-z]' '[\012*]' < story.md > nano.words                                   
tail -c +2 nano.words > nano.next
paste nano.words nano.next | sort | uniq -c | sort -n > nano.bigrams    
tail -n 10 nano.bigrams
comments powered by Disqus