Frequency transforms of text #132

danuep · 2017-12-01T07:45:13Z

I didn't even get the idea until a couple of days ago, and mostly I'm hoping I can get this uploaded before midnight...

Reading @aparrish at #23 talk about hoping to get a meaningful average novel got me thinking about the scales of variation in play, which led to wavelet transforms, which led to

Haar of Darkness

which is unfortunately 2000 words short of the limit, so in honor of a brilliant woman of letters and a brilliant woman of numbers:

The Wavelets, a Daubechies transform of The Waves, by Virginia Woolf

[edit: now with correct link to The Wavelets]

danuep · 2017-12-01T14:32:23Z

(now that I've slept)

I'm grateful to @aparrish for sharing her word vectors generated from Project Gutenberg. I wouldn't have had the time to pull this together without that resource. If I had more time, I'd go back and be more content-aware about tokenizing the source texts -- I split on spaces and at each non-letter character, and the vector file contains entries for tokens like '--' and contractions. Entertainingly enough, The Waves isn't in Project Gutenberg, and so my lookup error log was a nice list of words that she coined in that book. For those, I greedily matched valid sub-words starting from the beginning of the word.

I used JWave for the Haar and Daubechies transforms, and Annoy for the nearest-neighbor matching.

hugovk · 2017-12-01T14:34:14Z

🎈

Is the source available somewhere?

danuep · 2017-12-01T14:56:02Z

I'll put it up later today--was mostly rushing to meet the deadline (which I now see was UTC, not local, so oh well).

danuep · 2017-12-01T22:08:48Z

Scripts are up at https://github.com/danuep/nanogenmo2017

hugovk added the completed For completed novels! label Dec 1, 2017

cpressey mentioned this issue Oct 18, 2018

Language survey 2017 #135

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Frequency transforms of text #132

Frequency transforms of text #132

danuep commented Dec 1, 2017 •

edited

Loading

danuep commented Dec 1, 2017

hugovk commented Dec 1, 2017

danuep commented Dec 1, 2017 •

edited

Loading

danuep commented Dec 1, 2017

Frequency transforms of text #132

Frequency transforms of text #132

Comments

danuep commented Dec 1, 2017 • edited Loading

danuep commented Dec 1, 2017

hugovk commented Dec 1, 2017

danuep commented Dec 1, 2017 • edited Loading

danuep commented Dec 1, 2017

danuep commented Dec 1, 2017 •

edited

Loading

danuep commented Dec 1, 2017 •

edited

Loading