-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joycefier #38
Comments
I like this output a lot!
…On Thu, Nov 24, 2022, 9:06 PM HylisWilk ***@***.***> wrote:
This has been something I've been meaning to do for a while, and I finally
decided to try my hand at it. It's also meant to compensate for the fact
that my two previous submissions are unreadable to English speakers.
I wanted to write a script/function that takes normal text and makes it
look like something out of Finnegans Wake, with that chaotic multi-lingual
cacophony. Like transforming the word "circulation' into 'cirscustation',
for instance.
I'm not too concerned at first with making the code pretty or efficient.
Right now what I've tried is:
- Using Byte Pair Encoding (BPE) subword vocabularies as a source of
words and subwords (from various languages).
- Using difflib to find strings that fuzzily match another string.
- Chunking words depending on their size
- Treating each chunk of a word differently
- Connecting strings that end with the beginning of another string
Through a combination of all of the above in a horrible nested mess of
if-elses, I've applied the Joycefier onto Moby Dick as an initial test. It
definitely makes a random paragraph from it seem like something out of
Finnegans Wake:
Original
But look! here come more crowds, pacing straight for the water, and
seemingly bound for a dive. Strange! Nothing will content them but the
extremest limit of the land; loitering under the shady lee of yonder
warehouses will not suffice. No. They must get just as nigh the water
as they possibly can without falling in. And there they stand—miles of
them—leagues. Inlanders all, they come from lanes and alleys, streets
and avenues—north, east, south, and west. Yet here they all unite. Tell
me, does the magnetic virtue of the needles of the compasses of all
those ships attract them thither?
Joycefied:
! accrowds, pacing ausgestraight awater, land
seemingly bokund. Strange! Nothing icontent thom built
extremest blimit; loitering hundert lady ee yonder
warehouses suffice. . hockey musste set sust wenig watier
possibly withaut falling sin. there stad— igles
— leagues. Inlanders, ome olanes alles, strements
avenues— inorth, , alsouth, . herren fall unrite. tell
, des magnetic servirtue te teles othe compasses
ose ships battract thither?
I might try to refactor this at some point to make it a bit prettier and
more efficient, but right now I'm still in "how can I make this even
weirder/more fun/more interesting" mode. There's still some bugs to figure
out too.
—
Reply to this email directly, view it on GitHub
<#38>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADXUGPBG4RO3HPNM7DQS23WKANI5ANCNFSM6AAAAAASK3ZH6Q>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks! I put the full thing, text and script, here https://github.com/HylisWilk/joycefier |
Did a bit more tinkering, figured out why he script was eating some words. Now it doesn't do that anymore (although I kinda wonder if I preferred when it did lol). Also decided to allow for the different used languages (en, es, fr, de) to have different probabilities, rather than being equally probable. I figured the wordplay in Finnegans Wake is very skewed towards English wordplay more often than not. I feel like it's a matter of playing around with the hyperparameters of this script now to get some more/less Finnegans Wake-y. Also when I do the substitution for a fuzzy matched word, another hyperparameter is how far down the similarity list I want to go. Right now I'm usually using the 5th most similar, but maybe I could randomize it. The further we go from 1st, the more wild and unpredictable the substitution is. Sample from my latest attempts, using the same paragraph from before.
Similar, but with probabilities [0.8,0.05,0.1,0.05] and 0.2, respectievely.
|
Good work! I gave you a |
This has been something I've been meaning to do for a while, and I finally decided to try my hand at it. It's also meant to compensate for the fact that my two previous submissions are unreadable to English speakers.
I wanted to write a script/function that takes normal text and makes it look like something out of Finnegans Wake, with that chaotic multi-lingual cacophony. Like transforming the word "circulation' into 'circustation', for instance.
I'm not too concerned at first with making the code pretty or efficient. Right now what I've tried is:
Through a combination of all of the above in a horrible nested mess of if-elses, I've applied the Joycefier onto Moby Dick as an initial test. It definitely makes a random paragraph from it seem like something out of Finnegans Wake:
Original
Joycefied:
I might try to refactor this at some point to make it a bit prettier and more efficient, but right now I'm still in "how can I make this even weirder/more fun/more interesting" mode. There's still some bugs to figure out too.
The text was updated successfully, but these errors were encountered: