Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abandoned idea: base80 asciilification #2

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

abandoned idea: base80 asciilification #2

wants to merge 1 commit into from

Conversation

buhman
Copy link
Member

@buhman buhman commented May 26, 2020

This implements a base80 symbol set using a 41/32 encoding.

After implementing this, I realized my original math was incorrect:

(256^32 / 80^41) means only 10.89% of the possible 41-character base80 values are used for a given 32-byte input. I did more math, and came up with this table for all idealized baseN encoding schemes between 64 and 85, for all <=256-bit encodings:

symbols: 85 5/4=1.250 96.798%
symbols: 83 39/31=1.258 64.773%
symbols: 82 34/27=1.259 89.703%
symbols: 82 29/23=1.261 77.431%
symbols: 81 24/19=1.263 89.726%
symbols: 80 19/15=1.267 92.234%
symbols: 79 33/26=1.269 98.298%
symbols: 79 14/11=1.273 83.919%
symbols: 78 37/29=1.276 67.836%
symbols: 77 23/18=1.278 90.998%
symbols: 77 32/25=1.280 68.912%
symbols: 76 41/32=1.281 89.191%
symbols: 75 9/7=1.286 95.968%
symbols: 74 40/31=1.290 76.943%
symbols: 74 31/24=1.292 71.052%
symbols: 73 22/17=1.294 88.507%
symbols: 73 35/27=1.296 64.000%
symbols: 72 13/10=1.300 86.512%
symbols: 71 30/23=1.304 71.083%
symbols: 70 17/13=1.308 87.187%
symbols: 69 38/29=1.310 91.768%
symbols: 69 21/16=1.312 82.415%
symbols: 68 25/19=1.316 87.869%
symbols: 68 29/22=1.318 68.948%
symbols: 67 33/25=1.320 88.213%
symbols: 67 37/28=1.321 73.443%
symbols: 67 41/31=1.323 61.147%
symbols: 64 4/3=1.333 100.000%

base75 at seems like a clear winner:

  • 9/7 is implementable with 64-bit integer math
  • a base75 value of length 6 has 2.49x more entropy than a base64 value of length 6
  • has at least one symbol set plausibly usable as a URL, unlike base85

Note base66 is not present in this table--despite the additional 2 symbols, it doesn't actually have more possible values than base64 in a 4/3 encoding scheme.

@buhman
Copy link
Member Author

buhman commented May 27, 2020

4b0512a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant