Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupted files can be generated when worksheet names of a certain length contain unicode #588

Open
aardvarkk opened this issue May 25, 2018 · 0 comments · May be fixed by #589
Open

Corrupted files can be generated when worksheet names of a certain length contain unicode #588

aardvarkk opened this issue May 25, 2018 · 0 comments · May be fixed by #589
Labels
Done in caxlsx This has already been solved in the caxlsx fork.

Comments

@aardvarkk
Copy link

Can be reproduced with the following example:

require 'axlsx'

Axlsx::Package.new do |p|
  p.workbook.add_worksheet(:name => "good")
  p.serialize('good.xlsx')
end

Axlsx::Package.new do |p|
  p.workbook.add_worksheet(:name => "\u{1F1EB 1F1F7}1234567890123456789012345678")
  p.serialize('bad.xlsx')
end

The file good.xlsx works fine. The file bad.xlsx displays an error upon opening in Microsoft Excel and asks to repair a corrupted file.

The root cause seems to be the 31 character worksheet name limit. When unicode is used in the worksheet name, the additional bytes don't seem to count toward the 31 character limit. So in this case, a 28- or 29- character "normal" string (after the unicode portion) will not throw an error but will generate a corrupted file. I would recommend updating the error trigger on a too-long name to check string length in bytes instead of just normal string length.

aardvarkk pushed a commit to aardvarkk/axlsx that referenced this issue May 25, 2018
size returns length in characters, but doesn't factor in multibyte Unicode characters.
By switching to bytesize, we check the relevant measure of how many bytes the worksheet name is.

Fixes randym#588.
@aardvarkk aardvarkk linked a pull request May 25, 2018 that will close this issue
fmluizao pushed a commit to fmluizao/axlsx-alt that referenced this issue Sep 17, 2018
size returns length in characters, but doesn't factor in multibyte Unicode characters.
By switching to bytesize, we check the relevant measure of how many bytes the worksheet name is.

Fixes randym#588.
aardvarkk pushed a commit to aardvarkk/axlsx that referenced this issue Dec 16, 2019
- `size` returns length in characters, but doesn't factor in multibyte Unicode characters.
By switching to `bytesize`, we check the relevant measure of how many bytes the worksheet name is.
- Fixes randym#588
- Copy of PR against original axlsx
(randym#589)
noniq pushed a commit to caxlsx/caxlsx that referenced this issue Dec 17, 2019
- `size` returns length in characters, but doesn't factor in multibyte Unicode characters.
By switching to `bytesize`, we check the relevant measure of how many bytes the worksheet name is.
- Fixes randym/axlsx#588
- Copy of PR against original axlsx
(randym/axlsx#589)
@noniq noniq added the Done in caxlsx This has already been solved in the caxlsx fork. label Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Done in caxlsx This has already been solved in the caxlsx fork.
Projects
None yet
2 participants