Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for text formatting information extraction #185

Closed
rayburn opened this issue Mar 26, 2015 · 7 comments
Closed

Add support for text formatting information extraction #185

rayburn opened this issue Mar 26, 2015 · 7 comments

Comments

@rayburn
Copy link

rayburn commented Mar 26, 2015

It would be great if the gem could preserve the cell text formatting during parse as an option. The current parse converts everything to straight text it seems. I need to be able to preserve the text formatting of a cell.

For example the following cell contents currently get converted to 'This text is bold and this is not' without any formatting iformation.

This text is bold and this is not

What would be great is if it stored the text as the following:

This text is bold<\b> and this is not

Perhaps have support for bold, italic, underline, subscript, superscript and other common text formats.

Thanks for the consideration.

@stevendaniels
Copy link
Contributor

I agree it would be better to have some method that preserves the cell's text formatting. I'm still getting familiar with how Roo deals with formatting.

It probably makes sense to work on adding them after updating Roo on rubygems. (#172)

@rayburn
Copy link
Author

rayburn commented Mar 27, 2015

Thx stevendaniels. Let me know how I can help. In the semiconductor industry there is increasing usage of spreadsheets for content authoring and the text formatting information is key. Current workaround is to save the file as Excel 2003 XML and find the information there and do a substitution. Of course that is ugly...

@stevendaniels
Copy link
Contributor

In case you're interested in scratching your own itch, you can get the formatting details in the sharedStrings.xml file. lib/roo/excelx/shared_strings.rb extracts those strings, but it looks like it just throws away the format information.

Here's a link to the sharedStrings schema.

@rayburn
Copy link
Author

rayburn commented Mar 27, 2015

sounds good, i think i will start scratching. will be good to get my first pull request out of the way!

@stevendaniels
Copy link
Contributor

@rayburn Any progress on this issue?

@lukas2342
Copy link

i suppose that there hasn't been a post about it here means that this feature is not available yet?

@stevendaniels
Copy link
Contributor

Closing due to lack of interest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants