-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix #30 #254
base: master
Are you sure you want to change the base?
fix #30 #254
Conversation
Will review, thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestions to improve the code... at least the formatting should be fixed, and any refinement you can make for the other comments would be great!
@@ -186,8 +187,30 @@ public String getString(long offset) { | |||
|
|||
|
|||
public String getString(long offset, int maxLength, Charset cs) { | |||
final byte[] bytes = IO.getZeroTerminatedByteArray(address() + offset, maxLength); | |||
return cs.decode(ByteBuffer.wrap(bytes)).toString(); | |||
if(cs == StandardCharsets.UTF_8){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reformat according to more common Java formatting standards:
if (cond) {
...
} else {
...
for (stuff) {
...
Otherwise we get a mix of formats and future patches will likely include unhelpful formatting updates.
@@ -186,8 +187,30 @@ public String getString(long offset) { | |||
|
|||
|
|||
public String getString(long offset, int maxLength, Charset cs) { | |||
final byte[] bytes = IO.getZeroTerminatedByteArray(address() + offset, maxLength); | |||
return cs.decode(ByteBuffer.wrap(bytes)).toString(); | |||
if(cs == StandardCharsets.UTF_8){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic could apply to any Charset that has a minimum character width of 1 byte, correct? All such encodings should use a single-byte \0
for C string termination I think?
At the very least this should include the ISO-8859 encodings, which are all single-byte (their CharsetEncoder.maxBytesPerChar will all be 1.0).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree that I should include other single byte terminator charsets, but CharsetEncoder.maxBytesPerChar won't be sufficient because it does not resolve to 1.0 for utf-8
}else{ | ||
byte[] bytes = new byte[maxLength]; | ||
IO.getByteArray(address() + offset, bytes, 0, maxLength); | ||
final byte[] nullCharBytes = new String("\0").getBytes(cs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could possibly be cached for common encodings (UTF-16, UTF-32) or it might be valid to just use a width of \0
equal to the encoding's CharsetEncoder.maxBytesPerChar value.
…tly with more charsets, fix DirectMemoryIO.getString()
Request another review when you are ready. Thanks for keeping at it! |
this part of the code is quite tricky, will keep you posted, thanks |
btw is the irc channel still alive ? |
…ngth() that works with wide chars, default max length to max int
@demon36 Ah no I expect the IRC channel, being on now-defunct FreeNode, is probably dead. I should set up a new channel on Matrix or libera. For now if you want to chat with those of us maintaining these libraries, stop by the #jruby channel on Matrix! |
@headius good to know that, the PR is ready for a review, please also take a look at Struct.java |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking near completion. Main issues are:
- formatting throughout does not match Java conventions or the rest of the codebase
- no tests provided for the behavior
Almost there!
return cs.decode(ByteBuffer.wrap(bytes)).toString(); | ||
long baseAddress = address() + offset; | ||
int nullTermSize = StringUtil.terminatorWidth(cs); | ||
if(nullTermSize == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting throughout should match the rest of the code, which is intended to match typical Java coding conventions:
- four space indentation for blocks of code, 8-space indentation for line continuation
- space between keywords like
if
andwhile
and their parenthesized conditionals - spaces around operators like
+
and%
It's nitpicky but if we don't maintain consistent code formatting then we end up with future commits and PRs that have lots of unrelated changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, will take care of that
@@ -64,7 +66,8 @@ public static CharSequence getCharSequence(ByteBuffer buf, Charset charset) { | |||
final ByteBuffer buffer = buf.slice(); | |||
// Find the NUL terminator and limit to that, so the | |||
// StringBuffer/StringBuilder does not have superfluous NUL chars | |||
int end = indexOf(buffer, (byte) 0); | |||
final byte[] nullCharBytes = new byte[StringUtil.terminatorWidth(charset)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can cache the three known lengths and not reallocate this byte[] every time?
Looks like this might need a merge or rebase from master to pick up that missing import too. |
@demon36 Should we just close this in light of the work to do this all natively in jffi? I am inclined to keep this as fallback code when we do not have an updated jffi binary, but what are your thoughts going forward? |
fix #30, DirectMemoryIO.getString() fails for non UTF-8