Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache record names to avoid hitting class loader #219

Merged
merged 2 commits into from
Sep 19, 2020
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@

public abstract class AvroSchemaHelper
{
private static final Map<String, String> SCHEMA_NAME_CACHE = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns about this.

Since it is static it is per-ClassLoader, and ideally nothing in Jackson would use that. If at all possible this should be tied to something else, probably AvroFactory (or AvroMapper if practical).

Second, all caches should be bound (not with unlimited size). This is easy to solve by just using

com.fasterxml.jackson.databind.util.LRUMap

with whatever size settings (default, max size) makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also another significant problem is that access is not synchronized: since this will get called from multiple threads, and it is not read-only, access must be made thread-safe. LRUMap uses ConcurrentHashMap which solves that issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question from a Java non-expert: will the call Class.forName(nestedClassName) (called in the method that resolves the name) use the default class loader? If so, no matter how we cache the names, it will still be broken.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would use the default class loader of... well. It's probably the context class loader (alternative being class loader that loaded given class; usually these are the same but not always).


/**
* Dedicated mapper for handling default values (String &lt;-&gt; JsonNode &lt;-&gt; Object)
*
Expand Down Expand Up @@ -332,32 +334,54 @@ public static String getTypeId(Schema schema) {
*/
public static String getFullName(Schema schema) {
switch (schema.getType()) {
case RECORD:
case ENUM:
case FIXED:
String namespace = schema.getNamespace();
String name = schema.getName();
if (namespace == null) {
return schema.getName();
}
if (namespace.endsWith("$")) {
return namespace + name;
}
StringBuilder sb = new StringBuilder(1 + namespace.length() + name.length());
// 28-Feb-2020: [dataformats-binary#195] somewhat complicated logic of trying
// to support differences between avro-lib 1.8 and 1.9...
// Check if this is a nested class
String nestedClassName = sb.append(namespace).append('$').append(name).toString();
try {
Class.forName(nestedClassName);
return nestedClassName;
} catch (ClassNotFoundException e) {
// Could not find a nested class, must be a regular class
sb.setLength(0);
return sb.append(namespace).append('.').append(name).toString();
}
case RECORD:
case ENUM:
case FIXED:
String namespace = schema.getNamespace();
String name = schema.getName();
String key = namespace + "." + name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namespace might be null here (not when a schema is generated from a java class, but if using hand-crafted or schemas generated from avdl files)

Copy link
Contributor

@baharclerode baharclerode Sep 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double checked, schema.getFullName() should be suitable as a cache key here without any additional checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there @baharclerode! Thank you for additional comments -- and yes, I realize that Avro's choice of "interesting" ways of dealing with stuff sort of forces our hand here.
Would be happy to merge a PR for removing helper class as key, use schema.getFullName() if that should work?

One other related question: if we had a setting to select between "1.8-and-earlier" vs "1.9-and-later" modes, would it be possible to use different logic, avoiding Class lookup or at least alleviate it?
Or does the problem exist regardless of this, with 1.9 and above?
(I realize that right now lack of context makes it impossible to have such a setting, but I still hope we can make AvroSchemaHelper become non-static and/or take some context object we control which could pass configuration, and ideally contain per-mapper (or per-schema or whatever) cache).

String schemaName = SCHEMA_NAME_CACHE.get(key);

if (schemaName == null) {
schemaName = resolveFullName(schema);
SCHEMA_NAME_CACHE.put(key, schemaName);
}

return schemaName;

default:
return schema.getType().getName();
return schema.getType().getName();
}
}

private static String resolveFullName(Schema schema) {
String namespace = schema.getNamespace();
String name = schema.getName();

if (namespace == null) {
return schema.getName();
}

if (namespace.endsWith("$")) {
return namespace + name;
}

StringBuilder sb = new StringBuilder(1 + namespace.length() + name.length());

// 28-Feb-2020: [dataformats-binary#195] somewhat complicated logic of trying
// to support differences between avro-lib 1.8 and 1.9...
// Check if this is a nested class
String nestedClassName = sb.append(namespace).append('$').append(name).toString();

try {
Class.forName(nestedClassName);

return nestedClassName;
} catch (ClassNotFoundException e) {
// Could not find a nested class, must be a regular class
sb.setLength(0);

return sb.append(namespace).append('.').append(name).toString();
}
}

Expand Down