Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a config to limit the code-gen and class loading #83

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

volauvent
Copy link
Collaborator

The config sets the maximum number of fast SerDes classes
generated and loaded. It helps to limit the metaspace and codecache
usage brought by fast-avro runtime code-gen and class loading.

The limit config can be set via FastSerdeCache constructors.

It sets the maximum number of fast SerDes classes generated and
loaded. It helps to limit the metaspace and codecache usage
brought by fast-avro runtime code-gen and class loading.

The limit config can be set via FastSerdeCache constructors.
@codecov-commenter
Copy link

Codecov Report

Merging #83 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #83   +/-   ##
=========================================
  Coverage     53.42%   53.42%           
  Complexity      275      275           
=========================================
  Files            39       39           
  Lines          1662     1662           
  Branches        206      206           
=========================================
  Hits            888      888           
  Misses          692      692           
  Partials         82       82           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80ecbc2...5a79fa9. Read the comment docs.

} else {
LOGGER.warn("Loaded fast serdes classes number {}, with limit set to {}", loadClassNum, loadClassLimit);
}
return null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be clearer to just throw the exception instead of returning null.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will fix it.

* @param limit
* custom number {@link #generatedFastSerDesLimit}
*/
public FastSerdeCache(int limit) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add other two constructors? Maybe this one is good enough.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No harm to add these two constructors :) For exmaple, I am using FastSerdeCache(Executor executorService, int limit) in the test.

T result = null;
if (this.generatedSerDesNum.get() < this.generatedFastSerDesLimit) {
result = supplier.get();
} else if (this.generatedSerDesNum.get() == this.generatedFastSerDesLimit) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am seeing there are two counters, which are being maintained independently in FastSerdeCache and FastSerdeBase, and I think we should combine and just use one.
The idea in my mind:

  1. Maintaining the counter and limit in FastSerdeCache.
  2. Passing the limit enforcement function to FastSerdeBase via all the Generator classes.
  3. The limit enforcement function can be this way:
private int generatedSerDeNum;
  private int generatedFastSerDeLimit;
  Predicate<Boolean> limitPredicate = (whetherIncrementCounter) -> {
    synchronized (this) {
      if (generatedSerDeNum >= generatedFastSerDeLimit) {
        return false;
      }
      if (whetherIncrementCounter) {
        ++generatedSerDeNum;
      }
      return true;
    }
  };
  1. In FastSerdeCache and all kinds of generators before the fast class generation, we should call limitPredicate(false), and fail fast when this predicate returns false.
  2. In FastSerdeBase, before loading the new generated class, we should call limitPredicate(true), and fail fast if the predicate return false.

Essentially, the idea is to keep the counting logic in a single place to make it consistent.
Invoking this function in FastSerdeCache and various Generators is to try to avoid useless work.

Let me know if you want to have a sync up about this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explaination!

This approach indeed helps to reduce two counters to one. However, it also leads to redundant code-gens and compilations which are CPU intensive tasks. IIUC, there's no limit for the code-gen and compilation before we really load generatedFastSerDeLimit number of fast classes. So the corner case is fast-avro may generate and compile a great number of fast classes, and then throw them away.

For the current implementation, we at most generate extra N - 1 fast classes, N is the threads number of FastSerdeCache Executor. So I think the current implementation is better. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants