-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation of ContinuationLocal #1
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,46 @@ | ||
# continuation-local-storage-jvm | ||
A common JVM implementation of continuation-local storage | ||
|
||
This is a common implementation of continuation-local storage for the JVM. | ||
It is based on Twitter's [`com.twitter.util.Local`](https://github.com/twitter/util/blob/master/util-core/src/main/scala/com/twitter/util/Local.scala), but this is a Java port | ||
without any external dependencies (notably; no dependency on Scala). | ||
|
||
The ContinuationLocal can be used for any data that should be persisted in | ||
a continuation scope, but is especially useful for various monitoring and tracing | ||
implementations. | ||
|
||
## Usage example | ||
|
||
```java | ||
import com.github.distributedtracing.ContinuationLocal; | ||
|
||
public final class MyProgram { | ||
private static final ContinuationLocal<Integer> userId | ||
= new ContinuationLocal<Integer>(); | ||
|
||
public static void main(String[] args) { | ||
userId.set(123); | ||
printUserId(); | ||
} | ||
|
||
private static void printUserId() { | ||
System.out.println(userId.get()); | ||
} | ||
} | ||
``` | ||
|
||
## Design differences between com.twitter.util.Local and ContinuationLocal: | ||
|
||
- The value of `com.twitter.util.Local<T>` is of type `scala.Option<T>`, | ||
where `Some(value)` holds the value, while `None` indicates that the | ||
value is absent. In `com.github.distributedtracing.ContinuationLocal<T>`, | ||
the value is not an Option, instead being a nullable of type `T`. | ||
- In `com.twitter.util.Local` the inner storage mechanism is a `ThreadLocal`, | ||
while in `ContinuationLocal`, it is a `InheritableThreadLocal`. This means | ||
that a value that would not be passed on in `com.twitter.util.Local` will be | ||
in a `ContinuationLocal`, in case a new thread is started in the same | ||
context as the `ContinuationLocal` was set. | ||
- `com.twitter.util.Local`'s `update` method is not included. The similar | ||
method `set` should instead be used. `update` in `c.t.u.Local` uses `Option` | ||
types, while `set` in `ContinuationLocal` deals with nullable values directly. | ||
- `com.twitter.util.Local`'s (static) `letClear` function has been renamed to | ||
`letAllClear` to avoid a naming collision with the non-static method `letClear`. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
|
||
<parent> | ||
<groupId>org.sonatype.oss</groupId> | ||
<artifactId>oss-parent</artifactId> | ||
<version>7</version> | ||
</parent> | ||
|
||
<properties> | ||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | ||
</properties> | ||
|
||
<groupId>com.github.distributedtracing</groupId> | ||
<artifactId>continuation-local-storage</artifactId> | ||
<version>1.0-SNAPSHOT</version> | ||
<packaging>jar</packaging> | ||
|
||
<name>continuation-local-storage</name> | ||
<url>https://github.com/DistributedTracing/continuation-local-storage-jvm</url> | ||
|
||
<description> | ||
This is a Java port of Twitter's com.twitter.util.Local. The purpose of | ||
ContinuationLocal is to be able to pass state implicitly between continuations; | ||
much like a ThreadLocal, but one that also works in thread pools and event loops. | ||
</description> | ||
|
||
<licenses> | ||
<license> | ||
<name>The Apache Software License, Version 2.0</name> | ||
<url>http://www.apache.org/licenses/LICENSE-2.0.txt</url> | ||
<distribution>repo</distribution> | ||
</license> | ||
</licenses> | ||
|
||
<developers> | ||
<developer> | ||
<id>eirslett</id> | ||
<name>Eirik Sletteberg</name> | ||
<email>[email protected]</email> | ||
</developer> | ||
</developers> | ||
|
||
<dependencies> | ||
<dependency> | ||
<groupId>junit</groupId> | ||
<artifactId>junit</artifactId> | ||
<version>4.12</version> | ||
<scope>test</scope> | ||
</dependency> | ||
</dependencies> | ||
|
||
<build> | ||
<plugins> | ||
<plugin> | ||
<groupId>org.codehaus.mojo</groupId> | ||
<artifactId>animal-sniffer-maven-plugin</artifactId> | ||
<version>1.14</version> | ||
<executions> | ||
<execution> | ||
<phase>test</phase> | ||
<goals> | ||
<goal>check</goal> | ||
</goals> | ||
</execution> | ||
</executions> | ||
<configuration> | ||
<signature> | ||
<groupId>org.codehaus.mojo.signature</groupId> | ||
<artifactId>java16</artifactId> | ||
<version>1.1</version> | ||
</signature> | ||
</configuration> | ||
</plugin> | ||
</plugins> | ||
|
||
<pluginManagement> | ||
<plugins> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add animalsniffer? <plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>animal-sniffer-maven-plugin</artifactId>
<version>1.14</version>
<executions>
<execution>
<phase>test</phase>
<goals>
<goal>check</goal>
</goals>
</execution>
</executions>
<configuration>
<signature>
<groupId>org.codehaus.mojo.signature</groupId>
<artifactId>java16</artifactId>
<version>1.1</version>
</signature>
</configuration>
</plugin> |
||
<plugin> | ||
<groupId>org.apache.maven.plugins</groupId> | ||
<artifactId>maven-compiler-plugin</artifactId> | ||
<version>3.1</version> | ||
<configuration> | ||
<source>1.6</source> | ||
<target>1.6</target> | ||
</configuration> | ||
</plugin> | ||
</plugins> | ||
</pluginManagement> | ||
</build> | ||
</project> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,170 @@ | ||
package com.github.distributedtracing; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this ultimately what we want? Might as well state your intent and go for org.continuation from the start There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to not having it in the com.github namespace. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do +1 it myself. |
||
|
||
public final class ContinuationLocal<T> { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. some things in the readme could be moved here I think. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree. It would be good to add some javadoc here. |
||
private static InheritableThreadLocal<Object[]> localCtx = | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is passive inheritance desired? Is it better to be explicit about passing context so that the mechanism by which context carries over thread boundaries is not dependent on how the thread was created. Or to put it another way, everyone should use executors or wrap runnables There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you referring to ThreadLocal vs InheritableThreadLocal? (The only difference being that InheritableThreadLocal passes state from one thread to another thread if the second thread was created from the first thread) It's a design decision whether or not passing state between threads should be explicit or implicit. The ThreadLocal in Java does it implicitly, so I guess that's the "standard" Java way. Another point of view is that we want to be able to trace requests without having to make lots of changes to userland code. Even more, to instruct developers to manually wrap all their threads and thread pools. In a multi-million-line code base, it's a pain to search-replace every place you need to wrap runnables and so on. (Unfortunately, that's what we have to do anyways, unless the Java standard library adopts something like ContinuationLocal and wraps ExecutorServices natively). The underlying mechanism for passing state is the save() and restore() stuff, and the ExecutorService wrapper included in this library is simply a default implementation for thread pools. Likewise, InheritableThreadLocal could be thought of as a default implementation for passing state into new threads, I suppose? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm just questioning whether implicit passing via InheritableThreadLocal is a fools errand for the reasons you mention. I consider wrapping Executors to be explicit passing btw and something most modern application do as a matter of routine. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does seem There are also issues with object retention, in that the ThreadPoolExecutor, even when properly wrapped, would be retaining references to the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mariusae was it a conscious decision to use ThreadLocal insted of InheritableThreadLocal, in the first place? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @eirslett yes. we wanted all context saving and restoring to be explicit. |
||
new InheritableThreadLocal<Object[]>(); | ||
private static volatile int size = 0; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better to use AtomicInteger so you don't have to hold the class monitor which risks contention. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good point! Will change that. |
||
|
||
/** | ||
* Return a snapshot of the current ContinuationLocal state. | ||
*/ | ||
public static Object[] save() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is likely to violate the mutability expectations of users when state is moved between threads. In particular the fact that some inner code can directly replace the bound value at any time for any ContinuationLocal regardless of visibility rules. I'd strongly consider encapsulating this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, also a trade-off here; Twitter solved this by using a type alias: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems dropping some type safety and returning That isn't to say that I think returning an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good solution! |
||
return localCtx.get(); | ||
} | ||
|
||
/** | ||
* Restore the ContinuationLocal state to a given set of values. | ||
*/ | ||
public static void restore(Object[] saved) { | ||
localCtx.set(saved); | ||
} | ||
|
||
private static synchronized int add() { | ||
size += 1; | ||
return size - 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. return size++? |
||
} | ||
|
||
private static void set(int i, Object v) { | ||
assert i < size; | ||
Object[] ctx = localCtx.get(); | ||
|
||
if(ctx == null) { | ||
ctx = new Object[size]; | ||
} else { | ||
Object[] oldCtx = ctx; | ||
ctx = new Object[size]; | ||
System.arraycopy(oldCtx, 0, ctx, 0, oldCtx.length); | ||
} | ||
|
||
ctx[i] = v; | ||
localCtx.set(ctx); | ||
} | ||
|
||
private static Object get(int i) { | ||
Object[] ctx = localCtx.get(); | ||
if(ctx == null || ctx.length <= i) { | ||
return null; | ||
} | ||
|
||
return ctx[i]; | ||
} | ||
|
||
private static void clear(int i) { | ||
set(i, null); | ||
} | ||
|
||
/** | ||
* Clear all locals in the current context. | ||
*/ | ||
private static void clearAll() { | ||
localCtx.set(null); | ||
} | ||
|
||
/** | ||
* Execute a block with the given ContinuationLocals, | ||
* restoring current values upon completion. | ||
*/ | ||
public static <U> U let(Object[] ctx, Producer<U> f) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is not the usual meaning of |
||
Object[] saved = save(); | ||
restore(ctx); | ||
U result; | ||
try { | ||
result = f.apply(); | ||
} finally { | ||
restore(saved); | ||
} | ||
return result; | ||
} | ||
|
||
/** | ||
* Execute a block with all ContinuationLocals clear, restoring | ||
* current values upon completion. | ||
*/ | ||
public static <U> U letAllClear(Producer<U> f) { | ||
return ContinuationLocal.let(null, f); | ||
} | ||
|
||
/** | ||
* Convert a Producer<R> into another producer of the same | ||
* type whose ContinuationLocal context is saved when calling `closed` | ||
* and restored upon invocation. | ||
*/ | ||
public static <R> Producer<R> closed(final Producer<R> fn) { | ||
final Object[] closure = ContinuationLocal.save(); | ||
return new Producer<R>() { | ||
public R apply() { | ||
Object[] save = ContinuationLocal.save(); | ||
ContinuationLocal.restore(closure); | ||
R result; | ||
try { | ||
result = fn.apply(); | ||
} finally { | ||
ContinuationLocal.restore(save); | ||
} | ||
return result; | ||
} | ||
}; | ||
} | ||
|
||
private final int me = add(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While integer keys are efficient the data density tradeoff might be undesirable if folks think that they can create and discard new continuation local instances all the time. If the performance characteristics of indexed lookup are critical then perhaps place an upper bound on the array size and fallback to a spare map implementation once that size is exceeded. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's right that developers shouldn't make new ContinuationLocals everywhere, because they have an overhead, and aren't really garbage collected either. But I'm not sure the implementation details should document how ContinuationLocal is supposed to be used, maybe that should be a warning in the JavaDoc instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ContinuationLocal instances appear to be garbage collectable (am I missing something?) I agree you want to avoid too much coupling between implementation details and interface but there is a clear risk here. Guidance in doc is probably sufficient for most cases but defense in depth is also worth considering. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The objects that are referred to by a ContinuationLocal is stored inside the context (the current/temporary implementation of context is Object[]). So the ContinuationLocal itself is garbage collected, but not the value it's bound to. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just as a data point: we're operating really rather huge codebases with |
||
|
||
/** | ||
* Update the ContinuationLocal with the given value. | ||
*/ | ||
public void set(T value) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is continuous mutability the goal or an anti-pattern here? We explicitly chose not to allow it in because we believe the aggregate set of 'context' passed to a unit of work should be stable and that it was an exceptional event to need to mutate it. If folks need mutability for some reason they can have the stored state be mutable. If your goal is to have the 'feel' of ThreadLocal then it will have to accept the risks that come with that choice. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a very good point! Mutability can be achieved by using a mutable inner value. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mariusae wouldn't this set() method on the Local mean it's mutable in the same sense that a ThreadLocal is mutable? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's really hard to compare, because ThreadLocals don't propagate across their delimitation's boundaries (starting a new thread) whereas Locals do. Thus, if I change a Local with Even so, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, the implication of this, is that if you do want to share values that mutate across contexts, you do have to use inner refs. Finagle's context API provides a clearer API for all of this. (It does more--but it might inform how I'd write |
||
set(me, value); | ||
} | ||
|
||
/** | ||
* Get the ContinuationLocal's value. | ||
*/ | ||
public T get() { | ||
return (T)get(me); | ||
} | ||
|
||
/** | ||
* Alias for get() | ||
*/ | ||
public T apply() { | ||
return get(); | ||
} | ||
|
||
/** | ||
* Execute a block with a specific ContinuationLocal value, | ||
* restoring the current state upon completion. | ||
*/ | ||
public <U> U let(T value, Producer<U> f) { | ||
T saved = get(); | ||
set(value); | ||
U result; | ||
try { | ||
result = f.apply(); | ||
} finally { | ||
set(saved); | ||
} | ||
return result; | ||
} | ||
|
||
/** | ||
* Execute a block with the ContinuationLocal cleared, restoring | ||
* the current state upon completion. | ||
*/ | ||
public <U> U letClear(Producer<U> f) { | ||
T saved = get(); | ||
clear(); | ||
U result; | ||
try { | ||
result = f.apply(); | ||
} finally { | ||
set(saved); | ||
} | ||
return result; | ||
} | ||
|
||
/** | ||
* Clear the ContinuationLocal's value. Other ContinuationLocal's are not modified. | ||
*/ | ||
public void clear() { | ||
ContinuationLocal.clear(me); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
com.github probably isn't the right group id, as we won't be able to publish against that. @rtyler correct me if I am wrong on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of packages out there released from github: http://search.maven.org/#search%7Cga%7C1%7Ccom.github
But maybe we should aquire a domain like "distributedtracing.org" and use "org.distributedtracing" as a groupId? Then we could point that domain to a gh-pages here in the github repo. (And the specification won't be tied to github)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I suppose I could have googled that :) I don't know what's best, but I suspect that not having github in package names or group ids could lead to later flexibility (thinking apache etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I suggest that we buy that domain, and change the groupId.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
com.github.distributedtracing is certainly legitimate and can be used for publishing. So can "distributedtracing" or anything in between.
Group IDs are only conventionally supposed to be reverse DNS to the authors domain, and distributedtracing.github.com is a resolvable name and will point to the github pages repo for this organization when it is set up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think the gh-pages will be
distributedtracing.github.io
so it would be better to start with a maven groupio.github.distributedtracing
if that was the rationale. We should be able to control static content in the domain/subdomain that we claim, so I think ".com" is wrong.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsyer both are equivalent, foo.github.io and foo.github.com are the same.
In the case of a custom CNAME, like we use for @lookout, lookout.github.io, lookout.github.com and hackers.lookout.com all end up at the same place