[GitHub] [groovy] danielsun1106 opened a new pull request #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

classic Classic list List threaded Threaded
35 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] [groovy] danielsun1106 opened a new pull request #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

GitBox
danielsun1106 opened a new pull request #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic
URL: https://github.com/apache/groovy/pull/1135
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [groovy] danielsun1106 commented on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

GitBox
danielsun1106 commented on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic
URL: https://github.com/apache/groovy/pull/1135#issuecomment-570689193
 
 
   @blackdrag
   Jochen, it would be great if you could review the PR :-)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [groovy] blackdrag commented on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

GitBox
In reply to this post by GitBox
blackdrag commented on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic
URL: https://github.com/apache/groovy/pull/1135#issuecomment-570808378
 
 
   Just to be sure I understand the code correctly... We install a CachedCallableCallsite, which will have similar logic to before, meaning we do the normal method selection, then install a method as primary target and if all the guards fail (or all callsites get reset by meta class logic) we fallback to now not a new select, but to another guard that first checks with cacheHits if the cache contains the method and sets the new target in a ThreadLocal. If it exists we use fromCache, which basically gets the handle from the ThreadLocal and in case of a cache miss we do select again, but select will always add the new selections to the cache. To determine if it is a cache hit we use the class name of the receiver. Oh, and select will also (as before) set a new target for the callsite.
   
   Assuming this is correct I do have a few problems with this:
   - you should always have a micro benchmark for this sort of thing. Yes, micro benchmarks are evil, but in this case you are trying to optimize something very small, so it is going to be micro, if not nano.
   - unless changed ThreadLocal is a performance killer. It introduces a barrier which is preventing method inlining.
   - a call x.foo(y) may add a handle X#foo(Y1) in the cache, but if y changes to Y2 your cache will still return X#foo(Y1), even though y may not be an instance of Y1. You will have to check the argument types. In java and static Groovy the receiver check would be enough, in Groovy (because of double dispatch) not. And of course if X changes to a class of same name but different loader, then the handle for X#foo(Y1) would be invalid as well.
   
   Let me suggest an alternative approach... if the end up to select the method again, then use old callsite target as fallback for the new handle. That way you create a giant method handle that is the cache, forcing the JVM to optimize the whole thing. Of course there are x more steps to improve that, but it would be a simple start. (we may want to limit the length, reorder according to the number of calls, maybe have some weakly referenced to avoid class unloading issues, and maybe other things, but all that needs a benchmark to go by and prove it is an improvement really)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [groovy] danielsun1106 commented on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

GitBox
In reply to this post by GitBox
danielsun1106 commented on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic
URL: https://github.com/apache/groovy/pull/1135#issuecomment-570811643
 
 
   Jochen, the main idea can be described as the following code:
   
   ```
   if (cacheHits(receiverCassName)) { // Try to find the cached MethodHandle
         // the main logic of `fromCache`
         MethodHandle mh = lruCache.get(receiverCassName); // find the cached MethodHandle again...
         return mh.invokeExact(...);
   } else { // fallback, and put the fallback methodhandle to the inline cache(LRU) for reuse
         // the main logic of `selectMethod`
         Selector selector = Selector.getSelector(...);
         selector.setCallSiteTarget();
         MethodHandle mh = selector.handle.asSpreader(...).asType(...);
         lruCache.put(receiverCassName, mh); // cache the methodhandle
         return mh.invokeExact(...);
   }
   ```
   
   As the inline cache is implemented as LRU cache, and `cacheHits` and `fromCache` are executed in two steps(not atomic), so if `cacheHits` returns `true`(means cached methodhandle found), `fromCache` find the same cached methodhandle from LRU cache, the result probably is `null` as cached methodhandle may be cleared according to the rule of LRU... I use `ThreadLocal` to avoid the above concurrent issue.
   
   Also, we have inline cache for callsite, so `setTarget` can be avoided, which causes the poor performance as the GROOVY-8298 shown.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [groovy] danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

GitBox
In reply to this post by GitBox
danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic
URL: https://github.com/apache/groovy/pull/1135#issuecomment-570811643
 
 
   Jochen, the main idea can be described as the following code:
   
   ```java
   if (cacheHits(receiverCassName)) { // Try to find the cached MethodHandle
         // the main logic of `fromCache`
         MethodHandle mh = lruCache.get(receiverCassName); // find the cached MethodHandle again...
         return mh.invokeExact(...);
   } else { // fallback, and put the fallback methodhandle to the inline cache(LRU) for reuse
         // the main logic of `selectMethod`
         Selector selector = Selector.getSelector(...);
         selector.setCallSiteTarget();
         MethodHandle mh = selector.handle.asSpreader(...).asType(...);
         lruCache.put(receiverCassName, mh); // cache the methodhandle
         return mh.invokeExact(...);
   }
   ```
   
   As the inline cache is implemented as LRU cache, and `cacheHits` and `fromCache` are executed in two steps(not atomic), so if `cacheHits` returns `true`(means cached methodhandle found), `fromCache` find the same cached methodhandle from LRU cache, the result probably is `null` as cached methodhandle may be cleared according to the rule of LRU... I use `ThreadLocal` to avoid the above concurrent issue.
   
   Also, we have inline cache for callsite, so `setTarget` can be avoided, which causes the poor performance as the GROOVY-8298 shown.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [groovy] danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

GitBox
In reply to this post by GitBox
danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic
URL: https://github.com/apache/groovy/pull/1135#issuecomment-570811643
 
 
   Jochen, the main idea can be described as the following code:
   
   ```java
   if (cacheHits(receiverCassName)) { // Try to find the cached MethodHandle
         // the main logic of `fromCache`
         MethodHandle mh = lruCache.get(receiverCassName); // find the cached MethodHandle again...
         return mh.invokeExact(...);
   } else { // fallback, and put the fallback methodhandle to the inline cache(LRU) for reuse
         // the main logic of `selectMethod`
         Selector selector = Selector.getSelector(...);
         selector.setCallSiteTarget();
         MethodHandle mh = selector.handle.asSpreader(...).asType(...);
         lruCache.put(receiverCassName, mh); // cache the methodhandle
         return mh.invokeExact(...);
   }
   ```
   
   As the inline cache is implemented as LRU cache, and `cacheHits` and `fromCache` are executed in two steps(not atomic), so if `cacheHits` returns `true`(means cached methodhandle found), `fromCache` find the same cached methodhandle from LRU cache again, the result probably is `null` as cached methodhandle may be cleared according to the rule of LRU... I use `ThreadLocal` to avoid the above concurrent issue.
   
   Also, we have inline cache for callsite, so `setTarget` can be avoided, which causes the poor performance as the GROOVY-8298 shown.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [groovy] danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

GitBox
In reply to this post by GitBox
danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic
URL: https://github.com/apache/groovy/pull/1135#issuecomment-570811643
 
 
   Jochen, the main idea can be described as the following code:
   
   ```java
   if (cacheHits(receiverCassName)) { // Try to find the cached MethodHandle
         // the main logic of `fromCache`
         MethodHandle mh = lruCache.get(receiverCassName); // find the cached MethodHandle again...
         return mh.invokeExact(...);
   } else { // fallback, and put the fallback methodhandle to the inline cache(LRU) for reuse
         // the main logic of `selectMethod`
         Selector selector = Selector.getSelector(...);
         selector.setCallSiteTarget();
         MethodHandle mh = selector.handle.asSpreader(...).asType(MethodType.methodType(Object.class, Object[].class));
         lruCache.put(receiverCassName, mh); // cache the methodhandle
         return mh.invokeExact(...);
   }
   ```
   
   As the inline cache is implemented as LRU cache, and `cacheHits` and `fromCache` are executed in two steps(not atomic), so if `cacheHits` returns `true`(means cached methodhandle found), `fromCache` find the same cached methodhandle from LRU cache again, the result probably is `null` as cached methodhandle may be cleared according to the rule of LRU... I use `ThreadLocal` to avoid the above concurrent issue.
   
   >a call x.foo(y) may add a handle X#foo(Y1) in the cache, but if y changes to Y2 your cache will still return X#foo(Y1), even though y may not be an instance of Y1. You will have to check the argument types.
   
   We cache the methodhandle with method type `MethodType.methodType(Object.class, Object[].class)`, so we will not encounter the issue(the following code runs well) ;-)
   ```groovy
   def same(obj) { return obj }
   [1, 1.0f, '1.0'].each { same(it) }
   ```
   
   Also, we have inline cache for callsite, so `setTarget` can be avoided, which causes the poor performance as the GROOVY-8298 shown.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [groovy] danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

GitBox
In reply to this post by GitBox
danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic
URL: https://github.com/apache/groovy/pull/1135#issuecomment-570811643
 
 
   Jochen, the main idea can be described as the following code:
   
   ```java
   if (cacheHits(receiverCassName)) { // Try to find the cached MethodHandle
         // the main logic of `fromCache`
         MethodHandle mh = lruCache.get(receiverCassName); // find the cached MethodHandle again...
         return mh.invokeExact(...);
   } else { // fallback, and put the fallback methodhandle to the inline cache(LRU) for reuse
         // the main logic of `selectMethod`
         Selector selector = Selector.getSelector(...);
         selector.setCallSiteTarget();
         MethodHandle mh = selector.handle.asSpreader(...).asType(MethodType.methodType(Object.class, Object[].class));
         lruCache.put(receiverCassName, mh); // cache the methodhandle
         return mh.invokeExact(...);
   }
   ```
   
   As the inline cache is implemented as LRU cache, and `cacheHits` and `fromCache` are executed in two steps(not atomic), so if `cacheHits` returns `true`(means cached methodhandle found), `fromCache` find the same cached methodhandle from LRU cache again, the result probably is `null` as cached methodhandle may be cleared according to the rule of LRU... I use `ThreadLocal` to avoid the above concurrent issue.
   
   >a call x.foo(y) may add a handle X#foo(Y1) in the cache, but if y changes to Y2 your cache will still return X#foo(Y1), even though y may not be an instance of Y1. You will have to check the argument types.
   
   We cache the methodhandle with method type `MethodType.methodType(Object.class, Object[].class)`, so we will not encounter the issue, the following code runs well  ;-)
   ```groovy
   def same(obj) { return obj }
   for (int i = 0; i < 100000; i++) {
       [1, 1.0f, '1.0'].each { same(it) }
   }
   ```
   (before tuning: 9s, after tuning: 6s, on my machine)
   
   ```groovy
   for (int i = 0; i < 100000; i++) {
       [1, 1.0f, '1.0'].each { it.toString() }
   }
   ```
   (before tuning: 6s, after tuning: 2s, on my machine)
   
   More strict performance will be done with JMH, I have no idea how to enable indy for performance tests for now...
   
   Also, we have inline cache for callsite, so `setTarget` can be avoided, which causes the poor performance as the GROOVY-8298 shown.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [groovy] danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

GitBox
In reply to this post by GitBox
danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic
URL: https://github.com/apache/groovy/pull/1135#issuecomment-570811643
 
 
   Jochen, the main idea can be described as the following code:
   
   ```java
   if (cacheHits(receiverCassName)) { // Try to find the cached MethodHandle
         // the main logic of `fromCache`
         MethodHandle mh = lruCache.get(receiverCassName); // find the cached MethodHandle again...
         return mh.invokeExact(...);
   } else { // fallback, and put the fallback methodhandle to the inline cache(LRU) for reuse
         // the main logic of `selectMethod`
         Selector selector = Selector.getSelector(...);
         selector.setCallSiteTarget();
         MethodHandle mh = selector.handle.asSpreader(...).asType(MethodType.methodType(Object.class, Object[].class));
         lruCache.put(receiverCassName, mh); // cache the methodhandle
         return mh.invokeExact(...);
   }
   ```
   
   As the inline cache is implemented as LRU cache, and `cacheHits` and `fromCache` are executed in two steps(not atomic), so if `cacheHits` returns `true`(means cached methodhandle found), `fromCache` find the same cached methodhandle from LRU cache again, the result probably is `null` as cached methodhandle may be cleared according to the rule of LRU... I use `ThreadLocal` to avoid the above concurrent issue.
   
   >a call x.foo(y) may add a handle X#foo(Y1) in the cache, but if y changes to Y2 your cache will still return X#foo(Y1), even though y may not be an instance of Y1. You will have to check the argument types.
   
   We cache the methodhandle with method type `MethodType.methodType(Object.class, Object[].class)`, so we will not encounter the issue, the following code runs well  ;-)
   ```groovy
   def same(String obj) { return obj }
   def same(int obj) { return obj }
   def same(float obj) { return obj }
   for (int i = 0; i < 100000; i++) {
       [1, 1.0f, '1.0'].each { same(it) }
   }
   ```
   (before tuning: 9s, after tuning: 6s, on my machine)
   
   ```groovy
   for (int i = 0; i < 100000; i++) {
       [1, 1.0f, '1.0'].each { it.toString() }
   }
   ```
   (before tuning: 6s, after tuning: 2s, on my machine)
   
   More strict performance will be done with JMH, I have no idea how to enable indy for performance tests for now...
   
   Also, we have inline cache for callsite, so `setTarget` can be avoided, which causes the poor performance as the GROOVY-8298 shown.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [groovy] danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic

GitBox
In reply to this post by GitBox
danielsun1106 edited a comment on issue #1135: GROOVY-8298: Slow Performance Caused by Invoke Dynamic
URL: https://github.com/apache/groovy/pull/1135#issuecomment-570811643
 
 
   Jochen, the main idea can be described as the following code:
   
   ```java
   if (cacheHits(receiverCassName)) { // Try to find the cached MethodHandle
         // the main logic of `fromCache`
         MethodHandle mh = lruCache.get(receiverCassName); // find the cached MethodHandle again...
         return mh.invokeExact(...);
   } else { // fallback, and put the fallback methodhandle to the inline cache(LRU) for reuse
         // the main logic of `selectMethod`
         Selector selector = Selector.getSelector(...);
         selector.setCallSiteTarget();
         MethodHandle mh = selector.handle.asSpreader(...).asType(MethodType.methodType(Object.class, Object[].class));
         lruCache.put(receiverCassName, mh); // cache the methodhandle
         return mh.invokeExact(...);
   }
   ```
   
   As the inline cache is implemented as LRU cache, and `cacheHits` and `fromCache` are executed in two steps(not atomic), so if `cacheHits` returns `true`(means cached methodhandle found), `fromCache` find the same cached methodhandle from LRU cache again, the result probably is `null` as cached methodhandle may be cleared according to the rule of LRU... I use `ThreadLocal` to avoid the above concurrent issue.
   
   Also, we have the inline cache for callsite, so `setTarget` can be avoided, which causes the poor performance as the GROOVY-8298 shown.
   
   >a call x.foo(y) may add a handle X#foo(Y1) in the cache, but if y changes to Y2 your cache will still return X#foo(Y1), even though y may not be an instance of Y1. You will have to check the argument types.
   
   We cache the methodhandle with method type `MethodType.methodType(Object.class, Object[].class)`, so we will not encounter the issue, the following code runs well  ;-)
   ```groovy
   def same(String obj) { return obj }
   def same(int obj) { return obj }
   def same(float obj) { return obj }
   for (int i = 0; i < 100000; i++) {
       [1, 1.0f, '1.0'].each { same(it) }
   }
   ```
   (before tuning: 9s, after tuning: 6s, on my machine)
   
   ```groovy
   for (int i = 0; i < 100000; i++) {
       [1, 1.0f, '1.0'].each { it.toString() }
   }
   ```
   (before tuning: 6s, after tuning: 2s, on my machine)
   
   More strict performance will be done with JMH, I have no idea how to enable indy for performance tests for now...
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
1234