Memory leak in ClassInfo when using MetaClasses

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
32 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Memory leak in ClassInfo when using MetaClasses

candrews
This post has NOT been accepted by the mailing list yet.
I'm trying to track down a memory leak in Grails and I'm pretty sure I've discovered the root cause of the leak is in Groovy. For reference, here's the thread on the grails-dev list on this topic: http://grails.1312388.n4.nabble.com/Easily-reproducible-memory-leak-in-Grails-td4655816.html

The problem is that when a class is loaded, then a metaclass is assigned, then the class is no longer reachable, the metaclass remains in memory. When this happens many times, lots of metaclasses are leaked, which becomes a significant problem. I discovered this problem because this how Grails implements GSPs - by loading the GSP as a groovy class, then when the GSP changes, loading a new class. The old class is unreachable, except for some internal Groovy structures... hence the memory leak.

Here's a junit test that reproduces the problem:
-----------------------------------------------------
package bugreport;

import groovy.lang.ExpandoMetaClass;
import groovy.lang.GroovyClassLoader;
import groovy.lang.GroovySystem;
import groovy.lang.MetaClass;
import groovy.lang.MetaClassRegistry;

import org.junit.Test;

public class LeakTest {
        @Test
        public void testForLeak() throws Exception{
                for(int i=1; i <= 10000 ; i++){
                        GroovyClassLoader classLoader = new GroovyClassLoader(Thread.currentThread().getContextClassLoader());
                        Class<?> clazz = classLoader.parseClass("print 'hello world'");
                        ExpandoMetaClass metaClass = getExpandoMetaClass(clazz);
                        System.gc();
                        Thread.sleep(50);
                        System.out.println("Done " + i + " iterations");
                }
        }
        //--Begin copied unmodified from GrailsMetaClassUtils--//
    public static ExpandoMetaClass getExpandoMetaClass(Class<?> aClass) {
        MetaClassRegistry registry = getRegistry();

        MetaClass mc = registry.getMetaClass(aClass);
        if (mc instanceof ExpandoMetaClass) {
            ExpandoMetaClass emc = (ExpandoMetaClass) mc;
            registry.setMetaClass(aClass, emc); // make permanent
            return emc;
        }

        registry.removeMetaClass(aClass);
        mc = registry.getMetaClass(aClass);
        if (mc instanceof ExpandoMetaClass) {
            return (ExpandoMetaClass)mc;
        }

        ExpandoMetaClass emc = new ExpandoMetaClass(aClass, true, true);
        emc.initialize();
        registry.setMetaClass(aClass, emc);
        return emc;
    }
   
    public static MetaClassRegistry getRegistry() {
        return GroovySystem.getMetaClassRegistry();
    }
        //--End copied unmodified from GrailsMetaClassUtils--//
       
}
-----------------------------------------------------
Eventually, that results in an OutOfMemoryError. Using a tool like visualvm, it is clear that the number of classes constantly rises, none are ever unloaded, and the permgen keeps getting used. The heap also rises. In my tests, it usually dies after about 5,000 iterations.

In my opinion, this shouldn't happen - this test case should run without leaking memory. Groovy should allow the MetaClasses to be garbage collected once the Class is no longer referenced by anything else. In other words, Groovy should hold a weak reference to the MetaClass/ClassInfo/etc for the Class based on the reachability of the Class.

By taking a heap dump in visualvm, I found that there are lots of instances of org.codehaus.groovy.runtime.metaclass.MetaMethodIndex$Entry.

The problem areas (that are GC roots) seem to be:
org.codehaus.groovy.reflection.ClassInfo's static field modifiedExpandos
org.codehaus.groovy.reflection.ClassInfo's static field globalClassSet

The classes that area created by classLoader.parseClass also stick around, which is why the permgen is leaking.

Can someone please help me determine how to fix this problem in Groovy (assuming there is agreement that this is a problem)?
Reply | Threaded
Open this post in threaded view
|

Re: Memory leak in ClassInfo when using MetaClasses

candrews
This post has NOT been accepted by the mailing list yet.
I have reported this issue at http://jira.codehaus.org/browse/GROOVY-6704
Reply | Threaded
Open this post in threaded view
|

Re: Memory leak in ClassInfo when using MetaClasses

candrews
Does anyone have any thoughts?

In my opinion, this is a significant issue, and it seems to impact all versions of Groovy. Fixing this should reduce the memory use of all Grails applications and probably a reasonable improvement for non-Grails Groovy ones too (it should reduce memory use in normal use, and fix a case where application die due to a fatal memory leak).

I'm willing to work on this, but I could really use some guidance but even general thoughts could be helpful too.

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: Memory leak in ClassInfo when using MetaClasses

Jim White
Hi Craig.

Your previous posts haven't shown up on the list, so my first questions were "this is in reply to what?" and "which (of many) leaks?".  But google knows all:



As you're learning, the problem of leakage in metaobject code is tricky and tough.  Java 8 is even changing the way PermGen is allocated so that at least it won't run out prematurely like it does now.

You can fix one of the leaks in your test by not generating an unbounded number of class names.  The GroovyClassLoader.parseClass(String) method is a convenience method and as you'll notice there is no file name specified, so it makes one up for you.  Of course that will happen every time and the class gets stored under that name which is why you're seeing GCL referencing a ton of classes.  

So the first thing is to use a particular name with the parseClass(String, String) method :

   Class<?> clazz = classLoader.parseClass("print 'hello world'", "TheOneTrueHelloWorld");

If you comment out the call to getExpandoMetaClass(clazz) then you'll see you can run indefinitely.

This does point out what is could be considered something of a bug with parseClass(String) which is that it would be nicer to use a unique hash of the code rather than timestamped names.  That way the application could rely on the GCL caching to be truly helpful.  The trouble is that would depend on including the unique hash of all the dependencies too (like Nixpkg does and as the Gondor grid dataflow processing tool I'm developing will do too).  Could happen for some future Groovy.

As for EMC, I don't have any particular tips but perhaps someone who knows more about that bit will help.

Jim


On Wed, Apr 16, 2014 at 2:48 PM, candrews <[hidden email]> wrote:
Does anyone have any thoughts?

In my opinion, this is a significant issue, and it seems to impact all
versions of Groovy. Fixing this should reduce the memory use of all Grails
applications and probably a reasonable improvement for non-Grails Groovy
ones too (it should reduce memory use in normal use, and fix a case where
application die due to a fatal memory leak).

I'm willing to work on this, but I could really use some guidance but even
general thoughts could be helpful too.

Thanks!



--
View this message in context: http://groovy.329449.n5.nabble.com/Memory-leak-in-ClassInfo-when-using-MetaClasses-tp5719218p5719253.html
Sent from the groovy - dev mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Reply | Threaded
Open this post in threaded view
|

Re: Memory leak in ClassInfo when using MetaClasses

Jim White
Oh, and another tip.  Run your tests with a smaller max heap so they fail quicker.  For example this:

JAVA_OPTS=-Xmx20m groovy LeakTest.groovy

Makes your original test fail in under 300 iterations for me on my Mac:

$ java -version
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Jim



On Wed, Apr 16, 2014 at 6:12 PM, Jim White <[hidden email]> wrote:
Hi Craig.

Your previous posts haven't shown up on the list, so my first questions were "this is in reply to what?" and "which (of many) leaks?".  But google knows all:



As you're learning, the problem of leakage in metaobject code is tricky and tough.  Java 8 is even changing the way PermGen is allocated so that at least it won't run out prematurely like it does now.

You can fix one of the leaks in your test by not generating an unbounded number of class names.  The GroovyClassLoader.parseClass(String) method is a convenience method and as you'll notice there is no file name specified, so it makes one up for you.  Of course that will happen every time and the class gets stored under that name which is why you're seeing GCL referencing a ton of classes.  

So the first thing is to use a particular name with the parseClass(String, String) method :

   Class<?> clazz = classLoader.parseClass("print 'hello world'", "TheOneTrueHelloWorld");

If you comment out the call to getExpandoMetaClass(clazz) then you'll see you can run indefinitely.

This does point out what is could be considered something of a bug with parseClass(String) which is that it would be nicer to use a unique hash of the code rather than timestamped names.  That way the application could rely on the GCL caching to be truly helpful.  The trouble is that would depend on including the unique hash of all the dependencies too (like Nixpkg does and as the Gondor grid dataflow processing tool I'm developing will do too).  Could happen for some future Groovy.

As for EMC, I don't have any particular tips but perhaps someone who knows more about that bit will help.

Jim


On Wed, Apr 16, 2014 at 2:48 PM, candrews <[hidden email]> wrote:
Does anyone have any thoughts?

In my opinion, this is a significant issue, and it seems to impact all
versions of Groovy. Fixing this should reduce the memory use of all Grails
applications and probably a reasonable improvement for non-Grails Groovy
ones too (it should reduce memory use in normal use, and fix a case where
application die due to a fatal memory leak).

I'm willing to work on this, but I could really use some guidance but even
general thoughts could be helpful too.

Thanks!



--
View this message in context: http://groovy.329449.n5.nabble.com/Memory-leak-in-ClassInfo-when-using-MetaClasses-tp5719218p5719253.html
Sent from the groovy - dev mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email




Reply | Threaded
Open this post in threaded view
|

Re: Memory leak in ClassInfo when using MetaClasses

Jim White
Ah, I see you've seen the behavior of the generation of unique anonymous class names of GCL.parseClass(String) and propose to change it:


As I alluded to in my previous reply, the class name must be unique wrt to everything that may affect the generated bytecode for the class.  Since that depends on not only the contents of the class source but also all of the classes on which it is dependent.  Since there is no way to for the compiler to know that before compiling the class and the java.lang.ClassLoader API requires the name to be known at loadtime, generating unique names for these "globally anonymous" classes is the only practical solution.  

Your proposal to simply remove the timestamp without providing a new implementation that satisfies the requirements for which it is doing work is not adequate.  There are bound to be some jiras that discuss the timestamp feature and perhaps some alternatives.  You are welcome to find one of those or start a new one if you think there is a better answer or at least there should be a search for one (as I mentioned before, there are purely functional solutions which are superior than this timestamp thing - but they are not a simple delete of a few characters in Groovy's code).

Jim 


On Wed, Apr 16, 2014 at 6:15 PM, Jim White <[hidden email]> wrote:
Oh, and another tip.  Run your tests with a smaller max heap so they fail quicker.  For example this:

JAVA_OPTS=-Xmx20m groovy LeakTest.groovy

Makes your original test fail in under 300 iterations for me on my Mac:

$ java -version
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Jim



On Wed, Apr 16, 2014 at 6:12 PM, Jim White <[hidden email]> wrote:
Hi Craig.

Your previous posts haven't shown up on the list, so my first questions were "this is in reply to what?" and "which (of many) leaks?".  But google knows all:



As you're learning, the problem of leakage in metaobject code is tricky and tough.  Java 8 is even changing the way PermGen is allocated so that at least it won't run out prematurely like it does now.

You can fix one of the leaks in your test by not generating an unbounded number of class names.  The GroovyClassLoader.parseClass(String) method is a convenience method and as you'll notice there is no file name specified, so it makes one up for you.  Of course that will happen every time and the class gets stored under that name which is why you're seeing GCL referencing a ton of classes.  

So the first thing is to use a particular name with the parseClass(String, String) method :

   Class<?> clazz = classLoader.parseClass("print 'hello world'", "TheOneTrueHelloWorld");

If you comment out the call to getExpandoMetaClass(clazz) then you'll see you can run indefinitely.

This does point out what is could be considered something of a bug with parseClass(String) which is that it would be nicer to use a unique hash of the code rather than timestamped names.  That way the application could rely on the GCL caching to be truly helpful.  The trouble is that would depend on including the unique hash of all the dependencies too (like Nixpkg does and as the Gondor grid dataflow processing tool I'm developing will do too).  Could happen for some future Groovy.

As for EMC, I don't have any particular tips but perhaps someone who knows more about that bit will help.

Jim


On Wed, Apr 16, 2014 at 2:48 PM, candrews <[hidden email]> wrote:
Does anyone have any thoughts?

In my opinion, this is a significant issue, and it seems to impact all
versions of Groovy. Fixing this should reduce the memory use of all Grails
applications and probably a reasonable improvement for non-Grails Groovy
ones too (it should reduce memory use in normal use, and fix a case where
application die due to a fatal memory leak).

I'm willing to work on this, but I could really use some guidance but even
general thoughts could be helpful too.

Thanks!



--
View this message in context: http://groovy.329449.n5.nabble.com/Memory-leak-in-ClassInfo-when-using-MetaClasses-tp5719218p5719253.html
Sent from the groovy - dev mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email





Reply | Threaded
Open this post in threaded view
|

Re: Memory leak in ClassInfo when using MetaClasses

Jochen Theodorou
In reply to this post by candrews
Am 16.04.2014 23:48, schrieb candrews:

> Does anyone have any thoughts?
>
> In my opinion, this is a significant issue, and it seems to impact all
> versions of Groovy. Fixing this should reduce the memory use of all Grails
> applications and probably a reasonable improvement for non-Grails Groovy
> ones too (it should reduce memory use in normal use, and fix a case where
> application die due to a fatal memory leak).
>
> I'm willing to work on this, but I could really use some guidance but even
> general thoughts could be helpful too.


so there are two ways to tackle this. The unique class name is something
Jim explained already and this won't work. The other is the permament
meta class.

Now since there is a call registry.setMetaClass(aClass, emc); Groovy
must keep a hard reference to the meta class, this code forces it to do
so. In theory of course the meta class has to exist only as long as the
class does exist. If you imagine a Java class, we have not really a way
of injecting something into it to force it storing the meta class
itself. JDK7 has ClassValue, that provides that actually, but since we
target JDK5/6 we don't use it yet.

The simple idea I had though is to have a static field in code compiled
by us and then use that to store the meta class. For ClassInfo it would
then in these cases be enough to store them with a weak reference. This
does not solve the problem for all kinds of classes, only for classes
produced by us... but in this case here, this is good enough imho.

What do you think?

bye blackdrag

--
Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
blog: http://blackdragsview.blogspot.com/
german groovy discussion newsgroup: de.comp.lang.misc
For Groovy programming sources visit http://groovy-lang.org


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: Memory leak in ClassInfo when using MetaClasses

candrews
Jim White wrote
Hi Craig.

Your previous posts haven't shown up on the list, so my first questions
were "this is in reply to what?" and "which (of many) leaks?".  But google
knows all:

http://groovy.329449.n5.nabble.com/Memory-leak-in-ClassInfo-when-using-MetaClasses-td5719218.html

http://jira.codehaus.org/browse/GROOVY-6704
Sorry about that... I thought I was a list member, but then realized much too late that I was not.

Jim White wrote
As you're learning, the problem of leakage in metaobject code is tricky and
tough.  Java 8 is even changing the way PermGen is allocated so that at
least it won't run out prematurely like it does now.
The memory management changes in Oracle Java 8 will just delay this problem, not prevent it, as you know. I'm also working on IBM Java 6 and the OutOfMemoryError definitely does happen much later than when I run on Oracle < Java 8 (as we would expect).

Jim White wrote
You can fix one of the leaks in your test by not generating an unbounded
number of class names.  The GroovyClassLoader.parseClass(String) method is
a convenience method and as you'll notice there is no file name specified,
so it makes one up for you.  Of course that will happen every time and the
class gets stored under that name which is why you're seeing GCL
referencing a ton of classes.

So the first thing is to use a particular name with the parseClass(String,
String) method :

   Class<?> clazz = classLoader.parseClass("print 'hello world'",
"TheOneTrueHelloWorld");
Since the GroovyClassLoader is created inside the loop, it shouldn't be leaking. If you leave the line as:
and just comment out:
getExpandoMetaClass(clazz)
then you shouldn't get an OutOfMemoryError.
But, in the interest of not confusing things, I think testing with:
   Class<?> clazz = classLoader.parseClass("print 'hello world'",
"TheOneTrueHelloWorld");
as you said is a better test, as it keeps the 2 issues separate. Thanks for pointing that out, I should have presented it that way in the first place.

Jochen Theodorou wrote
Am 16.04.2014 23:48, schrieb candrews:
> Does anyone have any thoughts?
>
> In my opinion, this is a significant issue, and it seems to impact all
> versions of Groovy. Fixing this should reduce the memory use of all Grails
> applications and probably a reasonable improvement for non-Grails Groovy
> ones too (it should reduce memory use in normal use, and fix a case where
> application die due to a fatal memory leak).
>
> I'm willing to work on this, but I could really use some guidance but even
> general thoughts could be helpful too.


so there are two ways to tackle this. The unique class name is something
Jim explained already and this won't work. The other is the permament
meta class.

Now since there is a call registry.setMetaClass(aClass, emc); Groovy
must keep a hard reference to the meta class, this code forces it to do
so. In theory of course the meta class has to exist only as long as the
class does exist. If you imagine a Java class, we have not really a way
of injecting something into it to force it storing the meta class
itself. JDK7 has ClassValue, that provides that actually, but since we
target JDK5/6 we don't use it yet.

The simple idea I had though is to have a static field in code compiled
by us and then use that to store the meta class. For ClassInfo it would
then in these cases be enough to store them with a weak reference. This
does not solve the problem for all kinds of classes, only for classes
produced by us... but in this case here, this is good enough imho.

What do you think?
That sounds like a really great solution... I wish I knew more about how to implement it, as it would be fun to give that a shot. A potentially even better approach would be to use ClassValue (when it's available), fall back to the static field in compiled code at runtime (in cases where that's possible), and finally if neither is doable in a specific case log an message saying so making the user aware that memory leaks may occur for this reason.

"only for classes produced by us" - I was under the impression that ClassInfo only applies to Groovy classes, and that the Groovy compiler/runtime (aka "us") produces all of those. What situation would this solution not work for?
Reply | Threaded
Open this post in threaded view
|

Re: Memory leak in ClassInfo when using MetaClasses

Jochen Theodorou
Am 17.04.2014 20:45, schrieb candrews:
[...]
> Jochen Theodorou wrote
[...]

>> The simple idea I had though is to have a static field in code compiled
>> by us and then use that to store the meta class. For ClassInfo it would
>> then in these cases be enough to store them with a weak reference. This
>> does not solve the problem for all kinds of classes, only for classes
>> produced by us... but in this case here, this is good enough imho.
>>
>> What do you think?
>
> That sounds like a really great solution... I wish I knew more about how to
> implement it, as it would be fun to give that a shot. A potentially even
> better approach would be to use ClassValue (when it's available), fall back
> to the static field in compiled code at runtime (in cases where that's
> possible), and finally if neither is doable in a specific case log an
> message saying so making the user aware that memory leaks may occur for this
> reason.

any Java class is such a case... won't do good to log that ;)

You can make a proof of concept if you dare and just use a script with a
static field where the meta class will be stored in by the runtime. For
a normal Groovy class this can be done. I won't say it is easy to do,
but maybe worth a try.

> "only for classes produced by us" - I was under the impression that
> ClassInfo only applies to Groovy classes, and that the Groovy
> compiler/runtime (aka "us") produces all of those. What situation would this
> solution not work for?

ClassInfo is similar to MetaClass, a runtime companion to a Class. It
stores information about methods and properties. If you for example do
"ffoo".toUppercase() in Groovy, you need the meta class of String, and
with that the ClassInfo of String, or else you won't be able to make the
method call. Being able to use ExpandoMetaClass to add methods to for
example Integer should be proof enough to know, that this system is not
limited to Groovy classes. I am pretty sure, that your GSP will require
quite many meta classes for classes from java.lang

bye blackdrag

--
Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
blog: http://blackdragsview.blogspot.com/
german groovy discussion newsgroup: de.comp.lang.misc
For Groovy programming sources visit http://groovy-lang.org


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: Memory leak in ClassInfo when using MetaClasses

candrews
I started working on implementing this change... and I'm having some trouble. Many of the tests fail, so that's a decent indicator (in my experience) that I'm doing something wrong :-)

I'm using a backport of ClassValue to Java 6 that I found at https://code.google.com/p/jvm-language-runtime/source/browse/trunk/invokedynamic-backport/src/jsr292/java/lang/ClassValue.java I believe the license is LGPLv2, and since Groovy is Apache license, that's a problem. If my approach is found to be sound, I'll contact the author to see if he'll re-license it. But for now, I figured it's okay to use it my experiments.

BTW,  the tests fail in the same way using the backport ClassValue as java.lang.ClassValue.

My intent is to eventually switch between backport ClassValue and java.lang.ClassValue based on a runtime determination of if java.lang.ClassValue is available.

I changed Groovy's ClassInfo class to use ClassValue, basically replacing globalClassSet with it. Then I had to implement some other things to keep track of ClassInfo's to continue to support the existing public methods.

Would someone please take a look at my work, provide some feedback, and perhaps give some thoughts on where I'm going to horribly wrong regarding the test failures? :)

https://github.com/candrews/groovy-core/tree/ClassValue

Thanks!
1234