Performance of a simple script

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance of a simple script

Martin C. Martin-2
Hi,

I have a simple script to get a histogram of values in an XML document.
  In Groovy, I do this using:

// export CLASSPATH=".:bin-groovy:xalan.jar"

import org.apache.xpath.XPathAPI
import groovy.xml.DOMBuilder

hist = [:]

use (XPathAPI) {
    new File("qpxd.log.2007-01-07T12_00_00").eachLine { line ->
       if (line.indexOf('<QPXQuery') > 0) {
          def xml = line[line.indexOf('<QPXQuery') ..<
line.lastIndexOf(']')].replace(/\n/, "\n")
          def doc = DOMBuilder.parse(new StringReader(xml)).documentElement
          def names = doc.selectNodeList('//Foo//Bar/text()').collect {
it.nodeValue }
          names.each { if (hist.containsKey(it)) hist[it] += 1; else
hist[it] = 1 }
       }
    }
    println hist
}

And it takes 128 seconds on my machine.  The equivalent thing in Python
takes only 20 seconds:

import libxml2

hist = {}

for line in file("qpxd.log.2007-01-07T12_00_00"):
    if line.find("<QPXQuery") >= 0:
        xmltext =
line[line.find("<QPXQuery"):line.rfind("]")].replace(r'\n', '\n')
        xml = libxml2.parseDoc(xmltext)
        codes = [code.getContent() for code in
xml.xpathEval('//Foo//Bar/text()')]
        for c in codes:
            hist[c] = hist.get(c, 0) + 1
        xml.free()

print hist

Any idea why the Groovy takes so much longer?  The output from -Xprof is
below.  The single biggest "hot spot" is
java.security.AccessController.doPrivileged, with
java.io.UnixFileSystem.getBooleanAttributes0 second.  Why are we getting
Boolean attributes on a UnixFileSystem so often?

Java 1.5.0_07 on Linux (FC3).

Best,
Martin


Flat profile of 128.68 secs (7803 total ticks): main

   Interpreted + native   Method
   1.3%     0  +   101    java.lang.Throwable.fillInStackTrace
   0.7%    51  +     0    org.apache.xpath.VariableStack.reset
   0.2%     0  +    13    java.lang.Object.clone
   0.2%     8  +     4    java.security.AccessController.doPrivileged
   0.1%     0  +     8    java.lang.System.arraycopy
   0.1%     8  +     0    org.apache.xpath.compiler.OpMapVector.<init>
   0.1%     0  +     7    java.lang.Class.getInterfaces
   0.1%     3  +     4    java.security.AccessController.doPrivileged
   0.1%     1  +     5    java.io.UnixFileSystem.getBooleanAttributes0
   0.1%     6  +     0    org.apache.xerces.util.SymbolTable$Entry.<init>
   0.1%     0  +     5    java.lang.ClassLoader.findLoadedClass0
   0.1%     5  +     0
org.apache.xerces.impl.XMLEntityManager$CharacterBuffer.<init>
   0.1%     4  +     0    org.apache.xerces.util.SymbolTable.<init>
   0.0%     0  +     3    java.lang.Object.getClass
   0.0%     3  +     0    java.nio.HeapByteBuffer.<init>
   0.0%     3  +     0    java.io.FileInputStream.read
   0.0%     3  +     0
org.apache.xerces.parsers.SecuritySupport.getFileExists
   0.0%     3  +     0    org.apache.xpath.VariableStack.<init>
   0.0%     3  +     0    org.apache.xml.utils.ObjectVector.<init>
   0.0%     3  +     0    java.io.BufferedReader.<init>
   0.0%     3  +     0    java.util.regex.Matcher.<init>
   0.0%     0  +     2    java.lang.Thread.currentThread
   0.0%     0  +     2    java.lang.Object.hashCode
   0.0%     0  +     2    java.util.zip.Inflater.reset
   0.0%     0  +     2    java.util.zip.ZipEntry.initFields
   4.4%   171  +   174    Total interpreted (including elided)

      Compiled + native   Method
   3.4%   213  +    55    org.apache.xpath.XPathContext.<init>
   3.4%   266  +     0    java.util.regex.Matcher.replaceAll
   3.1%   240  +     0    org.apache.xml.dtm.ref.dom2dtm.DOM2DTM.addNode
   2.6%   200  +     0    org.apache.xml.dtm.ref.dom2dtm.DOM2DTM.nextNode
   2.3%   183  +     0    sun.nio.cs.UTF_8$Decoder.decodeArrayLoop
   2.2%   173  +     1
org.apache.xerces.dom.DeferredDocumentImpl.ensureCapacity
   2.1%   161  +     0    org.apache.xpath.patterns.NodeTest.execute
   1.4%   112  +     0    java.io.BufferedReader.readLine
   1.4%   108  +     1    java.util.regex.Matcher.appendReplacement
   1.2%    93  +     0    java.lang.StringBuffer.append
   1.1%    89  +     0    org.apache.xpath.axes.AxesWalker.nextNode
   1.1%    88  +     0    org.apache.xerces.impl.XMLEntityScanner.scanQName
   1.1%    82  +     0
org.apache.xerces.dom.ElementNSImpl.getNamespaceURI
   0.8%    60  +     0
org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement
   0.7%    55  +     0
org.apache.xerces.dom.DeferredDocumentImpl.synchronizeChildren
   0.7%    52  +     0
org.apache.xml.dtm.ref.DTMDefaultBaseTraversers$DescendantTraverser.next
   0.7%    52  +     0
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch
   0.6%    44  +     0
org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString
   0.6%    43  +     0
org.apache.xerces.util.ParserConfigurationSettings.addRecognizedFeatures
   0.6%    43  +     0    java.net.URLStreamHandler.parseURL
   0.5%    42  +     0    org.apache.xml.dtm.ref.DTMDefaultBase.indexNode
   0.5%    41  +     0    adapters
   0.5%    41  +     0    java.lang.String.indexOf
   0.5%    41  +     0    groovy.lang.MetaClassImpl.chooseMethod
   0.5%    39  +     0    groovy.lang.MetaClassImpl.getMethodWithoutCaching
  59.3%  4493  +   124    Total compiled (including elided)

          Stub + native   Method
   5.0%   218  +   174    java.security.AccessController.doPrivileged
   4.5%    23  +   325    java.io.UnixFileSystem.getBooleanAttributes0
   4.0%     0  +   309    java.lang.String.intern
   2.1%     0  +   165    java.util.zip.ZipFile.getEntry
   1.9%     0  +   147    java.lang.Object.clone
   1.6%     0  +   125    java.lang.Object.hashCode
   1.4%     0  +   108    java.util.zip.Inflater.inflateBytes
   1.2%     0  +    91    java.lang.ClassLoader.findLoadedClass0
   1.0%     0  +    79    java.lang.Class.isAssignableFrom
   1.0%     0  +    77    java.lang.Throwable.fillInStackTrace
   0.7%     0  +    58    java.lang.Class.isArray
   0.5%     0  +    36    java.lang.Class.isPrimitive
   0.5%    17  +    19    java.security.AccessController.doPrivileged
   0.3%    12  +     9    java.security.AccessController.doPrivileged
   0.2%     0  +    15    java.lang.Class.isInterface
   0.2%     0  +    13    java.util.zip.Inflater.reset
   0.2%     0  +    13    java.util.zip.ZipEntry.initFields
   0.1%     0  +    11    java.util.zip.ZipFile.freeEntry
   0.1%     0  +    11    java.nio.Bits.copyToByteArray
   0.1%     0  +     5    java.lang.Class.getSuperclass
   0.1%     0  +     5    java.util.zip.ZipFile.getCSize
   0.1%     0  +     4    java.util.zip.Inflater.getBytesWritten
   0.0%     0  +     3    java.lang.Class.getClassLoader0
   0.0%     0  +     3    java.util.zip.ZipFile.getMethod
   0.0%     0  +     2    java.util.zip.ZipFile.getEntryOffset
  26.7%   270  +  1808    Total stub (including elided)

   Thread-local ticks:
   0.2%    14             Blocked (of total)
   8.6%   670             Interpreter
   0.0%     2             Compilation
   1.0%    77             Unknown: thread_state


Flat profile of 0.02 secs (1 total ticks): DestroyJavaVM

   Thread-local ticks:
100.0%     1             Blocked (of total)


Global summary of 128.73 seconds:
100.0%  7859             Received ticks
   0.6%    49             Received GC ticks
   3.9%   306             Compilation
   0.1%     4             Other VM operations
   8.5%   670             Interpreter
   1.0%    77             Unknown code

real    2m8.865s
user    2m7.864s
sys     0m4.544s


---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email

Reply | Threaded
Open this post in threaded view
|

Re: Performance of a simple script

graemer
JVM load time?

Cheers

On 1/17/07, Martin C. Martin <[hidden email]> wrote:

> Hi,
>
> I have a simple script to get a histogram of values in an XML document.
>   In Groovy, I do this using:
>
> // export CLASSPATH=".:bin-groovy:xalan.jar"
>
> import org.apache.xpath.XPathAPI
> import groovy.xml.DOMBuilder
>
> hist = [:]
>
> use (XPathAPI) {
>     new File("qpxd.log.2007-01-07T12_00_00").eachLine { line ->
>        if (line.indexOf('<QPXQuery') > 0) {
>           def xml = line[line.indexOf('<QPXQuery') ..<
> line.lastIndexOf(']')].replace(/\n/, "\n")
>           def doc = DOMBuilder.parse(new StringReader(xml)).documentElement
>           def names = doc.selectNodeList('//Foo//Bar/text()').collect {
> it.nodeValue }
>           names.each { if (hist.containsKey(it)) hist[it] += 1; else
> hist[it] = 1 }
>        }
>     }
>     println hist
> }
>
> And it takes 128 seconds on my machine.  The equivalent thing in Python
> takes only 20 seconds:
>
> import libxml2
>
> hist = {}
>
> for line in file("qpxd.log.2007-01-07T12_00_00"):
>     if line.find("<QPXQuery") >= 0:
>         xmltext =
> line[line.find("<QPXQuery"):line.rfind("]")].replace(r'\n', '\n')
>         xml = libxml2.parseDoc(xmltext)
>         codes = [code.getContent() for code in
> xml.xpathEval('//Foo//Bar/text()')]
>         for c in codes:
>             hist[c] = hist.get(c, 0) + 1
>         xml.free()
>
> print hist
>
> Any idea why the Groovy takes so much longer?  The output from -Xprof is
> below.  The single biggest "hot spot" is
> java.security.AccessController.doPrivileged, with
> java.io.UnixFileSystem.getBooleanAttributes0 second.  Why are we getting
> Boolean attributes on a UnixFileSystem so often?
>
> Java 1.5.0_07 on Linux (FC3).
>
> Best,
> Martin
>
>
> Flat profile of 128.68 secs (7803 total ticks): main
>
>    Interpreted + native   Method
>    1.3%     0  +   101    java.lang.Throwable.fillInStackTrace
>    0.7%    51  +     0    org.apache.xpath.VariableStack.reset
>    0.2%     0  +    13    java.lang.Object.clone
>    0.2%     8  +     4    java.security.AccessController.doPrivileged
>    0.1%     0  +     8    java.lang.System.arraycopy
>    0.1%     8  +     0    org.apache.xpath.compiler.OpMapVector.<init>
>    0.1%     0  +     7    java.lang.Class.getInterfaces
>    0.1%     3  +     4    java.security.AccessController.doPrivileged
>    0.1%     1  +     5    java.io.UnixFileSystem.getBooleanAttributes0
>    0.1%     6  +     0    org.apache.xerces.util.SymbolTable$Entry.<init>
>    0.1%     0  +     5    java.lang.ClassLoader.findLoadedClass0
>    0.1%     5  +     0
> org.apache.xerces.impl.XMLEntityManager$CharacterBuffer.<init>
>    0.1%     4  +     0    org.apache.xerces.util.SymbolTable.<init>
>    0.0%     0  +     3    java.lang.Object.getClass
>    0.0%     3  +     0    java.nio.HeapByteBuffer.<init>
>    0.0%     3  +     0    java.io.FileInputStream.read
>    0.0%     3  +     0
> org.apache.xerces.parsers.SecuritySupport.getFileExists
>    0.0%     3  +     0    org.apache.xpath.VariableStack.<init>
>    0.0%     3  +     0    org.apache.xml.utils.ObjectVector.<init>
>    0.0%     3  +     0    java.io.BufferedReader.<init>
>    0.0%     3  +     0    java.util.regex.Matcher.<init>
>    0.0%     0  +     2    java.lang.Thread.currentThread
>    0.0%     0  +     2    java.lang.Object.hashCode
>    0.0%     0  +     2    java.util.zip.Inflater.reset
>    0.0%     0  +     2    java.util.zip.ZipEntry.initFields
>    4.4%   171  +   174    Total interpreted (including elided)
>
>       Compiled + native   Method
>    3.4%   213  +    55    org.apache.xpath.XPathContext.<init>
>    3.4%   266  +     0    java.util.regex.Matcher.replaceAll
>    3.1%   240  +     0    org.apache.xml.dtm.ref.dom2dtm.DOM2DTM.addNode
>    2.6%   200  +     0    org.apache.xml.dtm.ref.dom2dtm.DOM2DTM.nextNode
>    2.3%   183  +     0    sun.nio.cs.UTF_8$Decoder.decodeArrayLoop
>    2.2%   173  +     1
> org.apache.xerces.dom.DeferredDocumentImpl.ensureCapacity
>    2.1%   161  +     0    org.apache.xpath.patterns.NodeTest.execute
>    1.4%   112  +     0    java.io.BufferedReader.readLine
>    1.4%   108  +     1    java.util.regex.Matcher.appendReplacement
>    1.2%    93  +     0    java.lang.StringBuffer.append
>    1.1%    89  +     0    org.apache.xpath.axes.AxesWalker.nextNode
>    1.1%    88  +     0    org.apache.xerces.impl.XMLEntityScanner.scanQName
>    1.1%    82  +     0
> org.apache.xerces.dom.ElementNSImpl.getNamespaceURI
>    0.8%    60  +     0
> org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement
>    0.7%    55  +     0
> org.apache.xerces.dom.DeferredDocumentImpl.synchronizeChildren
>    0.7%    52  +     0
> org.apache.xml.dtm.ref.DTMDefaultBaseTraversers$DescendantTraverser.next
>    0.7%    52  +     0
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch
>    0.6%    44  +     0
> org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString
>    0.6%    43  +     0
> org.apache.xerces.util.ParserConfigurationSettings.addRecognizedFeatures
>    0.6%    43  +     0    java.net.URLStreamHandler.parseURL
>    0.5%    42  +     0    org.apache.xml.dtm.ref.DTMDefaultBase.indexNode
>    0.5%    41  +     0    adapters
>    0.5%    41  +     0    java.lang.String.indexOf
>    0.5%    41  +     0    groovy.lang.MetaClassImpl.chooseMethod
>    0.5%    39  +     0    groovy.lang.MetaClassImpl.getMethodWithoutCaching
>   59.3%  4493  +   124    Total compiled (including elided)
>
>           Stub + native   Method
>    5.0%   218  +   174    java.security.AccessController.doPrivileged
>    4.5%    23  +   325    java.io.UnixFileSystem.getBooleanAttributes0
>    4.0%     0  +   309    java.lang.String.intern
>    2.1%     0  +   165    java.util.zip.ZipFile.getEntry
>    1.9%     0  +   147    java.lang.Object.clone
>    1.6%     0  +   125    java.lang.Object.hashCode
>    1.4%     0  +   108    java.util.zip.Inflater.inflateBytes
>    1.2%     0  +    91    java.lang.ClassLoader.findLoadedClass0
>    1.0%     0  +    79    java.lang.Class.isAssignableFrom
>    1.0%     0  +    77    java.lang.Throwable.fillInStackTrace
>    0.7%     0  +    58    java.lang.Class.isArray
>    0.5%     0  +    36    java.lang.Class.isPrimitive
>    0.5%    17  +    19    java.security.AccessController.doPrivileged
>    0.3%    12  +     9    java.security.AccessController.doPrivileged
>    0.2%     0  +    15    java.lang.Class.isInterface
>    0.2%     0  +    13    java.util.zip.Inflater.reset
>    0.2%     0  +    13    java.util.zip.ZipEntry.initFields
>    0.1%     0  +    11    java.util.zip.ZipFile.freeEntry
>    0.1%     0  +    11    java.nio.Bits.copyToByteArray
>    0.1%     0  +     5    java.lang.Class.getSuperclass
>    0.1%     0  +     5    java.util.zip.ZipFile.getCSize
>    0.1%     0  +     4    java.util.zip.Inflater.getBytesWritten
>    0.0%     0  +     3    java.lang.Class.getClassLoader0
>    0.0%     0  +     3    java.util.zip.ZipFile.getMethod
>    0.0%     0  +     2    java.util.zip.ZipFile.getEntryOffset
>   26.7%   270  +  1808    Total stub (including elided)
>
>    Thread-local ticks:
>    0.2%    14             Blocked (of total)
>    8.6%   670             Interpreter
>    0.0%     2             Compilation
>    1.0%    77             Unknown: thread_state
>
>
> Flat profile of 0.02 secs (1 total ticks): DestroyJavaVM
>
>    Thread-local ticks:
> 100.0%     1             Blocked (of total)
>
>
> Global summary of 128.73 seconds:
> 100.0%  7859             Received ticks
>    0.6%    49             Received GC ticks
>    3.9%   306             Compilation
>    0.1%     4             Other VM operations
>    8.5%   670             Interpreter
>    1.0%    77             Unknown code
>
> real    2m8.865s
> user    2m7.864s
> sys     0m4.544s
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list please visit:
>
>     http://xircles.codehaus.org/manage_email
>
>


--
Graeme Rocher
Grails Project Lead
http://grails.org

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email

Reply | Threaded
Open this post in threaded view
|

Re: Performance of a simple script

tugwilson
In reply to this post by Martin C. Martin-2

On 17 Jan 2007, at 17:13, Martin C. Martin wrote:

> Any idea why the Groovy takes so much longer?  The output from -
> Xprof is below.  The single biggest "hot spot" is  
> java.security.AccessController.doPrivileged, with  
> java.io.UnixFileSystem.getBooleanAttributes0 second.  Why are we  
> getting Boolean attributes on a UnixFileSystem so often?


java.security.AccessController.doPrivileged is almost certainly  
method dispatch via reflection. It's a prime target for optimisation.

java.io.UnixFileSystem.getBooleanAttributes0 - no idea I'm afraid.

libxml2 is very fast C code so I would expect the Python time to be  
good.

JVM and groovy start up takes quite a while too.


John Wilson
The Wilson Partnership
web http://www.wilson.co.uk
blog http://eek.ook.org



---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email

Reply | Threaded
Open this post in threaded view
|

Re: Performance of a simple script

Martin C. Martin-2
Thanks.  I don't think the XML parsing time is dominating the Groovy, as
you said it's probably method dispatch.

JVM and Groovy startup certainly take a little time, but hopefully not
90 seconds.  :)

Best,
Martin

John Wilson wrote:

>
> On 17 Jan 2007, at 17:13, Martin C. Martin wrote:
>
>> Any idea why the Groovy takes so much longer?  The output from -Xprof
>> is below.  The single biggest "hot spot" is
>> java.security.AccessController.doPrivileged, with
>> java.io.UnixFileSystem.getBooleanAttributes0 second.  Why are we
>> getting Boolean attributes on a UnixFileSystem so often?
>
>
> java.security.AccessController.doPrivileged is almost certainly method
> dispatch via reflection. It's a prime target for optimisation.
>
> java.io.UnixFileSystem.getBooleanAttributes0 - no idea I'm afraid.
>
> libxml2 is very fast C code so I would expect the Python time to be good.
>
> JVM and groovy start up takes quite a while too.
>
>
> John Wilson
> The Wilson Partnership
> web http://www.wilson.co.uk
> blog http://eek.ook.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list please visit:
>
>    http://xircles.codehaus.org/manage_email
>

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email

Reply | Threaded
Open this post in threaded view
|

Re: Performance of a simple script

Alex Popescu
In reply to this post by Martin C. Martin-2
On 1/17/07, Martin C. Martin <[hidden email]> wrote:

> Hi,
>
> I have a simple script to get a histogram of values in an XML document.
>   In Groovy, I do this using:
>
> // export CLASSPATH=".:bin-groovy:xalan.jar"
>
> import org.apache.xpath.XPathAPI
> import groovy.xml.DOMBuilder
>
> hist = [:]
>
> use (XPathAPI) {
>     new File("qpxd.log.2007-01-07T12_00_00").eachLine { line ->
>        if (line.indexOf('<QPXQuery') > 0) {
>           def xml = line[line.indexOf('<QPXQuery') ..<
> line.lastIndexOf(']')].replace(/\n/, "\n")
>           def doc = DOMBuilder.parse(new StringReader(xml)).documentElement
>           def names = doc.selectNodeList('//Foo//Bar/text()').collect {
> it.nodeValue }
>           names.each { if (hist.containsKey(it)) hist[it] += 1; else
> hist[it] = 1 }
>        }
>     }
>     println hist
> }
>
> And it takes 128 seconds on my machine.  The equivalent thing in Python
> takes only 20 seconds:
>
> import libxml2
>
> hist = {}
>
> for line in file("qpxd.log.2007-01-07T12_00_00"):
>     if line.find("<QPXQuery") >= 0:
>         xmltext =
> line[line.find("<QPXQuery"):line.rfind("]")].replace(r'\n', '\n')
>         xml = libxml2.parseDoc(xmltext)
>         codes = [code.getContent() for code in
> xml.xpathEval('//Foo//Bar/text()')]
>         for c in codes:
>             hist[c] = hist.get(c, 0) + 1
>         xml.free()
>
> print hist
>
> Any idea why the Groovy takes so much longer?  The output from -Xprof is
> below.  The single biggest "hot spot" is
> java.security.AccessController.doPrivileged, with
> java.io.UnixFileSystem.getBooleanAttributes0 second.  Why are we getting
> Boolean attributes on a UnixFileSystem so often?
>
> Java 1.5.0_07 on Linux (FC3).
>
> Best,
> Martin
>
>
> Flat profile of 128.68 secs (7803 total ticks): main
>
>    Interpreted + native   Method
>    1.3%     0  +   101    java.lang.Throwable.fillInStackTrace
>    0.7%    51  +     0    org.apache.xpath.VariableStack.reset
>    0.2%     0  +    13    java.lang.Object.clone
>    0.2%     8  +     4    java.security.AccessController.doPrivileged
>    0.1%     0  +     8    java.lang.System.arraycopy
>    0.1%     8  +     0    org.apache.xpath.compiler.OpMapVector.<init>
>    0.1%     0  +     7    java.lang.Class.getInterfaces
>    0.1%     3  +     4    java.security.AccessController.doPrivileged
>    0.1%     1  +     5    java.io.UnixFileSystem.getBooleanAttributes0
>    0.1%     6  +     0    org.apache.xerces.util.SymbolTable$Entry.<init>
>    0.1%     0  +     5    java.lang.ClassLoader.findLoadedClass0
>    0.1%     5  +     0
> org.apache.xerces.impl.XMLEntityManager$CharacterBuffer.<init>
>    0.1%     4  +     0    org.apache.xerces.util.SymbolTable.<init>
>    0.0%     0  +     3    java.lang.Object.getClass
>    0.0%     3  +     0    java.nio.HeapByteBuffer.<init>
>    0.0%     3  +     0    java.io.FileInputStream.read
>    0.0%     3  +     0
> org.apache.xerces.parsers.SecuritySupport.getFileExists
>    0.0%     3  +     0    org.apache.xpath.VariableStack.<init>
>    0.0%     3  +     0    org.apache.xml.utils.ObjectVector.<init>
>    0.0%     3  +     0    java.io.BufferedReader.<init>
>    0.0%     3  +     0    java.util.regex.Matcher.<init>
>    0.0%     0  +     2    java.lang.Thread.currentThread
>    0.0%     0  +     2    java.lang.Object.hashCode
>    0.0%     0  +     2    java.util.zip.Inflater.reset
>    0.0%     0  +     2    java.util.zip.ZipEntry.initFields
>    4.4%   171  +   174    Total interpreted (including elided)
>
>       Compiled + native   Method
>    3.4%   213  +    55    org.apache.xpath.XPathContext.<init>
>    3.4%   266  +     0    java.util.regex.Matcher.replaceAll
>    3.1%   240  +     0    org.apache.xml.dtm.ref.dom2dtm.DOM2DTM.addNode
>    2.6%   200  +     0    org.apache.xml.dtm.ref.dom2dtm.DOM2DTM.nextNode
>    2.3%   183  +     0    sun.nio.cs.UTF_8$Decoder.decodeArrayLoop
>    2.2%   173  +     1
> org.apache.xerces.dom.DeferredDocumentImpl.ensureCapacity
>    2.1%   161  +     0    org.apache.xpath.patterns.NodeTest.execute
>    1.4%   112  +     0    java.io.BufferedReader.readLine
>    1.4%   108  +     1    java.util.regex.Matcher.appendReplacement
>    1.2%    93  +     0    java.lang.StringBuffer.append
>    1.1%    89  +     0    org.apache.xpath.axes.AxesWalker.nextNode
>    1.1%    88  +     0    org.apache.xerces.impl.XMLEntityScanner.scanQName
>    1.1%    82  +     0
> org.apache.xerces.dom.ElementNSImpl.getNamespaceURI
>    0.8%    60  +     0
> org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement
>    0.7%    55  +     0
> org.apache.xerces.dom.DeferredDocumentImpl.synchronizeChildren
>    0.7%    52  +     0
> org.apache.xml.dtm.ref.DTMDefaultBaseTraversers$DescendantTraverser.next
>    0.7%    52  +     0
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch
>    0.6%    44  +     0
> org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString
>    0.6%    43  +     0
> org.apache.xerces.util.ParserConfigurationSettings.addRecognizedFeatures
>    0.6%    43  +     0    java.net.URLStreamHandler.parseURL
>    0.5%    42  +     0    org.apache.xml.dtm.ref.DTMDefaultBase.indexNode
>    0.5%    41  +     0    adapters
>    0.5%    41  +     0    java.lang.String.indexOf
>    0.5%    41  +     0    groovy.lang.MetaClassImpl.chooseMethod
>    0.5%    39  +     0    groovy.lang.MetaClassImpl.getMethodWithoutCaching
>   59.3%  4493  +   124    Total compiled (including elided)
>
>           Stub + native   Method
>    5.0%   218  +   174    java.security.AccessController.doPrivileged
>    4.5%    23  +   325    java.io.UnixFileSystem.getBooleanAttributes0
>    4.0%     0  +   309    java.lang.String.intern
>    2.1%     0  +   165    java.util.zip.ZipFile.getEntry
>    1.9%     0  +   147    java.lang.Object.clone
>    1.6%     0  +   125    java.lang.Object.hashCode
>    1.4%     0  +   108    java.util.zip.Inflater.inflateBytes
>    1.2%     0  +    91    java.lang.ClassLoader.findLoadedClass0
>    1.0%     0  +    79    java.lang.Class.isAssignableFrom
>    1.0%     0  +    77    java.lang.Throwable.fillInStackTrace
>    0.7%     0  +    58    java.lang.Class.isArray
>    0.5%     0  +    36    java.lang.Class.isPrimitive
>    0.5%    17  +    19    java.security.AccessController.doPrivileged
>    0.3%    12  +     9    java.security.AccessController.doPrivileged
>    0.2%     0  +    15    java.lang.Class.isInterface
>    0.2%     0  +    13    java.util.zip.Inflater.reset
>    0.2%     0  +    13    java.util.zip.ZipEntry.initFields
>    0.1%     0  +    11    java.util.zip.ZipFile.freeEntry
>    0.1%     0  +    11    java.nio.Bits.copyToByteArray
>    0.1%     0  +     5    java.lang.Class.getSuperclass
>    0.1%     0  +     5    java.util.zip.ZipFile.getCSize
>    0.1%     0  +     4    java.util.zip.Inflater.getBytesWritten
>    0.0%     0  +     3    java.lang.Class.getClassLoader0
>    0.0%     0  +     3    java.util.zip.ZipFile.getMethod
>    0.0%     0  +     2    java.util.zip.ZipFile.getEntryOffset
>   26.7%   270  +  1808    Total stub (including elided)
>
>    Thread-local ticks:
>    0.2%    14             Blocked (of total)
>    8.6%   670             Interpreter
>    0.0%     2             Compilation
>    1.0%    77             Unknown: thread_state
>
>
> Flat profile of 0.02 secs (1 total ticks): DestroyJavaVM
>
>    Thread-local ticks:
> 100.0%     1             Blocked (of total)
>
>
> Global summary of 128.73 seconds:
> 100.0%  7859             Received ticks
>    0.6%    49             Received GC ticks
>    3.9%   306             Compilation
>    0.1%     4             Other VM operations
>    8.5%   670             Interpreter
>    1.0%    77             Unknown code
>

I am not sure I am correctly interpretting the data but from the above
table I would say that real execution time was only 8.5% + 3.9% which
makes it less than python execution. By the way what have you used to
generate this output?

TIA,

./alex
--
.w( the_mindstorm )p.


> real    2m8.865s
> user    2m7.864s
> sys     0m4.544s
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list please visit:
>
>     http://xircles.codehaus.org/manage_email
>
>

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email

Reply | Threaded
Open this post in threaded view
|

Re: Performance of a simple script

Martin C. Martin-2
I used Sun's profiler, by running the script like this:

JAVA_OPTS=-Xprof groovy myscript.groovy

I'm not sure what "Compiled" means, I assume that's the JIT compiler, as
opposed to the bytecode interpreter.

I'm pretty sure that 8.5% + 3.9% doesn't include native and compiled
time, but it's hard to tell.

Best,
Martin

Alexandru Popescu wrote:

> On 1/17/07, Martin C. Martin <[hidden email]> wrote:
>> Hi,
>>
>> I have a simple script to get a histogram of values in an XML document.
>>   In Groovy, I do this using:
>>
>> // export CLASSPATH=".:bin-groovy:xalan.jar"
>>
>> import org.apache.xpath.XPathAPI
>> import groovy.xml.DOMBuilder
>>
>> hist = [:]
>>
>> use (XPathAPI) {
>>     new File("qpxd.log.2007-01-07T12_00_00").eachLine { line ->
>>        if (line.indexOf('<QPXQuery') > 0) {
>>           def xml = line[line.indexOf('<QPXQuery') ..<
>> line.lastIndexOf(']')].replace(/\n/, "\n")
>>           def doc = DOMBuilder.parse(new
>> StringReader(xml)).documentElement
>>           def names = doc.selectNodeList('//Foo//Bar/text()').collect {
>> it.nodeValue }
>>           names.each { if (hist.containsKey(it)) hist[it] += 1; else
>> hist[it] = 1 }
>>        }
>>     }
>>     println hist
>> }
>>
>> And it takes 128 seconds on my machine.  The equivalent thing in Python
>> takes only 20 seconds:
>>
>> import libxml2
>>
>> hist = {}
>>
>> for line in file("qpxd.log.2007-01-07T12_00_00"):
>>     if line.find("<QPXQuery") >= 0:
>>         xmltext =
>> line[line.find("<QPXQuery"):line.rfind("]")].replace(r'\n', '\n')
>>         xml = libxml2.parseDoc(xmltext)
>>         codes = [code.getContent() for code in
>> xml.xpathEval('//Foo//Bar/text()')]
>>         for c in codes:
>>             hist[c] = hist.get(c, 0) + 1
>>         xml.free()
>>
>> print hist
>>
>> Any idea why the Groovy takes so much longer?  The output from -Xprof is
>> below.  The single biggest "hot spot" is
>> java.security.AccessController.doPrivileged, with
>> java.io.UnixFileSystem.getBooleanAttributes0 second.  Why are we getting
>> Boolean attributes on a UnixFileSystem so often?
>>
>> Java 1.5.0_07 on Linux (FC3).
>>
>> Best,
>> Martin
>>
>>
>> Flat profile of 128.68 secs (7803 total ticks): main
>>
>>    Interpreted + native   Method
>>    1.3%     0  +   101    java.lang.Throwable.fillInStackTrace
>>    0.7%    51  +     0    org.apache.xpath.VariableStack.reset
>>    0.2%     0  +    13    java.lang.Object.clone
>>    0.2%     8  +     4    java.security.AccessController.doPrivileged
>>    0.1%     0  +     8    java.lang.System.arraycopy
>>    0.1%     8  +     0    org.apache.xpath.compiler.OpMapVector.<init>
>>    0.1%     0  +     7    java.lang.Class.getInterfaces
>>    0.1%     3  +     4    java.security.AccessController.doPrivileged
>>    0.1%     1  +     5    java.io.UnixFileSystem.getBooleanAttributes0
>>    0.1%     6  +     0    org.apache.xerces.util.SymbolTable$Entry.<init>
>>    0.1%     0  +     5    java.lang.ClassLoader.findLoadedClass0
>>    0.1%     5  +     0
>> org.apache.xerces.impl.XMLEntityManager$CharacterBuffer.<init>
>>    0.1%     4  +     0    org.apache.xerces.util.SymbolTable.<init>
>>    0.0%     0  +     3    java.lang.Object.getClass
>>    0.0%     3  +     0    java.nio.HeapByteBuffer.<init>
>>    0.0%     3  +     0    java.io.FileInputStream.read
>>    0.0%     3  +     0
>> org.apache.xerces.parsers.SecuritySupport.getFileExists
>>    0.0%     3  +     0    org.apache.xpath.VariableStack.<init>
>>    0.0%     3  +     0    org.apache.xml.utils.ObjectVector.<init>
>>    0.0%     3  +     0    java.io.BufferedReader.<init>
>>    0.0%     3  +     0    java.util.regex.Matcher.<init>
>>    0.0%     0  +     2    java.lang.Thread.currentThread
>>    0.0%     0  +     2    java.lang.Object.hashCode
>>    0.0%     0  +     2    java.util.zip.Inflater.reset
>>    0.0%     0  +     2    java.util.zip.ZipEntry.initFields
>>    4.4%   171  +   174    Total interpreted (including elided)
>>
>>       Compiled + native   Method
>>    3.4%   213  +    55    org.apache.xpath.XPathContext.<init>
>>    3.4%   266  +     0    java.util.regex.Matcher.replaceAll
>>    3.1%   240  +     0    org.apache.xml.dtm.ref.dom2dtm.DOM2DTM.addNode
>>    2.6%   200  +     0    org.apache.xml.dtm.ref.dom2dtm.DOM2DTM.nextNode
>>    2.3%   183  +     0    sun.nio.cs.UTF_8$Decoder.decodeArrayLoop
>>    2.2%   173  +     1
>> org.apache.xerces.dom.DeferredDocumentImpl.ensureCapacity
>>    2.1%   161  +     0    org.apache.xpath.patterns.NodeTest.execute
>>    1.4%   112  +     0    java.io.BufferedReader.readLine
>>    1.4%   108  +     1    java.util.regex.Matcher.appendReplacement
>>    1.2%    93  +     0    java.lang.StringBuffer.append
>>    1.1%    89  +     0    org.apache.xpath.axes.AxesWalker.nextNode
>>    1.1%    88  +     0    
>> org.apache.xerces.impl.XMLEntityScanner.scanQName
>>    1.1%    82  +     0
>> org.apache.xerces.dom.ElementNSImpl.getNamespaceURI
>>    0.8%    60  +     0
>> org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement
>>    0.7%    55  +     0
>> org.apache.xerces.dom.DeferredDocumentImpl.synchronizeChildren
>>    0.7%    52  +     0
>> org.apache.xml.dtm.ref.DTMDefaultBaseTraversers$DescendantTraverser.next
>>    0.7%    52  +     0
>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch
>>
>>    0.6%    44  +     0
>> org.apache.xerces.dom.DeferredDocumentImpl.getNodeValueString
>>    0.6%    43  +     0
>> org.apache.xerces.util.ParserConfigurationSettings.addRecognizedFeatures
>>    0.6%    43  +     0    java.net.URLStreamHandler.parseURL
>>    0.5%    42  +     0    org.apache.xml.dtm.ref.DTMDefaultBase.indexNode
>>    0.5%    41  +     0    adapters
>>    0.5%    41  +     0    java.lang.String.indexOf
>>    0.5%    41  +     0    groovy.lang.MetaClassImpl.chooseMethod
>>    0.5%    39  +     0    
>> groovy.lang.MetaClassImpl.getMethodWithoutCaching
>>   59.3%  4493  +   124    Total compiled (including elided)
>>
>>           Stub + native   Method
>>    5.0%   218  +   174    java.security.AccessController.doPrivileged
>>    4.5%    23  +   325    java.io.UnixFileSystem.getBooleanAttributes0
>>    4.0%     0  +   309    java.lang.String.intern
>>    2.1%     0  +   165    java.util.zip.ZipFile.getEntry
>>    1.9%     0  +   147    java.lang.Object.clone
>>    1.6%     0  +   125    java.lang.Object.hashCode
>>    1.4%     0  +   108    java.util.zip.Inflater.inflateBytes
>>    1.2%     0  +    91    java.lang.ClassLoader.findLoadedClass0
>>    1.0%     0  +    79    java.lang.Class.isAssignableFrom
>>    1.0%     0  +    77    java.lang.Throwable.fillInStackTrace
>>    0.7%     0  +    58    java.lang.Class.isArray
>>    0.5%     0  +    36    java.lang.Class.isPrimitive
>>    0.5%    17  +    19    java.security.AccessController.doPrivileged
>>    0.3%    12  +     9    java.security.AccessController.doPrivileged
>>    0.2%     0  +    15    java.lang.Class.isInterface
>>    0.2%     0  +    13    java.util.zip.Inflater.reset
>>    0.2%     0  +    13    java.util.zip.ZipEntry.initFields
>>    0.1%     0  +    11    java.util.zip.ZipFile.freeEntry
>>    0.1%     0  +    11    java.nio.Bits.copyToByteArray
>>    0.1%     0  +     5    java.lang.Class.getSuperclass
>>    0.1%     0  +     5    java.util.zip.ZipFile.getCSize
>>    0.1%     0  +     4    java.util.zip.Inflater.getBytesWritten
>>    0.0%     0  +     3    java.lang.Class.getClassLoader0
>>    0.0%     0  +     3    java.util.zip.ZipFile.getMethod
>>    0.0%     0  +     2    java.util.zip.ZipFile.getEntryOffset
>>   26.7%   270  +  1808    Total stub (including elided)
>>
>>    Thread-local ticks:
>>    0.2%    14             Blocked (of total)
>>    8.6%   670             Interpreter
>>    0.0%     2             Compilation
>>    1.0%    77             Unknown: thread_state
>>
>>
>> Flat profile of 0.02 secs (1 total ticks): DestroyJavaVM
>>
>>    Thread-local ticks:
>> 100.0%     1             Blocked (of total)
>>
>>
>> Global summary of 128.73 seconds:
>> 100.0%  7859             Received ticks
>>    0.6%    49             Received GC ticks
>>    3.9%   306             Compilation
>>    0.1%     4             Other VM operations
>>    8.5%   670             Interpreter
>>    1.0%    77             Unknown code
>>
>
> I am not sure I am correctly interpretting the data but from the above
> table I would say that real execution time was only 8.5% + 3.9% which
> makes it less than python execution. By the way what have you used to
> generate this output?
>
> TIA,
>
> ./alex
> --
> .w( the_mindstorm )p.
>
>
>> real    2m8.865s
>> user    2m7.864s
>> sys     0m4.544s
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list please visit:
>>
>>     http://xircles.codehaus.org/manage_email
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list please visit:
>
>    http://xircles.codehaus.org/manage_email
>

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email