Quantcast

Reading the contents of a File without impacting heap space (Reading the file into memory?)

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Reading the contents of a File without impacting heap space (Reading the file into memory?)

Dave McGee
Hi all,

I am working with parsing the contents of a log file to check for the occurrence of a string (i.e. "CRITICAL alarm"). Was hoping for the application I'm using to be as light-weight as possible in terms of memory usage, etc.  As a test, I created a 250 MB log file and noticed that I get a "java.lang.OutOfMemoryError: Java heap space" error when trying to deal with this big file. I replicated this by setting my maximum heap size to 128MB. Looks like its reading this file into memory, anyone know of any ways around this?

Thanks!

Regards,
Dave.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reading the contents of a File without impacting heap space (Reading the file into memory?)

Guillaume Laforge-4
Hi Dave,

Perhaps you should show us the code that you use for that task.

What I mean for example is that if you use something like new
File("log.txt").text or new File("log.txt").readLines(), then Groovy
will try to read everything in memory! But if you use something like
new File("log.txt").eachLine { ... }, it will read one line at a time,
and won't fill the memory.

Guillaume

On Wed, Jul 13, 2011 at 12:43, Dave McGee <[hidden email]> wrote:

> Hi all,
>
> I am working with parsing the contents of a log file to check for the
> occurrence of a string (i.e. "CRITICAL alarm"). Was hoping for the
> application I'm using to be as light-weight as possible in terms of memory
> usage, etc.  As a test, I created a 250 MB log file and noticed that I get a
> "java.lang.OutOfMemoryError: Java heap space" error when trying to deal with
> this big file. I replicated this by setting my maximum heap size to 128MB.
> Looks like its reading this file into memory, anyone know of any ways around
> this?
>
> Thanks!
>
> Regards,
> Dave.
>
>



--
Guillaume Laforge
Groovy Project Manager
Head of Groovy Development at SpringSource
http://www.springsource.com/g2one

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reading the contents of a File without impacting heap space (Reading the file into memory?)

Mikhail Antonov
In reply to this post by Dave McGee
Do you run your log analyzer in parallel (in background) with the real app generating the log? Anyway, I'd suggest 2 things.

1) Enable automatic log files rolling with limit as say 50 or 100 mb for app generating the log, it should be doable via configuration of logger (like Log4J or whatever you are using) - i.e. upon reaching this limit, the application starts new log file.

2) Read the file in streaming fashion but chunks of fixed size of, say, 10mb.

-Mikhail

2011/7/13 Dave McGee <[hidden email]>
Hi all,

I am working with parsing the contents of a log file to check for the occurrence of a string (i.e. "CRITICAL alarm"). Was hoping for the application I'm using to be as light-weight as possible in terms of memory usage, etc.  As a test, I created a 250 MB log file and noticed that I get a "java.lang.OutOfMemoryError: Java heap space" error when trying to deal with this big file. I replicated this by setting my maximum heap size to 128MB. Looks like its reading this file into memory, anyone know of any ways around this?

Thanks!

Regards,
Dave.




--
Thanks,
Michael Antonov
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reading the contents of a File without impacting heap space (Reading the file into memory?)

Dave McGee
In reply to this post by Guillaume Laforge-4
Hi Guillaume,

Thanks for the quick response! Apologies - I forgot to include my code. I think I am using the approach that you mentioned (new File("log.txt").eachLine { ... })

    File file = new File(/C:\Users\Dave\workspace\MyApplication\test.txt/)

        file.eachLine {line ->
          if (line.contains(id)) {
            alarmIdentified = 1
          }
        }

This still seems to be giving an OutOfMemoryError. Can anyone spot anything I am doing incorrect?

Regards,
Dave.

On 13 July 2011 03:48, Guillaume Laforge <[hidden email]> wrote:
Hi Dave,

Perhaps you should show us the code that you use for that task.

What I mean for example is that if you use something like new
File("log.txt").text or new File("log.txt").readLines(), then Groovy
will try to read everything in memory! But if you use something like
new File("log.txt").eachLine { ... }, it will read one line at a time,
and won't fill the memory.

Guillaume

On Wed, Jul 13, 2011 at 12:43, Dave McGee <[hidden email]> wrote:
> Hi all,
>
> I am working with parsing the contents of a log file to check for the
> occurrence of a string (i.e. "CRITICAL alarm"). Was hoping for the
> application I'm using to be as light-weight as possible in terms of memory
> usage, etc.  As a test, I created a 250 MB log file and noticed that I get a
> "java.lang.OutOfMemoryError: Java heap space" error when trying to deal with
> this big file. I replicated this by setting my maximum heap size to 128MB.
> Looks like its reading this file into memory, anyone know of any ways around
> this?
>
> Thanks!
>
> Regards,
> Dave.
>
>



--
Guillaume Laforge
Groovy Project Manager
Head of Groovy Development at SpringSource
http://www.springsource.com/g2one

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email





--

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reading the contents of a File without impacting heap space (Reading the file into memory?)

denis.zhdanov
You can use streaming approach instead. Please have a look to the standard java.util.Scanner class - http://download.oracle.com/javase/6/docs/api/java/util/Scanner.html

Denis
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reading the contents of a File without impacting heap space (Reading the file into memory?)

Daniel Henrique Alves Lima
In reply to this post by Dave McGee
    Stupid question: Does your file seem odd? Does it only contains just
a single big line?

    You can see eachLine implementation here:

http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.groovy/groovy-all/1.8.0-beta-1/org/codehaus/groovy/runtime/DefaultGroovyMethods.java#DefaultGroovyMethods.eachLine%28java.io.File%2Cgroovy.lang.Closure%29
http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.groovy/groovy-all/1.8.0-beta-1/org/codehaus/groovy/runtime/DefaultGroovyMethods.java#DefaultGroovyMethods.eachLine%28java.io.File%2Cjava.lang.String%2Cint%2Cgroovy.lang.Closure%29
(...)
http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.groovy/groovy-all/1.8.0-beta-1/org/codehaus/groovy/runtime/DefaultGroovyMethods.java#DefaultGroovyMethods.eachLine%28java.io.Reader%2Cint%2Cgroovy.lang.Closure%29

Dave McGee wrote:

> Hi Guillaume,
>
> Thanks for the quick response! Apologies - I forgot to include my
> code. I think I am using the approach that you mentioned (new
> File("log.txt").eachLine { ... })
>
>     File file = new File(/C:\Users\Dave\workspace\MyApplication\test.txt/)
>
>         file.eachLine {line ->
>           if (line.contains(id)) {
>             alarmIdentified = 1
>           }
>         }
>
> This still seems to be giving an OutOfMemoryError. Can anyone spot
> anything I am doing incorrect?
>
> Regards,
> Dave.
>
> On 13 July 2011 03:48, Guillaume Laforge <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi Dave,
>
>     Perhaps you should show us the code that you use for that task.
>
>     What I mean for example is that if you use something like new
>     File("log.txt").text or new File("log.txt").readLines(), then Groovy
>     will try to read everything in memory! But if you use something like
>     new File("log.txt").eachLine { ... }, it will read one line at a time,
>     and won't fill the memory.
>
>     Guillaume
>
>     On Wed, Jul 13, 2011 at 12:43, Dave McGee <[hidden email]
>     <mailto:[hidden email]>> wrote:
>     > Hi all,
>     >
>     > I am working with parsing the contents of a log file to check
>     for the
>     > occurrence of a string (i.e. "CRITICAL alarm"). Was hoping for the
>     > application I'm using to be as light-weight as possible in terms
>     of memory
>     > usage, etc.  As a test, I created a 250 MB log file and noticed
>     that I get a
>     > "java.lang.OutOfMemoryError: Java heap space" error when trying
>     to deal with
>     > this big file. I replicated this by setting my maximum heap size
>     to 128MB.
>     > Looks like its reading this file into memory, anyone know of any
>     ways around
>     > this?
>     >
>     > Thanks!
>     >
>     > Regards,
>     > Dave.
>     >
>     >
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reading the contents of a File without impacting heap space (Reading the file into memory?)

Dave McGee
Hi Daniel, all,

Good catch (and to John also!). I needed to add a "\n" to the end of the line on the log entry in the file I was generating for testing purposes. Looks like the eachLine was just going over a huge single line but I didn't spot it as it filled up my notepad window, which created the illusion of lines on random text! Schoolboy error!

Thanks for all your help,

Regards,
Dave.

On 13 July 2011 05:13, Daniel Henrique Alves Lima <[hidden email]> wrote:
  Stupid question: Does your file seem odd? Does it only contains just a single big line?

  You can see eachLine implementation here:

http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.groovy/groovy-all/1.8.0-beta-1/org/codehaus/groovy/runtime/DefaultGroovyMethods.java#DefaultGroovyMethods.eachLine%28java.io.File%2Cgroovy.lang.Closure%29
http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.groovy/groovy-all/1.8.0-beta-1/org/codehaus/groovy/runtime/DefaultGroovyMethods.java#DefaultGroovyMethods.eachLine%28java.io.File%2Cjava.lang.String%2Cint%2Cgroovy.lang.Closure%29
(...)
http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.groovy/groovy-all/1.8.0-beta-1/org/codehaus/groovy/runtime/DefaultGroovyMethods.java#DefaultGroovyMethods.eachLine%28java.io.Reader%2Cint%2Cgroovy.lang.Closure%29

Dave McGee wrote:
Hi Guillaume,

Thanks for the quick response! Apologies - I forgot to include my code. I think I am using the approach that you mentioned (new File("log.txt").eachLine { ... })

   File file = new File(/C:\Users\Dave\workspace\MyApplication\test.txt/)

       file.eachLine {line ->
         if (line.contains(id)) {
           alarmIdentified = 1
         }
       }

This still seems to be giving an OutOfMemoryError. Can anyone spot anything I am doing incorrect?

Regards,
Dave.

On 13 July 2011 03:48, Guillaume Laforge <[hidden email] <mailto:[hidden email]>> wrote:

   Hi Dave,

   Perhaps you should show us the code that you use for that task.

   What I mean for example is that if you use something like new
   File("log.txt").text or new File("log.txt").readLines(), then Groovy
   will try to read everything in memory! But if you use something like
   new File("log.txt").eachLine { ... }, it will read one line at a time,
   and won't fill the memory.

   Guillaume

   On Wed, Jul 13, 2011 at 12:43, Dave McGee <[hidden email]
   <mailto:[hidden email]>> wrote:
   > Hi all,
   >
   > I am working with parsing the contents of a log file to check
   for the
   > occurrence of a string (i.e. "CRITICAL alarm"). Was hoping for the
   > application I'm using to be as light-weight as possible in terms
   of memory
   > usage, etc.  As a test, I created a 250 MB log file and noticed
   that I get a
   > "java.lang.OutOfMemoryError: Java heap space" error when trying
   to deal with
   > this big file. I replicated this by setting my maximum heap size
   to 128MB.
   > Looks like its reading this file into memory, anyone know of any
   ways around
   > this?
   >
   > Thanks!
   >
   > Regards,
   > Dave.
   >
   >



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

  http://xircles.codehaus.org/manage_email





--
Regards,
Dave
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reading the contents of a File without impacting heap space (Reading the file into memory?)

danmickey28
This post has NOT been accepted by the mailing list yet.
In reply to this post by Daniel Henrique Alves Lima
If a file contains a single big line (40-50mb), then eachLines will not work.
How to avoid heap space error in this scenario?

My code is below:

def input = new File('C:\\actorids.txt')
def strIds = input.getText()

which is throwing error as:

Exception thrown 3 Jan, 2012 12:02:43 PM org.codehaus.groovy.runtime.StackTraceUtils sanitize
WARNING: Sanitizing stacktrace:
java.lang.OutOfMemoryError: Java heap space
Loading...