File counting methods

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

File counting methods

Merlin Beedell

This might not be a suitable question for the groovy users, but I am using it in a groovy script, so thought it might solicit some useful feedback and alternate ideas.

 

In Java’s NIO there are a couple of ways to walk the tree of files in order to return a file count.  There must be lots of ways!  I don’t know  Groovy has a better built in method – but I could not see it.

This one should be the fastest, using concurrent processes. However, if the set of files changes during the count it errors.  I have no idea how to catch and ignore that error.

long fileCountV2(Path dir) {

    return Files.walk(dir)

                .parallel()

                .filter(p -> !p.toFile().isDirectory())

                .count();

}

 

This one works regardless of any file changes, and uses a Stream for low memory overhead.  But it could do with concurrency to speed it up, but I really cannot figure how to GPARS this for performance.  This script did once fail with “too many files open” error on Linux (and a directory with 200,000+ files).  I can’t determine how – no files are actually opened!

 

long fileCount(Path dir)

{

  long i = 0

  try {

    Files.newDirectoryStream(dir).each {i++}

  } catch (e) {

    println (e)

  }

  return i

}

 

Merlin Beedell