DGM method to avoid blocking processes?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
12
Reply | Threaded
Open this post in threaded view
|

DGM method to avoid blocking processes?

Jochen Theodorou
Hi,

I noticed that we have no method to remove the output of a process. Some
of you may know that when java creates a process via Runtime#exec, that
this Process writes it's error and normal output in a buffer. The size
of that buffer is about 1k maybe less or more, I am not sure. The point
is, that if the buffer is full the process is stoped. In this state the
process does not end nor is it possible to kill it with the destroy method.

Basically I see her two options:

1) make a method in DGM, that starts 2 Threads reading the input and
error stream of the process
2) make a new exec method, that includes option 1)

so, any thoughts?


bye blackdrag
tog
Reply | Threaded
Open this post in threaded view
|

Re: DGM method to avoid blocking processes?

tog
blackdrag

Can't we have option 1 where when there is no explicit output stream defined one thread is created and in charge of reading this stream so that the process is never
stopped.

cheers
tog

---------- Forwarded message ----------
From: Jochen Theodorou <[hidden email]>
Date: Mar 9, 2006 12:52 PM
Subject: [groovy-dev] DGM method to avoid blocking processes?
To: [hidden email]

Hi,

I noticed that we have no method to remove the output of a process. Some
of you may know that when java creates a process via Runtime#exec, that
this Process writes it's error and normal output in a buffer. The size
of that buffer is about 1k maybe less or more, I am not sure. The point
is, that if the buffer is full the process is stoped. In this state the
process does not end nor is it possible to kill it with the destroy method.

Basically I see her two options:

1) make a method in DGM, that starts 2 Threads reading the input and
error stream of the process
2) make a new exec method, that includes option 1)

so, any thoughts?


bye blackdrag
Reply | Threaded
Open this post in threaded view
|

RE: DGM method to avoid blocking processes?

Dierk König
In reply to this post by Jochen Theodorou
now on the real thread:

cool submit.

I's suggest a small renaming since 'Process' is
already clear from the object this method is called upon

 proc.consumeStreams() //?
 proc.swallow()        //?

How about making it return the Process object itself such that we can do

 'cat'.execute().swallow() << 'Lasagne'

cheers
Mittie

> -----Original Message-----
> From: Jochen Theodorou [mailto:[hidden email]]
> Sent: Donnerstag, 9. März 2006 12:53
> To: [hidden email]
> Subject: [groovy-dev] DGM method to avoid blocking processes?
>
>
> Hi,
>
> I noticed that we have no method to remove the output of a process. Some
> of you may know that when java creates a process via Runtime#exec, that
> this Process writes it's error and normal output in a buffer. The size
> of that buffer is about 1k maybe less or more, I am not sure. The point
> is, that if the buffer is full the process is stoped. In this state the
> process does not end nor is it possible to kill it with the
> destroy method.
>
> Basically I see her two options:
>
> 1) make a method in DGM, that starts 2 Threads reading the input and
> error stream of the process
> 2) make a new exec method, that includes option 1)
>
> so, any thoughts?
>
>
> bye blackdrag

Reply | Threaded
Open this post in threaded view
|

Re: DGM method to avoid blocking processes?

Jochen Theodorou
In reply to this post by tog
tog schrieb:
> blackdrag
>
> Can't we have option 1 where when there is no explicit output stream
> defined one thread is created and in charge of reading this stream so
> that the process is never
> stopped.

the problem is that we don't know if there is such a stream. I mean the
process tdefines them and if you want to connect another stream or not
is not buisness of the Process object. Subclassing Process wouldn't help
here, the only thing would be to write a Groovy version of the Process
object which enables you to do this.

bye blackdrag
Reply | Threaded
Open this post in threaded view
|

Re: DGM method to avoid blocking processes?

Jochen Theodorou
In reply to this post by Dierk König
Dierk Koenig schrieb:
> now on the real thread:
>
> cool submit.
>
> I's suggest a small renaming since 'Process' is
> already clear from the object this method is called upon
>
>  proc.consumeStreams() //?
>  proc.swallow()        //?

swallow looks nice to me

> How about making it return the Process object itself such that we can do
>
>  'cat'.execute().swallow() << 'Lasagne'

yes, I think returning the Process object is a good idea.

Maybe we should also extend it, so not both streams are wallowed, maybe
someone wants just to have the error stream, or just the input stream.
For example with:

proc.swallowOut()
proc.swallowErr()

it is a bit a naming conflict here, as the output of the process is in
the inputStream, so we have to be careful on what to decide here.

Another good idea may be to be able to redirect the streams, for exmaple
err to out.

proc.err2out()
proc.out2err()

ok, this would only work if we had control over the InputStreams. We
could of course here return our own Streams and replace
proc.getInputStream() and such. DGM allows us to do this.

More easy would be to redirect the process output to another stream

proc.out2stream(System.out)
proc.err2stream(System.err)

Instead of System.* we could also have a FileOuputStream or a Writer...

Of course the naming of all these has to be discussed as well as if we
really want to do that.

bye blackdrag
tog
Reply | Threaded
Open this post in threaded view
|

Re: DGM method to avoid blocking processes?

tog


On 3/9/06, Jochen Theodorou <[hidden email]> wrote:
Dierk Koenig schrieb:
> now on the real thread:
>
> cool submit.
>
> I's suggest a small renaming since 'Process' is
> already clear from the object this method is called upon
>
>  proc.consumeStreams() //?
>  proc.swallow()        //?

swallow looks nice to me

> How about making it return the Process object itself such that we can do
>
>  'cat'.execute().swallow() << 'Lasagne'

yes, I think returning the Process object is a good idea.

And (again a stupid) question why can't we have  something more like 
'cat'.execute() << 'Lasagne' without to explicitely put swallow()

We coud even have 'cat'.execute() << 'Lasagne' >> 'somewhere else'

cheers
tog

Maybe we should also extend it, so not both streams are wallowed, maybe
someone wants just to have the error stream, or just the input stream.
For example with:

proc.swallowOut()
proc.swallowErr()

it is a bit a naming conflict here, as the output of the process is in
the inputStream, so we have to be careful on what to decide here.

Another good idea may be to be able to redirect the streams, for exmaple
err to out.

proc.err2out()
proc.out2err()

ok, this would only work if we had control over the InputStreams. We
could of course here return our own Streams and replace
proc.getInputStream() and such. DGM allows us to do this.

More easy would be to redirect the process output to another stream

proc.out2stream(System.out )
proc.err2stream(System.err)

Instead of System.* we could also have a FileOuputStream or a Writer...

Of course the naming of all these has to be discussed as well as if we
really want to do that.

bye blackdrag

Reply | Threaded
Open this post in threaded view
|

Re: DGM method to avoid blocking processes?

Jochen Theodorou
tog schrieb:
>
[...]
> And (again a stupid) question why can't we have  something more like
> 'cat'.execute() << 'Lasagne' without to explicitely put swallow()
>
> We coud even have 'cat'.execute() << 'Lasagne' >> 'somewhere else'

it is the same answer as last time ;)
besides that , the above can't work, as << returns an OutputStream and
 >> isn't defined for that.

But ok, let us discuss this a little more in detail...

Have you ever tried to read from an InputStream with 2 Threads? It's
annoying, it is possible that only one thread will get the output, but
the ouput is never duplicated by this. And how should that work? I eman
the Inpustream is not distributing its content, it has just a read
method, that is used form outside and InputStream doesn't know who it
is. The reading programm on the other side expects he is the only one
reading the InputStream and why shouldn't such a Stream not expect this.
It means it can ready the InputStream fully without thinking about
others accessing the stream too.

As two reader programms can't use 1 InpuStream we only could duplicate
the stream, or we could ensure there is only 1 reading program.

Duplicating the stream is no good solution as it means 2 Streams must be
read and this with different speed, where one speed could be 0. And if
it is 0 what do we do if the buffer is full? In Java it was decided to
stop the process then - no option for us.

And then we can't ensure there is only one reading program, if we start
reading by default, then there is always one reading program. And if if
we could know that there is a different reading programm, what about the
already read input? Scuh input would be lost then.

no I think, the solution would be to have a new execute method taking
for exmaple 2 booleans marking which stream should be swallowed. For
example:

'cat'.execute(true,false)

this will create another 5 methods in DGM for the ones already in there
and additional 6 for the Runtime methods. -> 11 new methods.

And the decision can't be reversed, but that is not different from the
swallow-solution.

bye blackdrag

tog
Reply | Threaded
Open this post in threaded view
|

Re: DGM method to avoid blocking processes?

tog
Jochen,

I am a bit lost now ! Let me re-formulate the problem to see what's is fuzzy here in my brain.

Let's assume you have something like:
'something to do'.execute()

This is something like an external process that can have input, output & error stream.
You say that this cause problem when you don't explicitely define the output and error stream due to buffer limitation ...

In Java I do this:
        ProcessBuilder pb = new ProcessBuilder("cmd", "/c", "dir");
        Map<String, String> env = pb.environment();
        env.put("GLOUBI", "BOULGA");
        //env.remove("OTHERVAR");
        env.put("GLOUBI", env.get("BLOUBI") + "_YABON");
        pb.directory(new File("C:\\eclipse\\"));
       
        try {
            Process p = pb.start();
            BufferedReader br = new BufferedReader(new InputStreamReader(p.getInputStream ()));
            String line;
            while ((line = br.readLine()) != null) {
                System.out.println(line);
            }
        } catch (IOException ex) {
            ex.printStackTrace ();
        }

May be there is something better to be done than what is currently existing. We have an AntBuider, a SwingBuilder, ... why not a process builder like in Java5

An in Groovy I will do:
     p = new ProcessBuilder(dir:".", env:[], mergeStderrStdOut:true)
     output = p.output
     p.start() << input
     output.each{println it}
     while (p.isFinished())
     // continue here

something like that ... probably this need to be refined.
If you don't specify any output then you should either create one and throw away all that come on it or use the Stdout from your script or class

Does that make sense.
cheers
tog

On 3/9/06, Jochen Theodorou <[hidden email]> wrote:
tog schrieb:
>
[...]
> And (again a stupid) question why can't we have  something more like
> 'cat'.execute() << 'Lasagne' without to explicitely put swallow()
>
> We coud even have 'cat'.execute() << 'Lasagne' >> 'somewhere else'

it is the same answer as last time ;)
besides that , the above can't work, as << returns an OutputStream and
>> isn't defined for that.

But ok, let us discuss this a little more in detail...

Have you ever tried to read from an InputStream with 2 Threads? It's
annoying, it is possible that only one thread will get the output, but
the ouput is never duplicated by this. And how should that work? I eman
the Inpustream is not distributing its content, it has just a read
method, that is used form outside and InputStream doesn't know who it
is. The reading programm on the other side expects he is the only one
reading the InputStream and why shouldn't such a Stream not expect this.
It means it can ready the InputStream fully without thinking about
others accessing the stream too.

As two reader programms can't use 1 InpuStream we only could duplicate
the stream, or we could ensure there is only 1 reading program.

Duplicating the stream is no good solution as it means 2 Streams must be
read and this with different speed, where one speed could be 0. And if
it is 0 what do we do if the buffer is full? In Java it was decided to
stop the process then - no option for us.

And then we can't ensure there is only one reading program, if we start
reading by default, then there is always one reading program. And if if
we could know that there is a different reading programm, what about the
already read input? Scuh input would be lost then.

no I think, the solution would be to have a new execute method taking
for exmaple 2 booleans marking which stream should be swallowed. For
example:

'cat'.execute(true,false)

this will create another 5 methods in DGM for the ones already in there
and additional 6 for the Runtime methods. -> 11 new methods.

And the decision can't be reversed, but that is not different from the
swallow-solution.

bye blackdrag


Reply | Threaded
Open this post in threaded view
|

Re: DGM method to avoid blocking processes?

Jochen Theodorou
tog schrieb:

> Jochen,
>
> I am a bit lost now ! Let me re-formulate the problem to see what's is
> fuzzy here in my brain.
>
> Let's assume you have something like:
> 'something to do'.execute()
>
> This is something like an external process that can have input, output &
> error stream.
> You say that this cause problem when you don't explicitely define the
> output and error stream due to buffer limitation ...

not define, empty the streams, but yes. The process will be stopped and
changes in a stzate where the destroy method from java will not succeed

> In Java I do this:
>         ProcessBuilder pb = new ProcessBuilder("cmd", "/c", "dir");
>         Map<String, String> env = pb.environment();
>         env.put("GLOUBI", "BOULGA");
>         //env.remove("OTHERVAR");
>         env.put("GLOUBI", env.get("BLOUBI") + "_YABON");
>         pb.directory(new File("C:\\eclipse\\"));
>        
>         try {
>             Process p = pb.start();
>             BufferedReader br = new BufferedReader(new
> InputStreamReader(p.getInputStream ()));
>             String line;
>             while ((line = br.readLine()) != null) {
>                 System.out.println(line);
>             }
>         } catch (IOException ex) {
>             ex.printStackTrace ();
>         }

if you do it that way your program may stop, as you don't empty the
error stream.

> May be there is something better to be done than what is currently
> existing. We have an AntBuider, a SwingBuilder, ... why not a process
> builder like in Java5
>
> An in Groovy I will do:
>      p = new ProcessBuilder(dir:".", env:[], mergeStderrStdOut:true)
>      output = p.output
>      p.start() << input
>      output.each{println it}
>      while (p.isFinished())
>      // continue here

it is not much of a builder. The important part here is that we don't
operate on the process object, but on a object from groovy. Maybe that
is rerally the best way, it opens more possibilities.

> something like that ... probably this need to be refined.
> If you don't specify any output then you should either create one and
> throw away all that come on it or use the Stdout from your script or class

if we want this, then we should do it more like this way:

def processOutput=null
def p = new ProcessBuilder().create() {
   dir "."
   command "echo hello World"
   env []
   redirectErr2Out()
   processOutput = output()
}
processOutput.each {println it}

dir,env,redirect*,command and output would then be methods of the
builder. command would automatically choose a shell if possible, maybe
an additional shell(String) method would be good to overwrite the default.

the shortest form of this would then be:

def p = new ProcessBuilder().create() {
   command "echo hello World"
}

The assignment inside the closure to processOutput is, because this way
we can know at the end of the closure if we have assigned the streams or
not, and do that if not.

bye blackdrag
Reply | Threaded
Open this post in threaded view
|

Re: DGM method to avoid blocking processes?

Yuri Schimke
In reply to this post by tog
Hey,

Does grash do anything like this?

Otherwise, I did some work on this kind of thing a while ago, 2 years
1 month ago to be precise.  I was planning to start a unix style shell
with pipes etc. I ran out of time, and two small kids keep me busy.
Also some of the syntax sugar added for java.lang.Process in
DefaultGroovyMethods made the common cases simple enough.

But Its checked in groovy/modules/process/

I doubt its still working, but it might not take to much to get it
working again.  It by default handles stderr, unless you redirect it
elsewhere.

Alternatively, some extra syntax sugar in DefaultGroovyMethods may be
more effective.

Anyway some examples.

By default it assumes methods are processes to execute:

gsh.cat('test_scripts/blah.txt').toStdOut();
gsh.cat('test_scripts/blah.txt').toFile(new java.io.File('blah.out'));
gsh.cat().fromStdIn().toStdOut();

There are some special commands, to process part of the pipe in groovy.

f = gsh.find('.', '-name', '*.java', '-ls');
total = 0;
lines = gsh.grid { values,w |
  x = values[2,4,6,10];
  s = x.join('  ');
  w.println(s);
  total += Integer.parseInt(values[6]);
};

f.pipeTo(lines);
lines.toStdOut();
 
System.out.println("Total: " + total);


On 3/9/06, tog <[hidden email]> wrote:

> Jochen,
>
> I am a bit lost now ! Let me re-formulate the problem to see what's is fuzzy
> here in my brain.
>
> Let's assume you have something like:
> 'something to do'.execute()
>
> This is something like an external process that can have input, output &
> error stream.
> You say that this cause problem when you don't explicitely define the output
> and error stream due to buffer limitation ...
>
> In Java I do this:
>         ProcessBuilder pb = new ProcessBuilder("cmd", "/c", "dir");
>         Map<String, String> env = pb.environment();
>         env.put("GLOUBI", "BOULGA");
>         //env.remove("OTHERVAR");
>         env.put("GLOUBI", env.get("BLOUBI") + "_YABON");
>         pb.directory(new File("C:\\eclipse\\"));
>
>         try {
>             Process p = pb.start();
>             BufferedReader br = new BufferedReader(new
> InputStreamReader(p.getInputStream ()));
>             String line;
>             while ((line = br.readLine()) != null) {
>                 System.out.println(line);
>             }
>         } catch (IOException ex) {
>             ex.printStackTrace ();
>         }
>
> May be there is something better to be done than what is currently existing.
> We have an AntBuider, a SwingBuilder, ... why not a process builder like in
> Java5
>
> An in Groovy I will do:
>      p = new ProcessBuilder(dir:".", env:[], mergeStderrStdOut:true)
>      output = p.output
>      p.start() << input
>      output.each{println it}
>      while (p.isFinished())
>      // continue here
>
> something like that ... probably this need to be refined.
> If you don't specify any output then you should either create one and throw
> away all that come on it or use the Stdout from your script or class
>
> Does that make sense.
> cheers
> tog
>
>
> On 3/9/06, Jochen Theodorou <[hidden email]> wrote:
> > tog schrieb:
> > >
> > [...]
> > > And (again a stupid) question why can't we have  something more like
> > > 'cat'.execute() << 'Lasagne' without to explicitely put swallow()
> > >
> > > We coud even have 'cat'.execute() << 'Lasagne' >> 'somewhere else'
> >
> > it is the same answer as last time ;)
> > besides that , the above can't work, as << returns an OutputStream and
> > >> isn't defined for that.
> >
> > But ok, let us discuss this a little more in detail...
> >
> > Have you ever tried to read from an InputStream with 2 Threads? It's
> > annoying, it is possible that only one thread will get the output, but
> > the ouput is never duplicated by this. And how should that work? I eman
> > the Inpustream is not distributing its content, it has just a read
> > method, that is used form outside and InputStream doesn't know who it
> > is. The reading programm on the other side expects he is the only one
> > reading the InputStream and why shouldn't such a Stream not expect this.
> > It means it can ready the InputStream fully without thinking about
> > others accessing the stream too.
> >
> > As two reader programms can't use 1 InpuStream we only could duplicate
> > the stream, or we could ensure there is only 1 reading program.
> >
> > Duplicating the stream is no good solution as it means 2 Streams must be
> > read and this with different speed, where one speed could be 0. And if
> > it is 0 what do we do if the buffer is full? In Java it was decided to
> > stop the process then - no option for us.
> >
> > And then we can't ensure there is only one reading program, if we start
> > reading by default, then there is always one reading program. And if if
> > we could know that there is a different reading programm, what about the
> > already read input? Scuh input would be lost then.
> >
> > no I think, the solution would be to have a new execute method taking
> > for exmaple 2 booleans marking which stream should be swallowed. For
> > example:
> >
> > 'cat'.execute(true,false)
> >
> > this will create another 5 methods in DGM for the ones already in there
> > and additional 6 for the Runtime methods. -> 11 new methods.
> >
> > And the decision can't be reversed, but that is not different from the
> > swallow-solution.
> >
> > bye blackdrag
> >
> >
>
>


--
Yuri Schimke
12