Groovy for data science, a killer application for Groovy?

classic Classic list List threaded Threaded
28 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Groovy for data science, a killer application for Groovy?

Jochen Theodorou
Am 06.01.2015 11:19, schrieb Dylan Cali:

> On Tue, Jan 6, 2015 at 1:57 AM, Jochen Theodorou <[hidden email]> wrote:
>> Am 06.01.2015 00:38, schrieb Russel Winder:
>> But if the keypoint here is Fortran and C++ computation
>> frameworks, then things look indeed bad. Funny thing is that I did talk with
>> Cedric about native code binding just a few weeks ago and that I was
>> wondering if it is not possible to provide something better than what the
>> JVM has today...
>
> Is JNA not an option here? I've had good success using it to leverage
> native libraries, and it is much, much easier to get going with than
> JNI.

well, you want easy usage and good performance. JNA performs even worse
than JNI.

bye blackdrag

--
Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
blog: http://blackdragsview.blogspot.com/
german groovy discussion newsgroup: de.comp.lang.misc
For Groovy programming sources visit http://groovy-lang.org


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: Groovy for data science, a killer application for Groovy?

Stergios Papadimitriou

Hi all,

from my experience pure Java performs generally better than
unoptimized C, and optimized C is slightly only faster.

In GroovyLab, matrix multiplication is multithreaded ,
for Linux it uses native BLAS and  only MATLAB offers higher speed,
for Windows however native routines do not work well,
and I use pure Java multithreaded matrix multiplication.


.. today I tried scavis, which is Jython based,
I don't liked the Python programming style and it doesn't seem fast
either ..

Regards

Stergios

On 01/06/2015 02:43 PM, Jochen Theodorou wrote:

> Am 06.01.2015 11:19, schrieb Dylan Cali:
>> On Tue, Jan 6, 2015 at 1:57 AM, Jochen Theodorou <[hidden email]>
>> wrote:
>>> Am 06.01.2015 00:38, schrieb Russel Winder:
>>> But if the keypoint here is Fortran and C++ computation
>>> frameworks, then things look indeed bad. Funny thing is that I did
>>> talk with
>>> Cedric about native code binding just a few weeks ago and that I was
>>> wondering if it is not possible to provide something better than
>>> what the
>>> JVM has today...
>>
>> Is JNA not an option here? I've had good success using it to leverage
>> native libraries, and it is much, much easier to get going with than
>> JNI.
>
> well, you want easy usage and good performance. JNA performs even
> worse than JNI.
>
> bye blackdrag
>


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: Groovy for data science, a killer application for Groovy?

dracodoc
In reply to this post by Russel Winder-3
Actually Julia is a perfect example. Julia is very new, created by a small group, without much libraries support.

Of course Groovy will be a very different case from Julia, however I think in language aspect, Groovy have both advantages and disadvantages compared to Julia. In the case of library support, Groovy have actually much better potential than Julia.

If somebody told you in 2012 that they will create Julia from scratch in an already crowded field, without any library support, do you believe they will succeed?
Reply | Threaded
Open this post in threaded view
|

Re: Groovy for data science, a killer application for Groovy?

Paolo Di Tommaso
In  general I would agree with Russel: data science or, more in general, scientific computing is a very competitive field where it is difficult to enter. 

But it is also true this is a very broad area that includes many different requirements. 

The people who need the top performance generally use super-computer and program in C/C++/Fortran/MPI. 

Then there's a vast area in which performance is not critical whilst is given more importance on problem modelling or to application fast prototyping. Here most of the people use Perl, Python, R and, more recently, Julia, because they are easy to use and (almost) portable. 

The scientific community made a huge investment to develop or integrate data analysis and visualisation libraries, above all for Python, so it's very difficult that they switch to a different environment without a clear benefit. 

Also for this class of applications it is very important the ability to compose them within other scripts or simply by using the Unix command line pipe. 

Here, unfortunately, the slow bootstrap time of Groovy runtime represents a serious handicap, because it makes inefficient to invoke Groovy scripts from another tool or from the command line in a timely manner (and it makes *perceive* it as a slow). In my opinion resolving this, it would be much easier to propose Groovy as an alternative scripting language in the data science community.


However this does not mean that Groovy cannot be used with profit for data science. Indeed, together with Gpars, they provide a compelling set of features valuable in range of scientific applications. 

As said by Jim in a previous email, I'm using both of them in Nextflow, that is a DSL for data analysis pipelines, with a good feedback. http://www.nextflow.io

Also, though Python is the reference language in this field, it is also true that out there exists plenty of libraries/resources (for ML, big data, genomics, etc) that run on the JVM and for which Groovy is, by definition, a perfect match for the reason we know.



Cheers,
Paolo

 


On Tue, Jan 6, 2015 at 3:20 PM, dracodoc <[hidden email]> wrote:
Actually Julia is a perfect example. Julia is very new, created by a small
group, without much libraries support.

Of course Groovy will be a very different case from Julia, however I think
in language aspect, Groovy have both advantages and disadvantages compared
to Julia. In the case of library support, Groovy have actually much better
potential than Julia.

If somebody told you in 2012 that they will create Julia from scratch in an
already crowded field, without any library support, do you believe they will
succeed?



-----
http://dracodoc.blogspot.com/
--
View this message in context: http://groovy.329449.n5.nabble.com/Groovy-for-data-science-a-killer-application-for-Groovy-tp5722061p5722090.html
Sent from the groovy - user mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Reply | Threaded
Open this post in threaded view
|

Re: Groovy for data science, a killer application for Groovy?

Dylan Cali
On Tue, Jan 6, 2015 at 12:59 PM, Paolo Di Tommaso
<[hidden email]> wrote:

> Also for this class of applications it is very important the ability to
> compose them within other scripts or simply by using the Unix command line
> pipe.
>
> Here, unfortunately, the slow bootstrap time of Groovy runtime represents a
> serious handicap, because it makes inefficient to invoke Groovy scripts from
> another tool or from the command line in a timely manner (and it makes
> *perceive* it as a slow). In my opinion resolving this, it would be much
> easier to propose Groovy as an alternative scripting language in the data
> science community.

This is probably a silly idea, but thought I'd throw it out there:
could the bootstrap time be addressed by using a 'daemon' approach,
similar to how the Gradle daemon is used to speed up launching Gradle
build scripts?

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: Groovy for data science, a killer application for Groovy?

Anther Astimony


This is probably a silly idea, but thought I'd throw it out there:
could the bootstrap time be addressed by using a 'daemon' approach,
similar to how the Gradle daemon is used to speed up launching Gradle
build scripts?


Already accomplished by the GroovyServ project:

http://kobo.github.io/groovyserv/

~~~ astimony

Reply | Threaded
Open this post in threaded view
|

Re: Groovy for data science, a killer application for Groovy?

Dylan Cali
On Sun, Jan 11, 2015 at 5:02 PM, Anther Astimony <[hidden email]> wrote:

>
>
>> This is probably a silly idea, but thought I'd throw it out there:
>> could the bootstrap time be addressed by using a 'daemon' approach,
>> similar to how the Gradle daemon is used to speed up launching Gradle
>> build scripts?
>>
>
> Already accomplished by the GroovyServ project:
>
> http://kobo.github.io/groovyserv/
>
> ~~~ astimony
>

Nice!

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: Groovy for data science, a killer application for Groovy?

Paolo Di Tommaso
In reply to this post by Anther Astimony
I think this is not the right solution if we want to propose Groovy as an alternative programming language in the data science community. These are people working most of the time with the command line, and they need to work with tools that are fast and can be composed in an efficient manner. 

For experience, when you propose to evaluate Groovy to some of these guys, and they will notice that it requires 700ms to print an "Hello world", most of them will stop there, because they have the perception that is slow. You won't have a second chance.



Cheers, p



On Mon, Jan 12, 2015 at 12:02 AM, Anther Astimony <[hidden email]> wrote:


This is probably a silly idea, but thought I'd throw it out there:
could the bootstrap time be addressed by using a 'daemon' approach,
similar to how the Gradle daemon is used to speed up launching Gradle
build scripts?


Already accomplished by the GroovyServ project:

http://kobo.github.io/groovyserv/

~~~ astimony


Reply | Threaded
Open this post in threaded view
|

Re: Groovy for data science, a killer application for Groovy?

Russel Winder-3

On Mon, 2015-01-12 at 11:28 +0100, Paolo Di Tommaso wrote:
> […]
> won't have a second chance.

s/second //
>
--
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:[hidden email]
41 Buckmaster Road    m: +44 7770 465 077   xmpp: [hidden email]
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: Groovy for data science, a killer application for Groovy?

Paolo Di Tommaso
:) 


However, I disagree on this. For example in the genomics field, though the most used programming languages are C and Python, there are really important pieces of software that are written in Java. 

In this context Groovy could really play a role, as an easy-to-go scripting alternative to Java, well founded parallelisation libraries, and good performance (not fast as C code, but much better then Python). 


Cheers,
Paolo

 

On Mon, Jan 12, 2015 at 12:34 PM, Russel Winder <[hidden email]> wrote:

On Mon, 2015-01-12 at 11:28 +0100, Paolo Di Tommaso wrote:
> […]
> won't have a second chance.

s/second //
>
--
Russel.
=============================================================================
Dr Russel Winder      t: <a href="tel:%2B44%2020%207585%202200" value="+442075852200">+44 20 7585 2200   voip: [hidden email]
41 Buckmaster Road    m: <a href="tel:%2B44%207770%20465%20077" value="+447770465077">+44 7770 465 077   xmpp: [hidden email]
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



123