[jira] [Comment Edited] (GROOVY-7464) Iterable.intersect() is not commutative (and allows duplicates)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Comment Edited] (GROOVY-7464) Iterable.intersect() is not commutative (and allows duplicates)

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/GROOVY-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585761#comment-14585761 ]

Paul King edited comment on GROOVY-7464 at 6/15/15 10:42 AM:
-------------------------------------------------------------

Not sure the best way to go just yet - still pondering - maybe others have some better ideas.

But just on the history, quite some time ago someone contributed some code to make the method more efficient - from memory, calling {{contains}} especially in conjunction with {{NumberAwareComparator}} was particularly slow, so the goal was to call it the least number of times. But clearly, in hindsight, we should have vetted that contribution more critically since reversing the left and right params clearly doesn't make it behave anymore like the original intention of:
{code}
a.findAll{ b.contains(it) }
{code}
Ironically, the behavior would be the same if we were removing duplicates.

It definitely needs fixing for several reasons. Not only above but also consider this:
{code}
TreeSet vowels = 'aeiou'.toList()   // or Set
def letters = 'abcdefg'.toList()
displayResult(vowels.intersect(letters))

def displayResult(Set answer) {
  println answer
}
{code}
What would you expect to happen above? Particularly since we added these variants in 2.4.0:
{code}
Set<T> intersect(Set<T> left, Iterable<T> right)
SortedSet<T> intersect(SortedSet<T> left, Iterable<T> right)
{code}
This highlights another issue - I think concrete types favored over interfaces.


was (Author: paulk):
Not sure the best way to go just yet - still pondering - maybe others have some better ideas.

But just on the history, quite some time ago someone contributed some code to make the method more efficient - from memory, calling {{contains}} especially in conjunction with {{NumberAwareComparator}} was particularly slow, so the goal was to call it the least number of times. But clearly, in hindsight, we should have vetted that contribution more critically since reversing the left and right params clearly doesn't make it behave anymore like the original intention of:
{code}
a.findAll{ b.contains(it) }
{code}
Ironically, the behavior would be the same if we were removing duplicates.

It definitely needs fixing for several reasons. Not only above but also consider this:
{code}
TreeSet vowels = 'aeiou'.toList()
def letters = 'abcdefg'.toList()
displayResult(vowels.intersect(letters))

def displayResult(Set answer) {
  println answer
}
{code}
What would you expect to happen above? Particularly since we added these variants in 2.4.0:
{code}
Set<T> intersect(Set<T> left, Iterable<T> right)
SortedSet<T> intersect(SortedSet<T> left, Iterable<T> right)
{code}
This highlights another issue - I think concrete types favored over interfaces.

> Iterable.intersect() is not commutative (and allows duplicates)
> ---------------------------------------------------------------
>
>                 Key: GROOVY-7464
>                 URL: https://issues.apache.org/jira/browse/GROOVY-7464
>             Project: Groovy
>          Issue Type: Bug
>          Components: groovy-jdk
>    Affects Versions: 2.4.3
>            Reporter: Peter Ledbrook
>            Assignee: Guillaume Laforge
>
> The {{intersect()}} method on {{Iterable}} has different results depending on the order of the receiver object and the argument. For example,
> {code}
> def a = [1, 1, 2, 4]
> def b = [1, 2, 2, 3]
> println a.intersect(b)
> println b.intersect(a)
> {code}
> prints out two different results ({{[1, 1, 2]}} and {{[1, 2, 2]}}).
> Assuming that the operation should be commutative, i.e. the results should be the same for both, the question then becomes whether duplicates should even be returned at all.
> I think the method should always return a set as intersection is a set operation. In other words, {{Iterable.intersect()}} becomes a convenience wrapper that converts the receiver object and the argument to sets.
> This issue started as a discussion on the [dev mailing list|http://mail-archives.apache.org/mod_mbox/incubator-groovy-dev/201506.mbox/browser].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)