JVM Advent 2018

The JVM Programming Advent Calendar

API Design of Eclipse Collections

Eclipse Collections is an open source Java Collections Framework which enables writing functional, fluent code in Java.

Eclipse Collections started off as a collections framework named Caramel at Goldman Sachs in 2004. Since then the framework has evolved, and in 2012, it was open sourced to GitHub as a project called GS Collections. Over the years, around 40 or so developers from the same company have contributed to the collections framework. To maximize the best nature of open source project, GS Collections was migrated to the Eclipse Foundation, re-branded as Eclipse Collections in 2015. Now the framework is fully open to the community, accepting contributions!

Design Goals
Eclipse Collections was designed to provide Rich, functional, fluent and fun API along with memory efficient data structures all the while providing interoperability with Java Collections. It provides missing types like Bag, Multimap, Stack, BiMap, Interval.

Evolution of the Framework
Over the past 14+ years, the framework has matured and the highest interface: RichIterable now has more than 100 methods on it. These methods were included on the interface after careful deliberation. Below are the steps we take while adding an API:

1. Use case: Majority of the methods added to the framework are motivated by user requirements. Users will raise either an Issue or directly a Pull Request on the project and then we start the discussion.

2. Static Utility vs API: Eclipse Collections has static utility classes like Iterate, ListIterate, etc. These static utility classes allow us to prototype our features before adding it as an API. If the static utility methods are heavily used then in subsequent release we try to implement the method as an API on the collection interface to provide a rich and fluent coding experience.

For example: Iterate#groupByAndCollect() is currently implemented on the static utility. Since the method is used frequently it justifies adding it as an API on RichIterable to provide a rich, functional and fluent coding experience. There is an open issue in case you would like to help us out.

3. Covariant Overrides: We override the API methods logically such that the API returns a type which is true to it’s behavior.

For example: RichIterable has an API called select() which is similar to filter() which returns all elements of the collections which evaluate true for the Predicate. Below is how the API is defined on each interface:

// RichIterable
RichIterable<T> select(Predicate<? super T> predicate);

// ListIterable
ListIterable<T> select(Predicate<? super T> predicate);

// MutableList
MutableList<T> select(Predicate<? super T> predicate)

As you can see select() on
RichIterable returns a RichIterable
ListIterable returns a ListIterable
MutableList returns a MutableList

4. Overloads with Target: Sometimes it is possible that we need a different collection than the one returned. In order to make it efficient and fluent, we create an overloaded method which accepts a target collection. The target collections are used to accumulate the results and return the target collection.

For example: As described above, the select() method on a MutableList returns a MutableList. However, what if you want a MutableSet? There is an overloaded select() method available which takes in a target collection which can be a set.

MutableList<Integer> integers = Lists.mutable.with(
        1, 2, 2, 3, 3, 3, 4, 4, 4, 4);
MutableList<Integer> evens = integers.select(each -> each % 2 == 0);
Assert.assertEquals(Lists.mutable.with(2, 2, 4, 4, 4, 4), evens);

MutableSet<Integer> uniqueEvens = integers.select(
        each -> each % 2 == 0,
Assert.assertEquals(Sets.mutable.with(2, 4), uniqueEvens);

5. Symmetry: Eclipse Collections offers primitive collections. We try to maintain symmetry between the object collections and primitive collections to provide a complete user experience.

Implementing an API in Practice

Let us implement a simple API RichIterable#countBy() which was added in Eclipse Collections release 9.0.0: The motivation for this API was the users mentioning having to collect() a collection in a Bag. In Eclipse Collections collect() is similar to map() and Bag is a data structure which maintains a mapping of an object to the count.

MutableList<String> strings = Lists.mutable.with(
        "1", "2", "2", "3", "3", "3", "4", "4", "4", "4");
Bag<Integer> integers = strings.collect(
Assert.assertEquals(1, integers.occurrencesOf(1));
Assert.assertEquals(2, integers.occurrencesOf(2));
Assert.assertEquals(3, integers.occurrencesOf(3));
Assert.assertEquals(4, integers.occurrencesOf(4));

The above solution to count the integers worked, however, it was not intuitive. An inexperienced developer might have a hard time to implement this solution. So, we decided to add countBy() and now the code is looks more functional, fluent and moreover intuitive.

MutableList<String> strings = Lists.mutable.with(
        "1", "2", "2", "3", "3", "3", "4", "4", "4", "4");
Bag<Integer> integers = strings.countBy(Integer::valueOf);
Assert.assertEquals(1, integers.occurrencesOf(1));
Assert.assertEquals(2, integers.occurrencesOf(2));
Assert.assertEquals(3, integers.occurrencesOf(3));
Assert.assertEquals(4, integers.occurrencesOf(4));


In this blog I explained the evolution strategy of a mature Java collections library. The aspects we look at are use case, utility vs API, covariant overrides, necessary overloads and lastly symmetry.

It is a personal goal to get 1000 stars on our GitHub project, so, if you like the framework, show your support and put a star on the repository.



Author: Nikhil Nanivadekar

Lead Eclipse Collections: eclipse.org/collections, Java Champion. I enjoy hiking, skiing, reading. All opinions stated by me are my own.

Next Post

Previous Post

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

© 2019 JVM Advent 2018

Theme by Anders Norén

%d bloggers like this: