In Kotlin, sequences provide a way to perform lazy and efficient transformations on collections. Unlike regular collections, which eagerly evaluate all their elements when created, sequences only evaluate elements as needed, making them a powerful tool for working with large data sets or performing complex transformations on collections.
In this blog post, we will explore the concept of Kotlin sequences, their benefits, and how to use them effectively.
What are Sequences in Kotlin?
A sequence is an interface in Kotlin that represents a collection of elements that can be iterated over lazily. Each element is evaluated only when it is accessed, and not when the sequence is created.
public interface Sequence<out T> {
public operator fun iterator(): Iterator<T>
}
The Sequence
interface is a generic interface, which means that it can work with any type of element. The type parameter T
represents the type of elements in the sequence.
The Sequence
interface has a single method, iterator()
, which returns an Iterator
that can be used to iterate over the elements of the sequence. The Iterator
interface has two methods: hasNext()
and next()
. The hasNext()
method returns true
if there are more elements in the sequence to be iterated over, and false
otherwise. The next()
method returns the next element in the sequence.
It is important to note that sequences can be iterated multiple times. However, some sequence implementations may only allow a single iteration over the elements and will throw an exception if
iterator()
is called a second time. This behavior is documented for each sequence implementation(e.g. generateSequence overload), and it is generally preserved by sequence operations likemap
,filter
, etc.
Here in the below example, iterating that sequence a second time will fail and throw an IllegalStateException with the message “This sequence can be consumed only once”.
var count = 3
val sequence = generateSequence {
(count--).takeIf { it > 0 } // will return null, when value becomes non-positive,
// and that will terminate the sequence
}
println(sequence.toList()) // [3, 2, 1]
sequence.forEach { } // <- iterating that sequence second time will fail and throw a IllegalStateException with the message "This sequence can be consumed only once."
In contrast, a regular collection such as a list or set eagerly evaluates all its elements when created. This can be wasteful if you only need a subset of the elements, or if you need to perform complex transformations on the elements.
Sequences provide a more efficient way to work with collections, as they only evaluate the elements that are accessed. This makes them ideal for working with large data sets or performing complex transformations on collections.
BTW, What are eager and lazy operations?
Eager operations in Kotlin are operations that are performed immediately on a collection or sequence. They create intermediate collections or sequences to hold the results of each operation. For example, the filter
and map
functions are eager operations.
Here is an example of an eager operation:
val numbers = listOf(1, 2, 3, 4, 5)
val doubled = numbers.filter { it % 2 == 0 }.map { it * 2 }
println(doubled)
In this example, the filter
function is applied to the numbers
list to get a new list of even numbers, and then the map
function is applied to that list to double each number. The doubled
list is created as an intermediate collection to hold the results of both operations. The final result is printed on the console.
Lazy operations in Kotlin, on the other hand, are operations that are not performed immediately. They create a sequence that holds the operations until they are actually needed. Only when you iterate over the sequence, the operations are performed. This is more efficient for large collections because it avoids creating intermediate collections.
Here is an example of a lazy operation:
val numbers = listOf(1, 2, 3, 4, 5)
val doubled = numbers.asSequence().filter { it % 2 == 0 }.map { it * 2 }
println(doubled.toList())
In this example, the asSequence
function is used to convert the numbers
list to a sequence. Then, the filter
and map
functions are applied to the sequence to get a new sequence of even numbers, and then to double each number. No intermediate collection is created. The doubled
sequence is only evaluated when the toList
function is called, which triggers the sequence to be iterated and the operations to be performed.
Internally, how do eager and lazy operations work?
Let’s see the post-mortem of the map
function for Iterable
(in case of collections) and Sequence
in Kotlin:
The map
function for Iterable
eagerly creates a new ArrayList
with the expected size of the resulting list, then applies the given transform
function to each element in the iterable and adds the transformed element to the new ArrayList
. Finally, it returns the resulting list.
public inline fun <T, R> Iterable<T>.map(transform: (T) -> R): List<R> {
return mapTo(ArrayList<R>(collectionSizeOrDefault(10)), transform)
}
Here the map
function for Iterable
takes a lambda function as its argument, which it applies to each element in the collection, and returns a new list containing the transformed elements. The resulting list is eagerly created and stored in memory, which means that each transformation operation is performed immediately, and the entire list is stored in memory at once. This can be inefficient for large collections, especially if multiple intermediate lists are created in a chained operation.
On the other hand, the map
function for Sequence
lazily creates a new TransformingSequence
object that wraps the original sequence and applies the given transform
function to each element on-demand when the resulting sequence is iterated over. The transformed elements are not stored in a new collection, but instead, they are calculated on-the-fly as needed. Finally, it returns the resulting sequence.
public fun <T, R> Sequence<T>.map(transform: (T) -> R): Sequence<R> {
return TransformingSequence(this, transform)
}
Here the map
function for Sequence
returns a new sequence containing the transformed elements. The sequence is lazily evaluated, which means that the transformation operation is not performed immediately. Instead, the elements are transformed on demand as they are needed, such as when iterating over the sequence or calling another sequence operation. This can be more efficient for large collections, as it avoids the creation of intermediate lists and only performs the transformations that are actually needed.
Looking carefully at both implementation codes, we can see that the map
function for Iterable
creates a new ArrayList
with an initial capacity of 10, then calls the mapTo
function to perform the transformation and store the result in the newly created list.
In contrast, the map
function for Sequence
returns a new TransformingSequence
object, which wraps the original sequence and the transformation lambda. When an operation is performed on the sequence, such as toList()
or forEach()
, the iterator
method is called on the TransformingSequence
object, which in turn calls the iterator
method on the original sequence and applies the transformation lambda to each element as it is retrieved.
Creating sequences
we can create sequences using several different approaches. Here are some of the most common ways:
From elements
To create a sequence from a list of elements, call the sequenceOf()
function listing the elements as its arguments.
//sequenceOf: Creates a sequence from a list of elements
val numberSequence = sequenceOf(1, 2, 3, 4)
From an Iterable
If you already have an Iterable
object (such as a List
or a Set
), you can create a sequence from it by calling asSequence().
// listOf().asSequence(): Converts a list to a sequence.
val list = listOf(1, 2, 3, 4)
val numberSequence = list.asSequence()
From a function
One more way to create a sequence is by building it with a function that calculates its elements. To build a sequence based on a function, call generateSequence()
with this function as an argument.
//generateSequence: Creates a sequence from a seed value and a function that generates the next element based on the previous element.
val infiniteSequence = generateSequence(1) { it + 1 }
The generateSequence
function takes a seed value and a function that generates the next element based on the previous element. In this example, we start with a seed value of 1 and generate the next element by adding 1 to the previous element. This creates an infinite sequence of natural numbers.
To create a finite sequence with generateSequence()
, provide a function that returns null
after the last element you need.
val finiteSequence = generateSequence(1) { if (it < 18) it + 1 else null }
From chunks
Finally, there is a function that lets you produce sequence elements one by one or by chunks of arbitrary sizes — the sequence()
function.
val oddNumbers = sequence {
yield(1)
yieldAll(listOf(3, 5))
yieldAll(generateSequence(7) { it + 2 })
}
println(oddNumbers.take(5).toList()) // output : [1, 3, 5, 7, 9]
The above code snippet creates a sequence of odd numbers and prints the first five elements of the sequence.
The sequence
function is used to create the sequence. The lambda passed to sequence
contains a series of yield
and yieldAll
statements that define the elements of the sequence.
The yield
function is used to emit a single value from the sequence. In this case, the sequence starts with the odd number 1
.
The yieldAll
function is used to emit multiple values from the sequence. In the first yieldAll
call, a list of odd numbers [3, 5]
is emitted. In the second yieldAll
call, the generateSequence
function is used to emit an infinite sequence of odd numbers starting from 7
. The lambda passed to generateSequence
takes the previous number and adds 2
to it to generate the next number in the sequence.
Finally, the take
function is used to get the first five elements of the sequence, and the toList
function is used to convert the sequence into a list. The output of the code snippet is [1, 3, 5, 7, 9]
, which is the first five odd numbers.
Sequence operations
Operations on a sequence are generally divided into two categories: intermediate and terminal.
Intermediate Operations:
Intermediate operations are those operations that return a new sequence and transform the elements of the original sequence. These operations are typically stateless and do not require much memory to perform. Some examples of intermediate operations are map()
, filter()
, flatMap()
, distinct()
, sorted()
, take()
, and drop()
.
Intermediate operations can be chained to form a sequence of operations to perform on a sequence. However, none of these operations are executed until a terminal operation is called.
For example, consider the following code:
val numbers = listOf(1, 2, 3, 4, 5)
val result = numbers.asSequence()
.map { it * 2 }
.filter { it > 5 }
.toList()
In this example, the asSequence()
function is used to convert the List
into a sequence. The map
and filter
operations are intermediate operations that return a new sequence that knows how to transform the elements of the original sequence. The toList()
function is a terminal operation that returns a List
containing the transformed elements of the sequence. The map
and filter
operations are not executed until the toList()
function is called, and the resulting list is [6, 8, 10]
.
Here are few more common intermediate operations:
map
: Transforms each element of the sequence by applying a function.
val numbers = sequenceOf(1, 2, 3, 4, 5)
val squares = numbers.map { it * it }
println(squares.toList()) // Output: [1, 4, 9, 16, 25]
filter
: Returns a sequence that contains only the elements that satisfy a predicate.
val numbers = sequenceOf(1, 2, 3, 4, 5)
val evenNumbers = numbers.filter { it % 2 == 0 }
println(evenNumbers.toList()) // Output: [2, 4]
take
: Returns the first n elements of the sequence.
val numbers = sequenceOf(1, 2, 3, 4, 5)
val firstThree = numbers.take(3)
println(firstThree.toList()) // Output: [1, 2, 3]
drop
: Returns a sequence that contains all elements except the first n elements.
val numbers = sequenceOf(1, 2, 3, 4, 5)
val withoutFirstTwo = numbers.drop(2)
println(withoutFirstTwo.toList()) // Output: [3, 4, 5]
flatMap
: operation maps each element of a sequence to a new sequence and flattens the resulting sequence into a single sequence.
val numbers = sequenceOf(listOf(1, 2), listOf(3, 4), listOf(5, 6))
val flattened = numbers.flatMap { it.asSequence() }
println(flattened.toList()) // Output: [1, 2, 3, 4, 5, 6]
distinct
: operation returns a new sequence with only the distinct elements of the original sequence.
val numbers = sequenceOf(1, 2, 2, 3, 3, 3, 4, 4, 4, 4)
val distinctNumbers = numbers.distinct()
println(distinctNumbers.toList()) // Output: [1, 2, 3, 4]
sorted
: operation returns a new sequence with the elements sorted in ascending order.
val numbers = sequenceOf(3, 5, 1, 4, 2)
val sortedNumbers = numbers.sorted()
println(sortedNumbers.toList()) // Output: [1, 2, 3, 4, 5]
groupBy
: function groups the elements of the sequence into a map based on a given key selector function.
data class Person(val name: String, val age: Int)
val seq = sequenceOf(Person("Amol", 25), Person("Baban", 30), Person("Chetan", 25), Person("Dada", 30))
val groupedSeq = seq.groupBy { it.age }
groupedSeq.forEach { (age, people) -> println("$age: ${people.joinToString(", ") { it.name }}") }
// prints "25: Amol, Chetan", "30: Baban, Dada"
windowed
: function returns a sequence of sliding windows of a given size over the elements of the sequence.
val seq = sequenceOf(1, 2, 3, 4, 5)
val windowedSeq = seq.windowed(3)
windowedSeq.forEach { println(it) } // prints "[1, 2, 3]", "[2, 3, 4]", "[3, 4, 5]"
zip
: function returns a sequence of pairs of elements from two sequences that have the same index.
val seq1 = sequenceOf(1, 2, 3)
val seq2 = sequenceOf("one", "two", "three")
val zippedSeq = seq1.zip(seq2)
zippedSeq.forEach { println(it) } // prints "(1, one)", "(2, two)", "(3, three)"
Terminal Operations:
Terminal operations are those operations that produce a result from the sequence. These operations are typically stateful and may require a large amount of memory to perform. Examples of terminal operations are toList()
, toSet()
, sum()
, max()
, min()
, count()
, any()
, and all()
.
When a terminal operation is called, all the intermediate operations are executed in the order they were chained. Terminal operations can only be called once and after that, the sequence is consumed, meaning it cannot be reused.
For example, consider the following code:
val numbers = listOf(1, 2, 3, 4, 5)
val result = numbers.asSequence()
.map { it * 2 }
.filter { it > 5 }
.count()
In this example, the asSequence()
function is used to convert the List
into a sequence. The map
and filter
operations are intermediate operations that return a new sequence that knows how to transform the elements of the original sequence. The count()
function is a terminal operation that returns the number of elements in the sequence. The map
and filter
operations are not executed until the count()
function is called, and the resulting count is 3
.
Here are few more examples of terminal operations in sequences:
toList
: converts a sequence to a list.
val numbers = sequenceOf(1, 2, 3, 4, 5)
val numberList = numbers.toList()
println(numberList) // output: [1, 2, 3, 4, 5]
toSet
: converts a sequence to a set.
val numbers = sequenceOf(1, 2, 3, 2, 4, 5, 3)
val numberSet = numbers.toSet()
println(numberSet) // output: [1, 2, 3, 4, 5]
sum
: returns the sum of all elements in a sequence.
val numbers = sequenceOf(1, 2, 3, 4, 5)<br>val sum = numbers.sum()<br>println(sum) // output: 15
max
: returns the largest element in a sequence.
val numbers = sequenceOf(1, 2, 3, 4, 5)
val max = numbers.max()
println(max) // output: 5
count
: returns the number of elements in a sequence.
val numbers = sequenceOf(1, 2, 3, 4, 5)
val count = numbers.count()
println(count) // output: 5
any
: returns true if at least one element in a sequence matches a given predicate.
val numbers = sequenceOf(1, 2, 3, 4, 5)
val anyEven = numbers.any { it % 2 == 0 }
println(anyEven) // output: true
all
: returns true if all elements in a sequence match a given predicate.
val numbers = sequenceOf(1, 2, 3, 4, 5)
val allEven = numbers.all { it % 2 == 0 }
println(allEven) // output: false
Streams or Sequences
If you’re familiar with Java 8 streams, you’ll see that sequences are exactly the same concept. Kotlin provides its own version of the same concept because Java 8 streams aren’t available on platforms built on older versions of Java, such as Android. If you’re targeting Java 8, streams give you one big feature that isn’t currently implemented for Kotlin sequences: the ability to run a stream operation (such as map or filter) on multiple CPUs in parallel.
Kotlin sequences do not natively support parallel processing in multiple CPUs. However, you can convert a sequence to a Java Stream and use parallelStream() to run operations on multiple CPUs in parallel.
val numbers = sequenceOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
// Convert sequence to Java Stream
val stream = numbers.asSequence().toStream()
// Use parallelStream() for parallel processing
val sum = stream.parallelStream()
.filter { it % 2 == 0 }
.map { it * it }
.sum()
println(sum) // Output: 220
In this example, we first convert a Kotlin sequence to a Java Stream using the toStream()
extension function. Then, we use the parallelStream()
method to run the filter and map operations on multiple CPUs in parallel. Finally, we calculate the sum of the resulting sequence using the sum()
method.
Additionally, Kotlin sequences provide some additional features that Java 8 streams don’t, such as the ability to iterate over the sequence multiple times.
Ultimately, the choice between using Java 8 streams or Kotlin sequences depends on your specific requirements and the Java version you’re targeting. If you need parallel processing capabilities and are targeting Java 8 or later, then streams may be the better choice. If you’re targeting older Java versions or want the ability to iterate over a sequence multiple times, then Kotlin sequences may be the better option.
The Benefits of Using Sequences
Using sequences has several benefits over regular collections:
- Lazy evaluation: Sequences are evaluated lazily, which means that only the elements that are accessed are evaluated, rather than all the elements at once. This can be more memory-efficient and faster than eagerly evaluating all the elements.
- Intermediate operations: Sequences provide a set of intermediate operations such as
map
,filter
,sorted
, anddistinct
, which allow you to transform and manipulate the elements of the sequence without creating intermediate collections. - Short-circuiting: Sequences support short-circuiting, which means that if a terminal operation only needs to access a subset of the elements, the remaining elements will not be evaluated.
- Immutable: Sequences are immutable, which means that they cannot be modified after creation. This makes them thread-safe and easy to reason about.
Collection vs Sequence
Here are some key differences between collections and sequences in Kotlin
Eager vs. Lazy Operations:
Collections in Kotlin are eager, which means that any operation performed on them is executed immediately, and a new collection is returned as a result. This can be inefficient for large collections because intermediate collections are created in memory, which can cause performance issues.
Sequences, on the other hand, are lazy. Operations on a sequence are not executed immediately, but instead, they are executed only when they are needed, and the result is returned as a sequence again. This means that sequences do not create intermediate collections in memory, which can be more efficient for large collections.
API Methods:
Collections and sequences have different sets of API methods. Collections support operations like add, remove, and get, while sequences support operations like filter, map, and reduce.
Iteration:
Collections can be iterated over using a for loop or an iterator, while sequences can only be iterated over using an iterator.
Conversions:
Collections can be converted to sequences using the asSequence()
function, and sequences can be converted to collections using the toList()
function.
BSF:
collections are best suited for small to medium-sized data sets and operations that require immediate execution, while sequences are best suited for large data sets and operations that can be executed lazily.
Conclusion
Sequences in Kotlin are a powerful tool for working with collections, especially when dealing with large data sets or performing complex transformations on collections. They allow for lazy evaluation, which can be more memory-efficient and faster than eagerly evaluating all the elements.
Sequences provide a set of intermediate and terminal operations that allow you to transform and manipulate the elements of the sequence. Intermediate operations are lazy and do not evaluate the elements until a terminal operation is called.
By using sequences effectively, you can write more efficient and concise code that is easier to reason about and maintain.