Programs as Values, Part III: Explicit Control Flow

At the end of Part II, we arrived at a description of the programs as values paradigm: we represent effects as datatypes (i.e. programs as values), and we use functions to assemble big programs out of smaller ones, until the last instruction in main translates them to actual side effects.

Before diving more deeply in the structure of these datatypes, I’d like to expand on the two fundamental ideas:

We can represent execution through explicit combinators. We will see how this is possible even in the execution as evaluation paradigm.
Programs as values datatypes really do behave as values. We will show this with very simple examples based on cats.effect.IO.

Making execution explicit

In this section we will look at how to express execution via explicit combinators. We don’t yet know how to do it in programs as values, so we are going to do it in execution as evaluation. In particular we are looking to express the following program:

def p = {
  println("What's your name?")
  Thread.sleep(1000)
  val yourName = scala.io.StdIn.readLine()
  println(s"Hello, $yourName!")
  yourName.length
}

This type of sequential control flow is very natural in execution as evaluation, but on deeper thought it relies on two “magical” features: one is that a newline/semicolon between two instructions A and B means “execute A before B”, and the other is that we can refer to the result of an instruction in subsequent instructions by binding it to a val.

So here’s a challenge: in a strict language, is it possible to express p without relying on the two magical features I just described? Remember that by strict language we mean a language where the evaluation semantics are call-by-value, which means (informally) that the arguments of a function are evaluated left to right before the body of that function.

Actually step away from the page and spend a few minutes thinking how we could do this.

Did you manage? Nice!, but don’t worry if you didn’t: it’s counterintuitive at first, but we can get there by looking at call-by-value in more detail.

Let’s take this example:

def two = 2
def three = 3
def add(a: Int, b: Int): Int = a + b

add(two, three)

and add printlns to trace evaluation:

def two = {
 println("Evaluating 2")
 2
}

def three = {
 println("Evaluating 3")
 3
}

def add(a: Int, b: Int): Int = {
  println("Evaluating addition")
  a + b
}

add(two, three)
// Evaluating 2
// Evaluating 3
// Evaluating addition
// res1: Int = 5

The execution trace closely resembles sequential control flow, which gives us an idea on how we can sequence two instructions by exploiting the evaluation order of the arguments of a function:

def seq1[A, B](fst: A, snd: B): Unit = ()

seq1(
  println("first"),
  seq1(
    println("second"),
    println("third")
  )
)
// first
// second
// third

But that’s not general enough: when using the normal newline/semicolon, the last expression becomes the overall result. So we can slightly modify our seq1 to return the second argument instead:

def seq2[A, B](a: A, b: B): B = b

seq2(
  println("one"),
  42
)
// one
// res3: Int = 42

We are getting there, but we still cannot express the second “magical” feature, the ability to refer to the result of an instruction in subsequent instructions. This program:

val s = scala.io.StdIn.readLine()
s.length + 1

would translate to:

seq2(
  scala.io.StdIn.readLine(),
  s.length + 1
)
// error: not found: value s
//   s.length + 1
//   ^

To get to the final solution, we need to focus on the role of val yourName in this snippet:

 val yourName = scala.io.StdIn.readLine()
 println(s"Hello, $yourName!")
 yourName.length

val yourName conceptually divides the program in two halves: the first half computes yourName, and the second half depends on it.

seq2 cannot represent this idea because the two halves of the program are represented as A and B, and there is no relationship between them:

def seq2[A, B](a: A, b: B): B = b

but that shows us the path to a solution, we need to express the idea that B depends on A, i.e. that B is a function of A:

def seq3[A, B](a: A)(f: A => B): B = f(a)

and we can now write:

seq3(scala.io.StdIn.readLine())(s => s.length + 1)

seq3 also works for the simple sequencing case, by ignoring its argument:

seq3(println("hello"))(_ => println("world"))

which makes sense, normal semicolons can also be seen as a special case of named results, i.e.

println("hello")
println("world")

can be seen as:

val _ = println("hello")
val _ = println("world")

We’ve solved our problem, and can now express the original program:

def seq[A, B](a: A)(f: A => B): B = f(a)

def original = {
  println("What's your name?")
  Thread.sleep(1000)
  val yourName = scala.io.StdIn.readLine()
  println(s"Hello, $yourName!")
  yourName.length
}

def explicit =
  seq(println("What's your name?")) { _ =>
    seq(Thread.sleep(1000)) { _ =>
      seq(scala.io.StdIn.readLine()) { yourName =>
        seq(println(s"Hello, $yourName!")) { _ =>
          yourName.length
        }
      }
    }
  }

So, why did we do this? The implementation of seq still relies on evaluation, and the effectful building blocks like println aren’t represented as datatypes/values, but explicit shows that we can describe execution through combinators, and that structure can be applied to things which are values, like cats.effect.IO:

import cats.effect.IO
import scala.concurrent.duration._

 // works entirely differently from `seq`,
 // but expresses the same idea
def seqIO[A, B](a: IO[A])(f: A => IO[B]): IO[B] =
  a.flatMap(f)

def p2 =
  seqIO(IO.println("What's your name")) { _ =>
    seqIO(IO.sleep(1.second)) { _ =>
      seqIO(IO.readLine) { yourName =>
        seqIO(IO.println(s"Hello, $yourName!")) { _ =>
          IO.pure(yourName.length)
        }
      }
    }
  }

// or, idiomatically:

IO.println("What's your name") >>
IO.sleep(1.second) >>
IO.readLine.flatMap { yourName =>
  IO.println(s"Hello, $yourName!").as(yourName.length)
}

IO is a value

In programs as values effects are represented as datatypes, and IO is the most fundamental datatype in the paradigm: an IO[A] represents a program that can perform arbitrary effects, and produce a result of type A.

In the previous section I said that cats.effect.IO is a value, so what does this mean?

Let’s take a look at this program:

import cats.effect.IO

val a: IO[Unit] = IO.println("hello")
// a: IO[Unit] = IO(...)

As you can see, evaluating a doesn’t print anything, it literally just returns an instance of the IO datatype. In fact, it isn’t any different from:

val b: Int = 42
// b: Int = 42

IO gets translated into actual side-effects only in main, and cats-effect defines an IOApp trait as the entry point of your application:

import cats.effect.IOApp

object MyApp extends IOApp.Simple {
  def run: IO[Unit] = a
}

The translation happens when the JVM calls main:

MyApp.main(Array())

// hello

Because IOs are values, relying on evaluation to sequence two IOs has no effect:

val a: IO[Unit] = {
  IO.println("hey")
  IO.println("hello")
}

MyApp.main(Array())

// hello

Which is very surprising at first, but it’s equivalent to saying that the number 34 is discarded in:

val c = {
  34
  "something"
}
// c: String = "something"

Both IO.println("hey") and 34 are just values, and have no effect if you do nothing with them.

Sequencing has to be explicit through combinators, just like we saw in the previous section:

val a: IO[Unit] =
  IO.println("hey").flatMap { _ =>
    IO.println("hello")
  }

// can be shortened to
 IO.println("hey") >> IO.println("hello")

MyApp.main(Array())

// hey
// hello

And above all, remember that values are referentially transparent, so the two programs below describe the same behaviour, which is to print twice:

IO.println("hello") >> IO.println("hello")

val p = IO.println("hello")
p >> p

just like the two expressions below describe the same result:

2 + 2

val x = 2
x + x

You can always give a name to a value, and you can always replace a name with its value.

Explicit control flow with values

In this post we saw that using datatypes to describe effects requires modelling execution with explicit combinators, and that we can readily emulate sequential flow with this approach.

In most execution as evaluation languages, enriching control flow beyond that requires additional features, such as try/catch, for loops, or async functions, but those concepts can all be represented as combinators over values:

ioA.handleErrorWith(ioB)
listA.traverse(i => myIO(i))
(ioA, ioB).parMapN(f)

and the same approach scales all the way to very rich, compositional apis:

val randomWait =
  Stream
    .eval(Random.scalaUtilRandom[IO])
    .flatMap { random => Stream.repeatEval(random.nextInt) }
    .evalMap { n => IO.sleep(n.nanos) }

val hello = IO.println("hello")

Stream
  .repeatEval(hello)
  .zipLeft(randomWait)
  .take(10)
  .interruptAfter(10.seconds)

as well as to other kinds of effectful programs, such as parsers:

val listSep: Parser[Unit] =
  Parser.char(',').surroundedBy(whitespace.rep.void).void

Next time, we are going to start looking in detail at the structure of programs as values datatypes, by talking about algebras. See you then!