Why Referential Transparency matters?
It’s not always obvious to understand what referential transparency is and why it matters. It’s often intertwined with “complex” frameworks, dealing with Functional Programming.
Here, I start from a simple case all developers are dealing with (constants), and show the “equivalence” for more complex cases (class), and the impact when Referential Transparency is present or not.
Constants are easy to reason about
When we are working with constant values, it’s easy to reason about: we know the value is fixed, we can replace the constant by its value. We often use UPPERCASE notation to notify the reader of this behavior. It’s the same in every language:
// C
#define THRESHOLD 200
bool test(int i) { return i > THRESHOLD; }// JavaScript
const THRESHOLD = 200
function test(i) { return i > THRESHOLD }// Scala
// val is ubiquitous in Scala so the uppercase is never used
val threshold = 200
def test(i: Int) = i > threshold
When we replace THRESHOLD by its value, those programs are equivalent:
bool test(int i) { return i > 200; }
function test(i) { return i > 200}
def test(i: Int) = i > 200
The C preprocessor is the most straightforward implementation of this behavior: it works as a “simple find & replace”, it’s doesn’t need to understand the program.
Can we treat everything as a constant? to be able to replace expressions when we read them, without altering the meaning of the program?
That’s what referential transparency is all about.
Not everything act as a constant
Here is a trivial class (idiomatic to Java but not to Scala hopefully):
class Person(var name: String) {
def setName(n: String): Unit = name = n
def getName(): String = name
}
It’s easy to work with:
val p = new Person("john")
println(p.getName())
// "john"
We can substitute p by its value:
println(new Person("john").getName())
// "john"
Where it‘s more cumbersome, it’s when we have another reference to this Person in our function as in this example:
val p = new Person("john")
val p2 = p
p2.setName("henry")println(p.getName()) // "henry"
println(p2.getName()) // "henry"
Nothing fancy, it does what we expect, because we’re working with the same reference. What happens if we follow the same trail as with the constant, and replace p by its value?
// we have replace p by: new Person("john")val p2 = new Person("john")
p2.setName("henry")println(new Person("john").getName()) // "john"
println(p2.getName()) // "henry"
Ouch! We don’t have the same result! It happened because we are not working with the same reference anymore (it’s gone).
We just demonstrated that “Person” is not referentially-transparent: you can’t replace its references by its value.
What is causing this behavior?
Mutability drives us nuts
Person is mutable: its name is updated through setName. This is what we lost during the find/replace we did.
As I said, this example was not idiomatic to Scala; we would have written:
case class Person(name: String)
We don’t have anymore getters & setters, but we have .copy
:
val p = new Person("john")
val p2 = p.copy(name = "henry")
println(p.name) // "john"
println(p2.name) // "henry"
copy
returns a new value, it doesn’t alter the original instance, this is why we see “john” and “henry” in the output, both are working with different instances.
If we do our find/replace trick, we get the same result:
val p2 = new Person("john").copy(name = "henry")
println(new Person("john").name) // "john"
println(p2.name) // "henry"
Immutability is necessary to provide us referential-transparency.
It’s not the only prerequisite: functions should not introduced any side-effects.
Side-effects drive us nuts
Let’s add one line in our Person:
case class Person(name: String) {
println(s"I'm a new Person: $name")
}
This is enough to break referential-transparency. Our same program as before leads to a different output:
val p = new Person("john")
val p2 = p.copy(name = "henry")
println(p.name)
println(p2.name)I'm a new Person: john
I'm a new Person: henry
john
henry
With our find/replace trick, we replace p by its value:
val p2 = new Person("john").copy(name = "henry")
println(new Person("john").name)
println(p2.name)I'm a new Person: john
I'm a new Person: henry
I'm a new Person: john
john
henry
Bummer! The output is different. We created 3 instances instead of 2 previously (due to the immutability).
It’s “just” a println, replace it with an UPDATE to some database, or a increment to some counters, and you just introduced a bug in your program: they will be performed twice.
Solution exist to encapsulate such side-effects, and defer them. I’ll write about it a future post. (hint: IO)
Conclusion
We are used to work with references and mutability: it does not mean it’s the proper way.
Code that is not referentially-transparent is difficult to refactor. We can’t ensure the behavior won’t change. Even an excellent test coverage doesn’t protect us from unpredictable behaviors.
This is why it’s not “easy to reason about” things that are not referentially-transparent. It’s a best-practice to always provide it: we need to think less about those technical digressions, and think more about the business-oriented code, which is why we are working for.
—
If you want to dive deeper into referential-transparency and into the “purity” of the types, I’ve written a series of articles about it, feel free to read it.
Thanks for reading!