The Functional Subset of D
As I wrote about in an earlier post, the future of D lies in the field of functional programming. More specifically, what the creators of D are trying to do, is to construct a pure functional subset that can be utilized within the otherwise imperative language.
Let’s take a closer look at that functional subset that is taking form in the experimental 2.0 version of D.
Immutable data
The most fundamental difference between a purely functional language and an imperative one is how they treat data. Many of us are used to think of data in terms of state, where variables can be changed through assignments. But in a functional program there are no states. There are only constant values and functions that operate on them.
In D, immutability is achieved with the invariant keyword, either as a storage class:
invariant int i = 3;
or as a type constructor:
invariant(int) i = 3;
Transitive invariance
The side-effect free nature that comes with immutable data has some great advantages. For one thing it simplifies testing since the result of a function only depends on its input. There are also some optimizations that can be done by the compiler, but the biggest advantage is that programs written in this way are thread-safe by default.
To take advantage of these things the compiler needs to be able to trust the immutability of our data. This is where transitivity comes in. In D, invariance is transitive, which basically means that no data reachable from an invariant reference can be changed. Here’s an example.
int ga = 2; // mutable struct A { int *p = &ga; // pointer to mutable } invariant(A) a; // a is immutable A b; // b is mutable // invariant is transitive a = b; // ERROR, a is immutable a.p = null; // ERROR, a.p is immutable *a.p = 3; // ERROR, data pointed to by a.p is also immutable
Garbage collection
Since data must never change in a functional program, and consequently must not be destroyed while the data is in use, it’s usually a good idea for a functional language to utilize automatic memory management. Like most functional languages, D features garbage collection (alongside with explicit memory management.)
Higher class functions
In order to do anything interesting in a purely functional language you need higher order functions, or – in other words – the ability to send functions as arguments to other functions. For this we can use the function pointers (or delegates for methods and nested functions).
As an example, let’s say that we want to create a function that calculates the sum of two adjacent Fibonacci numbers. Here’s one way to do that.
int nth_fib(int n) { if(n == 0) return 0; if(n == 1) return 1; return nth_fib(n-1) + nth_fib(n-2); } int add_next_fib(int n) { return nth_fib(n) + nth_fib(n+1); }
Now, let’s say that we want to do the same operation on a different sequence, for example natural numbers. Well, we could use the good old copy and paste but that isn’t very DRY. Let’s make add_next a higher order function instead so that it could be used with any sequence function.
int add_next(int n, <em>int function(int) nth</em>) { return nth(n) + nth(n+1); } int i = add_next(3, &amp;nth_fib); // i is 8 (3+5)
Now, we can write any sequence function we want and have add_next apply it.
// Sequence function for natural numbers int nth_nat(int n) { return n; } int i = add_next(3, &amp;nth_nat); // i is 7 (3+4)
Note: For methods and nested functions, the keyword function is replaced with the keyword delegate, otherwise it’s the same syntax.
Closures
Closures is another indispensable feature of functional languages. In short, it’s the ability to extract a function pointer for later use, and when invoked the function will still have access to the context in which it was created, even though that context has gone out of scope.
In D, closures are created with the delegate keyword.
int delegate() create_closure() { int x = 3; int f() { return x; } return &amp;f; } int delegate() a_closure = create_closure(); int i = a_closure(); // i is 3
Note that the extracted function f (referenced by the a_closure variable) accesses the local variable x, although it has gone out of scope at the time of execution. D got this ability with the 2.07 version, before that it didn’t have real closures.
Currying
Closures provide an easy way to do currying, which is common in functional languages. Simply put, currying is a technique where functions take a general function and return a new, more specialized one.
For instance, we could curry our add_next function in our previous example and create a specialized version of it, say add_next_fib (and thus get back to where we started).
int delegate(int) curry_add_next(int function(int) nth) { int curry_f(int n) { return add_next(n, nth); } return &amp;curry_f; } int delegate(int) add_next_fib = curry_add_next(&amp;nth_fib); int i = add_next_fib(5); // i is still 8
Pure functions
These features are all we need to write purely functional code, but in order to take full advantage of the functional programming paradigm some major things remain unsolved.
For one thing, the compiler needs to know whether or not our code is functional in order to apply possible optimizations. The easiest way to do this is to give the programmer a keyword to tell the compiler she wishes purity, and then have the compiler enforce it. In D this is the purpose of the pure storage class.
pure int a_pure_square_function(invariant(int) x) { return x * x; }
A pure function must not access non-invariant data, and may not invoke other non-pure functions. As per D2.13, the pure storage class has not had its semantics implemented and are therefore not enforcing purity. I sense this is not a trivial matter, so it may take some time before we have it.
Cheers!
To guarantee immutability, shouldn’t b = a be denied as well? Otherwise I could do b = a; b.p = null?
Good point! D handles this case too, but not all the way. This is a topic I’m planning to delve into soon, in a different post.
You are confusing immutable data with pure functions, they are not related. Just because your function arguments are immutable doesn’t mean that a function only depends on its input.
I see your point, but I’m not confusing immutable data with pure functions. I expressed myself rather sloppy though. What I meant to say was that if all data in your program is immutable, some great advantages come. I didn’t particularly mean arguments, but rather all data that the function utilizes.
In a language where all data is immutable, there is no concept of “pure” functions, since they all are. So in that sense, immutable data and pure functions are related.
Hmmm, I’m not sure that follows. Data can be proven to never be needed again, I guess you could call that “garbage”. It can also be recalculated as needed.
So true, I really should rewrite that part. Thank you for pointing it out.