D Gotchas
I have been playing around with the D Programming Language lately, and I love it. D combines the low-level control of C and modern productivity features like garbage collection, a built in unit-testing framework and – the most recent feature – real closures.
But D is still a young language, and as such a little rough around the edges. It jumps out and bites me every now and then, forcing me to change some of my most common coding habits. Here are a couple of gotchas that made me trip more than once.
Premature Convertion
What’s the value of the variable after the assignment below?
real a = 5/2;
In D, the answer is 2. The reason for this unintuitive behavior is D’s arithmetic conversion rules which only takes operand types into consideration. A division between two integers result in an integer as well. The desired type, the type of the resulting variable, is disregarded.
To get the desired result we need to convert at least one of the operands to a floating-point number. This can be done in two ways. Either literally:
real a = 5.0/2; // a=>2.5
Or with an explicit cast:
real a = cast(real)5/2; // a=>2.5
Note that you must convert the operand, not the result. So this won’t work:
real a = cast(real)(5/2); // a=>2
The Premature Conversion gotcha is a particularly nasty one. It compiles and runs, which means only testing can reveal this bug.
Testing for identity
I’m a defensive programmer. I like to put assertions whenever I make assumptions in my code. By far, the most common assumption I make, is that an object is assigned. Here’s how I normally do it:
assert(o != null, "o should be assigned!");
In D, this is a big gotcha. The code above works as long as o is not null. If o is unassigned, we’ll get a nasty Access Violation Error. Here’s another example:
SomeObject o = null; if (o == null) // <= Access Violation o = new SomeObject;
The reason is that D supports overloaded operators, in this case the equality operators (== and !=). Unlike Java, D converts the equality operator into a method call without checking for null references. So, internally, the above code gets converted to the following:
SomeObject o = null; if (o.opEquals(null)) o = new SomeObject;
Since o is null, the call to opEquals result in an Access Violation. Instead you should use the is operator to check for identity.
if(o is null) ...
Or
assert(o !is null, ...)
Despite the tripping, I actually like the idea of a separate identity operator. After all, “is a equivalent to b?” is a different question than “are a and b the same object?”. But, as we say in Sweden, It’s difficult to teach old dogs to sit.
Cheers!
Your D posts are great. D seems to have so much potential and your writings really help shine a light on it. Your blog is one of the few I look forward to reading.
Regarding premature conversion. I agree that the behavior is not ideal, but other languages have the same issue/feature, such as C#. I tried to research why it was implemented as such in C#, but could find no answers. It does make me think there was a reason for that behavior though. Perhaps another reader can shine a light on it.
…Michael…
Thank you, your comment gave me even more inspiration!
The problem comes from pointer hiding which leads to a confusion between object and value comparison semantics (something I wrote about a few weeks ago on my site).
Many OO languages have this problem because they don’t allow differentiation between a user defined value type and a user defined object type. This tends not to come up in value based languages.
The sub-heading above “Testing for identity” hits the nail right on the head. D converts what looks like an identity test into an equivalence test – whoops.
Your argument about “Premature Convertion” is incorrect, and I think that the answer 2 is what most programming languages (I only tested in C++ and Python) will give. In my understanding the assignment expression is equivalent to:
auto t = ; // t’s type deduced
real a = t; // proper conversion performed
It’s illogical to expect the compiler to enforce the left side’s type on each operand of the right side before the right side is evaluated.
I realize this is a it-depends-on-how-you-see-it kind of issue. If you take the compiler’s point of view I guess the behavior is logical. But, from my point of view, my intentions are rather clear.
I wanted a floating-point division, and a compiler is certainly capable of figuring that out. So, the rationale behind the behavior must be something else. I suspect it’s performance.
Python won’t stay that way much longer. C-style integer division is a misfeature which will be removed in 3.0. You can also get it in Python 2.5:
from __future__ import division
a = 5/2 # 2.5
b = 5//2 # 2 (old behavior with new operator)
The reason 5/2 is 2 is probably concistency. It behaves the same way as if a and b were int variables. A simple integer literal is of the type int. The expression ‘5/2;’ all by itself is a valid statement, at least in C. It makes sense that putting it in a context doesn’t change its meaning. I guess you’re aware how different floats and ints are for the cpu, semantic and performance-wise.
Of course, changing it to be the way you suggest, has some advantages.
And about assert(o != null). I think if you do the corresponding operation (trying to use a null reference) in java, you get a NullPointerException. The difference being that in D, you need to run it in the debugger to get the line number and file name. Probably the most important reason some people use the phobos backtrace hack. Python uses the same operators the D does here (== and is). 🙂
Thank you for your comment. The 5/2 could definitely be exchanged for integer variables, but I chose the literals for sake of simplicity.
In java, if o is null the == operator resorts to an identity check.
Ouch. That == is an ugly one. And why o !is null rather than !o is null? !(o is null)?
!(o is null) works too, but I prefer the (o !is null). As an analogy to the != operator.
Suppose the compile would read your intentions and converted 5/2 in this case to floating point before doing the operation. It would be a special rule. Now suppose your actually want to do an integral division and assign it to a real, how would that look like?
real a = cast(int) 5/2; //???
real a = 5/2; a = floor(a); //hmm
Wouldn’t that be nasty?
‘o !is null’ is just another syntax for ‘!(o is null)’, similar to ‘!=’
Well, I have no problem with floor(5/2) or round(5/2), since division is most likely used with floating-points anyway. And the rounding functions are more expressive than the current behavior.
I think Object Pascal solves this in a nice way though, by having a special operator for integer division:
a := 5/2; // => 2.5
a := 5 div 2; // => 2
Actually, in Java the == operator is always an identity check. You have to use the equals() method for equivalence test.
What’s interesting is that both of these features are from Visual Basic, and the syntax and behavior is almost identical. Of course, liking D is cool and stylish, while saying something nice about VB is certainly not :). Anyway…since as far back as I can remember:
Vb:
If object = Nothing Then ‘…
>What’s interesting is that both of these features are from Visual
> Basic, and the syntax and behavior is almost identical.
They were in VB, not from it. Python, which had its first release a couple months prior to the release of VB in 1991, had the “is” operator.
No, the “is” operator is from Delphi (Object Pascal).
All the rest of the stuff is taken directly from C.
I can’t imagine how bad the confusion would be if D didn’t have 3/2 == 1. For a language that keeps such syntactic closeness to C, it would be a nightmare.
No, even python never actually moved to having the default be “true division”. The great thing about integer division is how much more accurate it is than floating point, simply because you can keep exact answers by using the modulus…something floating point cannot do.
I agree. This is of course the reason.
Presumably D is aimed at programmers who know C and C++. The premature conversion you describe is not a bug, it’s how C and C++ already work. The expression itself has a type regardless of what it is ultimately assigned to. The type of dividing one integer by another is an integer. Assign that value to a real causes the type conversion.
In C:
double x;
x = 5 / 2;
printf("%f", x);
Outputs 2.000000.
If D’s handling of conversions was opposite what C and C++ (and other languages) do it would confuse a lot of programmers.
If you think this through I believe you’ll conclude that C, C++, and D are doing the right thing, and the alternative you propose is actually more confusing.
I totally agree!
I don’t agree on this one. The result of a division between two integers is most likely not an integer, so why should that be the default?
There are some other unintuitive things, such as
uint a;
int b;
a = 1;
b = -3;
writefln(a + b); // 4294967294
This may seem no problem at a first sight but when working for example with sdl where rectangle sizes are unsigned and positions are signed this produces not-too-easy trackable errors in my code regulary.
Even if I think of it at the time writing it, I find it ugly to having to cast around that much.
Addition: You are right that similarity to C/C++ is also important for intuitivity. But maybe an optional flag that warns about such implicit conversions or something along those lines would be nice.
Really good and really interesting post. I expect (and other readers maybe :)) new useful posts from you!
Good luck and successes in blogging!