Archive

Archive for the ‘programming’ Category

PowerShell TDD: Testing CmdletBinding and OutputType

July 11th, 2022 No comments

A while ago, I decided to add PowerShell to my automation toolbox. Since I believe the best way to learn a new language is to do test-driven development and PowerShell has a fantastic module for this, called Pester. The framework is super intuitive, easy to do mocking in, and allows you develop your code design from tests.

However, one thing bugged me. I didn’t seem to be able to write code that tested whether a function had declared CmdletBinding (i.e. was an Advanced Function), or if it had declared OutputType.

# How can I test this?
function MyFunction {
    [CmdletBinding(SupportsShouldProcess, ConfirmImpact='High')]
    [OutputType('String')]
    ...
}

Google didn’t have anything useful on this subject so I tried my luck on StackOverflow and mclayton pointed me in the right direction (although, as mclayton puts it: it’s a bit of a mouthful). It turns out that I can use the built-in Abstract Syntax Tree (AST) and specifically the Param block attributes to find out if CmdletBinding is declared.

Testing CmdletBinding

The below function takes a command as input and looks for a CmdletBinding param block attribute.

function Test-CmdletBinding {
[OutputType([Bool])]
[CmdletBinding()]
    param (
        # Parameter help description
        [Parameter(Mandatory)]
        [System.Management.Automation.CommandInfo]
        $Command
    )

    $attribute = $command.ScriptBlock.Ast.Body.ParamBlock.Attributes | where-object { $_.TypeName.FullName -eq 'CmdletBinding' };

    $null -ne $attribute
}

You can then use the helper function in a Pester test, like this.

It "Should be an advanced function" {
    $c = Get-Command -Name MyCommand
    Test-CmdletBinding $c | Should -BeTrue
}

Testing CmdletBinding Arguments

That’s great, but what about arguments, like SupportsShouldProcess or ConfirmImpact?

[CmdletBinding(SupportsShouldProcess, ConfirmImpact='High')]

How can I test for those? Well, that’s where we get mouthful bits I guess, but the good news is that it’s doable. Here’s a helper function that can test those scenarios. It takes a command, an argument name, and an optional argument value and returns true if the command meets those conditions.

function Test-CmdletBindingArgument {
    [CmdletBinding()]
    [OutputType([Bool])]
    param (
        # Parameter help description
        [Parameter(Mandatory)]
        [System.Management.Automation.CommandInfo]
        $Command,
        [Parameter(Mandatory)]
        [string]
        $ArgumentName,
        [Parameter()]
        [string]
        $ArgumentValue
    )

    $attribute = $command.ScriptBlock.Ast.Body.ParamBlock.Attributes | where-object { $_.TypeName.FullName -eq 'CmdletBinding' };

    if ($attribute) {
        $argument = $attribute.NamedArguments | where-object { $_.ArgumentName -eq $ArgumentName };

        if ($null -eq $argument) {
            # The attribute does not have the argument, return false
            $false
        } elseif ($argument.ExpressionOmitted) {
            $ArgumentValue -eq '' -or $ArgumentValue -eq $true
        } elseif ($argument.Argument.Extent.Text -eq '$true') {
            $ArgumentValue -eq '' -or $ArgumentValue -eq $true
        } elseif ($argument.Argument.Extent.Text -eq '$false') {
            $ArgumentValue -eq $false
        } else {
            $ArgumentValue -eq $argument.Argument.Value
        }
    } else {
        # No such attribute exists on the command, return false
        $false
    }
}

The code handles both implicit and explicit values, e.g.

[CmdletBinding(SomeAttribute)]
[CmdletBinding(SomeAttribute=$true)] # same as having the attribute declared
[CmdletBinding(SomeAttribute=$false)] # same as not having the attribute declared
[CmdletBinding(SomeAttribute='Some Value')]

Here are a couple of examples of Pester tests using the helper function.

It "Should have CmdletBinding with ConfirmImpact set to High" {
    $c = Get-Command -Name MyCommand
    Test-CmdletBindingArgument $c -ArgumentName 'ConfirmImpact' -ArgumentValue 'High' | Should -BeTrue
}

It "Should have SupportShouldProcess declared" {
    $c = Get-Command -Name MyCommand
    Test-CmdletBindingArgument $c -ArgumentName 'SupportShouldProcess' | Should -BeTrue
}

Testing OutputType

Testing OutputType requires a slightly different approach. Since the OutputType attribute declares a type we have to access it through the positional arguments (instead of named arguments that were used for CmdletBinding).

Here’s a helper function to verify that a given command has declared a given type as its output type.

function Test-OutputType {
    [OutputType([Bool])]
    [CmdletBinding()]
    param (
        # Parameter help description
        [Parameter(Mandatory)]
        [System.Management.Automation.CommandInfo]
        $Command,
        [Parameter(Mandatory)]
        [string]
        $TypeName
    )

    $attribute = $command.ScriptBlock.Ast.Body.ParamBlock.Attributes | where-object { $_.TypeName.FullName -eq 'OutputType' };

    if ($attribute) {
        $argument = $attribute.PositionalArguments | where-object {
            if ($_.StaticType.Name -eq 'String') {
                $_.Value -eq $TypeName 
            } elseif ($_.StaticType.Name -eq 'Type') {
                $_.TypeName.Name -eq $TypeName 
            } else {
                $false
            }
        }
        if ($argument) {
            $true
        } else {
            $false
        }
    } else {
        $false
    }
}

Note that a type can be declared as a string or as a type, but the helper function handles both cases:

[OutputType([Bool])]
[OutputType('System.Bool')]

And here’s an example of how it can be used within a Pester test.

It "Should have Output type Bool" {
    $c = Get-Command -Name MyCommand
    Test-TDDOutputType $c -TypeName 'Bool' | Should -BeTrue
}

Introducing TDDUtils

I have created a PowerShell module called TDDUtils, which contains the above functions (although named with a TDD prefix) as well as more general versions that allows you to test other attributes than CmdletBinding and OutputType.

 You can install it from an Administrator PowerShell with the below command.

Install-Module TDDUtils

I plan to add more useful functions to that module as I go on with my PowerShell TDD journey, and if you want to contribute or just give me some suggestions, feel free to contact me on Twitter.

New XCode Tutorial: Turning View-based Into Navigation-based

September 30th, 2011 No comments

The other day I started a new iPhone project to create an app that I had been thinking about for quite some time now. I meant to build a Navigation-based app, but for some reason I used the general View-based application template to create my project.

Nemas problemas – I thought – transforming it into a Navigation-based project should be simple enough. Call me stupid, or stubborn, but what seemed like an easy task took me the better part of a day to achieve; XCode can be very unintuitive some times, and the error messages not very helpful 🙂

Anyway, I thought I’d better put my achievement down in writing in case some of you out there will have the same need one day. Then maybe you’ll thank me for writing the Turning View-base into Navigation-based  tutorial.

Cheers!

Categories: programming Tags: ,

Steve and Nat: Encapsulate Generics

February 21st, 2011 No comments

I’m reading Growing Object-Oriented Software, Guided by Tests, by Steve Freeman and Nat Pryce. I was happy to find that the authors share a similar view on constructed generic types as I do.

Our rule of thumb is that we try to limit passing around types with generics […]. Particularly when applied to collections, we view it as a form of duplication. It’s a hint that there’s a domain concept that should be extracted into a type.

A less blunt way of saying what I tried to say in my own post.

If you’re interested in the book I can strongly recommend it. Although I’m only half-way through it I can already tell it’s the best TDD book I have read so far. I’ll have a brief review up in a coming post.

Cheers!

Categories: books, Comment, programming Tags:

Generic Types are Litter

July 28th, 2010 2 comments

The short version of this post:

Do not spread generic types around. They are ugly and primitive.

Allow me to elaborate.

Generics are great

Before we got Generics (2004 in Java, 2005 in C#, and 2009 in Delphi) the way to enforce type safety on a collection type was to encapsulate it and use delegation.

class OrderList
{
private ArrayList items = new ArrayList();

public void Add(OrderItem item)
{
items.Add(item);
}

// Plus lots of more delegation code
// to implement Remove, foreach, etc.
}

Nowadays, we can construct our own type safe collection types with a single line of code.

List<orderitem> orders = new List<orderitem>();

That alone is good reason to love generics, and there lies the problem; many developers have come to love their generics a little too much.

Generics are ugly

The expressiveness of generics comes at a price, the price of syntactic noise. That’s natural. We must specify the parameterized types somehow, and the current syntax for doing so is probably as good as anything else.

Still, my feeling when I look at code with generics is that the constructed types don’t harmonize with the rest of the language. They kind of stand out, and are slightly negative for code readability.

OrderList orders = new OrderList();
// as compared to
List<OrderItem> orders = new List<OrderItem>();

This might not be so bad, but when we start to pass the constructed types around they become more and more like litter.

List<OrderItem> FetchOrders()
{
//…
}

double CalculateTotalSum(List<OrderItem> orders)
{
//…
}

void PrintOrders(List<OrderItem> orders)
{
//…
}

Besides being noisy, the constructed types expose implementation decisions. In our case the list of orders is implemented as a straight list. What if we found out that using a hash table implementation for performance reasons would be better? Then we’d have to change the declaration in all those places.

Dictionary<OrderItem, OrderItem> FetchOrders()
{
//…
}

double CalculateTotalSum(Dictionary<OrderItem, OrderItem> orders)
{
//…
}

void PrintOrders(Dictionary<OrderItem, OrderItem> orders)
{
//…
}

Comments redundant.

Generics are primitive

A related problem is that the constructed generic types encourage design that is more procedural than object-oriented. As an example, consider the CalculateTotalSum method again.

double CalculateTotalSum(List&amp;amp;amp;lt;OrderItem&amp;amp;amp;gt; orders)
{
//…
}

Clearly, this method belongs in the List<OrderItem> type. Ideally, we should be able to invoke it using the dot operator.

orders.TotalSum();
// instead of
CalculateTotalSum(orders);

But we cannot make that refactoring. We cannot add a TotalSum method to the List<OrderItem> type. (Well maybe we can if we make it an extension method, but I wouldn’t go there.) In that sense, constructed generic types are primitive types, closed and out of our control.

So we should hide them

Don’t get me wrong. I use generics a lot. But for the previously given reasons I do my best to hide them. I do that using the same encapsulation technique as before, or – at the very least – by inheriting the constructed types.

class OrderList : List<OrderItem>
{
  double TotalSum()
  {
    //…
  }
}

So, the rule I go by to mitigate the downsides of generics, is this:

Constructed types should always stay encapsulated. Never let them leak through any abstraction layers, be it methods, interfaces or cl

That is my view on Generics. What is yours?

Cheers!

P.S.

Just to be clear, the beef I have with Generics is with constructed types alone. I have nothing against using open generic types in public interfaces, although that is more likely to be a tool of library and framework developers.

The Bad Practices of Exception Handling

October 31st, 2009 12 comments

Exception handling has truly been a blessing to us software developers. Without it, dealing with special conditions and writing robust programs was a lot more painful. But, like any powerful tool, badly used it could cause more harm than good. This article name the top three on my Exception handling bad practices list, all of which I’ve been practicing in the past but now stay away from.

Swallowing Exceptions

Have you ever come across code like this?

try
{
  DoSomeNonCriticalStuff();
}
catch (Exception e)
{
  // Ignore errors
}

DoStuffThatMustBeDoneDispiteAnyErrorsAbove();

Of all the bad exception handling practices, this is the worst since its effect is the complete opposite of the programmer’s intention. The reasoning goes something like this: Catching exceptions where they don’t hurt makes my program more robust since it’ll continue working even when conditions aren’t perfect.

The reasoning could have been valid if it wasn’t for Fatal exceptions; Here described by Eric Lippert.

Fatal exceptions are not your fault, you cannot prevent them, and you cannot sensibly clean up from them. They almost always happen because the process is deeply diseased and is about to be put out of its misery. Out of memory, thread aborted, and so on. There is absolutely no point in catching these because nothing your puny user code can do will fix the problem. Just let your “finally” blocks run and hope for the best.

Catching and ignoring these fatal exceptions makes your program less robust since it will try to carry on as if nothing happened in the worst of conditions, most likely making things worse. Not very Fail fastish.

So, am I saying that ignoring exceptions is bad and should always be avoided? No, the bad practice is catching and ignoring general exceptions. Specific exceptions on the other hand is quite OK.

try
{
  DoSomeNonCriticalStuff();
}
catch (FileNotFoundException e)
{
  // So we couldn't find the settings file, big deal!
}

Bad example, I know, but you get the point.

Throwing Exception

Here’s another bad practice I come across every now and then.

throw new Exception("No connection!");


The problem is that in order to handle Exception we have to catch Exception, and if we catch Exception we have to be prepared to handle every other type of Exception, including the Fatal exceptions that we discussed in the previous section.

So, if you feel the need to throw an exception, make sure it’s specific.

throw new NoConnectionException();


If the idea of defining lots of specific exceptions puts you off, then the very least thing you should do is to define your own application exception to use instead of the basic Exception. This is nothing I recommend though, since general exceptions, like ApplicationException, violate the Be specific rule of the Three rules for Effective Exception Handling.  It’ll make you depend heavily on the message property to separate different errors, so don’t go there.

Overusing exceptions

Exceptions is a language construct to handle exceptional circumstances, situations where a piece of code discovers an error but don’t have the context necessary to handle it. It should not be used to handle valid program states. The reasons are:

  1. Performance. Throwing an exception with all that’s involved, like building a stack trace, will cost you a second or so.
  2. Annoyance. Debugging code where exceptions are a part of the normal execution flow can be frustrating.

Eric Lippert calls those exceptions Vexing exceptions, which I concider a great name given the second argument. Make sure you’ll check out his article (link at the beginning of this article).

Those were the three misuses of exception handling I concider worst. What’s on your list?

Cheers

Tools of the Effective Developer: Error Handling Infrastructure

September 3rd, 2009 2 comments

I often come across code with little or no infrastructure for error handling. This is a big mistake, one that’ll make you pay increasingly as the code base grows. Why? Because your code’ll end up with loads of this:

try
{
  ParseText(SomeText);
}
catch (Exception e)
{
  MessageBox.Show(&quot;Error while parsing.&quot;, &quot;My Application&quot;,
         MessageBoxButtons.OK, MessageBoxIcon.Error);

  // Skipping the rest due to the error
  return;
}

It may look OK but there are several problems with the error handling code above, some more severe than others.

  • Code duplication
    A message box for displaying error messages to the user should almost always have an error icon, the application name as the title, and an OK button. Repeating this all over the code base is not very DRY and should be avoided.
  • Inconsistency
    A related problem is that when you leave programmers with only the general error handling routines of the platform chances are they’ll end up doing things differently, resulting in an inconsistent user experience; For example showing some error dialogs with an error icon and some without.
  • Information loss
    A more serious problem is the fact that the code in the example, by ignoring the exception object, drops information that could be important to track down a bug. We could of course mititgate that problem somewhat by showing the error message of the exception to the user (by concatinating e.Message), but we’d still miss an important piece of information: the stack trace.
    Since dumping a pile of function calls in the face of the users is a no no, what we’d really need is a way to put the details in a non-intrusive place, for example a log file. If the Error Handling Infrastructure doesn’t make this easy, we’re likely to leave that kind of information out completely. For shame.
  • Not automation-friendly
    But the most severe problem with a spread out usage of UI displaying methods like MessageBox is that it makes your code impossible to automate. If someone has to monitor a scheduled run of for example unit tests, and every now and then klick an OK button, it kind of beats the purpose of automation.

So here’s my advice: Implement a strategy for handling errors at the earliest possible time. That is, right after setting up continuous integration for the Hello world application.

Properties of an Error Handling Infrastructure

So how should a error handling infrastructure be like? There’s only one rule. It has to be easy!
If it isn’t simple to use, it won’t be used, and programmers end up using the general purpose message showing methods again, or worse, not doing any error handling at all.

I usually build my error handling interfaces around two use cases.

  1. Displaying error messages
  2. Logging error messages

These are often used in combination, for example

try
{
  ParseText(SomeText);
}
catch (Exception e)
{
  ApplicationEnvironment.ShowErrorMessage(&quot;Error while parsing: &quot; + e.Message);
  ApplicationEnvironment.LogError(e);
  return;
}

Or, to avoid code duplication, combined into a single convenient method.

try
{
  ParseText(SomeText);
}
catch (Exception e)
{
  ApplicationEnvironment.HandleException(&quot;Error while parsing&quot;, e);
  return;
}

When you design your error handling interfaces don’t be afraid to add lots of convenient methods. Remember, it has to be easy to use and that’s what convenient methods are all about, as opposed to utility methods who generally need to be given more arguments. In this case I prefer the Humane interface design style instead of a minimal interface approach.

Needs to be configurable

Furthermore, the Error Handling Infrastructure needs to be configurable or support some other kind of dependency breaking technique like dependency injection. For instance, you should be able to mute all error messages intended for the user, and instead write them to the error log. This way your code will be able to run in a scripted environment, like during unit testing.

There are many ways to design an Error Handling Infrastructure. You could create your own message dialogs allowing the user to send a report right away, you could use the built in application log of the operating system or just plain text files, etc etc.

Whatever you choose to do, don’t forget to do it early and make it easy.

Cheers!

Previous posts in the Tools of The Effective Developer series:

  1. Tools of The Effective Developer: Personal Logs
  2. Tools of The Effective Developer: Personal Planning
  3. Tools of The Effective Developer: Programming By Intention
  4. Tools of The Effective Developer: Customer View
  5. Tools of The Effective Developer: Fail Fast!
  6. Tools of The Effective Developer: Make It Work – First!
  7. Tools of The Effective Developer: Whetstones
  8. Tools of The Effective Developer: Rule of Three
  9. Tools of The Effective Developer: Touch Typing

Method Reference Events? Nah, Not Yet.

June 5th, 2009 No comments

In my previous post I discussed the problem that method pointers aren’t able to store references to anonymous methods. Unfortunately that limitation makes anonymous methods less useful since they can’t be used to set up event handlers on the fly, like so:

// Wouldn't it be great
// if we could do something
// like this?

Memo1.OnKeyPress :=
  procedure(Sender: TObject; var Key: Char)
  begin
    Key := Chr(Ord(Key) + 1);
  end;

One thing I wasn’t aware of when I wrote that previous post is that while method pointers can’t be used to store anonymous methods, the opposite is true: Method references can indeed store both anonymous methods and instance methods. Many thanks to Barry Kelly who pointed it out to me in a comment.

Could this be the way to go? Shall we all start using method references instead of method pointers when we declare the events of our components from now on? Let’s take a look shall we? But just so that we’re clear with what we mean – there are many ways to say the same thing – here are the definitions I’m using in this post.

First of all, when I say method I mean either a function or a procedure, and it can be any of the types below. This is different from the common definition stating that methods are subroutines associated with either a class or an object.

Plain methods

These are functions and procedures declared outside the context of a class.

procedure PlainProcedure;
begin
  // Has no Self pointer
end;

Instance methods

Functions or procedures associated to an object instance

Procedure TSomeClass.InstanceProcedure;
begin
  // Self is an instance of TSomeClass
end;

Class methods

Functions or procedures associated to a class. In other languages also called static methods.

class procedure TSomeClass.ClassProcedure;
begin
  // Self is TSomeClass
end;

Anonymous methods

Functions or procedures declared as they are assigned in an execution context. These methods capture the surrounding context and may use local variables and arguments even if they’re out of execution scope.

SomeItems.Sort(
  function(Item1, Item2: TSomeItem): Integer;
  begin
    if Item1.Value &amp;gt; Item2.Value then
      Result := 1
    else if Item1.Value &amp;lt; Item2.Value then
      Result :=  -1
    else
      Result := 0;
    end
  );

For these four types of methods there are three reference types to store pointers for later invocation of the methods.

Plain method pointer

type TPlainMethodPointer =
  procedure(ASender: TObject);

Method pointers

type TMethodPointer =
  procedure(ASender: TObject) of object;

Method references

type TMethodReference =
  reference to procedure(ASender: TObject);

This compatibility graph shows what method types can be stored with the respective reference type.

Plain Instance Class Anonymous
Plain method pointer YES no no no
Method pointer no YES YES no
Method reference YES YES YES YES

From this graph we can see that the Method reference type is the unifying type that can store all types of method references. It seems like the way to go is to embrace Method references and render the other reference types obsolete. Is it possible? The answer is yes but no.

Take the event example of my last post. The VCL can not be changed to utilize Method pointers for backward compatibility reasons (like the widespread use of TMethod casts) so there’s nothing to do about it there. But what about our own components, and components created from now on?

type 
  TNotifyMethod = reference to procedure(Sender: TObject);

  TMyComponent = class(TComponent)
    ...
  published
    property OnChange: TNotifyMethod read FOnChange write FOnChange;
  end;

Well, it works fine as long as you set up the event handlers (OnChange in the above example) dynamically, but the method reference type events do not play well with the Delphi 2009 IDE. The object inspector doesn’t recognize possible event handlers and won’t create one automatically I you double click the event property. If you try to force the assignment by giving the name of an existing event handler method explicitly, the IDE throws an ugly Invalid Property Error Dialog.

Invalid property error

As much as I’d like to be able to assign anonymous methods as event handlers to my components’, I’m not prepared to sacrifice the IDE integration. Hopefully CodeGear will fix this issue in future releases but until then anonymous methods will remain less useful than they could be.

Cheers!

Categories: Delphi, opinion, programming Tags:

CodeGear, Please Fix the Anonymous Method Assymetry

June 3rd, 2009 4 comments

As I noted in my previous post, anonymous methods is a big new feature in Delphi 2009 for the Win32 platform. While “closures” is natural and much appreciated in other languages, most notably Ruby, the Delphi community is still a bit reluctant and hesitant. Segunda Feira put words on it in his post on the subject.

I am still not convinced that this convenience is worth getting all that excited about – It has not enabled anything that was previously impossible, nor even especially difficult either to implement or to understand

[…]

anonymous methods should be used only when you absolutely have to, and so far I have not yet seen an example where anonymous methods are the only way to achieve anything

I agree that more often than not a problem can be solved with equal elegance using a pure object orientated approach, but there are situations where anonymous methods may actually be the better alternative. One situation that comes to mind is the setting up of test fixtures.
Say, for instance, that we want to test the event firing mechanism of a component. An easy way to set this up could be like the following.

procedure TestSomeComponent.TestOnChange;
var
  SUT: TSomeComponent;
  OnChangeFired: Boolean;
begin
  OnChangeFired := False;
  // Set up the fixture
  SUT := CreateTheComponentToTest;
  SUT.OnChange :=
    procedure(ASender: TObject)
    begin
      OnChangeFired := True;
      CheckSame(SUT, ASender, 'The Sender event argument should be set');
    end;
  // Perform operation
  SUT.Text := 'Some text';
  // Check the result
  CheckTrue(OnChangeFired, 'Setting the Text property should fire the OnChange event');
end;

The above code checks that the component’s OnChange event is fired when the Text property is set, and that the component is set as the Sender reference to the event handler. Except for being more compact, the biggest advantage to using anonymous methods in this scenario is that we avoid the obscure test smell and that we don’t have to clutter our test class with an instance variable (like FOnChangeFired) to identify if the event was really fired or not.

The only problem is: it doesn’t work. Why? Well, because the OnChange property is most likely of an instance method reference type (i.e. TNotifyEvent), meaning it doesn’t accept a references to an anonymous method even if it has the same signature.

type TNotifyEvent = procedure(ASender: TObject) &lt;strong&gt;of object&lt;/strong&gt;;

For our code to compile we need to redeclare TNotifyEvent and remove the “of object” keywords and instead use the method reference type.

type TNotifyEvent = &lt;strong&gt;reference to&lt;/strong&gt; procedure(ASender: TObject);

But of course, that’s not an option. It would mean that the traditional way of setting up an event handler (by using an instance method) would not work.

I see a definite problem with how the Delphi language explicitly forces you to distinguish between instance methods (and class methods for that matter) and anonymous method references, even though they share the same signature.
This is most unfortunate since I feel that the situations where we’d like to support both kinds of event handlers are quite common. And with the current semantics we have to use code duplication in order to achieve that. Like the two Synchronize methods of the built in TThread class.

In Delphi 2009 an overloaded variant of the TThread.Synchronize method was introduced, one that make use of anonymous methods. Here are snippets from that code:

type
  TThreadMethod = procedure of object;
  TThreadProcedure = reference to procedure;
...
  class TThread = class
    …
    procedure Synchronize(AMethod: TThreadMethod); overload;
    procedure Synchronize(AThreadProc: TThreadProcedure); overload;
    ...
  end;

The two methods have almost identical implementations. The only real difference is the type of the argument.

procedure TThread.Synchronize(AMethod: TThreadMethod);
begin
  FSynchronize.FThread := Self;
  FSynchronize.FSynchronizeException := nil;
  FSynchronize.FMethod := AMethod;
  FSynchronize.FProcedure := nil;
  Synchronize(@FSynchronize);
end;

procedure TThread.Synchronize(AThreadProc: TThreadProcedure);
begin
  FSynchronize.FThread := Self;
  FSynchronize.FSynchronizeException := nil;
  FSynchronize.FMethod := nil;
  FSynchronize.FProcedure := AThreadProc;
  Synchronize(@FSynchronize);
end;

I may be overly sensitive, but code like that really disturbs me. Unfortunately it gets worse. If we follow the execution chain down into the Synchronize class method that is invoked by the two, we find this.

class procedure TThread.Synchronize(ASyncRec: PSynchronizeRecord; QueueEvent: Boolean = False);
var
  SyncProc: TSyncProc;
  SyncProcPtr: PSyncProc;
begin
  if GetCurrentThreadID = MainThreadID then
  begin
    if Assigned(ASyncRec.FMethod) then
      ASyncRec.FMethod()
    else if Assigned(ASyncRec.FProcedure) then
      ASyncRec.FProcedure();
    end else
      …
end;

It would be a lot nicer if the two reference types were joined under a common reference type. And I can’t see why it couldn’t be done. When I look at the “of object” keywords I get a feeling that the language is leaking compiler implementation through the language interface; information that is indifferent to the developer. What matters from a callers’ perspective is the method signature, not whether the method has a self pointer or not.

I hope CodeGear recognizes this problem and find a way to clean this assymetry from the language. Anonymous methods would be so much more useful if they do.

Cheers!

The Functional Subset of D

May 20th, 2008 6 comments

As I wrote about in an earlier post, the future of D lies in the field of functional programming. More specifically, what the creators of D are trying to do, is to construct a pure functional subset that can be utilized within the otherwise imperative language.

Let’s take a closer look at that functional subset that is taking form in the experimental 2.0 version of D.

Immutable data

The most fundamental difference between a purely functional language and an imperative one is how they treat data. Many of us are used to think of data in terms of state, where variables can be changed through assignments. But in a functional program there are no states. There are only constant values and functions that operate on them.

In D, immutability is achieved with the invariant keyword, either as a storage class:

invariant int i = 3;

or as a type constructor:

invariant(int) i = 3;

Transitive invariance

The side-effect free nature that comes with immutable data has some great advantages. For one thing it simplifies testing since the result of a function only depends on its input. There are also some optimizations that can be done by the compiler, but the biggest advantage is that programs written in this way are thread-safe by default.

To take advantage of these things the compiler needs to be able to trust the immutability of our data. This is where transitivity comes in. In D, invariance is transitive, which basically means that no data reachable from an invariant reference can be changed. Here’s an example.

int ga = 2; // mutable

struct A {
  int *p = &amp;amp;ga; // pointer to mutable
}

invariant(A) a; // a is immutable
A b; // b is mutable

// invariant is transitive
a = b; // ERROR, a is immutable
a.p = null; // ERROR, a.p is immutable
*a.p = 3; // ERROR, data pointed to by a.p is also immutable

Garbage collection

Since data must never change in a functional program, and consequently must not be destroyed while the data is in use, it’s usually a good idea for a functional language to utilize automatic memory management. Like most functional languages, D features garbage collection (alongside with explicit memory management.)

Higher class functions

In order to do anything interesting in a purely functional language you need higher order functions, or – in other words – the ability to send functions as arguments to other functions. For this we can use the function pointers (or delegates for methods and nested functions).

As an example, let’s say that we want to create a function that calculates the sum of two adjacent Fibonacci numbers. Here’s one way to do that.

int nth_fib(int n) {
  if(n == 0) return 0;
  if(n == 1) return 1;
  return nth_fib(n-1) + nth_fib(n-2);
}

int add_next_fib(int n) {
  return nth_fib(n) + nth_fib(n+1);
}

Now, let’s say that we want to do the same operation on a different sequence, for example natural numbers. Well, we could use the good old copy and paste but that isn’t very DRY. Let’s make add_next a higher order function instead so that it could be used with any sequence function.

int add_next(int n, &lt;em&gt;int function(int) nth&lt;/em&gt;) {
  return nth(n) + nth(n+1);
}

int i = add_next(3, &amp;amp;nth_fib);
// i is 8 (3+5)

Now, we can write any sequence function we want and have add_next apply it.

// Sequence function for natural numbers
int nth_nat(int n) {
  return n;
}

int i = add_next(3, &amp;amp;nth_nat);
// i is 7 (3+4)

Note: For methods and nested functions, the keyword function is replaced with the keyword delegate, otherwise it’s the same syntax.

Closures

Closures is another indispensable feature of functional languages. In short, it’s the ability to extract a function pointer for later use, and when invoked the function will still have access to the context in which it was created, even though that context has gone out of scope.

In D, closures are created with the delegate keyword.

int delegate() create_closure() {
  int x = 3;

  int f() {
    return x;
  }

  return &amp;amp;f;
}

int delegate() a_closure = create_closure();
int i = a_closure();
// i is 3

Note that the extracted function f (referenced by the a_closure variable) accesses the local variable x, although it has gone out of scope at the time of execution. D got this ability with the 2.07 version, before that it didn’t have real closures.

Currying

Closures provide an easy way to do currying, which is common in functional languages. Simply put, currying is a technique where functions take a general function and return a new, more specialized one.

For instance, we could curry our add_next function in our previous example and create a specialized version of it, say add_next_fib (and thus get back to where we started).

int delegate(int) curry_add_next(int function(int) nth) {
  int curry_f(int n) {
    return add_next(n, nth);
  }
  return &amp;amp;curry_f;
}

int delegate(int) add_next_fib = curry_add_next(&amp;amp;nth_fib);
int i = add_next_fib(5);
// i is still 8

Pure functions

These features are all we need to write purely functional code, but in order to take full advantage of the functional programming paradigm some major things remain unsolved.

For one thing, the compiler needs to know whether or not our code is functional in order to apply possible optimizations. The easiest way to do this is to give the programmer a keyword to tell the compiler she wishes purity, and then have the compiler enforce it. In D this is the purpose of the pure storage class.

pure int a_pure_square_function(invariant(int) x) {
  return x * x;
}

A pure function must not access non-invariant data, and may not invoke other non-pure functions. As per D2.13, the pure storage class has not had its semantics implemented and are therefore not enforcing purity. I sense this is not a trivial matter, so it may take some time before we have it.

Cheers!

Categories: D Programming Language, programming Tags:

The Future of D is Functional

April 16th, 2008 20 comments

The D Programming Language has an impressive list of cool features. That is not always a good thing. A feature should contribute to the philosophy and foundation on which the language was built. If it doesn’t, harmony breaks down and the result is a language that is harder to learn and to use.

Some people think D suffers from such a feature creep. To some degree I can agree. D has features that are unlikely to become widespread, but most of them are aligned towards a common goal. That goal is bringing productivity to the world of low-level programming.

Lately, a new goal has emerged from the D community, and it has triggered some real intense activity. The future of D seems to lie in the field of functional programming, making the imperative and the functional paradigms truly come together.

Why is this important? Let me quote Walter Bright from a discussion at the D forums:

The future of programming will be multicore, multithreaded. Languages that
make it easy to program them will supplant languages that don’t.
[…]
The surge in
use of Haskell and Erlang is evidence of this coming trend (the killer
feature of those languages is they make it easy to do multiprogramming).

As we all know, multithread programming in an imperative language is a real pain. It’s complicated and easy to get wrong, but that is not the case in a functional language. A pure functional program is thread-safe by design, since it’s free from mutable state and side-effects.

You never have to worry about deadlocks and race conditions because you don’t need to use locks! No piece of data in a functional program is modified twice by the same thread, let alone by two different threads. That means you can easily add threads without ever giving conventional problems that plague concurrency applications a second thought!

Quote from Slava Akhmechet’s excellent article Functional Programming For the Rest of Us.

What the people behind D want to do is to create a true functional subset of the D Programming Language, and create a safe interfacing to parts of the program that are imperative. The functional subset would enforce pure functional programming, like disallowing access to mutable global state and calls to non-pure functions. In effect that would enable you to write parts of your program that need to be thread-safe in a functional style, while using the style of programming that most of us are used to for the rest.

But, it might not stop there. Andrei Alexandrescu, who’s one of the driving forces behind the functional subset, has suggested that the enforcements inside a pure function can be relaxed to allow mutability of what he calls “automatic state,” thus allowing imperative techniques to be used as long as the mutability doesn’t leak across the function barrier. Here’s an example from Andrei’s slides on the subject:

int fun(invariant(Node) n) pure {
  int i = 42;
  if (n.value) ++i;
  int accum = 0;
  for (i = 0; i != n.value; ++i) ++accum;
  return n.value + i;
}

This code doesn’t look a bit functional, but the result of fun() is solely dependent on its arguments, so seen from the outside it’s pure.

Mixing pure functional and main stream imperative programming is certainly bleeding-edge. As far as I know it has never been done in any other language. But even though I’m excited, I can’t help wondering whether this is the right way to go. There’s a risk that D spreads itself too thin, and that the pure functional subset of D will be too noisy; I anticipate a heavy use of the invariant keyword for instance.

Would it be a better option to create a completely new language, say Functional D, and make D and Functional D interoperable on a binary level? At least that would battle the noise by reducing the need for explicitly expressing things like immutability, which can be taken for granted in a functional language. I also suspect that the compilers would be less difficult to implement than a compiler that has to support both styles.

But then again, I don’t know much about making compilers. I just happen to be more of a minimalist when it comes to computer languages. (As opposed to my preferences for APIs and Framworks.)

Expect more posts on this subject as I plan to delve into details next.

Cheers!