The Paradigm.

Software engineering with elegance and precision. Technique, meticulousness, and pride in craftsmanship. The path to the stars is not a race to the bottom.

Why C# is Not My Favorite Programming Language

There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code.

— Flon’s Law

Abstract

This post is an attempt to recreate a similarly titled paper I published internally at Microsoft many years ago. The title is a reference (and homage) to Brian Kernighan’s 1981 similarly titled paper about Pascal.

It is my opinion that C# is not the first language I would choose for a project, and through this post I will lay my reasons for that. This is not an attempt to compare C# to any one language, but rather to discuss C# for what it is. I wrote C# code professionally for a few years, and much of this post will be based on my own personal experience.

Overview

I started writing C# code many years after I had been writing C, C++, and Haskell. The first project I undertook in C# was back in 2003, when my team and I tried to write a database engine in C#. I went into the project fairly optimistically, with no preconceived notions of C#. That turned out to be the first of many times I wished I were not using C#.

I have divided the rest of this post into the following categories:

  • Types
  • Resource management
  • Other language problems
  • .NET problems

For each of these, I will present my observations as why some of the choices in the language are not ideal.

C# may have pushed Java to be a better language. It may have enabled programmers, otherwise not trained or qualified enough to develop C++ programs, to write code. It deserved credit for both these things; but it falls short when it comes to developing complex systems.

Types

No type synonyms

One of the advantages (and purposes) of using types in the first place is to abstract away the representation of the underlying data, even if it’s trivial. Languages like C++ and Haskell provide a mechanism for creating new types that are synonymous with an exist type (typedef in C++, or type in Haskell).

Take the following code:

class Game {
public:
    typedef unsigned int Score;

    Score getScore() const;
};

Experienced developers use this construct extensively in order to increase readability, maintainability, and portability.

There is very simply no way to achieve the same effect in C#. To be fair, there is a using statement, but the type synonym is not exported outside of the current .cs file. The outside world will see the original type.

Heavy types

A common feature found in other languages is to redefine an underlying type so that it is not interchangeable with the new type (non-isomormic identities).

For example, in C++ you could do something like this:

struct Name {
    string value;

public:
    explicit Name(string &value) : value(value) {}
};

void print(Name n);

print(Name("Fred")); // ok
print("Fred"); // error

The beauty here is that Name (which is a compile-time concept) has no overhead over string. The size and performance of using both types is identical – the only difference is the compile-time check I get from it.

Sure, one could do the same in C#. The problem is that the newly defined type is not free. There is a non-trivial amount of overhead associated with defining a type in .NET. From the compiler, to the type metadata in the assembly, to the JIT compiler, to oftentimes the final machine code itself, defining new types is expensive.

Enums

Enums are a great way to make code more legible. This is another feature in programming languages that experienced developers will use extensively to make code more readable and maintainable. And they come with no overhead! They are replaced by the value at compile time.

enum class CacheMode : uint8_t {
    NO_CACHE,
    READ_THROUGH,
    CACHE_ONLY,
};

void read(Cache &cache, CacheMode mode);

read(cache, CacheMode::READ_THROUGH); // very easy to read

C# does have enums, but they are expensive, as they carry metadata (they are not a free compile-time check on top of an integral type). When writing C++ code, I use enums liberally, sometimes even replacing booleans for specific operations:

enum class ShouldFlush : bool {
    NO = false,
    YES = true,
};

void write(Data &data, ShouldFlush flush) {
    if (ShouldFlush::YES == flush) { // still no overhead
    }
}

write(data, ShouldFlush::NO); // easy to read!

As a side note, I happen to know that Anders was against including enums in the language, advocating instead that folks should use public static const fields in a static class. He was obviously finally convinced otherwise by the team.

Structs and classes

In C#, the difference between a struct and a class is that struct are value types, whereas classes are reference types. What seems bizarre to me is that in most languages the differentiation is done at instantiation and parameter passing time, not at declaration time.

There are legitimate reasons why value types are used, for example, for performance reasons or as a view of a part of a data buffer.

C# imposes severe artificial restrictions on structs. For example, they cannot inherit from other structs, as can C++ types:

struct Character {
    Level  level;
    Points health;
    Money  money;
};

struct Wizard : Character {
    Points magic;
};

Wizard wizards[10];

Now, you could have an array of references in C#, but that is not equivalent. Arrays of reference types have no memory locality, and are a non-starter for applications that require high-performance (such as graphics intensive applications, or simulations).

The other argument that could be made is that inheritance is not all that important, since you could use containment (has-a) instead of inheritance (is-a). Well that is an argument in favor of my point against the language forcing classes for everything.

No bit fields

C# does not support bit fields. To be fair, C# supports this Flags attribute on enumerations, which achieves the same effect in a much more error-prone and verbose way.

For example, in C++ you could do something like this:

struct Status {
    bool read_only : 1;
    bool deletable : 1;
    bool versioned : 1;
};

// Break the build if Status is not 1-byte long
static_assert(sizeof(Status) == 1, "Size of Status too large");

// somewhere later
if (!table.status.deletable) {
   ...

This representation is compact, efficient, and elegant. It’s also easy to read and maintain.

This particular example could be replicated in C# without bit fields. It would look something like:

[Flags]
public enum Status 
{
    None      = 0,
    ReadOnly  = 1,
    Deletable = 2,
    Versioned = 4
}

// somewhere later
if (!(table.Status & Status.Deletable)) {
   ...

The above example is obviously much more error-prone than its C++ counterpart, since the developer is required to do all the heavy lifting (also, don’t forget to zero the instance of Status upon initialization).

Some C# supporters criticize the use of bit fields as a relic from a bygone era when memory was expensive. This could not be further from the truth: first, consuming less memory and having better performance never hurts (especially if the language makes it is easy and elegant to do so). Secondly, bit fields are extremely useful when dealing with low-level systems. A very good application is for efficiently reading and writing binary files (and hey, some language needs to properly support writing systems such as the .NET CLR, no?).

Whereas the example above was somewhat easy to emulate (albeit more cumbersome and error-prone), more advanced use cases for bit fields become increasingly trickier to emulate. Take the following example:

namespace Zip {
    typedef uint32_t Signature;
    typedef uint16_t Version;

    enum class CompressionOption : uint8_t {
        // options
    };
    enum class CompressionMethod : uint16_t {
        NO_COMPRESSION    = 0,
        SHRUNK            = 1,
        REDUCED_FACTOR_1  = 2,
        // ...
    };

    struct Header {
        Signature             signature;
        Version               version;
        struct {
            bool              encrypted         : 1;
            CompressionOption compression       : 2;
            DataDescriptor    descriptor        : 1;
            bool              enhancedDeflation : 1;
            //
            // other bit fields
            //
        }                     flags;
        CompressionMethod     method;
    };

    static_assert(sizeof(Header) == 30, "Wrong size of ZIP header");
}

In the above example, there’s no hacks, no messy annotations. It’s just code that’s semantic, clean, easy to read, and self-documenting. Reading and writing this type is also very fast: it can be done as a binary block (and should endianness correction be required for a platform, it can be done extremely efficiently after being read to main memory).

Poor support for unions

I’ll be the first to admit: unions can be abused, and if misused, can cause serious trouble. At the same time, if used correctly, they are an extremely powerful feature that helps write clean and semantic code.

I have used unions extensively when writing image manipulation software. You can then define the data types and operations very clearly and elegantly, as follows:

union RgbaColor {
    typedef uint8_t Channel;

    uint32_t    rgbaValue;
    uint32_t    rgbValue : 24;
    Channel     rgb[3];
    Channel     rgba[4];
    struct {
        Channel r;
        Channel g;
        Channel b;
        Channel a;
    }           channels;

    RgbaColor(uint32_t rgbaValue) :
        rgbaValue(rgbaValue)
    {
    }
    RgbaColor(uint32_t rgbValue, Channel alpha) :
        rgbValue(rgbValue)
    {
        channels.a = alpha;
    }
    RgbaColor(Channel r, Channel g, Channel b) :
        channels({r, g, b, 0})
    {
    }
    RgbaColor(Channel r, Channel g, Channel b, Channel a) :
        channels({r, g, b, a})
    {
    }
};

// In this example, it is convenient to use a sized array
// of channels, and then a separate alpha channel
RgbaColor toGrayScale(RgbaColor color) {
   RgbaColor::Channel value = average(color.rgb);
   RgbaColor result(value, value, value, color.channels.a);

   return result;
} 

// In this example, it is convenient to use see the colors
// as one number
RgbaColor average(RgbaColor c1, RgbaColor c2) {
   RgbaColor result((c1.rgbValue + c2.rgbValue) / 2, 0);

   return result;
}

In C#, there’s no syntactic way for creating unions (which is weird, because C# has way more syntax than it needs to, or should). But if you really want to, you can use the StructLayout attribute to emulate that (the type is actually called StructLayoutAttribute, but don’t even get me started on the compiler schizophrenia of dropping the Attribute part of the name).

The StructLayout attribute, combined with the FieldOffset attribute somewhat emulate the behavior of unions. The problem is that they do not work but on the simplest examples. The language was designed to be used as an OO-language with reference types, and has very limited support for anything outside of that.

In C#, the RgbaColor example above would look like this:

[StructLayout(LayoutKind.Explicit, Size=4)]
public unsafe struct RgbaColor {
    [FieldOffset(0)] public uint rgbaValue;

    // - no way to declare rgbValue
    // could add a getter/setter that leave the a channel unchanged

    [FieldOffset(0)] public fixed byte rgb[3];
    [FieldOffset(0)] public fixed byte rgba[4];
    [FieldOffset(0)] public Channels channels;

    struct Channels {
        byte r;
        byte g;
        byte b;
        byte a;
    }

In order to use “fixed arrays” (which is a C/C++ style array “inlined” in the data structure), the struct has to be marked with unsafe. This is inconvenient for two reasons: first, I need to now enable the /unsafe option to the compiler. But most importantly, unsafe is leaky: any code that reads that array must be marked as unsafe too.

Writing something actually useful with StructLayout and FieldOffset is difficult. But even if successful in doing so, another major obstacle is a lot of definite assignment compiler errors when trying to access the fields in the union.

My suspicion is that since unions are not first-class concepts in the language, the code in the compiler that performs the definite assignment checks is unaware that a certain type is a union, and treats it like any other type. Furthermore, it would need to maintain a map of which memory areas of the object have been initialized, and which fields map to which areas in order for the checks to be accurate.

No const-correctness

The C++ FAQ starts the section on const-correctness with the words “A good thing”. Const-correctness tightens the belt on type safety, by restricting which operations are allowed on types modified with the const keyword. It is as a way to ask the compiler to remind you that you do not wish to change a certain value.

In C#, if I try to do this:

const List<int> list = new List<int> { 1, 2, 3 };

I get the following compiler error:

error CS0134: 'list' is of type 'System.Collections.Generic.List<int>'.
   A const field of a reference type other than string can only be
   initialized with null.

The only possible value for a const reference variable is null. In other words, const-correctness does not work in C#. But it gets even worse, as the compiler does not enforce const-correctness at compile time. For example, this code compiles in C# (and then throws during runtime):

const List<int> list = null;
list.Add(1);

For comparison, the same code in C++:

const list<int> list = {1, 2, 3};
list.push_back(4);

The C++ compiler gives me the following error:

error: no matching member function for call to 'push_back'
note: candidate function not viable: 'this' argument has type
   'const std::list<int>', but method is not marked const

Finding errors during compile-time (instead of runtime) is very obviously desirable. But const-correctness also has another very important role: it serves as an important hint to the optimizer. Knowing that a certain value cannot change gives the compiler the opportunity to perform optimizations, such as (from Stack Overflow):

  • incorporating the object’s value directly into the machine instruction opcodes
  • complete elimination of code that can never be reached because the const object is used in a conditional expression that is known at compile time
  • loop unrolling if the const object is controlling the number of iterations of a loop

No multiple inheritance

If you buy into the OO paradigm, multiple inheritance is a very natural and desirable feature. A Button is both a Rectangle and a ClickTarget, regardless of what C# supports. Not supporting multiple inheritance just means that the code has to be designed in such a way that makes it harder to read and maintain.

Some people criticize multiple inheritance as inherently unsafe (pun intended). This feature is not unsafe or evil. The main problem with it is the ambiguity caused when resolving inherited members. The language must then provide disambiguation mechanisms, which are oftentimes confusing.

Here’s my problem with that argument: it restricts the capabilities in the language and caters to a few developers who would be confused at the expense of those who deem the feature useful. If you are a developer, and you are confused by a certain language feature, then here’s what you do: don’t use that feature.

Also, not supporting multiple inheritance does not mean the problems do not exist: for instance, Java 8 has become susceptible to the diamond problem by introducing default interface methods. The diamond problem is not insurmountable, and there are ways to solve it: For example, C++ follows each inheritance path separately, as well as forcing the programmer to disambiguate the path to the parent. Another related feature is virtual inheritance.

No templates

C# generics are not nearly as powerful as templates. From Wikipedia:

Although C++ templates, Java generics, and .NET generics are often considered similar, generics only mimic the basic behavior of C++ templates.[4] Some of the advanced template features utilized by libraries such as Boost and STLSoft, and implementations of the STL itself, for template metaprogramming (explicit or partial specialization, default template arguments, template non-type arguments, template template arguments, …) are not available with generics.

Templates are so powerful, and their use is so essential to a modern C++ developer that working in a language without them simply feels limiting.

With default template arguments, non-type arguments, template specialization, and the principle of SFINAE (which is useful for compile-time introspection), there’s absolutely nothing in the C# world that nearly resembles the power of templates.

No way to know the size of an object

It makes me deeply uncomfortable to not have a very good idea of how much memory my application is going to consume.

Let’s start with just something as simple as determining the size of an object in memory. In C++, it’s easy. There’s a built-in operator (sizeof) for that:

sizeof(obj);

In C#, one used to have to use unsafe code to do it:

RuntimeTypeHandle th = obj.GetType().TypeHandle;
unsafe
{
    int size = *(*(int**)&th + 1);
}

Starting in .NET 4, they made it slightly better to query the size of the object. I can now do it without resorting to unsafe code:

RuntimeTypeHandle th = obj.GetType().TypeHandle;
int size = Marshal.ReadInt32(th.Value, 4);

Even though this code no longer needs to be in an unsafe block, it is not any safer than the first version. There is, after all, a very implicit assumption about the size and position of that field.

Still, knowing the size of the object does not tell me much. It doesn’t tell me, for example, how the overhead in the object grows in comparison to the data in the object (if at all). It doesn’t tell me the overhead associated with the object in the garbage collector. It doesn’t tell me how the garbage collector itself grows over time.

Take the following example:

class Empty
{
}

Empty obj = new Empty();
RuntimeTypeHandle th = obj.GetType().TypeHandle;
int size = Marshal.ReadInt32(th.Value, 4);

In the example above, size is 12. If you replace Empty with Small (a class with one integer), the size is still 12.

Poor support for fixed arrays

In one of the examples above, I used fixed arrays in a union. Support for fixed arrays (which are commonly used in real-world applications) is limited to primitive numerical types in C#.

That means that something like this:

enum class Suit {
   HEARTS,
   DIAMONDS,
   CLUBS,
   SPADES,
};

enum class Value {
   ACE   = 1,
   TWO   = 2,
   THREE = 3,
   // ...
   JACK  = 10,
   QUEEN = 11,
   KING  = 12,
};

struct Card {
   Suit  suit;
   Value value;
};

struct Player {
    Card myHand[5];
};

would be difficult to represent so that sizeof(Player) == sizeof(Card[5]). There are cases where such a property is desirable.

Resource management

Before I start about C#, let me talk about C++. C++ has a brilliant concept called RAII (Resource Allocation Is Initialization). The way it works is that objects are constructed when they go into scope, and destructed when they go out of scope.

Sure C# is garbage-collected. But it is an incomplete solution to a more comprehensive problem. Resources are more than just memory – resources are files, locks, network sockets, slots in a buffer, memory, and countless others.

In a sense, memory is the least interesting resource. If you are leaking memory, the effects are likely to be far less dire than leaking a semaphore, for instance. (And don’t get me wrong, I’m not saying it’s not bad).

Now this is not about C++, but let me set something straight. Constructors and destructors are absolutely the right way to go when it comes to managing resources. C++-style constructors and destructors, combined with copy constructors, allow for very efficient resource management. They allow for management of memory, as well as any other resource your program might use.

Before you tell me about memory leaks in C++, let me tell you this: Bjarne Stroustrup (the creator of C++) is the first to say that if you are doing manual mallocs and frees in C++, then you are doing it wrong. In C++, are you supposed to be doing RAII-style management of everything, including memory. You can have a full garbage collector if you so wish. But resources are guaranteed by the compiler to always be freed up.

int global_var = 0;
mutex global_var_mutex;

void some_function() {
    lock_guard<mutex> guard(global_var_mutex); // locks the mutex
    global_var++;
}

At the end of some_function, the mutex is always released. It doesn’t matter if the function returned explicitly, implicitly, or threw. The result is always the same: the mutex is released.

When talking about why you shouldn’t use manual mallocs and frees, Bjarne has a quote that I very much enjoy (I heard it from him verbally, and this might be the first transcription of the quote):

It doesn’t matter how good and disciplined you are when dealing with mallocs and frees. It only takes one omission out of the thousands of places you are doing it from to wreak havoc on the system. Let the compiler do it for you.

He is right. This is a good strategy, that we should adopt as much as possible.

In order to walk around the lack of support for a proper deterministic destruction, C# has the notion of finalizers (which they call destructors, but are different from C++ destructors in that they are non-deterministic) and an interface called IDisposable.

If having both Finalize (exposed as a destructor, with the same syntax as a C++ destructor) and IDisposable.Dipose sounds confusing, well, it is.

If your type has unmanaged resources (anything not relating to the garbage collector), then you are expected to implement a destructor (Finalize). If you can’t wait until the garbage collector runs, then you should implement IDisposable.

Placing the burden on the user

Suppose you have a type that contains a private member that implements IDisposable. Because of that, your type should really also implement IDisposable, and your callers must know to call Dispose in a finally block (or wrap your object in a using block).

There are several problems with that. Remember Bjarne’s quote “it only takes one omission to wreak havoc”? Here, the language is placing all the burden to make your object an IDisposable on you. You need to remember to do that (or at the very least, you need to run a tool to tell you to do that). It’s also placing the burden of remembering to use it correctly on your users (which you have no control over).

Leaky abstractions for managing resources

Now suppose you have a public type that does not contain an IDisposable member, but now for the next version of the API you need to add one.

Here’s the problem: your existing users are guaranteed to leak that resource, unless they are willing to go and find every reference to your type and make sure they are invoking Dispose.

Because of that, people have come up with rules such as “public types should always be marked as IDisposable“. Sounds like a lot of work on me. Also, “it only takes one omission”.

Resource management hacks

Obviously all this is very non-ideal. If there’s one resource that you really don’t want to leak is locks. They will cause your system to hang or deadlock very quickly.

So instead of providing a sound, general, and universal solution to resource management (such as RAII), C# went ahead and added a lock keyword (for critical sessions). And a using keyword (for IDisposables).

So at this point we have Finalize, IDisposable, using, and lock, and we are still not at the same level of resource management that RAII provides.

Other language problems

Scoping of functions

C# claims to be a multi-paradigm language, with support for object orientation. In reality, it is object-mandated, with some support for functional programming. I say that because everything must be contained in a class – even things that do not belong in classes.

C# itself ships with a number of those, such as Math.Floor or GC.Collect. As a matter of fact, both Math and GC are static classes (meaning they cannot be instantiated or inherited from). In other words: they are, in fact, not classes at all – they are namespaces.

If I wanted to implement my custom rounding in a language like C++ (don’t ask why I want to do it, it is sometimes necessary in some domains, like circuit design), that would simply be a few functions on a namespace, like so:

namespace math {
    float scaled_round(const float arg);
    double scaled_round(const double arg);
    long double scaled_round(const long double arg);
}

Furthermore, it’s trivial to augment an existing namespace, or to add new specializations to existing templates (more to on that later).

In C#, I’m stuck providing a new class and new implementation of those methods.

namespace Project
{
    public static class MyMath
    {
        public static float ScaledRound(float arg)
        {
            // implementation
        }

        public static double ScaledRound(double arg)
        {
            // implementation
        }

        public static decimal ScaledRound(decimal arg)
        {
            // implementation
        }
    }
}

This approach has several problems. First, it tries to fit everything into a specific model – even when things don’t naturally fit into that model.

Secondly, even if you assume that everything should be object-oriented, this does not fall into the traditional definition of static methods. From Wikipedia:

Static methods are meant to be relevant to all the instances of a class rather than to any specific instance.

And that is very telling. Methods such as floor do not belong to all instances of the Math class. Quite the opposite! It belongs to no instances of the Math class (which can’t even have instances!). It goes to show that these functions would be better served off a namespace. The .NET Framework does support functions on a namespace – it is C# as a language that doesn’t.

Thirdly, it highlights one of my pet peeves with C# – there’s too much unnecessary verbiage around my code (more on that later).

Lastly, and perhaps most weirdly, is that fact that the same operations on slightly different types are exposed through entirely different static classes. Take the System.Data.OracleNumber struct, for instance. Compare and constrast:

// on a primitive number
Math.Floor(Math.PI);

// on an OracleNumber
OracleNumber n(OracleNumber.PI);
OracleNumber.Floor(n);

As a side note: in this particular case, I am left wondering why it is that the implementation of the Oracle DB driver didn’t hide the underlying representation of numbers from the user. I will admit I haven’t looked too much into this, but my first instinct tells me that all APIs should have exposed native types to the user.

Now suppose that we are writing the DB layer itself (and thus have to deal with the marshaling of numbers between different systems). In C++, we would have something like this:

class DBFloat {
    // ...
}; 

/*
 * we can define operations such as
 * - math::constants::pi<DBFloat>
 * - DBFloat std::floor(DBFloat)
 */

// The usage is now this:
// Much more natural than C#
floor(pi<float>());
floor(pi<DBFloat>());

Although the example above is in C++, many other languages have similar concepts. For example, Haskell has a very similar concept through the RealFrac type class.

No local static variables

In C++, if I need a constant (or sometimes a global variable, such as a mutex) that only really applies to one function, I can easily define it next to the place it’s used, as in the following example:

void my_function() {
    static mutex m;
    lock_guard<mutex> guard(m);

    // do something
}

void another_function() {
    static const float R5_RESOLUTIONS[] = {1.0, 1.6, 2.5, 4.0, 6.3};

    // R5_RESOLUTIONS is not created for every call to this function
    // it is read-only, compiler-allocated memory in the read-only globals
    // segment
}

In C#, I would have to declare these to be class-wide. The problem with that is that one is unnecessarily expanding the scope of that variable, thus making the code more error-prone, and less readable and maintainable.

Poor optimizations

Let’s be honest here: the C# compiler is not necessarily known for being a highly-optimizing compiler. In a blog post, Eric Lippert discusses the very few cases where the C# compiler optimizes the output, and talks about some of the cases it purposefully eschews optimizations – for example, in order to preserve certain information in the debugging symbols (in my opinion, this is an argument for a better debugging symbol file format, one which supports temporal references – which the PDB format generated by the C# compiler does not). In this own words:

The /optimize flag does not change a huge amount of our emitting and generation logic. We try to always generate straightforward, verifiable code and then rely upon the jitter to do the heavy lifting of optimizations when it generates the real machine code.

Eric Lippert

By his own admission, the C# compiler relies on the JIT compiler to do most of the optimizations. There are several problems with that:

  • optimizations can be expensive, so ideally all optimizations would happen during build-time, not runtime (and don’t even mention ngen – it’s a joke)
  • just because the JIT can optimize code, it doesn’t mean it always does so
  • there are optimizations that are related to language constructs, and therefore must be done by the compiler, not the linker. Good examples include C’s adoption of the keyword restrict, and C++ const keyword (discussed below).

Eric’s post continues. He lists a few optimizations performed by the C# compiler, and concludes with:

That’s pretty much it. These are very straightforward optimizations; there’s no inlining of IL, no loop unrolling, no interprocedural analysis whatsoever.

Eric Lippert

And again, they leave that to the JIT compiler. The JIT compiler does a terrible job at a lot of basic optimizations.

David Notario, a developer in the .NET JIT compiler team, discussed JIT optimizations in a blog post:

These are some of the reasons for which we won’t inline a method:
– Valuetypes: We have several limitations regarding value types an inlining. We take the blame here, this is a limitation of our JIT, we could do better and we know it. Unfortunately, when stack ranked against other features of Whidbey, [excuse omitted for brevity]
– Complicated flowgraph: We don’t inline loops, methods with exception handling regions, etc…

David Notario

So by the .NET JIT compiler’s team own admission, they don’t inline functions with loops, etc.? I really wonder what that etc means.

So there you have it. Both the C# compiler and the JIT compiler lack in their ability to optimize code.

And don’t even get me started on things that cannot be optimized, such as the metadata arrangement in the assembly, which causes the whole file to be read into main memory and processed. No wonder startup times for .NET applications is so slow.

No separate linking stage

Very simply put: building a C# project does not scale. I have been in projects (when I was a developer at Microsoft) where it would take almost 30 mins to build the full project. And we are not talking about full builds vs. incremental builds – this is exactly the point: incremental builds do not exist for C#.

The C# compiler takes a list of source files, does a two-pass compilation on all of them, and produces a binary. The more files that are passed in, naturally the longer the compilation time.

One way to fix this problem is to generate separate DLLs for small parts of the project. This helps both in keeping the build time of each DLL small, but also by enabling parallel builds for independent DLLs. The problem with this approach is that now your tax is on the runtime. You have to ship more DLLs (and deal with the GAC problems, DLL loading times, etc.).

Another ill-fated approach would be to use modules (some people call them netmodules, after their filename extension). My team at Microsoft tried that, and it was a nightmare. That did not at all work well.

One contrasting approach adopted by many other languages is to have two separate building stages: compiling and linking. The compiler transforms source code into objects. The linker then combines objects into a final binary. Linking is easier and cheaper than compiling.

When working with large projects, that makes a huge difference. Many modern tools exist that allow teams to store compiled objects in a network cache, so in order to get a working binary, downloading the objects and then linking them are all that are required. And if I make changes to one of the source files, only the affected files need to be re-compiled into objects. And then linked together.

Verbiage

When writing C# code, I have this feeling that it has too much syntax around my code. Adding a new source file to a project usually consists of filling in a template containing using clauses, and the namespace and the class declarations. That is such a common pattern that most IDEs will automatically add those for you when you create a new file in the project.

This phenomenon is not exclusive to C#, but C# definitely suffers a lot from it, especially with it’s properties, and getters and setters. Paul Graham summarized this phenomenon well:

Object-oriented programming generates a lot of what looks like work. Back in the days of fanfold, there was a type of programmer who would only put five or ten lines of code on a page, preceded by twenty lines of elaborately formatted comments. Object-oriented programming is like crack for these people: it lets you incorporate all this scaffolding right into your source code. Something that a Lisp hacker might handle by pushing a symbol onto a list becomes a whole file of classes and methods. So it is a good tool if you want to convince yourself, or someone else, that you are doing a lot of work.

Paul Graham, via Jeff Atwood

That a programming language shapes one’s way of thinking is a real phenomenon. Those who have switched to a new language have probably found themselves asking “How do I do X in this language?”, only to be answered something like “You are thinking in a different paradigm. In this language, you don’t have to do X”.

C# verbiage-driven development is also very real. I have seen it first hand. I have seen more people preoccupied with the scaffolding, defining properties and interfaces and pure virtual methods than I care to count.

Eric Lippert (one of the authors of the C# compiler at Microsoft, and a former member of the C# design committee) had the following to say:

What I sometimes see when I interview people and review code is symptoms of a disease I call Object Happiness. Object Happy people feel the need to apply principles of OO design to small, trivial, throwaway projects. They invest lots of unnecessary time making pure virtual abstract base classes — writing programs where IFoos talk to IBars but there is only one implementation of each interface! I suspect that early exposure to OO design principles divorced from any practical context that motivates those principles leads to object happiness. People come away as OO True Believers rather than OO pragmatists.

Eric Lippert, via Jeff Atwood

His was not a criticism of C#, but in my experience there is definitely a lot of Object Happiness in the C# community. I find it hard to make an argument that every single one of those people suffering from Object Happiness is a bad developer. An argument is easier made that there is an underlying force driving them to that – and I believe that force is the language design. I became convinced of that by noticing that this phenomenon is not observed nearly as often among enthusiasts of other languages (such as C++, Haskell, Scala, or even JavaScript!).

There is no partial application

C# LINQ extensions were a welcome functional addition to the language. Using them, however, often involves manually creating lambdas, just because C# doesn’t support partial application.

Suppose a function called translate that takes a source language f, a destination language t, and a string s and returns the translated string. Now suppose I want to translate every string in a list from English to French. In Haskell, I would easily write something like this:

map (translate English French) strings

Notice that the expression (translate English French) is a partial application of translate. It is a new function that takes 1 argument, of type string, and translates it from English to French. Other languages like C++ (through templates!) or JavaScript also have support for partial application.

In C#, however, the burden is on me:

strings.Select(s => Translate(Languages.English, Languages.French, s));

When used many times, and often, it becomes quite tedious to write partially applied functions in C#. And they end up reducing readability (instead of increasing it!) due to unnecessary syntax around the lambda.

There is no escape

This was also a section in the original paper. It is difficult to override the type mechanism when necessary. This same problem originally observed in Pascal by Kernighan in 1980 persists in C# today:

There is no way to override the type mechanism when necessary, nothing analogous to the “cast” mechanism in C. This means that it is not possible to write programs like storage allocators or I/O systems.

Brian W. Kernighan

.NET problems

Signed integers for sizes

Even though C# has unsigned integers (unlike Java), the C# community does not seem to understand when to use them. And the example comes from the .NET Framework class library: things like Count on a list produce an int. How it is possible that a list contains -3 elements is anyone’s guess, and yet, that’s the type they use.

And this is spread consistently across the language. ICollection.Count returns an int. The norm is for indexers to take ints. Bizarrely, Object.GetHashCode() also returns an int, which half of the time yields negative hashes. While there’s nothing per se wrong with a hash value being negative, it’s not really semantic. And that type should be defined as a Hash type, synonymous (but not isomorphic) to uint.

The reason for this weird behavior is that unsigned numbers are not “CLS-compliant”. Still, those are artificial rules. And it doesn’t make the out-of-the-box C# experience any less weird.

Exceptions

Exceptions can be a useful feature, but they do not replace expected codepaths for error cases. The .NET Framework class library oftentimes raises exceptions when a status code would have been adequate (and desirable). For example, File.Open can throw a FileNotFoundException. A file not being there is not an exceptional case at all – I would expect any developer trying to open a file to handle that case as something that can – and does – happen.

Global Assembly Cache (GAC)

In my experience, more problems have been caused by the GAC than solved by it. Version mismatches are common. Of significant importance is the design decision of not being able to load a different assembly version from a file if there’s an assembly with a similar signature in the GAC.

In large systems (and large teams), the GAC tends to cause a lot of problems. Assemblies will oftentimes end up in the GAC as part of setup. Oftentimes (due to bugs during the development cycle, and other reasons) there are some assemblies left behind in the GAC, which makes developing difficult.

The problem happens even when requesting an assembly by filename, even if the full path is provided. In that case, the file is located and opened, and the assembly signature is then extracted. Next, the GAC is probed for an assembly of similar signature, and if one is found, then the file which had been requested by filename is closed and the GAC version is used instead.

The GAC should have been designed to be used as a true cache. Once a file is loaded, the results of JIT optimizing it should be cached for next time, all completely transparent to the application.

Loading untrusted code (plug-ins)

This is not a flaw in the design of the language, but an annoyance inextricably linked to the it by virtue of its runtime. The way .NET loads and manages untrusted DLLs is frustrating to say the least. The problems are deep, systemic, and endemic.

Since this is not a problem with C# per se, I don’t want to spend a lot of time here – but some of the problems include not being able to unload individual DLLs (and having to use AppDomains), the very way AppDomains work, and how data is passed between different AppDomains.

Conclusion

I do acknowledge that a lot of these problems I presented also affect other languages, such as Java. But Java is not my favorite language either!

Through much of this post, I compared C# to C++. This is not to say that C++ is perfect – it has its significant share of problems as well. But in the aspects where I find C# problematic, C++ tended to be a good language to illustrate an alternative.

During my career as a professional software developer, I have written programs in a few languages: C, C++, C#, x86 assembly, JavaScript, and Haskell. I am not including DSL, such as HTML, CSS, or SQL.

The most frustrating of them, by a mile, was C#. The limitations in the language meant that programs were written to satisfy the compiler, rather than long-term requirements, such as maintainability, or user-requirements, such as performance.

The language seemed to direct the team into adopting coding practices I deemed inefficient, such as a liberal usage of design patterns (abstract type factories abounded in the code, as did bridges, proxies, etc.).

In many of those projects, the code was not what I consider good code: semantically clean, small, efficient, and easy to understand. It was an spaghetti of properties, constructors, and factories. Much of the code seemed to be written just so that we could write the code we wanted to in the first place, except in a more complicated way.

I do like many of C#’s features: I like that the compiler performs assignment checks in if expressions. I like it that I can easily check for overflow. I like it that I don’t have to use header files. The tools (especially the Visual Studio IDE) are nice and well-finished.

On the gloomier side, in my opinion, none of those positives is enough to offset the severe limitations in the language – they were nice, but I derived little benefit from them.

The language might be suitable for the development of very small applications, but falls short for large systems. So unless the project is intended to remain perpetually small, the choice of C# is a trap.

The domain of the applications should be considered. Those small applications should not include build tools. I have learned from experience that small tools executed over and over again during the build can significantly slow down a build pipeline, due to the high startup cost.

I feel like it’s a mistake to use C# for the development of large and complex systems. In that sense, it’s a toy language suitable for beginners and amateurs.

Lookahead exclusions from LALR(1) sets

When dealing with grammars (eg., when writing parsers, or designing a programming language), it is not uncommon for the grammar to contain LR(0) conflicts. Oftentimes these comments will require some creative thought to be resolved. This usually results in grammars that are hard to understand, and/or code that is hard to maintain, etc.

For example, consider the following very simple grammar (italicized words are non-terminals, bold words are terminals):

program:\\  \indent function \\  \indent|\;statement\\  \\  function:\\  \indent \textrm{\small\textbf{function}}\;Identifier\\  \\  statement:\\  \indent expression\\  \indent|\;\textrm{\small\textbf{\{}} expression \textrm{\small\textbf{\}}}\\  \indent|\;\epsilon\\  \\  expression:\\  \indent \textrm{\small\textbf{function}}\;Identifier_{\textrm{\tiny\textbf{opt}}}\\  \indent|\;Number\\  \indent|\;Identifier

If we compute the LALR(1) sets for it, some of the generated states will be:

State 0:

program ::= * function
program ::= * statement
function ::= * FUNCTION ID
statement ::= * expression
statement ::= * LCURLY expression RCURLY
statement ::= *
expression ::= * FUNCTION maybe_id
expression ::= * NUMBER
expression ::= * ID

Actions for state 0:

  FUNCTION shift  3
        ID shift  7
    LCURLY shift  1
    NUMBER shift  8
   program accept
  function shift  6
 statement shift  5
expression shift  11
 {default} reduce 5

State 3:

function ::= FUNCTION * ID
expression ::= FUNCTION * maybe_id
maybe_id ::= * ID
maybe_id ::= *

Actions for state 3:

        ID shift  13
  maybe_id shift  12
 {default} reduce 10

State 13:

function ::= FUNCTION ID *
maybe_id ::= ID *

Actions for state 13:

         $ reduce 9
 {default} reduce 2

In state 13 there’s a reduce/reduce conflict: once the parser has successfully gotten an ID from the scanner, it doesn’t know if it should reduce to state 9 or to state 2.

The reason for the conflict can be found by examining the states shown above:

  1. from the start state (state 0), the parser expects a function or a statement
  2. a function starts with the token function, followed by an Identifier
  3. statements might start with an expression, which in turn could start with the token function followed by an optional Identifier

Let’s say we decide to get rid of the conflict by saying that if a statement is an expression then it cannot start with function. Effectively, we are removing item 3 from the list above. This is a very common solution in grammar specifications. The grammars for many popular languages, such as JavaScript, use this technique. I call it “lookahead exclusion”.

In practice, lookahead exclusion is not supported by parser generators. The practical solution for implementing grammars specified using this technique is to hand-write the parser. But hand-written parsers can be buggier, harder to maintain and are potentially slower, since they are usually top-down (which requires backtracking).

Another problem with top-down parsers is that they usually require modifying the original grammar (transforming the LR(k) grammar into an LL(1) grammar) by removing left recursion, ambiguities, etc.. I’ll leave the actual modification of the grammar above as an exercise to the reader.

Now consider the lookahead exclusion described above:

statement:\\  \indent expression\;\textrm{\small{(lookahead\,}}\not\in \{\textrm{\small\textbf{function}}\}\textrm{\small{)}}\\  \indent|\;\textrm{\small\textbf{\{}} expression \textrm{\small\textbf{\}}}\\  \indent|\;\epsilon

When generating the LALR(1) sets, we can see how the corresponding states have changed:

State 0:

program ::= * function
program ::= * statement
function ::= * FUNCTION ID
statement ::= * expression
statement ::= * LCURLY expression RCURLY
statement ::= *
expression ::= * NUMBER
expression ::= * ID

Actions for state 0:

  FUNCTION shift  4
        ID shift  7
    LCURLY shift  1
    NUMBER shift  8
   program accept
  function shift  6
 statement shift  5
expression shift  12
 {default} reduce 5

State 4:

function ::= FUNCTION * ID

Actions for state 4:

ID shift  13

State 13:

function ::= FUNCTION ID *

Actions for state 13:

{default} reduce 2

The conflict is gone! Because we excluded function from the lookaheads of the production statement \rightarrow expression, when the keyword function is found at state 0, the parser is shifted to a state where it needs to find an Identifier in order to successfully reduce function, or the input is malformed.

I have made a change to the Lemon LALR(1) parser generator that allows one to specify exclusion sets for a given rule. The modifier grammar above can be described as:

program ::= function.
program ::= statement.
function ::= FUNCTION ID.
statement ::= expression - [ FUNCTION ].
statement ::= LCURLY expression RCURLY.
statement ::= .
expression ::= FUNCTION maybe_id.
expression ::= NUMBER.
expression ::= ID.
maybe_id ::= ID.
maybe_id ::= .

The Lemon syntax for the exclusions is:

exclusions ::= MINUS LBRACKET list_of_exclusions maybe_comma RBRACKET
list_of_exclusions ::= exclusion
list_of_exclusions ::= list_of_exclusions COMMA exclusion
exclusion ::= terminal
maybe_comma ::= COMMA.
maybe_comma ::= .

Exclusions must come after all right-hand side symbols in a grammar rule, immediately before the dot.

The source for the Lemon parser generator can be found below:
lemon.c
lemon.c.diff

Have fun!

The Bazooka Text Processor

Today, I am proud to anounce Bazooka 1.0RC. Bazooka is a text processor, designed to generate beautiful documents from markup. Those familiar with LaTeX should understand the concept.

Bazooka will transform a markup language very similar to that of Wikipedia into documents. It was designed to be used as a lightweight solution of generating documents, especially during software builds.

The images below show some examples of the Bazooka documentation, which was generated using Bazooka:

Bazooka can be downloaded from the following links:

The documentation:

Enjoy!

Update: if you try to run Bazooka and you get an error saying that msvcrt100.dll is missing, then you will need to download and install the Visual C++ 2010 Runtime. The link to the 32-bit version is here and the 64-bit version is here.

On the performance of strlen

I was working on an optimization problem when I started wondering just how efficient the implementation of the C standard library (libc) is. I decided to start my investigation with one of the simplest functions in the whole libc: strlen.

The experiment I designed for measuring the performance of strlen is as follows:

  1. I generated a large (around 71K) file with text from Lorem Ipsum
  2. A program opens that file, fseeks to the end, ftells the current offset, fseeks to the beginning, allocates as many bytes as requires, and freads the whole file into that memory region.
  3. Finally, it calls mystrlen 100,000 times.

mystrlen is a wrapper around strlen as follows:

size_t mystrlen(const char *text, int len) {
    size_t l;

    l = strlen(text);
    if (l != len) {
        fprintf(stderr,
            "error in size function\n"
            "found: %d, expected: %d\n",
            l,
            len);
        exit(1);
    }

    return l;
}

After running this function 100K times, we will have scanned through 7,186,600,000 characters (yes, that’s over 7 billion characters!). Now let’s see just how fast that can be done.

I defined 4 implementations of strlen with different performance characteristics:

  1. c
  2. std
  3. asm
  4. fastc

c is a vanilla C implementation of strlen, the way any freshman computer science student would implement it.
I have considered this the reference implementation, because being pure C, it will test the compiler’s ability to perform optimizations, as well as serve as a baseline for other less intuitive approaches.

size_t __cdecl strlen (const char* text) {
    size_t count = 0;
    while (*text++) count++;
    return count;
}

std is the implementation in libc. Its implementation is oftentimes machine-dependent and written in Assembly language directly.

asm is an Assembly language implementation of strlen. It uses x86-specific string instructions.
The implementation below uses gcc’s syntax for inline assembly and AT&T assembly syntax for the assembly language.
The same code could’ve easily been written in a assembly file as a procedure called _strlen, and then liked against the C object containing the main function.

size_t __cdecl strlen (const char* text) {
    size_t count;
    asm("subl   %%ecx, %%ecx;"
        "not    %%ecx       ;"
        "movl   %1, %%edi   ;"
        "sub    %%al, %%al  ;"
        "cld                ;"
        "repne  scasb       ;"
        "notl   %%ecx       ;"
        "decl   %%ecx       ;"
        "movl   %%ecx, %0   ;"
        :"=r"(count)            /* output */
        :"r"(text)              /* input */
        :"%eax", "%ecx", "%edi" /* clobbered registers */
    );
    return count;
}

fastc is a C implementation that is not at all obvious. It is very machine-dependent, but it is very interesting to us because it allows us to test the compiler’s optimizing abilities, especially against asm.
We’ll revisit the details fastc later.

We’ll test each of these 4 implementations of strlen in 3 different compiler output modes:

  1. No optimizations, generic x86 target architecture
  2. Full optimizations (-O3), generic x86 target architecture
  3. Full optimizations (-O3), core2 target architecture (-march=core2)

With the problem introduce, let’s go to results:

With no optizations, targeting a generic x86 architecture, we got the following results:

Implementation Total time
asm 10.8830 s
c 17.9130 s
fastc 4.5830 s
std 1.7960 s

These results are actually along the lines of what I expected. asm is faster than c, and std is the fastest one. But how can fastc be faster than asm?

The answer to that question is in what fastc does, and how it is different from asm and c.

If you take a closer look at both asm and c, you’ll notice that they both operate on individual bytes. In other words, there’s a lot of data being moved (remember, we are operating on more than 7 Gbs of textual data). Even with a large cache, each bytes needs to be moved from cache to register. It’s even surprising we can process 7 Gbs of data in 10 secs!

Which brings us to fastc. fastc is an implementation that operates on the whole word at the same time:

size_t __cdecl strlen (const char* text) {
    const char *s;
    const uint32_t* pdwText;
    register uint32_t dwText;

    pdwText = (uint32_t*)text;
    while (1) {
        dwText = *pdwText;

        if ((dwText - 0x01010101UL) & ~dwText & 0x80808080UL) {
            s = (const char*)pdwText;
            while (*s) s++;
            return s - text;
        }

        pdwText++;
    }
}

This algorithm operates on as many bytes as can fit in a register (in the case of this 32-bit program, 4 bytes).
The if check is verifying whether any byte in that word is 0x00. This technique was described in the Hacker’s Delight.

This algorithm can be further improved. Before we start analyzing words, we might want to make sure the pointer is word-aligned. If it’s not, we can fall back to the simple c algorithm until the pointer is aligned to the word boundary. Additionally, the inner while loop (the one executed when the found a 0x00 byte) can be optimized.

These results made sense, right? Now when we turn on optimizations, things take a surpring turn:

With full optizations (-O3), targeting a generic x86 architecture, we got the following results:

Implementation Total time
asm 10.8750 s
c 3.4280 s
fastc 1.7430 s
std 10.8650 s

Before we proceed, I must say that the performance numbers change from run to run (even when averaged out), so keep that in mind.

Unsurprisingly, the numbers for asm didn’t change. We manually wrote it in Assembly, so turning on compiler optimizations wouldn’t affect it.

The fully optimized c is now faster than fastc with no optimizations (which shows that the C compiler can do a fantastic job optimizing). Which brings me to my first conclusion: oftentimes, writing simple C code is good enough. Now, of course, fastc is still faster than c (although only twice as faster, as opposed to 4x like before). Second conclusion: even with the best optimizing compilers, there’s still a difference between efficient code and innefficient code.

Both c and fastc are now faster than asm. That’s because the compiler knows how to stop all those memory operations.

The big surprise here was std. It is unclear to me why std is so slow, but one thing is clear: there’s no statistical difference between the runtime of asm and std, which leads me to believe that std is doing it one byte at a time. This was so surprising to me that I re-compiled and re-ran the tests several times.

My theory about that is that because full optimizations were on, the compiler was in liberty to change the code as it saw fit. But targeting a generic x86 architecture, it ended up modifying it to something that would run fastest on lower-end machines, whereas fastc is hand-optimized for 32-bit machines. That theory can’t explain c being faster than std. If you can explain it, please leave a comment.

Lastly, with full optizations (-O3), targeting the core2 architecture (-march=core2), we got the following results:

Implementation Total time
asm 10.8500 s
c 3.4270 s
fastc 1.7510 s
std 1.8500 s

asm, c and fastc haven’t changed. asm is hand-written Assembly and shouldn’t change. fastc makes assumptions about the underlying architecture, and therefore shouldn’t change. c not changing is a bit surprising. std is back to normal. This brings us to our third conclusion: always set the target architecture type.

Interestingly, our fastc implementation was able to beat std. So does that mean that everyone should write their own strlen? No. Remember, we are operating on over 7 Gbs of textual data here. Which means that for your average string, you will not be able to tell the difference between fastc and std, as long as you set the target architecture type!

In case you are curious, the fastc64 (the modified version of fastc to operate on 64-bit words) running in 32-bit mode is slower than fastc32. The total time for fastc64 running in 64-bit mode is 0.8770 s, which makes it, very unsurprisingly, twice as fast as the fastc32 running in 32-bit mode. Forth conclusion: if compiling to 64-bit processors, make use of it. Almost nobody does.

And last, but not at all least, don’t run strlen 100,000 times over the same string. If you expect that you’ll need to access the size of the string several times, consider storing it. No code is faster than no code.

An efficient implementation of tables in C

Oftentimes I need to use tabular data in my program. Runtime initializations are expensive, and writing tables that are kept in-sync with each other often hard.

Here’s a simple example. Consider the the following table representing US coin names and values:

typedef struct {
   char *Name;
   unsigned short Count;
} Coin;

typedef enum {
   COIN_PENNY,
   COIN_NICKEL,
   COIN_DIME,
   COIN_QUARTER,
   COIN_HALF,
} Coins;

static Coin CoinInfo[] = {
   { "penny", 1 },
   { "nickel", 5 },
   { "dime", 10 },
   { "quarter", 25 },
   { "half-dollar", 50 },
};

Now accessing information about each coin is very easy:

Coin *c = &CoinInfo[COIN_DIME];
printf("You owe me %d cents (or a %s)\n", c->Count, c->Name);

The problem with that approach is that the it can be tricky to keep both the enum (used as in indexer for the info array) and the array itself in sync, especially as they get bigger.

Here’s a trick to help with that. We start by defining a set of macros to help us with it:

#define _E(e, ...) e
#define _V(e, ...) { __VA_ARGS__ }

#define ENUM(name, vm) \
   typedef enum {\
      vm(_E)\
   } name
#define TABLE(type, name, vm) \
   static type name[] = {\
      vm(_V)\
   }

Next, we describe the table data:

typedef struct {
   char *Name;
   unsigned int BusinessDay : 1;
} Weekday;

Then we define the data accessor (enum identifier) and the data itself. Notice how they are both defined alongside each other, so they are always in sync:

#define WEEK(_m) \
   _m(WEEKDAY_SUNDAY,      "Sunday",       0), \
   _m(WEEKDAY_MONDAY,      "Monday",       1), \
   _m(WEEKDAY_TUESDAY,     "Tuesday",      1), \
   _m(WEEKDAY_WEDNESDAY,   "Wednesday",    1), \
   _m(WEEKDAY_THURSDAY,    "Thursday",     1), \
   _m(WEEKDAY_FRIDAY,      "Friday",       1), \
   _m(WEEKDAY_SATURDAY,    "Saturday",     0), \

Notice how the above statement is merely a definition. It’s a preprocessor macro that will be compiled into nothingness in the generated code, unless it’s referenced from somewhere, and that’s what we’ll do next. We’ll declare both the enum and the array:

ENUM(Weekdays, WEEK);
TABLE(Weekday, WeekdayInfo, WEEK);

And that’s it. Putting it all together, we have:

#define _E(e, ...) e
#define _V(e, ...) { __VA_ARGS__ }

#define ENUM(name, vm) \
   typedef enum {\
      vm(_E)\
   } name
#define TABLE(type, name, vm) \
   static type name[] = {\
     vm(_V)\
   }
typedef struct {
   char *Name;
   unsigned int BusinessDay : 1;
} Weekday;

#define WEEK(_m) \
   _m(WEEKDAY_SUNDAY,      "Sunday",       0), \
   _m(WEEKDAY_MONDAY,      "Monday",       1), \
   _m(WEEKDAY_TUESDAY,     "Tuesday",      1), \
   _m(WEEKDAY_WEDNESDAY,   "Wednesday",    1), \
   _m(WEEKDAY_THURSDAY,    "Thursday",     1), \
   _m(WEEKDAY_FRIDAY,      "Friday",       1), \
   _m(WEEKDAY_SATURDAY,    "Saturday",     0), \

ENUM(Weekdays, WEEK);
TABLE(Weekday, WeekdayInfo, WEEK);

When it runs, the preprocessor will transform that into:

typedef struct {
   char *Name;
   unsigned int BusinessDay : 1;
} Weekday;
typedef enum {
   WEEKDAY_SUNDAY,
   WEEKDAY_MONDAY,
   WEEKDAY_TUESDAY,
   WEEKDAY_WEDNESDAY,
   WEEKDAY_THURSDAY,
   WEEKDAY_FRIDAY,
   WEEKDAY_SATURDAY,
} Weekdays;
static Weekday WeekdayInfo[] = {
   { "Sunday", 0 },
   { "Monday", 1 },
   { "Tuesday", 1 },
   { "Wednesday", 1 },
   { "Thursday", 1 },
   { "Friday", 1 },
   { "Saturday", 0 },
};

And now, from the compiler point-of-view (post-preprocessor), it is as if the enum and the table were declared separately!

Now if we inspect the generated assembly for the code above, it is comforting to see that the table is indeed staticaly initialized into the read-only data segment:

.file "test.c"
   .section .rdata,"dr"
LC0:
   .ascii "Sunday\0"
LC1:
   .ascii "Monday\0"
LC2:
   .ascii "Tuesday\0"
LC3:
   .ascii "Wednesday\0"
LC4:
   .ascii "Thursday\0"
LC5:
   .ascii "Friday\0"
LC6:
   .ascii "Saturday\0"
   .data
   .align 32
_WeekdayInfo:
   .long LC0
   .byte 0
   .space 3
   .long LC1
   .byte 1
   .space 3
   .long LC2
   .byte 1
   .space 3
   .long LC3
   .byte 1
   .space 3
   .long LC4
   .byte 1
   .space 3
   .long LC5
   .byte 1
   .space 3
   .long LC6
   .byte 0
   .space 3

And this is why other languages are still to match C in simplicity, elegance and power.

On the object-oriented myth

It is often a common assumption in CS circles that object-oriented programming (OOP) is the holy grail of programming. It is the most popular programming paradigm today, and as such, it is unpopular not to like it. Whereas there have been many publications over the past 20 years devoted to the silver-bullet properties OOP, none of those publications were actually written by actual developers, because they were too busy writing actual code.

For some reason, it is also often implied that one’s dislike for OOP is rooted in one’s not understanding it. And more recently, the same argument has been made for managed memory, etc.. We have been told ad nauseam how much more productive developers will be if they use a the combo OOP + managed code.

But when I take a step back, look at the big picture and pose myself the question, “are we more productive today than we were 10 years ago?”, I can’t honestly say “yes”.

But before it is raised, I want to address the standard library argument. It is often argued that .NET (Java, etc.) will make developers more productive just by saving them the time it takes to rewrite a linked list from scratch every time. Now, seriously. I am yet to meet a C developer who will rewrite his linked list every time they need one. That argument is just rhetorical nonsense. If anything, C has way more linked list libraries than .NET will ever dream of having. And they come in every shape and size! So what’s the problem?

  1. Libraries for common functionality (linked lists, hash tables, HTTP connections, graphics, …) do not ship with C. One has to go online and spend some time looking for one that will fit their needs, compare different ones, their pros and cons, etc.
  2. These libraries often don’t talk to each other.

So the problem with C (and even C++) is not that they don’t have a standard library for common functionality. Is that they have too much of it – and it is too confusing. Imagine the following scenario:

Albert is a developer at SoftTacos. He works on a project being written in C. His boss asked him to find an HTTP library for their app to use, since they don’t want to write one from scratch. After some research, Albert found that there are 10 really popular libraries that they could use. Albert has to compare them for performance, size, licensing, prices, bugs, support, maintainability, etc.. Albert feels overwhelmed and anxious about making such a decision.

Albert finally makes a decision on which HTTP library to use. He chose a library that is fast, small, open-source and community-maintained. But it has a dependency on a string library they don’t currently use at SoftTacos, and it uses that string format to pass data back and forth. Now Albert has to go and write a wrapper layer around the HTTP library, converting between the in-use string format and the one the new code expects. Albert is feeling insecure at this point that his decision was even the right one.

This scenario described above is very common when dealing with C. So as I said above, there is a productivity hit for using C – not because it is procedural, but because it lacks a standard library that is integrated, organized and comprehensive, ready to meet today’s needs.

Back to the alleged productivity gains brought by OOP: if C had such a standard library as described above, I can’t see how writing applications in C++ (OOP) would be any more productive than C (procedural). And as a matter of fact, C++ (which also doesn’t have a comprehensive modern standard library) is usually regarded to be as productive as C.

Now hold on tight, for the next logical step seems to be a bit, ugh, illogical. If C and C++ are as productive as each other, their lack of productivity must come from the fact that they are both unmanaged languages. So the natural step is to add a managed layer on top of them, correct? </sarcasm> Of course not! It would’ve sufficed to have the class library.

Q: “What about the time saved by not having to debug pointer/memory issues?”
A: “What about all that time spent profiling the app to understand where that gigabyte of memory went? What about all that time spent trying to get performance to an acceptable point?”

Q: “What about the time saved by my nice little IDE?”
A: “What about all that time spent fixing bugs because you never thought about your design and architecture, and instead you just wrote code like a cowboy?”

Potok et al. have conclusively shown in a meticulous 1999 study that there is no significant difference in productivity between OOP and procedural approaches. Luda Cardelli (assistant director at Microsoft Research in Cambridge, UK) has published a famous paper titled "Bad Engineering Properties of Object-Oriented Languages”.

Here are some of my favorite quotes about OOP (my favorite ones are the ones by Richard Mansfield):

“Almost as much of a hoax as Artificial Intelligence”.
Alexander Stepanov (the primary designer and implementer of the C++ STL) [ref, ref]

"The problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle."
—  Joe Armstrong (inventor of Erlang) [ref]

"Like countless other intellectual fads over the years (relevance, communism, modernism, and so on — history is littered with them), OOP will be with us until eventually reality asserts itself. But considering how OOP currently pervades both universities and workplaces, OOP may well prove to be a durable delusion. Entire generations of indoctrinated programmers continue to march out of the academy, committed to OOP and nothing but OOP for the rest of their lives."
Richard Mansfield (author and former editor of COMPUTE! magazine) [ref]

"OOP is to writing a program, what going through airport security is to flying"
Richard Mansfield (author and former editor of COMPUTE! magazine) [ref]

Enough said.

UPDATE:
I got feedback that the lines between OOP and managed memory seem to be blurred. Whereas this post focuses primarily on OOP, in today’s development environment it is almost impossible to think of OOP without thinking of some level of managed memory. But still, I should’ve done a better job separating the two of them.