Overview
What is D?
D is a general purpose systems and applications programming language. It is a high level language, but retains the ability to write high performance code and interface directly with the operating system API's and with hardware. D is well suited to writing medium to large scale million line programs with teams of developers. D is easy to learn, provides many capabilities to aid the programmer, and is well suited to aggressive compiler optimization technology.
D is not a scripting language, nor an interpreted language. It doesn't come with a VM, a religion, or an overriding philosophy. It's a practical language for practical programmers who need to get the job done quickly, reliably, and leave behind maintainable, easy to understand code.
D is the culmination of decades of experience implementing compilers for many diverse languages, and attempting to construct large projects using those languages. D draws inspiration from those other languages (most especially C++) and tempers it with experience and real world practicality.
Why D?
Why, indeed. Who needs another programming language?
The software industry is continuously evolving at a breakneck pace. New ideas appear, and older ideas are either validated or discarded. What programmers need and want out of a programming language changes. Available memory and computing power have increased by orders of magnitude, as well as the scale of programs being developed.
Compilers are no longer terribly constrained by available computing resources, and so are able to do much more for the programmer. Much more powerful language features have become practical. These features can be difficult to retrofit into existing languages, and when there are enough of these, a new language is justified.
Major Design Goals of D
Everything in designing a language is a tradeoff. Keeping some principles in mind will help to make the right decisions.
- Enable writing fast, effective code in a straightforward manner.
- Make it easier to write code that is portable from compiler to compiler, machine to machine, and operating system to operating system. Eliminate undefined and implementation defined behaviors as much as practical.
- Provide syntactic and semantic constructs that eliminate or at least reduce common mistakes. Reduce or even eliminate the need for third party static code checkers.
- Support memory safe programming.
- Support multi-paradigm programming, i.e. at a minimum support imperative, structured, object oriented, generic and even functional programming paradigms.
- Make doing things the right way easier than the wrong way.
- Have a short learning curve for programmers comfortable with programming in C, C++ or Java.
- Provide low level bare metal access as required. Provide a means for the advanced programmer to escape checking as necessary.
- Be compatible with the local C application binary interface.
- Where D code looks the same as C code, have it either behave the same or issue an error.
- Have a context-free grammar. Successful parsing must not require semantic analysis.
- Easily support writing internationalized applications - Unicode everywhere.
- Incorporate Contract Programming and unit testing methodology.
- Be able to build lightweight, standalone programs.
- Reduce the costs of creating documentation.
- Provide sufficient semantics to enable advances in compiler optimization technology.
- Cater to the needs of numerical analysis programmers.
- Obviously, sometimes these goals will conflict. Resolution will be in favor of usability.
Transitioning to D
The general look of D is like C and C++. This makes it easier to learn and port code to D. Transitioning from C/C++ to D should feel natural. The programmer will not have to learn an entirely new way of doing things.
Using D will not mean that the programmer will become restricted to a specialized runtime vm (virtual machine) like the Java vm or the Smalltalk vm. There is no D vm, it's a straightforward compiler that generates linkable object files. D connects to the operating system just like C does. The usual familiar tools like make will fit right in with D development.
- The general look and feel of C/C++ is adopted. It uses the same algebraic syntax, most of the same expression and statement forms, and the general layout.
- D programs can be written in familiar paradigms, function-and-data, object-oriented, template metaprogramming, functional, or any mix of them.
- The compile/link/debug development model is carried forward from languages like C, although nothing precludes D from being compiled into bytecode and interpreted.
- Exception handling. More and more experience with exception handling shows it to be a superior way to handle errors than error codes.
- Runtime Type Identification. Fully supporting RTTI it enables better garbage collection, better debugger support, more automated persistence, etc.
- D maintains function link compatibility with the C calling conventions. This makes it possible for D programs to access operating system API's directly. Programmers' knowledge and experience with existing programming API's and paradigms can be carried forward to D with minimal effort.
- Limited support for function link compatibility with C++ calling conventions. In particular, there is compatibility with C++ name mangling, namespaces, templates, and exceptions.
- Operator overloading. D programs can overload operators enabling extension of the basic types with user-defined types.
- Template Metaprogramming. Parametric polymorphism (aka template metaprogramming) is straightforward in D.
- RAII (Resource Acquisition Is Initialization). RAII techniques are an essential component of writing reliable software.
- Custom memory management. There are times in systems programming where developers need alternatives to garbage collection. D also allows for manual memory management, techniques such as reference counting, and the use of specialized memory allocators.
- Down and dirty programming. D retains the ability to do down-and-dirty programming without resorting to referring to external modules compiled in a different language. Sometimes, it's just necessary to coerce a pointer or dip into assembly when doing systems work. D's goal is not to prevent down and dirty programming, but to minimize the need for it in solving routine coding tasks.
Features To Leave Behind
- Support for 16 bit computers. No consideration is given in D for mixed near/far pointers and all the machinations necessary to generate good 16 bit code. The D language design assumes at least a 32 bit flat memory space and supports 64 bit as well.
- Mutual dependence of compiler passes. Having a clean separation between the lexer, parser, and semantic passes makes for an easier to implement and understand language. This means no user-defined tokens and no user-defined syntax.
- Macro systems. Macros are an easy way for users to add very powerful features. Unfortunately, the complexity is pushed off on to the user trying to understand the macro usage, and the result is rarely worth it and not justifiable.
- C/C++ source code compatibility. The use of the preprocessor has made this difficult to achieve without inevitably requiring manual touch up. It's best to leave this to external tools.
- Multiple inheritance. It's a complex feature of debatable value. It's very difficult to implement in an efficient manner, and compilers are prone to many bugs in implementing it. Nearly all the value of MI can be handled with single inheritance coupled with interfaces and aggregation. What's left does not justify the weight of MI implementation.
Who is D For?
- Programmers who routinely use lint or similar code analysis tools to eliminate bugs before the code is even compiled.
- People who compile with maximum warning levels turned on and who instruct the compiler to treat warnings as errors.
- Programming managers who are forced to rely on programming style guidelines to avoid common bugs.
- Projects that need built-in testing and verification.
- Teams who write apps with a million lines of code in it.
- Programmers who think the language should provide enough features to obviate the continual necessity to manipulate pointers directly.
- Programmers who write half their application in scripting languages the other half in native langauges to speed up the bottlenecks. D has productivity features that make using such hybrid approaches unnecessary.
- D's lexical analyzer and parser are totally independent of each other and of the semantic analyzer. This means it is easy to write simple tools to manipulate D source perfectly without having to build a full compiler. It also means that source code can be transmitted in tokenized form for specialized applications.
Who D is Not For?
- Realistically, nobody is going to convert million line legacy programs into D. D, however, can connect directly to any code that exposes a C ABI interface or (to a limited extent) a C++ ABI interface.
- As a first programming language - Python or JavaScript is more suitable for beginners. D makes an excellent second language for intermediate to advanced programmers.
- Language purists. D is a practical language, and each feature of it is evaluated in that light, rather than by an ideal. For example, D has constructs and semantics that virtually eliminate the need for pointers for ordinary tasks. But pointers are still there, because sometimes the rules need to be broken. Similarly, casts are still there for those times when the typing system needs to be overridden.
Major Features of D
This section lists some of the more interesting features of D in various categories.
Object Oriented Programming
Classes
D's object oriented nature comes from classes. The inheritance model is single inheritance enhanced with interfaces. The class Object sits at the root of the inheritance hierarchy, so all classes implement a common set of functionality. Classes are instantiated by reference, and so complex code to clean up after exceptions is not required.
See the Classes page for more information.
Operator Overloading
Classes can be crafted that work with existing operators to extend the type system to support new types. An example would be creating a bignumber class and then overloading the +, -, * and / operators to enable using ordinary algebraic syntax with them.
See the Operator Overloading page for more information.
Functional Programming
Functional programming has a lot to offer in terms of encapsulation, concurrent programming, memory safety, and composition. D's support for functional style programming include:
- Pure functions
- Immutable types and data structures
- Lambda functions and closures
Productivity
Modules
Source files have a one-to-one correspondence with modules.
See the Module page for more information.
Declaration vs Definition
Functions and classes are defined once. There is no need for declarations when they are forward referenced. A module can be imported, and all its public declarations become available to the importer.
Example:
class ABC { int func() { return 7; } static int z = 7; } int q;
All members are defined in the class or struct, not separately.
class Foo { int foo(Bar c) { return c.bar; } } class Bar { int bar() { return 3; } }
Whether a D function is inlined or not is determined by the optimizer settings.
Templates
D templates offer a clean way to support generic programming while offering the power of partial specialization. Template classes and template functions are available, along with variadic template arguments and tuples.
See the Templates page for more information.
Associative Arrays
Associative arrays are arrays with an arbitrary data type as the index rather than being limited to an integer index. In essence, associated arrays are hash tables. Associative arrays make it easy to build fast, efficient, bug-free symbol tables.
See the Associative Arrays page for more information.
Documentation
Documentation has traditionally been done twice - first there are comments documenting what a function does, and then this gets rewritten into a separate html or man page. And naturally, over time, they'll tend to diverge as the code gets updated and the separate documentation doesn't. Being able to generate the requisite polished documentation directly from the comments embedded in the source will not only cut the time in half needed to prepare documentation, it will make it much easier to keep the documentation in sync with the code. Ddoc is the specification for the D documentation generator. This page was generated by Ddoc, too.
Although third party tools exist to do this for other languages. they have some serious shortcomings:
- It is difficult to parse the language correctly. Third party tools tend to parse only a subset of a language correctly, so their use will constrain the source code to that subset.
- Different compilers support different versions of a language and have different extensions to it. Third party tools have a problem matching all these variations.
- Third party tools may not be available for all the desired platforms, and they're necessarily on a different upgrade cycle from the compilers.
- Having it builtin to the compiler means it is standardized across all D implementations. Having a default one ready to go at all times means it is far more likely to be used.
Functions
D has the expected support for ordinary functions including global functions, overloaded functions, inlining of functions, member functions, virtual functions, function pointers, etc. In addition:
Nested Functions
Functions can be nested within other functions. This is highly useful for code factoring, locality, and function closure techniques.
Function Literals
Anonymous functions can be embedded directly into an expression.
Dynamic Closures
Nested functions and class member functions can be referenced with closures (also called delegates), making generic programming much easier and type safe.
In, Out, and Ref Parameters
Not only does specifying this help make functions more self-documenting, it eliminates much of the necessity for pointers without sacrificing anything, and it opens up possibilities for more compiler help in finding coding problems.
Such makes it possible for D to directly interface to a wider variety of foreign API's. There would be no need for workarounds like "Interface Definition Languages".
Arrays
C arrays have several faults that can be corrected:
- Dimension information is not carried around with the array, and so has to be stored and passed separately. The classic example of this are the argc and argv parameters to main(int argc, char *argv[]). (In D, main is declared as main(string[] args).)
- Arrays are not first class objects. When an array is passed to a function, it is converted to a pointer, even though the prototype confusingly says it's an array. When this conversion happens, all array type information gets lost.
- C arrays cannot be resized. This means that even simple aggregates like a stack need to be constructed as a complex class.
- C arrays cannot be bounds checked, because they don't know what the array bounds are.
- Arrays are declared with the [] after the identifier. This leads to
very clumsy
syntax to declare things like a pointer to an array:
int (*array)[3];
In D, the [] for the array go on the left:
int[3]* array; // declares a pointer to an array of 3 ints long[] func(int x); // declares a function returning an array of longs
which is much simpler to understand.
D arrays come in several varieties: pointers, static arrays, dynamic arrays, and associative arrays.
See the Arrays page for more information.
Strings
String manipulation needs direct support in the language. Modern languages handle string concatenation, copying, etc., and so does D. Strings are a direct consequence of improved array handling.
Ranges
D uses the concept of a range in lieu of iterators or generators found in other languages. A range is any type that provides a common interface to a sequence of values. The purpose of a range is to allow for a simpler way to write code that works on arbitrary data, thereby making it reusable.
The most basic type of range is called an input range, which provides three methods.
struct MyRange { auto front() { // return the next value in the sequence } void popFront() { // move the front of the sequence to the next value } bool empty() { // true if the range has no more values to return } }
To understand the power of this simple interface, let's run through an example. Say we wanted to write a program that took all of the employees in a company, took out the ones younger than 40, and grouped the remaining into an array of arrays by their organization.
struct Employee { uint id; uint organization_id; string name; uint age; } struct Employees { Employee[] data; this(Employee[] employees) { data = employees; } Employee front() { return data[0]; } void popFront() { data = data[1 .. $]; } bool empty() { return data.length == 0; } }
Here the data is coming from a constructor as a simple example, but it could come from any source, like a CSV or a database.
Please note that this code should not be used in actual code, because in D, the basic dynamic array also acts as a range, so any algorithm that accepts ranges also accepts arrays. However, static arrays are not considered ranges, as the operation popFront is based on mutating the length of the range, which is impossible with static arrays. To get a range out of a static array, create a slice containing all of its elements, like so:
int[4] array = [1, 2, 3, 4]; // not a range array[]; // valid range
Now that the range is defined, we can populate it and write the filtering code.
void main() { import std.algorithm.iteration : filter, chunkBy; Employees employees = Employees([ Employee(1, 1, "George", 50), Employee(2, 3, "John", 65), Employee(3, 2, "David", 40), Employee(4, 1, "Eli", 40), Employee(5, 2, "Hal", 35) ]); auto older_employees = employees .filter!(a => a.age > 40) // lambdas in D use the => syntax .chunkBy!((a,b) => a.organization_id == b.organization_id); }
All of the algorithms in std.algorithm work with ranges to avoid the problem of rewriting common functionality for every project. std.algorithm implements sorts, filters, maps, reductions, and more.
Because our Employees struct conforms to the input range definition, it can also be used in foreach loops, which automatically detects if the value passed is an input range.
foreach(employee; employees)
{
writeln(employee);
}
which is equivalent to
for(; !employees.empty; employees.popFront())
{
writeln(employees.front);
}
Types of Ranges
The input range is just the most basic form of range, there are also
- forward ranges
- bidirectional ranges
- random access ranges
- output ranges
Each of these ranges represents a distinct way of accessing the underlying data. Or, in the case of the output range, a way to send data to another source. To learn more about each type of range, see the range primitives page in the standard library documentation.
Each of these types of ranges give you access to different algorithms in the standard library. For example, a filter can be run on an input range, but not a sort, which requires a random access range with the slice operator overload defined. The requirements for each function can be seen in the function signature in the documentation in the template constraints. See the template page and the range primitives link above for more details on template constraints.
As stated before, dynamic arrays and associative arrays in D act as ranges. Specifically, they are random access ranges, so any of the functions in std.algorithm work with these basic types.
Advanced Range Code
The following example uses input ranges and an output range to take data from stdin, take only the unique lines, sort them, and then print the result to stdout.
// Sort lines import std.stdio; import std.array; import std.algorithm; void main() { stdin .byLine(KeepTerminator.yes) .uniq .map!(a => a.idup) .array .sort .copy(stdout.lockingTextWriter());
For more examples of range based code, see the Ranges chapter in Ali Çehreli's book "Programming In D". Also, see the article Component programming with ranges by H. S. Teoh.
Resource Management
Automatic Memory Management
D memory allocation is fully garbage collected. With garbage collection, programming gets much simpler. Garbage collection eliminates the need for tedious, error prone memory allocation tracking code. This not only means much faster development time and lower maintenance costs, but the resulting program frequently runs faster.
For a fuller discussion of this, see garbage collection.
Explicit Memory Management
Despite D being a garbage collected language, the new and delete operations can be overridden for particular classes so that a custom allocator can be used.
RAII
RAII is a modern software development technique to manage resource allocation and deallocation. D supports RAII in a controlled, predictable manner that is independent of the garbage collection cycle.
Performance
Lightweight Aggregates
D supports simple C style structs, both for compatibility with C data structures and because they're useful when the full power of classes is overkill.
Inline Assembler
Device drivers, high performance system applications, embedded systems, and specialized code sometimes need to dip into assembly language to get the job done. While D implementations are not required to implement the inline assembler, it is defined and part of the language. Most assembly code needs can be handled with it, obviating the need for separate assemblers or DLL's.
Many D implementations will also support intrinsic functions analogously to C's support of intrinsics for I/O port manipulation, direct access to special floating point operations, etc.
See the Inline Assembler page for more information.
Reliability
A modern language should do all it can to help the programmer flush out bugs in the code. Help can come in many forms; from making it easy to use more robust techniques, to compiler flagging of obviously incorrect code, to runtime checking.
Contracts
Contract Programming (invented by Dr. Bertrand Meyer) is a technique to aid in ensuring the correctness of programs. D's version of Contracts includes function preconditions, function postconditions, class invariants, and assert contracts. See Contracts for D's implementation.
Unit Tests
Unit tests can be added to a class, such that they are automatically run upon program startup. This aids in verifying, in every build, that class implementations weren't inadvertently broken. The unit tests form part of the source code for a class. Creating them becomes a natural part of the class development process, as opposed to throwing the finished code over the wall to the testing group.
Unit tests can be done in other languages, but the result is kludgy and the languages just aren't accommodating of the concept. Unit testing is a main feature of D. For library functions it works out great, serving both to guarantee that the functions actually work and to illustrate how to use the functions.
Consider the many library and application code bases out there for download on the web. How much of it comes with *any* verification tests at all, let alone unit testing? The usual practice is if it compiles, we assume it works. And we wonder if the warnings the compiler spits out in the process are real bugs or just nattering about nits.
Along with Contract Programming, unit testing makes D far and away the best language for writing reliable, robust systems applications. Unit testing also gives us a quick-and-dirty estimate of the quality of some unknown piece of D code dropped in our laps - if it has no unit tests and no contracts, it's unacceptable.
See the Unit Tests page for more information.
Debug Attributes and Statements
Now debug is part of the syntax of the language. The code can be enabled or disabled at compile time, without the use of macros or preprocessing commands. The debug syntax enables a consistent, portable, and understandable recognition that real source code needs to be able to generate both debug compilations and release compilations.
Exception Handling
The superior try-catch-finally model is used rather than just try-catch. There's no need to create dummy objects just to have the destructor implement the finally semantics.
Synchronization
Multithreaded programming is becoming more and more mainstream, and D provides primitives to build multithreaded programs with. Synchronization can be done at either the method or the object level.
synchronized int func() { ... }
Synchronized functions allow only one thread at a time to be executing that function.
The synchronize statement puts a mutex around a block of statements, controlling access either by object or globally.
Support for Robust Techniques
- Dynamic arrays instead of pointers
- Reference variables instead of pointers
- Reference objects instead of pointers
- Garbage collection instead of explicit memory management
- Built-in primitives for thread synchronization
- No macros to inadvertently slam code
- Inline functions instead of macros
- Vastly reduced need for pointers
- Integral type sizes are explicit
- No more uncertainty about the signed-ness of chars
- No need to duplicate declarations in source and header files.
- Explicit parsing support for adding in debug code.
Compile Time Checks
- Stronger type checking
- No empty ; for loop bodies
- Assignments do not yield boolean results
- Deprecating of obsolete API's
Runtime Checking
- assert() expressions
- array bounds checking
- undefined case in switch exception
- out of memory exception
- In, out, and class invariant Contract Programming support
Compatibility
Operator precedence and evaluation rules
D retains C operators and their precedence rules, order of evaluation rules, and promotion rules. This avoids subtle bugs that might arise from being so used to the way C does things that one has a great deal of trouble finding bugs due to different semantics.
Direct Access to C API's
Not only does D have data types that correspond to C types, it provides direct access to C functions. There is no need to write wrapper functions, parameter swizzlers, nor code to copy aggregate members one by one.
Support for all C data types
Making it possible to interface to any C API or existing C library code. This support includes structs, unions, enums, pointers, and all C99 types. D includes the capability to set the alignment of struct members to ensure compatibility with externally imposed data formats.
OS Exception Handling
D's exception handling mechanism will connect to the way the underlying operating system handles exceptions in an application.
Uses Existing Tools
D produces code in standard object file format, enabling the use of standard assemblers, linkers, debuggers, profilers, exe compressors, and other analyzers, as well as linking to code written in other languages.
Project Management
Versioning
D provides built-in support for generation of multiple versions of a program from the same text. It replaces the C preprocessor #if/#endif technique.
Deprecation
As code evolves over time, some old library code gets replaced with newer, better versions. The old versions must be available to support legacy code, but they can be marked as deprecated. Code that uses deprecated versions will be normally flagged as illegal, but would be allowed by a compiler switch. This will make it easy for maintenance programmers to identify any dependence on deprecated features.
Sample D Program (sieve.d)
/* Sieve of Eratosthenes prime numbers */ import std.stdio; void main() { size_t count; bool[8191] flags; writeln("10 iterations"); // using iter as a throwaway variable foreach (iter; 1 .. 11) { count = 0; flags[] = 1; foreach (index, flag; flags) { if (flag) { size_t prime = index + index + 3; size_t k = index + prime; while (k < flags.length) { flags[k] = 0; k += prime; } count += 1; } } } writefln("%d primes", count); }
NB: The expectation may be that array index x represents the number x, with i + i + 3 seeming odd at first glance. However, if one were to consider each index, it would mean that the first element would represent 0 + 0 + 3 = 3; the second element would represent 1 + 1 + 3 = 5; the third element would represent 2 + 2 + 3 = 7; and so on. So the numbers represented by the array actually go from 3 to (8190 + 8190 + 3), or 16383.