- Quick Examples
- ImportC Dialect
- Invoking ImportC
- Preprocessor
- Predefined Macros
- Preprocessor Directives
- Implementation
- Limitations
- Extensions
- Gnu and Clang Extensions
- Visual C Extensions
- Digital Mars C Extensions
- ImportC from D's Point of View
- Wrapping C Code
- Warnings
- ImportC++
- Other Solutions
- How ImportC Works
ImportC
ImportC is a C compiler embedded into the D implementation. It enables direct importation of C files, without needing to manually prepare a D file corresponding to the declarations in the C file. It directly compiles C files into modules that can be linked in with D code to form an executable. It can be used as a C compiler to compile and link 100% C programs.
Quick Examples
C code in file hello.c:
#include <stdio.h> int main() { printf("hello world\n"); return 0; }
Compile and run:
dmd hello.c ./hello hello world
C function in file functions.c:
int square(int i) { return i * i; }
D program in file demo.d:
import std.stdio; import functions; void main() { int i = 7; writefln("The square of %s is %s", i, square(i)); }
Compile and run:
dmd demo.d functions.c ./demo The square of 7 is 49
ImportC Dialect
There are many versions of C. ImportC is an implementation of ISO/IEC 9899:2011, which will be referred to as C11. References to the C11 Standard will be C11 followed by the paragraph number. Prior versions, such as C99, C89, and K+R C, are not supported.
Further adjustment is made to take advantage of some of the D implementation's capabilities.
Invoking ImportC
The ImportC compiler can be invoked:
- directly via the command line
- indirectly via importing a C file
ImportC Files on the Command Line
ImportC files have one of the extensions .i, or .c. If no extension is given, .i is tried first, then .c.
dmd hello.c
will compile hello.c with ImportC and link it to create the executable file hello (hello.exe on Windows) which can be run
Importing C Files from D Code
Use the D ImportDeclaration:
import hello;
which will, if hello is not a D file, and has an extension .i or .c, compile hello with ImportC.
Preprocessor
ImportC does not have a preprocessor. It is designed to compile C files after they have been first run through the C preprocessor. ImportC can automatically run the C preprocessor associated with the Associated C Compiler, or a preprocessor can be run manually.
Running the Preprocessor Automatically
If the C file has a .c extension, ImportC will run the preprocessor for it automatically.
- When compiling for Windows with the -m32omf switch, sppn.exe will be used as the preprocessor.
- When compiling for Windows with the -m32mscoff or the -m64 switch, cl.exe /P /Zc:preprocessor will be used as the preprocessor.
- When compiling for OSX, the clang -E preprocessor will be used.
- Otherwise the cpp preprocessor will be used.
The druntime file src/importc.h will automatically be #included.
The -v switch can be used to observe the command that invokes the preprocessor.
The -Ppreprocessorflag switch passes preprocessorflag to the preprocessor.
Running the Preprocessor Manually
If the C file has a .i extension, the file is presumed to be already preprocessed. Preprocessing can be run manually:
Digital Mars C Preprocessor sppn.exe
sppn.exe runs on Win32 and is invoked as:
sppn file.c
and the preprocessed output is written to file.i.
Gnu C Preprocessor
The Gnu C Preprocessor can be invoked as:
gcc -E file.c > file.i
Clang C Preprocessor
The Clang Preprocessor can be invoked as:
clang -E file.c -o file.i
Microsoft VC Preprocessor
The VC Preprocessor can be invoked as:
cl /P /Zc:preprocessor file.c -Fifile.i
and the preprocessed output is written to file.i.
dmpp C Preprocessor
The dmpp C Preprocessor can be invoked as:
dmpp file.c
and the preprocessed output is written to file.i.
Preprocessor Macros
ImportC collects all the #define macros from the preprocessor run when it is run automatically. The macros that look like manifest constants, such as:
#define COLOR 0x123456
are interpreted as D manifest constant declarations of the form:
enum COLOR = 0x123456;
The variety of macros that can be interpreted as D declarations may be expanded, but will never encompass all the metaprogramming uses of C macros.
Predefined Macros
ImportC does not predefine any macros.
To distinguish an ImportC compile vs some other C compiler, use:
#if __IMPORTC__
__IMPORTC__ is defined in src/importc.h which is automatically included when the preprocessor is run.
Preprocessor Directives
ImportC supports these preprocessor directives:
Line control
C11 6.10.4
Linemarker
linemarker directives are normally embedded in the output of C preprocessors.
pragma
The following pragmas are supported:
- #pragma pack ( )
- #pragma pack ( show )
- #pragma pack ( push )
- #pragma pack ( push , identifier )
- #pragma pack ( push , integer )
- #pragma pack ( push , identifier , integer )
- #pragma pack ( pop )
- #pragma pack ( pop PopList )
Implementation
The implementation defined characteristics of ImportC are:
Enums
enumeration-constants are always typed as int.
The expression that defines the value of an enumeration-constant must be an integral type and evaluate to an integer value that fits in an int.
enum E { -10, 0x81231234 }; // ok enum F { 0x812312345678 }; // error, doesn't fit in int enum G { 1.0 }; // error, not integral type
The enumerated type is int.
Bit Fields
There are many implementation defined aspects of C11 bit fields. ImportC's behavior adjusts to match the behavior of the associated C compiler on the target platform.
Implicit Function Declarations
Implicit function declarations:
int main() { func(); // implicit declaration of func() }
were allowed in K+R C and C89, but were invalidated in C99 and C11. Although many C compilers still support them, ImportC does not.
#pragma STDC FENV_ACCESS
This is described in C11 7.6.1
#pragma STDC FENV_ACCESS on-off-switch on-off-switch: ON OFF DEFAULT
It is completely ignored.
Limitations
Exception Handling
ImportC is assumed to never throw exceptions. setjmp and longjmp are not supported.
Const
C11 specifies that const only applies locally. const in ImportC applies transitively, meaning that although
int *const p;means in C11 that p is a const pointer to int, in ImportC it means p is a const pointer to a const int.
Volatile
The volatile type-qualifier (C11 6.7.3) is ignored. Use of volatile to implement shared memory access is unlikely to work anyway, _Atomic is for that. To use volatile as a device register, call a function to do it that is compiled separately, or use inline assembler.
Restrict
The restrict type-qualifier (C11 6.7.3) is ignored.
_Atomic
The _Atomic type-qualifier (C11 6.7.3) is ignored. To do atomic operations, use an externally compiled function for that, or the inline assembler.
Compatible Types
Compatible Types (C11 6.7.2) are identical types in ImportC.
Same only Different Types
On some platforms, C long and unsigned long are the same size as int and unsigned int, respectively. On other platforms, C long and unsigned long are the same size as long long and unsigned long long. long double and long double _Complex can be same size as double and double _Complex. In ImportC, these types that are the same size and signed-ness are treated as the same types.
_Generic
Generic selection expressions (C11 6.5.1.1) differ from ImportC. The types in Same only Different Types are indistinguishable in the type-name parts of generic-association. Instead of giving an error for duplicate types per C11 6.5.1.1-2, ImportC will select the first compatible type-name in the generic-assoc-list.
Extensions
Asm statement
For the D language, asm is a standard keyword, and its construct is shared with ImportC. For the C language, asm is an extension (J.5.10), and the recommendation is to instead use __asm__. All alternative keywords for asm are translated by the druntime file src/importc.h during the preprocessing stage.
The asm keyword may be used to embed assembler instructions, its syntax is implementation defined. The Digital Mars D compiler only supports the dialect of inline assembly as described in the documentation of the D x86 Inline Assembler.
asm in a function or variable declaration may be used to specify the mangle name for a symbol. Its use is analogous to pragma mangle.
char **myenviron asm("environ") = 0; int myprintf(char *, ...) asm("printf");
Using asm to associate registers with variables is ignored.
Forward References
Any declarations in scope can be accessed, not just declarations that lexically precede a reference.
Compile Time Function Execution
Evaluating constant expressions includes executing functions in the same manner as D's CTFE can. A constant-expression invokes CTFE.
Examples:
_Static_assert("\x1"[0] == 1, "failed"); int mint1() { return -1; } _Static_assert(mint1() == -1, "failed"); const int a = 7; int b = a; // sets b to 7
Function Inlining
Functions for which the function body is present can be inlined by ImportC as well as by the D code that calls them.
Enum Base Types
Enums are extended with an optional EnumBaseType:
EnumDeclaration: enum Identifier : EnumBaseType EnumBody EnumBaseType: Type
which, when supplied, causes the enum members to be implicitly cast to the EnumBaseType.
enum S : byte { A }; _Static_assert(sizeof(A) == 1, "A should be size 1");
Register Storage Class
Objects with register storage class are treated as auto declarations.
Objects with register storage class may have their address taken. C11 6.3.2.1-2
Arrays can have register storage class, and may be enregistered by the compiler. C11 6.3.2.1-3
typeof Operator
The typeof operator may be used as a type specifier:
type-specifier: typeof-specifier typeof-specifier: typeof ( expression ) typeof ( type-name )
Import Declarations
Modules can be imported with a CImportDeclaration:
CImportDeclaration: __import ImportList ;
Imports enable ImportC code to directly access D declarations and functions without the necessity of creating a .h file representing those declarations. The tedium and brittleness of keeping the .h file up-to-date with the D declarations is eliminated. D functions are available to be inlined.
Imports also enable ImportC code to directly import other C files without needing to create a .h file for them, either. Imported C functions become available to be inlined.
The ImportList works the same as it does for D.
The ordering of CImportDeclarations has no significance.
An ImportC file can be imported, the name of the C file to be imported is derived from the module name.
All the global symbols in the ImportC file become available to the importing module.
If a name is referred to in the importing file is not found, the global symbols in each imported file are searched for the name. If it is found in exactly one module, that becomes the resolution of the name. If it is found in multiple modules, it is an error.
Preprocessor symbols in the imported module are not available to the importing module, and preprocessing symbols in the importing file are not available to the imported module.
A D module can be imported, in the same manner as that of a ImportDeclaration.
Imports can be circular.
__import core.stdc.stdarg; // get D declaration of va_list __import mycode; // import mycode.c int foo() { va_list x; // picks up va_list from core.stdc.stdarg return 1 + A; // returns 4 }
mycode.c looks like:
enum E { A = 3; }
Gnu and Clang Extensions
gcc and clang are presumed to have the same behavior w.r.t. extensions, so gcc as used here refers to both.
__attribute__((noreturn))
__attribute__((noreturn)) marks a function as never returning. gcc set this as an attribute of the function, it is not part of the function's type. In D, a function that never returns has the return type noreturn. The difference can be seen with the code:
attribute((noreturn)) int foo(); size_t x = sizeof(foo());
This code is accepted by gcc, but makes no sense for D. Hence, although it works in ImportC, it is not representable as D code, meaning one must use judgement in creating a .di file to interface with C noreturn functions.
Furthermore, the D compiler takes advantage of noreturn functions by issuing compile time errors for unreachable code. Such unreachable code, however, is valid C11, and the ImportC compiler will accept it.
Visual C Extensions
Digital Mars C Extensions
ImportC from D's Point of View
There is no one-to-one mapping of C constructs to D constructs, although it is very close. What follows is a description of how the D side views the C declarations that are imported.
Module Name
The module name assigned to the ImportC file is the filename stripped of its path and extension. This is just like the default module name assigned to a D module that does not have a module declaration.
extern (C)
All C symbols are extern (C).
Enums
The C enum:
enum E { A, B = 2 };
appears to D code as:
enum E : int { A, B = 2 } alias A = E.A; alias B = E.B;
The .min and .max properties are available:
static assert(E.min == 0 && E.max == 2);
Tag Symbols
Tag symbols are the identifiers that appear after the struct, union, and enum keywords, (C11 6.7.2.3). In C, they are placed in a different symbol table from other identifiers. This means two different symbols can use the same name:
int S; struct S { int a, b; }; S = 3; struct S *ps;
D does not make this distinction. Given a tag symbol that is the only declaration of an identifier, that's what the D compiler recognizes. Given a tag symbol and a non-tag symbol that share an identifier, the D compiler recognizes the non-tag symbol. This is normally not a problem due to the common C practice of applying typedef, as in:
typedef struct S { int a, b; } S;
The D compiler recognizes the typedef applied to S, and the code compiles as expected. But when typedef is absent, as in:
int S; struct S { int a, b; };
the most pragmatic workaround is to add a typedef to the C code:
int S; struct S { int a, b; }; typedef struct S S_t; // add this typedef
Then the D compiler can access the struct tag symbol via S_t.
Wrapping C Code
Many difficulties with adapting C code to ImportC can be done without editing the C code itself. Wrap the C code in another C file and then
#includeit. Consider the following problematic C file file.c:
void func(int *__restrict p); int S; struct S { int a, b; };
The problems are that
__restrictis not a type qualifier recognized by ImportC (or C11), and the struct S is hidden from D by the declaration
int S;. To wrap file.c with a fix, create the file file_ic.c with the contents:
#define __restrict restrict #include "file.c" typedef struct S S_t;
Then, import file_ic; instead of import file;, and use S_t when
struct Sis desired.
Warnings
Many suspicious C constructs normally cause warnings to be emitted by default by typical compilers, such as:
int *p = 3; // Warning: integer implicitly converted to pointer
ImportC does not emit warnings. The presumption is the user will be importing existing C code developed using another C compiler, and it is written as intended. If C11 says it is legal, ImportC accepts it.
ImportC++
ImportC will not compile C++ code. For that, use dpp.
Other Solutions
dpp by Atila Neves
From the Article:
dpp is a compiler wrapper that will parse a D source file with the .dpp extension and expand in place any #include directives it encounters, translating all of the C or C++ symbols to D, and then pass the result to a D compiler (DMD by default).
Like DStep, dpp relies on libclang.
DStep by Jacob Carlborg
From the Article:
DStep is a tool for automatically generating D bindings for C and Objective-C libraries. This is implemented by processing C or Objective-C header files and outputting D modules. DStep uses the Clang compiler as a library (libclang) to process the header files.
htod by Walter Bright
htod converts a C .h file to a D source file, suitable for importing into D code. htod is built from the front end of the Digital Mars C and C++ compiler. It works just like a C or C++ compiler except that its output is source code for a D module rather than object code.
How ImportC Works
ImportC's implementation is based on the idea that D's semantics are very similar to C's. ImportC gets its own parser, which converts the C syntax into the same AST (Abstract Syntax Tree) that D uses. The lexer for ImportC is the same as for D, but with some modifications here and there, such as the keywords and integer literals being different. Where the semantics of C differ from D, there are adjustments in the semantic analysis code in the D compiler.
This co-opting of the D semantic implementation allows ImportC to be able to do things like handle forward references, CTFE (Compile Time Function Execution), and inlining of C functions into D code. Being able to handle forward references means it is not necessary to even write a .h file to be able to import C declarations into D. Being able to perform CTFE is very handy for testing that ImportC is working without needing to generate an executable. But, in general, the strong temptation to add D features to ImportC has been resisted.
The optimizer and code generator are, of course, the same as D uses.