This document specifies the C language dialect used by Xen and the assumptions Xen makes on the translation toolchain. It covers, in particular:
All points are of course relevant for portability. In addition, programming in C is impossible without a detailed knowledge of the implementation-defined behaviors. For this reason, it is recommended that Xen developers have familiarity with this document and the documentation referenced therein.
This document needs maintenance and adaptation in the following circumstances:
Xen is written in C99 with extensions. The relevant ISO standard is
ISO/IEC 9899:1999/Cor 3:2007: Programming Languages - C, Technical Corrigendum 3. ISO/IEC, Geneva, Switzerland, 2007.
The following documents are referred to in the sequel:
https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/artifacts/master/raw/x86-64-ABI/abi.pdf?job=build
The following table lists the extensions currently used in Xen. The table columns are as follows:
- Extension
a terse description of the extension;
- Architectures
a set of Xen architectures making use of the extension;
- References
when available, references to the documentation explaining the syntax and semantics of (each instance of) the extension.
The following table lists the translation limits that a toolchain has to satisfy in order to translate Xen. The numbers given are a compromise: on the one hand, many modern compilers have very generous limits (in several cases, the only limitation is the amount of available memory); on the other hand we prefer setting limits that are not too high, because compilers do not have any obligation of diagnosing when a limit has been exceeded, and not too low, so as to avoid frequently updating this document. In the table, only the limits that go beyond the minima specified by the relevant C Standard are listed.
The table columns are as follows:
- Limit
a terse description of the translation limit;
- Architectures
a set relevant of Xen architectures;
- Threshold
a value that the Xen project does not wish to exceed for that limit (this is typically below, often much below what the translation toolchain supports);
- References
when available, references to the documentation providing evidence that the translation toolchain honors the threshold (and more).
| Limit | Architectures | Threshold | References |
|---|---|---|---|
| Size of an object | ARM64, X8664 | 8388608 | The maximum size of an object is defined in the MAXSIZE macro, and for a 32 bit architecture is 8MB. The maximum size for an array is defined in the PTRDIFFMAX and in a 32 bit architecture is 2^30-1. See occurrences of these macros in GCCMANUAL. |
| Characters in one logical source line | ARM64 | 5000 | See Section "11.2 Implementation limits" of CPPMANUAL. |
| Characters in one logical source line | X8664 | 12000 | See Section "11.2 Implementation limits" of CPPMANUAL. |
| Nesting levels for #include files | ARM64 | 24 | See Section "11.2 Implementation limits" of CPPMANUAL. |
| Nesting levels for #include files | X8664 | 32 | See Section "11.2 Implementation limits" of CPPMANUAL. |
| case labels for a switch statement (excluding those for any nested switch statements) | X8664 | 1500 | See Section "4.12 Statements" of GCCMANUAL. |
| Number of significant initial characters in an external identifier | ARM64, X8664 | 63 | See Section "4.3 Identifiers" of GCCMANUAL. |
The following table lists the C language implementation-defined behaviors relevant for MISRA C:2012 Dir 1.1 upon which Xen may possibly depend.
The table columns are as follows:
- I.-D.B.
a terse description of the implementation-defined behavior;
- Architectures
a set relevant of Xen architectures;
- Value(s)
for i.-d.b.'s with values, the values allowed;
- References
when available, references to the documentation providing details about how the i.-d.b. is resolved by the translation toolchain.
| I.-D.B. | Architectures | Value(s) | References |
|---|---|---|---|
| Allowable bit-field types other than Bool, signed int, and unsigned int | ARM64, X8664 | All explicitly signed integer types, all unsigned integer types, and enumerations. | See Section "4.9 Structures, Unions, Enumerations, and Bit-Fields". |
| #pragma preprocessing directive that is documented as causing translation failure or some other form of undefined behavior is encountered | ARM64, X8664 | pack, GCC visibility |
|
| The number of bits in a byte | ARM64 | 8 | See Section "4.4 Characters" of GCCMANUAL and Section "8.1 Data types" of ARM64ABI_MANUAL. |
| The number of bits in a byte | X8664 | 8 | See Section "4.4 Characters" of GCCMANUAL and Section "3.1.2 Data Representation" of X8664_ABI_MANUAL. |
| Whether signed integer types are represented using sign and magnitude, two's complement, or one's complement, and whether the extraordinary value is a trap representation or an ordinary value | ARM64, X8664 | Two's complement | See Section "4.5 Integers" of GCCMANUAL. |
| Any extended integer types that exist in the implementation | X8664 | See Section "6.9 128-bit Integers" of GCCMANUAL. | |
| The number, order, and encoding of bytes in any object | ARM64 | See Section "4.15 Architecture" of GCCMANUAL and Chapter 5 "Data types and alignment" of ARM64ABI_MANUAL. | |
| The number, order, and encoding of bytes in any object | X8664 | See Section "4.15 Architecture" of GCCMANUAL and Section "3.1.2 Data Representation" of X8664_ABI_MANUAL. | |
| Whether a bit-field can straddle a storage-unit boundary | ARM64 | See Section "4.9 Structures, Unions, Enumerations, and Bit-Fields of GCCMANUAL and Section "8.1.8 Bit-fields" of ARM64ABI_MANUAL. | |
| Whether a bit-field can straddle a storage-unit boundary | X8664 | See Section "4.9 Structures, Unions, Enumerations, and Bit-Fields" of GCCMANUAL and Section "3.1.2 Data Representation" of X8664_ABI_MANUAL. | |
| The order of allocation of bit-fields within a unit | ARM64 | See Section "4.9 Structures, Unions, Enumerations, and Bit-Fields of GCCMANUAL and Section "8.1.8 Bit-fields" of ARM64ABI_MANUAL. | |
| The order of allocation of bit-fields within a unit | X8664 | See Section "4.9 Structures, Unions, Enumerations, and Bit-Fields" of GCCMANUAL and Section "3.1.2 Data Representation" of X8664_ABI_MANUAL. | |
| What constitutes an access to an object that has volatile-qualified type | ARM64, X8664 | See Section "4.10 Qualifiers" of GCCMANUAL. | |
| The values or expressions assigned to the macros specified in the headers <float.h>, <limits.h>, and <stdint.h> | ARM64 | See Section "4.15 Architecture" of GCCMANUAL and Chapter 5 "Data types and alignment" of ARM64ABI_MANUAL. | |
| The values or expressions assigned to the macros specified in the headers <float.h>, <limits.h>, and <stdint.h> | X8664 | See Section "4.15 Architecture" of GCCMANUAL and Section "3.1.2 Data Representation" of X8664_ABI_MANUAL. | |
| Character not in the basic source character set is encountered in a source file, except in an identifier, a character constant, a string literal, a header name, a comment, or a preprocessing token that is never converted to a token | ARM64 | UTF-8 | See Section "1.1 Character sets" of CPPMANUAL. We assume the locale is not restricting any UTF-8 characters being part of the source character set. |
| The value of a char object into which has been stored any character other than a member of the basic execution character set | ARM64 | See Section "4.4 Characters" of GCCMANUAL and Section "8.1 Data types" of ARM64ABI_MANUAL. | |
| The value of a char object into which has been stored any character other than a member of the basic execution character set | X8664 | See Section "4.4 Characters" of GCCMANUAL and Section "3.1.2 Data Representation" of X8664_ABI_MANUAL. | |
| The value of an integer character constant containing more than one character or containing a character or escape sequence that does not map to a single-byte execution character | ARM64 | See Section "4.4 Characters" of GCCMANUAL and Section "8.1 Data types" of ARM64ABI_MANUAL. | |
| The value of an integer character constant containing more than one character or containing a character or escape sequence that does not map to a single-byte execution character | X8664 | See Section "4.4 Characters" of GCCMANUAL and Section "3.1.2 Data Representation" of X8664_ABI_MANUAL. | |
| The mapping of members of the source character set | ARM64, X8664 | See Section "4.4 Characters" of GCCMANUAL and the documentation for -finput-charset=charset in the same manual. | |
| The members of the source and execution character sets, except as explicitly specified in the Standard | ARM64, X8664 | UTF-8 | See Section "4.4 Characters" of GCCMANUAL |
| The values of the members of the execution character set | ARM64, X8664 | See Section "4.4 Characters" of GCCMANUAL and the documentation for -fexec-charset=charset in the same manual. | |
| How a diagnostic is identified | ARM64, X8664 | See Section "4.1 Translation" of GCCMANUAL. | |
| The places that are searched for an included < > delimited header, and how the places are specified or the header is identified | ARM64, X8664 | See Chapter "2 Header Files" of CPPMANUAL. | |
| How the named source file is searched for in an included " " delimited header | ARM64, X8664 | See Chapter "2 Header Files" of CPPMANUAL. | |
| How sequences in both forms of header names are mapped to headers or external source file names | ARM64, X8664 | See Chapter "2 Header Files" of CPPMANUAL. | |
| Whether the # operator inserts a character before the character that begins a universal character name in a character constant or string literal | ARM64, X8664 | See Section "3.4 Stringizing" of CPPMANUAL. | |
| The current locale used to convert a wide string literal into corresponding wide character codes | ARM64, X8664 | See Section "4.4 Characters" of GCCMANUAL and Section "11.1 Implementation-defined behavior" of CPPMANUAL. | |
| The value of a string literal containing a multibyte character or escape sequence not represented in the execution character set | X8664 | See Section "4.4 Characters" of GCCMANUAL and Section "11.1 Implementation-defined behavior" of CPPMANUAL. | |
| The behavior on each recognized #pragma directive | ARM64, X8664 | pack, GCC visibility | See Section "4.13 Preprocessing Directives" of GCCMANUAL and Section "7 Pragmas" of CPPMANUAL. |
| The method by which preprocessing tokens (possibly resulting from macro expansion) in a #include directive are combined into a header name | X8664 | See Section "4.13 Preprocessing Directives" of GCCMANUAL and Section "11.1 Implementation-defined behavior" of CPPMANUAL. |
https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst
A summary table of data types, sizes and alignment is below:
| Type | Size | Alignment | Architectures |
|---|---|---|---|
| char | 8 bits | 8 bits | x8632, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A, x8664, ARMv8-A AArch64, RV64, PPC64 |
| short | 16 bits | 16 bits | x8632, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A, x8664, ARMv8-A AArch64, RV64, PPC64 |
| int | 32 bits | 32 bits | x8632, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A, x8664, ARMv8-A AArch64, RV64, PPC64 |
| long | 32 bits | 32 bits | x8632, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A |
| long | 64 bits | 64 bits | x8664, ARMv8-A AArch64, RV64, PPC64 |
| long long | 64-bit | 32-bit | x8632 |
| long long | 64-bit | 64-bit | x8664, ARMv8-A AArch64, RV64, PPC64, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A |
| pointer | 32-bit | 32-bit | x8632, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A |
| pointer | 64-bit | 64-bit | x8664, ARMv8-A AArch64, RV64, PPC64 |
END OF DOCUMENT.