Pointer provenance validity
CHERI C/C++ implement pointers using architectural
capabilities, rather than using conventional 32-bit or 64-bit integers.
This allows the provenance validity of language-level pointers to be
protected by the provenance properties of CHERI architectural capabilities:
only pointers implemented using valid capabilities can be dereferenced.
Other types that contain pointers, uintptr_t
and intptr_t
,
are similarly implemented
using architectural capabilities, so that casts through these types
can retain capability properties.
When a dereference is attempted on a capability without a valid tag —
including load, store, and instruction fetch — a hardware exception fires
(see Capability-related faults).
On the whole, the effects of pointer provenance validity are non-disruptive to C/C++ source code. However, a number of cases exist in language runtimes and other (typically less portable) C code that conflate integers and pointers that can disrupt provenance validity. In general, generated code will propagate provenance validity in only two situations:
-
Pointer types The compiler will generate suitable code to propagate the provenance validity of pointers by using capability load and store instructions. This occurs when using a pointer type (e.g.,
void *
) or an integer type defined as being able to hold a pointer (e.g.,intptr_t
). As with attempting to store 64-bit pointers in 32-bit integers on 64-bit architectures, passing a pointer through an inappropriate type will lead to truncation of metadata (e.g., the validity tag and bounds). It is therefore important that a suitable type be used to hold pointers.This pattern often occurs where an opaque field exists in a data structure — e.g., a
long_t
argument to a callback in older C code — that needs to be changed to use a capability-oblivious type such asintptr_t
.
-
Capability-oblivious code In some portions of the C/C++ runtime and compiler-generated code, it may not be possible to know whether memory is intended to contain a pointer or not — and yet preserving pointers is desirable. In those cases, memory accesses must be performed in a way that preserves pointer provenance. In the C runtime itself, this includes
memcpy
, which must use capability load and store instructions to transparently propagate capability metadata and tags.A useful example of potentially surprising code requiring modification for CHERI C/C++ is
qsort
. Some C programs assume thatqsort
on an array of data structures containing pointers will preserve the usability of those pointers. As a result,qsort
must be modified to perform memory copies using pointer-based types, such asintptr_t
, when size and alignment require it.