CAPcelerate Project
Outline
CAPcelerate is a £1.2m project funded by the UK Industrial Strategy Challenge Fund's Digital Security by Design (DSbD) programme, led by Prof. Timothy Jones and Dr A. Theodore Markettos.
We are considering the implications of using CHERI capabilities in various classes of accelerators, such as GPUs, AI and crypto accelerators and FPGAs.
In particular we are interested in workloads that have rich software stacks that interwork with CPU code in shared memory.
We are considering whether it is feasible to add capability support to such accelerators and their software stacks (compilers, drivers, libraries, etc), or whether alternative schemes can be used to protect from accelerators that might be running malicious software, or potentially malicious hardware itself (for example, a malicious external Thunderbolt GPU).
Team
News
- 1 Aug 2024 A paper about our SIMTight GPGPU has been accepted to ICCD 2024. This paper covers our baseline design and dynamic scalarisation features. A seperate paper about adding CHERI support to SIMTight is in preparation.
- 16 Feb 2023 We implemented a low-level GPU device driver that runs in user space and uses one of the CHERI compartmentalisation techniques that are in development. We tested it with a set of OpenGL game traces on Morello. We plan to implement support for multiple client applications in the next months.
- 25 Nov 2022 We finished a security review of the interfaces between our user space driver and the kernel, and between our driver and the layer on top of it in user space.
- 4 Nov 2022: Our SIMTight GPGPU now
supports functional unit sharing between vector lanes. This can
be used to implement area-expensive instructions, such as CHERI
bounds-setting instructions, at low hardware cost, while still
allowing maximum run-time performance when the instruction is
scalarisable.
- 25 Oct 2022 We fixed a bug in CheriBSDs virtual memory system that caused significant performance issues with some of the game traces that we use for our device driver work.
- 12 Oct 2022: We presented a poster about GPU driver
compartments at the DSbD all hands meeting in Wolverhampton.
- 7 Oct 2022: Our SIMTight GPGPU now
fully implements register file compression by supporting a
hardware mechanism to dynamically spill registers to main memory when
the on-chip storage available to the register file is exhausted. This
extends our earlier work on dynamic scalarisation.
- 06 Sept 2022 We recorded OpenGL traces with a set of games for our GPU device driver work and ran them on Morello with a purecap graphics stack.
- 26 Aug 2022: Our SIMTight GPGPU now
supports a scalarised vector store buffer, reducing the cost of
compiler-inserted register spills at low hardware cost. This is
particularly useful when the register being spilled is a capability,
reducing the memory bandwidth overhead CHERI's double-sized pointers.
This extends our earlier work on dynamic scalarisation.
- 21 May 2022: Our SIMTight GPGPU now
supports parallel scalar/vector pipelines, building upon our
earlier work on dynamic scalarisation. It allows scalarisable
instructions to be executed in parallel with vector ones, doubling
performance density in workloads with sufficient scalar
behaviour. In future, this feature may be used to implement
commonly-scalarisable CHERI instructions at low hardware cost.
- 12 May 2022 The first Morello board arrived.
- 7 Apr 2022: We presented a poster about capabilities for heterogeneous
accelerators at the DSbD all hands meeting in London.
- 7 Feb 2022 We implemented a proof of concept for our GPU device driver work that runs a portion of the low-level GPU specific code that is normally part of the kernel in user space. We implemented this on Linux and an ARM development board. We used a set of compute kernels for preliminary performance investigations.
- 8 Dec 2021: Our SIMTight GPGPU now
implements dynamic scalarisation. This has the potential to
largely eliminate the on-chip storage overhead of CHERI's double-sized
registers by exploiting the redundancy of capability meta-data between
hardware threads.
- 10 Aug 2021: We have released SIMTight, a
fully-synthesisable RISC-V GPGPU with support CHERI! This includes a
CUDA-like programming API called NoCL
along with a suite of benchmarks which run in pure capability
mode.
- 7 May 2021: We presented a poster about adding CHERI support to
GPGPUs at the DSbD all hands virtual meeting.
Posters
Releases
- SIMTight,
our fully synthesisable CHERI-enabled RISC-V GPGPU with high
performance density on Intel's Stratix 10 FPGA board.
- NoCL,
our CUDA-like programming API (and benchmark suite) in plain C++ that
runs in pure capability mode.
- Blarney,
our Haskell library for hardware description, used to implement
SIMTight.
Links
|