Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

English | 中文版

Appendix: Real-World Memory Safety Vulnerabilities in GPU/NPU Ecosystems

The six memory safety case studies in Section 6 demonstrate structural patterns where Rust prevents common mistakes. However, memory safety in accelerator code is not merely a theoretical concern — it has led to actively exploited zero-day vulnerabilities, production crashes, and security incidents across every major GPU/NPU vendor. This appendix documents concrete, citable cases.

A.1 ARM Mali GPU: Use-After-Free Exploited by Spyware (CVE-2023-4211)

A use-after-free in the ARM Mali GPU kernel driver’s VMA tracking allowed privilege escalation on billions of Android devices. An attacker could split a multi-page tracking VMA via munmap(), causing the teardown routine to null out kctx->process_mm while bookkeeping was still pending. Google TAG confirmed this was actively exploited by a commercial surveillance vendor. Rust’s ownership model prevents use-after-free by construction — the freed VMA would be consumed/dropped, and any subsequent reference would be a compile-time error.

Sources: Google Project Zero; Arm Security Bulletin

A.2 ARM Bifrost/Valhall GPU: Actively Exploited Zero-Day (CVE-2024-4610)

Another use-after-free in ARM GPU drivers, this time affecting Bifrost and Valhall architectures (r34p0–r40p0). CISA confirmed active exploitation in the wild across hundreds of millions of smartphones and embedded devices. Rust’s borrow checker enforces exclusive mutable access, making the dangling reference pattern impossible.

Source: CISA KEV Catalog

A.3 NVIDIA GPU Driver: Out-of-Bounds Write (CVE-2024-0090)

An out-of-bounds write in the NVIDIA GPU display driver for Linux and Windows enabled privilege escalation. Rust’s bounds checking on slice access would catch this with a safe panic rather than silent memory corruption.

Source: NVD; SecurityWeek

A.4 AMDGPU Fence: Use-After-Free Race Condition (CVE-2023-51042)

A race condition in the Linux AMDGPU driver’s amdgpu_cs_wait_all_fences() allowed code to access a fence object after it was freed. This triggered kernel crashes and potential privilege escalation, requiring emergency patches from Red Hat, SUSE, and Ubuntu. Rust’s ownership model makes data races a compile-time error — the fence would be protected by Arc<Mutex<...>>, preventing both the use-after-free and the underlying race.

Source: NVD

A.5 NVIDIA CUDA Toolkit: Heap Buffer Overflow via Integer Overflow (CVE-2024-53873)

Nine vulnerabilities in NVIDIA CUDA Toolkit’s cuobjdump utility, caused by integer overflow during cubin file parsing leading to heap buffer overflow. Rust’s checked arithmetic (overflow panics in debug, wrapping_mul required for explicit wrapping) prevents the integer overflow, and Vec/slice bounds checking prevents the subsequent heap corruption.

Source: Palo Alto Unit42

A.6 Qualcomm Adreno GPU: Three Zero-Days Exploited in Targeted Attacks (CVE-2025-21479/21480/27038)

Three zero-day vulnerabilities in Qualcomm Adreno GPU drivers, including unauthorized GPU microcode command execution and a use-after-free during rendering. Actively exploited in targeted attacks on billions of Android devices. Rust’s memory safety guarantees prevent the UAF, and the ownership model constrains what operations are possible on GPU resources.

Sources: The Hacker News; BleepingComputer

A.7 PyTorch CUDA Kernel: Silent Out-of-Bounds Access (Issue #37153)

In PyTorch’s Reduce.cuh, accessing iter.shape()[0] on a scalar input (where iter.shape() returns an empty array) caused an out-of-bounds memory read. This led to flaky test failures that were extremely difficult to reproduce or diagnose — a classic silent data corruption pattern. Rust’s slice indexing panics on empty-slice access rather than silently reading garbage memory.

Source: PyTorch Issue #37153

A.8 TensorFlow GPU Kernels: Repeated Heap Buffer Overflows (CVE-2023-25668, CVE-2020-15198, CVE-2019-16778)

A pattern of heap buffer overflows in TensorFlow GPU kernels: QuantizeAndDequantize reading past tensor bounds (CVE-2023-25668), SparseCountSparseOutput with mismatched tensor shapes (CVE-2020-15198), and UnsortedSegmentSum truncating int64 to int32 producing negative indices (CVE-2019-16778). These are particularly dangerous because ML models loaded from untrusted sources can trigger them. Rust prevents all three: bounds checking catches overflows, the type system can enforce shape consistency, and explicit as cast semantics prevent silent truncation.

Sources: Snyk: CVE-2023-25668; GitHub Advisory: CVE-2019-16778

A.9 GPU Memory Exploitation for Fun and Profit (USENIX Security 2024)

Academic research demonstrating that buffer overflows in CUDA kernel global memory can be exploited for code injection, return-oriented programming on GPU, and cross-tenant ML model weight corruption. Unlike CPUs, GPU memory spaces lack ASLR, stack canaries, and other standard protections. A malicious GPU kernel can corrupt another tenant’s model weights in shared GPU cloud deployments. Rust’s bounds checking prevents buffer overflows entirely in safe code — exactly the class of attack this paper demonstrates.

Source: USENIX Security 2024

Summary

CVEComponentBug ClassExploited?
CVE-2023-4211ARM Mali GPU driverUse-after-freeYes (spyware)
CVE-2024-4610ARM Bifrost/Valhall GPUUse-after-freeYes
CVE-2024-0090NVIDIA GPU driverOut-of-bounds writePatched
CVE-2023-51042AMDGPU Linux driverUse-after-free (race)Patched
CVE-2024-53873NVIDIA CUDA ToolkitHeap buffer overflowPatched
CVE-2025-21479Qualcomm Adreno GPUMemory corruption / UAFYes (targeted)
#37153PyTorch CUDA kernelsOut-of-bounds readN/A
CVE-2023-25668+TensorFlow GPU kernelsHeap buffer overflowN/A
USENIX ’24CUDA memory modelBuffer overflow (cross-tenant)Demonstrated

Every major GPU/NPU vendor — NVIDIA, AMD, ARM, Qualcomm — has shipped memory safety vulnerabilities in their accelerator drivers and toolchains. At least four were actively exploited in the wild. The bug classes — use-after-free, out-of-bounds writes, buffer overflows, race conditions — are precisely the categories that Rust’s ownership model, borrow checker, and bounds checking eliminate at compile time. This is the practical motivation for ascend-rs: not just cleaner code, but eliminating vulnerabilities that have real-world security consequences.