English | 中文版
Appendix B: CVE Code Analysis — Vulnerable C++ vs Safe Rust Mitigations
This appendix presents the actual (or reconstructed) vulnerable C/C++ code from the CVEs documented in Appendix A, paired with ascend-rs-style Rust code that structurally prevents each vulnerability class.
B.1 Use-After-Free via Reference Count Drop (CVE-2023-51042, AMDGPU)
The Linux AMDGPU driver dereferences a fence pointer after dropping its reference count.
Vulnerable C code (from drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c, before fix 2e54154):
// Inside amdgpu_cs_wait_all_fences()
r = dma_fence_wait_timeout(fence, true, timeout);
dma_fence_put(fence); // Reference dropped — fence may be freed
if (r < 0)
return r;
if (r == 0)
break;
if (fence->error) // USE-AFTER-FREE: fence already freed
return fence->error;
ascend-rs mitigation — Rust’s ownership ensures the value is consumed, not dangled:
#![allow(unused)]
fn main() {
// ascend_rs host API pattern: Arc<Fence> enforces lifetime
fn wait_all_fences(fences: &[Arc<Fence>], timeout: Duration) -> Result<()> {
for fence in fences {
let status = fence.wait_timeout(timeout)?;
// fence.error is checked WHILE we still hold the Arc reference
if let Some(err) = fence.error() {
return Err(err);
}
// Arc reference is alive until end of loop iteration —
// Rust compiler rejects any code that uses fence after drop
}
Ok(())
}
}
Why Rust prevents this: Arc<Fence> is reference-counted. The compiler ensures you cannot access fence.error() after the Arc is dropped — the borrow checker rejects any reference to a moved/dropped value at compile time. There is no way to write the C pattern (use after put) in safe Rust.
B.2 Out-of-Bounds Write via Unchecked User Index (CVE-2024-0090, NVIDIA)
The NVIDIA GPU driver accepts a user-supplied index via ioctl without bounds checking.
Vulnerable C code (reconstructed from CVE description):
// NVIDIA GPU driver ioctl handler
struct gpu_resource_table {
uint32_t entries[MAX_GPU_RESOURCES];
uint32_t count;
};
static int nvidia_ioctl_set_resource(struct gpu_resource_table *table,
struct user_resource_request *req)
{
// BUG: No bounds check on user-supplied index
table->entries[req->index] = req->value; // OUT-OF-BOUNDS WRITE
return 0;
}
ascend-rs mitigation — Rust slices enforce bounds at the type level:
#![allow(unused)]
fn main() {
// ascend_rs host API: DeviceBuffer<T> wraps a bounded slice
struct GpuResourceTable {
entries: Vec<u32>, // Vec tracks its own length
}
impl GpuResourceTable {
fn set_resource(&mut self, index: usize, value: u32) -> Result<()> {
// Option 1: Panics on out-of-bounds (debug + release)
self.entries[index] = value;
// Option 2: Returns None for out-of-bounds (graceful)
*self.entries.get_mut(index)
.ok_or(Error::IndexOutOfBounds)? = value;
Ok(())
}
}
}
Why Rust prevents this: Vec<u32> tracks its length. Indexing with [] performs a bounds check and panics (safe termination, not memory corruption). Using .get_mut() returns None for out-of-bounds access. There is no way to silently write past the buffer in safe Rust.
B.3 Integer Overflow Leading to Heap Buffer Overflow (CVE-2024-53873, NVIDIA CUDA Toolkit)
The CUDA cuobjdump tool reads a 2-byte signed value from a crafted .cubin file, sign-extends it, and uses the corrupted size in memcpy.
Vulnerable C code (from Talos disassembly analysis):
// Parsing .nv_debug_source section in cubin ELF files
int16_t name_len_raw = *(int16_t*)(section_data); // e.g., 0xFFFF = -1
int32_t name_len = (int32_t)name_len_raw; // sign-extends to -1
int32_t alloc_size = name_len + 1; // -1 + 1 = 0
memcpy(dest_buf, src, (size_t)alloc_size); // HEAP BUFFER OVERFLOW
ascend-rs mitigation — Rust’s checked arithmetic catches overflow:
#![allow(unused)]
fn main() {
// ascend_rs: parsing NPU binary metadata with safe arithmetic
fn parse_debug_section(section: &[u8], dest: &mut [u8]) -> Result<()> {
let name_len_raw = i16::from_le_bytes(
section.get(0..2).ok_or(Error::TruncatedInput)?.try_into()?
);
// checked_add returns None on overflow instead of wrapping
let alloc_size: usize = (name_len_raw as i32)
.checked_add(1)
.and_then(|n| usize::try_from(n).ok())
.ok_or(Error::IntegerOverflow)?;
// Slice bounds checking prevents buffer overflow
let src = section.get(offset..offset + alloc_size)
.ok_or(Error::BufferOverflow)?;
dest.get_mut(..alloc_size)
.ok_or(Error::BufferOverflow)?
.copy_from_slice(src);
Ok(())
}
}
Why Rust prevents this: checked_add() returns None on overflow. usize::try_from() rejects negative values. Slice indexing with .get() returns None for out-of-bounds ranges. The entire chain is safe — no silent wrapping, no unchecked memcpy.
B.4 Out-of-Bounds Read on Empty Container (PyTorch Issue #37153)
PyTorch’s CUDA reduce kernel indexes into iter.shape() which returns an empty array for scalar tensors.
Vulnerable C++ code (from aten/src/ATen/native/cuda/Reduce.cuh):
// iter.shape() returns empty IntArrayRef for scalar input
// iter.ndim() returns 0
int64_t dim0;
if (reduction_on_fastest_striding_dimension) {
dim0 = iter.shape()[0]; // OUT-OF-BOUNDS: shape() is empty
// dim0 = garbage value (e.g., 94599111233572)
}
ascend-rs mitigation — Rust’s Option type makes emptiness explicit:
#![allow(unused)]
fn main() {
// ascend_rs kernel: safe tensor shape access
fn configure_reduce_kernel(shape: &[usize], strides: &[usize]) -> Result<KernelConfig> {
// .first() returns Option<&T> — None for empty slices
let dim0 = shape.first()
.copied()
.ok_or(Error::ScalarTensorNotSupported)?;
// Or use pattern matching for multiple dimensions
let (dim0, dim1) = match shape {
[d0, d1, ..] => (*d0, *d1),
[d0] => (*d0, 1),
[] => return Err(Error::EmptyShape),
};
Ok(KernelConfig { dim0, dim1 })
}
}
Why Rust prevents this: shape.first() returns Option<&usize>, forcing the caller to handle the empty case. The match on slice patterns is exhaustive — the compiler requires the [] (empty) arm. shape[0] on an empty slice panics with a clear message instead of reading garbage.
B.5 Integer Truncation Bypassing Bounds Checks (CVE-2019-16778, TensorFlow)
TensorFlow’s UnsortedSegmentSum kernel implicitly truncates int64 tensor sizes to int32.
Vulnerable C++ code (from tensorflow/core/kernels/segment_reduction_ops.h):
template <typename T, typename Index> // Index = int32
struct UnsortedSegmentFunctor {
void operator()(OpKernelContext* ctx,
const Index num_segments, // TRUNCATED: int64 → int32
const Index data_size, // TRUNCATED: int64 → int32
const T* data, /* ... */)
{
if (data_size == 0) return; // Bypassed: truncated value ≠ 0
// data_size = 1 (truncated from 4294967297)
// Actual tensor has 4 billion elements — massive OOB access
}
};
ascend-rs mitigation — Rust’s type system rejects implicit narrowing:
#![allow(unused)]
fn main() {
// ascend_rs: explicit conversions prevent silent truncation
fn unsorted_segment_sum(
data: &DeviceBuffer<f32>,
segment_ids: &DeviceBuffer<i32>,
num_segments: usize, // Always full-width
) -> Result<DeviceBuffer<f32>> {
let data_size: usize = data.len(); // usize, never truncated
// If i32 index is needed for the kernel, conversion is explicit:
let data_size_i32: i32 = i32::try_from(data_size)
.map_err(|_| Error::TensorTooLarge {
size: data_size,
max: i32::MAX as usize,
})?;
// Rust rejects: let x: i32 = some_i64; // ERROR: mismatched types
// Rust rejects: let x: i32 = some_i64 as i32; // clippy::cast_possible_truncation
Ok(output)
}
}
Why Rust prevents this: Rust has no implicit integer narrowing. let x: i32 = some_i64; is a compile error. The as cast exists but clippy::cast_possible_truncation warns on it. TryFrom/try_into() returns Err when the value doesn’t fit, making truncation impossible without explicit acknowledgment.
B.6 Use-After-Free via Raw Pointer After Lock Release (CVE-2023-4211, ARM Mali)
The ARM Mali GPU driver copies a raw pointer from shared state, releases the lock, sleeps, then dereferences the now-dangling pointer.
Vulnerable C code (from mali_kbase_mem_linux.c, confirmed by Project Zero):
static void kbasep_os_process_page_usage_drain(struct kbase_context *kctx)
{
struct mm_struct *mm;
spin_lock(&kctx->mm_update_lock);
mm = rcu_dereference_protected(kctx->process_mm, /*...*/);
rcu_assign_pointer(kctx->process_mm, NULL);
spin_unlock(&kctx->mm_update_lock); // Lock released
synchronize_rcu(); // SLEEPS — mm may be freed by another thread
add_mm_counter(mm, MM_FILEPAGES, -pages); // USE-AFTER-FREE
}
ascend-rs mitigation — Rust’s Arc + Mutex prevents dangling references:
#![allow(unused)]
fn main() {
// ascend_rs host API: device context with safe shared state
struct DeviceContext {
process_mm: Mutex<Option<Arc<MmStruct>>>,
}
impl DeviceContext {
fn drain_page_usage(&self) {
// Take ownership of the Arc from the Mutex
let mm = {
let mut guard = self.process_mm.lock().unwrap();
guard.take() // Sets inner to None, returns Option<Arc<MmStruct>>
};
// Lock is released here (guard dropped)
// If mm exists, we hold a strong reference — it CANNOT be freed
if let Some(mm) = mm {
synchronize_rcu();
// mm is still alive — Arc guarantees it
mm.add_counter(MmCounter::FilePages, -pages);
}
// mm dropped here — Arc ref count decremented
// Only freed when the LAST Arc reference is dropped
}
}
}
Why Rust prevents this: Arc<MmStruct> is a reference-counted smart pointer. Taking it from the Option gives us ownership of a strong reference. Even after the lock is released and other threads run, our Arc keeps the MmStruct alive. There is no way to obtain a dangling raw pointer from an Arc in safe Rust — the underlying memory is freed only when the last Arc is dropped.