Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

English | 中文版

Conclusion

The ascend-rs project demonstrates that memory safety in NPU programming is achievable without sacrificing performance. Through Rust’s ownership system, lifetimes, and RAII patterns, we eliminate an entire class of memory safety errors at compile time — errors that traditional C++ NPU programming can only guard against through programmer experience and discipline.

From Hello World to the vectorized softmax kernel, we’ve seen a complete pipeline from source to NPU execution: Rust source → MLIR intermediate representation → C++ with AscendC vector intrinsics → NPU binary → device execution → safe result retrieval. With 413 tests passing on Ascend 910B3 hardware (0 failures, 0 crashes) across all kernel categories, benchmark results confirm that Rust vectorized kernels match the performance of hand-optimized C++ — with zero overhead.

With the introduction of the ascend_compile crate, ascend-rs now extends its impact beyond Rust kernel authors. By providing a standalone, validated compilation library with C ABI and Python bindings, the project enables the broader Ascend ecosystem — TileLang, Triton, PyTorch, and future compiler frameworks — to share a common, well-tested compilation backend. The same validation passes that catch missing sync barriers and buffer overflows in Rust-generated kernels now protect kernels from any source.

The direction is clear: bring safety guarantees to every Ascend NPU user, whether they’re writing Rust kernels, Python DSLs, or integrating compiler toolchains — and make the entire ecosystem more reliable in the process.


About the Project

If you’re interested in memory-safe NPU or GPU programming or collaboration, please contact the author.


Author: Yijun Yu