English | 中文版
2. Hello World: Your First NPU Program
Let’s start with the simplest possible example. This Hello World demonstrates the basics of the ascend-rs host API — safely initializing the NPU, creating execution contexts, and launching kernels from Rust.
Kernel Code (C++)
At this stage, Hello World uses a C++ kernel, which is the native approach for the CANN SDK:
// hello_world.cpp
#include "kernel_operator.h"
extern "C" __global__ __aicore__ void hello_world() {
AscendC::printf("Hello World!!!\n");
}
extern "C" void hello_world_do(uint32_t blockDim, void *stream) {
hello_world<<<blockDim, nullptr, stream>>>();
}
Here, __global__ marks the function as a host-callable entry point, and __aicore__ indicates it runs on the Ascend AI Core. The <<<...>>> syntax, similar to CUDA, specifies parallelism and execution stream.
Host Code (Rust)
The host code demonstrates ascend-rs’s most important design principle — RAII resource management and lifetime safety:
use ascend_rs::prelude::*;
use std::error::Error;
// Declare FFI interface to the C++ kernel
unsafe extern "C" {
fn hello_world_do(dim: u32, stream: *mut std::ffi::c_void);
}
fn main() -> Result<(), Box<dyn Error>> {
// Step 1: Initialize ACL runtime
let acl = Acl::new()?;
// Step 2: Select and initialize device
let device = Device::new(&acl)?;
// Step 3: Create execution context and stream
let context = AclContext::new(&device)?;
let stream = AclStream::new(&context)?;
// Step 4: Launch kernel (8 parallel blocks)
unsafe {
hello_world_do(8, stream.to_raw());
}
// Step 5: Synchronize and wait for kernel completion
stream.synchronize()?;
// Step 6: All resources automatically freed (RAII)
// Drop order: stream → context → device → acl
Ok(())
}
Key Design: Lifetime Chain
Notice the type signatures in this code:
Acl → Lifetime root
Device<'acl> → Must drop before Acl
AclContext<'d> → Must drop before Device
AclStream<'c> → Must drop before Context
If you try to use these resources in the wrong order, the code simply won’t compile. This is the power of Rust’s type system — guaranteeing correct resource management at compile time, whereas C++ can only rely on programmer discipline.
Comparison: Pitfalls in C++
The equivalent C++ code requires manual lifecycle management for every resource:
// C++ version: every resource requires manual cleanup
aclInit(nullptr);
aclrtSetDevice(0);
aclrtContext ctx;
aclrtCreateContext(&ctx, 0);
aclrtStream stream;
aclrtCreateStream(&stream);
hello_world_do(8, stream);
aclrtSynchronizeStream(stream);
// Must manually free in correct order, otherwise undefined behavior
aclrtDestroyStream(stream);
aclrtDestroyContext(ctx);
aclrtResetDevice(0);
aclFinalize();
If any step throws an exception or returns early, the subsequent cleanup code is skipped. In the Rust version, the Drop trait guarantees resources are always freed correctly, regardless of control flow changes.