Appendix G: CANN 8.5 Kernel Coverage — 998 Kernels
This appendix documents the coverage of CANN 8.5 built-in kernels by the ascendc-to-rs transpiler.
- 998 CANN kernel names — the real operator batch that feeds
ascendc-to-rs; each kernel below is a distinctops_<category>__<name>.rsproduced by the transpiler. - Two fidelity tiers:
- Transpiled (real compute body): 247/998 (25%). The Rust body contains at least one compute intrinsic beyond the
alloc/load/pipe_barrier/storeboilerplate (e.g.ascend_add_f32,ub_reduce_max,tile_matmul_f16). - Registered (identity stub): 751/998 (75%). The body is load → barrier → store only — the transpiler parsed the C++ signature and produced a kernel that passes the compile gate, but did not yet lower the compute intrinsics. Shape, dtype, and kernel ABI are real; the body is a placeholder.
- Transpiled (real compute body): 247/998 (25%). The Rust body contains at least one compute intrinsic beyond the
- This is a compile-gate coverage — every kernel produces a valid
kernel.acl.othrough Rust → MLIR → AscendC →bishengon Ascend 910B2. Numerical correctness against the reference CANN implementation is a separate (ongoing) gate. - Reproducible: the interactive browser below is regenerated from the in-repo transpiled corpus at
benchmarks/cann_kernels/ops_*__*.rsbyblog/mdbook/scripts/appg_build_cbdata.py. Re-run that script after any re-transpile to refresh both the per-category table and the embedded CB_DATA in one step.
Milestone — 2026-04-20: all 998/998 kernels in the real
ascendc-to-rsbatch produce a validkernel.acl.o(compile-gate pass). 247/998 of these carry non-identity bodies; the remaining 751/998 are identity stubs awaiting intrinsic-lowering work inrustc_codegen_mlir. Tag:ascendc-to-rs-998-working.
Note on category scheme: this appendix uses the real batch categories emitted by the
ascendc-to-rspipeline (ops_cv,ops_legacy,ops_math,ops_nn,ops_oam,ops_transformer). Earlier drafts showed a synthetic 8-category catalog (ops_index/ops_optimizer/ops_reduce/ops_resize) with no kernels in common with the tested set — replaced on 2026-04-20.
G.1 Kernel Inventory by Category
| Category | Total | Transpiled | Registered | Description |
|---|---|---|---|---|
| ops_cv | 41 | 5 | 36 | Computer-vision primitives (resize, colour convert, background replace, custom blends) |
| ops_legacy | 343 | 106 | 237 | Element-wise unary/binary ops across the CANN legacy library (exp, abs, add, mul, logical, per-dtype variants) |
| ops_math | 155 | 52 | 103 | Math / special functions (trig, hyperbolic, erf, gamma, power, per-dtype variants) |
| ops_nn | 306 | 81 | 225 | Neural-network ops (activations, norms, pooling, loss, optimizers, indexing, reductions, resize) |
| ops_oam | 3 | 0 | 3 | Operator-Adapter (OAM) bridge kernels |
| ops_transformer | 150 | 3 | 147 | Attention, matmul, flash-attention, MoE, MLA, quantized-linear variants |
| Total | 998 | 247 | 751 |
“Transpiled” = body contains compute intrinsics beyond alloc/load/barrier/store. “Registered” = body is an identity stub (load → barrier → store) that passes the compile gate but does not yet express the original C++ compute. ops_transformer is the furthest from full fidelity (3/150 transpiled) because its kernels have complex inner loops (attention softmax, flash-attention tiling, matmul) that the transpiler does not yet lower; the legacy / math / nn categories fare better because their element-wise bodies already lower through today’s intrinsics. Closing the remaining gap is a rustc_codegen_mlir intrinsic-lowering task, not a transpiler-frontend one.
G.2 Interactive Kernel Browser
Select a category and kernel to view the AscendC C++ source and transpiled Rust code. Click buttons to open in Playground.
998 kernels cataloged. Green = transpiled, grey = registered (source pending).