English | 中文版
附录 E:完整内核清单
本附录由
scripts/generate_kernel_appendix.sh自动生成。 运行bash scripts/generate_kernel_appendix.sh --lang zh可重新生成。
总览
| 指标 | 数量 |
|---|---|
| 编译测试内核 | 486 |
| 可部署内核 | 19 |
| 内核总数 | 505 |
| MultiKernelBench 覆盖 | 300/300 (100%) |
| MKB 类别覆盖 | 15/15 (100%) |
| 内存安全漏洞模式 | 6 类(含攻击示例) |
漏洞模式图例
| 编号 | 漏洞类型 | C++ 根因 | Rust 防护机制 | 攻击示例 |
|---|---|---|---|---|
| V1 | 类型擦除 | GM_ADDR 擦除所有类型信息 | 函数签名编码元素类型 | case1 |
| V2 | 缓冲区溢出 | GetValue(i) 无边界检查 | 缓冲区 ID API + 显式计数 | case2 |
| V3 | 整数溢出 | u32 偏移计算静默回绕 | wrapping_mul 显式溢出 | case6 |
| V4 | 释放后使用 | FreeTensor() 后访问过期 LocalTensor | API 中无手动释放 | case3 |
| V5 | 双重释放 | FreeTensor() 重复调用 | 无释放操作 | case5 |
| V6 | 同步缺失 | 遗漏 pipe_barrier() | kernel_ops 组合算子内置屏障 | case4 |
按类别的内核清单
Activation(17 个内核)
适用漏洞模式: V1(type erasure),V2(unchecked index),V6(missing sync)
MKB 参考: reference_kernels/activation/
Architecture(77 个内核)
适用漏洞模式: V1,V2,V3(offset overflow),V6
MKB 参考: reference_kernels/architecture/
Attention(23 个内核)
适用漏洞模式: V1,V2,V3,V6(multi-stage sync)
MKB 参考: reference_kernels/attention/
Broadcast(12 个内核)
适用漏洞模式: V1(type erasure),V2(bounds),V5(double free)
MKB 参考: reference_kernels/broadcast/
| 内核函数 | 源文件 | MKB 参考 | 910B3 状态 |
|---|---|---|---|
add_bias | tests/compiletest/ui/broadcast_ops_kernel.rs | add_bias.py | PASS |
elementwise_mul | tests/compiletest/ui/broadcast_ops_kernel.rs | elementwise_mul.py | PASS |
elementwise_div | tests/compiletest/ui/broadcast_ops_kernel.rs | elementwise_div.py | PASS |
elementwise_sub | tests/compiletest/ui/broadcast_ops_kernel.rs | elementwise_sub.py | PASS |
elementwise_max | tests/compiletest/ui/broadcast_ops_kernel.rs | elementwise_max.py | PASS |
clamp | tests/compiletest/ui/broadcast_ops_kernel.rs | — | PASS |
elementwise_min | tests/compiletest/ui/broadcast_ops_kernel.rs | elementwise_min.py | PASS |
elementwise_square | tests/compiletest/ui/broadcast_ops_kernel.rs | — | PASS |
where_broadcast | tests/compiletest/ui/broadcast_ext_kernel.rs | — | PASS |
logic_and_broadcast | tests/compiletest/ui/broadcast_ext_kernel.rs | logic_and_broadcast.py | PASS |
power_broadcast | tests/compiletest/ui/broadcast_ext_kernel.rs | power_broadcast.py | PASS |
scalar_mul | tests/compiletest/ui/scalar_mul_kernel.rs | scalar_mul.py | PASS |
Convolution(34 个内核)
适用漏洞模式: V2(nested loop OOB),V3(stride*index overflow)
MKB 参考: reference_kernels/convolution/
Fuse(120 个内核)
适用漏洞模式: V1,V2,V4(use-after-free in chain),V6(inter-op sync)
MKB 参考: reference_kernels/fuse/
Index(12 个内核)
适用漏洞模式: V2(gather/scatter OOB),V3(index calc overflow)
MKB 参考: reference_kernels/index/
| 内核函数 | 源文件 | MKB 参考 | 910B3 状态 |
|---|---|---|---|
argmax | tests/compiletest/ui/index_ops_kernel.rs | argmax.py | PASS |
argmin | tests/compiletest/ui/index_ops_kernel.rs | argmin.py | PASS |
gather | tests/compiletest/ui/index_ops_kernel.rs | gather.py | PASS |
scatter | tests/compiletest/ui/index_ops_kernel.rs | scatter.py | PASS |
scatter_add | tests/compiletest/ui/index_ops_kernel.rs | scatter_add.py | PASS |
index_select | tests/compiletest/ui/index_ops_kernel.rs | index_select.py | PASS |
index_copy | tests/compiletest/ui/index_ops_kernel.rs | index_copy.py | PASS |
index_add | tests/compiletest/ui/index_ops_kernel.rs | index_add.py | PASS |
embedding | tests/compiletest/ui/index_ops_kernel.rs | embedding.py | PASS |
masked_fill | tests/compiletest/ui/index_ops_kernel.rs | masked_fill.py | PASS |
inplace_update | tests/compiletest/ui/index_ops_kernel.rs | inplace_update.py | PASS |
take_along_dim | tests/compiletest/ui/index_ops_kernel.rs | take_along_dim.py | PASS |
Loss(6 个内核)
适用漏洞模式: V1,V2,V6(reduction sync)
MKB 参考: reference_kernels/loss/
| 内核函数 | 源文件 | MKB 参考 | 910B3 状态 |
|---|---|---|---|
mse_loss | tests/compiletest/ui/loss_ops_kernel.rs | mse_loss.py | PASS |
huber_loss | tests/compiletest/ui/loss_ops_kernel.rs | huber_loss.py | PASS |
hinge_loss | tests/compiletest/ui/loss_ops_kernel.rs | hinge_loss.py | PASS |
cosine_similarity | tests/compiletest/ui/loss_ops_kernel.rs | cosine_similarity.py | PASS |
cross_entropy_loss | tests/compiletest/ui/loss_ops_kernel.rs | cross_entropy_loss.py | PASS |
kl_div_loss | tests/compiletest/ui/loss_ops_kernel.rs | kl_div_loss.py | PASS |
Math(5 个内核)
适用漏洞模式: V2(cumulative bounds),V3(offset overflow)
MKB 参考: reference_kernels/math/
| 内核函数 | 源文件 | MKB 参考 | 910B3 状态 |
|---|---|---|---|
matrix_scalar_mul | tests/compiletest/ui/math_ops_kernel.rs | matrix_scalar_mul.py | PASS |
cumprod | tests/compiletest/ui/math_cumulative_kernel.rs | cumprod.py | PASS |
cumsum | tests/compiletest/ui/math_cumulative_kernel.rs | cumsum.py | PASS |
cumsum_exclusive | tests/compiletest/ui/math_cumulative_kernel.rs | cumsum_exclusive.py | PASS |
cumsum_reverse | tests/compiletest/ui/math_cumulative_kernel.rs | cumsum_reverse.py | PASS |
Matmul(23 个内核)
适用漏洞模式: V1(type erasure f16/f32),V2(tile bounds),V3(dim overflow),V6(cube sync)
MKB 参考: reference_kernels/matmul/
Normalization(10 个内核)
适用漏洞模式: V1,V2,V6(reduce-normalize sync)
MKB 参考: reference_kernels/normalization/
| 内核函数 | 源文件 | MKB 参考 | 910B3 状态 |
|---|---|---|---|
rms_norm | tests/compiletest/ui/norm_ops_kernel.rs | rms_norm.py | PASS |
l1_norm | tests/compiletest/ui/norm_ops_kernel.rs | l1_norm.py | PASS |
l2_norm | tests/compiletest/ui/norm_ops_kernel.rs | l2_norm.py | PASS |
l2_normalize | tests/compiletest/ui/norm_ops_kernel.rs | l2_normalize.py | PASS |
layer_norm | tests/compiletest/ui/norm_ops_kernel.rs | layer_norm.py | PASS |
batch_norm | tests/compiletest/ui/norm_extended_kernel.rs | — | PASS |
group_norm | tests/compiletest/ui/norm_extended_kernel.rs | group_norm.py | PASS |
instance_norm | tests/compiletest/ui/norm_extended_kernel.rs | instance_norm.py | PASS |
frobenius_norm | tests/compiletest/ui/norm_extended_kernel.rs | frobenius_norm.py | PASS |
layernorm | tests/compiletest/ui/layernorm_kernel.rs | layernorm.py | PASS |
Optimizer(6 个内核)
适用漏洞模式: V1,V2(param bounds),V4(in-place update UAF)
MKB 参考: reference_kernels/optimizer/
| 内核函数 | 源文件 | MKB 参考 | 910B3 状态 |
|---|---|---|---|
sgd_update | tests/compiletest/ui/optimizer_ops_kernel.rs | sgd_update.py | PASS |
sgd_momentum | tests/compiletest/ui/optimizer_ops_kernel.rs | sgd_momentum.py | PASS |
adagrad_update | tests/compiletest/ui/optimizer_ops_kernel.rs | adagrad_update.py | PASS |
rmsprop_update | tests/compiletest/ui/optimizer_ops_kernel.rs | rmsprop_update.py | PASS |
adam_update | tests/compiletest/ui/optimizer_ops_kernel.rs | adam_update.py | PASS |
lamb_update | tests/compiletest/ui/optimizer_ext_kernel.rs | lamb_update.py | PASS |
Pooling(12 个内核)
适用漏洞模式: V2(window OOB),V3(stride overflow)
MKB 参考: reference_kernels/pooling/
Reduce(5 个内核)
适用漏洞模式: V1,V2,V6(reduction pipeline sync)
MKB 参考: reference_kernels/reduce/
| 内核函数 | 源文件 | MKB 参考 | 910B3 状态 |
|---|---|---|---|
reduce_max | tests/compiletest/ui/reduce_ops_kernel.rs | reduce_max.py | PASS |
reduce_min | tests/compiletest/ui/reduce_ops_kernel.rs | reduce_min.py | PASS |
reduce_sum | tests/compiletest/ui/reduce_ops_kernel.rs | reduce_sum.py | PASS |
reduce_mean | tests/compiletest/ui/reduce_ops_kernel.rs | reduce_mean.py | PASS |
reduce_prod | tests/compiletest/ui/reduce_ops_kernel.rs | reduce_prod.py | PASS |
Resize(15 个内核)
适用漏洞模式: V2(interpolation OOB),V3(coordinate overflow)
MKB 参考: reference_kernels/resize/
Tiled(16 个内核)
适用漏洞模式: V2(tile boundary OOB),V6(tile-boundary sync)
| 内核函数 | 源文件 | 910B3 状态 |
|---|---|---|
relu_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
sigmoid_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
gelu_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
tanh_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
swish_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
exp_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
vec_add_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
vec_mul_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
elu_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
mish_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
layernorm_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
softmax_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
selu_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
leaky_relu_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
hardswish_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
rmsnorm_tiled | tests/compiletest/ui/tiled_kernel.rs | PASS |
Multiblock(16 个内核)
适用漏洞模式: V2(block partition OOB),V6(cross-block sync)
F16(14 个内核)
适用漏洞模式: V1(f16/f32 type confusion)
| 内核函数 | 源文件 | 910B3 状态 |
|---|---|---|
relu_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
sigmoid_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
abs_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
exp_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
ln_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
sqrt_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
rsqrt_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
reciprocal_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
vec_add_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
vec_sub_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
vec_mul_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
vec_div_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
reduce_max_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
reduce_sum_f16 | tests/compiletest/ui/f16_activation_kernel.rs | PASS |
Unary_math(8 个内核)
适用漏洞模式: V1,V2
| 内核函数 | 源文件 | 910B3 状态 |
|---|---|---|
exp_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
ln_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
sqrt_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
rsqrt_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
reciprocal_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
negate_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
square_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
cube_f32 | tests/compiletest/ui/f32_unary_kernel.rs | PASS |
可部署内核(含宿主机代码)
内存安全案例研究
每组案例包含一个有漏洞的 C++ 内核和一个结构上安全的 Rust 内核。
| 案例 | 漏洞类型 | C++ 文件 | Rust 文件 |
|---|---|---|---|
| 1 | 类型混淆(GM_ADDR 类型擦除) | vulnerable.cpp | safe.rs |
| 2 | 缓冲区溢出(无边界检查索引) | vulnerable.cpp | safe.rs |
| 3 | 释放后使用(FreeTensor 后访问) | vulnerable.cpp | safe.rs |
| 4 | 同步缺失(遗漏 pipe_barrier) | vulnerable.cpp | safe.rs |
| 5 | 双重释放(重复 FreeTensor) | vulnerable.cpp | safe.rs |
| 6 | 整数溢出(偏移计算静默回绕) | vulnerable.cpp | safe.rs |
性能比较(待完成)
| 内核 | ascend-rs 耗时 | AscendC C++ 耗时 | 比率 | 备注 |
|---|---|---|---|---|
| softmax (256) | 0.077 ms | 0.078 ms | 0.99x | 零开销 |
| softmax (16384) | 0.087 ms | 0.089 ms | 0.98x | 零开销 |
| relu | — | — | — | 待测 |
| matmul | — | — | — | 待测 |
| layernorm | — | — | — | 待测 |
| conv2d | — | — | — | 待测 |
性能评测实验正在进行中。上表将随实验结果持续更新。
本附录由 bash scripts/generate_kernel_appendix.sh --lang zh 自动生成。
内核计数: 编译测试 486 + 可部署 19 = 总计 505。