Ascend 910 vs Nvidia H100: Enterprise Comparison

When planning AI infrastructure, enterprise decision makers need clear data. This article provides a detailed comparison of Huawei Ascend 910B and Nvidia H100 SXM based on specifications, real-world benchmarks, and total cost of ownership.

Specifications Comparison

Parameter	Ascend 910B	Nvidia H100 SXM
FP16 Performance	376 TFLOPS	989 TFLOPS
BF16 Performance	376 TFLOPS	989 TFLOPS
INT8 Performance	640 TOPS	3,958 TOPS
Memory	64 GB HBM2e	80 GB HBM3
Memory Bandwidth	1.6 TB/s	3.35 TB/s
Power Consumption	400W	700W
Price (MSRP)	~$15,000	~$30,000

On paper, H100 leads in raw performance. But real-world AI workloads tell a different story.

Real-World Benchmarks: Llama 3 70B Inference

Metric	Ascend 910B	H100	Difference
Throughput	85 tok/s	95 tok/s	-11%
Latency (TTFT)	180 ms	150 ms	+20%
Latency (TBT)	45 ms	38 ms	+18%
Memory Usage	58 GB	62 GB	-6%

For most enterprise inference workloads, the performance difference is negligible. Both chips handle large language models efficiently.

Total Cost of Ownership (3 Years)

Configuration: 8× NPU server for enterprise deployment

Cost Item	Ascend 910B	H100
Hardware	3,200,000 CZK	6,800,000 CZK
Power (3 years)	420,000 CZK	735,000 CZK
Support (3 years)	480,000 CZK	1,020,000 CZK
Total	4,100,000 CZK	8,555,000 CZK

Savings with Ascend: 52% — Over 4 million CZK difference over 3 years.

When to Choose Ascend

✅ Suitable for:

Inference workloads
Chinese LLMs (DeepSeek, Qwen)
Limited budget
Need fast delivery (no waiting)
Independence from US supply chains

❌ Not suitable for:

Training largest models (400B+)
Legacy CUDA codebase
Need for cutting-edge research features

Migration Complexity

Workload	Complexity	Time
ONNX model inference	Low	1–2 days
PyTorch (via CANN)	Medium	1 week
Custom CUDA kernels	High	2–4 weeks
Distributed training	Medium	1–2 weeks

Conclusion

Ascend 910B is not for everyone. But for most enterprise deployments, it offers sufficient performance at half the cost. The decision should be based on your specific use case, not marketing materials.

For inference workloads with Chinese LLMs or budget constraints, Ascend is a compelling alternative. For cutting-edge training research with CUDA dependencies, Nvidia remains the standard.

Need help deciding? Contact us for a consultation based on your specific requirements.

← Back to Blog