+40%
Faster DeepSeek inference
1st
DeepSeek-Ascend commercial API
70k
Ascend 910C chips deployed
The Opportunity
When DeepSeek emerged as a serious open-source competitor to GPT-4, the entire industry scrambled to deploy it. But there was a bottleneck: Nvidia GPU availability. H100 allocation wait times stretched to 6+ months, and prices skyrocketed.
SiliconStorm saw the gap. Instead of waiting in the Nvidia queue, they partnered with Huawei to build the world's first commercial DeepSeek inference platform running entirely on Ascend 910C NPUs β and the results exceeded everyone's expectations.
The Architecture
SiliconStorm's infrastructure is built on a massive Ascend cluster:
- 70,000+ Ascend 910C chips deployed across multiple data centers (~$2 billion in hardware value)
- Custom-optimized DeepSeek kernels compiled specifically for CANN 8.0, exploiting Ascend's native tensor operations
- MindIE inference engine with batch scheduling optimized for high-concurrency API serving
- Multi-region deployment including EU-based data centers for GDPR-compliant workloads
"Everyone was fighting over Nvidia H100s. We built on Ascend and went from zero to production in 3 months while our competitors were still waiting for GPU allocations."
Technology Stack
Ascend 910C
DeepSeek R1
CANN 8.0
MindIE
Custom CUDAβCANN kernels
Multi-region API
Why 40% Faster?
The 40% improvement over comparable Nvidia-based inference isn't magic β it's the result of deep hardware-software co-optimization:
- Native CANN compilation: Instead of running DeepSeek through generic CUDA-to-CANN translation layers, SiliconStorm rewrote critical inference kernels natively for the Ascend architecture
- DaVinci core utilization: Ascend 910C's DaVinci cores have different memory access patterns than CUDA cores β SiliconStorm's custom kernels exploit this for better cache utilization during attention computation
- Batch scheduling: MindIE's dynamic batching engine groups similar-length requests together, maximizing NPU utilization to >85% (vs. ~60% on generic deployments)
- Quantization: INT8/FP16 mixed-precision inference tuned specifically for DeepSeek's architecture
Market Impact
SiliconStorm's success has triggered a cascade of industry moves:
- Alibaba, Baidu, and Tencent are all now testing Ascend 910C as a direct Nvidia H100 replacement β following SiliconStorm's proof-of-concept
- ~70,000 Ascend chips have been ordered by major cloud providers, representing ~$2 billion in hardware value
- European enterprises now have a viable path to run DeepSeek without Nvidia dependency β critical for organizations concerned about US export control risks
Relevance for European Enterprises
SiliconStorm's EU data center option is a game-changer for European organizations that want:
- DeepSeek-class AI capabilities without sending data to US or Chinese cloud providers
- No Nvidia dependency β no allocation queues, no export control risk
- Cost-competitive pricing β Ascend-based inference at 40-60% lower cost than equivalent Nvidia-based APIs
- GDPR compliance β data processed in EU-located data centers
Key Takeaway
SiliconStorm has proven that Ascend isn't just "comparable" to Nvidia β in optimized scenarios, it's measurably faster. For enterprises building AI strategies, this eliminates the last major objection: performance. Combined with lower cost and zero export control risk, the business case for Ascend has never been stronger.
π Source: SiliconStorm product announcements, industry analyst reports (Q4 2025), Huawei Ascend 910C deployment data