SiliconStorm — DeepSeek + Ascend Cloud Integration

+40%

Faster DeepSeek inference

1st

DeepSeek-Ascend commercial API

70k

Ascend 910C chips deployed

$2B

Total hardware value

The Opportunity

When DeepSeek emerged as a serious open-source competitor to GPT-4, the entire industry scrambled to deploy it. But there was a bottleneck: Nvidia GPU availability. H100 allocation wait times stretched to 6+ months, and prices skyrocketed.

SiliconStorm saw the gap. Instead of waiting in the Nvidia queue, they partnered with Huawei to build the world's first commercial DeepSeek inference platform running entirely on Ascend 910C NPUs — and the results exceeded everyone's expectations.

The Architecture

SiliconStorm's infrastructure is built on a massive Ascend cluster:

70,000+ Ascend 910C chips deployed across multiple data centers (~$2 billion in hardware value)
Custom-optimized DeepSeek kernels compiled specifically for CANN 8.0, exploiting Ascend's native tensor operations
MindIE inference engine with batch scheduling optimized for high-concurrency API serving
Multi-region deployment including EU-based data centers for GDPR-compliant workloads

"Everyone was fighting over Nvidia H100s. We built on Ascend and went from zero to production in 3 months while our competitors were still waiting for GPU allocations."

Technology Stack

Ascend 910C DeepSeek R1 CANN 8.0 MindIE Custom CUDA→CANN kernels Multi-region API

Why 40% Faster?

The 40% improvement over comparable Nvidia-based inference isn't magic — it's the result of deep hardware-software co-optimization:

Native CANN compilation: Instead of running DeepSeek through generic CUDA-to-CANN translation layers, SiliconStorm rewrote critical inference kernels natively for the Ascend architecture
DaVinci core utilization: Ascend 910C's DaVinci cores have different memory access patterns than CUDA cores — SiliconStorm's custom kernels exploit this for better cache utilization during attention computation
Batch scheduling: MindIE's dynamic batching engine groups similar-length requests together, maximizing NPU utilization to >85% (vs. ~60% on generic deployments)
Quantization: INT8/FP16 mixed-precision inference tuned specifically for DeepSeek's architecture

Market Impact

SiliconStorm's success has triggered a cascade of industry moves:

Alibaba, Baidu, and Tencent are all now testing Ascend 910C as a direct Nvidia H100 replacement — following SiliconStorm's proof-of-concept
~70,000 Ascend chips have been ordered by major cloud providers, representing ~$2 billion in hardware value
European enterprises now have a viable path to run DeepSeek without Nvidia dependency — critical for organizations concerned about US export control risks

Relevance for European Enterprises

SiliconStorm's EU data center option is a game-changer for European organizations that want:

DeepSeek-class AI capabilities without sending data to US or Chinese cloud providers
No Nvidia dependency — no allocation queues, no export control risk
Cost-competitive pricing — Ascend-based inference at 40-60% lower cost than equivalent Nvidia-based APIs
GDPR compliance — data processed in EU-located data centers

Key Takeaway

SiliconStorm has proven that Ascend isn't just "comparable" to Nvidia — in optimized scenarios, it's measurably faster. For enterprises building AI strategies, this eliminates the last major objection: performance. Combined with lower cost and zero export control risk, the business case for Ascend has never been stronger.

📎 Source: SiliconStorm product announcements, industry analyst reports (Q4 2025), Huawei Ascend 910C deployment data

Back to all case studies