General Questions
What is Huawei Ascend?
Huawei Ascend is a family of AI accelerators (NPU) and servers designed for training and
inference of large language models (LLM) and other AI workloads. Ascend is a direct alternative
to Nvidia GPU, offering comparable performance at significantly lower cost.
What is the difference between Ascend 910 and 310?
| Parameter |
Ascend 910 |
Ascend 310 |
| Target |
LLM training |
Inference |
| Performance |
256-376 TFLOPS FP16 |
16-22 TOPS INT8 |
| Memory |
32-64 GB HBM |
8-16 GB LPDDR4 |
| Power |
310-400W |
8-12W |
Why choose Ascend over Nvidia?
- Independence from US supply chains β no export restrictions
- Lower TCO β 20-40% lower costs over 3 years
- Local support β Czech technical support
- Integration β native support for Chinese LLMs (DeepSeek, Qwen)
Is Ascend compatible with existing models?
Yes. Models trained on Nvidia can be converted to Ascend using ONNX format, CANN toolkit, or
MindSpore framework. Most modern LLMs (Llama, DeepSeek, Qwen) are already optimized for Ascend.
Technical Questions
What is CANN toolkit?
Compute Architecture for Neural Networks is Huawei's software stack for Ascend,
including:
- ACL β Application Programming Interface
- ATC β Model conversion (Caffe/ONNX/TensorFlow β Ascend)
- Profiling tools β Performance optimization
How do I migrate from CUDA to CANN?
- Export model from PyTorch/TensorFlow to ONNX
- Convert using ATC compiler
- Optimize for Ascend NPU
- Test and benchmark
Read our detailed guide: Migrating from CUDA to CANN
Which frameworks are supported?
- MindSpore β native, best performance
- PyTorch β via CANN backend
- TensorFlow β via CANN backend
- ONNX Runtime β universal inference
What is the performance compared to Nvidia A100?
| Metric |
Ascend 910B |
Nvidia A100 |
| FP16 |
376 TFLOPS |
312 TFLOPS |
| INT8 |
640 TOPS |
624 TOPS |
| Memory BW |
1.6 TB/s |
2.0 TB/s |
| LLM inference |
~95% A100 |
100% |
Pricing & Purchase
How much does an Ascend server cost?
Atlas 800 Training Server (8Γ 910B):
- Price: ~2,500,000β3,000,000 CZK
- Includes 3-year support
- Delivery: 4β6 weeks
Atlas 300I Pro Inference Card:
- Price: ~80,000β120,000 CZK
- PCI-Express card for inference deployment
Is rental/cloud available?
Yes, we offer:
- Managed hosting β our server in your data center
- Cloud instances β API access
- Lease β 3-year financing options
Is there a trial/POC available?
Yes:
- Remote demo β access to our server
- POC β 30-day testing period
- Benchmark β comparison with your current solution
Deployment & Operations
How long does deployment take?
| Phase |
Duration |
| Hardware delivery |
4β6 weeks |
| Installation |
1β2 days |
| Configuration |
2β3 days |
| Model migration |
1β2 weeks |
| Testing |
1 week |
| Total |
6β10 weeks |
What are the data center requirements?
- Rack space: 2Uβ4U depending on configuration
- Power: 2Γ 3000W (for training server)
- Cooling: Front-to-back airflow, 35β45 dB
- Network: 2Γ 10G/25G/100G Ethernet
Is clustering supported?
Yes, we support:
- Scale-up β up to 8Γ 910B in single server
- Scale-out β multiple servers via RoCE/InfiniBand
- Kubernetes β container orchestration
- Slurm β HPC workload management
Security & Compliance
Is my data secure?
On-premise deployment means:
- Data never leave your infrastructure
- No cloud dependency
- Full control over access
- Audit logs for compliance
Is it GDPR compliant?
Yes:
- Data sovereignty β data stays in EU
- No third-party access
- Right to deletion β full control
- Audit trail β access logging
What certifications does Ascend have?
- ISO 27001 β information security
- ISO 9001 β quality management
- CE β European conformity
- RoHS β environmental standard
Still have questions?
Contact us directly. We're happy to help with any specific requirements or technical questions.
π +420 739 414 475 | π§ [email protected]