Provide Training Solutions for Model Algorithms
TES AI provides a one-stop solution for clusters, incorporating underlying technologies such as GPU topology awareness, affinity scheduling, and high parallel file systems, supporting multiple model training methods, compatible with mainstream AI frameworks, and extending and customizing the industry's mainstream distributed training solutions to improve the amount of training data and shorten the model delivery cycle; providing a customized algorithm framework, using code generation and other methods, and providing a rich set of built-in operators. It also provides a customized algorithm framework, which uses code generation and other methods, and provides rich built-in algorithms to simplify the steps of data set import, feature engineering processing, and pre-training model dependency to improve AI training development efficiency; for training in a distributed environment with large models, ZeRo and other technologies are used for memory optimization to break the gap between display memory and RAM memory and reduce the display memory overhead of training.
Last updated