🎬Team introduction

The origins of TES AI

In the wave of digital revolution, computing power has become a key driving force for technological progress and innovation in various industries. TES AI came into being, combining cutting-edge distributed computing power with innovative blockchain technology to create an unprecedented computing power sharing ecosystem.

TES AI provides efficient and intelligent solutions for various application scenarios with high computing power requirements in a decentralized way, while also creating a lucrative investment opportunity for ordinary users. By purchasing and running cloud nodes, users can obtain generous token rewards and use the computing power of these nodes to support various tasks such as real-time data processing, artificial intelligence (AI) model training, and big data analysis. The usage fees paid by users are converted into token rewards through the TES AI platform, ensuring that every participant can benefit from this efficient and well-incentivized ecosystem.

TES AI aims to build a world-leading decentralized computing power sharing platform, using distributed computing and smart contract technology to provide powerful computing support for real-time data processing, AI applications, and other scenarios with high computing power requirements. With the development of blockchain technology and artificial intelligence, TES AI is committed to promoting innovation and change in the entire industry and bringing users an unprecedented computing power sharing experience.

Moore's Law is Dead

For the past 40 years, Moore's Law has driven the unprecedented growth of the computer industry. According to this law, processor performance doubles every 18 months. However, performance growth has slowed to a meagre 10-20% over the same period. Although Moore's Law may have ended, the demand for increased computing power has increased. In response, computer architects have shifted their focus to developing domain-specific processors that prioritize performance over generality.

Domain-Specific Hardware is Not Enough

As the name suggests, domain-specific processors are optimized for specific workloads, sacrificing generality for performance. Deep learning is a prime example of such a workload, revolutionizing various application domains, including financial services, industrial control, medical diagnosis, manufacturing, system optimization, etc.

Companies have raced to create specialized processors, like Nvidia's GPUs and Google's TPUs, to support deep learning workloads. While accelerators such as GPUs and TPUs increase computational power, they only extend Moore's Law further into the future, rather than fundamentally increasing the rate of improvement.

The Triple Whammy of Deep Learning Application Demand: Machine learning applications' demands are growing at an astonishing pace. Here are three key workloads as examples:

Training

According to a renowned OpenAI blog post, the computation required to achieve state-of-the-art machine learning results has roughly doubled every 3.4 months since 2012. This equates to an increase of almost 40x every 18 months, which is 20x more than Moore's Law! Thus, even without the end of Moore's Law, it would fall significantly short of meeting the demands of these applications.

This explosive growth isn't exclusive to niche machine-learning applications like AlphaGo. Similar trends are evident in mainstream applications like computer vision and natural language processing. For example, comparing the computational resources required by the seq2seq model from 2014 to a pretraining approach on tens of billions of sentence pairs from 2019 reveals a ratio of over 5,000x. This corresponds to an annual increase of 5.5x. These figures overshadow Moore's Law, which suggests an increase of only 1.6x per year.

Tuning

The situation is further exacerbated by the fact that models are not trained just once. The quality of a model often depends on various hyperparameters, such as the number of layers, hidden units, and batch size. To find the best model, developers must search through different hyperparameter settings. This process, called hyperparameter tuning, can be resource-intensive.

For instance, RoBERTa, a robust technique for pretraining NLP models, uses at least 17 hyperparameters. Assuming a minimal two values per hyperparameter, the search space consists of over 130K configurations. Even partially exploring this space requires vast computational resources. Another example of a hyperparameter tuning task is neural architecture search, which automates the design of artificial neural networks by testing different architectures and selecting the best-performing one. Researchers report that designing even a simple neural network can take hundreds of thousands of GPU computing days.

Simulations

While deep neural network models can typically leverage advances in specialized hardware, not all ML algorithms can. In particular, reinforcement learning algorithms involve numerous simulations. Due to their complex logic, these simulations are best executed on general-purpose CPUs (with GPUs only used for rendering), meaning they don't benefit from recent advances in hardware accelerators. For example, in a recent blog post, OpenAI reported using 128,000 CPU cores and just 256 GPUs (i.e., 500x more CPUs than GPUs) to train a model capable of defeating amateurs at Dota 2.

While Dota 2 is just a game, we're witnessing a surge in the use of simulations for decision-making applications, with startups like Pathmind, Prowler, and Hash.ai emerging in this area. As simulators strive for increasingly accurate environmental modelling, their complexity rises, adding another multiplicative factor to the computational complexity of reinforcement learning.

Why we need distributed computing for AI

Big data and AI are rapidly transforming our world. While technological revolutions bring risks, we see immense potential for this revolution to enhance our lives in ways we couldn't have imagined just a decade ago. However, to realize this promise, we must overcome the massive challenges posed by the rapidly growing gap between application demands and our hardware capabilities. To bridge this gap, distributing applications appears to be the only viable solution. This necessitates new software tools, frameworks, and curricula to train and enable developers to build such applications, marking the beginning of a thrilling new era in computing.

At TES AI, we develop innovative tools and distributed systems like Ray to guide application developers into this new era.

NextBuilding a Global Supercomputing Network and Open Fire Ecosystem

Last updated 1 year ago