Distributed Computing: the Right Way to Power Artificial Intelligence

Sentient Technologies uses distributed computing to power its evolutionary and deep learning AI solutions. We sat down with our Principal Architect, Distributed Computing, Adam Beberg to walk us through it:

Q. Adam, you are relatively new to the company. What brought you to Sentient?
A. I’ve been building distributed systems like distributed.net and Folding@home that use idle computing power for a couple decades now, and when I saw what Sentient was doing in this area with AI, I was impressed by their results. It is the first company I’d seen that successfully combined large scale AI with this type of distributed computing system. The fact that Sentient has already demonstrated success using it in the areas of trading and medical research is really exciting, and shows the potential of AI in other areas.

Q. How does AI scale?
A. Usually when you double the power of a distributed system it can solve problems twice as fast, or a year earlier than others can following Moore’s Law. When running AI on a distributed system, the results aren’t as linear. Doubling the compute power can result in AI that is four times smarter, and that presents many exciting possibilities. Other types of AI need to double their power to reduce the errors another few percent, and require truly vast scale to improve on existing results.

Q. Is cloud computing a solved problem?
A. In the 90’s a company could go to an ISP and rent machines and bandwidth by the month then go through the complex process of setting up the database and other infrastructure. Today, when using a cloud provider it’s pay for compute by the hour, storage and bandwidth by the gigabyte. All the software, APIs, and tools are free – so long as it’s a standard problem. But it’s not very exciting for engineers wanting to build those systems, because the existing tools are unlikely to change radically and a handful of companies dominate the market.

Q. What is new about Sentient’s approach to scaling its AI?
A. If you ask a cloud provider for a million cores, they will laugh, then respond with a date a few years from now when those cores might be available. What we’re doing is harvesting idle cycles wherever they are, to solve problems with AI at a scale unheard of in the industry. AI is a new frontier and the problems we’re solving to make a system like this work are really difficult to solve. We’re having to rethink and re-imagine how distributed systems work from the ground up to enable the AI and that’s a fun problem to solve.

Q. How big is Sentient’s system and to how many cores can it scale to?
A. We’re designing the next version of our system to scale over millions of cores coordinated across different geographic regions, for months at a time. It’s also designed to handle general computation, not just the types of work needed for the AI.

Q. What types of challenges did Sentient have building this system and how were they overcome?
A. The biggest technical challenges are optimizing data movement across different geographic locations, security and data privacy issues, and making the entire system operate in the most efficient way possible by moving the computation to the data. These are problems that don’t exist when running at a cloud provider or a dedicated data center, but solving them allows us to work at unprecedented scale.

Q. You also architected the distributed.net and the early Folding@home system. What has changed about this type of distributed computing since that time?
A. 20 years ago machines connected with modems, and had a tiny fraction of the power in a cell phone today. Windows 95 PCs with Pentiums, and Macs with OS 7 and PowerPCs. Over time that evolved into faster machines, DSL connections in homes, PlayStations, and GPUs. Now almost every device has a multi-core processor and GPU and because high bandwidth is ubiquitous, location is no longer a limiting factor. The downside is that desktops are now an endangered species and most personal devices have a battery which limits which devices can be used for this type of work. Overall, building our distributed system is much easier today, harnessing the abundance of powerful compute resources wherever they are in the world to run our AI projects.