Jump to content
WWW.CSELITES.COM
CSElites known as [www.cselites.com], a virtual world from May 1, 2012, which continues to grow in the gaming world. CSELITES.COM has over 65k members in continuous expansion, coming from different parts of the world.
More
TZM.CSELITES.COM 178.33.179.182:27015 connect
GOLD.CSELITES.COM 51.75.87.10:27015 connect
MIX.CSELITES.COM 45.13.151.118:27015 connect

[Hardware]Nvidia Blackwell Is One Hot Processor


Maheso

Recommended Posts

https://www.technewsworld.com/wp-content/uploads/sites/3/2024/03/nvidia-blackwell-architecture.jpg

Nvidia has faced scrutiny this month because some servers with a whopping 72 Blackwell processors were overheating. The issue arose because some initial OEM deployments were not properly water-cooled, which Lenovo aggressively identified and mitigated with its Neptune warm water-cooling solutions.

As AI advances, we’ll need more highly dense, incredibly powerful AI processors, which suggests that air cooling in server rooms may become obsolete.

Let’s talk about Blackwell, water cooling, and why Lenovo’s Neptune solution stands out at the moment. We’ll close with my Product of the Week: Microsoft’s Windows 365 Link, which could be the missing link between PCs and terminals that could forever change desktop computing.

Blackwell is Nvidia’s premier, AI-focused GPU. When it was announced, it was so far over what most would have thought practical that it almost seemed more like a pipe dream than a solution. But it works, and there is nothing close to its class right now. However, it is massively dense in terms of technology and generates a lot of heat.

Some argue it is a potential ecological disaster. Don’t get me wrong, it does pull a lot of power and generate a tremendous amount of heat. But its performance is so high compared to the kind of load that you’d typically get with more conventional parts that it is relatively economical to run.

It’s like comparing a semi-truck with three trailers to a U-Haul van. Yes, the semi will get comparatively crappy gas mileage, but it will also hold more cargo than 10 U-Haul vans and use a lot less gas than those 10 vans, making it more ecologically friendly. The same is true of Blackwell. It is so far beyond its competition in terms of performance that its relatively high energy use is below what otherwise would be required for a competitive AI server.

But Blackwell chips do run hot, and most servers today are air-cooled. So, it shouldn’t be surprising that some Blackwell servers were configured with air cooling and those with 72 or more Blackwell processors on a rack overheated. While 72 Blackwells in a rack is unusual today, as AI advances, it will become more common, given Nvidia is currently the king of AI.

You can only go so far with air-cooled technology in terms of performance before you have to move to liquid cooling. While Nvidia did respond to this issue with a water-cooled rack specification that Dell is now using, Lenovo was way ahead of the curve with its Neptune water-cooling solution.

 

[Lenovo Neptune]
 

Lenovo was the first to realize this, mainly because it is currently the market leader in its class in terms of water cooling — a technology initially acquired from IBM, which has been doing water cooling for decades.

What is important with water cooling isn’t just the technology but the knowledge of how to deploy it safely. Mixing water and high-amperage electronics can be a disaster if you don’t know what you’re doing. As a result of the IBM server acquisition, Lenovo has decades of water cooling experience that it calls Neptune.

Given Nvidia has specified a water-cooled rack, what makes Neptune better? The answer is experience. Most that will use the Nvidia-specified solution, including Nvidia, don’t often deploy water-cooled solutions. As a result, particularly with these high-end Blackwell implementations, they’ll essentially be learning on the job.

It can be really dangerous when you mix water with high-amperage electronics. Water and electricity don’t mix. Not only can a leak fry an expensive part or even an entire rack, but if a person is present, it can fry them, too, if the breakers don’t set in. In a raised-floor environment, unless it has been designed with leaks in mind, terrible things can happen.

 

https://www.technewsworld.com/story/nvidia-blackwell-is-one-hot-processor-179476.html

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.