Reen Singh is an engineer and a technologist with a diverse background spanning software, hardware, aerospace, defense, and cybersecurity.
As CTO at Uvation, he leverages his extensive experience to lead the company’s technological innovation and development.
The NVIDIA DGX H200 is a powerful, fully integrated AI supercomputer, described as a “data centre in a single box”. It is a factory-built system designed from the ground up to handle the most demanding artificial intelligence and high-performance computing tasks. Its significant power consumption is directly related to the high-performance components it contains.
The key components driving its power demand include:
GPUs: The most power-hungry parts are its eight NVIDIA H200 Tensor Core GPUs. Each of these complex processors is equipped with 141 GB of fast HBM3e memory and requires a tremendous amount of electrical energy to perform billions of calculations per second.
CPU: The system uses two Intel Xeon CPUs to manage operations and prepare data for the GPUs. These processors require a notable amount of power, especially when coordinating data flow across all eight GPUs.
Interconnects: A specialised internal network called NVLink and NVSwitch acts as a high-speed data highway, allowing the eight GPUs to share information directly. Keeping this network running at maximum speed to avoid bottlenecks consumes a considerable amount of power.
Supporting Hardware: Other parts such as system memory, solid-state drives (SSDs), high-speed network cards, and a sophisticated cooling system with multiple fans also contribute to the total power draw.
The significant power consumption of the NVIDIA DGX H200 is not a design flaw but rather a direct trade-off for its unmatched computational performance. The electricity it consumes is converted into raw processing power, which allows the system to train massive AI models and solve complex scientific problems in hours or days, instead of the weeks or months it might otherwise take. This makes the high power requirement a necessary investment for leading-edge research.
The primary drivers of this high power demand are its powerful internal components working in concert:
Eight NVIDIA H200 GPUs: These are the most power-hungry components, requiring a huge amount of energy to perform massive numbers of simultaneous calculations.
Two Intel Xeon CPUs: These powerful processors manage the entire system and use a notable amount of power to coordinate the high-speed flow of data to the GPUs.
NVLink and NVSwitch Interconnects: This internal high-speed data highway consumes considerable power to allow the GPUs to communicate with each other without bottlenecks.
Supporting Hardware: Memory, storage drives, network cards, and cooling fans all add to the cumulative power load.
The power consumption of an NVIDIA DGX H200 is not a single, constant number but a range that depends on its workload. According to official specifications, the system has a maximum Thermal Design Power (TDP) of 10.2 kW, or 10,200 watts. A system’s TDP rating is a measure of the maximum heat it is expected to generate, which closely corresponds to its maximum power consumption.
Key details about its power usage include:
Maximum Draw: The 10.2 kW figure is the absolute peak performance draw. The system will only reach this level during the most demanding phases of AI model training or complex scientific simulations.
Operational Range: During less intensive tasks or idle periods, the DGX H200 will use significantly less power than its maximum rating.
Comparative Context: This power footprint is similar to its predecessor, the DGX H100, which demonstrates that NVIDIA has increased performance within a similar power envelope. For perspective, a single DGX H200 can consume as much power as an entire rack filled with conventional enterprise servers.
The DGX H200’s high power consumption of 10.2 kW has major implications for data centre infrastructure and requires careful planning. A standard wall outlet cannot be used. The system requires high-voltage, dedicated power circuits, typically 200–240 volts, delivered using a three-phase power configuration, which is standard in data centres for efficient and safe power delivery.
Specific electrical requirements include:
Current Draw: Based on a standard 208V three-phase system, the DGX H200 will draw approximately 28.3 Amps per phase at its maximum load.
Circuit Sizing: This current draw consumes about 57% of the capacity of a standard 50A/208V three-phase Power Distribution Unit (PDU) circuit.
Connectors: The system must be connected to a specialised PDU designed for three-phase power, using high-output, heavy-duty connectors, most commonly C19 outlets and corresponding cables capable of safely handling the high current.
Experts in data centre infrastructure note that supporting such high-density hardware is one of the biggest challenges facing modern data centres.
Managing the heat generated by a DGX H200 is as critical as supplying it with power. The system’s power consumption of 10.2 kW is directly converted into 10.2 kW of heat that must be continuously removed to prevent overheating and system failure.
Key cooling considerations are:
Air Cooling Challenges: Traditional air cooling is challenged by this level of heat density. A standard server rack with just a few DGX H200 systems could generate over 40 kW of heat, which is far beyond what typical room air conditioning can manage. If air-cooled, the system requires a massive and constant flow of cool air in a hot aisle/cold aisle containment setup.
Liquid Cooling Recommendation: Liquid cooling is the strongly recommended modern solution for efficiently cooling the DGX H200.
Liquid Cooling Methods:
Direct-to-Chip (D2C): This highly effective method involves attaching cold plates directly to the hottest components (GPUs and CPUs), where a liquid coolant absorbs heat far more efficiently than air.
Rear Door Heat Exchangers (RDHx): This involves a radiator unit attached to the back of the server rack that uses coolant-filled coils to capture heat from the server exhaust before it enters the data room. Adopting liquid cooling is a major infrastructure decision that requires a significant investment in specialised plumbing, pumps, and external chillers. Data centre experts see this shift as essential for supporting the next generation of high-performance hardware.
While a single DGX H200 is powerful, its true potential is realised when multiple systems are clustered together, as in NVIDIA’s DGX SuperPOD architecture, which integrates many units into a single massive computing cluster. Scaling up to a full rack creates significant infrastructure challenges that require holistic planning.
The key considerations for a full rack include:
Immense Power and Heat Density: A single rack containing eight DGX H200 units would have a combined potential power draw and heat output of approximately 82 kW. This extreme density creates a major challenge for power distribution and cooling systems, which must be specifically designed to handle such a concentrated load.
Holistic Planning: Data centre planners must account for more than just the DGX H200 systems themselves. The infrastructure plan must also include the power and cooling needs of all supporting components, such as high-speed networking switches (e.g., NVIDIA Quantum-2 InfiniBand), storage nodes, and control plane servers that are essential parts of a complete AI supercomputing pod.
While the NVIDIA DGX H200 has a high absolute power draw, it is engineered for exceptional efficiency, which is measured in “performance per watt”. This means it completes a far greater amount of AI computational work for every kilowatt-hour of electricity consumed compared to older or less specialised systems, offering a much higher return on energy investment.
The efficiency is demonstrated by:
Theoretical Peak Efficiency: At its peak, the system can deliver approximately 31.6 PetaFLOPS of FP8 performance while drawing 10.2 kW, yielding a theoretical peak performance of around 3.1 TFLOPS per watt.
Real-World Efficiency Gains: In industry-standard AI training benchmarks, the DGX H200 demonstrates a ~2x generational efficiency gain over its predecessor, the DGX H100. It completes the same AI training tasks in less time and using less total energy, providing roughly twice the AI computational work per watt compared to the previous generation.
For its target audience, the high power consumption of the DGX H200 is considered a necessary and worthwhile investment when evaluated against the immense value it delivers. The justification lies in its efficiency and the overall Total Cost of Ownership (TCO), which includes not only the hardware and electricity costs but also the value of accelerated performance.
Here is a breakdown of why it is considered worth it:
Accelerated Time-to-Result: The system’s unparalleled speed drastically reduces the time needed to train AI models from weeks or months down to days or hours. This acceleration saves on labour costs and shortens the time-to-market for new products, which can justify the power expenditure.
Total Cost of Ownership (TCO): An estimated TCO breakdown includes:
Initial Hardware Cost: Approximately $200,000 – $250,000+.
Estimated Annual Operational Cost: Around $17,400 – $20,100, which includes costs for power and cooling.
Target Audience: The DGX H200 is a specialised tool for large enterprises and premier research institutions focused on building foundational AI models or tackling major scientific challenges. For these users, achieving results faster than anyone else is the ultimate priority, making the power cost a justifiable part of the investment.
Unregistered User
It seems you are not registered on this platform. Sign up in order to submit a comment.
Sign up now