Lin Hsin Hsin Artificial Intelligence Center





Energy Consumption based on
№ of Parameters in AI Models

by Lin Hsin Hsin















Energy Consumption During Training

Training a large language model (LLM) involves two major energy-consuming phases:

▶ Forward propagation
◀ Backpropagation




Factors to Consider:

📍 Number of Parameters: 1 Trillion (1 x 1012)
📍 Operations per Parameter: 2 FLOPs per parameter
📍 Training Steps: 1 Million steps
📍 Batch Size: 2048
📍 Hardware: Nvidia A100 (300W)





Calculate the Total Operations

Each parameter requires 2 FLOPs per training example for both forward and backward passes.

FLOPs per Training Step:

📍 FLOPs per example: 2 x 1012 FLOPs

📍 FLOPs per batch: 2 x 1012 x 2048 = 4.096 x 1015 FLOPs

📍 Total FLOPs: 4.096 x 1015 x 106 = 4.096 x 1021 FLOPs


Power Consumption During Training

The energy consumption during training depends on the total time it takes and the hardware's power usage.

Energy for Training:

📍 Power: 300 W (for Nvidia A100 GPU)

📍 Training Time: 1 month (720 hours)

📍 Energy = 300 W x 720 hours = 216,000 watt-hours = 216


Energy Consumption During Inference

Inference uses much less energy compared to training but still requires substantial computation.

Energy for Inference:

📍 Power: 100 W (for Nvidia A100 GPU during inference)

📍 Energy per query: 100 W x 1 second = 0.0278 Wh


📍 Total energy for 10 million queries: 0.0278 Wh x 107 = 278 kWh


Total Energy Consumption

Total Energy Consumption:

📍 Training Energy: 216 kWh (1 A100 GPU for 1 month)
📍 Inference Energy: 278 kWh (for 10 million queries)