Energy Consumption based on № of Parameters in AI Models by Lin Hsin Hsin -- Founder of FIRST VIRTUAL MUSEUM in the WORLD -- 31st Anniversary of LIN HSIN HSIN ART MUSEUM -- Digital Art Museum, First Virtual Museum in the World - 1994. Wikipedia, Digital Media Center: Technology, Digital Art, Digital Paintings, Digital Sculptures, Digital Music, Digital Musical Instruments, Sound, , Animated Music, Web-enabled, Interactive, Di0gital Media Poineer

Lin Hsin Hsin Artificial Intelligence Center

Energy Consumption based on
№ of Parameters in AI Models
by Lin Hsin Hsin

Energy Consumption During Training

Training a large language model (LLM) involves two major energy-consuming phases:

▶ Forward propagation
◀ Backpropagation

Factors to Consider:

📍 Number of Parameters: 1 Trillion (1 x 10¹²)
📍 Operations per Parameter: 2 FLOPs per parameter
📍 Training Steps: 1 Million steps
📍 Batch Size: 2048
📍 Hardware: Nvidia A100 (300W)

Calculate the Total Operations

Each parameter requires 2 FLOPs per training example for both forward and backward passes.

FLOPs per Training Step:

📍 FLOPs per example: 2 x 10¹² FLOPs

📍 FLOPs per batch: 2 x 10¹² x 2048 = 4.096 x 10¹⁵ FLOPs

📍 Total FLOPs: 4.096 x 10¹⁵ x 10⁶ = 4.096 x 10²¹ FLOPs

Power Consumption During Training

The energy consumption during training depends on the total time it takes and the hardware's power usage.

Energy for Training:

📍 Power: 300 W (for Nvidia A100 GPU)

📍 Training Time: 1 month (720 hours)

📍 Energy = 300 W x 720 hours = 216,000 watt-hours = 216

Energy Consumption During Inference

Inference uses much less energy compared to training but still requires substantial computation.

Energy for Inference:

📍 Power: 100 W (for Nvidia A100 GPU during inference)

📍 Energy per query: 100 W x 1 second = 0.0278 Wh

📍 Total energy for 10 million queries: 0.0278 Wh x 10⁷ = 278 kWh

Total Energy Consumption

Total Energy Consumption:

📍 Training Energy: 216 kWh (1 A100 GPU for 1 month)
📍 Inference Energy: 278 kWh (for 10 million queries)