Week 4 - Power Analysis Attacks

2026-03-01 | Lecture , Cryptography , Power Analysis , Side Channel

Download the lecture slides for this week here: COMP6420_2026T1_Week4_Power_Attacks.pdf

COMP6420 Week 4 – Power Analysis Attacks

1. Real circuits leak information

Logic gates are built from transistors, and real circuits are not “ideal.” Computation takes time (propagation delay) and consumes power, and those physical side effects can leak information about what the circuit is doing.

Side channels are observable by-products of an implementation, not the algorithm itself. Common side channels include power, timing, and electromagnetic (EM) emissions. The core question is whether those observations reveal anything about secret data (especially cryptographic keys).

2. Timing is a classic side channel, but symmetric crypto aims to be constant-time

A simple timing leak comes from early-exit comparisons (e.g., password checks that return as soon as the first mismatch occurs). An attacker who can measure runtime can infer how many characters were guessed correctly.

For symmetric ciphers like AES/DES, straightforward timing attacks usually don’t apply in the same way because the algorithms are designed (and often implemented) to run in constant time. That pushes attention to a different side channel: power.

3. Why power reveals data

IC power consumption has two major components:

Static power (always present), and
Dynamic/switching power (changes with activity).

Dynamic power is the useful leakage source: switching a signal (0→1 or 1→0) charges/discharges capacitances and draws current. As a result, power becomes a function of computation and data transitions.

In practice, power traces can show structure like algorithm “rounds” (e.g., repeated patterns for DES/AES), even when the key itself is not visually obvious.

4. Capturing power traces: the side-channel measurement setup

A typical measurement setup includes:

a target embedded device performing encryption with a fixed secret key,
chosen inputs (plaintexts or ciphertexts),
a current-sense method (e.g., shunt/current sensor),
and an oscilloscope/DAQ capturing a power trace over time.

Each trace is paired with the corresponding known input/output data. Many attacks require lots of traces.

5. Simple Power Analysis (SPA)

SPA attempts to interpret a single power trace directly, visually, by linking spikes or patterns to operations.

SPA can work when:

the device is simple,
the correlation between operations and power is strong,
and operations differ clearly (classic example: RSA “square” vs “multiply” having different power signatures).

SPA often fails in realistic systems because:

modern chips have many components sharing the same power rails,
the signal gets noisy,
and simple countermeasures (e.g., dummy operations) can blur instruction-dependent patterns.

For AES, single-trace interpretation can show round boundaries, but pulling out a full 128-bit round key just by “looking” is not straightforward.

6. Differential Power Analysis (DPA): statistics over many traces

DPA is a much stronger approach that uses statistics across many traces to distinguish correct key guesses from incorrect ones. It relies on data dependency, not instruction dependency, which makes it harder to defeat with simple “add dummy instruction” tricks.

Core workflow

Collect many power traces with corresponding plaintexts or ciphertexts.
Make a key guess (usually for a small subkey, e.g., 6–8 bits or 1 byte).
Using the known text + key guess, compute a predicted intermediate value (commonly around S-boxes).
Partition traces into two groups based on a predicate bit of that predicted intermediate value.
Compute the difference of means between the two group averages at each time sample.
A wrong guess produces near-zero differences; a correct guess produces a noticeable spike where leakage occurs.

Why S-boxes are targeted

S-boxes are large combinational blocks with strong switching activity, so their internal transitions often create relatively strong leakage. Synchronising leakage to register boundaries is also easier than to “mid-combinational” signals, so implementations that store intermediate values can unintentionally help the attacker.

Why this isn’t “just brute force”

DPA is a divide-and-conquer strategy: instead of brute forcing the entire key space, it recovers small subkeys independently.

For DES: instead of brute-forcing 2^56 keys, guess 6-bit subkeys (per S-box) and repeat across S-boxes.
For AES-128: instead of brute-forcing 2^128, recover 8-bit subkeys (bytes) and repeat 16 times.

7. Correlation Power Analysis (CPA): using a leakage model

Bit-based DPA can be weak because a single bit gives limited signal. CPA improves this by correlating measured power with a power model that uses multiple bits of predicted leakage. The distinguisher is commonly the Pearson correlation coefficient: the correct key guess yields the strongest correlation.

Power models: Hamming Weight (HW) and Hamming Distance (HD)

Two widely used models link data to switching power:

Hamming Weight (HW): approximate power ∝ number of 1 bits in a value.
Useful but often corresponds more to “state” than switching, so it may be weaker when dynamic power dominates.
Hamming Distance (HD): approximate switching power ∝ number of bits that change between an “old” value and a “new” value.
Example: HD(00110010, 00100011) = 2.

HD is adimensional: it won’t predict an absolute power reading, but it can rank or compare predicted leakage across many traces—exactly what correlation-based attacks need.

To compute HD you need both the prior and next values of a target (often a register). That’s why resets and well-defined register updates matter: e.g., a register starting at all-zeros then loading 0xC0DE has HD equal to the number of 1s in that loaded value.

CPA on AES (typical target)

A common strategy is byte-wise recovery of a round key (often RK0 or a last-round key), by:

guessing one key byte at a time,
computing a predicted intermediate (e.g., an S-box output or S-box-adjacent value),
converting that prediction into a HW/HD leakage estimate for each trace,
correlating the predicted leakage vector against measured power samples,
selecting the key guess with highest correlation, and repeating for all bytes.

32-bit AES implementations can be especially convenient targets if they store intermediate values at clear boundaries, simplifying alignment and modelling.

8. Threat model for power attacks

A typical baseline assumes a passive, non-invasive attacker measuring power externally. Attacks are often grouped as:

Horizontal: single-trace style (SPA).
Vertical: multi-trace style (DPA/CPA).

Attackers may also be:

profiling (training on an identical device first), or
non-profiling (using generic models like HW/HD).

9. Defenses: broad categories and trade-offs

Power-analysis countermeasures fall into four broad families:

Detection: try to detect measurement attempts and respond (e.g., flush secrets).
Hiding: reduce or conceal data-dependent power variations.
Masking: randomise intermediate values so leakage is statistically independent of secrets.
Key management: change keys frequently so attackers can’t accumulate enough usable traces.

9.1 Detection

Power/voltage sensors or impedance monitoring can detect probing/measurement conditions (e.g., unusual analogue behaviour on the power delivery network). These approaches can be costly (area/power), complex, and stochastic (not guaranteed).

9.2 Hiding via balancing (power equalisation)

Dual-rail/precharge styles encode each bit with two wires (q and ¬q) and use a two-phase clock (precharge + evaluation) to reduce data-dependent variation. Downsides include major overheads: often ~2× area, ~2× power, and complex routing constraints to preserve balance and reduce glitches.

9.3 Hiding via “electrickery”

Analogue techniques can smooth or compensate instantaneous current draws (e.g., adjustable current sources, shunt feedback loops, passive filtering with capacitors/inductors). On-chip deployment may be expensive or impractical, while PCB-level approaches can sometimes be more feasible.

9.4 Masking and randomisation techniques

Several common approaches aim to break correlation/alignment:

Noise injection: run extra circuitry (e.g., ring oscillators, hash engines) to add (ideally data-independent) switching activity.
Clock/voltage randomisation: vary the clock (skip pulses, multi-phase), use DVFS-like randomisation, and disrupt trace alignment.
Shuffling / time hiding: permute independent operations or add dummy operations/no-ops to misalign traces (works best when true data-independent reordering exists)
Random data manipulation: keep only masked intermediates internally so observed leakage corresponds to randomised values rather than true sensitive values.

All defenses have costs (area, power, performance, design complexity), and combining defenses is common for high-value secrets.

10. Week 4 assessment context: Lab 2 (CPA on AES)

Lab 2 focuses on performing CPA against a 32-bit AES implementation. Intermediate value storage can be “safe” from a functional perspective, but it creates an opening for power analysis because it improves alignment/modelling.

Two generic reference implementations are provided (SPI-based and a no-IO version). Student-specific binaries share the same unique key across those two variants. The no-IO version can generate large numbers of traces quickly (inputs driven internally, e.g., via an LFSR), and the SPI version can be used to validate recovered key bytes.

A typical workflow is:

Understand the implementations,
collect many power traces,
run CPA byte-by-byte using a HW/HD power model,
recover the key,
optionally validate via the SPI version.