  EPFL > IC > LASEC > publications INDEX

CONTACT

EPFL - I&C - ISC - LASEC
Station 14 - Building INF
CH-1015 Lausanne
Switzerland

Tel. +41 21 693 7603
Fax. +41 21 693 7689

Introduction to Decorrelation Theory
On-Line Manual
In order to make the decorrelation theory more easily accessable, here is an easy introduction. This "decorrelation manual" will be subject to frequent updates. (It is mainly under construction at this time.) For any feedback, please email to
Serge Vaudenay < >

Last update: July 15th, 2002.

# 1. Goals

The goals of decorrelation theory is
1. to make the theory on k-wise independence usable in the block cipher theory
2. to provide handy tools for dealing with undistinguishability
3. to make formal security results feasible on block ciphers
4. to provide design guidelines for new algorithms.

For this we consider block ciphers as random permutations in a mathematical sense: the random choice of the secret key K induce a distribution on the permutation CK. In the following we will omit the reference to the key.

The goal of the designer of the block cipher is to make it "look like" a truly random permutation, i.e. like a random permutation C* with a uniform distribution among the set of permutations. For this we need tools to compare C and C*.

Intuitively, we can compare C and C* at different level of depth. Namely, if for any fixed plaintext x the corresponding ciphertext C(x) is uniformly distributed, we say that C and C* are similar "to the order 1". Similarly, if for any two different plaintexts x1 and x2 the corresponding ciphertext pair (C(x1),C(x2)) has a uniform distribution among all the pairs of different ciphertexts, we say that C and C* are similar "to the order 2". And so on. This intuitively defined the notion of order of decorrelation.

# 2. d-wise Distribution Matrices

For any random function F (a block cipher is a random function with the special property that it is always a permutation) and any integer d we associate a huge matrix [F]d with real number entries which is called the "d-wise distribution matrix".

Beginners do not really need to understand where this matrix come from. For the others, here is the formal definition.

Definition (d-wise distribution matrix). Let F be a random function and d be en integer. Any d inputs x1, ..., xd define a d-multi-input
x = (x1,...,xd).
Any d outputs y1, ..., yd define a d-multi-output
y = (y1,...,yd).
To any d-multi-input x and any d-multi-output y we associate the probability
[F]d(x,y) = Pr[ C(x1=y1), ..., C(xd=yd) ].
This way we define the d-wise distribution matrix [F]d in which each row corresponds to a d-multi-input and each colomn corresponds to a d-multi-output. In particular, the row of [F]d corresponding to the d-multi-input (x1,...,xd) defines the distribution of the random d-multi-output (C(x1),...,C(xd)).

The most important property we need to know is the following.

Interestingly, if C* denotes the ideal random permutation (which means it has a uniform distribution), we have

[C]d x [C*]d = [C* o C]d = [C*]d = [C o C*]d = [C*]d x [C]d
(Note that this does not hold for functions. Indeed if F is the constant function which is zero on all points, we have [F o F*]d = [F]d and [F* o F]d = [G]d where G is a random function such that G(x) = G(y) for any x and y and G(x) is uniformly distributed.) This implies the following property.
(To prove this, we just develop the right hand term and use Proposition 1.)

In the introduction was addressed the problem of comparing a cipher C to the ideal cipher C* "to a given order". Now we can be a little more precise. We will compare [C]d and [C*]d.

# 3. Decorrelation Distance and Decorrelation Bias

Assuming we are given a norm ||.|| on the matrices set, a straightforward way to compare the two matrices [F]d [G]d is to compute the distance
|| [F]d - [G]d ||
This is called the d-wise decorrelation distance between F and G according to the norm ||.||.

Random functions F must be compared to an ideal random function F* which has a uniform distribution. For this we define

DecFd||.||(F) = || [F]d - [F*]d ||
the d-wise decorrelation bias of function F.

Similarly, ciphers C must be compared to the ideal cipher C*, and we define

DecPd||.||(C) = || [C]d - [C*]d ||
the d-wise decorrelation bias of permutation C.

The question on the choice of the norm remains.
What we can already notice is that a decorrelation distance of zero do not depend on this choice, but holds for any norm. Thus we say that F and G have the same decorrelation of order d if their d-wise decorrelation distance is zero. Similarly, we say that a function F is perfectly decorrelated to the order d if its d-wise decorrelation bias is zero. We say that a cipher C is perfectly decorrelated to the order d if its d-wise decorrelation bias is zero.

We have an equivalent to the Shannon Theorem.

Hence we need a lot of random bits in order to construct perfectly decorrelated functions.

# 4. Matrix Norms and Decorrelation

A mapping from the set of matrices to the set of real numbers is a norm if
1. ||A|| = 0 holds if and only if A=0
2. ||u.A|| = |u|.||A|| for any real number u (|u| is the absolute value of u here)
3. ||A+B|| <= ||A|| + ||B||
In addition "matrix norms" are norms such that
1. ||AxB|| <= ||A||.||B||

If we measure decorrelation biases with matrix norms, from the property of matrix norms, the definitions and Proposition 2, we have the interesting property that

DecPd(C2oC1) <= DecPd(C1).DecPd(C2).
This means that the decorrelation bias of an iterated cipher decreases exponentially with the number of rounds (as long as the decorrelation of a few rounds is less than one), which is quite a satisfactory property. Note that this property does not hold for functions (despite what was claimed in [Sac99]).

# 5. The Four Problems of Decorrelation Theory

Decorrelation theory faces four complementary problems.
1. How to choose the matrix norm?
2. Does the decorrelation of a cipher provide protection?
3. How to build simple decorrelated primitives (decorrelation modules)?
4. How to use them in order to build decorrelated ciphers?

For the fourth problem, the strategy consists of considering any regular conservative designs, and plugging into it a few decorrelated primitives which are called decorrelation modules. If we choose the right notion of decorrelation (the right norm), the rights decorrelation modules and plug them in the right positions, we obtain a decorrelated cipher which may be protected against the right attacks.

# 6. On the Choice of the Matrix Norm

The Euclidean norm (the L2 norm) has first been considered. For any matrix A, its Euclidean norm ||A||2 is the square root of the sum of the square of all entries. It is not obvious to see it is a matrix norm, but it is indeed. One problem is that it leads with hard to manipulate quantities. (See Ref. [Sac98] for more details.)

A much more convenient norm is the |||.|||oo norm. For any matrix A, the norm |||A|||oo is the maximum over all rows of the sum of the absolute values of all entries:

|||A|||oo = maxx sumy |A(x,y)|.
The notation of this norm comes from the fact that it is precisely the matrix norm associated to the Loo vector-norm. More precisely we define
||V||oo = maxy |Vx|
for a vector and
|||A|||oo = maxV ||A.V||oo / ||V||oo.
This is the regular way for constructing a matrix norm from a vector norm.

Other interesting matrix norms are the ||.||a norm and the ||.||s norm defined by

||A||a = maxx1 sumy1 maxx2 sumy2 ... |A((x1,x2,...), (y1,y2,...))|
||A||s = maxb1,b2,... maxx1,0 sumx1,1 maxx2,0 sumx2,1 ... |A((x1,b1,x2,b2,...), (x1,1-b1,x2,1-b2,...))|
where b1, b2, ... are 0 or 1.

# 7. Norms and Undistinguishability

In the theory of randomness, we measure how much a function is pseudorandom by measuring the advantage of Turing tests. Actually, given two random functions F and G, we consider an oracle Turing machine AO which outputs 0 or 1 after playing with the oracle O. We measure the advantage for distinguishing F from G by
AdvA(F,G) = | Pr[AF=1] - Pr[AG=1] |.
Considering a class Cl of oracle Turing machines which output 0 or 1, we consider the best advantage

Here are important results which motivate the choice of the |||.|||oo, ||.||a and ||.||s matrix norms.

(See Ref. [Sac99] for more details.)

This formalism turns out to make trivial (from the matrix norm properties) the following intuitive results. If we consider r independent instances C1, ..., Cr of the same cipher C, we consider the iterated cipher Cro...oC1. We have

AdvCldna (Cro...oC1,C*) <= 2r-1. ( AdvCldna(C,C*) )r
AdvClda (Cro...oC1,C*) <= 2r-1. ( AdvClda(C,C*) )r
AdvClds (Cro...oC1,C*) <= 2r-1. ( AdvClds(C,C*) )r
The best advantage for distinguishing an iterated cipher from the ideal cipher C* decreases exponentially with the number of rounds r.

# 8. Decorrelation Modules

Inspired by what has been done based on the Carter-Wegman universal hashing theory (see Ref. [CW79], Ref. [WC81] and Ref. [CG89]), it is quite straightforward to propose decorrelation modules.
Here are a few decorrelation modules for the corresponding norms.

# 9. A few Decorrelated Cipher Constructions

Once we have a decorrelation module, we can plug it in regular conservative ciphers. In order to deduce an upper bound on the decorrelation bias of the cipher from an upper bound on the decorrelation bias of these modules, we use the following "meta-theorem".
Theorem 4 (Decorrelation modules to decorrelated ciphers). Let
C = Omega(F1,...,Fr,C1,...,Cs)
be a computation network which defines a cipher from r independent random functions F1,...,Fr and s random permutations C1,...,Cs. We consider an ideal version of this construction
C' = Omega(F*1,...,F*r, C*1,...,C*s)
in which the F*i and C*j are uniformly distributed functions or permutations. We assume that a computation of C requires ai computations of Fi for any i, and bj computations of Cj for any j. For any d we have
DecPd(C) <= DecPd(C') + DecFda1(F1) + ... + DecFdar(Fr) + DecPdb1(C1) + ... + DecPdbs(Cs)
for the |||.|||oo and ||.||a norms. Similarly, assuming that a computation of C or C-1 requires a'i computations of Fi for any i, and b'j computations of Cj or C-1j for any j, we have
DecPd||.||s(C) <= DecPd||.||s(C') + DecFda'1||.||a(F1) + ... + DecFda'r||.||a(Fr) + DecPdb'1||.||s(C1) + ... + DecPdb's||.||s(Cs).
(See
Ref. [Sac99] for more details.)
This theorem reduce the problem of estimating the decorrelation bias of C to the problem of estimating the decorrelation bias of C'.

It shall be outlined that it makes trivial the extension of Luby-Rackoff Theorem to pseudorandom functions instead of random ones: assuming that the advantage of any distinguisher which is limited to d chosen plaintext is at most epsilon against any round function Fi of a Feistel construction with at least three rounds, we obtain that any distinguisher which is limited to d chosen plaintext against the Feistel cipher is at most

3.epsilon + d2.2-m/2.
This result extends to chosen plaintext and ciphertext with 4 instead of 3.

We can use the following constructions.

1. (Coconut design)
We let C1 and C3 be any ciphers, and C2 be a decorrelation invertible module. We can use the
C = C3 o C2 o C1
construction. For the ideal version with C2 = C*, we obtain that DecPd(C') = 0. Therefore we have
DecPd(C) <= DecPd(C2)
for the |||.|||oo, ||.||a and ||.||s norms.

In Ref. [Stacs98] was propose the Coconut98 cipher with a perfect pairwise decorrelation with

C2(x) = ax + b in GF(2m)
shown as the NUT-II decorrelation module.

2. (Peanut design)
In Peanut ciphers, we use a regular r-round Feistel cipher with decorrelation modules F1, ..., Fr as round functions. With the standard notations we have
C = Psi(F1,...,Fr).
We recall that the Feistel scheme Psi is defined by
Psi(F1)(x,y) = ( x + F1(y) , y )
Psi(F1,...,Fr)(x,y) = Psi(F2,...,Fr)( y , x + F1(y) ).
The decorrelation bias of the corresponding ideal function C' is given by Luby-Rackoff Theorem (see Ref. [LR88]):
DecPd||.||a(C') <= 2.d2.2-m/2 for r >= 3
DecPd||.||s(C') <= 2.d2.2-m/2 for r >= 4.
Thus if we let epsilon denote the maximal decorrelation bias of the decorrelation modules we have
DecPd||.||a(C) <= ( 3.epsilon||.||a + 2.d2.2-m/2 )floor(r/3) for r >= 3
DecPd||.||s(C) <= ( 4.epsilon||.||s + 2.d2.2-m/2 )floor(r/4) for r >= 4.
(See Ref. [Sac99] for more details.)
In Ref. [Stacs98] was propose the Peanut98 cipher with pairwise decorrelation with
Fi(x) = g( ax + b mod 232+15 mod 232 ).

In Ref. [Aes98] (see also Ref. [Aes99]) was propose the DFC cipher with pairwise decorrelation with

Fi(x) = g( ax + b mod 264+13 mod 264 ).

These are the NUT-IV decorrelation module. In Ref. [Sac98] this construction was adapted to the L2 norm in order to propose the Peanut97 cipher with pairwise decorrelation with

Fi(x) = g( ax + b mod 232-5 )
which is the NUT-III decorrelation module.

3. (Walnut design)
We can also use the Lai-Massey Scheme which has been used in the IDEA cipher (see Ref. [Lai92] and Ref. [LM90]). More precisely we define
Lambdasigma(F1)(x,y) = ( x + F1(x-y) , y + F1(x-y) )
Lambdasigma(F1,...,Fr)(x,y) = Lambdasigma(F2,...,Fr)( sigma( x + F1(x-y) ) , y + F1(x-y) )
where sigma is a group alpha-orthomorphism. Similar bounds than for the Luby-Rackoff construction can be obtained. We need 3 rounds for pseudorandomness, and 4 rounds for super-pseudorandomness. In the Walnut ciphers, F1, ..., Fr are decorrelation modules.
(See Ref. [Asiacrypt99] for more details.)

Other constructions can be adapted. In the literature we can find many theorems of constructions in the ideal cases. For instance Ref. [ZMI89] investigates generalized Feistel schemes with more than two branches. Ref. [NR99] improves the Luby-Rackoff construction by using the same round functions several times. Ref. [PRS99] investigates a new construction, ...

Note on the "nut" names: the first aim of this name was to claim that decorrelation modules are cheap and basically cost nuts (see [Liens-97-3]). Here NUT stands for "N-Universal Transformation" (which reminds the Carter-Wegman paradigm).
COCONUT stands for "Cipher Organized with Cute Operations and NUT".
PEANUT stands for "Pretty Encryption Algorithm and NUT".
WALNUT stands for "Wonderful Algorithm with Light NUT".
All other nuts are welcome!

# 10. Differential and Linear Cryptanalysis

Differential cryptanalysis against r-round block ciphers usually starts from a simple distinguisher between a (r-i)-round block cipher and the perfect cipher with a small i (typically i = 1, 2, or 3). This distinguisher uses a fixed differential characteristic. It can be formalized as follows.
Differential distinguisher
1. iterate n times the following process
1. pick a random X
2. query C(X) and C(X XOR a)
3. if C(X XOR a) = C(X) XOR b then output 1, exit the loop, and stop
2. output 0 and stop
The capability of a distinguisher A to distinguish two block ciphers is measured by the advantage
Adv( C1 , C2 ) = Pr[ AC = 1 , C <- C1 ] - Pr[ AC = 1 , C <- C2 ]
where the probabilities hold over the X and C samples spaces.

Given a function f we define

DPf(a,b) = PrX[ f(X XOR a) = f(X) XOR b ]
and given a block cipher C (defined by a random key) we define
EDPC(a,b) = PrC,X[ C(X XOR a) = C(X) XOR b ]
The link between the advantage of the differential distinguisher and this quantity is given by the following result proven in
[Stacs98].
There is also a link between EDP and the decorrelation to the order 2. This leads to the following theorem. Other approach to guaranty low differential probabilities in block ciphers hold for a target group law like XOR. We can notice that the result here extends to differential attacks using any group law which is somewhat stronger.

There is a similar treatment for linear distinguishers. They are formalized by n,a,b and an acceptance set B:

Linear distinguisher
1. initialize a counter u to 0
2. iterate n times the following process
1. pick a random X
2. query C(X)
3. if a.X = b.C(X) then increment counter u
3. If u/n is in B then output 1 otherwise output 0
We define
LPf(a,b) = ( 2.PrX[ a.X = b.f(X) ] - 1 )2
ELPC(a,b) = EC( LPC(a,b) )
and we have the following results.

This treatment is quite useful when desiging block ciphers: if we manage to construct one with low decorrelation of order 2, we can guaranty for free that no differential or linear characteristic has a useful bias on average over the distribution of the key.

# 11. Iterated Attacks

[to be completed]

# 12. References

 [CW79] Universal Classes of Hash Functions. By Carter and Wegman. In Journal of Computer and System Sciences vol. 18, pp. 143-154, 1979. [WC81] New Hash Functions and their Use in Authentication and Set Equality. By Wegman and Carter. In Journal of Computer and System Sciences vol. 22, pp. 265-279, 1981. [LR88] How to Construct Pseudorandom Permutations from Pseudorandom Functions. By Luby and Rackoff. In SIAM Journal on Computing vol. 17, pp. 373-386, 1988. [CG89] On the Power of Two-Point Based Sampling. By Chor and Goldreich. In Jounal of Complexity vol. 5, pp. 96-106, 1989. [ZMI89] Impossibility and Optimality Results on Constructing Pseudorandom Permutations. By Zheng, Matsumoto and Imai. In EUROCRYPT' 89, pp. 412-422, LNCS 434, Springer-Verlag, 1990. [LM90] A Proposal for a New Block Encryption Standard. By Lai and Massey. In EUROCRYPT' 90, pp. 389-404, LNCS 473, Springer-Verlag, 1991. [Lai92] On the Design and Security of Block Ciphers. By Lai. In ETH Series in Information Processing vol. 1, Hartung-Gorre Verlag Konstanz, 1992. [Liens-97-3] A Cheap Paradigm for Block Cipher Security Strengthening. Technical report LIENS-97-3, Ecole Normale Supérieure, 1997. [Stacs98] Provable Security for Block Ciphers by Decorrelation. In STACS' 98, pp. 249-275, LNCS 1373, Springer-Verlag, 1998. [Sac98] Feistel Ciphers with L2 Decorrelation. In SAC' 98, pp. 1-14, LNCS 1556, Springer-Verlag, 1999. [Aes98] Decorrelated Fast Cipher: an AES Candidate. By Gilbert, Girault, Hoogvoorst, Noilhan, Pornin, Poupard, Stern and Vaudenay. In the Proceedings from the First Advanced Encryption Standard Candidate Conference, National Institute of Standards and Technology (NIST), 1998. [NR99] On the Construction of Pseudorandom Permutations: Luby-Rackoff Revisited. By Naor and Reingold. In Journal of Cryptology vol. 12, pp. 29-66, 1999. [Aes99] DFC Update. By Baudron, Gilbert, Granboulan, Handschuh, Harley, Joux, Nguyen, Noilhan, Pointcheval, Pornin, Poupard, Stern and Vaudenay. In the Proceedings from the Second Advanced Encryption Standard Candidate Conference, National Institute of Standards and Technology (NIST), 1999. [PRS99] Towards Making Luby-Rackoff Ciphers Optimal and Practical. By Patel, Ramzan, Sundaram. In Fast Software Encryption' 99, pp. 171-185, LNCS 1636, Springer-Verlag, 1999. [Eurocrypt99] Resistance Against General Iterated Attacks. In EUROCRYPT' 99, pp. 255-271, LNCS 1592, Springer-Verlag, 1999. [Sac99] Adaptive-Attack Norm for Decorrelation and Super-Pseudorandomness. Technical report LIENS-99-2, Ecole Normale Supérieure, 1999. To appear in SAC'99. [Asiacrypt99] On the Lai-Massey Scheme. Technical report LIENS-99-3, Ecole Normale Supérieure, 1999. To appear in ASIACRYPT'99.