RFC Errata
RFC 4009, "The SEED Encryption Algorithm", February 2005
Note: This RFC has been obsoleted by RFC 4269
Source of RFC: IETF - NON WORKING GROUP
Errata ID: 752
Status: Held for Document Update
Type: Technical
Publication Format(s) : TEXT
Reported By: Alfred Hoenes
Date Reported: 2005-02-28
Held for Document Update by: Russ Housley
(C) First, a formal issue with the ASN.1 given in the RFC text: In the ASN.1 definitions in section 2.5., it would perhaps be more natural and consistent with the requirements from the text (and the ASN.1 comment already present there!) to give an explicit SIZE restriction in the definition of the syntax of the initialization vector for SEED in CBC mode (which indeed *MUST* be 16 octets long). To this end, I recommend to replace the 7th text line of section 2.5., "seedCDCParameter ::= OCTET STRING -- 128-bit Initialization Vector" by: | "seedCDCParameter ::= OCTET STRING (SIZE(16)) | -- 128-bit Initialization Vector" (D) The text in section 2.2. talks about two basic 8x8 S-boxes, named "S1" and "S2". Contrary to that, Appendix A.1. (on page 7) gives tables for "S-Box S0" and "S-Box S1". It would be easier to change the headlines in Appendix A.1., but, given the numbering style of all other formula elements, it is certainly be more appropriate and consistent to modify the equations in section 2.2., replacing "S1" by "S0", and "S2" by "S1". I'll give a full replacement text for section 2.2. below, covering this issue as well. [ Now, returning to the open issue ... ] (B) In short, sorry, I do NOT agree with your definition of the SS-boxes. It does not conform with the primary definition of the function G. Below, I'll give you a detailed step-by-step explanation, using a concrete example, for the reasoning already presented in my first mail. Note: For brevity and due to the email line length restriction, in the subsequent reasoning, I will omit the braces '{ ... }' around the ANDed terms, making use of the usual precedence rule for multiplication over addition and the facts that - the '^' operation is the normal addition in the 8 / 32 dimensional vector spaces over GF(2) we are working on, - '&' representing the 8 / 32 fold tensor product of the scalar multiplication over GF(2) . I repeat and extend the numeration of the formulas from my first email, already applying item (D). X = X0 || X1 || X2 || X3 (2) ... the 32 bit wide input to the function G; Z = Z0 || Z1 || Z2 || Z3 (1) ... the 32 bit wide output of the function G applied to X; with the 8-bit wide components computed by the application of the two S-Boxes S0 and S1 via the equations: Z0 = S0(X0) & m0 ^ S1(X1) & m1 ^ S0(X2) & m2 ^ S1(X3) & m3 (1.0) Z1 = S0(X0) & m1 ^ S1(X1) & m2 ^ S0(X2) & m3 ^ S1(X3) & m0 (1.1) Z2 = S0(X0) & m2 ^ S1(X1) & m3 ^ S0(X2) & m0 ^ S1(X3) & m1 (1.2) Z3 = S0(X0) & m3 ^ S1(X1) & m0 ^ S0(X2) & m1 ^ S1(X3) & m2 (1.3) with m0 = 0xFC , m1 = 0xF3 , m2 = 0xCF , and m3 = 0x3F (1.4) The alternate description of the Function G is to be Z = SS0(X0) ^ SS1(X1) ^ SS2(X2) ^ SS3(X3) (3) where, below, I'll try "my" version of the SS-Box definition ... SS0(x) = S0(x) & m0 || S0(x) & m1 || S0(x) & m2 || S0(x) & m3 (4.0) SS1(x) = S1(x) & m1 || S1(x) & m2 || S1(x) & m3 || S1(x) & m0 (4.1) SS2(x) = S0(x) & m2 || S0(x) & m3 || S0(x) & m0 || S0(x) & m1 (4.2) SS3(x) = S1(x) & m3 || S1(x) & m0 || S1(x) & m1 || S1(x) & m2 (4.3) ... and the version from the text (with the formal changes already discussed properly applied) ... SS0(x) = S0(x) & m3 || S0(x) & m2 || S0(x) & m1 || S0(x) & m0 (5.0) SS1(x) = S1(x) & m0 || S1(x) & m3 || S1(x) & m2 || S1(x) & m1 (5.1) SS2(x) = S0(x) & m1 || S0(x) & m0 || S0(x) & m3 || S0(x) & m2 (5.2) SS3(x) = S1(x) & m2 || S1(x) & m1 || S1(x) & m0 || S1(x) & m3 (5.3) With these notations, let's challenge the example octet sequence X0 = 0x09 , X1 = 0x11 , X2 = 0xE1 , X3 = 0xF9 (6a) i.e., by (2) : X = (0x) 09 11 E1 F9 (6b) Applying the tables from Appendix A.1., we obtain (*) : S0(X0) = 0x43 , S1(X1) = 0x62 , S0(X2) = 0xB9 , S1(X3) = 0x4C . To first compute Z with the original definition of G, substituting (*) and (1.4) into (1.0) .. (1.3), we obtain, step by step: Z0 = S0(X0) & m0 ^ S1(X1) & m1 ^ S0(X2) & m2 ^ S1(X3) & m3 (1.0) = 0x43 & 0xFC ^ 0x62 & 0xF3 ^ 0xB9 & 0xCF ^ 0x4C & 0x3F = 0x40 ^ 0x62 ^ 0x89 ^ 0x0C ==> Z0 = 0xA7 (7.0) Z1 = S0(X0) & m1 ^ S1(X1) & m2 ^ S0(X2) & m3 ^ S1(X3) & m0 (1.1) = 0x43 & 0xF3 ^ 0x62 & 0xCF ^ 0xB9 & 0x3F ^ 0x4C & 0xFC = 0x43 ^ 0x42 ^ 0x39 ^ 0x4C ==> Z1 = 0x74 (7.1) Z2 = S0(X0) & m2 ^ S1(X1) & m3 ^ S0(X2) & m0 ^ S1(X3) & m1 (1.2) = 0x43 & 0xCF ^ 0x62 & 0x3F ^ 0xB9 & 0xFC ^ 0x4C & 0xF3 = 0x43 ^ 0x22 ^ 0xB8 ^ 0x40 ==> Z2 = 0x99 (7.2) Z3 = S0(X0) & m3 ^ S1(X1) & m0 ^ S0(X2) & m1 ^ S1(X3) & m2 (1.3) = 0x43 & 0x3F ^ 0x62 & 0xFC ^ 0xB9 & 0xF3 ^ 0x4C & 0xCF = 0x03 ^ 0x60 ^ 0xB1 ^ 0x4C ==> Z3 = 0x9E (7.3) Putting (7.0) .. (7.3) together using (1), we get the byte sequence Z = Z0 || Z1 || Z2 || Z3 (1) ==> Z = (0x) A7 74 99 9E (7) Now let's see what happens when we apply "my" definition of the SS-Boxes, substituting (6a), (*), and (1.4) into (4.0) .. (4.3) : SS0(X0) = S0(X0) & m0 || S0(X0) & m1 || S0(X0) & m2 || S0(X0) & m3 = 0x43 & 0xFC || 0x43 & 0xF3 || 0x43 & 0xCF || 0x43 & 0x3F = 0x40 || 0x43 || 0x43 || 0x03 ==> SS0(X0) = (0x) 40 43 43 03 (8.0) SS1(X1) = S1(X1) & m1 || S1(X1) & m2 || S1(X1) & m3 || S1(X1) & m0 = 0x62 & 0xF3 || 0x62 & 0xCF || 0x62 & 0x3F || 0x62 & 0xFC = 0x62 || 0x42 || 0x22 || 0x60 ==> SS1(X1) = (0x) 62 42 22 60 (8.1) SS2(X2) = S0(X2) & m2 || S0(X2) & m3 || S0(X2) & m0 || S0(X2) & m1 = 0xB9 & 0xCF || 0xB9 & 0x3F || 0xB9 & 0xFC || 0xB9 & 0xF3 = 0x89 || 0x39 || 0xB8 || 0xB1 ==> SS2(X2) = (0x) 89 39 B8 B1 (8.2) SS3(X3) = S1(X3) & m3 || S1(X3) & m0 || S1(X3) & m1 || S1(X3) & m2 = 0x4C & 0x3F || 0x4C & 0xFC || 0x4C & 0xF3 || 0x4C & 0xCF = 0x0C || 0x4C || 0x40 || 0x4C ==> SS3(X3) = (0x) 0C 4C 40 4C (8.3) Note: Please observe that the terms in (8.0) .. (8.3) are indeed the same terms as in (7.0) .. (7.3), but in 4x4 matrix transposed order -- as stated abstractly in my first email. Summing up (8.0) .. (8.3) according to (3), we get what should be the alternate definition of G(X) : Z = SS0(X0) ^ SS1(X1) ^ SS2(X2) ^ SS3(X3) (3) = (0x) 40 43 43 03 ^ -- from (8.0) (0x) 62 42 22 60 ^ -- from (8.1) (0x) 89 39 B8 B1 ^ -- from (8.2) (0x) 0C 4C 40 4C -- from (8.3) ==> ----------- Z = (0x) A7 74 99 9E (8) Obviously, (8) indeed is identical to (7). Finally, let's look for what we get with your definition of the SS-Boxes, substituting (6a), (*), and (1.4) into (5.0) .. (5.3) : SS0(X0) = S0(X0) & m3 || S0(X0) & m2 || S0(X0) & m1 || S0(X0) & m0 = 0x43 & 0x3F || 0x43 & 0xCF || 0x43 & 0xF3 || 0x43 & 0xFC = 0x03 || 0x43 || 0x43 || 0x40 ==> SS0(X0) = (0x) 03 43 43 40 (9.0) SS1(X1) = S1(X1) & m0 || S1(X1) & m3 || S1(X1) & m2 || S1(X1) & m1 = 0x62 & 0xFC || 0x62 & 0x3F || 0x62 & 0xCF || 0x62 & 0xF3 = 0x60 || 0x22 || 0x42 || 0x62 ==> SS1(X1) = (0x) 60 22 42 62 (9.1) SS2(X2) = S0(X2) & m1 || S0(X2) & m0 || S0(X2) & m3 || S0(X2) & m2 = 0xB9 & 0xF3 || 0xB9 & 0xFC || 0xB9 & 0x3F || 0xB9 & 0xCF = 0xB1 || 0xB8 || 0x39 || 0x89 ==> SS2(X2) = (0x) B1 B8 39 89 (9.2) SS3(X3) = S1(X3) & m2 || S1(X3) & m1 || S1(X3) & m0 || S1(X3) & m3 = 0x4C & 0xCF || 0x4C & 0xF3 || 0x4C & 0xFC || 0x4C & 0x3F = 0x4C || 0x40 || 0x4C || 0x0C ==> SS3(X3) = (0x) 4C 40 4C 0C (8.3) Summing up (9.0) .. (9.3) according to (3), we get Z = SS0(X0) ^ SS1(X1) ^ SS2(X2) ^ SS3(X3) (3) = (0x) 03 43 43 40 ^ -- from (9.0) (0x) 60 22 42 62 ^ -- from (9.1) (0x) B1 B8 39 89 ^ -- from (9.2) (0x) 4C 40 4C 0C -- from (9.3) ==> ----------- Z = (0x) 9E 99 74 A7 (9) This is NOT the same as (7). In fact, it is the byte-reversed sequence from (7) . -- Bingo! Taking a closer look at (5.0) .. (5.3), I now see that indeed the terms in these equations are in the reverse concatenation sequence compared to (4.0) .. (4.3) ! After a detailed inspection of the tables given in Appendix A.2. of RFC 4009, it now becomes clear that -- disregarding the IETF conventions to represent all binary data in network byte order, and without any further notice of this fact -- these tables do NOT represent octect sequences (as expected from the presentation of above formula using octet concatenation, not arithmetic left-shift or multiply-by-256 operations), BUT the hexadecimal representation of the proper 4-octet sequences improperly cast into 32 bit numbers on a 'little endian' processor, i.e., without using the ntohl() intrinsic! I suspect that your definition of the SS-boxes (5.0) .. (5.3) has been inadvertently retrofitted to match the values given in the tables in Appendix A.2., again re-formulating the printed byte order (which is NOT the storage byte order [= octet concatenation order] in a little endian processor) using the octet concatenation notation / formalism. Because definition of the function G according to section 2.2. does not make use of 32-bit integer arithmetic operations, I recommend to clarify the situation by o giving formula (4.0) .. (4.3) for the SS-Boxes, consistent with the other formulas, and o stating explicitely in Appendix A.2. that these tables give byte reversed presentations of the octet sequences to be used as the output of the SS-boxes. To catch up, here's my proposed replacement text for the whole section 2.2. of RFC 4009, covering the issues (A), (B), and (D) mentioned so far: ---------------- cut here ------------------------------- 2.2. The Function G The function G has two layers: a layer of two 8x8 S-boxes, S0 and S1, and a layer of block permutation of sixteen 8-bit sub-blocks. The output octet sequence Z (= Z0 || Z1 || Z2 || Z3) of the function G with the four-octet input X (= X0 || X1 || X2 || X3) is as follows: Z0 = {S0(X0) & m0} ^ {S1(X1) & m1} ^ {S0(X2) & m2} ^ {S1(X3) & m3} Z1 = {S0(X0) & m1} ^ {S1(X1) & m2} ^ {S0(X2) & m3} ^ {S1(X3) & m0} Z2 = {S0(X0) & m2} ^ {S1(X1) & m3} ^ {S0(X2) & m0} ^ {S1(X3) & m1} Z3 = {S0(X0) & m3} ^ {S1(X1) & m0} ^ {S0(X2) & m1} ^ {S1(X3) & m2} where m0 = 0xFC , m1 = 0xF3 , m2 = 0xCF , and m3 = 0x3F . To increase the efficiency of the calculation of the function G, four extended S-boxes with 4-octet output ('SS-boxes', see Appendix A.2.) are defined as follows: SS0(x) = {S0(x) & m0} || {S0(x) & m1} || {S0(x) & m2} || {S0(x) & m3} SS1(x) = {S1(x) & m1} || {S1(x) & m2} || {S1(x) & m3} || {S1(x) & m0} SS2(x) = {S0(x) & m2} || {S0(x) & m3} || {S0(x) & m0} || {S0(x) & m1} SS3(x) = {S1(x) & m3} || {S1(x) & m0} || {S1(x) & m1} || {S1(x) & m2} Hereby, the function G can be defined as follows: Z = G(X) = SS0(X0) ^ SS1(X1) ^ SS2(X2) ^ SS3(X3) . This alternate definition of the function G is faster than the original definition but it takes 16 times the memory to store the four SS-boxes compared to the two original S-boxes. ---------------- cut here ------------------------------- Immediately below the headline of Appendix A.2., on page 8, I propose to insert the following text (or similar): "The following tables specify the byte-reversed SS-box values, i.e., the first octet to be returned from the function G (named Z0 in section 2.2.), is obtained by XORing together the rightmost bytes from the appropriate entries looked up in the following four tables, etc." (E) The above discussion, and the observed deviation from the IETF standard ('network') byte ordering convention, immediately raises additional byte ordering concerns for - the definition of the round function F in section 2.1., and - the Key Schedule definition given in section 2.3. The function G -- as defined in section 2.2. -- operates on a four- octet sequence and returns a four-octet sequence, according to the formulae labelled (1) and (2) above. The definition of the round function F and the key schedule make use of several multi-byte INTEGER arithmetic operations, in particular 32-bit (mod 2^32) addition and subtraction, on input and output values of the function G, and the key schedule uses circular shift operations on concatenated (64 bit) intermediate key blocks. With regard to the observations made above, I now suspect that some of the implicit 'cast' operations (between octet sequences and multi- byte integers) involved in the round functions and the key schedule might also be formulated in a little-endian centric way, omitting the necessary application of the htonl() and ntohl() intrinsics when producing the test cases in Appendix B. (I did not have the time to verify that.) For the sake of interoperability, it SHOULD be clarified whether the formulae given for R0' and R1' in section 2.1., and for Ki0 and Ki1 in section 2.3., as well as the constant's values tabulated in the latter section, are specified according to IEFT standard byte ordering rules, or with implicit little-endian byte order in mind.
It should say:
[see above]
Notes:
from pending