classpathx-crypto
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Classpathx-crypto] [patch] Inlined Serpent


From: Casey Marshall
Subject: Re: [Classpathx-crypto] [patch] Inlined Serpent
Date: Fri, 06 Sep 2002 18:05:56 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Raif S. Naffah wrote:
| hello Casey and Dag,
|
| Casey Marshall wrote:
| | (I'm CC'ing this to the crypto list, too)
| |
| | Attached is a patch to modify Serpent to use Dag Arne Osvik's inlined
| | versions of the encryption and decryption methods. The key setup is
| | still the same as before.
|
| this version:
|
| a. is _not_ thread-safe,

The attached version should be; it still declares the five register
variables (x0..x4) as global for the class, but declares the encrypt
and decrypt methods to be explicitly synchronized. This doesn't appear
to affect speed, unless you're encrypting and decrypting in parallel.

| b. generates a 54K class file (compared to 14K for the current impl.)
| --with Sun's javac compiler.

Bigger classes will always result from such extensive unrolling; this
version does a bit better (each S-Box is in its own method, and I
removed the static sboxI* and transform methods, since they're no longer
needed). Once this version has an inlined key setup the static methods
can disappear completely.

| c. runs almost 20 times slower than the current one!
|

Because of the JIT. Sun's HotSpot can handle this one better.

I was aiming for the following with this version:

1) Keep constant array indices.
2) Keep methods short, so the JIT can handle them.
3) Don't repeat too much code (e.g. there is a one function per S-Box).

The results are a good compromise:

~            Sun 1.4.0         GCJ (native)       Kaffe 1.0.6
encrypt   4584.8003 KB/s     6262.5254 KB/s    4283.1689 KB/s
decrypt   4861.5435 KB/s     6230.0640 KB/s    4662.7871 KB/s

(all on an AMD MP 1200). Much better for Sun's VM, a little slower for
GCJ and Kaffe, and better than the current version. By the way with this
version GNU's Serpent will be faster than BouncyCastle's.

| i see some advantages in keeping the current implementation _and_ adding
| the new in-lined one as an alternative; say SerpentInLined? (naming
| suggestions are welcome)
|

I wonder if a naming scheme following "SerpentSBoxImpl" or something
similar is more appropriate, since what we're talking about here is a
particular implementation of "bit-sliced" S-Boxes.

|
| i'll start a new thread about discussing this alternative in general;
| i.e. how can we [re-]write the Factory methods to provide the _optimal_
| implementation.
|
| i'll also start another thread about collecting and publishing
| performance figures which, ultimately may allow us to tailor the
| heuristic for selecting one implementation among many, depending on the
| host environment.
|
|
| | If we can come up with a way to make this inlined version more
| | digestible for certain JITs, I think this would be a good implementation
| | for future releases.
|
| pointers to finding out more about JIT nuts and bolts, anybody?
|

There's this: <http://java.sun.com/docs/hotspot/VMOptions.html> which
says that HotSpot JITs can take the option -Xmaxjitcodesize<size>.
Generated code size is obviously not the problem, since trying the fully
inlined version with outrageously huge compiled code sizes results in
equivalent performance. I can't find anything relating to time spent in
the compiler.

Sun has published some books on their JITs, none of which I've seen,
though.

Cheers,

- --
Casey Marshall < address@hidden > http://metastatic.org/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
Comment: Using GnuPG with Netscape - http://enigmail.mozdev.org

iD8DBQE9eVD0gAuWMgRGsWsRAvxyAJ9TgIz0bAJSVyfK/vbJlv2DByfgzgCffhAQ
UlJLYpmYWhvLjRnX4FrKPMw=
=hgJf
-----END PGP SIGNATURE-----
package gnu.crypto.cipher;

// ----------------------------------------------------------------------------
// $Id: Serpent.java,v 1.3 2002/09/04 09:56:39 raif Exp $
//
// Copyright (C) 2002, Free Software Foundation, Inc.
//
// This program is free software; you can redistribute it and/or modify it
// under the terms of the GNU General Public License as published by the Free
// Software Foundation; either version 2 of the License or (at your option) any
// later version.
//
// This program is distributed in the hope that it will be useful, but WITHOUT
// ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
// more details.
//
// You should have received a copy of the GNU General Public License along with
// this program; see the file COPYING.  If not, write to the
//
//    Free Software Foundation Inc.,
//    59 Temple Place - Suite 330,
//    Boston, MA 02111-1307
//    USA
//
// As a special exception, if you link this library with other files to produce
// an executable, this library does not by itself cause the resulting
// executable to be covered by the GNU General Public License.  This exception
// does not however invalidate any other reasons why the executable file might
// be covered by the GNU General Public License.
// ----------------------------------------------------------------------------

import gnu.crypto.Registry;
import gnu.crypto.util.Util;

import java.security.InvalidKeyException;
import java.util.Collections;
import java.util.Iterator;

/**
 * <p>Serpent is a 32-round substitution-permutation network block cipher,
 * operating on 128-bit blocks and accepting keys of 128, 192, and 256 bits in
 * length. At each round the plaintext is XORed with a 128 bit portion of the
 * session key -- a 4224 bit key computed from the input key -- then one of
 * eight S-boxes are applied, and finally a simple linear transformation is
 * done. Decryption does the exact same thing in reverse order, and using the
 * eight inverses of the S-boxes.</p>
 *
 * <p>Serpent was designed by Ross Anderson, Eli Biham, and Lars Knudsen as a
 * proposed cipher for the Advanced Encryption Standard.</p>
 *
 * <p>Serpent can be sped up greatly by replacing S-box substitution with a
 * sequence of binary operations, and the optimal implementation depends
 * upon finding the fastest sequence of binary operations that reproduce this
 * substitution. This implementation uses the S-boxes discovered by
 * <a href="http://www.ii.uib.no/~osvik/";>Dag Arne Osvik</a>, which are
 * optimized for the Pentium family of processors.</p>
 *
 * <p>References:</p>
 *
 * <ol>
 *    <li><a href="http://www.cl.cam.ac.uk/~rja14/serpent.html";>Serpent: A
 *    Candidate Block Cipher for the Advanced Encryption Standard.</a></li>
 * </ol>
 *
 * @version $Revision: 1.3 $
 */
public class Serpent extends BaseCipher {

   // Constants and variables
   // -------------------------------------------------------------------------

   private static final String NAME = "serpent";

   private static final int DEFAULT_KEY_SIZE = 16;
   private static final int DEFAULT_BLOCK_SIZE = 16;
   private static final int ROUNDS = 32;

   /** The fractional part of the golden ratio, (sqrt(5)+1)/2. */
   private static final int PHI = 0x9e3779b9;

   /**
    * KAT vector (from ecb_vk):
    * I=9
    * KEY=008000000000000000000000000000000000000000000000
    * CT=5587B5BCB9EE5A28BA2BACC418005240
    */
   private static final byte[] KAT_KEY =
         
Util.toBytesFromString("008000000000000000000000000000000000000000000000");
   private static final byte[] KAT_CT =
         Util.toBytesFromString("5587B5BCB9EE5A28BA2BACC418005240");

   /** caches the result of the correctness test, once executed. */
   private static Boolean valid;

   private int x0, x1, x2, x3, x4;

   // Constructor(s)
   // -------------------------------------------------------------------------

   /** Trivial zero-argument constructor. */
   public Serpent() {
      super(Registry.SERPENT_CIPHER, DEFAULT_BLOCK_SIZE, DEFAULT_KEY_SIZE);
   }

   // Class methods
   // -------------------------------------------------------------------------

   // Bit-flip madness methods
   //
   // The following S-Box functions were developed by Dag Arne Osvik, and are
   // described in his paper, "Speeding up Serpent". They are optimized to
   // perform on the Pentium chips, but work well here too.
   //
   // The methods below are Copyright (C) 2000 Dag Arne Osvik.

   /** S-Box 0. */
   private static final void
         sbox0(int r0, int r1, int r2, int r3, int[] w, int off) {
      int r4 = r1 ^ r2;
      r3 ^= r0;
      r1 = r1 & r3 ^ r0;
      r0 = (r0 | r3) ^ r4;
      r4 ^= r3;
      r3 ^= r2;
      r2 = (r2 | r1) ^ r4;
      r4 = ~r4 | r1;
      r1 ^= r3 ^ r4;
      r3 |= r0;
      w[off] = r1 ^ r3;
      w[off + 1] = r4 ^ r3;
      w[off + 2] = r2;
      w[off + 3] = r0;
   }

   /** S-Box 1. */
   private static final void
         sbox1(int r0, int r1, int r2, int r3, int[] w, int off) {
      r0 = ~r0;
      int r4 = r0;
      r2 = ~r2;
      r0 &= r1;
      r2 ^= r0;
      r0 |= r3;
      r3 ^= r2;
      r1 ^= r0;
      r0 ^= r4;
      r4 |= r1;
      r1 ^= r3;
      r2 = (r2 | r0) & r4;
      r0 ^= r1;
      w[off] = r2;
      w[off + 1] = r0 & r2 ^ r4;
      w[off + 2] = r3;
      w[off + 3] = r1 & r2 ^ r0;
   }

   /** S-Box 2. */
   private static final void
         sbox2(int r0, int r1, int r2, int r3, int[] w, int off) {
      int r4 = r0;
      r0 = r0 & r2 ^ r3;
      r2 = r2 ^ r1 ^ r0;
      r3 = (r3 | r4) ^ r1;
      r4 ^= r2;
      r1 = r3;
      r3 = (r3 | r4) ^ r0;
      r0 &= r1;
      r4 ^= r0;
      w[off] = r2;
      w[off + 1] = r3;
      w[off + 2] = r1 ^ r3 ^ r4;
      w[off + 3] = ~r4;
   }

   /** S-Box 3. */
   private static final void
         sbox3(int r0, int r1, int r2, int r3, int[] w, int off) {
      int r4 = r0;
      r0 |= r3;
      r3 ^= r1;
      r1 &= r4;
      r4 = r4 ^ r2 | r1;
      r2 ^= r3;
      r3 = r3 & r0 ^ r4;
      r0 ^= r1;
      r4 = r4 & r0 ^ r2;
      r1 = (r1 ^ r3 | r0) ^ r2;
      r0 ^= r3;
      w[off] = (r1 | r3) ^ r0;
      w[off + 1] = r1;
      w[off + 2] = r3;
      w[off + 3] = r4;
   }

   /** S-Box 4. */
   private static final void
         sbox4(int r0, int r1, int r2, int r3, int[] w, int off) {
      r1 ^= r3;
      int r4 = r1;
      r3 = ~r3;
      r2 ^= r3;
      r3 ^= r0;
      r1 = r1 & r3 ^ r2;
      r4 ^= r3;
      r0 ^= r4;
      r2 = r2 & r4 ^ r0;
      r0 &= r1;
      r3 ^= r0;
      r4 = (r4 | r1) ^ r0;
      w[off] = r1;
      w[off + 1] = r4 ^ (r2 & r3);
      w[off + 2] = ~((r0 | r3) ^ r2);
      w[off + 3] = r3;
   }

   /** S-Box 5. */
   private static final void
         sbox5(int r0, int r1, int r2, int r3, int[] w, int off) {
      r0 ^= r1;
      r1 ^= r3;
      int r4 = r1;
      r3 = ~r3;
      r1 &= r0;
      r2 ^= r3;
      r1 ^= r2;
      r2 |= r4;
      r4 ^= r3;
      r3 = r3 & r1 ^ r0;
      r4 = r4 ^ r1 ^ r2;
      w[off] = r1;
      w[off + 1] = r3;
      w[off + 2] = r0 & r3 ^ r4;
      w[off + 3] = ~(r2 ^ r0) ^ (r4 | r3);
   }

   /** S-Box 6. */
   private static final void
         sbox6(int r0, int r1, int r2, int r3, int[] w, int off) {
      int r4 = r3;
      r2 = ~r2;
      r3 = r3 & r0 ^ r2;
      r0 ^= r4;
      r2 = (r2 | r4) ^ r0;
      r1 ^= r3;
      r0 |= r1;
      r2 ^= r1;
      r4 ^= r0;
      r0 = (r0 | r3) ^ r2;
      r4 = r4 ^ r3 ^ r0;
      w[off] = r0;
      w[off + 1] = r1;
      w[off + 2] = r4;
      w[off + 3] = r2 & r4 ^ ~r3;
   }

   /** S-Box 7. */
   private static final void
         sbox7(int r0, int r1, int r2, int r3, int[] w, int off) {
      int r4 = r1;
      r1 = (r1 | r2) ^ r3;
      r4 ^= r2;
      r2 ^= r1;
      r3 = (r3 | r4) & r0;
      r4 ^= r2;
      r3 ^= r1;
      r1 = (r1 | r4) ^ r0;
      r0 = (r0 | r4) ^ r2;
      r1 ^= r4;
      r2 ^= r1;
      w[off] = r4 ^ (~r2 | r0);
      w[off + 1] = r3;
      w[off + 2] = r1 & r0 ^ r4;
      w[off + 3] = r0;
   }

   // Instance methods
   // -------------------------------------------------------------------------

   // java.lang.Cloneable interface implementation ----------------------------

   public Object clone() {
      return new Serpent();
   }

   // IBlockCipherSpi interface implementation --------------------------------

   public Iterator blockSizes() {
      return Collections.singleton(new Integer(DEFAULT_BLOCK_SIZE)).iterator();
   }

   public Iterator keySizes() {
      return new Iterator() {
         int i = 0;
         // Support 128, 192, and 256 bit keys.
         Integer[] keySizes = {
            new Integer(16), new Integer(24), new Integer(32)
         };

         public boolean hasNext() {
            return i < keySizes.length;
         }

         public Object next() {
            if (hasNext()) {
               return keySizes[i++];
            }
            return null;
         }

         public void remove() {
            throw new UnsupportedOperationException();
         }
      };
   }

   public Object makeKey(byte[] key, int blockSize) throws InvalidKeyException {
      // Not strictly true, but here to conform with the AES proposal.
      // This restriction can be removed if deemed necessary.
      if (key.length != 16 && key.length != 24 && key.length != 32) {
         throw new InvalidKeyException("Key length is not 16, 24, or 32 bytes");
      }

      // Here w is our "pre-key".
      int[] w = new int[4 * (ROUNDS + 1)];
      int i, j;
      for (i = 0, j = key.length - 4; i < 8 && j >= 0; i++) {
         w[i] = (key[j] & 0xff) << 24 | (key[j + 1] & 0xff) << 16 |
               (key[j + 2] & 0xff) << 8 | (key[j + 3] & 0xff);
         j -= 4;
      }
      // Pad key if < 256 bits.
      if (i != 8) {
         w[i] = 1;
      }
      // Transform using w_i-8 ... w_i-1
      for (i = 8; i < 16; i++) {
         int t = w[i - 8] ^ w[i - 5] ^ w[i - 3] ^ w[i - 1] ^ PHI ^ (i - 8);
         w[i] = t << 11 | t >>> 21;
      }
      // Translate by 8.
      for (i = 0; i < 8; i++) {
         w[i] = w[i + 8];
      }
      // Transform the rest of the key.
      for (i = 8; i < w.length; i++) {
         int t = w[i - 8] ^ w[i - 5] ^ w[i - 3] ^ w[i - 1] ^ PHI ^ i;
         w[i] = t << 11 | t >>> 21;
      }

      // After these s-boxes the pre-key (w, above) will become the
      // session key (w, below).
      sbox3(w[0], w[1], w[2], w[3], w, 0);
      sbox2(w[4], w[5], w[6], w[7], w, 4);
      sbox1(w[8], w[9], w[10], w[11], w, 8);
      sbox0(w[12], w[13], w[14], w[15], w, 12);
      sbox7(w[16], w[17], w[18], w[19], w, 16);
      sbox6(w[20], w[21], w[22], w[23], w, 20);
      sbox5(w[24], w[25], w[26], w[27], w, 24);
      sbox4(w[28], w[29], w[30], w[31], w, 28);
      sbox3(w[32], w[33], w[34], w[35], w, 32);
      sbox2(w[36], w[37], w[38], w[39], w, 36);
      sbox1(w[40], w[41], w[42], w[43], w, 40);
      sbox0(w[44], w[45], w[46], w[47], w, 44);
      sbox7(w[48], w[49], w[50], w[51], w, 48);
      sbox6(w[52], w[53], w[54], w[55], w, 52);
      sbox5(w[56], w[57], w[58], w[59], w, 56);
      sbox4(w[60], w[61], w[62], w[63], w, 60);
      sbox3(w[64], w[65], w[66], w[67], w, 64);
      sbox2(w[68], w[69], w[70], w[71], w, 68);
      sbox1(w[72], w[73], w[74], w[75], w, 72);
      sbox0(w[76], w[77], w[78], w[79], w, 76);
      sbox7(w[80], w[81], w[82], w[83], w, 80);
      sbox6(w[84], w[85], w[86], w[87], w, 84);
      sbox5(w[88], w[89], w[90], w[91], w, 88);
      sbox4(w[92], w[93], w[94], w[95], w, 92);
      sbox3(w[96], w[97], w[98], w[99], w, 96);
      sbox2(w[100], w[101], w[102], w[103], w, 100);
      sbox1(w[104], w[105], w[106], w[107], w, 104);
      sbox0(w[108], w[109], w[110], w[111], w, 108);
      sbox7(w[112], w[113], w[114], w[115], w, 112);
      sbox6(w[116], w[117], w[118], w[119], w, 116);
      sbox5(w[120], w[121], w[122], w[123], w, 120);
      sbox4(w[124], w[125], w[126], w[127], w, 124);
      sbox3(w[128], w[129], w[130], w[131], w, 128);

      return w;
   }

   public synchronized void
   encrypt(byte[] in, int i, byte[] out, int o, Object K, int bs) {
      final int[] key = (int[]) K;

      //int x0, x1, x2, x3, x4;
      x3 = (in[i   ] & 0xff) << 24 | (in[i+ 1] & 0xff) << 16 |
           (in[i+ 2] & 0xff) <<  8 | (in[i+ 3] & 0xff);
      x2 = (in[i+ 4] & 0xff) << 24 | (in[i+ 5] & 0xff) << 16 |
           (in[i+ 6] & 0xff) <<  8 | (in[i+ 7] & 0xff);
      x1 = (in[i+ 8] & 0xff) << 24 | (in[i+ 9] & 0xff) << 16 |
           (in[i+10] & 0xff) <<  8 | (in[i+11] & 0xff);
      x0 = (in[i+12] & 0xff) << 24 | (in[i+13] & 0xff) << 16 |
           (in[i+14] & 0xff) <<  8 | (in[i+15] & 0xff);

      x0 ^= key[  0]; x1 ^= key[  1]; x2 ^= key[  2]; x3 ^= key[  3]; sbox0();
      x1 ^= key[  4]; x4 ^= key[  5]; x2 ^= key[  6]; x0 ^= key[  7]; sbox1();
      x0 ^= key[  8]; x4 ^= key[  9]; x2 ^= key[ 10]; x1 ^= key[ 11]; sbox2();
      x2 ^= key[ 12]; x1 ^= key[ 13]; x4 ^= key[ 14]; x3 ^= key[ 15]; sbox3();
      x1 ^= key[ 16]; x4 ^= key[ 17]; x3 ^= key[ 18]; x0 ^= key[ 19]; sbox4();
      x4 ^= key[ 20]; x2 ^= key[ 21]; x1 ^= key[ 22]; x0 ^= key[ 23]; sbox5();
      x2 ^= key[ 24]; x0 ^= key[ 25]; x4 ^= key[ 26]; x1 ^= key[ 27]; sbox6();
      x2 ^= key[ 28]; x0 ^= key[ 29]; x3 ^= key[ 30]; x4 ^= key[ 31]; sbox7();
      x0 = x3; x3 = x2; x2 = x4;

      x0 ^= key[ 32]; x1 ^= key[ 33]; x2 ^= key[ 34]; x3 ^= key[ 35]; sbox0();
      x1 ^= key[ 36]; x4 ^= key[ 37]; x2 ^= key[ 38]; x0 ^= key[ 39]; sbox1();
      x0 ^= key[ 40]; x4 ^= key[ 41]; x2 ^= key[ 42]; x1 ^= key[ 43]; sbox2();
      x2 ^= key[ 44]; x1 ^= key[ 45]; x4 ^= key[ 46]; x3 ^= key[ 47]; sbox3();
      x1 ^= key[ 48]; x4 ^= key[ 49]; x3 ^= key[ 50]; x0 ^= key[ 51]; sbox4();
      x4 ^= key[ 52]; x2 ^= key[ 53]; x1 ^= key[ 54]; x0 ^= key[ 55]; sbox5();
      x2 ^= key[ 56]; x0 ^= key[ 57]; x4 ^= key[ 58]; x1 ^= key[ 59]; sbox6();
      x2 ^= key[ 60]; x0 ^= key[ 61]; x3 ^= key[ 62]; x4 ^= key[ 63]; sbox7();
      x0 = x3; x3 = x2; x2 = x4;

      x0 ^= key[ 64]; x1 ^= key[ 65]; x2 ^= key[ 66]; x3 ^= key[ 67]; sbox0();
      x1 ^= key[ 68]; x4 ^= key[ 69]; x2 ^= key[ 70]; x0 ^= key[ 71]; sbox1();
      x0 ^= key[ 72]; x4 ^= key[ 73]; x2 ^= key[ 74]; x1 ^= key[ 75]; sbox2();
      x2 ^= key[ 76]; x1 ^= key[ 77]; x4 ^= key[ 78]; x3 ^= key[ 79]; sbox3();
      x1 ^= key[ 80]; x4 ^= key[ 81]; x3 ^= key[ 82]; x0 ^= key[ 83]; sbox4();
      x4 ^= key[ 84]; x2 ^= key[ 85]; x1 ^= key[ 86]; x0 ^= key[ 87]; sbox5();
      x2 ^= key[ 88]; x0 ^= key[ 89]; x4 ^= key[ 90]; x1 ^= key[ 91]; sbox6();
      x2 ^= key[ 92]; x0 ^= key[ 93]; x3 ^= key[ 94]; x4 ^= key[ 95]; sbox7();
      x0 = x3; x3 = x2; x2 = x4;

      x0 ^= key[ 96]; x1 ^= key[ 97]; x2 ^= key[ 98]; x3 ^= key[ 99]; sbox0();
      x1 ^= key[100]; x4 ^= key[101]; x2 ^= key[102]; x0 ^= key[103]; sbox1();
      x0 ^= key[104]; x4 ^= key[105]; x2 ^= key[106]; x1 ^= key[107]; sbox2();
      x2 ^= key[108]; x1 ^= key[109]; x4 ^= key[110]; x3 ^= key[111]; sbox3();
      x1 ^= key[112]; x4 ^= key[113]; x3 ^= key[114]; x0 ^= key[115]; sbox4();
      x4 ^= key[116]; x2 ^= key[117]; x1 ^= key[118]; x0 ^= key[119]; sbox5();
      x2 ^= key[120]; x0 ^= key[121]; x4 ^= key[122]; x1 ^= key[123]; sbox6();
      x2 ^= key[124]; x0 ^= key[125]; x3 ^= key[126]; x4 ^= key[127]; 
sbox7noLT();
      x0 = x3; x3 = x2; x2 = x4;
      x0 ^= key[128]; x1 ^= key[129]; x2 ^= key[130]; x3 ^= key[131];

      out[o   ] = (byte)(x3 >>> 24);
      out[o+ 1] = (byte)(x3 >>> 16);
      out[o+ 2] = (byte)(x3 >>> 8);
      out[o+ 3] = (byte) x3;
      out[o+ 4] = (byte)(x2 >>> 24);
      out[o+ 5] = (byte)(x2 >>> 16);
      out[o+ 6] = (byte)(x2 >>> 8);
      out[o+ 7] = (byte) x2;
      out[o+ 8] = (byte)(x1 >>> 24);
      out[o+ 9] = (byte)(x1 >>> 16);
      out[o+10] = (byte)(x1 >>> 8);
      out[o+11] = (byte) x1;
      out[o+12] = (byte)(x0 >>> 24);
      out[o+13] = (byte)(x0 >>> 16);
      out[o+14] = (byte)(x0 >>> 8);
      out[o+15] = (byte) x0;
   }

   public synchronized void
   decrypt(byte[] in, int i, byte[] out, int o, Object K, int bs) {
      final int[] key = (int[]) K;

      x3 = (in[i   ] & 0xff) << 24 | (in[i+ 1] & 0xff) << 16 |
           (in[i+ 2] & 0xff) <<  8 | (in[i+ 3] & 0xff);
      x2 = (in[i+ 4] & 0xff) << 24 | (in[i+ 5] & 0xff) << 16 |
           (in[i+ 6] & 0xff) <<  8 | (in[i+ 7] & 0xff);
      x1 = (in[i+ 8] & 0xff) << 24 | (in[i+ 9] & 0xff) << 16 |
           (in[i+10] & 0xff) <<  8 | (in[i+11] & 0xff);
      x0 = (in[i+12] & 0xff) << 24 | (in[i+13] & 0xff) << 16 |
           (in[i+14] & 0xff) <<  8 | (in[i+15] & 0xff);

      x0 ^= key[128]; x1 ^= key[129]; x2 ^= key[130]; x3 ^= key[131]; 
sboxI7noLT();
      x3 ^= key[124]; x0 ^= key[125]; x1 ^= key[126]; x4 ^= key[127]; sboxI6();
      x0 ^= key[120]; x1 ^= key[121]; x2 ^= key[122]; x4 ^= key[123]; sboxI5();
      x1 ^= key[116]; x3 ^= key[117]; x4 ^= key[118]; x2 ^= key[119]; sboxI4();
      x1 ^= key[112]; x2 ^= key[113]; x4 ^= key[114]; x0 ^= key[115]; sboxI3();
      x0 ^= key[108]; x1 ^= key[109]; x4 ^= key[110]; x2 ^= key[111]; sboxI2();
      x1 ^= key[104]; x3 ^= key[105]; x4 ^= key[106]; x2 ^= key[107]; sboxI1();
      x0 ^= key[100]; x1 ^= key[101]; x2 ^= key[102]; x4 ^= key[103]; sboxI0();
      x0 ^= key[ 96]; x3 ^= key[ 97]; x1 ^= key[ 98]; x4 ^= key[ 99]; sboxI7();
      x1 = x3; x3 = x4; x4 = x2;

      x3 ^= key[ 92]; x0 ^= key[ 93]; x1 ^= key[ 94]; x4 ^= key[ 95]; sboxI6();
      x0 ^= key[ 88]; x1 ^= key[ 89]; x2 ^= key[ 90]; x4 ^= key[ 91]; sboxI5();
      x1 ^= key[ 84]; x3 ^= key[ 85]; x4 ^= key[ 86]; x2 ^= key[ 87]; sboxI4();
      x1 ^= key[ 80]; x2 ^= key[ 81]; x4 ^= key[ 82]; x0 ^= key[ 83]; sboxI3();
      x0 ^= key[ 76]; x1 ^= key[ 77]; x4 ^= key[ 78]; x2 ^= key[ 79]; sboxI2();
      x1 ^= key[ 72]; x3 ^= key[ 73]; x4 ^= key[ 74]; x2 ^= key[ 75]; sboxI1();
      x0 ^= key[ 68]; x1 ^= key[ 69]; x2 ^= key[ 70]; x4 ^= key[ 71]; sboxI0();
      x0 ^= key[ 64]; x3 ^= key[ 65]; x1 ^= key[ 66]; x4 ^= key[ 67]; sboxI7();
      x1 = x3; x3 = x4; x4 = x2;

      x3 ^= key[ 60]; x0 ^= key[ 61]; x1 ^= key[ 62]; x4 ^= key[ 63]; sboxI6();
      x0 ^= key[ 56]; x1 ^= key[ 57]; x2 ^= key[ 58]; x4 ^= key[ 59]; sboxI5();
      x1 ^= key[ 52]; x3 ^= key[ 53]; x4 ^= key[ 54]; x2 ^= key[ 55]; sboxI4();
      x1 ^= key[ 48]; x2 ^= key[ 49]; x4 ^= key[ 50]; x0 ^= key[ 51]; sboxI3();
      x0 ^= key[ 44]; x1 ^= key[ 45]; x4 ^= key[ 46]; x2 ^= key[ 47]; sboxI2();
      x1 ^= key[ 40]; x3 ^= key[ 41]; x4 ^= key[ 42]; x2 ^= key[ 43]; sboxI1();
      x0 ^= key[ 36]; x1 ^= key[ 37]; x2 ^= key[ 38]; x4 ^= key[ 39]; sboxI0();
      x0 ^= key[ 32]; x3 ^= key[ 33]; x1 ^= key[ 34]; x4 ^= key[ 35]; sboxI7();
      x1 = x3; x3 = x4; x4 = x2;

      x3 ^= key[ 28]; x0 ^= key[ 29]; x1 ^= key[ 30]; x4 ^= key[ 31]; sboxI6();
      x0 ^= key[ 24]; x1 ^= key[ 25]; x2 ^= key[ 26]; x4 ^= key[ 27]; sboxI5();
      x1 ^= key[ 20]; x3 ^= key[ 21]; x4 ^= key[ 22]; x2 ^= key[ 23]; sboxI4();
      x1 ^= key[ 16]; x2 ^= key[ 17]; x4 ^= key[ 18]; x0 ^= key[ 19]; sboxI3();
      x0 ^= key[ 12]; x1 ^= key[ 13]; x4 ^= key[ 14]; x2 ^= key[ 15]; sboxI2();
      x1 ^= key[  8]; x3 ^= key[  9]; x4 ^= key[ 10]; x2 ^= key[ 11]; sboxI1();
      x0 ^= key[  4]; x1 ^= key[  5]; x2 ^= key[  6]; x4 ^= key[  7]; sboxI0();
      x2 = x1; x1 = x3; x3 = x4;

      x0 ^= key[  0]; x1 ^= key[  1]; x2 ^= key[  2]; x3 ^= key[  3];

      out[o   ] = (byte)(x3 >>> 24);
      out[o+ 1] = (byte)(x3 >>> 16);
      out[o+ 2] = (byte)(x3 >>> 8);
      out[o+ 3] = (byte) x3;
      out[o+ 4] = (byte)(x2 >>> 24);
      out[o+ 5] = (byte)(x2 >>> 16);
      out[o+ 6] = (byte)(x2 >>> 8);
      out[o+ 7] = (byte) x2;
      out[o+ 8] = (byte)(x1 >>> 24);
      out[o+ 9] = (byte)(x1 >>> 16);
      out[o+10] = (byte)(x1 >>> 8);
      out[o+11] = (byte) x1;
      out[o+12] = (byte)(x0 >>> 24);
      out[o+13] = (byte)(x0 >>> 16);
      out[o+14] = (byte)(x0 >>> 8);
      out[o+15] = (byte) x0;
   }

   public boolean selfTest() {
      if (valid == null) {
         boolean result = super.selfTest(); // do symmetry tests
         if (result) {
            result = testKat(KAT_KEY, KAT_CT);
         }
         valid = new Boolean(result);
      }
      return valid.booleanValue();
   }

   // Own methods. ----------------------------------------------------------

   private void sbox0() {
      x3 ^= x0;
      x4 = x1;
      x1 &= x3;
      x4 ^= x2;
      x1 ^= x0;
      x0 |= x3;
      x0 ^= x4;
      x4 ^= x3;
      x3 ^= x2;
      x2 |= x1;
      x2 ^= x4;
      x4 ^= -1;
      x4 |= x1;
      x1 ^= x3;
      x1 ^= x4;
      x3 |= x0;
      x1 ^= x3;
      x4 ^= x3;

      x1 = (x1 << 13) | (x1 >>> 19);
      x4 ^= x1;
      x3 = x1 << 3;
      x2 = (x2 <<  3) | (x2 >>> 29);
      x4 ^= x2;
      x0 ^= x2;
      x4 = (x4 <<  1) | (x4 >>> 31);
      x0 ^= x3;
      x0 = (x0 <<  7) | (x0 >>> 25);
      x3 = x4;
      x1 ^= x4;
      x3 <<= 7;
      x1 ^= x0;
      x2 ^= x0;
      x2 ^= x3;
      x1 = (x1 <<  5) | (x1 >>> 27);
      x2 = (x2 << 22) | (x2 >>> 10);
   }

   private void sbox1() {
      x4 = ~x4;
      x3 = x1;
      x1 ^= x4;
      x3 |= x4;
      x3 ^= x0;
      x0 &= x1;
      x2 ^= x3;
      x0 ^= x4;
      x0 |= x2;
      x1 ^= x3;
      x0 ^= x1;
      x4 &= x2;
      x1 |= x4;
      x4 ^= x3;
      x1 ^= x2;
      x3 |= x0;
      x1 ^= x3;
      x3 = ~x3;
      x4 ^= x0;
      x3 &= x2;
      x4 = ~x4;
      x3 ^= x1;
      x4 ^= x3;

      x0 = (x0 << 13) | (x0 >>> 19);
      x4 ^= x0;
      x3 = x0 << 3;
      x2 = (x2 <<  3) | (x2 >>> 29);
      x4 ^= x2;
      x1 ^= x2;
      x4 = (x4 <<  1) | (x4 >>> 31);
      x1 ^= x3;
      x1 = (x1 <<  7) | (x1 >>> 25);
      x3 = x4;
      x0 ^= x4;
      x3 <<= 7;
      x0 ^= x1;
      x2 ^= x1;
      x2 ^= x3;
      x0 = (x0 <<  5) | (x0 >>> 27);
      x2 = (x2 << 22) | (x2 >>> 10);
   }

   private void sbox2() {
      x3 = x0;
      x0 = x0 & x2;
      x0 = x0 ^ x1;
      x2 = x2 ^ x4;
      x2 = x2 ^ x0;
      x1 = x1 | x3;
      x1 = x1 ^ x4;
      x3 = x3 ^ x2;
      x4 = x1;
      x1 = x1 | x3;
      x1 = x1 ^ x0;
      x0 = x0 & x4;
      x3 = x3 ^ x0;
      x4 = x4 ^ x1;
      x4 = x4 ^ x3;
      x3 = ~x3;

      x2 = (x2 << 13) | (x2 >>> 19);
      x1 ^= x2;
      x0 = x2 << 3;
      x4 = (x4 <<  3) | (x4 >>> 29);
      x1 ^= x4;
      x3 ^= x4;
      x1 = (x1 <<  1) | (x1 >>> 31);
      x3 ^= x0;
      x3 = (x3 <<  7) | (x3 >>> 25);
      x0 = x1;
      x2 ^= x1;
      x0 <<= 7;
      x2 ^= x3;
      x4 ^= x3;
      x4 ^= x0;
      x2 = (x2 <<  5) | (x2 >>> 27);
      x4 = (x4 << 22) | (x4 >>> 10);
   }

   private void sbox3() {
      x0 = x2;
      x2 = x2 | x3;
      x3 = x3 ^ x1;
      x1 = x1 & x0;
      x0 = x0 ^ x4;
      x4 = x4 ^ x3;
      x3 = x3 & x2;
      x0 = x0 | x1;
      x3 = x3 ^ x0;
      x2 = x2 ^ x1;
      x0 = x0 & x2;
      x1 = x1 ^ x3;
      x0 = x0 ^ x4;
      x1 = x1 | x2;
      x1 = x1 ^ x4;
      x2 = x2 ^ x3;
      x4 = x1;
      x1 = x1 | x3;
      x1 = x1 ^ x2;

      x1 = (x1 << 13) | (x1 >>> 19);
      x4 ^= x1;
      x2 = x1 << 3;
      x3 = (x3 <<  3) | (x3 >>> 29);
      x4 ^= x3;
      x0 ^= x3;
      x4 = (x4 <<  1) | (x4 >>> 31);
      x0 ^= x2;
      x0 = (x0 <<  7) | (x0 >>> 25);
      x2 = x4;
      x1 ^= x4;
      x2 <<= 7;
      x1 ^= x0;
      x3 ^= x0;
      x3 ^= x2;
      x1 = (x1 <<  5) | (x1 >>> 27);
      x3 = (x3 << 22) | (x3 >>> 10);
   }

   private void sbox4() {
      x4 = x4 ^ x0;
      x0 = ~x0;
      x3 = x3 ^ x0;
      x0 = x0 ^ x1;
      x2 = x4;
      x4 = x4 & x0;
      x4 = x4 ^ x3;
      x2 = x2 ^ x0;
      x1 = x1 ^ x2;
      x3 = x3 & x2;
      x3 = x3 ^ x1;
      x1 = x1 & x4;
      x0 = x0 ^ x1;
      x2 = x2 | x4;
      x2 = x2 ^ x1;
      x1 = x1 | x0;
      x1 = x1 ^ x3;
      x3 = x3 & x0;
      x1 = ~x1;
      x2 = x2 ^ x3;

      x4 = (x4 << 13) | (x4 >>> 19);
      x2 ^= x4;
      x3 = x4 << 3;
      x1 = (x1 <<  3) | (x1 >>> 29);
      x2 ^= x1;
      x0 ^= x1;
      x2 = (x2 <<  1) | (x2 >>> 31);
      x0 ^= x3;
      x0 = (x0 <<  7) | (x0 >>> 25);
      x3 = x2;
      x4 ^= x2;
      x3 <<= 7;
      x4 ^= x0;
      x1 ^= x0;
      x1 ^= x3;
      x4 = (x4 <<  5) | (x4 >>> 27);
      x1 = (x1 << 22) | (x1 >>> 10);
   }

   private void sbox5() {
      x4 = x4 ^ x2;
      x2 = x2 ^ x0;
      x0 = ~x0;
      x3 = x2;
      x2 = x2 & x4;
      x1 = x1 ^ x0;
      x2 = x2 ^ x1;
      x1 = x1 | x3;
      x3 = x3 ^ x0;
      x0 = x0 & x2;
      x0 = x0 ^ x4;
      x3 = x3 ^ x2;
      x3 = x3 ^ x1;
      x1 = x1 ^ x4;
      x4 = x4 & x0;
      x1 = ~x1;
      x4 = x4 ^ x3;
      x3 = x3 | x0;
      x1 = x1 ^ x3;

      x2 = (x2 << 13) | (x2 >>> 19);
      x0 ^= x2;
      x3 = x2 << 3;
      x4 = (x4 <<  3) | (x4 >>> 29);
      x0 ^= x4;
      x1 ^= x4;
      x0 = (x0 <<  1) | (x0 >>> 31);
      x1 ^= x3;
      x1 = (x1 <<  7) | (x1 >>> 25);
      x3 = x0;
      x2 ^= x0;
      x3 <<= 7;
      x2 ^= x1;
      x4 ^= x1;
      x4 ^= x3;
      x2 = (x2 <<  5) | (x2 >>> 27);
      x4 = (x4 << 22) | (x4 >>> 10);
   }

   private void sbox6() {
      x4 = ~x4;
      x3 = x1;
      x1 = x1 & x2;
      x2 = x2 ^ x3;
      x1 = x1 ^ x4;
      x4 = x4 | x3;
      x0 = x0 ^ x1;
      x4 = x4 ^ x2;
      x2 = x2 | x0;
      x4 = x4 ^ x0;
      x3 = x3 ^ x2;
      x2 = x2 | x1;
      x2 = x2 ^ x4;
      x3 = x3 ^ x1;
      x3 = x3 ^ x2;
      x1 = ~x1;
      x4 = x4 & x3;
      x4 = x4 ^ x1;
      x2 = (x2 << 13) | (x2 >>> 19);
      x0 ^= x2;
      x1 = x2 << 3;
      x3 = (x3 <<  3) | (x3 >>> 29);
      x0 ^= x3;
      x4 ^= x3;
      x0 = (x0 <<  1) | (x0 >>> 31);
      x4 ^= x1;
      x4 = (x4 <<  7) | (x4 >>> 25);
      x1 = x0;
      x2 ^= x0;
      x1 <<= 7;
      x2 ^= x4;
      x3 ^= x4;
      x3 ^= x1;
      x2 = (x2 <<  5) | (x2 >>> 27);
      x3 = (x3 << 22) | (x3 >>> 10);
   }

   private void sbox7() {
      x1 = x3;
      x3 = x3 & x0;
      x3 = x3 ^ x4;
      x4 = x4 & x0;
      x1 = x1 ^ x3;
      x3 = x3 ^ x0;
      x0 = x0 ^ x2;
      x2 = x2 | x1;
      x2 = x2 ^ x3;
      x4 = x4 ^ x0;
      x3 = x3 ^ x4;
      x4 = x4 & x2;
      x4 = x4 ^ x1;
      x1 = x1 ^ x3;
      x3 = x3 & x2;
      x1 = ~x1;
      x3 = x3 ^ x1;
      x1 = x1 & x2;
      x0 = x0 ^ x4;
      x1 = x1 ^ x0;
      x3 = (x3 << 13) | (x3 >>> 19);
      x1 ^= x3;
      x0 = x3 << 3;
      x4 = (x4 <<  3) | (x4 >>> 29);
      x1 ^= x4;
      x2 ^= x4;
      x1 = (x1 <<  1) | (x1 >>> 31);
      x2 ^= x0;
      x2 = (x2 <<  7) | (x2 >>> 25);
      x0 = x1;
      x3 ^= x1;
      x0 <<= 7;
      x3 ^= x2;
      x4 ^= x2;
      x4 ^= x0;
      x3 = (x3 <<  5) | (x3 >>> 27);
      x4 = (x4 << 22) | (x4 >>> 10);
   }

   /** The final S-box, with no transform. */
   private void sbox7noLT() {
      x1 = x3;
      x3 = x3 & x0;
      x3 = x3 ^ x4;
      x4 = x4 & x0;
      x1 = x1 ^ x3;
      x3 = x3 ^ x0;
      x0 = x0 ^ x2;
      x2 = x2 | x1;
      x2 = x2 ^ x3;
      x4 = x4 ^ x0;
      x3 = x3 ^ x4;
      x4 = x4 & x2;
      x4 = x4 ^ x1;
      x1 = x1 ^ x3;
      x3 = x3 & x2;
      x1 = ~x1;
      x3 = x3 ^ x1;
      x1 = x1 & x2;
      x0 = x0 ^ x4;
      x1 = x1 ^ x0;
   }

   private void sboxI7noLT() {
      x4 = x2;
      x2 ^= x0;
      x0 &= x3;
      x2 = ~x2;
      x4 |= x3;
      x3 ^= x1;
      x1 |= x0;
      x0 ^= x2;
      x2 &= x4;
      x1 ^= x2;
      x2 ^= x0;
      x0 |= x2;
      x3 &= x4;
      x0 ^= x3;
      x4 ^= x1;
      x3 ^= x4;
      x4 |= x0;
      x3 ^= x2;
      x4 ^= x2;
   }

   private void sboxI6() {
      x1 = (x1 >>> 22) | (x1 << 10);
      x3 = (x3 >>>  5) | (x3 << 27);
      x2 = x0;
      x1 ^= x4;
      x2 <<= 7;
      x3 ^= x4;
      x1 ^= x2;
      x3 ^= x0;
      x4 = (x4 >>>  7) | (x4 << 25);
      x0 = (x0 >>>  1) | (x0 << 31);
      x0 ^= x3;
      x2 = x3 << 3;
      x4 ^= x2;
      x3 = (x3 >>> 13) | (x3 << 19);
      x0 ^= x1;
      x4 ^= x1;
      x1 = (x1 >>>  3) | (x1 << 29);
      x3 ^= x1;
      x2 = x1;
      x1 &= x3;
      x2 ^= x4;
      x1 = ~x1;
      x4 ^= x0;
      x1 ^= x4;
      x2 |= x3;
      x3 ^= x1;
      x4 ^= x2;
      x2 ^= x0;
      x0 &= x4;
      x0 ^= x3;
      x3 ^= x4;
      x3 |= x1;
      x4 ^= x0;
      x2 ^= x3;
   }

   private void sboxI5() {
      x2 = (x2 >>> 22) | (x2 << 10);
      x0 = (x0 >>>  5) | (x0 << 27);
      x3 = x1;
      x2 ^= x4;
      x3 <<= 7;
      x0 ^= x4;
      x2 ^= x3;
      x0 ^= x1;
      x4 = (x4 >>>  7) | (x4 << 25);
      x1 = (x1 >>>  1) | (x1 << 31);
      x1 ^= x0;
      x3 = x0 << 3;
      x4 ^= x3;
      x0 = (x0 >>> 13) | (x0 << 19);
      x1 ^= x2;
      x4 ^= x2;
      x2 = (x2 >>>  3) | (x2 << 29);
      x1 = ~x1;
      x3 = x4;
      x2 ^= x1;
      x4 |= x0;
      x4 ^= x2;
      x2 |= x1;
      x2 &= x0;
      x3 ^= x4;
      x2 ^= x3;
      x3 |= x0;
      x3 ^= x1;
      x1 &= x2;
      x1 ^= x4;
      x3 ^= x2;
      x4 &= x3;
      x3 ^= x1;
      x4 ^= x0;
      x4 ^= x3;
      x3 = ~x3;
   }

   private void sboxI4() {
      x4 = (x4 >>> 22) | (x4 << 10);
      x1 = (x1 >>>  5) | (x1 << 27);
      x0 = x3;
      x4 ^= x2;
      x0 <<= 7;
      x1 ^= x2;
      x4 ^= x0;
      x1 ^= x3;
      x2 = (x2 >>>  7) | (x2 << 25);
      x3 = (x3 >>>  1) | (x3 << 31);
      x3 ^= x1;
      x0 = x1 << 3;
      x2 ^= x0;
      x1 = (x1 >>> 13) | (x1 << 19);
      x3 ^= x4;
      x2 ^= x4;
      x4 = (x4 >>>  3) | (x4 << 29);
      x0 = x4;
      x4 &= x2;
      x4 ^= x3;
      x3 |= x2;
      x3 &= x1;
      x0 ^= x4;
      x0 ^= x3;
      x3 &= x4;
      x1 = ~x1;
      x2 ^= x0;
      x3 ^= x2;
      x2 &= x1;
      x2 ^= x4;
      x1 ^= x3;
      x4 &= x1;
      x2 ^= x1;
      x4 ^= x0;
      x4 |= x2;
      x2 ^= x1;
      x4 ^= x3;
   }

   private void sboxI3() {
      x4 = (x4 >>> 22) | (x4 << 10);
      x1 = (x1 >>>  5) | (x1 << 27);
      x3 = x2;
      x4 ^= x0;
      x3 <<= 7;
      x1 ^= x0;
      x4 ^= x3;
      x1 ^= x2;
      x0 = (x0 >>>  7) | (x0 << 25);
      x2 = (x2 >>>  1) | (x2 << 31);
      x2 ^= x1;
      x3 = x1 << 3;
      x0 ^= x3;
      x1 = (x1 >>> 13) | (x1 << 19);
      x2 ^= x4;
      x0 ^= x4;
      x4 = (x4 >>>  3) | (x4 << 29);
      x3 = x4;
      x4 ^= x2;
      x2 &= x4;
      x2 ^= x1;
      x1 &= x3;
      x3 ^= x0;
      x0 |= x2;
      x0 ^= x4;
      x1 ^= x3;
      x4 ^= x1;
      x1 |= x0;
      x1 ^= x2;
      x3 ^= x4;
      x4 &= x0;
      x2 |= x0;
      x2 ^= x4;
      x3 ^= x1;
      x4 ^= x3;
   }

   private void sboxI2() {
      x4 = (x4 >>> 22) | (x4 << 10);
      x0 = (x0 >>>  5) | (x0 << 27);
      x3 = x1;
      x4 ^= x2;
      x3 <<= 7;
      x0 ^= x2;
      x4 ^= x3;
      x0 ^= x1;
      x2 = (x2 >>>  7) | (x2 << 25);
      x1 = (x1 >>>  1) | (x1 << 31);
      x1 ^= x0;
      x3 = x0 << 3;
      x2 ^= x3;
      x0 = (x0 >>> 13) | (x0 << 19);
      x1 ^= x4;
      x2 ^= x4;
      x4 = (x4 >>>  3) | (x4 << 29);
      x4 ^= x2;
      x2 ^= x0;
      x3 = x2;
      x2 &= x4;
      x2 ^= x1;
      x1 |= x4;
      x1 ^= x3;
      x3 &= x2;
      x4 ^= x2;
      x3 &= x0;
      x3 ^= x4;
      x4 &= x1;
      x4 |= x0;
      x2 = ~x2;
      x4 ^= x2;
      x0 ^= x2;
      x0 &= x1;
      x2 ^= x3;
      x2 ^= x0;
   }

   private void sboxI1() {
      x4 = (x4 >>> 22) | (x4 << 10);
      x1 = (x1 >>>  5) | (x1 << 27);
      x0 = x3;
      x4 ^= x2;
      x0 <<= 7;
      x1 ^= x2;
      x4 ^= x0;
      x1 ^= x3;
      x2 = (x2 >>>  7) | (x2 << 25);
      x3 = (x3 >>>  1) | (x3 << 31);
      x3 ^= x1;
      x0 = x1 << 3;
      x2 ^= x0;
      x1 = (x1 >>> 13) | (x1 << 19);
      x3 ^= x4;
      x2 ^= x4;
      x4 = (x4 >>>  3) | (x4 << 29);
      x0 = x3;
      x3 ^= x2;
      x2 &= x3;
      x0 ^= x4;
      x2 ^= x1;
      x1 |= x3;
      x4 ^= x2;
      x1 ^= x0;
      x1 |= x4;
      x3 ^= x2;
      x1 ^= x3;
      x3 |= x2;
      x3 ^= x1;
      x0 = ~x0;
      x0 ^= x3;
      x3 |= x1;
      x3 ^= x1;
      x3 |= x0;
      x2 ^= x3;
   }

   private void sboxI0() {
      x2 = (x2 >>> 22) | (x2 << 10);
      x0 = (x0 >>>  5) | (x0 << 27);
      x3 = x1;
      x2 ^= x4;
      x3 <<= 7;
      x0 ^= x4;
      x2 ^= x3;
      x0 ^= x1;
      x4 = (x4 >>>  7) | (x4 << 25);
      x1 = (x1 >>>  1) | (x1 << 31);
      x1 ^= x0;
      x3 = x0 << 3;
      x4 ^= x3;
      x0 = (x0 >>> 13) | (x0 << 19);
      x1 ^= x2;
      x4 ^= x2;
      x2 = (x2 >>>  3) | (x2 << 29);
      x2 = ~x2;
      x3 = x1;
      x1 |= x0;
      x3 = ~x3;
      x1 ^= x2;
      x2 |= x3;
      x1 ^= x4;
      x0 ^= x3;
      x2 ^= x0;
      x0 &= x4;
      x3 ^= x0;
      x0 |= x1;
      x0 ^= x2;
      x4 ^= x3;
      x2 ^= x1;
      x4 ^= x0;
      x4 ^= x1;
      x2 &= x4;
      x3 ^= x2;
   }

   private void sboxI7() {
      x1 = (x1 >>> 22) | (x1 << 10);
      x0 = (x0 >>>  5) | (x0 << 27);
      x2 = x3;
      x1 ^= x4;
      x2 <<= 7;
      x0 ^= x4;
      x1 ^= x2;
      x0 ^= x3;
      x4 = (x4 >>>  7) | (x4 << 25);
      x3 = (x3 >>>  1) | (x3 << 31);
      x3 ^= x0;
      x2 = x0 << 3;
      x4 ^= x2;
      x0 = (x0 >>> 13) | (x0 << 19);
      x3 ^= x1;
      x4 ^= x1;
      x1 = (x1 >>>  3) | (x1 << 29);
      x2 = x1;
      x1 ^= x0;
      x0 &= x4;
      x1 = ~x1;
      x2 |= x4;
      x4 ^= x3;
      x3 |= x0;
      x0 ^= x1;
      x1 &= x2;
      x3 ^= x1;
      x1 ^= x0;
      x0 |= x1;
      x4 &= x2;
      x0 ^= x4;
      x2 ^= x3;
      x4 ^= x2;
      x2 |= x0;
      x4 ^= x1;
      x2 ^= x1;
   }
}


reply via email to

[Prev in Thread] Current Thread [Next in Thread]