linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] crypto: vmx: Remove dubiously licensed crypto code
@ 2017-03-29 12:56 Michal Suchanek
  2017-03-29 14:51 ` Greg Kroah-Hartman
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Michal Suchanek @ 2017-03-29 12:56 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Greg Kroah-Hartman,
	Geert Uytterhoeven, Mauro Carvalho Chehab, linux-kernel,
	linux-crypto, linuxppc-dev
  Cc: Michal Suchanek

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 148207 bytes --]

While reviewing commit 11c6e16ee13a ("crypto: vmx - Adding asm
subroutines for XTS") which adds the OpenSSL license header to
drivers/crypto/vmx/aesp8-ppc.pl licensing of this driver came into
qestion. The whole license reads:

 # Licensed under the OpenSSL license (the "License").  You may not use
 # this file except in compliance with the License.  You can obtain a
 # copy
 # in the file LICENSE in the source distribution or at
 # https://www.openssl.org/source/license.html

 #
 # ====================================================================
 # Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
 # project. The module is, however, dual licensed under OpenSSL and
 # CRYPTOGAMS licenses depending on where you obtain it. For further
 # details see http://www.openssl.org/~appro/cryptogams/.
 # ====================================================================

After seeking legal advice it is still not clear that this driver can be
legally used in Linux. In particular the "depending on where you obtain
it" part does not make it clear when you can apply the GPL and when the
OpenSSL license.

I tried contacting the author of the code for clarification but did not
hear back. In absence of clear licensing the only solution I see is
removing this code.

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
---
 MAINTAINERS                       |   12 -
 drivers/crypto/Kconfig            |    8 -
 drivers/crypto/Makefile           |    1 -
 drivers/crypto/vmx/.gitignore     |    2 -
 drivers/crypto/vmx/Kconfig        |    9 -
 drivers/crypto/vmx/Makefile       |   21 -
 drivers/crypto/vmx/aes.c          |  150 --
 drivers/crypto/vmx/aes_cbc.c      |  202 --
 drivers/crypto/vmx/aes_ctr.c      |  191 --
 drivers/crypto/vmx/aes_xts.c      |  190 --
 drivers/crypto/vmx/aesp8-ppc.h    |   25 -
 drivers/crypto/vmx/aesp8-ppc.pl   | 3789 -------------------------------------
 drivers/crypto/vmx/ghash.c        |  227 ---
 drivers/crypto/vmx/ghashp8-ppc.pl |  234 ---
 drivers/crypto/vmx/ppc-xlate.pl   |  228 ---
 drivers/crypto/vmx/vmx.c          |   88 -
 16 files changed, 5377 deletions(-)
 delete mode 100644 drivers/crypto/vmx/.gitignore
 delete mode 100644 drivers/crypto/vmx/Kconfig
 delete mode 100644 drivers/crypto/vmx/Makefile
 delete mode 100644 drivers/crypto/vmx/aes.c
 delete mode 100644 drivers/crypto/vmx/aes_cbc.c
 delete mode 100644 drivers/crypto/vmx/aes_ctr.c
 delete mode 100644 drivers/crypto/vmx/aes_xts.c
 delete mode 100644 drivers/crypto/vmx/aesp8-ppc.h
 delete mode 100644 drivers/crypto/vmx/aesp8-ppc.pl
 delete mode 100644 drivers/crypto/vmx/ghash.c
 delete mode 100644 drivers/crypto/vmx/ghashp8-ppc.pl
 delete mode 100644 drivers/crypto/vmx/ppc-xlate.pl
 delete mode 100644 drivers/crypto/vmx/vmx.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 1b0a87ffffab..fd4cbf046ab4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6190,18 +6190,6 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git
 S:	Maintained
 F:	arch/ia64/
 
-IBM Power VMX Cryptographic instructions
-M:	Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
-M:	Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
-L:	linux-crypto@vger.kernel.org
-S:	Supported
-F:	drivers/crypto/vmx/Makefile
-F:	drivers/crypto/vmx/Kconfig
-F:	drivers/crypto/vmx/vmx.c
-F:	drivers/crypto/vmx/aes*
-F:	drivers/crypto/vmx/ghash*
-F:	drivers/crypto/vmx/ppc-xlate.pl
-
 IBM Power in-Nest Crypto Acceleration
 M:	Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
 M:	Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 473d31288ad8..9fcd3af1f2f1 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -530,14 +530,6 @@ config CRYPTO_DEV_QCE
 	  hardware. To compile this driver as a module, choose M here. The
 	  module will be called qcrypto.
 
-config CRYPTO_DEV_VMX
-	bool "Support for VMX cryptographic acceleration instructions"
-	depends on PPC64 && VSX
-	help
-	  Support for VMX cryptographic acceleration instructions.
-
-source "drivers/crypto/vmx/Kconfig"
-
 config CRYPTO_DEV_IMGTEC_HASH
 	tristate "Imagination Technologies hardware hash accelerator"
 	depends on MIPS || COMPILE_TEST
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index 739609471169..486e57e10e7a 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -34,5 +34,4 @@ obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
 obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o
 obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
 obj-$(CONFIG_CRYPTO_DEV_VIRTIO) += virtio/
-obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
 obj-$(CONFIG_CRYPTO_DEV_BCM_SPU) += bcm/
diff --git a/drivers/crypto/vmx/.gitignore b/drivers/crypto/vmx/.gitignore
deleted file mode 100644
index af4a7ce4738d..000000000000
--- a/drivers/crypto/vmx/.gitignore
+++ /dev/null
@@ -1,2 +0,0 @@
-aesp8-ppc.S
-ghashp8-ppc.S
diff --git a/drivers/crypto/vmx/Kconfig b/drivers/crypto/vmx/Kconfig
deleted file mode 100644
index c3d524ea6998..000000000000
--- a/drivers/crypto/vmx/Kconfig
+++ /dev/null
@@ -1,9 +0,0 @@
-config CRYPTO_DEV_VMX_ENCRYPT
-	tristate "Encryption acceleration support on P8 CPU"
-	depends on CRYPTO_DEV_VMX
-	select CRYPTO_GHASH
-	default m
-	help
-	  Support for VMX cryptographic acceleration instructions on Power8 CPU.
-	  This module supports acceleration for AES and GHASH in hardware. If you
-	  choose 'M' here, this module will be called vmx-crypto.
diff --git a/drivers/crypto/vmx/Makefile b/drivers/crypto/vmx/Makefile
deleted file mode 100644
index 55f7c392582f..000000000000
--- a/drivers/crypto/vmx/Makefile
+++ /dev/null
@@ -1,21 +0,0 @@
-obj-$(CONFIG_CRYPTO_DEV_VMX_ENCRYPT) += vmx-crypto.o
-vmx-crypto-objs := vmx.o aesp8-ppc.o ghashp8-ppc.o aes.o aes_cbc.o aes_ctr.o aes_xts.o ghash.o
-
-ifeq ($(CONFIG_CPU_LITTLE_ENDIAN),y)
-TARGET := linux-ppc64le
-else
-TARGET := linux-ppc64
-endif
-
-quiet_cmd_perl = PERL $@
-      cmd_perl = $(PERL) $(<) $(TARGET) > $(@)
-
-targets += aesp8-ppc.S ghashp8-ppc.S
-
-$(obj)/aesp8-ppc.S: $(src)/aesp8-ppc.pl FORCE
-	$(call if_changed,perl)
-  
-$(obj)/ghashp8-ppc.S: $(src)/ghashp8-ppc.pl FORCE
-	$(call if_changed,perl)
-
-clean-files := aesp8-ppc.S ghashp8-ppc.S
diff --git a/drivers/crypto/vmx/aes.c b/drivers/crypto/vmx/aes.c
deleted file mode 100644
index 022c7ab7351a..000000000000
--- a/drivers/crypto/vmx/aes.c
+++ /dev/null
@@ -1,150 +0,0 @@
-/**
- * AES routines supporting VMX instructions on the Power 8
- *
- * Copyright (C) 2015 International Business Machines Inc.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
- * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
- */
-
-#include <linux/types.h>
-#include <linux/err.h>
-#include <linux/crypto.h>
-#include <linux/delay.h>
-#include <linux/hardirq.h>
-#include <asm/switch_to.h>
-#include <crypto/aes.h>
-
-#include "aesp8-ppc.h"
-
-struct p8_aes_ctx {
-	struct crypto_cipher *fallback;
-	struct aes_key enc_key;
-	struct aes_key dec_key;
-};
-
-static int p8_aes_init(struct crypto_tfm *tfm)
-{
-	const char *alg;
-	struct crypto_cipher *fallback;
-	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (!(alg = crypto_tfm_alg_name(tfm))) {
-		printk(KERN_ERR "Failed to get algorithm name.\n");
-		return -ENOENT;
-	}
-
-	fallback = crypto_alloc_cipher(alg, 0, CRYPTO_ALG_NEED_FALLBACK);
-	if (IS_ERR(fallback)) {
-		printk(KERN_ERR
-		       "Failed to allocate transformation for '%s': %ld\n",
-		       alg, PTR_ERR(fallback));
-		return PTR_ERR(fallback);
-	}
-	printk(KERN_INFO "Using '%s' as fallback implementation.\n",
-	       crypto_tfm_alg_driver_name((struct crypto_tfm *) fallback));
-
-	crypto_cipher_set_flags(fallback,
-				crypto_cipher_get_flags((struct
-							 crypto_cipher *)
-							tfm));
-	ctx->fallback = fallback;
-
-	return 0;
-}
-
-static void p8_aes_exit(struct crypto_tfm *tfm)
-{
-	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (ctx->fallback) {
-		crypto_free_cipher(ctx->fallback);
-		ctx->fallback = NULL;
-	}
-}
-
-static int p8_aes_setkey(struct crypto_tfm *tfm, const u8 *key,
-			 unsigned int keylen)
-{
-	int ret;
-	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	preempt_disable();
-	pagefault_disable();
-	enable_kernel_vsx();
-	ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
-	ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
-	disable_kernel_vsx();
-	pagefault_enable();
-	preempt_enable();
-
-	ret += crypto_cipher_setkey(ctx->fallback, key, keylen);
-	return ret;
-}
-
-static void p8_aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
-	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (in_interrupt()) {
-		crypto_cipher_encrypt_one(ctx->fallback, dst, src);
-	} else {
-		preempt_disable();
-		pagefault_disable();
-		enable_kernel_vsx();
-		aes_p8_encrypt(src, dst, &ctx->enc_key);
-		disable_kernel_vsx();
-		pagefault_enable();
-		preempt_enable();
-	}
-}
-
-static void p8_aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
-	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (in_interrupt()) {
-		crypto_cipher_decrypt_one(ctx->fallback, dst, src);
-	} else {
-		preempt_disable();
-		pagefault_disable();
-		enable_kernel_vsx();
-		aes_p8_decrypt(src, dst, &ctx->dec_key);
-		disable_kernel_vsx();
-		pagefault_enable();
-		preempt_enable();
-	}
-}
-
-struct crypto_alg p8_aes_alg = {
-	.cra_name = "aes",
-	.cra_driver_name = "p8_aes",
-	.cra_module = THIS_MODULE,
-	.cra_priority = 1000,
-	.cra_type = NULL,
-	.cra_flags = CRYPTO_ALG_TYPE_CIPHER | CRYPTO_ALG_NEED_FALLBACK,
-	.cra_alignmask = 0,
-	.cra_blocksize = AES_BLOCK_SIZE,
-	.cra_ctxsize = sizeof(struct p8_aes_ctx),
-	.cra_init = p8_aes_init,
-	.cra_exit = p8_aes_exit,
-	.cra_cipher = {
-		       .cia_min_keysize = AES_MIN_KEY_SIZE,
-		       .cia_max_keysize = AES_MAX_KEY_SIZE,
-		       .cia_setkey = p8_aes_setkey,
-		       .cia_encrypt = p8_aes_encrypt,
-		       .cia_decrypt = p8_aes_decrypt,
-	},
-};
diff --git a/drivers/crypto/vmx/aes_cbc.c b/drivers/crypto/vmx/aes_cbc.c
deleted file mode 100644
index 72a26eb4e954..000000000000
--- a/drivers/crypto/vmx/aes_cbc.c
+++ /dev/null
@@ -1,202 +0,0 @@
-/**
- * AES CBC routines supporting VMX instructions on the Power 8
- *
- * Copyright (C) 2015 International Business Machines Inc.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
- * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
- */
-
-#include <linux/types.h>
-#include <linux/err.h>
-#include <linux/crypto.h>
-#include <linux/delay.h>
-#include <linux/hardirq.h>
-#include <asm/switch_to.h>
-#include <crypto/aes.h>
-#include <crypto/scatterwalk.h>
-#include <crypto/skcipher.h>
-
-#include "aesp8-ppc.h"
-
-struct p8_aes_cbc_ctx {
-	struct crypto_skcipher *fallback;
-	struct aes_key enc_key;
-	struct aes_key dec_key;
-};
-
-static int p8_aes_cbc_init(struct crypto_tfm *tfm)
-{
-	const char *alg;
-	struct crypto_skcipher *fallback;
-	struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (!(alg = crypto_tfm_alg_name(tfm))) {
-		printk(KERN_ERR "Failed to get algorithm name.\n");
-		return -ENOENT;
-	}
-
-	fallback = crypto_alloc_skcipher(alg, 0,
-			CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK);
-
-	if (IS_ERR(fallback)) {
-		printk(KERN_ERR
-		       "Failed to allocate transformation for '%s': %ld\n",
-		       alg, PTR_ERR(fallback));
-		return PTR_ERR(fallback);
-	}
-	printk(KERN_INFO "Using '%s' as fallback implementation.\n",
-		crypto_skcipher_driver_name(fallback));
-
-
-	crypto_skcipher_set_flags(
-		fallback,
-		crypto_skcipher_get_flags((struct crypto_skcipher *)tfm));
-	ctx->fallback = fallback;
-
-	return 0;
-}
-
-static void p8_aes_cbc_exit(struct crypto_tfm *tfm)
-{
-	struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (ctx->fallback) {
-		crypto_free_skcipher(ctx->fallback);
-		ctx->fallback = NULL;
-	}
-}
-
-static int p8_aes_cbc_setkey(struct crypto_tfm *tfm, const u8 *key,
-			     unsigned int keylen)
-{
-	int ret;
-	struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	preempt_disable();
-	pagefault_disable();
-	enable_kernel_vsx();
-	ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
-	ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
-	disable_kernel_vsx();
-	pagefault_enable();
-	preempt_enable();
-
-	ret += crypto_skcipher_setkey(ctx->fallback, key, keylen);
-	return ret;
-}
-
-static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
-			      struct scatterlist *dst,
-			      struct scatterlist *src, unsigned int nbytes)
-{
-	int ret;
-	struct blkcipher_walk walk;
-	struct p8_aes_cbc_ctx *ctx =
-		crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
-
-	if (in_interrupt()) {
-		SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
-		skcipher_request_set_tfm(req, ctx->fallback);
-		skcipher_request_set_callback(req, desc->flags, NULL, NULL);
-		skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
-		ret = crypto_skcipher_encrypt(req);
-		skcipher_request_zero(req);
-	} else {
-		preempt_disable();
-		pagefault_disable();
-		enable_kernel_vsx();
-
-		blkcipher_walk_init(&walk, dst, src, nbytes);
-		ret = blkcipher_walk_virt(desc, &walk);
-		while ((nbytes = walk.nbytes)) {
-			aes_p8_cbc_encrypt(walk.src.virt.addr,
-					   walk.dst.virt.addr,
-					   nbytes & AES_BLOCK_MASK,
-					   &ctx->enc_key, walk.iv, 1);
-			nbytes &= AES_BLOCK_SIZE - 1;
-			ret = blkcipher_walk_done(desc, &walk, nbytes);
-		}
-
-		disable_kernel_vsx();
-		pagefault_enable();
-		preempt_enable();
-	}
-
-	return ret;
-}
-
-static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
-			      struct scatterlist *dst,
-			      struct scatterlist *src, unsigned int nbytes)
-{
-	int ret;
-	struct blkcipher_walk walk;
-	struct p8_aes_cbc_ctx *ctx =
-		crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
-
-	if (in_interrupt()) {
-		SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
-		skcipher_request_set_tfm(req, ctx->fallback);
-		skcipher_request_set_callback(req, desc->flags, NULL, NULL);
-		skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
-		ret = crypto_skcipher_decrypt(req);
-		skcipher_request_zero(req);
-	} else {
-		preempt_disable();
-		pagefault_disable();
-		enable_kernel_vsx();
-
-		blkcipher_walk_init(&walk, dst, src, nbytes);
-		ret = blkcipher_walk_virt(desc, &walk);
-		while ((nbytes = walk.nbytes)) {
-			aes_p8_cbc_encrypt(walk.src.virt.addr,
-					   walk.dst.virt.addr,
-					   nbytes & AES_BLOCK_MASK,
-					   &ctx->dec_key, walk.iv, 0);
-			nbytes &= AES_BLOCK_SIZE - 1;
-			ret = blkcipher_walk_done(desc, &walk, nbytes);
-		}
-
-		disable_kernel_vsx();
-		pagefault_enable();
-		preempt_enable();
-	}
-
-	return ret;
-}
-
-
-struct crypto_alg p8_aes_cbc_alg = {
-	.cra_name = "cbc(aes)",
-	.cra_driver_name = "p8_aes_cbc",
-	.cra_module = THIS_MODULE,
-	.cra_priority = 2000,
-	.cra_type = &crypto_blkcipher_type,
-	.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK,
-	.cra_alignmask = 0,
-	.cra_blocksize = AES_BLOCK_SIZE,
-	.cra_ctxsize = sizeof(struct p8_aes_cbc_ctx),
-	.cra_init = p8_aes_cbc_init,
-	.cra_exit = p8_aes_cbc_exit,
-	.cra_blkcipher = {
-			  .ivsize = AES_BLOCK_SIZE,
-			  .min_keysize = AES_MIN_KEY_SIZE,
-			  .max_keysize = AES_MAX_KEY_SIZE,
-			  .setkey = p8_aes_cbc_setkey,
-			  .encrypt = p8_aes_cbc_encrypt,
-			  .decrypt = p8_aes_cbc_decrypt,
-	},
-};
diff --git a/drivers/crypto/vmx/aes_ctr.c b/drivers/crypto/vmx/aes_ctr.c
deleted file mode 100644
index 7cf6d31c1123..000000000000
--- a/drivers/crypto/vmx/aes_ctr.c
+++ /dev/null
@@ -1,191 +0,0 @@
-/**
- * AES CTR routines supporting VMX instructions on the Power 8
- *
- * Copyright (C) 2015 International Business Machines Inc.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
- * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
- */
-
-#include <linux/types.h>
-#include <linux/err.h>
-#include <linux/crypto.h>
-#include <linux/delay.h>
-#include <linux/hardirq.h>
-#include <asm/switch_to.h>
-#include <crypto/aes.h>
-#include <crypto/scatterwalk.h>
-#include "aesp8-ppc.h"
-
-struct p8_aes_ctr_ctx {
-	struct crypto_blkcipher *fallback;
-	struct aes_key enc_key;
-};
-
-static int p8_aes_ctr_init(struct crypto_tfm *tfm)
-{
-	const char *alg;
-	struct crypto_blkcipher *fallback;
-	struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (!(alg = crypto_tfm_alg_name(tfm))) {
-		printk(KERN_ERR "Failed to get algorithm name.\n");
-		return -ENOENT;
-	}
-
-	fallback =
-	    crypto_alloc_blkcipher(alg, 0, CRYPTO_ALG_NEED_FALLBACK);
-	if (IS_ERR(fallback)) {
-		printk(KERN_ERR
-		       "Failed to allocate transformation for '%s': %ld\n",
-		       alg, PTR_ERR(fallback));
-		return PTR_ERR(fallback);
-	}
-	printk(KERN_INFO "Using '%s' as fallback implementation.\n",
-	       crypto_tfm_alg_driver_name((struct crypto_tfm *) fallback));
-
-	crypto_blkcipher_set_flags(
-		fallback,
-		crypto_blkcipher_get_flags((struct crypto_blkcipher *)tfm));
-	ctx->fallback = fallback;
-
-	return 0;
-}
-
-static void p8_aes_ctr_exit(struct crypto_tfm *tfm)
-{
-	struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (ctx->fallback) {
-		crypto_free_blkcipher(ctx->fallback);
-		ctx->fallback = NULL;
-	}
-}
-
-static int p8_aes_ctr_setkey(struct crypto_tfm *tfm, const u8 *key,
-			     unsigned int keylen)
-{
-	int ret;
-	struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	preempt_disable();
-	pagefault_disable();
-	enable_kernel_vsx();
-	ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
-	disable_kernel_vsx();
-	pagefault_enable();
-	preempt_enable();
-
-	ret += crypto_blkcipher_setkey(ctx->fallback, key, keylen);
-	return ret;
-}
-
-static void p8_aes_ctr_final(struct p8_aes_ctr_ctx *ctx,
-			     struct blkcipher_walk *walk)
-{
-	u8 *ctrblk = walk->iv;
-	u8 keystream[AES_BLOCK_SIZE];
-	u8 *src = walk->src.virt.addr;
-	u8 *dst = walk->dst.virt.addr;
-	unsigned int nbytes = walk->nbytes;
-
-	preempt_disable();
-	pagefault_disable();
-	enable_kernel_vsx();
-	aes_p8_encrypt(ctrblk, keystream, &ctx->enc_key);
-	disable_kernel_vsx();
-	pagefault_enable();
-	preempt_enable();
-
-	crypto_xor(keystream, src, nbytes);
-	memcpy(dst, keystream, nbytes);
-	crypto_inc(ctrblk, AES_BLOCK_SIZE);
-}
-
-static int p8_aes_ctr_crypt(struct blkcipher_desc *desc,
-			    struct scatterlist *dst,
-			    struct scatterlist *src, unsigned int nbytes)
-{
-	int ret;
-	u64 inc;
-	struct blkcipher_walk walk;
-	struct p8_aes_ctr_ctx *ctx =
-		crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
-	struct blkcipher_desc fallback_desc = {
-		.tfm = ctx->fallback,
-		.info = desc->info,
-		.flags = desc->flags
-	};
-
-	if (in_interrupt()) {
-		ret = crypto_blkcipher_encrypt(&fallback_desc, dst, src,
-					       nbytes);
-	} else {
-		blkcipher_walk_init(&walk, dst, src, nbytes);
-		ret = blkcipher_walk_virt_block(desc, &walk, AES_BLOCK_SIZE);
-		while ((nbytes = walk.nbytes) >= AES_BLOCK_SIZE) {
-			preempt_disable();
-			pagefault_disable();
-			enable_kernel_vsx();
-			aes_p8_ctr32_encrypt_blocks(walk.src.virt.addr,
-						    walk.dst.virt.addr,
-						    (nbytes &
-						     AES_BLOCK_MASK) /
-						    AES_BLOCK_SIZE,
-						    &ctx->enc_key,
-						    walk.iv);
-			disable_kernel_vsx();
-			pagefault_enable();
-			preempt_enable();
-
-			/* We need to update IV mostly for last bytes/round */
-			inc = (nbytes & AES_BLOCK_MASK) / AES_BLOCK_SIZE;
-			if (inc > 0)
-				while (inc--)
-					crypto_inc(walk.iv, AES_BLOCK_SIZE);
-
-			nbytes &= AES_BLOCK_SIZE - 1;
-			ret = blkcipher_walk_done(desc, &walk, nbytes);
-		}
-		if (walk.nbytes) {
-			p8_aes_ctr_final(ctx, &walk);
-			ret = blkcipher_walk_done(desc, &walk, 0);
-		}
-	}
-
-	return ret;
-}
-
-struct crypto_alg p8_aes_ctr_alg = {
-	.cra_name = "ctr(aes)",
-	.cra_driver_name = "p8_aes_ctr",
-	.cra_module = THIS_MODULE,
-	.cra_priority = 2000,
-	.cra_type = &crypto_blkcipher_type,
-	.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK,
-	.cra_alignmask = 0,
-	.cra_blocksize = 1,
-	.cra_ctxsize = sizeof(struct p8_aes_ctr_ctx),
-	.cra_init = p8_aes_ctr_init,
-	.cra_exit = p8_aes_ctr_exit,
-	.cra_blkcipher = {
-			  .ivsize = AES_BLOCK_SIZE,
-			  .min_keysize = AES_MIN_KEY_SIZE,
-			  .max_keysize = AES_MAX_KEY_SIZE,
-			  .setkey = p8_aes_ctr_setkey,
-			  .encrypt = p8_aes_ctr_crypt,
-			  .decrypt = p8_aes_ctr_crypt,
-	},
-};
diff --git a/drivers/crypto/vmx/aes_xts.c b/drivers/crypto/vmx/aes_xts.c
deleted file mode 100644
index 6adc9290557a..000000000000
--- a/drivers/crypto/vmx/aes_xts.c
+++ /dev/null
@@ -1,190 +0,0 @@
-/**
- * AES XTS routines supporting VMX In-core instructions on Power 8
- *
- * Copyright (C) 2015 International Business Machines Inc.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundations; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY of FITNESS FOR A PARTICUPAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
- * Author: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
- */
-
-#include <linux/types.h>
-#include <linux/err.h>
-#include <linux/crypto.h>
-#include <linux/delay.h>
-#include <linux/hardirq.h>
-#include <asm/switch_to.h>
-#include <crypto/aes.h>
-#include <crypto/scatterwalk.h>
-#include <crypto/xts.h>
-#include <crypto/skcipher.h>
-
-#include "aesp8-ppc.h"
-
-struct p8_aes_xts_ctx {
-	struct crypto_skcipher *fallback;
-	struct aes_key enc_key;
-	struct aes_key dec_key;
-	struct aes_key tweak_key;
-};
-
-static int p8_aes_xts_init(struct crypto_tfm *tfm)
-{
-	const char *alg;
-	struct crypto_skcipher *fallback;
-	struct p8_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (!(alg = crypto_tfm_alg_name(tfm))) {
-		printk(KERN_ERR "Failed to get algorithm name.\n");
-		return -ENOENT;
-	}
-
-	fallback = crypto_alloc_skcipher(alg, 0,
-			CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK);
-	if (IS_ERR(fallback)) {
-		printk(KERN_ERR
-			"Failed to allocate transformation for '%s': %ld\n",
-			alg, PTR_ERR(fallback));
-		return PTR_ERR(fallback);
-	}
-	printk(KERN_INFO "Using '%s' as fallback implementation.\n",
-		crypto_skcipher_driver_name(fallback));
-
-	crypto_skcipher_set_flags(
-		fallback,
-		crypto_skcipher_get_flags((struct crypto_skcipher *)tfm));
-	ctx->fallback = fallback;
-
-	return 0;
-}
-
-static void p8_aes_xts_exit(struct crypto_tfm *tfm)
-{
-	struct p8_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (ctx->fallback) {
-		crypto_free_skcipher(ctx->fallback);
-		ctx->fallback = NULL;
-	}
-}
-
-static int p8_aes_xts_setkey(struct crypto_tfm *tfm, const u8 *key,
-			     unsigned int keylen)
-{
-	int ret;
-	struct p8_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	ret = xts_check_key(tfm, key, keylen);
-	if (ret)
-		return ret;
-
-	preempt_disable();
-	pagefault_disable();
-	enable_kernel_vsx();
-	ret = aes_p8_set_encrypt_key(key + keylen/2, (keylen/2) * 8, &ctx->tweak_key);
-	ret += aes_p8_set_encrypt_key(key, (keylen/2) * 8, &ctx->enc_key);
-	ret += aes_p8_set_decrypt_key(key, (keylen/2) * 8, &ctx->dec_key);
-	disable_kernel_vsx();
-	pagefault_enable();
-	preempt_enable();
-
-	ret += crypto_skcipher_setkey(ctx->fallback, key, keylen);
-	return ret;
-}
-
-static int p8_aes_xts_crypt(struct blkcipher_desc *desc,
-			    struct scatterlist *dst,
-			    struct scatterlist *src,
-			    unsigned int nbytes, int enc)
-{
-	int ret;
-	u8 tweak[AES_BLOCK_SIZE];
-	u8 *iv;
-	struct blkcipher_walk walk;
-	struct p8_aes_xts_ctx *ctx =
-		crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
-
-	if (in_interrupt()) {
-		SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
-		skcipher_request_set_tfm(req, ctx->fallback);
-		skcipher_request_set_callback(req, desc->flags, NULL, NULL);
-		skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
-		ret = enc? crypto_skcipher_encrypt(req) : crypto_skcipher_decrypt(req);
-		skcipher_request_zero(req);
-	} else {
-		preempt_disable();
-		pagefault_disable();
-		enable_kernel_vsx();
-
-		blkcipher_walk_init(&walk, dst, src, nbytes);
-
-		ret = blkcipher_walk_virt(desc, &walk);
-		iv = walk.iv;
-		memset(tweak, 0, AES_BLOCK_SIZE);
-		aes_p8_encrypt(iv, tweak, &ctx->tweak_key);
-
-		while ((nbytes = walk.nbytes)) {
-			if (enc)
-				aes_p8_xts_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
-						nbytes & AES_BLOCK_MASK, &ctx->enc_key, NULL, tweak);
-			else
-				aes_p8_xts_decrypt(walk.src.virt.addr, walk.dst.virt.addr,
-						nbytes & AES_BLOCK_MASK, &ctx->dec_key, NULL, tweak);
-
-			nbytes &= AES_BLOCK_SIZE - 1;
-			ret = blkcipher_walk_done(desc, &walk, nbytes);
-		}
-
-		disable_kernel_vsx();
-		pagefault_enable();
-		preempt_enable();
-	}
-	return ret;
-}
-
-static int p8_aes_xts_encrypt(struct blkcipher_desc *desc,
-			      struct scatterlist *dst,
-			      struct scatterlist *src, unsigned int nbytes)
-{
-	return p8_aes_xts_crypt(desc, dst, src, nbytes, 1);
-}
-
-static int p8_aes_xts_decrypt(struct blkcipher_desc *desc,
-			      struct scatterlist *dst,
-			      struct scatterlist *src, unsigned int nbytes)
-{
-	return p8_aes_xts_crypt(desc, dst, src, nbytes, 0);
-}
-
-struct crypto_alg p8_aes_xts_alg = {
-	.cra_name = "xts(aes)",
-	.cra_driver_name = "p8_aes_xts",
-	.cra_module = THIS_MODULE,
-	.cra_priority = 2000,
-	.cra_type = &crypto_blkcipher_type,
-	.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK,
-	.cra_alignmask = 0,
-	.cra_blocksize = AES_BLOCK_SIZE,
-	.cra_ctxsize = sizeof(struct p8_aes_xts_ctx),
-	.cra_init = p8_aes_xts_init,
-	.cra_exit = p8_aes_xts_exit,
-	.cra_blkcipher = {
-			.ivsize = AES_BLOCK_SIZE,
-			.min_keysize = 2 * AES_MIN_KEY_SIZE,
-			.max_keysize = 2 * AES_MAX_KEY_SIZE,
-			.setkey	 = p8_aes_xts_setkey,
-			.encrypt = p8_aes_xts_encrypt,
-			.decrypt = p8_aes_xts_decrypt,
-	}
-};
diff --git a/drivers/crypto/vmx/aesp8-ppc.h b/drivers/crypto/vmx/aesp8-ppc.h
deleted file mode 100644
index 01972e16a6c0..000000000000
--- a/drivers/crypto/vmx/aesp8-ppc.h
+++ /dev/null
@@ -1,25 +0,0 @@
-#include <linux/types.h>
-#include <crypto/aes.h>
-
-#define AES_BLOCK_MASK  (~(AES_BLOCK_SIZE-1))
-
-struct aes_key {
-	u8 key[AES_MAX_KEYLENGTH];
-	int rounds;
-};
-
-int aes_p8_set_encrypt_key(const u8 *userKey, const int bits,
-			   struct aes_key *key);
-int aes_p8_set_decrypt_key(const u8 *userKey, const int bits,
-			   struct aes_key *key);
-void aes_p8_encrypt(const u8 *in, u8 *out, const struct aes_key *key);
-void aes_p8_decrypt(const u8 *in, u8 *out, const struct aes_key *key);
-void aes_p8_cbc_encrypt(const u8 *in, u8 *out, size_t len,
-			const struct aes_key *key, u8 *iv, const int enc);
-void aes_p8_ctr32_encrypt_blocks(const u8 *in, u8 *out,
-				 size_t len, const struct aes_key *key,
-				 const u8 *iv);
-void aes_p8_xts_encrypt(const u8 *in, u8 *out, size_t len,
-			const struct aes_key *key1, const struct aes_key *key2, u8 *iv);
-void aes_p8_xts_decrypt(const u8 *in, u8 *out, size_t len,
-			const struct aes_key *key1, const struct aes_key *key2, u8 *iv);
diff --git a/drivers/crypto/vmx/aesp8-ppc.pl b/drivers/crypto/vmx/aesp8-ppc.pl
deleted file mode 100644
index 0b4a293b8a1e..000000000000
--- a/drivers/crypto/vmx/aesp8-ppc.pl
+++ /dev/null
@@ -1,3789 +0,0 @@
-#! /usr/bin/env perl
-# Copyright 2014-2016 The OpenSSL Project Authors. All Rights Reserved.
-#
-# Licensed under the OpenSSL license (the "License").  You may not use
-# this file except in compliance with the License.  You can obtain a copy
-# in the file LICENSE in the source distribution or at
-# https://www.openssl.org/source/license.html
-
-#
-# ====================================================================
-# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
-# project. The module is, however, dual licensed under OpenSSL and
-# CRYPTOGAMS licenses depending on where you obtain it. For further
-# details see http://www.openssl.org/~appro/cryptogams/.
-# ====================================================================
-#
-# This module implements support for AES instructions as per PowerISA
-# specification version 2.07, first implemented by POWER8 processor.
-# The module is endian-agnostic in sense that it supports both big-
-# and little-endian cases. Data alignment in parallelizable modes is
-# handled with VSX loads and stores, which implies MSR.VSX flag being
-# set. It should also be noted that ISA specification doesn't prohibit
-# alignment exceptions for these instructions on page boundaries.
-# Initially alignment was handled in pure AltiVec/VMX way [when data
-# is aligned programmatically, which in turn guarantees exception-
-# free execution], but it turned to hamper performance when vcipher
-# instructions are interleaved. It's reckoned that eventual
-# misalignment penalties at page boundaries are in average lower
-# than additional overhead in pure AltiVec approach.
-#
-# May 2016
-#
-# Add XTS subroutine, 9x on little- and 12x improvement on big-endian
-# systems were measured.
-#
-######################################################################
-# Current large-block performance in cycles per byte processed with
-# 128-bit key (less is better).
-#
-#		CBC en-/decrypt	CTR	XTS
-# POWER8[le]	3.96/0.72	0.74	1.1
-# POWER8[be]	3.75/0.65	0.66	1.0
-
-$flavour = shift;
-
-if ($flavour =~ /64/) {
-	$SIZE_T	=8;
-	$LRSAVE	=2*$SIZE_T;
-	$STU	="stdu";
-	$POP	="ld";
-	$PUSH	="std";
-	$UCMP	="cmpld";
-	$SHL	="sldi";
-} elsif ($flavour =~ /32/) {
-	$SIZE_T	=4;
-	$LRSAVE	=$SIZE_T;
-	$STU	="stwu";
-	$POP	="lwz";
-	$PUSH	="stw";
-	$UCMP	="cmplw";
-	$SHL	="slwi";
-} else { die "nonsense $flavour"; }
-
-$LITTLE_ENDIAN = ($flavour=~/le$/) ? $SIZE_T : 0;
-
-$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
-( $xlate="${dir}ppc-xlate.pl" and -f $xlate ) or
-( $xlate="${dir}../../perlasm/ppc-xlate.pl" and -f $xlate) or
-die "can't locate ppc-xlate.pl";
-
-open STDOUT,"| $^X $xlate $flavour ".shift || die "can't call $xlate: $!";
-
-$FRAME=8*$SIZE_T;
-$prefix="aes_p8";
-
-$sp="r1";
-$vrsave="r12";
-
-#########################################################################
-{{{	# Key setup procedures						#
-my ($inp,$bits,$out,$ptr,$cnt,$rounds)=map("r$_",(3..8));
-my ($zero,$in0,$in1,$key,$rcon,$mask,$tmp)=map("v$_",(0..6));
-my ($stage,$outperm,$outmask,$outhead,$outtail)=map("v$_",(7..11));
-
-$code.=<<___;
-.machine	"any"
-
-.text
-
-.align	7
-rcon:
-.long	0x01000000, 0x01000000, 0x01000000, 0x01000000	?rev
-.long	0x1b000000, 0x1b000000, 0x1b000000, 0x1b000000	?rev
-.long	0x0d0e0f0c, 0x0d0e0f0c, 0x0d0e0f0c, 0x0d0e0f0c	?rev
-.long	0,0,0,0						?asis
-Lconsts:
-	mflr	r0
-	bcl	20,31,\$+4
-	mflr	$ptr	 #vvvvv "distance between . and rcon
-	addi	$ptr,$ptr,-0x48
-	mtlr	r0
-	blr
-	.long	0
-	.byte	0,12,0x14,0,0,0,0,0
-.asciz	"AES for PowerISA 2.07, CRYPTOGAMS by <appro\@openssl.org>"
-
-.globl	.${prefix}_set_encrypt_key
-Lset_encrypt_key:
-	mflr		r11
-	$PUSH		r11,$LRSAVE($sp)
-
-	li		$ptr,-1
-	${UCMP}i	$inp,0
-	beq-		Lenc_key_abort		# if ($inp==0) return -1;
-	${UCMP}i	$out,0
-	beq-		Lenc_key_abort		# if ($out==0) return -1;
-	li		$ptr,-2
-	cmpwi		$bits,128
-	blt-		Lenc_key_abort
-	cmpwi		$bits,256
-	bgt-		Lenc_key_abort
-	andi.		r0,$bits,0x3f
-	bne-		Lenc_key_abort
-
-	lis		r0,0xfff0
-	mfspr		$vrsave,256
-	mtspr		256,r0
-
-	bl		Lconsts
-	mtlr		r11
-
-	neg		r9,$inp
-	lvx		$in0,0,$inp
-	addi		$inp,$inp,15		# 15 is not typo
-	lvsr		$key,0,r9		# borrow $key
-	li		r8,0x20
-	cmpwi		$bits,192
-	lvx		$in1,0,$inp
-	le?vspltisb	$mask,0x0f		# borrow $mask
-	lvx		$rcon,0,$ptr
-	le?vxor		$key,$key,$mask		# adjust for byte swap
-	lvx		$mask,r8,$ptr
-	addi		$ptr,$ptr,0x10
-	vperm		$in0,$in0,$in1,$key	# align [and byte swap in LE]
-	li		$cnt,8
-	vxor		$zero,$zero,$zero
-	mtctr		$cnt
-
-	?lvsr		$outperm,0,$out
-	vspltisb	$outmask,-1
-	lvx		$outhead,0,$out
-	?vperm		$outmask,$zero,$outmask,$outperm
-
-	blt		Loop128
-	addi		$inp,$inp,8
-	beq		L192
-	addi		$inp,$inp,8
-	b		L256
-
-.align	4
-Loop128:
-	vperm		$key,$in0,$in0,$mask	# rotate-n-splat
-	vsldoi		$tmp,$zero,$in0,12	# >>32
-	 vperm		$outtail,$in0,$in0,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	vcipherlast	$key,$key,$rcon
-	 stvx		$stage,0,$out
-	 addi		$out,$out,16
-
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-	 vadduwm	$rcon,$rcon,$rcon
-	vxor		$in0,$in0,$key
-	bdnz		Loop128
-
-	lvx		$rcon,0,$ptr		# last two round keys
-
-	vperm		$key,$in0,$in0,$mask	# rotate-n-splat
-	vsldoi		$tmp,$zero,$in0,12	# >>32
-	 vperm		$outtail,$in0,$in0,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	vcipherlast	$key,$key,$rcon
-	 stvx		$stage,0,$out
-	 addi		$out,$out,16
-
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-	 vadduwm	$rcon,$rcon,$rcon
-	vxor		$in0,$in0,$key
-
-	vperm		$key,$in0,$in0,$mask	# rotate-n-splat
-	vsldoi		$tmp,$zero,$in0,12	# >>32
-	 vperm		$outtail,$in0,$in0,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	vcipherlast	$key,$key,$rcon
-	 stvx		$stage,0,$out
-	 addi		$out,$out,16
-
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-	vxor		$in0,$in0,$key
-	 vperm		$outtail,$in0,$in0,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	 stvx		$stage,0,$out
-
-	addi		$inp,$out,15		# 15 is not typo
-	addi		$out,$out,0x50
-
-	li		$rounds,10
-	b		Ldone
-
-.align	4
-L192:
-	lvx		$tmp,0,$inp
-	li		$cnt,4
-	 vperm		$outtail,$in0,$in0,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	 stvx		$stage,0,$out
-	 addi		$out,$out,16
-	vperm		$in1,$in1,$tmp,$key	# align [and byte swap in LE]
-	vspltisb	$key,8			# borrow $key
-	mtctr		$cnt
-	vsububm		$mask,$mask,$key	# adjust the mask
-
-Loop192:
-	vperm		$key,$in1,$in1,$mask	# roate-n-splat
-	vsldoi		$tmp,$zero,$in0,12	# >>32
-	vcipherlast	$key,$key,$rcon
-
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-
-	 vsldoi		$stage,$zero,$in1,8
-	vspltw		$tmp,$in0,3
-	vxor		$tmp,$tmp,$in1
-	vsldoi		$in1,$zero,$in1,12	# >>32
-	 vadduwm	$rcon,$rcon,$rcon
-	vxor		$in1,$in1,$tmp
-	vxor		$in0,$in0,$key
-	vxor		$in1,$in1,$key
-	 vsldoi		$stage,$stage,$in0,8
-
-	vperm		$key,$in1,$in1,$mask	# rotate-n-splat
-	vsldoi		$tmp,$zero,$in0,12	# >>32
-	 vperm		$outtail,$stage,$stage,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	vcipherlast	$key,$key,$rcon
-	 stvx		$stage,0,$out
-	 addi		$out,$out,16
-
-	 vsldoi		$stage,$in0,$in1,8
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	 vperm		$outtail,$stage,$stage,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-	 stvx		$stage,0,$out
-	 addi		$out,$out,16
-
-	vspltw		$tmp,$in0,3
-	vxor		$tmp,$tmp,$in1
-	vsldoi		$in1,$zero,$in1,12	# >>32
-	 vadduwm	$rcon,$rcon,$rcon
-	vxor		$in1,$in1,$tmp
-	vxor		$in0,$in0,$key
-	vxor		$in1,$in1,$key
-	 vperm		$outtail,$in0,$in0,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	 stvx		$stage,0,$out
-	 addi		$inp,$out,15		# 15 is not typo
-	 addi		$out,$out,16
-	bdnz		Loop192
-
-	li		$rounds,12
-	addi		$out,$out,0x20
-	b		Ldone
-
-.align	4
-L256:
-	lvx		$tmp,0,$inp
-	li		$cnt,7
-	li		$rounds,14
-	 vperm		$outtail,$in0,$in0,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	 stvx		$stage,0,$out
-	 addi		$out,$out,16
-	vperm		$in1,$in1,$tmp,$key	# align [and byte swap in LE]
-	mtctr		$cnt
-
-Loop256:
-	vperm		$key,$in1,$in1,$mask	# rotate-n-splat
-	vsldoi		$tmp,$zero,$in0,12	# >>32
-	 vperm		$outtail,$in1,$in1,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	vcipherlast	$key,$key,$rcon
-	 stvx		$stage,0,$out
-	 addi		$out,$out,16
-
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in0,$in0,$tmp
-	 vadduwm	$rcon,$rcon,$rcon
-	vxor		$in0,$in0,$key
-	 vperm		$outtail,$in0,$in0,$outperm	# rotate
-	 vsel		$stage,$outhead,$outtail,$outmask
-	 vmr		$outhead,$outtail
-	 stvx		$stage,0,$out
-	 addi		$inp,$out,15		# 15 is not typo
-	 addi		$out,$out,16
-	bdz		Ldone
-
-	vspltw		$key,$in0,3		# just splat
-	vsldoi		$tmp,$zero,$in1,12	# >>32
-	vsbox		$key,$key
-
-	vxor		$in1,$in1,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in1,$in1,$tmp
-	vsldoi		$tmp,$zero,$tmp,12	# >>32
-	vxor		$in1,$in1,$tmp
-
-	vxor		$in1,$in1,$key
-	b		Loop256
-
-.align	4
-Ldone:
-	lvx		$in1,0,$inp		# redundant in aligned case
-	vsel		$in1,$outhead,$in1,$outmask
-	stvx		$in1,0,$inp
-	li		$ptr,0
-	mtspr		256,$vrsave
-	stw		$rounds,0($out)
-
-Lenc_key_abort:
-	mr		r3,$ptr
-	blr
-	.long		0
-	.byte		0,12,0x14,1,0,0,3,0
-	.long		0
-.size	.${prefix}_set_encrypt_key,.-.${prefix}_set_encrypt_key
-
-.globl	.${prefix}_set_decrypt_key
-	$STU		$sp,-$FRAME($sp)
-	mflr		r10
-	$PUSH		r10,$FRAME+$LRSAVE($sp)
-	bl		Lset_encrypt_key
-	mtlr		r10
-
-	cmpwi		r3,0
-	bne-		Ldec_key_abort
-
-	slwi		$cnt,$rounds,4
-	subi		$inp,$out,240		# first round key
-	srwi		$rounds,$rounds,1
-	add		$out,$inp,$cnt		# last round key
-	mtctr		$rounds
-
-Ldeckey:
-	lwz		r0, 0($inp)
-	lwz		r6, 4($inp)
-	lwz		r7, 8($inp)
-	lwz		r8, 12($inp)
-	addi		$inp,$inp,16
-	lwz		r9, 0($out)
-	lwz		r10,4($out)
-	lwz		r11,8($out)
-	lwz		r12,12($out)
-	stw		r0, 0($out)
-	stw		r6, 4($out)
-	stw		r7, 8($out)
-	stw		r8, 12($out)
-	subi		$out,$out,16
-	stw		r9, -16($inp)
-	stw		r10,-12($inp)
-	stw		r11,-8($inp)
-	stw		r12,-4($inp)
-	bdnz		Ldeckey
-
-	xor		r3,r3,r3		# return value
-Ldec_key_abort:
-	addi		$sp,$sp,$FRAME
-	blr
-	.long		0
-	.byte		0,12,4,1,0x80,0,3,0
-	.long		0
-.size	.${prefix}_set_decrypt_key,.-.${prefix}_set_decrypt_key
-___
-}}}
-#########################################################################
-{{{	# Single block en- and decrypt procedures			#
-sub gen_block () {
-my $dir = shift;
-my $n   = $dir eq "de" ? "n" : "";
-my ($inp,$out,$key,$rounds,$idx)=map("r$_",(3..7));
-
-$code.=<<___;
-.globl	.${prefix}_${dir}crypt
-	lwz		$rounds,240($key)
-	lis		r0,0xfc00
-	mfspr		$vrsave,256
-	li		$idx,15			# 15 is not typo
-	mtspr		256,r0
-
-	lvx		v0,0,$inp
-	neg		r11,$out
-	lvx		v1,$idx,$inp
-	lvsl		v2,0,$inp		# inpperm
-	le?vspltisb	v4,0x0f
-	?lvsl		v3,0,r11		# outperm
-	le?vxor		v2,v2,v4
-	li		$idx,16
-	vperm		v0,v0,v1,v2		# align [and byte swap in LE]
-	lvx		v1,0,$key
-	?lvsl		v5,0,$key		# keyperm
-	srwi		$rounds,$rounds,1
-	lvx		v2,$idx,$key
-	addi		$idx,$idx,16
-	subi		$rounds,$rounds,1
-	?vperm		v1,v1,v2,v5		# align round key
-
-	vxor		v0,v0,v1
-	lvx		v1,$idx,$key
-	addi		$idx,$idx,16
-	mtctr		$rounds
-
-Loop_${dir}c:
-	?vperm		v2,v2,v1,v5
-	v${n}cipher	v0,v0,v2
-	lvx		v2,$idx,$key
-	addi		$idx,$idx,16
-	?vperm		v1,v1,v2,v5
-	v${n}cipher	v0,v0,v1
-	lvx		v1,$idx,$key
-	addi		$idx,$idx,16
-	bdnz		Loop_${dir}c
-
-	?vperm		v2,v2,v1,v5
-	v${n}cipher	v0,v0,v2
-	lvx		v2,$idx,$key
-	?vperm		v1,v1,v2,v5
-	v${n}cipherlast	v0,v0,v1
-
-	vspltisb	v2,-1
-	vxor		v1,v1,v1
-	li		$idx,15			# 15 is not typo
-	?vperm		v2,v1,v2,v3		# outmask
-	le?vxor		v3,v3,v4
-	lvx		v1,0,$out		# outhead
-	vperm		v0,v0,v0,v3		# rotate [and byte swap in LE]
-	vsel		v1,v1,v0,v2
-	lvx		v4,$idx,$out
-	stvx		v1,0,$out
-	vsel		v0,v0,v4,v2
-	stvx		v0,$idx,$out
-
-	mtspr		256,$vrsave
-	blr
-	.long		0
-	.byte		0,12,0x14,0,0,0,3,0
-	.long		0
-.size	.${prefix}_${dir}crypt,.-.${prefix}_${dir}crypt
-___
-}
-&gen_block("en");
-&gen_block("de");
-}}}
-#########################################################################
-{{{	# CBC en- and decrypt procedures				#
-my ($inp,$out,$len,$key,$ivp,$enc,$rounds,$idx)=map("r$_",(3..10));
-my ($rndkey0,$rndkey1,$inout,$tmp)=		map("v$_",(0..3));
-my ($ivec,$inptail,$inpperm,$outhead,$outperm,$outmask,$keyperm)=
-						map("v$_",(4..10));
-$code.=<<___;
-.globl	.${prefix}_cbc_encrypt
-	${UCMP}i	$len,16
-	bltlr-
-
-	cmpwi		$enc,0			# test direction
-	lis		r0,0xffe0
-	mfspr		$vrsave,256
-	mtspr		256,r0
-
-	li		$idx,15
-	vxor		$rndkey0,$rndkey0,$rndkey0
-	le?vspltisb	$tmp,0x0f
-
-	lvx		$ivec,0,$ivp		# load [unaligned] iv
-	lvsl		$inpperm,0,$ivp
-	lvx		$inptail,$idx,$ivp
-	le?vxor		$inpperm,$inpperm,$tmp
-	vperm		$ivec,$ivec,$inptail,$inpperm
-
-	neg		r11,$inp
-	?lvsl		$keyperm,0,$key		# prepare for unaligned key
-	lwz		$rounds,240($key)
-
-	lvsr		$inpperm,0,r11		# prepare for unaligned load
-	lvx		$inptail,0,$inp
-	addi		$inp,$inp,15		# 15 is not typo
-	le?vxor		$inpperm,$inpperm,$tmp
-
-	?lvsr		$outperm,0,$out		# prepare for unaligned store
-	vspltisb	$outmask,-1
-	lvx		$outhead,0,$out
-	?vperm		$outmask,$rndkey0,$outmask,$outperm
-	le?vxor		$outperm,$outperm,$tmp
-
-	srwi		$rounds,$rounds,1
-	li		$idx,16
-	subi		$rounds,$rounds,1
-	beq		Lcbc_dec
-
-Lcbc_enc:
-	vmr		$inout,$inptail
-	lvx		$inptail,0,$inp
-	addi		$inp,$inp,16
-	mtctr		$rounds
-	subi		$len,$len,16		# len-=16
-
-	lvx		$rndkey0,0,$key
-	 vperm		$inout,$inout,$inptail,$inpperm
-	lvx		$rndkey1,$idx,$key
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key
-	addi		$idx,$idx,16
-	vxor		$inout,$inout,$ivec
-
-Loop_cbc_enc:
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vcipher		$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vcipher		$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key
-	addi		$idx,$idx,16
-	bdnz		Loop_cbc_enc
-
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vcipher		$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key
-	li		$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vcipherlast	$ivec,$inout,$rndkey0
-	${UCMP}i	$len,16
-
-	vperm		$tmp,$ivec,$ivec,$outperm
-	vsel		$inout,$outhead,$tmp,$outmask
-	vmr		$outhead,$tmp
-	stvx		$inout,0,$out
-	addi		$out,$out,16
-	bge		Lcbc_enc
-
-	b		Lcbc_done
-
-.align	4
-Lcbc_dec:
-	${UCMP}i	$len,128
-	bge		_aesp8_cbc_decrypt8x
-	vmr		$tmp,$inptail
-	lvx		$inptail,0,$inp
-	addi		$inp,$inp,16
-	mtctr		$rounds
-	subi		$len,$len,16		# len-=16
-
-	lvx		$rndkey0,0,$key
-	 vperm		$tmp,$tmp,$inptail,$inpperm
-	lvx		$rndkey1,$idx,$key
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$inout,$tmp,$rndkey0
-	lvx		$rndkey0,$idx,$key
-	addi		$idx,$idx,16
-
-Loop_cbc_dec:
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vncipher	$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vncipher	$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key
-	addi		$idx,$idx,16
-	bdnz		Loop_cbc_dec
-
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vncipher	$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key
-	li		$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vncipherlast	$inout,$inout,$rndkey0
-	${UCMP}i	$len,16
-
-	vxor		$inout,$inout,$ivec
-	vmr		$ivec,$tmp
-	vperm		$tmp,$inout,$inout,$outperm
-	vsel		$inout,$outhead,$tmp,$outmask
-	vmr		$outhead,$tmp
-	stvx		$inout,0,$out
-	addi		$out,$out,16
-	bge		Lcbc_dec
-
-Lcbc_done:
-	addi		$out,$out,-1
-	lvx		$inout,0,$out		# redundant in aligned case
-	vsel		$inout,$outhead,$inout,$outmask
-	stvx		$inout,0,$out
-
-	neg		$enc,$ivp		# write [unaligned] iv
-	li		$idx,15			# 15 is not typo
-	vxor		$rndkey0,$rndkey0,$rndkey0
-	vspltisb	$outmask,-1
-	le?vspltisb	$tmp,0x0f
-	?lvsl		$outperm,0,$enc
-	?vperm		$outmask,$rndkey0,$outmask,$outperm
-	le?vxor		$outperm,$outperm,$tmp
-	lvx		$outhead,0,$ivp
-	vperm		$ivec,$ivec,$ivec,$outperm
-	vsel		$inout,$outhead,$ivec,$outmask
-	lvx		$inptail,$idx,$ivp
-	stvx		$inout,0,$ivp
-	vsel		$inout,$ivec,$inptail,$outmask
-	stvx		$inout,$idx,$ivp
-
-	mtspr		256,$vrsave
-	blr
-	.long		0
-	.byte		0,12,0x14,0,0,0,6,0
-	.long		0
-___
-#########################################################################
-{{	# Optimized CBC decrypt procedure				#
-my $key_="r11";
-my ($x00,$x10,$x20,$x30,$x40,$x50,$x60,$x70)=map("r$_",(0,8,26..31));
-my ($in0, $in1, $in2, $in3, $in4, $in5, $in6, $in7 )=map("v$_",(0..3,10..13));
-my ($out0,$out1,$out2,$out3,$out4,$out5,$out6,$out7)=map("v$_",(14..21));
-my $rndkey0="v23";	# v24-v25 rotating buffer for first found keys
-			# v26-v31 last 6 round keys
-my ($tmp,$keyperm)=($in3,$in4);	# aliases with "caller", redundant assignment
-
-$code.=<<___;
-.align	5
-_aesp8_cbc_decrypt8x:
-	$STU		$sp,-`($FRAME+21*16+6*$SIZE_T)`($sp)
-	li		r10,`$FRAME+8*16+15`
-	li		r11,`$FRAME+8*16+31`
-	stvx		v20,r10,$sp		# ABI says so
-	addi		r10,r10,32
-	stvx		v21,r11,$sp
-	addi		r11,r11,32
-	stvx		v22,r10,$sp
-	addi		r10,r10,32
-	stvx		v23,r11,$sp
-	addi		r11,r11,32
-	stvx		v24,r10,$sp
-	addi		r10,r10,32
-	stvx		v25,r11,$sp
-	addi		r11,r11,32
-	stvx		v26,r10,$sp
-	addi		r10,r10,32
-	stvx		v27,r11,$sp
-	addi		r11,r11,32
-	stvx		v28,r10,$sp
-	addi		r10,r10,32
-	stvx		v29,r11,$sp
-	addi		r11,r11,32
-	stvx		v30,r10,$sp
-	stvx		v31,r11,$sp
-	li		r0,-1
-	stw		$vrsave,`$FRAME+21*16-4`($sp)	# save vrsave
-	li		$x10,0x10
-	$PUSH		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
-	li		$x20,0x20
-	$PUSH		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
-	li		$x30,0x30
-	$PUSH		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
-	li		$x40,0x40
-	$PUSH		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
-	li		$x50,0x50
-	$PUSH		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
-	li		$x60,0x60
-	$PUSH		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
-	li		$x70,0x70
-	mtspr		256,r0
-
-	subi		$rounds,$rounds,3	# -4 in total
-	subi		$len,$len,128		# bias
-
-	lvx		$rndkey0,$x00,$key	# load key schedule
-	lvx		v30,$x10,$key
-	addi		$key,$key,0x20
-	lvx		v31,$x00,$key
-	?vperm		$rndkey0,$rndkey0,v30,$keyperm
-	addi		$key_,$sp,$FRAME+15
-	mtctr		$rounds
-
-Load_cbc_dec_key:
-	?vperm		v24,v30,v31,$keyperm
-	lvx		v30,$x10,$key
-	addi		$key,$key,0x20
-	stvx		v24,$x00,$key_		# off-load round[1]
-	?vperm		v25,v31,v30,$keyperm
-	lvx		v31,$x00,$key
-	stvx		v25,$x10,$key_		# off-load round[2]
-	addi		$key_,$key_,0x20
-	bdnz		Load_cbc_dec_key
-
-	lvx		v26,$x10,$key
-	?vperm		v24,v30,v31,$keyperm
-	lvx		v27,$x20,$key
-	stvx		v24,$x00,$key_		# off-load round[3]
-	?vperm		v25,v31,v26,$keyperm
-	lvx		v28,$x30,$key
-	stvx		v25,$x10,$key_		# off-load round[4]
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	?vperm		v26,v26,v27,$keyperm
-	lvx		v29,$x40,$key
-	?vperm		v27,v27,v28,$keyperm
-	lvx		v30,$x50,$key
-	?vperm		v28,v28,v29,$keyperm
-	lvx		v31,$x60,$key
-	?vperm		v29,v29,v30,$keyperm
-	lvx		$out0,$x70,$key		# borrow $out0
-	?vperm		v30,v30,v31,$keyperm
-	lvx		v24,$x00,$key_		# pre-load round[1]
-	?vperm		v31,v31,$out0,$keyperm
-	lvx		v25,$x10,$key_		# pre-load round[2]
-
-	#lvx		$inptail,0,$inp		# "caller" already did this
-	#addi		$inp,$inp,15		# 15 is not typo
-	subi		$inp,$inp,15		# undo "caller"
-
-	 le?li		$idx,8
-	lvx_u		$in0,$x00,$inp		# load first 8 "words"
-	 le?lvsl	$inpperm,0,$idx
-	 le?vspltisb	$tmp,0x0f
-	lvx_u		$in1,$x10,$inp
-	 le?vxor	$inpperm,$inpperm,$tmp	# transform for lvx_u/stvx_u
-	lvx_u		$in2,$x20,$inp
-	 le?vperm	$in0,$in0,$in0,$inpperm
-	lvx_u		$in3,$x30,$inp
-	 le?vperm	$in1,$in1,$in1,$inpperm
-	lvx_u		$in4,$x40,$inp
-	 le?vperm	$in2,$in2,$in2,$inpperm
-	vxor		$out0,$in0,$rndkey0
-	lvx_u		$in5,$x50,$inp
-	 le?vperm	$in3,$in3,$in3,$inpperm
-	vxor		$out1,$in1,$rndkey0
-	lvx_u		$in6,$x60,$inp
-	 le?vperm	$in4,$in4,$in4,$inpperm
-	vxor		$out2,$in2,$rndkey0
-	lvx_u		$in7,$x70,$inp
-	addi		$inp,$inp,0x80
-	 le?vperm	$in5,$in5,$in5,$inpperm
-	vxor		$out3,$in3,$rndkey0
-	 le?vperm	$in6,$in6,$in6,$inpperm
-	vxor		$out4,$in4,$rndkey0
-	 le?vperm	$in7,$in7,$in7,$inpperm
-	vxor		$out5,$in5,$rndkey0
-	vxor		$out6,$in6,$rndkey0
-	vxor		$out7,$in7,$rndkey0
-
-	mtctr		$rounds
-	b		Loop_cbc_dec8x
-.align	5
-Loop_cbc_dec8x:
-	vncipher	$out0,$out0,v24
-	vncipher	$out1,$out1,v24
-	vncipher	$out2,$out2,v24
-	vncipher	$out3,$out3,v24
-	vncipher	$out4,$out4,v24
-	vncipher	$out5,$out5,v24
-	vncipher	$out6,$out6,v24
-	vncipher	$out7,$out7,v24
-	lvx		v24,$x20,$key_		# round[3]
-	addi		$key_,$key_,0x20
-
-	vncipher	$out0,$out0,v25
-	vncipher	$out1,$out1,v25
-	vncipher	$out2,$out2,v25
-	vncipher	$out3,$out3,v25
-	vncipher	$out4,$out4,v25
-	vncipher	$out5,$out5,v25
-	vncipher	$out6,$out6,v25
-	vncipher	$out7,$out7,v25
-	lvx		v25,$x10,$key_		# round[4]
-	bdnz		Loop_cbc_dec8x
-
-	subic		$len,$len,128		# $len-=128
-	vncipher	$out0,$out0,v24
-	vncipher	$out1,$out1,v24
-	vncipher	$out2,$out2,v24
-	vncipher	$out3,$out3,v24
-	vncipher	$out4,$out4,v24
-	vncipher	$out5,$out5,v24
-	vncipher	$out6,$out6,v24
-	vncipher	$out7,$out7,v24
-
-	subfe.		r0,r0,r0		# borrow?-1:0
-	vncipher	$out0,$out0,v25
-	vncipher	$out1,$out1,v25
-	vncipher	$out2,$out2,v25
-	vncipher	$out3,$out3,v25
-	vncipher	$out4,$out4,v25
-	vncipher	$out5,$out5,v25
-	vncipher	$out6,$out6,v25
-	vncipher	$out7,$out7,v25
-
-	and		r0,r0,$len
-	vncipher	$out0,$out0,v26
-	vncipher	$out1,$out1,v26
-	vncipher	$out2,$out2,v26
-	vncipher	$out3,$out3,v26
-	vncipher	$out4,$out4,v26
-	vncipher	$out5,$out5,v26
-	vncipher	$out6,$out6,v26
-	vncipher	$out7,$out7,v26
-
-	add		$inp,$inp,r0		# $inp is adjusted in such
-						# way that at exit from the
-						# loop inX-in7 are loaded
-						# with last "words"
-	vncipher	$out0,$out0,v27
-	vncipher	$out1,$out1,v27
-	vncipher	$out2,$out2,v27
-	vncipher	$out3,$out3,v27
-	vncipher	$out4,$out4,v27
-	vncipher	$out5,$out5,v27
-	vncipher	$out6,$out6,v27
-	vncipher	$out7,$out7,v27
-
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	vncipher	$out0,$out0,v28
-	vncipher	$out1,$out1,v28
-	vncipher	$out2,$out2,v28
-	vncipher	$out3,$out3,v28
-	vncipher	$out4,$out4,v28
-	vncipher	$out5,$out5,v28
-	vncipher	$out6,$out6,v28
-	vncipher	$out7,$out7,v28
-	lvx		v24,$x00,$key_		# re-pre-load round[1]
-
-	vncipher	$out0,$out0,v29
-	vncipher	$out1,$out1,v29
-	vncipher	$out2,$out2,v29
-	vncipher	$out3,$out3,v29
-	vncipher	$out4,$out4,v29
-	vncipher	$out5,$out5,v29
-	vncipher	$out6,$out6,v29
-	vncipher	$out7,$out7,v29
-	lvx		v25,$x10,$key_		# re-pre-load round[2]
-
-	vncipher	$out0,$out0,v30
-	 vxor		$ivec,$ivec,v31		# xor with last round key
-	vncipher	$out1,$out1,v30
-	 vxor		$in0,$in0,v31
-	vncipher	$out2,$out2,v30
-	 vxor		$in1,$in1,v31
-	vncipher	$out3,$out3,v30
-	 vxor		$in2,$in2,v31
-	vncipher	$out4,$out4,v30
-	 vxor		$in3,$in3,v31
-	vncipher	$out5,$out5,v30
-	 vxor		$in4,$in4,v31
-	vncipher	$out6,$out6,v30
-	 vxor		$in5,$in5,v31
-	vncipher	$out7,$out7,v30
-	 vxor		$in6,$in6,v31
-
-	vncipherlast	$out0,$out0,$ivec
-	vncipherlast	$out1,$out1,$in0
-	 lvx_u		$in0,$x00,$inp		# load next input block
-	vncipherlast	$out2,$out2,$in1
-	 lvx_u		$in1,$x10,$inp
-	vncipherlast	$out3,$out3,$in2
-	 le?vperm	$in0,$in0,$in0,$inpperm
-	 lvx_u		$in2,$x20,$inp
-	vncipherlast	$out4,$out4,$in3
-	 le?vperm	$in1,$in1,$in1,$inpperm
-	 lvx_u		$in3,$x30,$inp
-	vncipherlast	$out5,$out5,$in4
-	 le?vperm	$in2,$in2,$in2,$inpperm
-	 lvx_u		$in4,$x40,$inp
-	vncipherlast	$out6,$out6,$in5
-	 le?vperm	$in3,$in3,$in3,$inpperm
-	 lvx_u		$in5,$x50,$inp
-	vncipherlast	$out7,$out7,$in6
-	 le?vperm	$in4,$in4,$in4,$inpperm
-	 lvx_u		$in6,$x60,$inp
-	vmr		$ivec,$in7
-	 le?vperm	$in5,$in5,$in5,$inpperm
-	 lvx_u		$in7,$x70,$inp
-	 addi		$inp,$inp,0x80
-
-	le?vperm	$out0,$out0,$out0,$inpperm
-	le?vperm	$out1,$out1,$out1,$inpperm
-	stvx_u		$out0,$x00,$out
-	 le?vperm	$in6,$in6,$in6,$inpperm
-	 vxor		$out0,$in0,$rndkey0
-	le?vperm	$out2,$out2,$out2,$inpperm
-	stvx_u		$out1,$x10,$out
-	 le?vperm	$in7,$in7,$in7,$inpperm
-	 vxor		$out1,$in1,$rndkey0
-	le?vperm	$out3,$out3,$out3,$inpperm
-	stvx_u		$out2,$x20,$out
-	 vxor		$out2,$in2,$rndkey0
-	le?vperm	$out4,$out4,$out4,$inpperm
-	stvx_u		$out3,$x30,$out
-	 vxor		$out3,$in3,$rndkey0
-	le?vperm	$out5,$out5,$out5,$inpperm
-	stvx_u		$out4,$x40,$out
-	 vxor		$out4,$in4,$rndkey0
-	le?vperm	$out6,$out6,$out6,$inpperm
-	stvx_u		$out5,$x50,$out
-	 vxor		$out5,$in5,$rndkey0
-	le?vperm	$out7,$out7,$out7,$inpperm
-	stvx_u		$out6,$x60,$out
-	 vxor		$out6,$in6,$rndkey0
-	stvx_u		$out7,$x70,$out
-	addi		$out,$out,0x80
-	 vxor		$out7,$in7,$rndkey0
-
-	mtctr		$rounds
-	beq		Loop_cbc_dec8x		# did $len-=128 borrow?
-
-	addic.		$len,$len,128
-	beq		Lcbc_dec8x_done
-	nop
-	nop
-
-Loop_cbc_dec8x_tail:				# up to 7 "words" tail...
-	vncipher	$out1,$out1,v24
-	vncipher	$out2,$out2,v24
-	vncipher	$out3,$out3,v24
-	vncipher	$out4,$out4,v24
-	vncipher	$out5,$out5,v24
-	vncipher	$out6,$out6,v24
-	vncipher	$out7,$out7,v24
-	lvx		v24,$x20,$key_		# round[3]
-	addi		$key_,$key_,0x20
-
-	vncipher	$out1,$out1,v25
-	vncipher	$out2,$out2,v25
-	vncipher	$out3,$out3,v25
-	vncipher	$out4,$out4,v25
-	vncipher	$out5,$out5,v25
-	vncipher	$out6,$out6,v25
-	vncipher	$out7,$out7,v25
-	lvx		v25,$x10,$key_		# round[4]
-	bdnz		Loop_cbc_dec8x_tail
-
-	vncipher	$out1,$out1,v24
-	vncipher	$out2,$out2,v24
-	vncipher	$out3,$out3,v24
-	vncipher	$out4,$out4,v24
-	vncipher	$out5,$out5,v24
-	vncipher	$out6,$out6,v24
-	vncipher	$out7,$out7,v24
-
-	vncipher	$out1,$out1,v25
-	vncipher	$out2,$out2,v25
-	vncipher	$out3,$out3,v25
-	vncipher	$out4,$out4,v25
-	vncipher	$out5,$out5,v25
-	vncipher	$out6,$out6,v25
-	vncipher	$out7,$out7,v25
-
-	vncipher	$out1,$out1,v26
-	vncipher	$out2,$out2,v26
-	vncipher	$out3,$out3,v26
-	vncipher	$out4,$out4,v26
-	vncipher	$out5,$out5,v26
-	vncipher	$out6,$out6,v26
-	vncipher	$out7,$out7,v26
-
-	vncipher	$out1,$out1,v27
-	vncipher	$out2,$out2,v27
-	vncipher	$out3,$out3,v27
-	vncipher	$out4,$out4,v27
-	vncipher	$out5,$out5,v27
-	vncipher	$out6,$out6,v27
-	vncipher	$out7,$out7,v27
-
-	vncipher	$out1,$out1,v28
-	vncipher	$out2,$out2,v28
-	vncipher	$out3,$out3,v28
-	vncipher	$out4,$out4,v28
-	vncipher	$out5,$out5,v28
-	vncipher	$out6,$out6,v28
-	vncipher	$out7,$out7,v28
-
-	vncipher	$out1,$out1,v29
-	vncipher	$out2,$out2,v29
-	vncipher	$out3,$out3,v29
-	vncipher	$out4,$out4,v29
-	vncipher	$out5,$out5,v29
-	vncipher	$out6,$out6,v29
-	vncipher	$out7,$out7,v29
-
-	vncipher	$out1,$out1,v30
-	 vxor		$ivec,$ivec,v31		# last round key
-	vncipher	$out2,$out2,v30
-	 vxor		$in1,$in1,v31
-	vncipher	$out3,$out3,v30
-	 vxor		$in2,$in2,v31
-	vncipher	$out4,$out4,v30
-	 vxor		$in3,$in3,v31
-	vncipher	$out5,$out5,v30
-	 vxor		$in4,$in4,v31
-	vncipher	$out6,$out6,v30
-	 vxor		$in5,$in5,v31
-	vncipher	$out7,$out7,v30
-	 vxor		$in6,$in6,v31
-
-	cmplwi		$len,32			# switch($len)
-	blt		Lcbc_dec8x_one
-	nop
-	beq		Lcbc_dec8x_two
-	cmplwi		$len,64
-	blt		Lcbc_dec8x_three
-	nop
-	beq		Lcbc_dec8x_four
-	cmplwi		$len,96
-	blt		Lcbc_dec8x_five
-	nop
-	beq		Lcbc_dec8x_six
-
-Lcbc_dec8x_seven:
-	vncipherlast	$out1,$out1,$ivec
-	vncipherlast	$out2,$out2,$in1
-	vncipherlast	$out3,$out3,$in2
-	vncipherlast	$out4,$out4,$in3
-	vncipherlast	$out5,$out5,$in4
-	vncipherlast	$out6,$out6,$in5
-	vncipherlast	$out7,$out7,$in6
-	vmr		$ivec,$in7
-
-	le?vperm	$out1,$out1,$out1,$inpperm
-	le?vperm	$out2,$out2,$out2,$inpperm
-	stvx_u		$out1,$x00,$out
-	le?vperm	$out3,$out3,$out3,$inpperm
-	stvx_u		$out2,$x10,$out
-	le?vperm	$out4,$out4,$out4,$inpperm
-	stvx_u		$out3,$x20,$out
-	le?vperm	$out5,$out5,$out5,$inpperm
-	stvx_u		$out4,$x30,$out
-	le?vperm	$out6,$out6,$out6,$inpperm
-	stvx_u		$out5,$x40,$out
-	le?vperm	$out7,$out7,$out7,$inpperm
-	stvx_u		$out6,$x50,$out
-	stvx_u		$out7,$x60,$out
-	addi		$out,$out,0x70
-	b		Lcbc_dec8x_done
-
-.align	5
-Lcbc_dec8x_six:
-	vncipherlast	$out2,$out2,$ivec
-	vncipherlast	$out3,$out3,$in2
-	vncipherlast	$out4,$out4,$in3
-	vncipherlast	$out5,$out5,$in4
-	vncipherlast	$out6,$out6,$in5
-	vncipherlast	$out7,$out7,$in6
-	vmr		$ivec,$in7
-
-	le?vperm	$out2,$out2,$out2,$inpperm
-	le?vperm	$out3,$out3,$out3,$inpperm
-	stvx_u		$out2,$x00,$out
-	le?vperm	$out4,$out4,$out4,$inpperm
-	stvx_u		$out3,$x10,$out
-	le?vperm	$out5,$out5,$out5,$inpperm
-	stvx_u		$out4,$x20,$out
-	le?vperm	$out6,$out6,$out6,$inpperm
-	stvx_u		$out5,$x30,$out
-	le?vperm	$out7,$out7,$out7,$inpperm
-	stvx_u		$out6,$x40,$out
-	stvx_u		$out7,$x50,$out
-	addi		$out,$out,0x60
-	b		Lcbc_dec8x_done
-
-.align	5
-Lcbc_dec8x_five:
-	vncipherlast	$out3,$out3,$ivec
-	vncipherlast	$out4,$out4,$in3
-	vncipherlast	$out5,$out5,$in4
-	vncipherlast	$out6,$out6,$in5
-	vncipherlast	$out7,$out7,$in6
-	vmr		$ivec,$in7
-
-	le?vperm	$out3,$out3,$out3,$inpperm
-	le?vperm	$out4,$out4,$out4,$inpperm
-	stvx_u		$out3,$x00,$out
-	le?vperm	$out5,$out5,$out5,$inpperm
-	stvx_u		$out4,$x10,$out
-	le?vperm	$out6,$out6,$out6,$inpperm
-	stvx_u		$out5,$x20,$out
-	le?vperm	$out7,$out7,$out7,$inpperm
-	stvx_u		$out6,$x30,$out
-	stvx_u		$out7,$x40,$out
-	addi		$out,$out,0x50
-	b		Lcbc_dec8x_done
-
-.align	5
-Lcbc_dec8x_four:
-	vncipherlast	$out4,$out4,$ivec
-	vncipherlast	$out5,$out5,$in4
-	vncipherlast	$out6,$out6,$in5
-	vncipherlast	$out7,$out7,$in6
-	vmr		$ivec,$in7
-
-	le?vperm	$out4,$out4,$out4,$inpperm
-	le?vperm	$out5,$out5,$out5,$inpperm
-	stvx_u		$out4,$x00,$out
-	le?vperm	$out6,$out6,$out6,$inpperm
-	stvx_u		$out5,$x10,$out
-	le?vperm	$out7,$out7,$out7,$inpperm
-	stvx_u		$out6,$x20,$out
-	stvx_u		$out7,$x30,$out
-	addi		$out,$out,0x40
-	b		Lcbc_dec8x_done
-
-.align	5
-Lcbc_dec8x_three:
-	vncipherlast	$out5,$out5,$ivec
-	vncipherlast	$out6,$out6,$in5
-	vncipherlast	$out7,$out7,$in6
-	vmr		$ivec,$in7
-
-	le?vperm	$out5,$out5,$out5,$inpperm
-	le?vperm	$out6,$out6,$out6,$inpperm
-	stvx_u		$out5,$x00,$out
-	le?vperm	$out7,$out7,$out7,$inpperm
-	stvx_u		$out6,$x10,$out
-	stvx_u		$out7,$x20,$out
-	addi		$out,$out,0x30
-	b		Lcbc_dec8x_done
-
-.align	5
-Lcbc_dec8x_two:
-	vncipherlast	$out6,$out6,$ivec
-	vncipherlast	$out7,$out7,$in6
-	vmr		$ivec,$in7
-
-	le?vperm	$out6,$out6,$out6,$inpperm
-	le?vperm	$out7,$out7,$out7,$inpperm
-	stvx_u		$out6,$x00,$out
-	stvx_u		$out7,$x10,$out
-	addi		$out,$out,0x20
-	b		Lcbc_dec8x_done
-
-.align	5
-Lcbc_dec8x_one:
-	vncipherlast	$out7,$out7,$ivec
-	vmr		$ivec,$in7
-
-	le?vperm	$out7,$out7,$out7,$inpperm
-	stvx_u		$out7,0,$out
-	addi		$out,$out,0x10
-
-Lcbc_dec8x_done:
-	le?vperm	$ivec,$ivec,$ivec,$inpperm
-	stvx_u		$ivec,0,$ivp		# write [unaligned] iv
-
-	li		r10,`$FRAME+15`
-	li		r11,`$FRAME+31`
-	stvx		$inpperm,r10,$sp	# wipe copies of round keys
-	addi		r10,r10,32
-	stvx		$inpperm,r11,$sp
-	addi		r11,r11,32
-	stvx		$inpperm,r10,$sp
-	addi		r10,r10,32
-	stvx		$inpperm,r11,$sp
-	addi		r11,r11,32
-	stvx		$inpperm,r10,$sp
-	addi		r10,r10,32
-	stvx		$inpperm,r11,$sp
-	addi		r11,r11,32
-	stvx		$inpperm,r10,$sp
-	addi		r10,r10,32
-	stvx		$inpperm,r11,$sp
-	addi		r11,r11,32
-
-	mtspr		256,$vrsave
-	lvx		v20,r10,$sp		# ABI says so
-	addi		r10,r10,32
-	lvx		v21,r11,$sp
-	addi		r11,r11,32
-	lvx		v22,r10,$sp
-	addi		r10,r10,32
-	lvx		v23,r11,$sp
-	addi		r11,r11,32
-	lvx		v24,r10,$sp
-	addi		r10,r10,32
-	lvx		v25,r11,$sp
-	addi		r11,r11,32
-	lvx		v26,r10,$sp
-	addi		r10,r10,32
-	lvx		v27,r11,$sp
-	addi		r11,r11,32
-	lvx		v28,r10,$sp
-	addi		r10,r10,32
-	lvx		v29,r11,$sp
-	addi		r11,r11,32
-	lvx		v30,r10,$sp
-	lvx		v31,r11,$sp
-	$POP		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
-	$POP		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
-	$POP		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
-	$POP		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
-	$POP		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
-	$POP		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
-	addi		$sp,$sp,`$FRAME+21*16+6*$SIZE_T`
-	blr
-	.long		0
-	.byte		0,12,0x14,0,0x80,6,6,0
-	.long		0
-.size	.${prefix}_cbc_encrypt,.-.${prefix}_cbc_encrypt
-___
-}}	}}}
-
-#########################################################################
-{{{	# CTR procedure[s]						#
-my ($inp,$out,$len,$key,$ivp,$x10,$rounds,$idx)=map("r$_",(3..10));
-my ($rndkey0,$rndkey1,$inout,$tmp)=		map("v$_",(0..3));
-my ($ivec,$inptail,$inpperm,$outhead,$outperm,$outmask,$keyperm,$one)=
-						map("v$_",(4..11));
-my $dat=$tmp;
-
-$code.=<<___;
-.globl	.${prefix}_ctr32_encrypt_blocks
-	${UCMP}i	$len,1
-	bltlr-
-
-	lis		r0,0xfff0
-	mfspr		$vrsave,256
-	mtspr		256,r0
-
-	li		$idx,15
-	vxor		$rndkey0,$rndkey0,$rndkey0
-	le?vspltisb	$tmp,0x0f
-
-	lvx		$ivec,0,$ivp		# load [unaligned] iv
-	lvsl		$inpperm,0,$ivp
-	lvx		$inptail,$idx,$ivp
-	 vspltisb	$one,1
-	le?vxor		$inpperm,$inpperm,$tmp
-	vperm		$ivec,$ivec,$inptail,$inpperm
-	 vsldoi		$one,$rndkey0,$one,1
-
-	neg		r11,$inp
-	?lvsl		$keyperm,0,$key		# prepare for unaligned key
-	lwz		$rounds,240($key)
-
-	lvsr		$inpperm,0,r11		# prepare for unaligned load
-	lvx		$inptail,0,$inp
-	addi		$inp,$inp,15		# 15 is not typo
-	le?vxor		$inpperm,$inpperm,$tmp
-
-	srwi		$rounds,$rounds,1
-	li		$idx,16
-	subi		$rounds,$rounds,1
-
-	${UCMP}i	$len,8
-	bge		_aesp8_ctr32_encrypt8x
-
-	?lvsr		$outperm,0,$out		# prepare for unaligned store
-	vspltisb	$outmask,-1
-	lvx		$outhead,0,$out
-	?vperm		$outmask,$rndkey0,$outmask,$outperm
-	le?vxor		$outperm,$outperm,$tmp
-
-	lvx		$rndkey0,0,$key
-	mtctr		$rounds
-	lvx		$rndkey1,$idx,$key
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$inout,$ivec,$rndkey0
-	lvx		$rndkey0,$idx,$key
-	addi		$idx,$idx,16
-	b		Loop_ctr32_enc
-
-.align	5
-Loop_ctr32_enc:
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vcipher		$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vcipher		$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key
-	addi		$idx,$idx,16
-	bdnz		Loop_ctr32_enc
-
-	vadduwm		$ivec,$ivec,$one
-	 vmr		$dat,$inptail
-	 lvx		$inptail,0,$inp
-	 addi		$inp,$inp,16
-	 subic.		$len,$len,1		# blocks--
-
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vcipher		$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key
-	 vperm		$dat,$dat,$inptail,$inpperm
-	 li		$idx,16
-	?vperm		$rndkey1,$rndkey0,$rndkey1,$keyperm
-	 lvx		$rndkey0,0,$key
-	vxor		$dat,$dat,$rndkey1	# last round key
-	vcipherlast	$inout,$inout,$dat
-
-	 lvx		$rndkey1,$idx,$key
-	 addi		$idx,$idx,16
-	vperm		$inout,$inout,$inout,$outperm
-	vsel		$dat,$outhead,$inout,$outmask
-	 mtctr		$rounds
-	 ?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vmr		$outhead,$inout
-	 vxor		$inout,$ivec,$rndkey0
-	 lvx		$rndkey0,$idx,$key
-	 addi		$idx,$idx,16
-	stvx		$dat,0,$out
-	addi		$out,$out,16
-	bne		Loop_ctr32_enc
-
-	addi		$out,$out,-1
-	lvx		$inout,0,$out		# redundant in aligned case
-	vsel		$inout,$outhead,$inout,$outmask
-	stvx		$inout,0,$out
-
-	mtspr		256,$vrsave
-	blr
-	.long		0
-	.byte		0,12,0x14,0,0,0,6,0
-	.long		0
-___
-#########################################################################
-{{	# Optimized CTR procedure					#
-my $key_="r11";
-my ($x00,$x10,$x20,$x30,$x40,$x50,$x60,$x70)=map("r$_",(0,8,26..31));
-my ($in0, $in1, $in2, $in3, $in4, $in5, $in6, $in7 )=map("v$_",(0..3,10,12..14));
-my ($out0,$out1,$out2,$out3,$out4,$out5,$out6,$out7)=map("v$_",(15..22));
-my $rndkey0="v23";	# v24-v25 rotating buffer for first found keys
-			# v26-v31 last 6 round keys
-my ($tmp,$keyperm)=($in3,$in4);	# aliases with "caller", redundant assignment
-my ($two,$three,$four)=($outhead,$outperm,$outmask);
-
-$code.=<<___;
-.align	5
-_aesp8_ctr32_encrypt8x:
-	$STU		$sp,-`($FRAME+21*16+6*$SIZE_T)`($sp)
-	li		r10,`$FRAME+8*16+15`
-	li		r11,`$FRAME+8*16+31`
-	stvx		v20,r10,$sp		# ABI says so
-	addi		r10,r10,32
-	stvx		v21,r11,$sp
-	addi		r11,r11,32
-	stvx		v22,r10,$sp
-	addi		r10,r10,32
-	stvx		v23,r11,$sp
-	addi		r11,r11,32
-	stvx		v24,r10,$sp
-	addi		r10,r10,32
-	stvx		v25,r11,$sp
-	addi		r11,r11,32
-	stvx		v26,r10,$sp
-	addi		r10,r10,32
-	stvx		v27,r11,$sp
-	addi		r11,r11,32
-	stvx		v28,r10,$sp
-	addi		r10,r10,32
-	stvx		v29,r11,$sp
-	addi		r11,r11,32
-	stvx		v30,r10,$sp
-	stvx		v31,r11,$sp
-	li		r0,-1
-	stw		$vrsave,`$FRAME+21*16-4`($sp)	# save vrsave
-	li		$x10,0x10
-	$PUSH		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
-	li		$x20,0x20
-	$PUSH		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
-	li		$x30,0x30
-	$PUSH		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
-	li		$x40,0x40
-	$PUSH		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
-	li		$x50,0x50
-	$PUSH		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
-	li		$x60,0x60
-	$PUSH		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
-	li		$x70,0x70
-	mtspr		256,r0
-
-	subi		$rounds,$rounds,3	# -4 in total
-
-	lvx		$rndkey0,$x00,$key	# load key schedule
-	lvx		v30,$x10,$key
-	addi		$key,$key,0x20
-	lvx		v31,$x00,$key
-	?vperm		$rndkey0,$rndkey0,v30,$keyperm
-	addi		$key_,$sp,$FRAME+15
-	mtctr		$rounds
-
-Load_ctr32_enc_key:
-	?vperm		v24,v30,v31,$keyperm
-	lvx		v30,$x10,$key
-	addi		$key,$key,0x20
-	stvx		v24,$x00,$key_		# off-load round[1]
-	?vperm		v25,v31,v30,$keyperm
-	lvx		v31,$x00,$key
-	stvx		v25,$x10,$key_		# off-load round[2]
-	addi		$key_,$key_,0x20
-	bdnz		Load_ctr32_enc_key
-
-	lvx		v26,$x10,$key
-	?vperm		v24,v30,v31,$keyperm
-	lvx		v27,$x20,$key
-	stvx		v24,$x00,$key_		# off-load round[3]
-	?vperm		v25,v31,v26,$keyperm
-	lvx		v28,$x30,$key
-	stvx		v25,$x10,$key_		# off-load round[4]
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	?vperm		v26,v26,v27,$keyperm
-	lvx		v29,$x40,$key
-	?vperm		v27,v27,v28,$keyperm
-	lvx		v30,$x50,$key
-	?vperm		v28,v28,v29,$keyperm
-	lvx		v31,$x60,$key
-	?vperm		v29,v29,v30,$keyperm
-	lvx		$out0,$x70,$key		# borrow $out0
-	?vperm		v30,v30,v31,$keyperm
-	lvx		v24,$x00,$key_		# pre-load round[1]
-	?vperm		v31,v31,$out0,$keyperm
-	lvx		v25,$x10,$key_		# pre-load round[2]
-
-	vadduqm		$two,$one,$one
-	subi		$inp,$inp,15		# undo "caller"
-	$SHL		$len,$len,4
-
-	vadduqm		$out1,$ivec,$one	# counter values ...
-	vadduqm		$out2,$ivec,$two
-	vxor		$out0,$ivec,$rndkey0	# ... xored with rndkey[0]
-	 le?li		$idx,8
-	vadduqm		$out3,$out1,$two
-	vxor		$out1,$out1,$rndkey0
-	 le?lvsl	$inpperm,0,$idx
-	vadduqm		$out4,$out2,$two
-	vxor		$out2,$out2,$rndkey0
-	 le?vspltisb	$tmp,0x0f
-	vadduqm		$out5,$out3,$two
-	vxor		$out3,$out3,$rndkey0
-	 le?vxor	$inpperm,$inpperm,$tmp	# transform for lvx_u/stvx_u
-	vadduqm		$out6,$out4,$two
-	vxor		$out4,$out4,$rndkey0
-	vadduqm		$out7,$out5,$two
-	vxor		$out5,$out5,$rndkey0
-	vadduqm		$ivec,$out6,$two	# next counter value
-	vxor		$out6,$out6,$rndkey0
-	vxor		$out7,$out7,$rndkey0
-
-	mtctr		$rounds
-	b		Loop_ctr32_enc8x
-.align	5
-Loop_ctr32_enc8x:
-	vcipher 	$out0,$out0,v24
-	vcipher 	$out1,$out1,v24
-	vcipher 	$out2,$out2,v24
-	vcipher 	$out3,$out3,v24
-	vcipher 	$out4,$out4,v24
-	vcipher 	$out5,$out5,v24
-	vcipher 	$out6,$out6,v24
-	vcipher 	$out7,$out7,v24
-Loop_ctr32_enc8x_middle:
-	lvx		v24,$x20,$key_		# round[3]
-	addi		$key_,$key_,0x20
-
-	vcipher 	$out0,$out0,v25
-	vcipher 	$out1,$out1,v25
-	vcipher 	$out2,$out2,v25
-	vcipher 	$out3,$out3,v25
-	vcipher 	$out4,$out4,v25
-	vcipher 	$out5,$out5,v25
-	vcipher 	$out6,$out6,v25
-	vcipher 	$out7,$out7,v25
-	lvx		v25,$x10,$key_		# round[4]
-	bdnz		Loop_ctr32_enc8x
-
-	subic		r11,$len,256		# $len-256, borrow $key_
-	vcipher 	$out0,$out0,v24
-	vcipher 	$out1,$out1,v24
-	vcipher 	$out2,$out2,v24
-	vcipher 	$out3,$out3,v24
-	vcipher 	$out4,$out4,v24
-	vcipher 	$out5,$out5,v24
-	vcipher 	$out6,$out6,v24
-	vcipher 	$out7,$out7,v24
-
-	subfe		r0,r0,r0		# borrow?-1:0
-	vcipher 	$out0,$out0,v25
-	vcipher 	$out1,$out1,v25
-	vcipher 	$out2,$out2,v25
-	vcipher 	$out3,$out3,v25
-	vcipher 	$out4,$out4,v25
-	vcipher		$out5,$out5,v25
-	vcipher		$out6,$out6,v25
-	vcipher		$out7,$out7,v25
-
-	and		r0,r0,r11
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	vcipher		$out0,$out0,v26
-	vcipher		$out1,$out1,v26
-	vcipher		$out2,$out2,v26
-	vcipher		$out3,$out3,v26
-	vcipher		$out4,$out4,v26
-	vcipher		$out5,$out5,v26
-	vcipher		$out6,$out6,v26
-	vcipher		$out7,$out7,v26
-	lvx		v24,$x00,$key_		# re-pre-load round[1]
-
-	subic		$len,$len,129		# $len-=129
-	vcipher		$out0,$out0,v27
-	addi		$len,$len,1		# $len-=128 really
-	vcipher		$out1,$out1,v27
-	vcipher		$out2,$out2,v27
-	vcipher		$out3,$out3,v27
-	vcipher		$out4,$out4,v27
-	vcipher		$out5,$out5,v27
-	vcipher		$out6,$out6,v27
-	vcipher		$out7,$out7,v27
-	lvx		v25,$x10,$key_		# re-pre-load round[2]
-
-	vcipher		$out0,$out0,v28
-	 lvx_u		$in0,$x00,$inp		# load input
-	vcipher		$out1,$out1,v28
-	 lvx_u		$in1,$x10,$inp
-	vcipher		$out2,$out2,v28
-	 lvx_u		$in2,$x20,$inp
-	vcipher		$out3,$out3,v28
-	 lvx_u		$in3,$x30,$inp
-	vcipher		$out4,$out4,v28
-	 lvx_u		$in4,$x40,$inp
-	vcipher		$out5,$out5,v28
-	 lvx_u		$in5,$x50,$inp
-	vcipher		$out6,$out6,v28
-	 lvx_u		$in6,$x60,$inp
-	vcipher		$out7,$out7,v28
-	 lvx_u		$in7,$x70,$inp
-	 addi		$inp,$inp,0x80
-
-	vcipher		$out0,$out0,v29
-	 le?vperm	$in0,$in0,$in0,$inpperm
-	vcipher		$out1,$out1,v29
-	 le?vperm	$in1,$in1,$in1,$inpperm
-	vcipher		$out2,$out2,v29
-	 le?vperm	$in2,$in2,$in2,$inpperm
-	vcipher		$out3,$out3,v29
-	 le?vperm	$in3,$in3,$in3,$inpperm
-	vcipher		$out4,$out4,v29
-	 le?vperm	$in4,$in4,$in4,$inpperm
-	vcipher		$out5,$out5,v29
-	 le?vperm	$in5,$in5,$in5,$inpperm
-	vcipher		$out6,$out6,v29
-	 le?vperm	$in6,$in6,$in6,$inpperm
-	vcipher		$out7,$out7,v29
-	 le?vperm	$in7,$in7,$in7,$inpperm
-
-	add		$inp,$inp,r0		# $inp is adjusted in such
-						# way that at exit from the
-						# loop inX-in7 are loaded
-						# with last "words"
-	subfe.		r0,r0,r0		# borrow?-1:0
-	vcipher		$out0,$out0,v30
-	 vxor		$in0,$in0,v31		# xor with last round key
-	vcipher		$out1,$out1,v30
-	 vxor		$in1,$in1,v31
-	vcipher		$out2,$out2,v30
-	 vxor		$in2,$in2,v31
-	vcipher		$out3,$out3,v30
-	 vxor		$in3,$in3,v31
-	vcipher		$out4,$out4,v30
-	 vxor		$in4,$in4,v31
-	vcipher		$out5,$out5,v30
-	 vxor		$in5,$in5,v31
-	vcipher		$out6,$out6,v30
-	 vxor		$in6,$in6,v31
-	vcipher		$out7,$out7,v30
-	 vxor		$in7,$in7,v31
-
-	bne		Lctr32_enc8x_break	# did $len-129 borrow?
-
-	vcipherlast	$in0,$out0,$in0
-	vcipherlast	$in1,$out1,$in1
-	 vadduqm	$out1,$ivec,$one	# counter values ...
-	vcipherlast	$in2,$out2,$in2
-	 vadduqm	$out2,$ivec,$two
-	 vxor		$out0,$ivec,$rndkey0	# ... xored with rndkey[0]
-	vcipherlast	$in3,$out3,$in3
-	 vadduqm	$out3,$out1,$two
-	 vxor		$out1,$out1,$rndkey0
-	vcipherlast	$in4,$out4,$in4
-	 vadduqm	$out4,$out2,$two
-	 vxor		$out2,$out2,$rndkey0
-	vcipherlast	$in5,$out5,$in5
-	 vadduqm	$out5,$out3,$two
-	 vxor		$out3,$out3,$rndkey0
-	vcipherlast	$in6,$out6,$in6
-	 vadduqm	$out6,$out4,$two
-	 vxor		$out4,$out4,$rndkey0
-	vcipherlast	$in7,$out7,$in7
-	 vadduqm	$out7,$out5,$two
-	 vxor		$out5,$out5,$rndkey0
-	le?vperm	$in0,$in0,$in0,$inpperm
-	 vadduqm	$ivec,$out6,$two	# next counter value
-	 vxor		$out6,$out6,$rndkey0
-	le?vperm	$in1,$in1,$in1,$inpperm
-	 vxor		$out7,$out7,$rndkey0
-	mtctr		$rounds
-
-	 vcipher	$out0,$out0,v24
-	stvx_u		$in0,$x00,$out
-	le?vperm	$in2,$in2,$in2,$inpperm
-	 vcipher	$out1,$out1,v24
-	stvx_u		$in1,$x10,$out
-	le?vperm	$in3,$in3,$in3,$inpperm
-	 vcipher	$out2,$out2,v24
-	stvx_u		$in2,$x20,$out
-	le?vperm	$in4,$in4,$in4,$inpperm
-	 vcipher	$out3,$out3,v24
-	stvx_u		$in3,$x30,$out
-	le?vperm	$in5,$in5,$in5,$inpperm
-	 vcipher	$out4,$out4,v24
-	stvx_u		$in4,$x40,$out
-	le?vperm	$in6,$in6,$in6,$inpperm
-	 vcipher	$out5,$out5,v24
-	stvx_u		$in5,$x50,$out
-	le?vperm	$in7,$in7,$in7,$inpperm
-	 vcipher	$out6,$out6,v24
-	stvx_u		$in6,$x60,$out
-	 vcipher	$out7,$out7,v24
-	stvx_u		$in7,$x70,$out
-	addi		$out,$out,0x80
-
-	b		Loop_ctr32_enc8x_middle
-
-.align	5
-Lctr32_enc8x_break:
-	cmpwi		$len,-0x60
-	blt		Lctr32_enc8x_one
-	nop
-	beq		Lctr32_enc8x_two
-	cmpwi		$len,-0x40
-	blt		Lctr32_enc8x_three
-	nop
-	beq		Lctr32_enc8x_four
-	cmpwi		$len,-0x20
-	blt		Lctr32_enc8x_five
-	nop
-	beq		Lctr32_enc8x_six
-	cmpwi		$len,0x00
-	blt		Lctr32_enc8x_seven
-
-Lctr32_enc8x_eight:
-	vcipherlast	$out0,$out0,$in0
-	vcipherlast	$out1,$out1,$in1
-	vcipherlast	$out2,$out2,$in2
-	vcipherlast	$out3,$out3,$in3
-	vcipherlast	$out4,$out4,$in4
-	vcipherlast	$out5,$out5,$in5
-	vcipherlast	$out6,$out6,$in6
-	vcipherlast	$out7,$out7,$in7
-
-	le?vperm	$out0,$out0,$out0,$inpperm
-	le?vperm	$out1,$out1,$out1,$inpperm
-	stvx_u		$out0,$x00,$out
-	le?vperm	$out2,$out2,$out2,$inpperm
-	stvx_u		$out1,$x10,$out
-	le?vperm	$out3,$out3,$out3,$inpperm
-	stvx_u		$out2,$x20,$out
-	le?vperm	$out4,$out4,$out4,$inpperm
-	stvx_u		$out3,$x30,$out
-	le?vperm	$out5,$out5,$out5,$inpperm
-	stvx_u		$out4,$x40,$out
-	le?vperm	$out6,$out6,$out6,$inpperm
-	stvx_u		$out5,$x50,$out
-	le?vperm	$out7,$out7,$out7,$inpperm
-	stvx_u		$out6,$x60,$out
-	stvx_u		$out7,$x70,$out
-	addi		$out,$out,0x80
-	b		Lctr32_enc8x_done
-
-.align	5
-Lctr32_enc8x_seven:
-	vcipherlast	$out0,$out0,$in1
-	vcipherlast	$out1,$out1,$in2
-	vcipherlast	$out2,$out2,$in3
-	vcipherlast	$out3,$out3,$in4
-	vcipherlast	$out4,$out4,$in5
-	vcipherlast	$out5,$out5,$in6
-	vcipherlast	$out6,$out6,$in7
-
-	le?vperm	$out0,$out0,$out0,$inpperm
-	le?vperm	$out1,$out1,$out1,$inpperm
-	stvx_u		$out0,$x00,$out
-	le?vperm	$out2,$out2,$out2,$inpperm
-	stvx_u		$out1,$x10,$out
-	le?vperm	$out3,$out3,$out3,$inpperm
-	stvx_u		$out2,$x20,$out
-	le?vperm	$out4,$out4,$out4,$inpperm
-	stvx_u		$out3,$x30,$out
-	le?vperm	$out5,$out5,$out5,$inpperm
-	stvx_u		$out4,$x40,$out
-	le?vperm	$out6,$out6,$out6,$inpperm
-	stvx_u		$out5,$x50,$out
-	stvx_u		$out6,$x60,$out
-	addi		$out,$out,0x70
-	b		Lctr32_enc8x_done
-
-.align	5
-Lctr32_enc8x_six:
-	vcipherlast	$out0,$out0,$in2
-	vcipherlast	$out1,$out1,$in3
-	vcipherlast	$out2,$out2,$in4
-	vcipherlast	$out3,$out3,$in5
-	vcipherlast	$out4,$out4,$in6
-	vcipherlast	$out5,$out5,$in7
-
-	le?vperm	$out0,$out0,$out0,$inpperm
-	le?vperm	$out1,$out1,$out1,$inpperm
-	stvx_u		$out0,$x00,$out
-	le?vperm	$out2,$out2,$out2,$inpperm
-	stvx_u		$out1,$x10,$out
-	le?vperm	$out3,$out3,$out3,$inpperm
-	stvx_u		$out2,$x20,$out
-	le?vperm	$out4,$out4,$out4,$inpperm
-	stvx_u		$out3,$x30,$out
-	le?vperm	$out5,$out5,$out5,$inpperm
-	stvx_u		$out4,$x40,$out
-	stvx_u		$out5,$x50,$out
-	addi		$out,$out,0x60
-	b		Lctr32_enc8x_done
-
-.align	5
-Lctr32_enc8x_five:
-	vcipherlast	$out0,$out0,$in3
-	vcipherlast	$out1,$out1,$in4
-	vcipherlast	$out2,$out2,$in5
-	vcipherlast	$out3,$out3,$in6
-	vcipherlast	$out4,$out4,$in7
-
-	le?vperm	$out0,$out0,$out0,$inpperm
-	le?vperm	$out1,$out1,$out1,$inpperm
-	stvx_u		$out0,$x00,$out
-	le?vperm	$out2,$out2,$out2,$inpperm
-	stvx_u		$out1,$x10,$out
-	le?vperm	$out3,$out3,$out3,$inpperm
-	stvx_u		$out2,$x20,$out
-	le?vperm	$out4,$out4,$out4,$inpperm
-	stvx_u		$out3,$x30,$out
-	stvx_u		$out4,$x40,$out
-	addi		$out,$out,0x50
-	b		Lctr32_enc8x_done
-
-.align	5
-Lctr32_enc8x_four:
-	vcipherlast	$out0,$out0,$in4
-	vcipherlast	$out1,$out1,$in5
-	vcipherlast	$out2,$out2,$in6
-	vcipherlast	$out3,$out3,$in7
-
-	le?vperm	$out0,$out0,$out0,$inpperm
-	le?vperm	$out1,$out1,$out1,$inpperm
-	stvx_u		$out0,$x00,$out
-	le?vperm	$out2,$out2,$out2,$inpperm
-	stvx_u		$out1,$x10,$out
-	le?vperm	$out3,$out3,$out3,$inpperm
-	stvx_u		$out2,$x20,$out
-	stvx_u		$out3,$x30,$out
-	addi		$out,$out,0x40
-	b		Lctr32_enc8x_done
-
-.align	5
-Lctr32_enc8x_three:
-	vcipherlast	$out0,$out0,$in5
-	vcipherlast	$out1,$out1,$in6
-	vcipherlast	$out2,$out2,$in7
-
-	le?vperm	$out0,$out0,$out0,$inpperm
-	le?vperm	$out1,$out1,$out1,$inpperm
-	stvx_u		$out0,$x00,$out
-	le?vperm	$out2,$out2,$out2,$inpperm
-	stvx_u		$out1,$x10,$out
-	stvx_u		$out2,$x20,$out
-	addi		$out,$out,0x30
-	b		Lcbc_dec8x_done
-
-.align	5
-Lctr32_enc8x_two:
-	vcipherlast	$out0,$out0,$in6
-	vcipherlast	$out1,$out1,$in7
-
-	le?vperm	$out0,$out0,$out0,$inpperm
-	le?vperm	$out1,$out1,$out1,$inpperm
-	stvx_u		$out0,$x00,$out
-	stvx_u		$out1,$x10,$out
-	addi		$out,$out,0x20
-	b		Lcbc_dec8x_done
-
-.align	5
-Lctr32_enc8x_one:
-	vcipherlast	$out0,$out0,$in7
-
-	le?vperm	$out0,$out0,$out0,$inpperm
-	stvx_u		$out0,0,$out
-	addi		$out,$out,0x10
-
-Lctr32_enc8x_done:
-	li		r10,`$FRAME+15`
-	li		r11,`$FRAME+31`
-	stvx		$inpperm,r10,$sp	# wipe copies of round keys
-	addi		r10,r10,32
-	stvx		$inpperm,r11,$sp
-	addi		r11,r11,32
-	stvx		$inpperm,r10,$sp
-	addi		r10,r10,32
-	stvx		$inpperm,r11,$sp
-	addi		r11,r11,32
-	stvx		$inpperm,r10,$sp
-	addi		r10,r10,32
-	stvx		$inpperm,r11,$sp
-	addi		r11,r11,32
-	stvx		$inpperm,r10,$sp
-	addi		r10,r10,32
-	stvx		$inpperm,r11,$sp
-	addi		r11,r11,32
-
-	mtspr		256,$vrsave
-	lvx		v20,r10,$sp		# ABI says so
-	addi		r10,r10,32
-	lvx		v21,r11,$sp
-	addi		r11,r11,32
-	lvx		v22,r10,$sp
-	addi		r10,r10,32
-	lvx		v23,r11,$sp
-	addi		r11,r11,32
-	lvx		v24,r10,$sp
-	addi		r10,r10,32
-	lvx		v25,r11,$sp
-	addi		r11,r11,32
-	lvx		v26,r10,$sp
-	addi		r10,r10,32
-	lvx		v27,r11,$sp
-	addi		r11,r11,32
-	lvx		v28,r10,$sp
-	addi		r10,r10,32
-	lvx		v29,r11,$sp
-	addi		r11,r11,32
-	lvx		v30,r10,$sp
-	lvx		v31,r11,$sp
-	$POP		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
-	$POP		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
-	$POP		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
-	$POP		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
-	$POP		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
-	$POP		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
-	addi		$sp,$sp,`$FRAME+21*16+6*$SIZE_T`
-	blr
-	.long		0
-	.byte		0,12,0x14,0,0x80,6,6,0
-	.long		0
-.size	.${prefix}_ctr32_encrypt_blocks,.-.${prefix}_ctr32_encrypt_blocks
-___
-}}	}}}
-
-#########################################################################
-{{{	# XTS procedures						#
-# int aes_p8_xts_[en|de]crypt(const char *inp, char *out, size_t len,	#
-#                             const AES_KEY *key1, const AES_KEY *key2,	#
-#                             [const] unsigned char iv[16]);		#
-# If $key2 is NULL, then a "tweak chaining" mode is engaged, in which	#
-# input tweak value is assumed to be encrypted already, and last tweak	#
-# value, one suitable for consecutive call on same chunk of data, is	#
-# written back to original buffer. In addition, in "tweak chaining"	#
-# mode only complete input blocks are processed.			#
-
-my ($inp,$out,$len,$key1,$key2,$ivp,$rounds,$idx) =	map("r$_",(3..10));
-my ($rndkey0,$rndkey1,$inout) =				map("v$_",(0..2));
-my ($output,$inptail,$inpperm,$leperm,$keyperm) =	map("v$_",(3..7));
-my ($tweak,$seven,$eighty7,$tmp,$tweak1) =		map("v$_",(8..12));
-my $taillen = $key2;
-
-   ($inp,$idx) = ($idx,$inp);				# reassign
-
-$code.=<<___;
-.globl	.${prefix}_xts_encrypt
-	mr		$inp,r3				# reassign
-	li		r3,-1
-	${UCMP}i	$len,16
-	bltlr-
-
-	lis		r0,0xfff0
-	mfspr		r12,256				# save vrsave
-	li		r11,0
-	mtspr		256,r0
-
-	vspltisb	$seven,0x07			# 0x070707..07
-	le?lvsl		$leperm,r11,r11
-	le?vspltisb	$tmp,0x0f
-	le?vxor		$leperm,$leperm,$seven
-
-	li		$idx,15
-	lvx		$tweak,0,$ivp			# load [unaligned] iv
-	lvsl		$inpperm,0,$ivp
-	lvx		$inptail,$idx,$ivp
-	le?vxor		$inpperm,$inpperm,$tmp
-	vperm		$tweak,$tweak,$inptail,$inpperm
-
-	neg		r11,$inp
-	lvsr		$inpperm,0,r11			# prepare for unaligned load
-	lvx		$inout,0,$inp
-	addi		$inp,$inp,15			# 15 is not typo
-	le?vxor		$inpperm,$inpperm,$tmp
-
-	${UCMP}i	$key2,0				# key2==NULL?
-	beq		Lxts_enc_no_key2
-
-	?lvsl		$keyperm,0,$key2		# prepare for unaligned key
-	lwz		$rounds,240($key2)
-	srwi		$rounds,$rounds,1
-	subi		$rounds,$rounds,1
-	li		$idx,16
-
-	lvx		$rndkey0,0,$key2
-	lvx		$rndkey1,$idx,$key2
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$tweak,$tweak,$rndkey0
-	lvx		$rndkey0,$idx,$key2
-	addi		$idx,$idx,16
-	mtctr		$rounds
-
-Ltweak_xts_enc:
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vcipher		$tweak,$tweak,$rndkey1
-	lvx		$rndkey1,$idx,$key2
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vcipher		$tweak,$tweak,$rndkey0
-	lvx		$rndkey0,$idx,$key2
-	addi		$idx,$idx,16
-	bdnz		Ltweak_xts_enc
-
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vcipher		$tweak,$tweak,$rndkey1
-	lvx		$rndkey1,$idx,$key2
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vcipherlast	$tweak,$tweak,$rndkey0
-
-	li		$ivp,0				# don't chain the tweak
-	b		Lxts_enc
-
-Lxts_enc_no_key2:
-	li		$idx,-16
-	and		$len,$len,$idx			# in "tweak chaining"
-							# mode only complete
-							# blocks are processed
-Lxts_enc:
-	lvx		$inptail,0,$inp
-	addi		$inp,$inp,16
-
-	?lvsl		$keyperm,0,$key1		# prepare for unaligned key
-	lwz		$rounds,240($key1)
-	srwi		$rounds,$rounds,1
-	subi		$rounds,$rounds,1
-	li		$idx,16
-
-	vslb		$eighty7,$seven,$seven		# 0x808080..80
-	vor		$eighty7,$eighty7,$seven	# 0x878787..87
-	vspltisb	$tmp,1				# 0x010101..01
-	vsldoi		$eighty7,$eighty7,$tmp,15	# 0x870101..01
-
-	${UCMP}i	$len,96
-	bge		_aesp8_xts_encrypt6x
-
-	andi.		$taillen,$len,15
-	subic		r0,$len,32
-	subi		$taillen,$taillen,16
-	subfe		r0,r0,r0
-	and		r0,r0,$taillen
-	add		$inp,$inp,r0
-
-	lvx		$rndkey0,0,$key1
-	lvx		$rndkey1,$idx,$key1
-	addi		$idx,$idx,16
-	vperm		$inout,$inout,$inptail,$inpperm
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$inout,$inout,$tweak
-	vxor		$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key1
-	addi		$idx,$idx,16
-	mtctr		$rounds
-	b		Loop_xts_enc
-
-.align	5
-Loop_xts_enc:
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vcipher		$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key1
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vcipher		$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key1
-	addi		$idx,$idx,16
-	bdnz		Loop_xts_enc
-
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vcipher		$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key1
-	li		$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$rndkey0,$rndkey0,$tweak
-	vcipherlast	$output,$inout,$rndkey0
-
-	le?vperm	$tmp,$output,$output,$leperm
-	be?nop
-	le?stvx_u	$tmp,0,$out
-	be?stvx_u	$output,0,$out
-	addi		$out,$out,16
-
-	subic.		$len,$len,16
-	beq		Lxts_enc_done
-
-	vmr		$inout,$inptail
-	lvx		$inptail,0,$inp
-	addi		$inp,$inp,16
-	lvx		$rndkey0,0,$key1
-	lvx		$rndkey1,$idx,$key1
-	addi		$idx,$idx,16
-
-	subic		r0,$len,32
-	subfe		r0,r0,r0
-	and		r0,r0,$taillen
-	add		$inp,$inp,r0
-
-	vsrab		$tmp,$tweak,$seven		# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	vand		$tmp,$tmp,$eighty7
-	vxor		$tweak,$tweak,$tmp
-
-	vperm		$inout,$inout,$inptail,$inpperm
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$inout,$inout,$tweak
-	vxor		$output,$output,$rndkey0	# just in case $len<16
-	vxor		$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key1
-	addi		$idx,$idx,16
-
-	mtctr		$rounds
-	${UCMP}i	$len,16
-	bge		Loop_xts_enc
-
-	vxor		$output,$output,$tweak
-	lvsr		$inpperm,0,$len			# $inpperm is no longer needed
-	vxor		$inptail,$inptail,$inptail	# $inptail is no longer needed
-	vspltisb	$tmp,-1
-	vperm		$inptail,$inptail,$tmp,$inpperm
-	vsel		$inout,$inout,$output,$inptail
-
-	subi		r11,$out,17
-	subi		$out,$out,16
-	mtctr		$len
-	li		$len,16
-Loop_xts_enc_steal:
-	lbzu		r0,1(r11)
-	stb		r0,16(r11)
-	bdnz		Loop_xts_enc_steal
-
-	mtctr		$rounds
-	b		Loop_xts_enc			# one more time...
-
-Lxts_enc_done:
-	${UCMP}i	$ivp,0
-	beq		Lxts_enc_ret
-
-	vsrab		$tmp,$tweak,$seven		# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	vand		$tmp,$tmp,$eighty7
-	vxor		$tweak,$tweak,$tmp
-
-	le?vperm	$tweak,$tweak,$tweak,$leperm
-	stvx_u		$tweak,0,$ivp
-
-Lxts_enc_ret:
-	mtspr		256,r12				# restore vrsave
-	li		r3,0
-	blr
-	.long		0
-	.byte		0,12,0x04,0,0x80,6,6,0
-	.long		0
-.size	.${prefix}_xts_encrypt,.-.${prefix}_xts_encrypt
-
-.globl	.${prefix}_xts_decrypt
-	mr		$inp,r3				# reassign
-	li		r3,-1
-	${UCMP}i	$len,16
-	bltlr-
-
-	lis		r0,0xfff8
-	mfspr		r12,256				# save vrsave
-	li		r11,0
-	mtspr		256,r0
-
-	andi.		r0,$len,15
-	neg		r0,r0
-	andi.		r0,r0,16
-	sub		$len,$len,r0
-
-	vspltisb	$seven,0x07			# 0x070707..07
-	le?lvsl		$leperm,r11,r11
-	le?vspltisb	$tmp,0x0f
-	le?vxor		$leperm,$leperm,$seven
-
-	li		$idx,15
-	lvx		$tweak,0,$ivp			# load [unaligned] iv
-	lvsl		$inpperm,0,$ivp
-	lvx		$inptail,$idx,$ivp
-	le?vxor		$inpperm,$inpperm,$tmp
-	vperm		$tweak,$tweak,$inptail,$inpperm
-
-	neg		r11,$inp
-	lvsr		$inpperm,0,r11			# prepare for unaligned load
-	lvx		$inout,0,$inp
-	addi		$inp,$inp,15			# 15 is not typo
-	le?vxor		$inpperm,$inpperm,$tmp
-
-	${UCMP}i	$key2,0				# key2==NULL?
-	beq		Lxts_dec_no_key2
-
-	?lvsl		$keyperm,0,$key2		# prepare for unaligned key
-	lwz		$rounds,240($key2)
-	srwi		$rounds,$rounds,1
-	subi		$rounds,$rounds,1
-	li		$idx,16
-
-	lvx		$rndkey0,0,$key2
-	lvx		$rndkey1,$idx,$key2
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$tweak,$tweak,$rndkey0
-	lvx		$rndkey0,$idx,$key2
-	addi		$idx,$idx,16
-	mtctr		$rounds
-
-Ltweak_xts_dec:
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vcipher		$tweak,$tweak,$rndkey1
-	lvx		$rndkey1,$idx,$key2
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vcipher		$tweak,$tweak,$rndkey0
-	lvx		$rndkey0,$idx,$key2
-	addi		$idx,$idx,16
-	bdnz		Ltweak_xts_dec
-
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vcipher		$tweak,$tweak,$rndkey1
-	lvx		$rndkey1,$idx,$key2
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vcipherlast	$tweak,$tweak,$rndkey0
-
-	li		$ivp,0				# don't chain the tweak
-	b		Lxts_dec
-
-Lxts_dec_no_key2:
-	neg		$idx,$len
-	andi.		$idx,$idx,15
-	add		$len,$len,$idx			# in "tweak chaining"
-							# mode only complete
-							# blocks are processed
-Lxts_dec:
-	lvx		$inptail,0,$inp
-	addi		$inp,$inp,16
-
-	?lvsl		$keyperm,0,$key1		# prepare for unaligned key
-	lwz		$rounds,240($key1)
-	srwi		$rounds,$rounds,1
-	subi		$rounds,$rounds,1
-	li		$idx,16
-
-	vslb		$eighty7,$seven,$seven		# 0x808080..80
-	vor		$eighty7,$eighty7,$seven	# 0x878787..87
-	vspltisb	$tmp,1				# 0x010101..01
-	vsldoi		$eighty7,$eighty7,$tmp,15	# 0x870101..01
-
-	${UCMP}i	$len,96
-	bge		_aesp8_xts_decrypt6x
-
-	lvx		$rndkey0,0,$key1
-	lvx		$rndkey1,$idx,$key1
-	addi		$idx,$idx,16
-	vperm		$inout,$inout,$inptail,$inpperm
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$inout,$inout,$tweak
-	vxor		$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key1
-	addi		$idx,$idx,16
-	mtctr		$rounds
-
-	${UCMP}i	$len,16
-	blt		Ltail_xts_dec
-	be?b		Loop_xts_dec
-
-.align	5
-Loop_xts_dec:
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vncipher	$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key1
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vncipher	$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key1
-	addi		$idx,$idx,16
-	bdnz		Loop_xts_dec
-
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vncipher	$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key1
-	li		$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$rndkey0,$rndkey0,$tweak
-	vncipherlast	$output,$inout,$rndkey0
-
-	le?vperm	$tmp,$output,$output,$leperm
-	be?nop
-	le?stvx_u	$tmp,0,$out
-	be?stvx_u	$output,0,$out
-	addi		$out,$out,16
-
-	subic.		$len,$len,16
-	beq		Lxts_dec_done
-
-	vmr		$inout,$inptail
-	lvx		$inptail,0,$inp
-	addi		$inp,$inp,16
-	lvx		$rndkey0,0,$key1
-	lvx		$rndkey1,$idx,$key1
-	addi		$idx,$idx,16
-
-	vsrab		$tmp,$tweak,$seven		# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	vand		$tmp,$tmp,$eighty7
-	vxor		$tweak,$tweak,$tmp
-
-	vperm		$inout,$inout,$inptail,$inpperm
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$inout,$inout,$tweak
-	vxor		$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key1
-	addi		$idx,$idx,16
-
-	mtctr		$rounds
-	${UCMP}i	$len,16
-	bge		Loop_xts_dec
-
-Ltail_xts_dec:
-	vsrab		$tmp,$tweak,$seven		# next tweak value
-	vaddubm		$tweak1,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	vand		$tmp,$tmp,$eighty7
-	vxor		$tweak1,$tweak1,$tmp
-
-	subi		$inp,$inp,16
-	add		$inp,$inp,$len
-
-	vxor		$inout,$inout,$tweak		# :-(
-	vxor		$inout,$inout,$tweak1		# :-)
-
-Loop_xts_dec_short:
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vncipher	$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key1
-	addi		$idx,$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vncipher	$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key1
-	addi		$idx,$idx,16
-	bdnz		Loop_xts_dec_short
-
-	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
-	vncipher	$inout,$inout,$rndkey1
-	lvx		$rndkey1,$idx,$key1
-	li		$idx,16
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-	vxor		$rndkey0,$rndkey0,$tweak1
-	vncipherlast	$output,$inout,$rndkey0
-
-	le?vperm	$tmp,$output,$output,$leperm
-	be?nop
-	le?stvx_u	$tmp,0,$out
-	be?stvx_u	$output,0,$out
-
-	vmr		$inout,$inptail
-	lvx		$inptail,0,$inp
-	#addi		$inp,$inp,16
-	lvx		$rndkey0,0,$key1
-	lvx		$rndkey1,$idx,$key1
-	addi		$idx,$idx,16
-	vperm		$inout,$inout,$inptail,$inpperm
-	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
-
-	lvsr		$inpperm,0,$len			# $inpperm is no longer needed
-	vxor		$inptail,$inptail,$inptail	# $inptail is no longer needed
-	vspltisb	$tmp,-1
-	vperm		$inptail,$inptail,$tmp,$inpperm
-	vsel		$inout,$inout,$output,$inptail
-
-	vxor		$rndkey0,$rndkey0,$tweak
-	vxor		$inout,$inout,$rndkey0
-	lvx		$rndkey0,$idx,$key1
-	addi		$idx,$idx,16
-
-	subi		r11,$out,1
-	mtctr		$len
-	li		$len,16
-Loop_xts_dec_steal:
-	lbzu		r0,1(r11)
-	stb		r0,16(r11)
-	bdnz		Loop_xts_dec_steal
-
-	mtctr		$rounds
-	b		Loop_xts_dec			# one more time...
-
-Lxts_dec_done:
-	${UCMP}i	$ivp,0
-	beq		Lxts_dec_ret
-
-	vsrab		$tmp,$tweak,$seven		# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	vand		$tmp,$tmp,$eighty7
-	vxor		$tweak,$tweak,$tmp
-
-	le?vperm	$tweak,$tweak,$tweak,$leperm
-	stvx_u		$tweak,0,$ivp
-
-Lxts_dec_ret:
-	mtspr		256,r12				# restore vrsave
-	li		r3,0
-	blr
-	.long		0
-	.byte		0,12,0x04,0,0x80,6,6,0
-	.long		0
-.size	.${prefix}_xts_decrypt,.-.${prefix}_xts_decrypt
-___
-#########################################################################
-{{	# Optimized XTS procedures					#
-my $key_=$key2;
-my ($x00,$x10,$x20,$x30,$x40,$x50,$x60,$x70)=map("r$_",(0,3,26..31));
-    $x00=0 if ($flavour =~ /osx/);
-my ($in0,  $in1,  $in2,  $in3,  $in4,  $in5 )=map("v$_",(0..5));
-my ($out0, $out1, $out2, $out3, $out4, $out5)=map("v$_",(7,12..16));
-my ($twk0, $twk1, $twk2, $twk3, $twk4, $twk5)=map("v$_",(17..22));
-my $rndkey0="v23";	# v24-v25 rotating buffer for first found keys
-			# v26-v31 last 6 round keys
-my ($keyperm)=($out0);	# aliases with "caller", redundant assignment
-my $taillen=$x70;
-
-$code.=<<___;
-.align	5
-_aesp8_xts_encrypt6x:
-	$STU		$sp,-`($FRAME+21*16+6*$SIZE_T)`($sp)
-	mflr		r11
-	li		r7,`$FRAME+8*16+15`
-	li		r3,`$FRAME+8*16+31`
-	$PUSH		r11,`$FRAME+21*16+6*$SIZE_T+$LRSAVE`($sp)
-	stvx		v20,r7,$sp		# ABI says so
-	addi		r7,r7,32
-	stvx		v21,r3,$sp
-	addi		r3,r3,32
-	stvx		v22,r7,$sp
-	addi		r7,r7,32
-	stvx		v23,r3,$sp
-	addi		r3,r3,32
-	stvx		v24,r7,$sp
-	addi		r7,r7,32
-	stvx		v25,r3,$sp
-	addi		r3,r3,32
-	stvx		v26,r7,$sp
-	addi		r7,r7,32
-	stvx		v27,r3,$sp
-	addi		r3,r3,32
-	stvx		v28,r7,$sp
-	addi		r7,r7,32
-	stvx		v29,r3,$sp
-	addi		r3,r3,32
-	stvx		v30,r7,$sp
-	stvx		v31,r3,$sp
-	li		r0,-1
-	stw		$vrsave,`$FRAME+21*16-4`($sp)	# save vrsave
-	li		$x10,0x10
-	$PUSH		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
-	li		$x20,0x20
-	$PUSH		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
-	li		$x30,0x30
-	$PUSH		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
-	li		$x40,0x40
-	$PUSH		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
-	li		$x50,0x50
-	$PUSH		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
-	li		$x60,0x60
-	$PUSH		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
-	li		$x70,0x70
-	mtspr		256,r0
-
-	subi		$rounds,$rounds,3	# -4 in total
-
-	lvx		$rndkey0,$x00,$key1	# load key schedule
-	lvx		v30,$x10,$key1
-	addi		$key1,$key1,0x20
-	lvx		v31,$x00,$key1
-	?vperm		$rndkey0,$rndkey0,v30,$keyperm
-	addi		$key_,$sp,$FRAME+15
-	mtctr		$rounds
-
-Load_xts_enc_key:
-	?vperm		v24,v30,v31,$keyperm
-	lvx		v30,$x10,$key1
-	addi		$key1,$key1,0x20
-	stvx		v24,$x00,$key_		# off-load round[1]
-	?vperm		v25,v31,v30,$keyperm
-	lvx		v31,$x00,$key1
-	stvx		v25,$x10,$key_		# off-load round[2]
-	addi		$key_,$key_,0x20
-	bdnz		Load_xts_enc_key
-
-	lvx		v26,$x10,$key1
-	?vperm		v24,v30,v31,$keyperm
-	lvx		v27,$x20,$key1
-	stvx		v24,$x00,$key_		# off-load round[3]
-	?vperm		v25,v31,v26,$keyperm
-	lvx		v28,$x30,$key1
-	stvx		v25,$x10,$key_		# off-load round[4]
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	?vperm		v26,v26,v27,$keyperm
-	lvx		v29,$x40,$key1
-	?vperm		v27,v27,v28,$keyperm
-	lvx		v30,$x50,$key1
-	?vperm		v28,v28,v29,$keyperm
-	lvx		v31,$x60,$key1
-	?vperm		v29,v29,v30,$keyperm
-	lvx		$twk5,$x70,$key1	# borrow $twk5
-	?vperm		v30,v30,v31,$keyperm
-	lvx		v24,$x00,$key_		# pre-load round[1]
-	?vperm		v31,v31,$twk5,$keyperm
-	lvx		v25,$x10,$key_		# pre-load round[2]
-
-	 vperm		$in0,$inout,$inptail,$inpperm
-	 subi		$inp,$inp,31		# undo "caller"
-	vxor		$twk0,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out0,$in0,$twk0
-	vxor		$tweak,$tweak,$tmp
-
-	 lvx_u		$in1,$x10,$inp
-	vxor		$twk1,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	 le?vperm	$in1,$in1,$in1,$leperm
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out1,$in1,$twk1
-	vxor		$tweak,$tweak,$tmp
-
-	 lvx_u		$in2,$x20,$inp
-	 andi.		$taillen,$len,15
-	vxor		$twk2,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	 le?vperm	$in2,$in2,$in2,$leperm
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out2,$in2,$twk2
-	vxor		$tweak,$tweak,$tmp
-
-	 lvx_u		$in3,$x30,$inp
-	 sub		$len,$len,$taillen
-	vxor		$twk3,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	 le?vperm	$in3,$in3,$in3,$leperm
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out3,$in3,$twk3
-	vxor		$tweak,$tweak,$tmp
-
-	 lvx_u		$in4,$x40,$inp
-	 subi		$len,$len,0x60
-	vxor		$twk4,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	 le?vperm	$in4,$in4,$in4,$leperm
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out4,$in4,$twk4
-	vxor		$tweak,$tweak,$tmp
-
-	 lvx_u		$in5,$x50,$inp
-	 addi		$inp,$inp,0x60
-	vxor		$twk5,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	 le?vperm	$in5,$in5,$in5,$leperm
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out5,$in5,$twk5
-	vxor		$tweak,$tweak,$tmp
-
-	vxor		v31,v31,$rndkey0
-	mtctr		$rounds
-	b		Loop_xts_enc6x
-
-.align	5
-Loop_xts_enc6x:
-	vcipher		$out0,$out0,v24
-	vcipher		$out1,$out1,v24
-	vcipher		$out2,$out2,v24
-	vcipher		$out3,$out3,v24
-	vcipher		$out4,$out4,v24
-	vcipher		$out5,$out5,v24
-	lvx		v24,$x20,$key_		# round[3]
-	addi		$key_,$key_,0x20
-
-	vcipher		$out0,$out0,v25
-	vcipher		$out1,$out1,v25
-	vcipher		$out2,$out2,v25
-	vcipher		$out3,$out3,v25
-	vcipher		$out4,$out4,v25
-	vcipher		$out5,$out5,v25
-	lvx		v25,$x10,$key_		# round[4]
-	bdnz		Loop_xts_enc6x
-
-	subic		$len,$len,96		# $len-=96
-	 vxor		$in0,$twk0,v31		# xor with last round key
-	vcipher		$out0,$out0,v24
-	vcipher		$out1,$out1,v24
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk0,$tweak,$rndkey0
-	 vaddubm	$tweak,$tweak,$tweak
-	vcipher		$out2,$out2,v24
-	vcipher		$out3,$out3,v24
-	 vsldoi		$tmp,$tmp,$tmp,15
-	vcipher		$out4,$out4,v24
-	vcipher		$out5,$out5,v24
-
-	subfe.		r0,r0,r0		# borrow?-1:0
-	 vand		$tmp,$tmp,$eighty7
-	vcipher		$out0,$out0,v25
-	vcipher		$out1,$out1,v25
-	 vxor		$tweak,$tweak,$tmp
-	vcipher		$out2,$out2,v25
-	vcipher		$out3,$out3,v25
-	 vxor		$in1,$twk1,v31
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk1,$tweak,$rndkey0
-	vcipher		$out4,$out4,v25
-	vcipher		$out5,$out5,v25
-
-	and		r0,r0,$len
-	 vaddubm	$tweak,$tweak,$tweak
-	 vsldoi		$tmp,$tmp,$tmp,15
-	vcipher		$out0,$out0,v26
-	vcipher		$out1,$out1,v26
-	 vand		$tmp,$tmp,$eighty7
-	vcipher		$out2,$out2,v26
-	vcipher		$out3,$out3,v26
-	 vxor		$tweak,$tweak,$tmp
-	vcipher		$out4,$out4,v26
-	vcipher		$out5,$out5,v26
-
-	add		$inp,$inp,r0		# $inp is adjusted in such
-						# way that at exit from the
-						# loop inX-in5 are loaded
-						# with last "words"
-	 vxor		$in2,$twk2,v31
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk2,$tweak,$rndkey0
-	 vaddubm	$tweak,$tweak,$tweak
-	vcipher		$out0,$out0,v27
-	vcipher		$out1,$out1,v27
-	 vsldoi		$tmp,$tmp,$tmp,15
-	vcipher		$out2,$out2,v27
-	vcipher		$out3,$out3,v27
-	 vand		$tmp,$tmp,$eighty7
-	vcipher		$out4,$out4,v27
-	vcipher		$out5,$out5,v27
-
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	 vxor		$tweak,$tweak,$tmp
-	vcipher		$out0,$out0,v28
-	vcipher		$out1,$out1,v28
-	 vxor		$in3,$twk3,v31
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk3,$tweak,$rndkey0
-	vcipher		$out2,$out2,v28
-	vcipher		$out3,$out3,v28
-	 vaddubm	$tweak,$tweak,$tweak
-	 vsldoi		$tmp,$tmp,$tmp,15
-	vcipher		$out4,$out4,v28
-	vcipher		$out5,$out5,v28
-	lvx		v24,$x00,$key_		# re-pre-load round[1]
-	 vand		$tmp,$tmp,$eighty7
-
-	vcipher		$out0,$out0,v29
-	vcipher		$out1,$out1,v29
-	 vxor		$tweak,$tweak,$tmp
-	vcipher		$out2,$out2,v29
-	vcipher		$out3,$out3,v29
-	 vxor		$in4,$twk4,v31
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk4,$tweak,$rndkey0
-	vcipher		$out4,$out4,v29
-	vcipher		$out5,$out5,v29
-	lvx		v25,$x10,$key_		# re-pre-load round[2]
-	 vaddubm	$tweak,$tweak,$tweak
-	 vsldoi		$tmp,$tmp,$tmp,15
-
-	vcipher		$out0,$out0,v30
-	vcipher		$out1,$out1,v30
-	 vand		$tmp,$tmp,$eighty7
-	vcipher		$out2,$out2,v30
-	vcipher		$out3,$out3,v30
-	 vxor		$tweak,$tweak,$tmp
-	vcipher		$out4,$out4,v30
-	vcipher		$out5,$out5,v30
-	 vxor		$in5,$twk5,v31
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk5,$tweak,$rndkey0
-
-	vcipherlast	$out0,$out0,$in0
-	 lvx_u		$in0,$x00,$inp		# load next input block
-	 vaddubm	$tweak,$tweak,$tweak
-	 vsldoi		$tmp,$tmp,$tmp,15
-	vcipherlast	$out1,$out1,$in1
-	 lvx_u		$in1,$x10,$inp
-	vcipherlast	$out2,$out2,$in2
-	 le?vperm	$in0,$in0,$in0,$leperm
-	 lvx_u		$in2,$x20,$inp
-	 vand		$tmp,$tmp,$eighty7
-	vcipherlast	$out3,$out3,$in3
-	 le?vperm	$in1,$in1,$in1,$leperm
-	 lvx_u		$in3,$x30,$inp
-	vcipherlast	$out4,$out4,$in4
-	 le?vperm	$in2,$in2,$in2,$leperm
-	 lvx_u		$in4,$x40,$inp
-	 vxor		$tweak,$tweak,$tmp
-	vcipherlast	$tmp,$out5,$in5		# last block might be needed
-						# in stealing mode
-	 le?vperm	$in3,$in3,$in3,$leperm
-	 lvx_u		$in5,$x50,$inp
-	 addi		$inp,$inp,0x60
-	 le?vperm	$in4,$in4,$in4,$leperm
-	 le?vperm	$in5,$in5,$in5,$leperm
-
-	le?vperm	$out0,$out0,$out0,$leperm
-	le?vperm	$out1,$out1,$out1,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	 vxor		$out0,$in0,$twk0
-	le?vperm	$out2,$out2,$out2,$leperm
-	stvx_u		$out1,$x10,$out
-	 vxor		$out1,$in1,$twk1
-	le?vperm	$out3,$out3,$out3,$leperm
-	stvx_u		$out2,$x20,$out
-	 vxor		$out2,$in2,$twk2
-	le?vperm	$out4,$out4,$out4,$leperm
-	stvx_u		$out3,$x30,$out
-	 vxor		$out3,$in3,$twk3
-	le?vperm	$out5,$tmp,$tmp,$leperm
-	stvx_u		$out4,$x40,$out
-	 vxor		$out4,$in4,$twk4
-	le?stvx_u	$out5,$x50,$out
-	be?stvx_u	$tmp, $x50,$out
-	 vxor		$out5,$in5,$twk5
-	addi		$out,$out,0x60
-
-	mtctr		$rounds
-	beq		Loop_xts_enc6x		# did $len-=96 borrow?
-
-	addic.		$len,$len,0x60
-	beq		Lxts_enc6x_zero
-	cmpwi		$len,0x20
-	blt		Lxts_enc6x_one
-	nop
-	beq		Lxts_enc6x_two
-	cmpwi		$len,0x40
-	blt		Lxts_enc6x_three
-	nop
-	beq		Lxts_enc6x_four
-
-Lxts_enc6x_five:
-	vxor		$out0,$in1,$twk0
-	vxor		$out1,$in2,$twk1
-	vxor		$out2,$in3,$twk2
-	vxor		$out3,$in4,$twk3
-	vxor		$out4,$in5,$twk4
-
-	bl		_aesp8_xts_enc5x
-
-	le?vperm	$out0,$out0,$out0,$leperm
-	vmr		$twk0,$twk5		# unused tweak
-	le?vperm	$out1,$out1,$out1,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	le?vperm	$out2,$out2,$out2,$leperm
-	stvx_u		$out1,$x10,$out
-	le?vperm	$out3,$out3,$out3,$leperm
-	stvx_u		$out2,$x20,$out
-	vxor		$tmp,$out4,$twk5	# last block prep for stealing
-	le?vperm	$out4,$out4,$out4,$leperm
-	stvx_u		$out3,$x30,$out
-	stvx_u		$out4,$x40,$out
-	addi		$out,$out,0x50
-	bne		Lxts_enc6x_steal
-	b		Lxts_enc6x_done
-
-.align	4
-Lxts_enc6x_four:
-	vxor		$out0,$in2,$twk0
-	vxor		$out1,$in3,$twk1
-	vxor		$out2,$in4,$twk2
-	vxor		$out3,$in5,$twk3
-	vxor		$out4,$out4,$out4
-
-	bl		_aesp8_xts_enc5x
-
-	le?vperm	$out0,$out0,$out0,$leperm
-	vmr		$twk0,$twk4		# unused tweak
-	le?vperm	$out1,$out1,$out1,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	le?vperm	$out2,$out2,$out2,$leperm
-	stvx_u		$out1,$x10,$out
-	vxor		$tmp,$out3,$twk4	# last block prep for stealing
-	le?vperm	$out3,$out3,$out3,$leperm
-	stvx_u		$out2,$x20,$out
-	stvx_u		$out3,$x30,$out
-	addi		$out,$out,0x40
-	bne		Lxts_enc6x_steal
-	b		Lxts_enc6x_done
-
-.align	4
-Lxts_enc6x_three:
-	vxor		$out0,$in3,$twk0
-	vxor		$out1,$in4,$twk1
-	vxor		$out2,$in5,$twk2
-	vxor		$out3,$out3,$out3
-	vxor		$out4,$out4,$out4
-
-	bl		_aesp8_xts_enc5x
-
-	le?vperm	$out0,$out0,$out0,$leperm
-	vmr		$twk0,$twk3		# unused tweak
-	le?vperm	$out1,$out1,$out1,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	vxor		$tmp,$out2,$twk3	# last block prep for stealing
-	le?vperm	$out2,$out2,$out2,$leperm
-	stvx_u		$out1,$x10,$out
-	stvx_u		$out2,$x20,$out
-	addi		$out,$out,0x30
-	bne		Lxts_enc6x_steal
-	b		Lxts_enc6x_done
-
-.align	4
-Lxts_enc6x_two:
-	vxor		$out0,$in4,$twk0
-	vxor		$out1,$in5,$twk1
-	vxor		$out2,$out2,$out2
-	vxor		$out3,$out3,$out3
-	vxor		$out4,$out4,$out4
-
-	bl		_aesp8_xts_enc5x
-
-	le?vperm	$out0,$out0,$out0,$leperm
-	vmr		$twk0,$twk2		# unused tweak
-	vxor		$tmp,$out1,$twk2	# last block prep for stealing
-	le?vperm	$out1,$out1,$out1,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	stvx_u		$out1,$x10,$out
-	addi		$out,$out,0x20
-	bne		Lxts_enc6x_steal
-	b		Lxts_enc6x_done
-
-.align	4
-Lxts_enc6x_one:
-	vxor		$out0,$in5,$twk0
-	nop
-Loop_xts_enc1x:
-	vcipher		$out0,$out0,v24
-	lvx		v24,$x20,$key_		# round[3]
-	addi		$key_,$key_,0x20
-
-	vcipher		$out0,$out0,v25
-	lvx		v25,$x10,$key_		# round[4]
-	bdnz		Loop_xts_enc1x
-
-	add		$inp,$inp,$taillen
-	cmpwi		$taillen,0
-	vcipher		$out0,$out0,v24
-
-	subi		$inp,$inp,16
-	vcipher		$out0,$out0,v25
-
-	lvsr		$inpperm,0,$taillen
-	vcipher		$out0,$out0,v26
-
-	lvx_u		$in0,0,$inp
-	vcipher		$out0,$out0,v27
-
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	vcipher		$out0,$out0,v28
-	lvx		v24,$x00,$key_		# re-pre-load round[1]
-
-	vcipher		$out0,$out0,v29
-	lvx		v25,$x10,$key_		# re-pre-load round[2]
-	 vxor		$twk0,$twk0,v31
-
-	le?vperm	$in0,$in0,$in0,$leperm
-	vcipher		$out0,$out0,v30
-
-	vperm		$in0,$in0,$in0,$inpperm
-	vcipherlast	$out0,$out0,$twk0
-
-	vmr		$twk0,$twk1		# unused tweak
-	vxor		$tmp,$out0,$twk1	# last block prep for stealing
-	le?vperm	$out0,$out0,$out0,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	addi		$out,$out,0x10
-	bne		Lxts_enc6x_steal
-	b		Lxts_enc6x_done
-
-.align	4
-Lxts_enc6x_zero:
-	cmpwi		$taillen,0
-	beq		Lxts_enc6x_done
-
-	add		$inp,$inp,$taillen
-	subi		$inp,$inp,16
-	lvx_u		$in0,0,$inp
-	lvsr		$inpperm,0,$taillen	# $in5 is no more
-	le?vperm	$in0,$in0,$in0,$leperm
-	vperm		$in0,$in0,$in0,$inpperm
-	vxor		$tmp,$tmp,$twk0
-Lxts_enc6x_steal:
-	vxor		$in0,$in0,$twk0
-	vxor		$out0,$out0,$out0
-	vspltisb	$out1,-1
-	vperm		$out0,$out0,$out1,$inpperm
-	vsel		$out0,$in0,$tmp,$out0	# $tmp is last block, remember?
-
-	subi		r30,$out,17
-	subi		$out,$out,16
-	mtctr		$taillen
-Loop_xts_enc6x_steal:
-	lbzu		r0,1(r30)
-	stb		r0,16(r30)
-	bdnz		Loop_xts_enc6x_steal
-
-	li		$taillen,0
-	mtctr		$rounds
-	b		Loop_xts_enc1x		# one more time...
-
-.align	4
-Lxts_enc6x_done:
-	${UCMP}i	$ivp,0
-	beq		Lxts_enc6x_ret
-
-	vxor		$tweak,$twk0,$rndkey0
-	le?vperm	$tweak,$tweak,$tweak,$leperm
-	stvx_u		$tweak,0,$ivp
-
-Lxts_enc6x_ret:
-	mtlr		r11
-	li		r10,`$FRAME+15`
-	li		r11,`$FRAME+31`
-	stvx		$seven,r10,$sp		# wipe copies of round keys
-	addi		r10,r10,32
-	stvx		$seven,r11,$sp
-	addi		r11,r11,32
-	stvx		$seven,r10,$sp
-	addi		r10,r10,32
-	stvx		$seven,r11,$sp
-	addi		r11,r11,32
-	stvx		$seven,r10,$sp
-	addi		r10,r10,32
-	stvx		$seven,r11,$sp
-	addi		r11,r11,32
-	stvx		$seven,r10,$sp
-	addi		r10,r10,32
-	stvx		$seven,r11,$sp
-	addi		r11,r11,32
-
-	mtspr		256,$vrsave
-	lvx		v20,r10,$sp		# ABI says so
-	addi		r10,r10,32
-	lvx		v21,r11,$sp
-	addi		r11,r11,32
-	lvx		v22,r10,$sp
-	addi		r10,r10,32
-	lvx		v23,r11,$sp
-	addi		r11,r11,32
-	lvx		v24,r10,$sp
-	addi		r10,r10,32
-	lvx		v25,r11,$sp
-	addi		r11,r11,32
-	lvx		v26,r10,$sp
-	addi		r10,r10,32
-	lvx		v27,r11,$sp
-	addi		r11,r11,32
-	lvx		v28,r10,$sp
-	addi		r10,r10,32
-	lvx		v29,r11,$sp
-	addi		r11,r11,32
-	lvx		v30,r10,$sp
-	lvx		v31,r11,$sp
-	$POP		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
-	$POP		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
-	$POP		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
-	$POP		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
-	$POP		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
-	$POP		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
-	addi		$sp,$sp,`$FRAME+21*16+6*$SIZE_T`
-	blr
-	.long		0
-	.byte		0,12,0x04,1,0x80,6,6,0
-	.long		0
-
-.align	5
-_aesp8_xts_enc5x:
-	vcipher		$out0,$out0,v24
-	vcipher		$out1,$out1,v24
-	vcipher		$out2,$out2,v24
-	vcipher		$out3,$out3,v24
-	vcipher		$out4,$out4,v24
-	lvx		v24,$x20,$key_		# round[3]
-	addi		$key_,$key_,0x20
-
-	vcipher		$out0,$out0,v25
-	vcipher		$out1,$out1,v25
-	vcipher		$out2,$out2,v25
-	vcipher		$out3,$out3,v25
-	vcipher		$out4,$out4,v25
-	lvx		v25,$x10,$key_		# round[4]
-	bdnz		_aesp8_xts_enc5x
-
-	add		$inp,$inp,$taillen
-	cmpwi		$taillen,0
-	vcipher		$out0,$out0,v24
-	vcipher		$out1,$out1,v24
-	vcipher		$out2,$out2,v24
-	vcipher		$out3,$out3,v24
-	vcipher		$out4,$out4,v24
-
-	subi		$inp,$inp,16
-	vcipher		$out0,$out0,v25
-	vcipher		$out1,$out1,v25
-	vcipher		$out2,$out2,v25
-	vcipher		$out3,$out3,v25
-	vcipher		$out4,$out4,v25
-	 vxor		$twk0,$twk0,v31
-
-	vcipher		$out0,$out0,v26
-	lvsr		$inpperm,r0,$taillen	# $in5 is no more
-	vcipher		$out1,$out1,v26
-	vcipher		$out2,$out2,v26
-	vcipher		$out3,$out3,v26
-	vcipher		$out4,$out4,v26
-	 vxor		$in1,$twk1,v31
-
-	vcipher		$out0,$out0,v27
-	lvx_u		$in0,0,$inp
-	vcipher		$out1,$out1,v27
-	vcipher		$out2,$out2,v27
-	vcipher		$out3,$out3,v27
-	vcipher		$out4,$out4,v27
-	 vxor		$in2,$twk2,v31
-
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	vcipher		$out0,$out0,v28
-	vcipher		$out1,$out1,v28
-	vcipher		$out2,$out2,v28
-	vcipher		$out3,$out3,v28
-	vcipher		$out4,$out4,v28
-	lvx		v24,$x00,$key_		# re-pre-load round[1]
-	 vxor		$in3,$twk3,v31
-
-	vcipher		$out0,$out0,v29
-	le?vperm	$in0,$in0,$in0,$leperm
-	vcipher		$out1,$out1,v29
-	vcipher		$out2,$out2,v29
-	vcipher		$out3,$out3,v29
-	vcipher		$out4,$out4,v29
-	lvx		v25,$x10,$key_		# re-pre-load round[2]
-	 vxor		$in4,$twk4,v31
-
-	vcipher		$out0,$out0,v30
-	vperm		$in0,$in0,$in0,$inpperm
-	vcipher		$out1,$out1,v30
-	vcipher		$out2,$out2,v30
-	vcipher		$out3,$out3,v30
-	vcipher		$out4,$out4,v30
-
-	vcipherlast	$out0,$out0,$twk0
-	vcipherlast	$out1,$out1,$in1
-	vcipherlast	$out2,$out2,$in2
-	vcipherlast	$out3,$out3,$in3
-	vcipherlast	$out4,$out4,$in4
-	blr
-        .long   	0
-        .byte   	0,12,0x14,0,0,0,0,0
-
-.align	5
-_aesp8_xts_decrypt6x:
-	$STU		$sp,-`($FRAME+21*16+6*$SIZE_T)`($sp)
-	mflr		r11
-	li		r7,`$FRAME+8*16+15`
-	li		r3,`$FRAME+8*16+31`
-	$PUSH		r11,`$FRAME+21*16+6*$SIZE_T+$LRSAVE`($sp)
-	stvx		v20,r7,$sp		# ABI says so
-	addi		r7,r7,32
-	stvx		v21,r3,$sp
-	addi		r3,r3,32
-	stvx		v22,r7,$sp
-	addi		r7,r7,32
-	stvx		v23,r3,$sp
-	addi		r3,r3,32
-	stvx		v24,r7,$sp
-	addi		r7,r7,32
-	stvx		v25,r3,$sp
-	addi		r3,r3,32
-	stvx		v26,r7,$sp
-	addi		r7,r7,32
-	stvx		v27,r3,$sp
-	addi		r3,r3,32
-	stvx		v28,r7,$sp
-	addi		r7,r7,32
-	stvx		v29,r3,$sp
-	addi		r3,r3,32
-	stvx		v30,r7,$sp
-	stvx		v31,r3,$sp
-	li		r0,-1
-	stw		$vrsave,`$FRAME+21*16-4`($sp)	# save vrsave
-	li		$x10,0x10
-	$PUSH		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
-	li		$x20,0x20
-	$PUSH		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
-	li		$x30,0x30
-	$PUSH		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
-	li		$x40,0x40
-	$PUSH		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
-	li		$x50,0x50
-	$PUSH		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
-	li		$x60,0x60
-	$PUSH		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
-	li		$x70,0x70
-	mtspr		256,r0
-
-	subi		$rounds,$rounds,3	# -4 in total
-
-	lvx		$rndkey0,$x00,$key1	# load key schedule
-	lvx		v30,$x10,$key1
-	addi		$key1,$key1,0x20
-	lvx		v31,$x00,$key1
-	?vperm		$rndkey0,$rndkey0,v30,$keyperm
-	addi		$key_,$sp,$FRAME+15
-	mtctr		$rounds
-
-Load_xts_dec_key:
-	?vperm		v24,v30,v31,$keyperm
-	lvx		v30,$x10,$key1
-	addi		$key1,$key1,0x20
-	stvx		v24,$x00,$key_		# off-load round[1]
-	?vperm		v25,v31,v30,$keyperm
-	lvx		v31,$x00,$key1
-	stvx		v25,$x10,$key_		# off-load round[2]
-	addi		$key_,$key_,0x20
-	bdnz		Load_xts_dec_key
-
-	lvx		v26,$x10,$key1
-	?vperm		v24,v30,v31,$keyperm
-	lvx		v27,$x20,$key1
-	stvx		v24,$x00,$key_		# off-load round[3]
-	?vperm		v25,v31,v26,$keyperm
-	lvx		v28,$x30,$key1
-	stvx		v25,$x10,$key_		# off-load round[4]
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	?vperm		v26,v26,v27,$keyperm
-	lvx		v29,$x40,$key1
-	?vperm		v27,v27,v28,$keyperm
-	lvx		v30,$x50,$key1
-	?vperm		v28,v28,v29,$keyperm
-	lvx		v31,$x60,$key1
-	?vperm		v29,v29,v30,$keyperm
-	lvx		$twk5,$x70,$key1	# borrow $twk5
-	?vperm		v30,v30,v31,$keyperm
-	lvx		v24,$x00,$key_		# pre-load round[1]
-	?vperm		v31,v31,$twk5,$keyperm
-	lvx		v25,$x10,$key_		# pre-load round[2]
-
-	 vperm		$in0,$inout,$inptail,$inpperm
-	 subi		$inp,$inp,31		# undo "caller"
-	vxor		$twk0,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out0,$in0,$twk0
-	vxor		$tweak,$tweak,$tmp
-
-	 lvx_u		$in1,$x10,$inp
-	vxor		$twk1,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	 le?vperm	$in1,$in1,$in1,$leperm
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out1,$in1,$twk1
-	vxor		$tweak,$tweak,$tmp
-
-	 lvx_u		$in2,$x20,$inp
-	 andi.		$taillen,$len,15
-	vxor		$twk2,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	 le?vperm	$in2,$in2,$in2,$leperm
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out2,$in2,$twk2
-	vxor		$tweak,$tweak,$tmp
-
-	 lvx_u		$in3,$x30,$inp
-	 sub		$len,$len,$taillen
-	vxor		$twk3,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	 le?vperm	$in3,$in3,$in3,$leperm
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out3,$in3,$twk3
-	vxor		$tweak,$tweak,$tmp
-
-	 lvx_u		$in4,$x40,$inp
-	 subi		$len,$len,0x60
-	vxor		$twk4,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	 le?vperm	$in4,$in4,$in4,$leperm
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out4,$in4,$twk4
-	vxor		$tweak,$tweak,$tmp
-
-	 lvx_u		$in5,$x50,$inp
-	 addi		$inp,$inp,0x60
-	vxor		$twk5,$tweak,$rndkey0
-	vsrab		$tmp,$tweak,$seven	# next tweak value
-	vaddubm		$tweak,$tweak,$tweak
-	vsldoi		$tmp,$tmp,$tmp,15
-	 le?vperm	$in5,$in5,$in5,$leperm
-	vand		$tmp,$tmp,$eighty7
-	 vxor		$out5,$in5,$twk5
-	vxor		$tweak,$tweak,$tmp
-
-	vxor		v31,v31,$rndkey0
-	mtctr		$rounds
-	b		Loop_xts_dec6x
-
-.align	5
-Loop_xts_dec6x:
-	vncipher	$out0,$out0,v24
-	vncipher	$out1,$out1,v24
-	vncipher	$out2,$out2,v24
-	vncipher	$out3,$out3,v24
-	vncipher	$out4,$out4,v24
-	vncipher	$out5,$out5,v24
-	lvx		v24,$x20,$key_		# round[3]
-	addi		$key_,$key_,0x20
-
-	vncipher	$out0,$out0,v25
-	vncipher	$out1,$out1,v25
-	vncipher	$out2,$out2,v25
-	vncipher	$out3,$out3,v25
-	vncipher	$out4,$out4,v25
-	vncipher	$out5,$out5,v25
-	lvx		v25,$x10,$key_		# round[4]
-	bdnz		Loop_xts_dec6x
-
-	subic		$len,$len,96		# $len-=96
-	 vxor		$in0,$twk0,v31		# xor with last round key
-	vncipher	$out0,$out0,v24
-	vncipher	$out1,$out1,v24
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk0,$tweak,$rndkey0
-	 vaddubm	$tweak,$tweak,$tweak
-	vncipher	$out2,$out2,v24
-	vncipher	$out3,$out3,v24
-	 vsldoi		$tmp,$tmp,$tmp,15
-	vncipher	$out4,$out4,v24
-	vncipher	$out5,$out5,v24
-
-	subfe.		r0,r0,r0		# borrow?-1:0
-	 vand		$tmp,$tmp,$eighty7
-	vncipher	$out0,$out0,v25
-	vncipher	$out1,$out1,v25
-	 vxor		$tweak,$tweak,$tmp
-	vncipher	$out2,$out2,v25
-	vncipher	$out3,$out3,v25
-	 vxor		$in1,$twk1,v31
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk1,$tweak,$rndkey0
-	vncipher	$out4,$out4,v25
-	vncipher	$out5,$out5,v25
-
-	and		r0,r0,$len
-	 vaddubm	$tweak,$tweak,$tweak
-	 vsldoi		$tmp,$tmp,$tmp,15
-	vncipher	$out0,$out0,v26
-	vncipher	$out1,$out1,v26
-	 vand		$tmp,$tmp,$eighty7
-	vncipher	$out2,$out2,v26
-	vncipher	$out3,$out3,v26
-	 vxor		$tweak,$tweak,$tmp
-	vncipher	$out4,$out4,v26
-	vncipher	$out5,$out5,v26
-
-	add		$inp,$inp,r0		# $inp is adjusted in such
-						# way that at exit from the
-						# loop inX-in5 are loaded
-						# with last "words"
-	 vxor		$in2,$twk2,v31
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk2,$tweak,$rndkey0
-	 vaddubm	$tweak,$tweak,$tweak
-	vncipher	$out0,$out0,v27
-	vncipher	$out1,$out1,v27
-	 vsldoi		$tmp,$tmp,$tmp,15
-	vncipher	$out2,$out2,v27
-	vncipher	$out3,$out3,v27
-	 vand		$tmp,$tmp,$eighty7
-	vncipher	$out4,$out4,v27
-	vncipher	$out5,$out5,v27
-
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	 vxor		$tweak,$tweak,$tmp
-	vncipher	$out0,$out0,v28
-	vncipher	$out1,$out1,v28
-	 vxor		$in3,$twk3,v31
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk3,$tweak,$rndkey0
-	vncipher	$out2,$out2,v28
-	vncipher	$out3,$out3,v28
-	 vaddubm	$tweak,$tweak,$tweak
-	 vsldoi		$tmp,$tmp,$tmp,15
-	vncipher	$out4,$out4,v28
-	vncipher	$out5,$out5,v28
-	lvx		v24,$x00,$key_		# re-pre-load round[1]
-	 vand		$tmp,$tmp,$eighty7
-
-	vncipher	$out0,$out0,v29
-	vncipher	$out1,$out1,v29
-	 vxor		$tweak,$tweak,$tmp
-	vncipher	$out2,$out2,v29
-	vncipher	$out3,$out3,v29
-	 vxor		$in4,$twk4,v31
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk4,$tweak,$rndkey0
-	vncipher	$out4,$out4,v29
-	vncipher	$out5,$out5,v29
-	lvx		v25,$x10,$key_		# re-pre-load round[2]
-	 vaddubm	$tweak,$tweak,$tweak
-	 vsldoi		$tmp,$tmp,$tmp,15
-
-	vncipher	$out0,$out0,v30
-	vncipher	$out1,$out1,v30
-	 vand		$tmp,$tmp,$eighty7
-	vncipher	$out2,$out2,v30
-	vncipher	$out3,$out3,v30
-	 vxor		$tweak,$tweak,$tmp
-	vncipher	$out4,$out4,v30
-	vncipher	$out5,$out5,v30
-	 vxor		$in5,$twk5,v31
-	 vsrab		$tmp,$tweak,$seven	# next tweak value
-	 vxor		$twk5,$tweak,$rndkey0
-
-	vncipherlast	$out0,$out0,$in0
-	 lvx_u		$in0,$x00,$inp		# load next input block
-	 vaddubm	$tweak,$tweak,$tweak
-	 vsldoi		$tmp,$tmp,$tmp,15
-	vncipherlast	$out1,$out1,$in1
-	 lvx_u		$in1,$x10,$inp
-	vncipherlast	$out2,$out2,$in2
-	 le?vperm	$in0,$in0,$in0,$leperm
-	 lvx_u		$in2,$x20,$inp
-	 vand		$tmp,$tmp,$eighty7
-	vncipherlast	$out3,$out3,$in3
-	 le?vperm	$in1,$in1,$in1,$leperm
-	 lvx_u		$in3,$x30,$inp
-	vncipherlast	$out4,$out4,$in4
-	 le?vperm	$in2,$in2,$in2,$leperm
-	 lvx_u		$in4,$x40,$inp
-	 vxor		$tweak,$tweak,$tmp
-	vncipherlast	$out5,$out5,$in5
-	 le?vperm	$in3,$in3,$in3,$leperm
-	 lvx_u		$in5,$x50,$inp
-	 addi		$inp,$inp,0x60
-	 le?vperm	$in4,$in4,$in4,$leperm
-	 le?vperm	$in5,$in5,$in5,$leperm
-
-	le?vperm	$out0,$out0,$out0,$leperm
-	le?vperm	$out1,$out1,$out1,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	 vxor		$out0,$in0,$twk0
-	le?vperm	$out2,$out2,$out2,$leperm
-	stvx_u		$out1,$x10,$out
-	 vxor		$out1,$in1,$twk1
-	le?vperm	$out3,$out3,$out3,$leperm
-	stvx_u		$out2,$x20,$out
-	 vxor		$out2,$in2,$twk2
-	le?vperm	$out4,$out4,$out4,$leperm
-	stvx_u		$out3,$x30,$out
-	 vxor		$out3,$in3,$twk3
-	le?vperm	$out5,$out5,$out5,$leperm
-	stvx_u		$out4,$x40,$out
-	 vxor		$out4,$in4,$twk4
-	stvx_u		$out5,$x50,$out
-	 vxor		$out5,$in5,$twk5
-	addi		$out,$out,0x60
-
-	mtctr		$rounds
-	beq		Loop_xts_dec6x		# did $len-=96 borrow?
-
-	addic.		$len,$len,0x60
-	beq		Lxts_dec6x_zero
-	cmpwi		$len,0x20
-	blt		Lxts_dec6x_one
-	nop
-	beq		Lxts_dec6x_two
-	cmpwi		$len,0x40
-	blt		Lxts_dec6x_three
-	nop
-	beq		Lxts_dec6x_four
-
-Lxts_dec6x_five:
-	vxor		$out0,$in1,$twk0
-	vxor		$out1,$in2,$twk1
-	vxor		$out2,$in3,$twk2
-	vxor		$out3,$in4,$twk3
-	vxor		$out4,$in5,$twk4
-
-	bl		_aesp8_xts_dec5x
-
-	le?vperm	$out0,$out0,$out0,$leperm
-	vmr		$twk0,$twk5		# unused tweak
-	vxor		$twk1,$tweak,$rndkey0
-	le?vperm	$out1,$out1,$out1,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	vxor		$out0,$in0,$twk1
-	le?vperm	$out2,$out2,$out2,$leperm
-	stvx_u		$out1,$x10,$out
-	le?vperm	$out3,$out3,$out3,$leperm
-	stvx_u		$out2,$x20,$out
-	le?vperm	$out4,$out4,$out4,$leperm
-	stvx_u		$out3,$x30,$out
-	stvx_u		$out4,$x40,$out
-	addi		$out,$out,0x50
-	bne		Lxts_dec6x_steal
-	b		Lxts_dec6x_done
-
-.align	4
-Lxts_dec6x_four:
-	vxor		$out0,$in2,$twk0
-	vxor		$out1,$in3,$twk1
-	vxor		$out2,$in4,$twk2
-	vxor		$out3,$in5,$twk3
-	vxor		$out4,$out4,$out4
-
-	bl		_aesp8_xts_dec5x
-
-	le?vperm	$out0,$out0,$out0,$leperm
-	vmr		$twk0,$twk4		# unused tweak
-	vmr		$twk1,$twk5
-	le?vperm	$out1,$out1,$out1,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	vxor		$out0,$in0,$twk5
-	le?vperm	$out2,$out2,$out2,$leperm
-	stvx_u		$out1,$x10,$out
-	le?vperm	$out3,$out3,$out3,$leperm
-	stvx_u		$out2,$x20,$out
-	stvx_u		$out3,$x30,$out
-	addi		$out,$out,0x40
-	bne		Lxts_dec6x_steal
-	b		Lxts_dec6x_done
-
-.align	4
-Lxts_dec6x_three:
-	vxor		$out0,$in3,$twk0
-	vxor		$out1,$in4,$twk1
-	vxor		$out2,$in5,$twk2
-	vxor		$out3,$out3,$out3
-	vxor		$out4,$out4,$out4
-
-	bl		_aesp8_xts_dec5x
-
-	le?vperm	$out0,$out0,$out0,$leperm
-	vmr		$twk0,$twk3		# unused tweak
-	vmr		$twk1,$twk4
-	le?vperm	$out1,$out1,$out1,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	vxor		$out0,$in0,$twk4
-	le?vperm	$out2,$out2,$out2,$leperm
-	stvx_u		$out1,$x10,$out
-	stvx_u		$out2,$x20,$out
-	addi		$out,$out,0x30
-	bne		Lxts_dec6x_steal
-	b		Lxts_dec6x_done
-
-.align	4
-Lxts_dec6x_two:
-	vxor		$out0,$in4,$twk0
-	vxor		$out1,$in5,$twk1
-	vxor		$out2,$out2,$out2
-	vxor		$out3,$out3,$out3
-	vxor		$out4,$out4,$out4
-
-	bl		_aesp8_xts_dec5x
-
-	le?vperm	$out0,$out0,$out0,$leperm
-	vmr		$twk0,$twk2		# unused tweak
-	vmr		$twk1,$twk3
-	le?vperm	$out1,$out1,$out1,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	vxor		$out0,$in0,$twk3
-	stvx_u		$out1,$x10,$out
-	addi		$out,$out,0x20
-	bne		Lxts_dec6x_steal
-	b		Lxts_dec6x_done
-
-.align	4
-Lxts_dec6x_one:
-	vxor		$out0,$in5,$twk0
-	nop
-Loop_xts_dec1x:
-	vncipher	$out0,$out0,v24
-	lvx		v24,$x20,$key_		# round[3]
-	addi		$key_,$key_,0x20
-
-	vncipher	$out0,$out0,v25
-	lvx		v25,$x10,$key_		# round[4]
-	bdnz		Loop_xts_dec1x
-
-	subi		r0,$taillen,1
-	vncipher	$out0,$out0,v24
-
-	andi.		r0,r0,16
-	cmpwi		$taillen,0
-	vncipher	$out0,$out0,v25
-
-	sub		$inp,$inp,r0
-	vncipher	$out0,$out0,v26
-
-	lvx_u		$in0,0,$inp
-	vncipher	$out0,$out0,v27
-
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	vncipher	$out0,$out0,v28
-	lvx		v24,$x00,$key_		# re-pre-load round[1]
-
-	vncipher	$out0,$out0,v29
-	lvx		v25,$x10,$key_		# re-pre-load round[2]
-	 vxor		$twk0,$twk0,v31
-
-	le?vperm	$in0,$in0,$in0,$leperm
-	vncipher	$out0,$out0,v30
-
-	mtctr		$rounds
-	vncipherlast	$out0,$out0,$twk0
-
-	vmr		$twk0,$twk1		# unused tweak
-	vmr		$twk1,$twk2
-	le?vperm	$out0,$out0,$out0,$leperm
-	stvx_u		$out0,$x00,$out		# store output
-	addi		$out,$out,0x10
-	vxor		$out0,$in0,$twk2
-	bne		Lxts_dec6x_steal
-	b		Lxts_dec6x_done
-
-.align	4
-Lxts_dec6x_zero:
-	cmpwi		$taillen,0
-	beq		Lxts_dec6x_done
-
-	lvx_u		$in0,0,$inp
-	le?vperm	$in0,$in0,$in0,$leperm
-	vxor		$out0,$in0,$twk1
-Lxts_dec6x_steal:
-	vncipher	$out0,$out0,v24
-	lvx		v24,$x20,$key_		# round[3]
-	addi		$key_,$key_,0x20
-
-	vncipher	$out0,$out0,v25
-	lvx		v25,$x10,$key_		# round[4]
-	bdnz		Lxts_dec6x_steal
-
-	add		$inp,$inp,$taillen
-	vncipher	$out0,$out0,v24
-
-	cmpwi		$taillen,0
-	vncipher	$out0,$out0,v25
-
-	lvx_u		$in0,0,$inp
-	vncipher	$out0,$out0,v26
-
-	lvsr		$inpperm,0,$taillen	# $in5 is no more
-	vncipher	$out0,$out0,v27
-
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	vncipher	$out0,$out0,v28
-	lvx		v24,$x00,$key_		# re-pre-load round[1]
-
-	vncipher	$out0,$out0,v29
-	lvx		v25,$x10,$key_		# re-pre-load round[2]
-	 vxor		$twk1,$twk1,v31
-
-	le?vperm	$in0,$in0,$in0,$leperm
-	vncipher	$out0,$out0,v30
-
-	vperm		$in0,$in0,$in0,$inpperm
-	vncipherlast	$tmp,$out0,$twk1
-
-	le?vperm	$out0,$tmp,$tmp,$leperm
-	le?stvx_u	$out0,0,$out
-	be?stvx_u	$tmp,0,$out
-
-	vxor		$out0,$out0,$out0
-	vspltisb	$out1,-1
-	vperm		$out0,$out0,$out1,$inpperm
-	vsel		$out0,$in0,$tmp,$out0
-	vxor		$out0,$out0,$twk0
-
-	subi		r30,$out,1
-	mtctr		$taillen
-Loop_xts_dec6x_steal:
-	lbzu		r0,1(r30)
-	stb		r0,16(r30)
-	bdnz		Loop_xts_dec6x_steal
-
-	li		$taillen,0
-	mtctr		$rounds
-	b		Loop_xts_dec1x		# one more time...
-
-.align	4
-Lxts_dec6x_done:
-	${UCMP}i	$ivp,0
-	beq		Lxts_dec6x_ret
-
-	vxor		$tweak,$twk0,$rndkey0
-	le?vperm	$tweak,$tweak,$tweak,$leperm
-	stvx_u		$tweak,0,$ivp
-
-Lxts_dec6x_ret:
-	mtlr		r11
-	li		r10,`$FRAME+15`
-	li		r11,`$FRAME+31`
-	stvx		$seven,r10,$sp		# wipe copies of round keys
-	addi		r10,r10,32
-	stvx		$seven,r11,$sp
-	addi		r11,r11,32
-	stvx		$seven,r10,$sp
-	addi		r10,r10,32
-	stvx		$seven,r11,$sp
-	addi		r11,r11,32
-	stvx		$seven,r10,$sp
-	addi		r10,r10,32
-	stvx		$seven,r11,$sp
-	addi		r11,r11,32
-	stvx		$seven,r10,$sp
-	addi		r10,r10,32
-	stvx		$seven,r11,$sp
-	addi		r11,r11,32
-
-	mtspr		256,$vrsave
-	lvx		v20,r10,$sp		# ABI says so
-	addi		r10,r10,32
-	lvx		v21,r11,$sp
-	addi		r11,r11,32
-	lvx		v22,r10,$sp
-	addi		r10,r10,32
-	lvx		v23,r11,$sp
-	addi		r11,r11,32
-	lvx		v24,r10,$sp
-	addi		r10,r10,32
-	lvx		v25,r11,$sp
-	addi		r11,r11,32
-	lvx		v26,r10,$sp
-	addi		r10,r10,32
-	lvx		v27,r11,$sp
-	addi		r11,r11,32
-	lvx		v28,r10,$sp
-	addi		r10,r10,32
-	lvx		v29,r11,$sp
-	addi		r11,r11,32
-	lvx		v30,r10,$sp
-	lvx		v31,r11,$sp
-	$POP		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
-	$POP		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
-	$POP		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
-	$POP		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
-	$POP		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
-	$POP		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
-	addi		$sp,$sp,`$FRAME+21*16+6*$SIZE_T`
-	blr
-	.long		0
-	.byte		0,12,0x04,1,0x80,6,6,0
-	.long		0
-
-.align	5
-_aesp8_xts_dec5x:
-	vncipher	$out0,$out0,v24
-	vncipher	$out1,$out1,v24
-	vncipher	$out2,$out2,v24
-	vncipher	$out3,$out3,v24
-	vncipher	$out4,$out4,v24
-	lvx		v24,$x20,$key_		# round[3]
-	addi		$key_,$key_,0x20
-
-	vncipher	$out0,$out0,v25
-	vncipher	$out1,$out1,v25
-	vncipher	$out2,$out2,v25
-	vncipher	$out3,$out3,v25
-	vncipher	$out4,$out4,v25
-	lvx		v25,$x10,$key_		# round[4]
-	bdnz		_aesp8_xts_dec5x
-
-	subi		r0,$taillen,1
-	vncipher	$out0,$out0,v24
-	vncipher	$out1,$out1,v24
-	vncipher	$out2,$out2,v24
-	vncipher	$out3,$out3,v24
-	vncipher	$out4,$out4,v24
-
-	andi.		r0,r0,16
-	cmpwi		$taillen,0
-	vncipher	$out0,$out0,v25
-	vncipher	$out1,$out1,v25
-	vncipher	$out2,$out2,v25
-	vncipher	$out3,$out3,v25
-	vncipher	$out4,$out4,v25
-	 vxor		$twk0,$twk0,v31
-
-	sub		$inp,$inp,r0
-	vncipher	$out0,$out0,v26
-	vncipher	$out1,$out1,v26
-	vncipher	$out2,$out2,v26
-	vncipher	$out3,$out3,v26
-	vncipher	$out4,$out4,v26
-	 vxor		$in1,$twk1,v31
-
-	vncipher	$out0,$out0,v27
-	lvx_u		$in0,0,$inp
-	vncipher	$out1,$out1,v27
-	vncipher	$out2,$out2,v27
-	vncipher	$out3,$out3,v27
-	vncipher	$out4,$out4,v27
-	 vxor		$in2,$twk2,v31
-
-	addi		$key_,$sp,$FRAME+15	# rewind $key_
-	vncipher	$out0,$out0,v28
-	vncipher	$out1,$out1,v28
-	vncipher	$out2,$out2,v28
-	vncipher	$out3,$out3,v28
-	vncipher	$out4,$out4,v28
-	lvx		v24,$x00,$key_		# re-pre-load round[1]
-	 vxor		$in3,$twk3,v31
-
-	vncipher	$out0,$out0,v29
-	le?vperm	$in0,$in0,$in0,$leperm
-	vncipher	$out1,$out1,v29
-	vncipher	$out2,$out2,v29
-	vncipher	$out3,$out3,v29
-	vncipher	$out4,$out4,v29
-	lvx		v25,$x10,$key_		# re-pre-load round[2]
-	 vxor		$in4,$twk4,v31
-
-	vncipher	$out0,$out0,v30
-	vncipher	$out1,$out1,v30
-	vncipher	$out2,$out2,v30
-	vncipher	$out3,$out3,v30
-	vncipher	$out4,$out4,v30
-
-	vncipherlast	$out0,$out0,$twk0
-	vncipherlast	$out1,$out1,$in1
-	vncipherlast	$out2,$out2,$in2
-	vncipherlast	$out3,$out3,$in3
-	vncipherlast	$out4,$out4,$in4
-	mtctr		$rounds
-	blr
-        .long   	0
-        .byte   	0,12,0x14,0,0,0,0,0
-___
-}}	}}}
-
-my $consts=1;
-foreach(split("\n",$code)) {
-        s/\`([^\`]*)\`/eval($1)/geo;
-
-	# constants table endian-specific conversion
-	if ($consts && m/\.(long|byte)\s+(.+)\s+(\?[a-z]*)$/o) {
-	    my $conv=$3;
-	    my @bytes=();
-
-	    # convert to endian-agnostic format
-	    if ($1 eq "long") {
-	      foreach (split(/,\s*/,$2)) {
-		my $l = /^0/?oct:int;
-		push @bytes,($l>>24)&0xff,($l>>16)&0xff,($l>>8)&0xff,$l&0xff;
-	      }
-	    } else {
-		@bytes = map(/^0/?oct:int,split(/,\s*/,$2));
-	    }
-
-	    # little-endian conversion
-	    if ($flavour =~ /le$/o) {
-		SWITCH: for($conv)  {
-		    /\?inv/ && do   { @bytes=map($_^0xf,@bytes); last; };
-		    /\?rev/ && do   { @bytes=reverse(@bytes);    last; };
-		}
-	    }
-
-	    #emit
-	    print ".byte\t",join(',',map (sprintf("0x%02x",$_),@bytes)),"\n";
-	    next;
-	}
-	$consts=0 if (m/Lconsts:/o);	# end of table
-
-	# instructions prefixed with '?' are endian-specific and need
-	# to be adjusted accordingly...
-	if ($flavour =~ /le$/o) {	# little-endian
-	    s/le\?//o		or
-	    s/be\?/#be#/o	or
-	    s/\?lvsr/lvsl/o	or
-	    s/\?lvsl/lvsr/o	or
-	    s/\?(vperm\s+v[0-9]+,\s*)(v[0-9]+,\s*)(v[0-9]+,\s*)(v[0-9]+)/$1$3$2$4/o or
-	    s/\?(vsldoi\s+v[0-9]+,\s*)(v[0-9]+,)\s*(v[0-9]+,\s*)([0-9]+)/$1$3$2 16-$4/o or
-	    s/\?(vspltw\s+v[0-9]+,\s*)(v[0-9]+,)\s*([0-9])/$1$2 3-$3/o;
-	} else {			# big-endian
-	    s/le\?/#le#/o	or
-	    s/be\?//o		or
-	    s/\?([a-z]+)/$1/o;
-	}
-
-        print $_,"\n";
-}
-
-close STDOUT;
diff --git a/drivers/crypto/vmx/ghash.c b/drivers/crypto/vmx/ghash.c
deleted file mode 100644
index 27a94a119009..000000000000
--- a/drivers/crypto/vmx/ghash.c
+++ /dev/null
@@ -1,227 +0,0 @@
-/**
- * GHASH routines supporting VMX instructions on the Power 8
- *
- * Copyright (C) 2015 International Business Machines Inc.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
- * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
- */
-
-#include <linux/types.h>
-#include <linux/err.h>
-#include <linux/crypto.h>
-#include <linux/delay.h>
-#include <linux/hardirq.h>
-#include <asm/switch_to.h>
-#include <crypto/aes.h>
-#include <crypto/ghash.h>
-#include <crypto/scatterwalk.h>
-#include <crypto/internal/hash.h>
-#include <crypto/b128ops.h>
-
-#define IN_INTERRUPT in_interrupt()
-
-void gcm_init_p8(u128 htable[16], const u64 Xi[2]);
-void gcm_gmult_p8(u64 Xi[2], const u128 htable[16]);
-void gcm_ghash_p8(u64 Xi[2], const u128 htable[16],
-		  const u8 *in, size_t len);
-
-struct p8_ghash_ctx {
-	u128 htable[16];
-	struct crypto_shash *fallback;
-};
-
-struct p8_ghash_desc_ctx {
-	u64 shash[2];
-	u8 buffer[GHASH_DIGEST_SIZE];
-	int bytes;
-	struct shash_desc fallback_desc;
-};
-
-static int p8_ghash_init_tfm(struct crypto_tfm *tfm)
-{
-	const char *alg = "ghash-generic";
-	struct crypto_shash *fallback;
-	struct crypto_shash *shash_tfm = __crypto_shash_cast(tfm);
-	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	fallback = crypto_alloc_shash(alg, 0, CRYPTO_ALG_NEED_FALLBACK);
-	if (IS_ERR(fallback)) {
-		printk(KERN_ERR
-		       "Failed to allocate transformation for '%s': %ld\n",
-		       alg, PTR_ERR(fallback));
-		return PTR_ERR(fallback);
-	}
-	printk(KERN_INFO "Using '%s' as fallback implementation.\n",
-	       crypto_tfm_alg_driver_name(crypto_shash_tfm(fallback)));
-
-	crypto_shash_set_flags(fallback,
-			       crypto_shash_get_flags((struct crypto_shash
-						       *) tfm));
-
-	/* Check if the descsize defined in the algorithm is still enough. */
-	if (shash_tfm->descsize < sizeof(struct p8_ghash_desc_ctx)
-	    + crypto_shash_descsize(fallback)) {
-		printk(KERN_ERR
-		       "Desc size of the fallback implementation (%s) does not match the expected value: %lu vs %u\n",
-		       alg,
-		       shash_tfm->descsize - sizeof(struct p8_ghash_desc_ctx),
-		       crypto_shash_descsize(fallback));
-		return -EINVAL;
-	}
-	ctx->fallback = fallback;
-
-	return 0;
-}
-
-static void p8_ghash_exit_tfm(struct crypto_tfm *tfm)
-{
-	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(tfm);
-
-	if (ctx->fallback) {
-		crypto_free_shash(ctx->fallback);
-		ctx->fallback = NULL;
-	}
-}
-
-static int p8_ghash_init(struct shash_desc *desc)
-{
-	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
-	struct p8_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
-
-	dctx->bytes = 0;
-	memset(dctx->shash, 0, GHASH_DIGEST_SIZE);
-	dctx->fallback_desc.tfm = ctx->fallback;
-	dctx->fallback_desc.flags = desc->flags;
-	return crypto_shash_init(&dctx->fallback_desc);
-}
-
-static int p8_ghash_setkey(struct crypto_shash *tfm, const u8 *key,
-			   unsigned int keylen)
-{
-	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(tfm));
-
-	if (keylen != GHASH_BLOCK_SIZE)
-		return -EINVAL;
-
-	preempt_disable();
-	pagefault_disable();
-	enable_kernel_vsx();
-	gcm_init_p8(ctx->htable, (const u64 *) key);
-	disable_kernel_vsx();
-	pagefault_enable();
-	preempt_enable();
-	return crypto_shash_setkey(ctx->fallback, key, keylen);
-}
-
-static int p8_ghash_update(struct shash_desc *desc,
-			   const u8 *src, unsigned int srclen)
-{
-	unsigned int len;
-	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
-	struct p8_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
-
-	if (IN_INTERRUPT) {
-		return crypto_shash_update(&dctx->fallback_desc, src,
-					   srclen);
-	} else {
-		if (dctx->bytes) {
-			if (dctx->bytes + srclen < GHASH_DIGEST_SIZE) {
-				memcpy(dctx->buffer + dctx->bytes, src,
-				       srclen);
-				dctx->bytes += srclen;
-				return 0;
-			}
-			memcpy(dctx->buffer + dctx->bytes, src,
-			       GHASH_DIGEST_SIZE - dctx->bytes);
-			preempt_disable();
-			pagefault_disable();
-			enable_kernel_vsx();
-			gcm_ghash_p8(dctx->shash, ctx->htable,
-				     dctx->buffer, GHASH_DIGEST_SIZE);
-			disable_kernel_vsx();
-			pagefault_enable();
-			preempt_enable();
-			src += GHASH_DIGEST_SIZE - dctx->bytes;
-			srclen -= GHASH_DIGEST_SIZE - dctx->bytes;
-			dctx->bytes = 0;
-		}
-		len = srclen & ~(GHASH_DIGEST_SIZE - 1);
-		if (len) {
-			preempt_disable();
-			pagefault_disable();
-			enable_kernel_vsx();
-			gcm_ghash_p8(dctx->shash, ctx->htable, src, len);
-			disable_kernel_vsx();
-			pagefault_enable();
-			preempt_enable();
-			src += len;
-			srclen -= len;
-		}
-		if (srclen) {
-			memcpy(dctx->buffer, src, srclen);
-			dctx->bytes = srclen;
-		}
-		return 0;
-	}
-}
-
-static int p8_ghash_final(struct shash_desc *desc, u8 *out)
-{
-	int i;
-	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
-	struct p8_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
-
-	if (IN_INTERRUPT) {
-		return crypto_shash_final(&dctx->fallback_desc, out);
-	} else {
-		if (dctx->bytes) {
-			for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++)
-				dctx->buffer[i] = 0;
-			preempt_disable();
-			pagefault_disable();
-			enable_kernel_vsx();
-			gcm_ghash_p8(dctx->shash, ctx->htable,
-				     dctx->buffer, GHASH_DIGEST_SIZE);
-			disable_kernel_vsx();
-			pagefault_enable();
-			preempt_enable();
-			dctx->bytes = 0;
-		}
-		memcpy(out, dctx->shash, GHASH_DIGEST_SIZE);
-		return 0;
-	}
-}
-
-struct shash_alg p8_ghash_alg = {
-	.digestsize = GHASH_DIGEST_SIZE,
-	.init = p8_ghash_init,
-	.update = p8_ghash_update,
-	.final = p8_ghash_final,
-	.setkey = p8_ghash_setkey,
-	.descsize = sizeof(struct p8_ghash_desc_ctx)
-		+ sizeof(struct ghash_desc_ctx),
-	.base = {
-		 .cra_name = "ghash",
-		 .cra_driver_name = "p8_ghash",
-		 .cra_priority = 1000,
-		 .cra_flags = CRYPTO_ALG_TYPE_SHASH | CRYPTO_ALG_NEED_FALLBACK,
-		 .cra_blocksize = GHASH_BLOCK_SIZE,
-		 .cra_ctxsize = sizeof(struct p8_ghash_ctx),
-		 .cra_module = THIS_MODULE,
-		 .cra_init = p8_ghash_init_tfm,
-		 .cra_exit = p8_ghash_exit_tfm,
-	},
-};
diff --git a/drivers/crypto/vmx/ghashp8-ppc.pl b/drivers/crypto/vmx/ghashp8-ppc.pl
deleted file mode 100644
index d8429cb71f02..000000000000
--- a/drivers/crypto/vmx/ghashp8-ppc.pl
+++ /dev/null
@@ -1,234 +0,0 @@
-#!/usr/bin/env perl
-#
-# ====================================================================
-# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
-# project. The module is, however, dual licensed under OpenSSL and
-# CRYPTOGAMS licenses depending on where you obtain it. For further
-# details see http://www.openssl.org/~appro/cryptogams/.
-# ====================================================================
-#
-# GHASH for for PowerISA v2.07.
-#
-# July 2014
-#
-# Accurate performance measurements are problematic, because it's
-# always virtualized setup with possibly throttled processor.
-# Relative comparison is therefore more informative. This initial
-# version is ~2.1x slower than hardware-assisted AES-128-CTR, ~12x
-# faster than "4-bit" integer-only compiler-generated 64-bit code.
-# "Initial version" means that there is room for futher improvement.
-
-$flavour=shift;
-$output =shift;
-
-if ($flavour =~ /64/) {
-	$SIZE_T=8;
-	$LRSAVE=2*$SIZE_T;
-	$STU="stdu";
-	$POP="ld";
-	$PUSH="std";
-} elsif ($flavour =~ /32/) {
-	$SIZE_T=4;
-	$LRSAVE=$SIZE_T;
-	$STU="stwu";
-	$POP="lwz";
-	$PUSH="stw";
-} else { die "nonsense $flavour"; }
-
-$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
-( $xlate="${dir}ppc-xlate.pl" and -f $xlate ) or
-( $xlate="${dir}../../perlasm/ppc-xlate.pl" and -f $xlate) or
-die "can't locate ppc-xlate.pl";
-
-open STDOUT,"| $^X $xlate $flavour $output" || die "can't call $xlate: $!";
-
-my ($Xip,$Htbl,$inp,$len)=map("r$_",(3..6));	# argument block
-
-my ($Xl,$Xm,$Xh,$IN)=map("v$_",(0..3));
-my ($zero,$t0,$t1,$t2,$xC2,$H,$Hh,$Hl,$lemask)=map("v$_",(4..12));
-my $vrsave="r12";
-
-$code=<<___;
-.machine	"any"
-
-.text
-
-.globl	.gcm_init_p8
-	lis		r0,0xfff0
-	li		r8,0x10
-	mfspr		$vrsave,256
-	li		r9,0x20
-	mtspr		256,r0
-	li		r10,0x30
-	lvx_u		$H,0,r4			# load H
-	le?xor		r7,r7,r7
-	le?addi		r7,r7,0x8		# need a vperm start with 08
-	le?lvsr		5,0,r7
-	le?vspltisb	6,0x0f
-	le?vxor		5,5,6			# set a b-endian mask
-	le?vperm	$H,$H,$H,5
-
-	vspltisb	$xC2,-16		# 0xf0
-	vspltisb	$t0,1			# one
-	vaddubm		$xC2,$xC2,$xC2		# 0xe0
-	vxor		$zero,$zero,$zero
-	vor		$xC2,$xC2,$t0		# 0xe1
-	vsldoi		$xC2,$xC2,$zero,15	# 0xe1...
-	vsldoi		$t1,$zero,$t0,1		# ...1
-	vaddubm		$xC2,$xC2,$xC2		# 0xc2...
-	vspltisb	$t2,7
-	vor		$xC2,$xC2,$t1		# 0xc2....01
-	vspltb		$t1,$H,0		# most significant byte
-	vsl		$H,$H,$t0		# H<<=1
-	vsrab		$t1,$t1,$t2		# broadcast carry bit
-	vand		$t1,$t1,$xC2
-	vxor		$H,$H,$t1		# twisted H
-
-	vsldoi		$H,$H,$H,8		# twist even more ...
-	vsldoi		$xC2,$zero,$xC2,8	# 0xc2.0
-	vsldoi		$Hl,$zero,$H,8		# ... and split
-	vsldoi		$Hh,$H,$zero,8
-
-	stvx_u		$xC2,0,r3		# save pre-computed table
-	stvx_u		$Hl,r8,r3
-	stvx_u		$H, r9,r3
-	stvx_u		$Hh,r10,r3
-
-	mtspr		256,$vrsave
-	blr
-	.long		0
-	.byte		0,12,0x14,0,0,0,2,0
-	.long		0
-.size	.gcm_init_p8,.-.gcm_init_p8
-
-.globl	.gcm_gmult_p8
-	lis		r0,0xfff8
-	li		r8,0x10
-	mfspr		$vrsave,256
-	li		r9,0x20
-	mtspr		256,r0
-	li		r10,0x30
-	lvx_u		$IN,0,$Xip		# load Xi
-
-	lvx_u		$Hl,r8,$Htbl		# load pre-computed table
-	 le?lvsl	$lemask,r0,r0
-	lvx_u		$H, r9,$Htbl
-	 le?vspltisb	$t0,0x07
-	lvx_u		$Hh,r10,$Htbl
-	 le?vxor	$lemask,$lemask,$t0
-	lvx_u		$xC2,0,$Htbl
-	 le?vperm	$IN,$IN,$IN,$lemask
-	vxor		$zero,$zero,$zero
-
-	vpmsumd		$Xl,$IN,$Hl		# H.lo·Xi.lo
-	vpmsumd		$Xm,$IN,$H		# H.hi·Xi.lo+H.lo·Xi.hi
-	vpmsumd		$Xh,$IN,$Hh		# H.hi·Xi.hi
-
-	vpmsumd		$t2,$Xl,$xC2		# 1st phase
-
-	vsldoi		$t0,$Xm,$zero,8
-	vsldoi		$t1,$zero,$Xm,8
-	vxor		$Xl,$Xl,$t0
-	vxor		$Xh,$Xh,$t1
-
-	vsldoi		$Xl,$Xl,$Xl,8
-	vxor		$Xl,$Xl,$t2
-
-	vsldoi		$t1,$Xl,$Xl,8		# 2nd phase
-	vpmsumd		$Xl,$Xl,$xC2
-	vxor		$t1,$t1,$Xh
-	vxor		$Xl,$Xl,$t1
-
-	le?vperm	$Xl,$Xl,$Xl,$lemask
-	stvx_u		$Xl,0,$Xip		# write out Xi
-
-	mtspr		256,$vrsave
-	blr
-	.long		0
-	.byte		0,12,0x14,0,0,0,2,0
-	.long		0
-.size	.gcm_gmult_p8,.-.gcm_gmult_p8
-
-.globl	.gcm_ghash_p8
-	lis		r0,0xfff8
-	li		r8,0x10
-	mfspr		$vrsave,256
-	li		r9,0x20
-	mtspr		256,r0
-	li		r10,0x30
-	lvx_u		$Xl,0,$Xip		# load Xi
-
-	lvx_u		$Hl,r8,$Htbl		# load pre-computed table
-	 le?lvsl	$lemask,r0,r0
-	lvx_u		$H, r9,$Htbl
-	 le?vspltisb	$t0,0x07
-	lvx_u		$Hh,r10,$Htbl
-	 le?vxor	$lemask,$lemask,$t0
-	lvx_u		$xC2,0,$Htbl
-	 le?vperm	$Xl,$Xl,$Xl,$lemask
-	vxor		$zero,$zero,$zero
-
-	lvx_u		$IN,0,$inp
-	addi		$inp,$inp,16
-	subi		$len,$len,16
-	 le?vperm	$IN,$IN,$IN,$lemask
-	vxor		$IN,$IN,$Xl
-	b		Loop
-
-.align	5
-Loop:
-	 subic		$len,$len,16
-	vpmsumd		$Xl,$IN,$Hl		# H.lo·Xi.lo
-	 subfe.		r0,r0,r0		# borrow?-1:0
-	vpmsumd		$Xm,$IN,$H		# H.hi·Xi.lo+H.lo·Xi.hi
-	 and		r0,r0,$len
-	vpmsumd		$Xh,$IN,$Hh		# H.hi·Xi.hi
-	 add		$inp,$inp,r0
-
-	vpmsumd		$t2,$Xl,$xC2		# 1st phase
-
-	vsldoi		$t0,$Xm,$zero,8
-	vsldoi		$t1,$zero,$Xm,8
-	vxor		$Xl,$Xl,$t0
-	vxor		$Xh,$Xh,$t1
-
-	vsldoi		$Xl,$Xl,$Xl,8
-	vxor		$Xl,$Xl,$t2
-	 lvx_u		$IN,0,$inp
-	 addi		$inp,$inp,16
-
-	vsldoi		$t1,$Xl,$Xl,8		# 2nd phase
-	vpmsumd		$Xl,$Xl,$xC2
-	 le?vperm	$IN,$IN,$IN,$lemask
-	vxor		$t1,$t1,$Xh
-	vxor		$IN,$IN,$t1
-	vxor		$IN,$IN,$Xl
-	beq		Loop			# did $len-=16 borrow?
-
-	vxor		$Xl,$Xl,$t1
-	le?vperm	$Xl,$Xl,$Xl,$lemask
-	stvx_u		$Xl,0,$Xip		# write out Xi
-
-	mtspr		256,$vrsave
-	blr
-	.long		0
-	.byte		0,12,0x14,0,0,0,4,0
-	.long		0
-.size	.gcm_ghash_p8,.-.gcm_ghash_p8
-
-.asciz  "GHASH for PowerISA 2.07, CRYPTOGAMS by <appro\@openssl.org>"
-.align  2
-___
-
-foreach (split("\n",$code)) {
-	if ($flavour =~ /le$/o) {	# little-endian
-	    s/le\?//o		or
-	    s/be\?/#be#/o;
-	} else {
-	    s/le\?/#le#/o	or
-	    s/be\?//o;
-	}
-	print $_,"\n";
-}
-
-close STDOUT; # enforce flush
diff --git a/drivers/crypto/vmx/ppc-xlate.pl b/drivers/crypto/vmx/ppc-xlate.pl
deleted file mode 100644
index b18e67d0e065..000000000000
--- a/drivers/crypto/vmx/ppc-xlate.pl
+++ /dev/null
@@ -1,228 +0,0 @@
-#!/usr/bin/env perl
-
-# PowerPC assembler distiller by <appro>.
-
-my $flavour = shift;
-my $output = shift;
-open STDOUT,">$output" || die "can't open $output: $!";
-
-my %GLOBALS;
-my $dotinlocallabels=($flavour=~/linux/)?1:0;
-
-################################################################
-# directives which need special treatment on different platforms
-################################################################
-my $globl = sub {
-    my $junk = shift;
-    my $name = shift;
-    my $global = \$GLOBALS{$name};
-    my $ret;
-
-    $name =~ s|^[\.\_]||;
- 
-    SWITCH: for ($flavour) {
-	/aix/		&& do { $name = ".$name";
-				last;
-			      };
-	/osx/		&& do { $name = "_$name";
-				last;
-			      };
-	/linux/
-			&& do {	$ret = "_GLOBAL($name)";
-				last;
-			      };
-    }
-
-    $ret = ".globl	$name\nalign 5\n$name:" if (!$ret);
-    $$global = $name;
-    $ret;
-};
-my $text = sub {
-    my $ret = ($flavour =~ /aix/) ? ".csect\t.text[PR],7" : ".text";
-    $ret = ".abiversion	2\n".$ret	if ($flavour =~ /linux.*64le/);
-    $ret;
-};
-my $machine = sub {
-    my $junk = shift;
-    my $arch = shift;
-    if ($flavour =~ /osx/)
-    {	$arch =~ s/\"//g;
-	$arch = ($flavour=~/64/) ? "ppc970-64" : "ppc970" if ($arch eq "any");
-    }
-    ".machine	$arch";
-};
-my $size = sub {
-    if ($flavour =~ /linux/)
-    {	shift;
-	my $name = shift; $name =~ s|^[\.\_]||;
-	my $ret  = ".size	$name,.-".($flavour=~/64$/?".":"").$name;
-	$ret .= "\n.size	.$name,.-.$name" if ($flavour=~/64$/);
-	$ret;
-    }
-    else
-    {	"";	}
-};
-my $asciz = sub {
-    shift;
-    my $line = join(",",@_);
-    if ($line =~ /^"(.*)"$/)
-    {	".byte	" . join(",",unpack("C*",$1),0) . "\n.align	2";	}
-    else
-    {	"";	}
-};
-my $quad = sub {
-    shift;
-    my @ret;
-    my ($hi,$lo);
-    for (@_) {
-	if (/^0x([0-9a-f]*?)([0-9a-f]{1,8})$/io)
-	{  $hi=$1?"0x$1":"0"; $lo="0x$2";  }
-	elsif (/^([0-9]+)$/o)
-	{  $hi=$1>>32; $lo=$1&0xffffffff;  } # error-prone with 32-bit perl
-	else
-	{  $hi=undef; $lo=$_; }
-
-	if (defined($hi))
-	{  push(@ret,$flavour=~/le$/o?".long\t$lo,$hi":".long\t$hi,$lo");  }
-	else
-	{  push(@ret,".quad	$lo");  }
-    }
-    join("\n",@ret);
-};
-
-################################################################
-# simplified mnemonics not handled by at least one assembler
-################################################################
-my $cmplw = sub {
-    my $f = shift;
-    my $cr = 0; $cr = shift if ($#_>1);
-    # Some out-of-date 32-bit GNU assembler just can't handle cmplw...
-    ($flavour =~ /linux.*32/) ?
-	"	.long	".sprintf "0x%x",31<<26|$cr<<23|$_[0]<<16|$_[1]<<11|64 :
-	"	cmplw	".join(',',$cr,@_);
-};
-my $bdnz = sub {
-    my $f = shift;
-    my $bo = $f=~/[\+\-]/ ? 16+9 : 16;	# optional "to be taken" hint
-    "	bc	$bo,0,".shift;
-} if ($flavour!~/linux/);
-my $bltlr = sub {
-    my $f = shift;
-    my $bo = $f=~/\-/ ? 12+2 : 12;	# optional "not to be taken" hint
-    ($flavour =~ /linux/) ?		# GNU as doesn't allow most recent hints
-	"	.long	".sprintf "0x%x",19<<26|$bo<<21|16<<1 :
-	"	bclr	$bo,0";
-};
-my $bnelr = sub {
-    my $f = shift;
-    my $bo = $f=~/\-/ ? 4+2 : 4;	# optional "not to be taken" hint
-    ($flavour =~ /linux/) ?		# GNU as doesn't allow most recent hints
-	"	.long	".sprintf "0x%x",19<<26|$bo<<21|2<<16|16<<1 :
-	"	bclr	$bo,2";
-};
-my $beqlr = sub {
-    my $f = shift;
-    my $bo = $f=~/-/ ? 12+2 : 12;	# optional "not to be taken" hint
-    ($flavour =~ /linux/) ?		# GNU as doesn't allow most recent hints
-	"	.long	".sprintf "0x%X",19<<26|$bo<<21|2<<16|16<<1 :
-	"	bclr	$bo,2";
-};
-# GNU assembler can't handle extrdi rA,rS,16,48, or when sum of last two
-# arguments is 64, with "operand out of range" error.
-my $extrdi = sub {
-    my ($f,$ra,$rs,$n,$b) = @_;
-    $b = ($b+$n)&63; $n = 64-$n;
-    "	rldicl	$ra,$rs,$b,$n";
-};
-my $vmr = sub {
-    my ($f,$vx,$vy) = @_;
-    "	vor	$vx,$vy,$vy";
-};
-
-# Some ABIs specify vrsave, special-purpose register #256, as reserved
-# for system use.
-my $no_vrsave = ($flavour =~ /linux-ppc64le/);
-my $mtspr = sub {
-    my ($f,$idx,$ra) = @_;
-    if ($idx == 256 && $no_vrsave) {
-	"	or	$ra,$ra,$ra";
-    } else {
-	"	mtspr	$idx,$ra";
-    }
-};
-my $mfspr = sub {
-    my ($f,$rd,$idx) = @_;
-    if ($idx == 256 && $no_vrsave) {
-	"	li	$rd,-1";
-    } else {
-	"	mfspr	$rd,$idx";
-    }
-};
-
-# PowerISA 2.06 stuff
-sub vsxmem_op {
-    my ($f, $vrt, $ra, $rb, $op) = @_;
-    "	.long	".sprintf "0x%X",(31<<26)|($vrt<<21)|($ra<<16)|($rb<<11)|($op*2+1);
-}
-# made-up unaligned memory reference AltiVec/VMX instructions
-my $lvx_u	= sub {	vsxmem_op(@_, 844); };	# lxvd2x
-my $stvx_u	= sub {	vsxmem_op(@_, 972); };	# stxvd2x
-my $lvdx_u	= sub {	vsxmem_op(@_, 588); };	# lxsdx
-my $stvdx_u	= sub {	vsxmem_op(@_, 716); };	# stxsdx
-my $lvx_4w	= sub { vsxmem_op(@_, 780); };	# lxvw4x
-my $stvx_4w	= sub { vsxmem_op(@_, 908); };	# stxvw4x
-
-# PowerISA 2.07 stuff
-sub vcrypto_op {
-    my ($f, $vrt, $vra, $vrb, $op) = @_;
-    "	.long	".sprintf "0x%X",(4<<26)|($vrt<<21)|($vra<<16)|($vrb<<11)|$op;
-}
-my $vcipher	= sub { vcrypto_op(@_, 1288); };
-my $vcipherlast	= sub { vcrypto_op(@_, 1289); };
-my $vncipher	= sub { vcrypto_op(@_, 1352); };
-my $vncipherlast= sub { vcrypto_op(@_, 1353); };
-my $vsbox	= sub { vcrypto_op(@_, 0, 1480); };
-my $vshasigmad	= sub { my ($st,$six)=splice(@_,-2); vcrypto_op(@_, $st<<4|$six, 1730); };
-my $vshasigmaw	= sub { my ($st,$six)=splice(@_,-2); vcrypto_op(@_, $st<<4|$six, 1666); };
-my $vpmsumb	= sub { vcrypto_op(@_, 1032); };
-my $vpmsumd	= sub { vcrypto_op(@_, 1224); };
-my $vpmsubh	= sub { vcrypto_op(@_, 1096); };
-my $vpmsumw	= sub { vcrypto_op(@_, 1160); };
-my $vaddudm	= sub { vcrypto_op(@_, 192);  };
-my $vadduqm	= sub { vcrypto_op(@_, 256);  };
-
-my $mtsle	= sub {
-    my ($f, $arg) = @_;
-    "	.long	".sprintf "0x%X",(31<<26)|($arg<<21)|(147*2);
-};
-
-print "#include <asm/ppc_asm.h>\n" if $flavour =~ /linux/;
-
-while($line=<>) {
-
-    $line =~ s|[#!;].*$||;	# get rid of asm-style comments...
-    $line =~ s|/\*.*\*/||;	# ... and C-style comments...
-    $line =~ s|^\s+||;		# ... and skip white spaces in beginning...
-    $line =~ s|\s+$||;		# ... and at the end
-
-    {
-	$line =~ s|\b\.L(\w+)|L$1|g;	# common denominator for Locallabel
-	$line =~ s|\bL(\w+)|\.L$1|g	if ($dotinlocallabels);
-    }
-
-    {
-	$line =~ s|^\s*(\.?)(\w+)([\.\+\-]?)\s*||;
-	my $c = $1; $c = "\t" if ($c eq "");
-	my $mnemonic = $2;
-	my $f = $3;
-	my $opcode = eval("\$$mnemonic");
-	$line =~ s/\b(c?[rf]|v|vs)([0-9]+)\b/$2/g if ($c ne "." and $flavour !~ /osx/);
-	if (ref($opcode) eq 'CODE') { $line = &$opcode($f,split(',',$line)); }
-	elsif ($mnemonic)           { $line = $c.$mnemonic.$f."\t".$line; }
-    }
-
-    print $line if ($line);
-    print "\n";
-}
-
-close STDOUT;
diff --git a/drivers/crypto/vmx/vmx.c b/drivers/crypto/vmx/vmx.c
deleted file mode 100644
index 31a98dc6f849..000000000000
--- a/drivers/crypto/vmx/vmx.c
+++ /dev/null
@@ -1,88 +0,0 @@
-/**
- * Routines supporting VMX instructions on the Power 8
- *
- * Copyright (C) 2015 International Business Machines Inc.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
- * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
- */
-
-#include <linux/module.h>
-#include <linux/moduleparam.h>
-#include <linux/types.h>
-#include <linux/err.h>
-#include <linux/cpufeature.h>
-#include <linux/crypto.h>
-#include <asm/cputable.h>
-#include <crypto/internal/hash.h>
-
-extern struct shash_alg p8_ghash_alg;
-extern struct crypto_alg p8_aes_alg;
-extern struct crypto_alg p8_aes_cbc_alg;
-extern struct crypto_alg p8_aes_ctr_alg;
-extern struct crypto_alg p8_aes_xts_alg;
-static struct crypto_alg *algs[] = {
-	&p8_aes_alg,
-	&p8_aes_cbc_alg,
-	&p8_aes_ctr_alg,
-	&p8_aes_xts_alg,
-	NULL,
-};
-
-int __init p8_init(void)
-{
-	int ret = 0;
-	struct crypto_alg **alg_it;
-
-	for (alg_it = algs; *alg_it; alg_it++) {
-		ret = crypto_register_alg(*alg_it);
-		printk(KERN_INFO "crypto_register_alg '%s' = %d\n",
-		       (*alg_it)->cra_name, ret);
-		if (ret) {
-			for (alg_it--; alg_it >= algs; alg_it--)
-				crypto_unregister_alg(*alg_it);
-			break;
-		}
-	}
-	if (ret)
-		return ret;
-
-	ret = crypto_register_shash(&p8_ghash_alg);
-	if (ret) {
-		for (alg_it = algs; *alg_it; alg_it++)
-			crypto_unregister_alg(*alg_it);
-	}
-	return ret;
-}
-
-void __exit p8_exit(void)
-{
-	struct crypto_alg **alg_it;
-
-	for (alg_it = algs; *alg_it; alg_it++) {
-		printk(KERN_INFO "Removing '%s'\n", (*alg_it)->cra_name);
-		crypto_unregister_alg(*alg_it);
-	}
-	crypto_unregister_shash(&p8_ghash_alg);
-}
-
-module_cpu_feature_match(PPC_MODULE_FEATURE_VEC_CRYPTO, p8_init);
-module_exit(p8_exit);
-
-MODULE_AUTHOR("Marcelo Cerri<mhcerri@br.ibm.com>");
-MODULE_DESCRIPTION("IBM VMX cryptographic acceleration instructions "
-		   "support on Power 8");
-MODULE_LICENSE("GPL");
-MODULE_VERSION("1.0.0");
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] crypto: vmx: Remove dubiously licensed crypto code
  2017-03-29 12:56 [PATCH] crypto: vmx: Remove dubiously licensed crypto code Michal Suchanek
@ 2017-03-29 14:51 ` Greg Kroah-Hartman
  2017-03-29 15:13   ` Michal Suchánek
  2017-03-29 23:29 ` Tyrel Datwyler
       [not found] ` <87r31flisi.fsf@concordia.ellerman.id.au>
  2 siblings, 1 reply; 10+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-29 14:51 UTC (permalink / raw)
  To: Michal Suchanek, Leonidas S. Barbosa, Paulo Flabiano Smorigo
  Cc: Herbert Xu, David S. Miller, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Geert Uytterhoeven,
	Mauro Carvalho Chehab, linux-kernel, linux-crypto, linuxppc-dev

On Wed, Mar 29, 2017 at 02:56:39PM +0200, Michal Suchanek wrote:
> While reviewing commit 11c6e16ee13a ("crypto: vmx - Adding asm
> subroutines for XTS") which adds the OpenSSL license header to
> drivers/crypto/vmx/aesp8-ppc.pl licensing of this driver came into
> qestion. The whole license reads:
> 
>  # Licensed under the OpenSSL license (the "License").  You may not use
>  # this file except in compliance with the License.  You can obtain a
>  # copy
>  # in the file LICENSE in the source distribution or at
>  # https://www.openssl.org/source/license.html
> 
>  #
>  # ====================================================================
>  # Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
>  # project. The module is, however, dual licensed under OpenSSL and
>  # CRYPTOGAMS licenses depending on where you obtain it. For further
>  # details see http://www.openssl.org/~appro/cryptogams/.
>  # ====================================================================
> 
> After seeking legal advice it is still not clear that this driver can be
> legally used in Linux. In particular the "depending on where you obtain
> it" part does not make it clear when you can apply the GPL and when the
> OpenSSL license.
> 
> I tried contacting the author of the code for clarification but did not
> hear back. In absence of clear licensing the only solution I see is
> removing this code.
> 
> Signed-off-by: Michal Suchanek <msuchanek@suse.de>
> ---
>  MAINTAINERS                       |   12 -
>  drivers/crypto/Kconfig            |    8 -
>  drivers/crypto/Makefile           |    1 -
>  drivers/crypto/vmx/.gitignore     |    2 -
>  drivers/crypto/vmx/Kconfig        |    9 -
>  drivers/crypto/vmx/Makefile       |   21 -
>  drivers/crypto/vmx/aes.c          |  150 --
>  drivers/crypto/vmx/aes_cbc.c      |  202 --
>  drivers/crypto/vmx/aes_ctr.c      |  191 --
>  drivers/crypto/vmx/aes_xts.c      |  190 --
>  drivers/crypto/vmx/aesp8-ppc.h    |   25 -
>  drivers/crypto/vmx/aesp8-ppc.pl   | 3789 -------------------------------------
>  drivers/crypto/vmx/ghash.c        |  227 ---
>  drivers/crypto/vmx/ghashp8-ppc.pl |  234 ---
>  drivers/crypto/vmx/ppc-xlate.pl   |  228 ---
>  drivers/crypto/vmx/vmx.c          |   88 -
>  16 files changed, 5377 deletions(-)
>  delete mode 100644 drivers/crypto/vmx/.gitignore
>  delete mode 100644 drivers/crypto/vmx/Kconfig
>  delete mode 100644 drivers/crypto/vmx/Makefile
>  delete mode 100644 drivers/crypto/vmx/aes.c
>  delete mode 100644 drivers/crypto/vmx/aes_cbc.c
>  delete mode 100644 drivers/crypto/vmx/aes_ctr.c
>  delete mode 100644 drivers/crypto/vmx/aes_xts.c
>  delete mode 100644 drivers/crypto/vmx/aesp8-ppc.h
>  delete mode 100644 drivers/crypto/vmx/aesp8-ppc.pl
>  delete mode 100644 drivers/crypto/vmx/ghash.c
>  delete mode 100644 drivers/crypto/vmx/ghashp8-ppc.pl
>  delete mode 100644 drivers/crypto/vmx/ppc-xlate.pl
>  delete mode 100644 drivers/crypto/vmx/vmx.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1b0a87ffffab..fd4cbf046ab4 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6190,18 +6190,6 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git
>  S:	Maintained
>  F:	arch/ia64/
>  
> -IBM Power VMX Cryptographic instructions
> -M:	Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
> -M:	Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>

Any reason why you didn't cc: these maintainers on your patch?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] crypto: vmx: Remove dubiously licensed crypto code
  2017-03-29 14:51 ` Greg Kroah-Hartman
@ 2017-03-29 15:13   ` Michal Suchánek
  2017-03-29 23:08     ` Tyrel Datwyler
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Suchánek @ 2017-03-29 15:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Leonidas S. Barbosa, Paulo Flabiano Smorigo, Herbert Xu,
	David S. Miller, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Geert Uytterhoeven, Mauro Carvalho Chehab,
	linux-kernel, linux-crypto, linuxppc-dev

On Wed, 29 Mar 2017 16:51:35 +0200
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> On Wed, Mar 29, 2017 at 02:56:39PM +0200, Michal Suchanek wrote:
> > While reviewing commit 11c6e16ee13a ("crypto: vmx - Adding asm
> > subroutines for XTS") which adds the OpenSSL license header to
> > drivers/crypto/vmx/aesp8-ppc.pl licensing of this driver came into
> > qestion. The whole license reads:
> > 
> >  # Licensed under the OpenSSL license (the "License").  You may not
> > use # this file except in compliance with the License.  You can
> > obtain a # copy
> >  # in the file LICENSE in the source distribution or at
> >  # https://www.openssl.org/source/license.html
> > 
> >  #
> >  #
> > ====================================================================
> > # Written by Andy Polyakov <appro@openssl.org> for the OpenSSL #
> > project. The module is, however, dual licensed under OpenSSL and #
> > CRYPTOGAMS licenses depending on where you obtain it. For further #
> > details see http://www.openssl.org/~appro/cryptogams/. #
> > ====================================================================
> > 
> > After seeking legal advice it is still not clear that this driver
> > can be legally used in Linux. In particular the "depending on where
> > you obtain it" part does not make it clear when you can apply the
> > GPL and when the OpenSSL license.
> > 
> > I tried contacting the author of the code for clarification but did
> > not hear back. In absence of clear licensing the only solution I
> > see is removing this code.
> > 
> > Signed-off-by: Michal Suchanek <msuchanek@suse.de>
> > ---
> >  MAINTAINERS                       |   12 -
> >  drivers/crypto/Kconfig            |    8 -
> >  drivers/crypto/Makefile           |    1 -
> >  drivers/crypto/vmx/.gitignore     |    2 -
> >  drivers/crypto/vmx/Kconfig        |    9 -
> >  drivers/crypto/vmx/Makefile       |   21 -
> >  drivers/crypto/vmx/aes.c          |  150 --
> >  drivers/crypto/vmx/aes_cbc.c      |  202 --
> >  drivers/crypto/vmx/aes_ctr.c      |  191 --
> >  drivers/crypto/vmx/aes_xts.c      |  190 --
> >  drivers/crypto/vmx/aesp8-ppc.h    |   25 -
> >  drivers/crypto/vmx/aesp8-ppc.pl   | 3789
> > -------------------------------------
> > drivers/crypto/vmx/ghash.c        |  227 ---
> > drivers/crypto/vmx/ghashp8-ppc.pl |  234 ---
> > drivers/crypto/vmx/ppc-xlate.pl   |  228 ---
> > drivers/crypto/vmx/vmx.c          |   88 - 16 files changed, 5377
> > deletions(-) delete mode 100644 drivers/crypto/vmx/.gitignore
> >  delete mode 100644 drivers/crypto/vmx/Kconfig
> >  delete mode 100644 drivers/crypto/vmx/Makefile
> >  delete mode 100644 drivers/crypto/vmx/aes.c
> >  delete mode 100644 drivers/crypto/vmx/aes_cbc.c
> >  delete mode 100644 drivers/crypto/vmx/aes_ctr.c
> >  delete mode 100644 drivers/crypto/vmx/aes_xts.c
> >  delete mode 100644 drivers/crypto/vmx/aesp8-ppc.h
> >  delete mode 100644 drivers/crypto/vmx/aesp8-ppc.pl
> >  delete mode 100644 drivers/crypto/vmx/ghash.c
> >  delete mode 100644 drivers/crypto/vmx/ghashp8-ppc.pl
> >  delete mode 100644 drivers/crypto/vmx/ppc-xlate.pl
> >  delete mode 100644 drivers/crypto/vmx/vmx.c
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 1b0a87ffffab..fd4cbf046ab4 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -6190,18 +6190,6 @@ T:	git
> > git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git
> > S:	Maintained F:	arch/ia64/
> >  
> > -IBM Power VMX Cryptographic instructions
> > -M:	Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
> > -M:	Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>  
> 
> Any reason why you didn't cc: these maintainers on your patch?
> 

I used get_maintainers.pl with a filter that turns it into a valid
e-mail list and did not particularly thoroughly check the output.
Removing the maintainers from MAINTAINERS in the patch is probably what
causes the omission.

Thanks

Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] crypto: vmx: Remove dubiously licensed crypto code
  2017-03-29 15:13   ` Michal Suchánek
@ 2017-03-29 23:08     ` Tyrel Datwyler
  2017-03-30 16:30       ` Paulo Flabiano Smorigo
  0 siblings, 1 reply; 10+ messages in thread
From: Tyrel Datwyler @ 2017-03-29 23:08 UTC (permalink / raw)
  To: Michal Suchánek, Greg Kroah-Hartman
  Cc: Leonidas S. Barbosa, Herbert Xu, Geert Uytterhoeven,
	linux-kernel, Paul Mackerras, linux-crypto,
	Paulo Flabiano Smorigo, Mauro Carvalho Chehab, linuxppc-dev,
	David S. Miller

On 03/29/2017 08:13 AM, Michal Suchánek wrote:
> On Wed, 29 Mar 2017 16:51:35 +0200
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
>> On Wed, Mar 29, 2017 at 02:56:39PM +0200, Michal Suchanek wrote:
>>> While reviewing commit 11c6e16ee13a ("crypto: vmx - Adding asm
>>> subroutines for XTS") which adds the OpenSSL license header to
>>> drivers/crypto/vmx/aesp8-ppc.pl licensing of this driver came into
>>> qestion. The whole license reads:
>>>
>>>  # Licensed under the OpenSSL license (the "License").  You may not
>>> use # this file except in compliance with the License.  You can
>>> obtain a # copy
>>>  # in the file LICENSE in the source distribution or at
>>>  # https://www.openssl.org/source/license.html
>>>
>>>  #
>>>  #
>>> ====================================================================
>>> # Written by Andy Polyakov <appro@openssl.org> for the OpenSSL #
>>> project. The module is, however, dual licensed under OpenSSL and #
>>> CRYPTOGAMS licenses depending on where you obtain it. For further #
>>> details see http://www.openssl.org/~appro/cryptogams/. #
>>> ====================================================================
>>>
>>> After seeking legal advice it is still not clear that this driver
>>> can be legally used in Linux. In particular the "depending on where
>>> you obtain it" part does not make it clear when you can apply the
>>> GPL and when the OpenSSL license.
>>>
>>> I tried contacting the author of the code for clarification but did
>>> not hear back. In absence of clear licensing the only solution I
>>> see is removing this code.

A quick 'git grep OpenSSL' of the Linux tree returns several other
crypto files under the ARM architecture that are similarly licensed. Namely:

arch/arm/crypto/sha1-armv4-large.S
arch/arm/crypto/sha256-armv4.pl
arch/arm/crypto/sha256-core.S_shipped
arch/arm/crypto/sha512-armv4.pl
arch/arm/crypto/sha512-core.S_shipped
arch/arm64/crypto/sha256-core.S_shipped
arch/arm64/crypto/sha512-armv8.pl
arch/arm64/crypto/sha512-core.S_shipped

On closer inspection of some of those files have the addendum that
"Permission to use under GPL terms is granted", but not all of them.

-Tyrel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] crypto: vmx: Remove dubiously licensed crypto code
  2017-03-29 12:56 [PATCH] crypto: vmx: Remove dubiously licensed crypto code Michal Suchanek
  2017-03-29 14:51 ` Greg Kroah-Hartman
@ 2017-03-29 23:29 ` Tyrel Datwyler
       [not found] ` <87r31flisi.fsf@concordia.ellerman.id.au>
  2 siblings, 0 replies; 10+ messages in thread
From: Tyrel Datwyler @ 2017-03-29 23:29 UTC (permalink / raw)
  To: Michal Suchanek, Herbert Xu, David S. Miller,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Greg Kroah-Hartman, Geert Uytterhoeven, Mauro Carvalho Chehab,
	linux-kernel, linux-crypto, linuxppc-dev, appro

Adding Andy to the discussion as he may be able to shine light on
licensing issue of the crypto perl scripts in question.

-Tyrel

On 03/29/2017 05:56 AM, Michal Suchanek wrote:
> While reviewing commit 11c6e16ee13a ("crypto: vmx - Adding asm
> subroutines for XTS") which adds the OpenSSL license header to
> drivers/crypto/vmx/aesp8-ppc.pl licensing of this driver came into
> qestion. The whole license reads:
> 
>  # Licensed under the OpenSSL license (the "License").  You may not use
>  # this file except in compliance with the License.  You can obtain a
>  # copy
>  # in the file LICENSE in the source distribution or at
>  # https://www.openssl.org/source/license.html
> 
>  #
>  # ====================================================================
>  # Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
>  # project. The module is, however, dual licensed under OpenSSL and
>  # CRYPTOGAMS licenses depending on where you obtain it. For further
>  # details see http://www.openssl.org/~appro/cryptogams/.
>  # ====================================================================
> 
> After seeking legal advice it is still not clear that this driver can be
> legally used in Linux. In particular the "depending on where you obtain
> it" part does not make it clear when you can apply the GPL and when the
> OpenSSL license.
> 
> I tried contacting the author of the code for clarification but did not
> hear back. In absence of clear licensing the only solution I see is
> removing this code.
> 
> Signed-off-by: Michal Suchanek <msuchanek@suse.de>
> ---
>  MAINTAINERS                       |   12 -
>  drivers/crypto/Kconfig            |    8 -
>  drivers/crypto/Makefile           |    1 -
>  drivers/crypto/vmx/.gitignore     |    2 -
>  drivers/crypto/vmx/Kconfig        |    9 -
>  drivers/crypto/vmx/Makefile       |   21 -
>  drivers/crypto/vmx/aes.c          |  150 --
>  drivers/crypto/vmx/aes_cbc.c      |  202 --
>  drivers/crypto/vmx/aes_ctr.c      |  191 --
>  drivers/crypto/vmx/aes_xts.c      |  190 --
>  drivers/crypto/vmx/aesp8-ppc.h    |   25 -
>  drivers/crypto/vmx/aesp8-ppc.pl   | 3789 -------------------------------------
>  drivers/crypto/vmx/ghash.c        |  227 ---
>  drivers/crypto/vmx/ghashp8-ppc.pl |  234 ---
>  drivers/crypto/vmx/ppc-xlate.pl   |  228 ---
>  drivers/crypto/vmx/vmx.c          |   88 -
>  16 files changed, 5377 deletions(-)
>  delete mode 100644 drivers/crypto/vmx/.gitignore
>  delete mode 100644 drivers/crypto/vmx/Kconfig
>  delete mode 100644 drivers/crypto/vmx/Makefile
>  delete mode 100644 drivers/crypto/vmx/aes.c
>  delete mode 100644 drivers/crypto/vmx/aes_cbc.c
>  delete mode 100644 drivers/crypto/vmx/aes_ctr.c
>  delete mode 100644 drivers/crypto/vmx/aes_xts.c
>  delete mode 100644 drivers/crypto/vmx/aesp8-ppc.h
>  delete mode 100644 drivers/crypto/vmx/aesp8-ppc.pl
>  delete mode 100644 drivers/crypto/vmx/ghash.c
>  delete mode 100644 drivers/crypto/vmx/ghashp8-ppc.pl
>  delete mode 100644 drivers/crypto/vmx/ppc-xlate.pl
>  delete mode 100644 drivers/crypto/vmx/vmx.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1b0a87ffffab..fd4cbf046ab4 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6190,18 +6190,6 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git
>  S:	Maintained
>  F:	arch/ia64/
> 
> -IBM Power VMX Cryptographic instructions
> -M:	Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
> -M:	Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
> -L:	linux-crypto@vger.kernel.org
> -S:	Supported
> -F:	drivers/crypto/vmx/Makefile
> -F:	drivers/crypto/vmx/Kconfig
> -F:	drivers/crypto/vmx/vmx.c
> -F:	drivers/crypto/vmx/aes*
> -F:	drivers/crypto/vmx/ghash*
> -F:	drivers/crypto/vmx/ppc-xlate.pl
> -
>  IBM Power in-Nest Crypto Acceleration
>  M:	Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
>  M:	Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index 473d31288ad8..9fcd3af1f2f1 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -530,14 +530,6 @@ config CRYPTO_DEV_QCE
>  	  hardware. To compile this driver as a module, choose M here. The
>  	  module will be called qcrypto.
> 
> -config CRYPTO_DEV_VMX
> -	bool "Support for VMX cryptographic acceleration instructions"
> -	depends on PPC64 && VSX
> -	help
> -	  Support for VMX cryptographic acceleration instructions.
> -
> -source "drivers/crypto/vmx/Kconfig"
> -
>  config CRYPTO_DEV_IMGTEC_HASH
>  	tristate "Imagination Technologies hardware hash accelerator"
>  	depends on MIPS || COMPILE_TEST
> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> index 739609471169..486e57e10e7a 100644
> --- a/drivers/crypto/Makefile
> +++ b/drivers/crypto/Makefile
> @@ -34,5 +34,4 @@ obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
>  obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o
>  obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
>  obj-$(CONFIG_CRYPTO_DEV_VIRTIO) += virtio/
> -obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
>  obj-$(CONFIG_CRYPTO_DEV_BCM_SPU) += bcm/
> diff --git a/drivers/crypto/vmx/.gitignore b/drivers/crypto/vmx/.gitignore
> deleted file mode 100644
> index af4a7ce4738d..000000000000
> --- a/drivers/crypto/vmx/.gitignore
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -aesp8-ppc.S
> -ghashp8-ppc.S
> diff --git a/drivers/crypto/vmx/Kconfig b/drivers/crypto/vmx/Kconfig
> deleted file mode 100644
> index c3d524ea6998..000000000000
> --- a/drivers/crypto/vmx/Kconfig
> +++ /dev/null
> @@ -1,9 +0,0 @@
> -config CRYPTO_DEV_VMX_ENCRYPT
> -	tristate "Encryption acceleration support on P8 CPU"
> -	depends on CRYPTO_DEV_VMX
> -	select CRYPTO_GHASH
> -	default m
> -	help
> -	  Support for VMX cryptographic acceleration instructions on Power8 CPU.
> -	  This module supports acceleration for AES and GHASH in hardware. If you
> -	  choose 'M' here, this module will be called vmx-crypto.
> diff --git a/drivers/crypto/vmx/Makefile b/drivers/crypto/vmx/Makefile
> deleted file mode 100644
> index 55f7c392582f..000000000000
> --- a/drivers/crypto/vmx/Makefile
> +++ /dev/null
> @@ -1,21 +0,0 @@
> -obj-$(CONFIG_CRYPTO_DEV_VMX_ENCRYPT) += vmx-crypto.o
> -vmx-crypto-objs := vmx.o aesp8-ppc.o ghashp8-ppc.o aes.o aes_cbc.o aes_ctr.o aes_xts.o ghash.o
> -
> -ifeq ($(CONFIG_CPU_LITTLE_ENDIAN),y)
> -TARGET := linux-ppc64le
> -else
> -TARGET := linux-ppc64
> -endif
> -
> -quiet_cmd_perl = PERL $@
> -      cmd_perl = $(PERL) $(<) $(TARGET) > $(@)
> -
> -targets += aesp8-ppc.S ghashp8-ppc.S
> -
> -$(obj)/aesp8-ppc.S: $(src)/aesp8-ppc.pl FORCE
> -	$(call if_changed,perl)
> -  
> -$(obj)/ghashp8-ppc.S: $(src)/ghashp8-ppc.pl FORCE
> -	$(call if_changed,perl)
> -
> -clean-files := aesp8-ppc.S ghashp8-ppc.S
> diff --git a/drivers/crypto/vmx/aes.c b/drivers/crypto/vmx/aes.c
> deleted file mode 100644
> index 022c7ab7351a..000000000000
> --- a/drivers/crypto/vmx/aes.c
> +++ /dev/null
> @@ -1,150 +0,0 @@
> -/**
> - * AES routines supporting VMX instructions on the Power 8
> - *
> - * Copyright (C) 2015 International Business Machines Inc.
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; version 2 only.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program; if not, write to the Free Software
> - * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> - *
> - * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
> - */
> -
> -#include <linux/types.h>
> -#include <linux/err.h>
> -#include <linux/crypto.h>
> -#include <linux/delay.h>
> -#include <linux/hardirq.h>
> -#include <asm/switch_to.h>
> -#include <crypto/aes.h>
> -
> -#include "aesp8-ppc.h"
> -
> -struct p8_aes_ctx {
> -	struct crypto_cipher *fallback;
> -	struct aes_key enc_key;
> -	struct aes_key dec_key;
> -};
> -
> -static int p8_aes_init(struct crypto_tfm *tfm)
> -{
> -	const char *alg;
> -	struct crypto_cipher *fallback;
> -	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (!(alg = crypto_tfm_alg_name(tfm))) {
> -		printk(KERN_ERR "Failed to get algorithm name.\n");
> -		return -ENOENT;
> -	}
> -
> -	fallback = crypto_alloc_cipher(alg, 0, CRYPTO_ALG_NEED_FALLBACK);
> -	if (IS_ERR(fallback)) {
> -		printk(KERN_ERR
> -		       "Failed to allocate transformation for '%s': %ld\n",
> -		       alg, PTR_ERR(fallback));
> -		return PTR_ERR(fallback);
> -	}
> -	printk(KERN_INFO "Using '%s' as fallback implementation.\n",
> -	       crypto_tfm_alg_driver_name((struct crypto_tfm *) fallback));
> -
> -	crypto_cipher_set_flags(fallback,
> -				crypto_cipher_get_flags((struct
> -							 crypto_cipher *)
> -							tfm));
> -	ctx->fallback = fallback;
> -
> -	return 0;
> -}
> -
> -static void p8_aes_exit(struct crypto_tfm *tfm)
> -{
> -	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (ctx->fallback) {
> -		crypto_free_cipher(ctx->fallback);
> -		ctx->fallback = NULL;
> -	}
> -}
> -
> -static int p8_aes_setkey(struct crypto_tfm *tfm, const u8 *key,
> -			 unsigned int keylen)
> -{
> -	int ret;
> -	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	preempt_disable();
> -	pagefault_disable();
> -	enable_kernel_vsx();
> -	ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
> -	ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
> -	disable_kernel_vsx();
> -	pagefault_enable();
> -	preempt_enable();
> -
> -	ret += crypto_cipher_setkey(ctx->fallback, key, keylen);
> -	return ret;
> -}
> -
> -static void p8_aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
> -{
> -	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (in_interrupt()) {
> -		crypto_cipher_encrypt_one(ctx->fallback, dst, src);
> -	} else {
> -		preempt_disable();
> -		pagefault_disable();
> -		enable_kernel_vsx();
> -		aes_p8_encrypt(src, dst, &ctx->enc_key);
> -		disable_kernel_vsx();
> -		pagefault_enable();
> -		preempt_enable();
> -	}
> -}
> -
> -static void p8_aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
> -{
> -	struct p8_aes_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (in_interrupt()) {
> -		crypto_cipher_decrypt_one(ctx->fallback, dst, src);
> -	} else {
> -		preempt_disable();
> -		pagefault_disable();
> -		enable_kernel_vsx();
> -		aes_p8_decrypt(src, dst, &ctx->dec_key);
> -		disable_kernel_vsx();
> -		pagefault_enable();
> -		preempt_enable();
> -	}
> -}
> -
> -struct crypto_alg p8_aes_alg = {
> -	.cra_name = "aes",
> -	.cra_driver_name = "p8_aes",
> -	.cra_module = THIS_MODULE,
> -	.cra_priority = 1000,
> -	.cra_type = NULL,
> -	.cra_flags = CRYPTO_ALG_TYPE_CIPHER | CRYPTO_ALG_NEED_FALLBACK,
> -	.cra_alignmask = 0,
> -	.cra_blocksize = AES_BLOCK_SIZE,
> -	.cra_ctxsize = sizeof(struct p8_aes_ctx),
> -	.cra_init = p8_aes_init,
> -	.cra_exit = p8_aes_exit,
> -	.cra_cipher = {
> -		       .cia_min_keysize = AES_MIN_KEY_SIZE,
> -		       .cia_max_keysize = AES_MAX_KEY_SIZE,
> -		       .cia_setkey = p8_aes_setkey,
> -		       .cia_encrypt = p8_aes_encrypt,
> -		       .cia_decrypt = p8_aes_decrypt,
> -	},
> -};
> diff --git a/drivers/crypto/vmx/aes_cbc.c b/drivers/crypto/vmx/aes_cbc.c
> deleted file mode 100644
> index 72a26eb4e954..000000000000
> --- a/drivers/crypto/vmx/aes_cbc.c
> +++ /dev/null
> @@ -1,202 +0,0 @@
> -/**
> - * AES CBC routines supporting VMX instructions on the Power 8
> - *
> - * Copyright (C) 2015 International Business Machines Inc.
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; version 2 only.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program; if not, write to the Free Software
> - * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> - *
> - * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
> - */
> -
> -#include <linux/types.h>
> -#include <linux/err.h>
> -#include <linux/crypto.h>
> -#include <linux/delay.h>
> -#include <linux/hardirq.h>
> -#include <asm/switch_to.h>
> -#include <crypto/aes.h>
> -#include <crypto/scatterwalk.h>
> -#include <crypto/skcipher.h>
> -
> -#include "aesp8-ppc.h"
> -
> -struct p8_aes_cbc_ctx {
> -	struct crypto_skcipher *fallback;
> -	struct aes_key enc_key;
> -	struct aes_key dec_key;
> -};
> -
> -static int p8_aes_cbc_init(struct crypto_tfm *tfm)
> -{
> -	const char *alg;
> -	struct crypto_skcipher *fallback;
> -	struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (!(alg = crypto_tfm_alg_name(tfm))) {
> -		printk(KERN_ERR "Failed to get algorithm name.\n");
> -		return -ENOENT;
> -	}
> -
> -	fallback = crypto_alloc_skcipher(alg, 0,
> -			CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK);
> -
> -	if (IS_ERR(fallback)) {
> -		printk(KERN_ERR
> -		       "Failed to allocate transformation for '%s': %ld\n",
> -		       alg, PTR_ERR(fallback));
> -		return PTR_ERR(fallback);
> -	}
> -	printk(KERN_INFO "Using '%s' as fallback implementation.\n",
> -		crypto_skcipher_driver_name(fallback));
> -
> -
> -	crypto_skcipher_set_flags(
> -		fallback,
> -		crypto_skcipher_get_flags((struct crypto_skcipher *)tfm));
> -	ctx->fallback = fallback;
> -
> -	return 0;
> -}
> -
> -static void p8_aes_cbc_exit(struct crypto_tfm *tfm)
> -{
> -	struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (ctx->fallback) {
> -		crypto_free_skcipher(ctx->fallback);
> -		ctx->fallback = NULL;
> -	}
> -}
> -
> -static int p8_aes_cbc_setkey(struct crypto_tfm *tfm, const u8 *key,
> -			     unsigned int keylen)
> -{
> -	int ret;
> -	struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	preempt_disable();
> -	pagefault_disable();
> -	enable_kernel_vsx();
> -	ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
> -	ret += aes_p8_set_decrypt_key(key, keylen * 8, &ctx->dec_key);
> -	disable_kernel_vsx();
> -	pagefault_enable();
> -	preempt_enable();
> -
> -	ret += crypto_skcipher_setkey(ctx->fallback, key, keylen);
> -	return ret;
> -}
> -
> -static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
> -			      struct scatterlist *dst,
> -			      struct scatterlist *src, unsigned int nbytes)
> -{
> -	int ret;
> -	struct blkcipher_walk walk;
> -	struct p8_aes_cbc_ctx *ctx =
> -		crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
> -
> -	if (in_interrupt()) {
> -		SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
> -		skcipher_request_set_tfm(req, ctx->fallback);
> -		skcipher_request_set_callback(req, desc->flags, NULL, NULL);
> -		skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
> -		ret = crypto_skcipher_encrypt(req);
> -		skcipher_request_zero(req);
> -	} else {
> -		preempt_disable();
> -		pagefault_disable();
> -		enable_kernel_vsx();
> -
> -		blkcipher_walk_init(&walk, dst, src, nbytes);
> -		ret = blkcipher_walk_virt(desc, &walk);
> -		while ((nbytes = walk.nbytes)) {
> -			aes_p8_cbc_encrypt(walk.src.virt.addr,
> -					   walk.dst.virt.addr,
> -					   nbytes & AES_BLOCK_MASK,
> -					   &ctx->enc_key, walk.iv, 1);
> -			nbytes &= AES_BLOCK_SIZE - 1;
> -			ret = blkcipher_walk_done(desc, &walk, nbytes);
> -		}
> -
> -		disable_kernel_vsx();
> -		pagefault_enable();
> -		preempt_enable();
> -	}
> -
> -	return ret;
> -}
> -
> -static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
> -			      struct scatterlist *dst,
> -			      struct scatterlist *src, unsigned int nbytes)
> -{
> -	int ret;
> -	struct blkcipher_walk walk;
> -	struct p8_aes_cbc_ctx *ctx =
> -		crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
> -
> -	if (in_interrupt()) {
> -		SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
> -		skcipher_request_set_tfm(req, ctx->fallback);
> -		skcipher_request_set_callback(req, desc->flags, NULL, NULL);
> -		skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
> -		ret = crypto_skcipher_decrypt(req);
> -		skcipher_request_zero(req);
> -	} else {
> -		preempt_disable();
> -		pagefault_disable();
> -		enable_kernel_vsx();
> -
> -		blkcipher_walk_init(&walk, dst, src, nbytes);
> -		ret = blkcipher_walk_virt(desc, &walk);
> -		while ((nbytes = walk.nbytes)) {
> -			aes_p8_cbc_encrypt(walk.src.virt.addr,
> -					   walk.dst.virt.addr,
> -					   nbytes & AES_BLOCK_MASK,
> -					   &ctx->dec_key, walk.iv, 0);
> -			nbytes &= AES_BLOCK_SIZE - 1;
> -			ret = blkcipher_walk_done(desc, &walk, nbytes);
> -		}
> -
> -		disable_kernel_vsx();
> -		pagefault_enable();
> -		preempt_enable();
> -	}
> -
> -	return ret;
> -}
> -
> -
> -struct crypto_alg p8_aes_cbc_alg = {
> -	.cra_name = "cbc(aes)",
> -	.cra_driver_name = "p8_aes_cbc",
> -	.cra_module = THIS_MODULE,
> -	.cra_priority = 2000,
> -	.cra_type = &crypto_blkcipher_type,
> -	.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK,
> -	.cra_alignmask = 0,
> -	.cra_blocksize = AES_BLOCK_SIZE,
> -	.cra_ctxsize = sizeof(struct p8_aes_cbc_ctx),
> -	.cra_init = p8_aes_cbc_init,
> -	.cra_exit = p8_aes_cbc_exit,
> -	.cra_blkcipher = {
> -			  .ivsize = AES_BLOCK_SIZE,
> -			  .min_keysize = AES_MIN_KEY_SIZE,
> -			  .max_keysize = AES_MAX_KEY_SIZE,
> -			  .setkey = p8_aes_cbc_setkey,
> -			  .encrypt = p8_aes_cbc_encrypt,
> -			  .decrypt = p8_aes_cbc_decrypt,
> -	},
> -};
> diff --git a/drivers/crypto/vmx/aes_ctr.c b/drivers/crypto/vmx/aes_ctr.c
> deleted file mode 100644
> index 7cf6d31c1123..000000000000
> --- a/drivers/crypto/vmx/aes_ctr.c
> +++ /dev/null
> @@ -1,191 +0,0 @@
> -/**
> - * AES CTR routines supporting VMX instructions on the Power 8
> - *
> - * Copyright (C) 2015 International Business Machines Inc.
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; version 2 only.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program; if not, write to the Free Software
> - * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> - *
> - * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
> - */
> -
> -#include <linux/types.h>
> -#include <linux/err.h>
> -#include <linux/crypto.h>
> -#include <linux/delay.h>
> -#include <linux/hardirq.h>
> -#include <asm/switch_to.h>
> -#include <crypto/aes.h>
> -#include <crypto/scatterwalk.h>
> -#include "aesp8-ppc.h"
> -
> -struct p8_aes_ctr_ctx {
> -	struct crypto_blkcipher *fallback;
> -	struct aes_key enc_key;
> -};
> -
> -static int p8_aes_ctr_init(struct crypto_tfm *tfm)
> -{
> -	const char *alg;
> -	struct crypto_blkcipher *fallback;
> -	struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (!(alg = crypto_tfm_alg_name(tfm))) {
> -		printk(KERN_ERR "Failed to get algorithm name.\n");
> -		return -ENOENT;
> -	}
> -
> -	fallback =
> -	    crypto_alloc_blkcipher(alg, 0, CRYPTO_ALG_NEED_FALLBACK);
> -	if (IS_ERR(fallback)) {
> -		printk(KERN_ERR
> -		       "Failed to allocate transformation for '%s': %ld\n",
> -		       alg, PTR_ERR(fallback));
> -		return PTR_ERR(fallback);
> -	}
> -	printk(KERN_INFO "Using '%s' as fallback implementation.\n",
> -	       crypto_tfm_alg_driver_name((struct crypto_tfm *) fallback));
> -
> -	crypto_blkcipher_set_flags(
> -		fallback,
> -		crypto_blkcipher_get_flags((struct crypto_blkcipher *)tfm));
> -	ctx->fallback = fallback;
> -
> -	return 0;
> -}
> -
> -static void p8_aes_ctr_exit(struct crypto_tfm *tfm)
> -{
> -	struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (ctx->fallback) {
> -		crypto_free_blkcipher(ctx->fallback);
> -		ctx->fallback = NULL;
> -	}
> -}
> -
> -static int p8_aes_ctr_setkey(struct crypto_tfm *tfm, const u8 *key,
> -			     unsigned int keylen)
> -{
> -	int ret;
> -	struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	preempt_disable();
> -	pagefault_disable();
> -	enable_kernel_vsx();
> -	ret = aes_p8_set_encrypt_key(key, keylen * 8, &ctx->enc_key);
> -	disable_kernel_vsx();
> -	pagefault_enable();
> -	preempt_enable();
> -
> -	ret += crypto_blkcipher_setkey(ctx->fallback, key, keylen);
> -	return ret;
> -}
> -
> -static void p8_aes_ctr_final(struct p8_aes_ctr_ctx *ctx,
> -			     struct blkcipher_walk *walk)
> -{
> -	u8 *ctrblk = walk->iv;
> -	u8 keystream[AES_BLOCK_SIZE];
> -	u8 *src = walk->src.virt.addr;
> -	u8 *dst = walk->dst.virt.addr;
> -	unsigned int nbytes = walk->nbytes;
> -
> -	preempt_disable();
> -	pagefault_disable();
> -	enable_kernel_vsx();
> -	aes_p8_encrypt(ctrblk, keystream, &ctx->enc_key);
> -	disable_kernel_vsx();
> -	pagefault_enable();
> -	preempt_enable();
> -
> -	crypto_xor(keystream, src, nbytes);
> -	memcpy(dst, keystream, nbytes);
> -	crypto_inc(ctrblk, AES_BLOCK_SIZE);
> -}
> -
> -static int p8_aes_ctr_crypt(struct blkcipher_desc *desc,
> -			    struct scatterlist *dst,
> -			    struct scatterlist *src, unsigned int nbytes)
> -{
> -	int ret;
> -	u64 inc;
> -	struct blkcipher_walk walk;
> -	struct p8_aes_ctr_ctx *ctx =
> -		crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
> -	struct blkcipher_desc fallback_desc = {
> -		.tfm = ctx->fallback,
> -		.info = desc->info,
> -		.flags = desc->flags
> -	};
> -
> -	if (in_interrupt()) {
> -		ret = crypto_blkcipher_encrypt(&fallback_desc, dst, src,
> -					       nbytes);
> -	} else {
> -		blkcipher_walk_init(&walk, dst, src, nbytes);
> -		ret = blkcipher_walk_virt_block(desc, &walk, AES_BLOCK_SIZE);
> -		while ((nbytes = walk.nbytes) >= AES_BLOCK_SIZE) {
> -			preempt_disable();
> -			pagefault_disable();
> -			enable_kernel_vsx();
> -			aes_p8_ctr32_encrypt_blocks(walk.src.virt.addr,
> -						    walk.dst.virt.addr,
> -						    (nbytes &
> -						     AES_BLOCK_MASK) /
> -						    AES_BLOCK_SIZE,
> -						    &ctx->enc_key,
> -						    walk.iv);
> -			disable_kernel_vsx();
> -			pagefault_enable();
> -			preempt_enable();
> -
> -			/* We need to update IV mostly for last bytes/round */
> -			inc = (nbytes & AES_BLOCK_MASK) / AES_BLOCK_SIZE;
> -			if (inc > 0)
> -				while (inc--)
> -					crypto_inc(walk.iv, AES_BLOCK_SIZE);
> -
> -			nbytes &= AES_BLOCK_SIZE - 1;
> -			ret = blkcipher_walk_done(desc, &walk, nbytes);
> -		}
> -		if (walk.nbytes) {
> -			p8_aes_ctr_final(ctx, &walk);
> -			ret = blkcipher_walk_done(desc, &walk, 0);
> -		}
> -	}
> -
> -	return ret;
> -}
> -
> -struct crypto_alg p8_aes_ctr_alg = {
> -	.cra_name = "ctr(aes)",
> -	.cra_driver_name = "p8_aes_ctr",
> -	.cra_module = THIS_MODULE,
> -	.cra_priority = 2000,
> -	.cra_type = &crypto_blkcipher_type,
> -	.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK,
> -	.cra_alignmask = 0,
> -	.cra_blocksize = 1,
> -	.cra_ctxsize = sizeof(struct p8_aes_ctr_ctx),
> -	.cra_init = p8_aes_ctr_init,
> -	.cra_exit = p8_aes_ctr_exit,
> -	.cra_blkcipher = {
> -			  .ivsize = AES_BLOCK_SIZE,
> -			  .min_keysize = AES_MIN_KEY_SIZE,
> -			  .max_keysize = AES_MAX_KEY_SIZE,
> -			  .setkey = p8_aes_ctr_setkey,
> -			  .encrypt = p8_aes_ctr_crypt,
> -			  .decrypt = p8_aes_ctr_crypt,
> -	},
> -};
> diff --git a/drivers/crypto/vmx/aes_xts.c b/drivers/crypto/vmx/aes_xts.c
> deleted file mode 100644
> index 6adc9290557a..000000000000
> --- a/drivers/crypto/vmx/aes_xts.c
> +++ /dev/null
> @@ -1,190 +0,0 @@
> -/**
> - * AES XTS routines supporting VMX In-core instructions on Power 8
> - *
> - * Copyright (C) 2015 International Business Machines Inc.
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundations; version 2 only.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY of FITNESS FOR A PARTICUPAR PURPOSE. See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program; if not, write to the Free Software
> - * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> - *
> - * Author: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
> - */
> -
> -#include <linux/types.h>
> -#include <linux/err.h>
> -#include <linux/crypto.h>
> -#include <linux/delay.h>
> -#include <linux/hardirq.h>
> -#include <asm/switch_to.h>
> -#include <crypto/aes.h>
> -#include <crypto/scatterwalk.h>
> -#include <crypto/xts.h>
> -#include <crypto/skcipher.h>
> -
> -#include "aesp8-ppc.h"
> -
> -struct p8_aes_xts_ctx {
> -	struct crypto_skcipher *fallback;
> -	struct aes_key enc_key;
> -	struct aes_key dec_key;
> -	struct aes_key tweak_key;
> -};
> -
> -static int p8_aes_xts_init(struct crypto_tfm *tfm)
> -{
> -	const char *alg;
> -	struct crypto_skcipher *fallback;
> -	struct p8_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (!(alg = crypto_tfm_alg_name(tfm))) {
> -		printk(KERN_ERR "Failed to get algorithm name.\n");
> -		return -ENOENT;
> -	}
> -
> -	fallback = crypto_alloc_skcipher(alg, 0,
> -			CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK);
> -	if (IS_ERR(fallback)) {
> -		printk(KERN_ERR
> -			"Failed to allocate transformation for '%s': %ld\n",
> -			alg, PTR_ERR(fallback));
> -		return PTR_ERR(fallback);
> -	}
> -	printk(KERN_INFO "Using '%s' as fallback implementation.\n",
> -		crypto_skcipher_driver_name(fallback));
> -
> -	crypto_skcipher_set_flags(
> -		fallback,
> -		crypto_skcipher_get_flags((struct crypto_skcipher *)tfm));
> -	ctx->fallback = fallback;
> -
> -	return 0;
> -}
> -
> -static void p8_aes_xts_exit(struct crypto_tfm *tfm)
> -{
> -	struct p8_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (ctx->fallback) {
> -		crypto_free_skcipher(ctx->fallback);
> -		ctx->fallback = NULL;
> -	}
> -}
> -
> -static int p8_aes_xts_setkey(struct crypto_tfm *tfm, const u8 *key,
> -			     unsigned int keylen)
> -{
> -	int ret;
> -	struct p8_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	ret = xts_check_key(tfm, key, keylen);
> -	if (ret)
> -		return ret;
> -
> -	preempt_disable();
> -	pagefault_disable();
> -	enable_kernel_vsx();
> -	ret = aes_p8_set_encrypt_key(key + keylen/2, (keylen/2) * 8, &ctx->tweak_key);
> -	ret += aes_p8_set_encrypt_key(key, (keylen/2) * 8, &ctx->enc_key);
> -	ret += aes_p8_set_decrypt_key(key, (keylen/2) * 8, &ctx->dec_key);
> -	disable_kernel_vsx();
> -	pagefault_enable();
> -	preempt_enable();
> -
> -	ret += crypto_skcipher_setkey(ctx->fallback, key, keylen);
> -	return ret;
> -}
> -
> -static int p8_aes_xts_crypt(struct blkcipher_desc *desc,
> -			    struct scatterlist *dst,
> -			    struct scatterlist *src,
> -			    unsigned int nbytes, int enc)
> -{
> -	int ret;
> -	u8 tweak[AES_BLOCK_SIZE];
> -	u8 *iv;
> -	struct blkcipher_walk walk;
> -	struct p8_aes_xts_ctx *ctx =
> -		crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
> -
> -	if (in_interrupt()) {
> -		SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
> -		skcipher_request_set_tfm(req, ctx->fallback);
> -		skcipher_request_set_callback(req, desc->flags, NULL, NULL);
> -		skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
> -		ret = enc? crypto_skcipher_encrypt(req) : crypto_skcipher_decrypt(req);
> -		skcipher_request_zero(req);
> -	} else {
> -		preempt_disable();
> -		pagefault_disable();
> -		enable_kernel_vsx();
> -
> -		blkcipher_walk_init(&walk, dst, src, nbytes);
> -
> -		ret = blkcipher_walk_virt(desc, &walk);
> -		iv = walk.iv;
> -		memset(tweak, 0, AES_BLOCK_SIZE);
> -		aes_p8_encrypt(iv, tweak, &ctx->tweak_key);
> -
> -		while ((nbytes = walk.nbytes)) {
> -			if (enc)
> -				aes_p8_xts_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
> -						nbytes & AES_BLOCK_MASK, &ctx->enc_key, NULL, tweak);
> -			else
> -				aes_p8_xts_decrypt(walk.src.virt.addr, walk.dst.virt.addr,
> -						nbytes & AES_BLOCK_MASK, &ctx->dec_key, NULL, tweak);
> -
> -			nbytes &= AES_BLOCK_SIZE - 1;
> -			ret = blkcipher_walk_done(desc, &walk, nbytes);
> -		}
> -
> -		disable_kernel_vsx();
> -		pagefault_enable();
> -		preempt_enable();
> -	}
> -	return ret;
> -}
> -
> -static int p8_aes_xts_encrypt(struct blkcipher_desc *desc,
> -			      struct scatterlist *dst,
> -			      struct scatterlist *src, unsigned int nbytes)
> -{
> -	return p8_aes_xts_crypt(desc, dst, src, nbytes, 1);
> -}
> -
> -static int p8_aes_xts_decrypt(struct blkcipher_desc *desc,
> -			      struct scatterlist *dst,
> -			      struct scatterlist *src, unsigned int nbytes)
> -{
> -	return p8_aes_xts_crypt(desc, dst, src, nbytes, 0);
> -}
> -
> -struct crypto_alg p8_aes_xts_alg = {
> -	.cra_name = "xts(aes)",
> -	.cra_driver_name = "p8_aes_xts",
> -	.cra_module = THIS_MODULE,
> -	.cra_priority = 2000,
> -	.cra_type = &crypto_blkcipher_type,
> -	.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK,
> -	.cra_alignmask = 0,
> -	.cra_blocksize = AES_BLOCK_SIZE,
> -	.cra_ctxsize = sizeof(struct p8_aes_xts_ctx),
> -	.cra_init = p8_aes_xts_init,
> -	.cra_exit = p8_aes_xts_exit,
> -	.cra_blkcipher = {
> -			.ivsize = AES_BLOCK_SIZE,
> -			.min_keysize = 2 * AES_MIN_KEY_SIZE,
> -			.max_keysize = 2 * AES_MAX_KEY_SIZE,
> -			.setkey	 = p8_aes_xts_setkey,
> -			.encrypt = p8_aes_xts_encrypt,
> -			.decrypt = p8_aes_xts_decrypt,
> -	}
> -};
> diff --git a/drivers/crypto/vmx/aesp8-ppc.h b/drivers/crypto/vmx/aesp8-ppc.h
> deleted file mode 100644
> index 01972e16a6c0..000000000000
> --- a/drivers/crypto/vmx/aesp8-ppc.h
> +++ /dev/null
> @@ -1,25 +0,0 @@
> -#include <linux/types.h>
> -#include <crypto/aes.h>
> -
> -#define AES_BLOCK_MASK  (~(AES_BLOCK_SIZE-1))
> -
> -struct aes_key {
> -	u8 key[AES_MAX_KEYLENGTH];
> -	int rounds;
> -};
> -
> -int aes_p8_set_encrypt_key(const u8 *userKey, const int bits,
> -			   struct aes_key *key);
> -int aes_p8_set_decrypt_key(const u8 *userKey, const int bits,
> -			   struct aes_key *key);
> -void aes_p8_encrypt(const u8 *in, u8 *out, const struct aes_key *key);
> -void aes_p8_decrypt(const u8 *in, u8 *out, const struct aes_key *key);
> -void aes_p8_cbc_encrypt(const u8 *in, u8 *out, size_t len,
> -			const struct aes_key *key, u8 *iv, const int enc);
> -void aes_p8_ctr32_encrypt_blocks(const u8 *in, u8 *out,
> -				 size_t len, const struct aes_key *key,
> -				 const u8 *iv);
> -void aes_p8_xts_encrypt(const u8 *in, u8 *out, size_t len,
> -			const struct aes_key *key1, const struct aes_key *key2, u8 *iv);
> -void aes_p8_xts_decrypt(const u8 *in, u8 *out, size_t len,
> -			const struct aes_key *key1, const struct aes_key *key2, u8 *iv);
> diff --git a/drivers/crypto/vmx/aesp8-ppc.pl b/drivers/crypto/vmx/aesp8-ppc.pl
> deleted file mode 100644
> index 0b4a293b8a1e..000000000000
> --- a/drivers/crypto/vmx/aesp8-ppc.pl
> +++ /dev/null
> @@ -1,3789 +0,0 @@
> -#! /usr/bin/env perl
> -# Copyright 2014-2016 The OpenSSL Project Authors. All Rights Reserved.
> -#
> -# Licensed under the OpenSSL license (the "License").  You may not use
> -# this file except in compliance with the License.  You can obtain a copy
> -# in the file LICENSE in the source distribution or at
> -# https://www.openssl.org/source/license.html
> -
> -#
> -# ====================================================================
> -# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
> -# project. The module is, however, dual licensed under OpenSSL and
> -# CRYPTOGAMS licenses depending on where you obtain it. For further
> -# details see http://www.openssl.org/~appro/cryptogams/.
> -# ====================================================================
> -#
> -# This module implements support for AES instructions as per PowerISA
> -# specification version 2.07, first implemented by POWER8 processor.
> -# The module is endian-agnostic in sense that it supports both big-
> -# and little-endian cases. Data alignment in parallelizable modes is
> -# handled with VSX loads and stores, which implies MSR.VSX flag being
> -# set. It should also be noted that ISA specification doesn't prohibit
> -# alignment exceptions for these instructions on page boundaries.
> -# Initially alignment was handled in pure AltiVec/VMX way [when data
> -# is aligned programmatically, which in turn guarantees exception-
> -# free execution], but it turned to hamper performance when vcipher
> -# instructions are interleaved. It's reckoned that eventual
> -# misalignment penalties at page boundaries are in average lower
> -# than additional overhead in pure AltiVec approach.
> -#
> -# May 2016
> -#
> -# Add XTS subroutine, 9x on little- and 12x improvement on big-endian
> -# systems were measured.
> -#
> -######################################################################
> -# Current large-block performance in cycles per byte processed with
> -# 128-bit key (less is better).
> -#
> -#		CBC en-/decrypt	CTR	XTS
> -# POWER8[le]	3.96/0.72	0.74	1.1
> -# POWER8[be]	3.75/0.65	0.66	1.0
> -
> -$flavour = shift;
> -
> -if ($flavour =~ /64/) {
> -	$SIZE_T	=8;
> -	$LRSAVE	=2*$SIZE_T;
> -	$STU	="stdu";
> -	$POP	="ld";
> -	$PUSH	="std";
> -	$UCMP	="cmpld";
> -	$SHL	="sldi";
> -} elsif ($flavour =~ /32/) {
> -	$SIZE_T	=4;
> -	$LRSAVE	=$SIZE_T;
> -	$STU	="stwu";
> -	$POP	="lwz";
> -	$PUSH	="stw";
> -	$UCMP	="cmplw";
> -	$SHL	="slwi";
> -} else { die "nonsense $flavour"; }
> -
> -$LITTLE_ENDIAN = ($flavour=~/le$/) ? $SIZE_T : 0;
> -
> -$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
> -( $xlate="${dir}ppc-xlate.pl" and -f $xlate ) or
> -( $xlate="${dir}../../perlasm/ppc-xlate.pl" and -f $xlate) or
> -die "can't locate ppc-xlate.pl";
> -
> -open STDOUT,"| $^X $xlate $flavour ".shift || die "can't call $xlate: $!";
> -
> -$FRAME=8*$SIZE_T;
> -$prefix="aes_p8";
> -
> -$sp="r1";
> -$vrsave="r12";
> -
> -#########################################################################
> -{{{	# Key setup procedures						#
> -my ($inp,$bits,$out,$ptr,$cnt,$rounds)=map("r$_",(3..8));
> -my ($zero,$in0,$in1,$key,$rcon,$mask,$tmp)=map("v$_",(0..6));
> -my ($stage,$outperm,$outmask,$outhead,$outtail)=map("v$_",(7..11));
> -
> -$code.=<<___;
> -.machine	"any"
> -
> -.text
> -
> -.align	7
> -rcon:
> -.long	0x01000000, 0x01000000, 0x01000000, 0x01000000	?rev
> -.long	0x1b000000, 0x1b000000, 0x1b000000, 0x1b000000	?rev
> -.long	0x0d0e0f0c, 0x0d0e0f0c, 0x0d0e0f0c, 0x0d0e0f0c	?rev
> -.long	0,0,0,0						?asis
> -Lconsts:
> -	mflr	r0
> -	bcl	20,31,\$+4
> -	mflr	$ptr	 #vvvvv "distance between . and rcon
> -	addi	$ptr,$ptr,-0x48
> -	mtlr	r0
> -	blr
> -	.long	0
> -	.byte	0,12,0x14,0,0,0,0,0
> -.asciz	"AES for PowerISA 2.07, CRYPTOGAMS by <appro\@openssl.org>"
> -
> -.globl	.${prefix}_set_encrypt_key
> -Lset_encrypt_key:
> -	mflr		r11
> -	$PUSH		r11,$LRSAVE($sp)
> -
> -	li		$ptr,-1
> -	${UCMP}i	$inp,0
> -	beq-		Lenc_key_abort		# if ($inp==0) return -1;
> -	${UCMP}i	$out,0
> -	beq-		Lenc_key_abort		# if ($out==0) return -1;
> -	li		$ptr,-2
> -	cmpwi		$bits,128
> -	blt-		Lenc_key_abort
> -	cmpwi		$bits,256
> -	bgt-		Lenc_key_abort
> -	andi.		r0,$bits,0x3f
> -	bne-		Lenc_key_abort
> -
> -	lis		r0,0xfff0
> -	mfspr		$vrsave,256
> -	mtspr		256,r0
> -
> -	bl		Lconsts
> -	mtlr		r11
> -
> -	neg		r9,$inp
> -	lvx		$in0,0,$inp
> -	addi		$inp,$inp,15		# 15 is not typo
> -	lvsr		$key,0,r9		# borrow $key
> -	li		r8,0x20
> -	cmpwi		$bits,192
> -	lvx		$in1,0,$inp
> -	le?vspltisb	$mask,0x0f		# borrow $mask
> -	lvx		$rcon,0,$ptr
> -	le?vxor		$key,$key,$mask		# adjust for byte swap
> -	lvx		$mask,r8,$ptr
> -	addi		$ptr,$ptr,0x10
> -	vperm		$in0,$in0,$in1,$key	# align [and byte swap in LE]
> -	li		$cnt,8
> -	vxor		$zero,$zero,$zero
> -	mtctr		$cnt
> -
> -	?lvsr		$outperm,0,$out
> -	vspltisb	$outmask,-1
> -	lvx		$outhead,0,$out
> -	?vperm		$outmask,$zero,$outmask,$outperm
> -
> -	blt		Loop128
> -	addi		$inp,$inp,8
> -	beq		L192
> -	addi		$inp,$inp,8
> -	b		L256
> -
> -.align	4
> -Loop128:
> -	vperm		$key,$in0,$in0,$mask	# rotate-n-splat
> -	vsldoi		$tmp,$zero,$in0,12	# >>32
> -	 vperm		$outtail,$in0,$in0,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	vcipherlast	$key,$key,$rcon
> -	 stvx		$stage,0,$out
> -	 addi		$out,$out,16
> -
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -	 vadduwm	$rcon,$rcon,$rcon
> -	vxor		$in0,$in0,$key
> -	bdnz		Loop128
> -
> -	lvx		$rcon,0,$ptr		# last two round keys
> -
> -	vperm		$key,$in0,$in0,$mask	# rotate-n-splat
> -	vsldoi		$tmp,$zero,$in0,12	# >>32
> -	 vperm		$outtail,$in0,$in0,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	vcipherlast	$key,$key,$rcon
> -	 stvx		$stage,0,$out
> -	 addi		$out,$out,16
> -
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -	 vadduwm	$rcon,$rcon,$rcon
> -	vxor		$in0,$in0,$key
> -
> -	vperm		$key,$in0,$in0,$mask	# rotate-n-splat
> -	vsldoi		$tmp,$zero,$in0,12	# >>32
> -	 vperm		$outtail,$in0,$in0,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	vcipherlast	$key,$key,$rcon
> -	 stvx		$stage,0,$out
> -	 addi		$out,$out,16
> -
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -	vxor		$in0,$in0,$key
> -	 vperm		$outtail,$in0,$in0,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	 stvx		$stage,0,$out
> -
> -	addi		$inp,$out,15		# 15 is not typo
> -	addi		$out,$out,0x50
> -
> -	li		$rounds,10
> -	b		Ldone
> -
> -.align	4
> -L192:
> -	lvx		$tmp,0,$inp
> -	li		$cnt,4
> -	 vperm		$outtail,$in0,$in0,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	 stvx		$stage,0,$out
> -	 addi		$out,$out,16
> -	vperm		$in1,$in1,$tmp,$key	# align [and byte swap in LE]
> -	vspltisb	$key,8			# borrow $key
> -	mtctr		$cnt
> -	vsububm		$mask,$mask,$key	# adjust the mask
> -
> -Loop192:
> -	vperm		$key,$in1,$in1,$mask	# roate-n-splat
> -	vsldoi		$tmp,$zero,$in0,12	# >>32
> -	vcipherlast	$key,$key,$rcon
> -
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -
> -	 vsldoi		$stage,$zero,$in1,8
> -	vspltw		$tmp,$in0,3
> -	vxor		$tmp,$tmp,$in1
> -	vsldoi		$in1,$zero,$in1,12	# >>32
> -	 vadduwm	$rcon,$rcon,$rcon
> -	vxor		$in1,$in1,$tmp
> -	vxor		$in0,$in0,$key
> -	vxor		$in1,$in1,$key
> -	 vsldoi		$stage,$stage,$in0,8
> -
> -	vperm		$key,$in1,$in1,$mask	# rotate-n-splat
> -	vsldoi		$tmp,$zero,$in0,12	# >>32
> -	 vperm		$outtail,$stage,$stage,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	vcipherlast	$key,$key,$rcon
> -	 stvx		$stage,0,$out
> -	 addi		$out,$out,16
> -
> -	 vsldoi		$stage,$in0,$in1,8
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	 vperm		$outtail,$stage,$stage,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -	 stvx		$stage,0,$out
> -	 addi		$out,$out,16
> -
> -	vspltw		$tmp,$in0,3
> -	vxor		$tmp,$tmp,$in1
> -	vsldoi		$in1,$zero,$in1,12	# >>32
> -	 vadduwm	$rcon,$rcon,$rcon
> -	vxor		$in1,$in1,$tmp
> -	vxor		$in0,$in0,$key
> -	vxor		$in1,$in1,$key
> -	 vperm		$outtail,$in0,$in0,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	 stvx		$stage,0,$out
> -	 addi		$inp,$out,15		# 15 is not typo
> -	 addi		$out,$out,16
> -	bdnz		Loop192
> -
> -	li		$rounds,12
> -	addi		$out,$out,0x20
> -	b		Ldone
> -
> -.align	4
> -L256:
> -	lvx		$tmp,0,$inp
> -	li		$cnt,7
> -	li		$rounds,14
> -	 vperm		$outtail,$in0,$in0,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	 stvx		$stage,0,$out
> -	 addi		$out,$out,16
> -	vperm		$in1,$in1,$tmp,$key	# align [and byte swap in LE]
> -	mtctr		$cnt
> -
> -Loop256:
> -	vperm		$key,$in1,$in1,$mask	# rotate-n-splat
> -	vsldoi		$tmp,$zero,$in0,12	# >>32
> -	 vperm		$outtail,$in1,$in1,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	vcipherlast	$key,$key,$rcon
> -	 stvx		$stage,0,$out
> -	 addi		$out,$out,16
> -
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in0,$in0,$tmp
> -	 vadduwm	$rcon,$rcon,$rcon
> -	vxor		$in0,$in0,$key
> -	 vperm		$outtail,$in0,$in0,$outperm	# rotate
> -	 vsel		$stage,$outhead,$outtail,$outmask
> -	 vmr		$outhead,$outtail
> -	 stvx		$stage,0,$out
> -	 addi		$inp,$out,15		# 15 is not typo
> -	 addi		$out,$out,16
> -	bdz		Ldone
> -
> -	vspltw		$key,$in0,3		# just splat
> -	vsldoi		$tmp,$zero,$in1,12	# >>32
> -	vsbox		$key,$key
> -
> -	vxor		$in1,$in1,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in1,$in1,$tmp
> -	vsldoi		$tmp,$zero,$tmp,12	# >>32
> -	vxor		$in1,$in1,$tmp
> -
> -	vxor		$in1,$in1,$key
> -	b		Loop256
> -
> -.align	4
> -Ldone:
> -	lvx		$in1,0,$inp		# redundant in aligned case
> -	vsel		$in1,$outhead,$in1,$outmask
> -	stvx		$in1,0,$inp
> -	li		$ptr,0
> -	mtspr		256,$vrsave
> -	stw		$rounds,0($out)
> -
> -Lenc_key_abort:
> -	mr		r3,$ptr
> -	blr
> -	.long		0
> -	.byte		0,12,0x14,1,0,0,3,0
> -	.long		0
> -.size	.${prefix}_set_encrypt_key,.-.${prefix}_set_encrypt_key
> -
> -.globl	.${prefix}_set_decrypt_key
> -	$STU		$sp,-$FRAME($sp)
> -	mflr		r10
> -	$PUSH		r10,$FRAME+$LRSAVE($sp)
> -	bl		Lset_encrypt_key
> -	mtlr		r10
> -
> -	cmpwi		r3,0
> -	bne-		Ldec_key_abort
> -
> -	slwi		$cnt,$rounds,4
> -	subi		$inp,$out,240		# first round key
> -	srwi		$rounds,$rounds,1
> -	add		$out,$inp,$cnt		# last round key
> -	mtctr		$rounds
> -
> -Ldeckey:
> -	lwz		r0, 0($inp)
> -	lwz		r6, 4($inp)
> -	lwz		r7, 8($inp)
> -	lwz		r8, 12($inp)
> -	addi		$inp,$inp,16
> -	lwz		r9, 0($out)
> -	lwz		r10,4($out)
> -	lwz		r11,8($out)
> -	lwz		r12,12($out)
> -	stw		r0, 0($out)
> -	stw		r6, 4($out)
> -	stw		r7, 8($out)
> -	stw		r8, 12($out)
> -	subi		$out,$out,16
> -	stw		r9, -16($inp)
> -	stw		r10,-12($inp)
> -	stw		r11,-8($inp)
> -	stw		r12,-4($inp)
> -	bdnz		Ldeckey
> -
> -	xor		r3,r3,r3		# return value
> -Ldec_key_abort:
> -	addi		$sp,$sp,$FRAME
> -	blr
> -	.long		0
> -	.byte		0,12,4,1,0x80,0,3,0
> -	.long		0
> -.size	.${prefix}_set_decrypt_key,.-.${prefix}_set_decrypt_key
> -___
> -}}}
> -#########################################################################
> -{{{	# Single block en- and decrypt procedures			#
> -sub gen_block () {
> -my $dir = shift;
> -my $n   = $dir eq "de" ? "n" : "";
> -my ($inp,$out,$key,$rounds,$idx)=map("r$_",(3..7));
> -
> -$code.=<<___;
> -.globl	.${prefix}_${dir}crypt
> -	lwz		$rounds,240($key)
> -	lis		r0,0xfc00
> -	mfspr		$vrsave,256
> -	li		$idx,15			# 15 is not typo
> -	mtspr		256,r0
> -
> -	lvx		v0,0,$inp
> -	neg		r11,$out
> -	lvx		v1,$idx,$inp
> -	lvsl		v2,0,$inp		# inpperm
> -	le?vspltisb	v4,0x0f
> -	?lvsl		v3,0,r11		# outperm
> -	le?vxor		v2,v2,v4
> -	li		$idx,16
> -	vperm		v0,v0,v1,v2		# align [and byte swap in LE]
> -	lvx		v1,0,$key
> -	?lvsl		v5,0,$key		# keyperm
> -	srwi		$rounds,$rounds,1
> -	lvx		v2,$idx,$key
> -	addi		$idx,$idx,16
> -	subi		$rounds,$rounds,1
> -	?vperm		v1,v1,v2,v5		# align round key
> -
> -	vxor		v0,v0,v1
> -	lvx		v1,$idx,$key
> -	addi		$idx,$idx,16
> -	mtctr		$rounds
> -
> -Loop_${dir}c:
> -	?vperm		v2,v2,v1,v5
> -	v${n}cipher	v0,v0,v2
> -	lvx		v2,$idx,$key
> -	addi		$idx,$idx,16
> -	?vperm		v1,v1,v2,v5
> -	v${n}cipher	v0,v0,v1
> -	lvx		v1,$idx,$key
> -	addi		$idx,$idx,16
> -	bdnz		Loop_${dir}c
> -
> -	?vperm		v2,v2,v1,v5
> -	v${n}cipher	v0,v0,v2
> -	lvx		v2,$idx,$key
> -	?vperm		v1,v1,v2,v5
> -	v${n}cipherlast	v0,v0,v1
> -
> -	vspltisb	v2,-1
> -	vxor		v1,v1,v1
> -	li		$idx,15			# 15 is not typo
> -	?vperm		v2,v1,v2,v3		# outmask
> -	le?vxor		v3,v3,v4
> -	lvx		v1,0,$out		# outhead
> -	vperm		v0,v0,v0,v3		# rotate [and byte swap in LE]
> -	vsel		v1,v1,v0,v2
> -	lvx		v4,$idx,$out
> -	stvx		v1,0,$out
> -	vsel		v0,v0,v4,v2
> -	stvx		v0,$idx,$out
> -
> -	mtspr		256,$vrsave
> -	blr
> -	.long		0
> -	.byte		0,12,0x14,0,0,0,3,0
> -	.long		0
> -.size	.${prefix}_${dir}crypt,.-.${prefix}_${dir}crypt
> -___
> -}
> -&gen_block("en");
> -&gen_block("de");
> -}}}
> -#########################################################################
> -{{{	# CBC en- and decrypt procedures				#
> -my ($inp,$out,$len,$key,$ivp,$enc,$rounds,$idx)=map("r$_",(3..10));
> -my ($rndkey0,$rndkey1,$inout,$tmp)=		map("v$_",(0..3));
> -my ($ivec,$inptail,$inpperm,$outhead,$outperm,$outmask,$keyperm)=
> -						map("v$_",(4..10));
> -$code.=<<___;
> -.globl	.${prefix}_cbc_encrypt
> -	${UCMP}i	$len,16
> -	bltlr-
> -
> -	cmpwi		$enc,0			# test direction
> -	lis		r0,0xffe0
> -	mfspr		$vrsave,256
> -	mtspr		256,r0
> -
> -	li		$idx,15
> -	vxor		$rndkey0,$rndkey0,$rndkey0
> -	le?vspltisb	$tmp,0x0f
> -
> -	lvx		$ivec,0,$ivp		# load [unaligned] iv
> -	lvsl		$inpperm,0,$ivp
> -	lvx		$inptail,$idx,$ivp
> -	le?vxor		$inpperm,$inpperm,$tmp
> -	vperm		$ivec,$ivec,$inptail,$inpperm
> -
> -	neg		r11,$inp
> -	?lvsl		$keyperm,0,$key		# prepare for unaligned key
> -	lwz		$rounds,240($key)
> -
> -	lvsr		$inpperm,0,r11		# prepare for unaligned load
> -	lvx		$inptail,0,$inp
> -	addi		$inp,$inp,15		# 15 is not typo
> -	le?vxor		$inpperm,$inpperm,$tmp
> -
> -	?lvsr		$outperm,0,$out		# prepare for unaligned store
> -	vspltisb	$outmask,-1
> -	lvx		$outhead,0,$out
> -	?vperm		$outmask,$rndkey0,$outmask,$outperm
> -	le?vxor		$outperm,$outperm,$tmp
> -
> -	srwi		$rounds,$rounds,1
> -	li		$idx,16
> -	subi		$rounds,$rounds,1
> -	beq		Lcbc_dec
> -
> -Lcbc_enc:
> -	vmr		$inout,$inptail
> -	lvx		$inptail,0,$inp
> -	addi		$inp,$inp,16
> -	mtctr		$rounds
> -	subi		$len,$len,16		# len-=16
> -
> -	lvx		$rndkey0,0,$key
> -	 vperm		$inout,$inout,$inptail,$inpperm
> -	lvx		$rndkey1,$idx,$key
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key
> -	addi		$idx,$idx,16
> -	vxor		$inout,$inout,$ivec
> -
> -Loop_cbc_enc:
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vcipher		$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vcipher		$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key
> -	addi		$idx,$idx,16
> -	bdnz		Loop_cbc_enc
> -
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vcipher		$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key
> -	li		$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vcipherlast	$ivec,$inout,$rndkey0
> -	${UCMP}i	$len,16
> -
> -	vperm		$tmp,$ivec,$ivec,$outperm
> -	vsel		$inout,$outhead,$tmp,$outmask
> -	vmr		$outhead,$tmp
> -	stvx		$inout,0,$out
> -	addi		$out,$out,16
> -	bge		Lcbc_enc
> -
> -	b		Lcbc_done
> -
> -.align	4
> -Lcbc_dec:
> -	${UCMP}i	$len,128
> -	bge		_aesp8_cbc_decrypt8x
> -	vmr		$tmp,$inptail
> -	lvx		$inptail,0,$inp
> -	addi		$inp,$inp,16
> -	mtctr		$rounds
> -	subi		$len,$len,16		# len-=16
> -
> -	lvx		$rndkey0,0,$key
> -	 vperm		$tmp,$tmp,$inptail,$inpperm
> -	lvx		$rndkey1,$idx,$key
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$inout,$tmp,$rndkey0
> -	lvx		$rndkey0,$idx,$key
> -	addi		$idx,$idx,16
> -
> -Loop_cbc_dec:
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vncipher	$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vncipher	$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key
> -	addi		$idx,$idx,16
> -	bdnz		Loop_cbc_dec
> -
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vncipher	$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key
> -	li		$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vncipherlast	$inout,$inout,$rndkey0
> -	${UCMP}i	$len,16
> -
> -	vxor		$inout,$inout,$ivec
> -	vmr		$ivec,$tmp
> -	vperm		$tmp,$inout,$inout,$outperm
> -	vsel		$inout,$outhead,$tmp,$outmask
> -	vmr		$outhead,$tmp
> -	stvx		$inout,0,$out
> -	addi		$out,$out,16
> -	bge		Lcbc_dec
> -
> -Lcbc_done:
> -	addi		$out,$out,-1
> -	lvx		$inout,0,$out		# redundant in aligned case
> -	vsel		$inout,$outhead,$inout,$outmask
> -	stvx		$inout,0,$out
> -
> -	neg		$enc,$ivp		# write [unaligned] iv
> -	li		$idx,15			# 15 is not typo
> -	vxor		$rndkey0,$rndkey0,$rndkey0
> -	vspltisb	$outmask,-1
> -	le?vspltisb	$tmp,0x0f
> -	?lvsl		$outperm,0,$enc
> -	?vperm		$outmask,$rndkey0,$outmask,$outperm
> -	le?vxor		$outperm,$outperm,$tmp
> -	lvx		$outhead,0,$ivp
> -	vperm		$ivec,$ivec,$ivec,$outperm
> -	vsel		$inout,$outhead,$ivec,$outmask
> -	lvx		$inptail,$idx,$ivp
> -	stvx		$inout,0,$ivp
> -	vsel		$inout,$ivec,$inptail,$outmask
> -	stvx		$inout,$idx,$ivp
> -
> -	mtspr		256,$vrsave
> -	blr
> -	.long		0
> -	.byte		0,12,0x14,0,0,0,6,0
> -	.long		0
> -___
> -#########################################################################
> -{{	# Optimized CBC decrypt procedure				#
> -my $key_="r11";
> -my ($x00,$x10,$x20,$x30,$x40,$x50,$x60,$x70)=map("r$_",(0,8,26..31));
> -my ($in0, $in1, $in2, $in3, $in4, $in5, $in6, $in7 )=map("v$_",(0..3,10..13));
> -my ($out0,$out1,$out2,$out3,$out4,$out5,$out6,$out7)=map("v$_",(14..21));
> -my $rndkey0="v23";	# v24-v25 rotating buffer for first found keys
> -			# v26-v31 last 6 round keys
> -my ($tmp,$keyperm)=($in3,$in4);	# aliases with "caller", redundant assignment
> -
> -$code.=<<___;
> -.align	5
> -_aesp8_cbc_decrypt8x:
> -	$STU		$sp,-`($FRAME+21*16+6*$SIZE_T)`($sp)
> -	li		r10,`$FRAME+8*16+15`
> -	li		r11,`$FRAME+8*16+31`
> -	stvx		v20,r10,$sp		# ABI says so
> -	addi		r10,r10,32
> -	stvx		v21,r11,$sp
> -	addi		r11,r11,32
> -	stvx		v22,r10,$sp
> -	addi		r10,r10,32
> -	stvx		v23,r11,$sp
> -	addi		r11,r11,32
> -	stvx		v24,r10,$sp
> -	addi		r10,r10,32
> -	stvx		v25,r11,$sp
> -	addi		r11,r11,32
> -	stvx		v26,r10,$sp
> -	addi		r10,r10,32
> -	stvx		v27,r11,$sp
> -	addi		r11,r11,32
> -	stvx		v28,r10,$sp
> -	addi		r10,r10,32
> -	stvx		v29,r11,$sp
> -	addi		r11,r11,32
> -	stvx		v30,r10,$sp
> -	stvx		v31,r11,$sp
> -	li		r0,-1
> -	stw		$vrsave,`$FRAME+21*16-4`($sp)	# save vrsave
> -	li		$x10,0x10
> -	$PUSH		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
> -	li		$x20,0x20
> -	$PUSH		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
> -	li		$x30,0x30
> -	$PUSH		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
> -	li		$x40,0x40
> -	$PUSH		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
> -	li		$x50,0x50
> -	$PUSH		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
> -	li		$x60,0x60
> -	$PUSH		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
> -	li		$x70,0x70
> -	mtspr		256,r0
> -
> -	subi		$rounds,$rounds,3	# -4 in total
> -	subi		$len,$len,128		# bias
> -
> -	lvx		$rndkey0,$x00,$key	# load key schedule
> -	lvx		v30,$x10,$key
> -	addi		$key,$key,0x20
> -	lvx		v31,$x00,$key
> -	?vperm		$rndkey0,$rndkey0,v30,$keyperm
> -	addi		$key_,$sp,$FRAME+15
> -	mtctr		$rounds
> -
> -Load_cbc_dec_key:
> -	?vperm		v24,v30,v31,$keyperm
> -	lvx		v30,$x10,$key
> -	addi		$key,$key,0x20
> -	stvx		v24,$x00,$key_		# off-load round[1]
> -	?vperm		v25,v31,v30,$keyperm
> -	lvx		v31,$x00,$key
> -	stvx		v25,$x10,$key_		# off-load round[2]
> -	addi		$key_,$key_,0x20
> -	bdnz		Load_cbc_dec_key
> -
> -	lvx		v26,$x10,$key
> -	?vperm		v24,v30,v31,$keyperm
> -	lvx		v27,$x20,$key
> -	stvx		v24,$x00,$key_		# off-load round[3]
> -	?vperm		v25,v31,v26,$keyperm
> -	lvx		v28,$x30,$key
> -	stvx		v25,$x10,$key_		# off-load round[4]
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	?vperm		v26,v26,v27,$keyperm
> -	lvx		v29,$x40,$key
> -	?vperm		v27,v27,v28,$keyperm
> -	lvx		v30,$x50,$key
> -	?vperm		v28,v28,v29,$keyperm
> -	lvx		v31,$x60,$key
> -	?vperm		v29,v29,v30,$keyperm
> -	lvx		$out0,$x70,$key		# borrow $out0
> -	?vperm		v30,v30,v31,$keyperm
> -	lvx		v24,$x00,$key_		# pre-load round[1]
> -	?vperm		v31,v31,$out0,$keyperm
> -	lvx		v25,$x10,$key_		# pre-load round[2]
> -
> -	#lvx		$inptail,0,$inp		# "caller" already did this
> -	#addi		$inp,$inp,15		# 15 is not typo
> -	subi		$inp,$inp,15		# undo "caller"
> -
> -	 le?li		$idx,8
> -	lvx_u		$in0,$x00,$inp		# load first 8 "words"
> -	 le?lvsl	$inpperm,0,$idx
> -	 le?vspltisb	$tmp,0x0f
> -	lvx_u		$in1,$x10,$inp
> -	 le?vxor	$inpperm,$inpperm,$tmp	# transform for lvx_u/stvx_u
> -	lvx_u		$in2,$x20,$inp
> -	 le?vperm	$in0,$in0,$in0,$inpperm
> -	lvx_u		$in3,$x30,$inp
> -	 le?vperm	$in1,$in1,$in1,$inpperm
> -	lvx_u		$in4,$x40,$inp
> -	 le?vperm	$in2,$in2,$in2,$inpperm
> -	vxor		$out0,$in0,$rndkey0
> -	lvx_u		$in5,$x50,$inp
> -	 le?vperm	$in3,$in3,$in3,$inpperm
> -	vxor		$out1,$in1,$rndkey0
> -	lvx_u		$in6,$x60,$inp
> -	 le?vperm	$in4,$in4,$in4,$inpperm
> -	vxor		$out2,$in2,$rndkey0
> -	lvx_u		$in7,$x70,$inp
> -	addi		$inp,$inp,0x80
> -	 le?vperm	$in5,$in5,$in5,$inpperm
> -	vxor		$out3,$in3,$rndkey0
> -	 le?vperm	$in6,$in6,$in6,$inpperm
> -	vxor		$out4,$in4,$rndkey0
> -	 le?vperm	$in7,$in7,$in7,$inpperm
> -	vxor		$out5,$in5,$rndkey0
> -	vxor		$out6,$in6,$rndkey0
> -	vxor		$out7,$in7,$rndkey0
> -
> -	mtctr		$rounds
> -	b		Loop_cbc_dec8x
> -.align	5
> -Loop_cbc_dec8x:
> -	vncipher	$out0,$out0,v24
> -	vncipher	$out1,$out1,v24
> -	vncipher	$out2,$out2,v24
> -	vncipher	$out3,$out3,v24
> -	vncipher	$out4,$out4,v24
> -	vncipher	$out5,$out5,v24
> -	vncipher	$out6,$out6,v24
> -	vncipher	$out7,$out7,v24
> -	lvx		v24,$x20,$key_		# round[3]
> -	addi		$key_,$key_,0x20
> -
> -	vncipher	$out0,$out0,v25
> -	vncipher	$out1,$out1,v25
> -	vncipher	$out2,$out2,v25
> -	vncipher	$out3,$out3,v25
> -	vncipher	$out4,$out4,v25
> -	vncipher	$out5,$out5,v25
> -	vncipher	$out6,$out6,v25
> -	vncipher	$out7,$out7,v25
> -	lvx		v25,$x10,$key_		# round[4]
> -	bdnz		Loop_cbc_dec8x
> -
> -	subic		$len,$len,128		# $len-=128
> -	vncipher	$out0,$out0,v24
> -	vncipher	$out1,$out1,v24
> -	vncipher	$out2,$out2,v24
> -	vncipher	$out3,$out3,v24
> -	vncipher	$out4,$out4,v24
> -	vncipher	$out5,$out5,v24
> -	vncipher	$out6,$out6,v24
> -	vncipher	$out7,$out7,v24
> -
> -	subfe.		r0,r0,r0		# borrow?-1:0
> -	vncipher	$out0,$out0,v25
> -	vncipher	$out1,$out1,v25
> -	vncipher	$out2,$out2,v25
> -	vncipher	$out3,$out3,v25
> -	vncipher	$out4,$out4,v25
> -	vncipher	$out5,$out5,v25
> -	vncipher	$out6,$out6,v25
> -	vncipher	$out7,$out7,v25
> -
> -	and		r0,r0,$len
> -	vncipher	$out0,$out0,v26
> -	vncipher	$out1,$out1,v26
> -	vncipher	$out2,$out2,v26
> -	vncipher	$out3,$out3,v26
> -	vncipher	$out4,$out4,v26
> -	vncipher	$out5,$out5,v26
> -	vncipher	$out6,$out6,v26
> -	vncipher	$out7,$out7,v26
> -
> -	add		$inp,$inp,r0		# $inp is adjusted in such
> -						# way that at exit from the
> -						# loop inX-in7 are loaded
> -						# with last "words"
> -	vncipher	$out0,$out0,v27
> -	vncipher	$out1,$out1,v27
> -	vncipher	$out2,$out2,v27
> -	vncipher	$out3,$out3,v27
> -	vncipher	$out4,$out4,v27
> -	vncipher	$out5,$out5,v27
> -	vncipher	$out6,$out6,v27
> -	vncipher	$out7,$out7,v27
> -
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	vncipher	$out0,$out0,v28
> -	vncipher	$out1,$out1,v28
> -	vncipher	$out2,$out2,v28
> -	vncipher	$out3,$out3,v28
> -	vncipher	$out4,$out4,v28
> -	vncipher	$out5,$out5,v28
> -	vncipher	$out6,$out6,v28
> -	vncipher	$out7,$out7,v28
> -	lvx		v24,$x00,$key_		# re-pre-load round[1]
> -
> -	vncipher	$out0,$out0,v29
> -	vncipher	$out1,$out1,v29
> -	vncipher	$out2,$out2,v29
> -	vncipher	$out3,$out3,v29
> -	vncipher	$out4,$out4,v29
> -	vncipher	$out5,$out5,v29
> -	vncipher	$out6,$out6,v29
> -	vncipher	$out7,$out7,v29
> -	lvx		v25,$x10,$key_		# re-pre-load round[2]
> -
> -	vncipher	$out0,$out0,v30
> -	 vxor		$ivec,$ivec,v31		# xor with last round key
> -	vncipher	$out1,$out1,v30
> -	 vxor		$in0,$in0,v31
> -	vncipher	$out2,$out2,v30
> -	 vxor		$in1,$in1,v31
> -	vncipher	$out3,$out3,v30
> -	 vxor		$in2,$in2,v31
> -	vncipher	$out4,$out4,v30
> -	 vxor		$in3,$in3,v31
> -	vncipher	$out5,$out5,v30
> -	 vxor		$in4,$in4,v31
> -	vncipher	$out6,$out6,v30
> -	 vxor		$in5,$in5,v31
> -	vncipher	$out7,$out7,v30
> -	 vxor		$in6,$in6,v31
> -
> -	vncipherlast	$out0,$out0,$ivec
> -	vncipherlast	$out1,$out1,$in0
> -	 lvx_u		$in0,$x00,$inp		# load next input block
> -	vncipherlast	$out2,$out2,$in1
> -	 lvx_u		$in1,$x10,$inp
> -	vncipherlast	$out3,$out3,$in2
> -	 le?vperm	$in0,$in0,$in0,$inpperm
> -	 lvx_u		$in2,$x20,$inp
> -	vncipherlast	$out4,$out4,$in3
> -	 le?vperm	$in1,$in1,$in1,$inpperm
> -	 lvx_u		$in3,$x30,$inp
> -	vncipherlast	$out5,$out5,$in4
> -	 le?vperm	$in2,$in2,$in2,$inpperm
> -	 lvx_u		$in4,$x40,$inp
> -	vncipherlast	$out6,$out6,$in5
> -	 le?vperm	$in3,$in3,$in3,$inpperm
> -	 lvx_u		$in5,$x50,$inp
> -	vncipherlast	$out7,$out7,$in6
> -	 le?vperm	$in4,$in4,$in4,$inpperm
> -	 lvx_u		$in6,$x60,$inp
> -	vmr		$ivec,$in7
> -	 le?vperm	$in5,$in5,$in5,$inpperm
> -	 lvx_u		$in7,$x70,$inp
> -	 addi		$inp,$inp,0x80
> -
> -	le?vperm	$out0,$out0,$out0,$inpperm
> -	le?vperm	$out1,$out1,$out1,$inpperm
> -	stvx_u		$out0,$x00,$out
> -	 le?vperm	$in6,$in6,$in6,$inpperm
> -	 vxor		$out0,$in0,$rndkey0
> -	le?vperm	$out2,$out2,$out2,$inpperm
> -	stvx_u		$out1,$x10,$out
> -	 le?vperm	$in7,$in7,$in7,$inpperm
> -	 vxor		$out1,$in1,$rndkey0
> -	le?vperm	$out3,$out3,$out3,$inpperm
> -	stvx_u		$out2,$x20,$out
> -	 vxor		$out2,$in2,$rndkey0
> -	le?vperm	$out4,$out4,$out4,$inpperm
> -	stvx_u		$out3,$x30,$out
> -	 vxor		$out3,$in3,$rndkey0
> -	le?vperm	$out5,$out5,$out5,$inpperm
> -	stvx_u		$out4,$x40,$out
> -	 vxor		$out4,$in4,$rndkey0
> -	le?vperm	$out6,$out6,$out6,$inpperm
> -	stvx_u		$out5,$x50,$out
> -	 vxor		$out5,$in5,$rndkey0
> -	le?vperm	$out7,$out7,$out7,$inpperm
> -	stvx_u		$out6,$x60,$out
> -	 vxor		$out6,$in6,$rndkey0
> -	stvx_u		$out7,$x70,$out
> -	addi		$out,$out,0x80
> -	 vxor		$out7,$in7,$rndkey0
> -
> -	mtctr		$rounds
> -	beq		Loop_cbc_dec8x		# did $len-=128 borrow?
> -
> -	addic.		$len,$len,128
> -	beq		Lcbc_dec8x_done
> -	nop
> -	nop
> -
> -Loop_cbc_dec8x_tail:				# up to 7 "words" tail...
> -	vncipher	$out1,$out1,v24
> -	vncipher	$out2,$out2,v24
> -	vncipher	$out3,$out3,v24
> -	vncipher	$out4,$out4,v24
> -	vncipher	$out5,$out5,v24
> -	vncipher	$out6,$out6,v24
> -	vncipher	$out7,$out7,v24
> -	lvx		v24,$x20,$key_		# round[3]
> -	addi		$key_,$key_,0x20
> -
> -	vncipher	$out1,$out1,v25
> -	vncipher	$out2,$out2,v25
> -	vncipher	$out3,$out3,v25
> -	vncipher	$out4,$out4,v25
> -	vncipher	$out5,$out5,v25
> -	vncipher	$out6,$out6,v25
> -	vncipher	$out7,$out7,v25
> -	lvx		v25,$x10,$key_		# round[4]
> -	bdnz		Loop_cbc_dec8x_tail
> -
> -	vncipher	$out1,$out1,v24
> -	vncipher	$out2,$out2,v24
> -	vncipher	$out3,$out3,v24
> -	vncipher	$out4,$out4,v24
> -	vncipher	$out5,$out5,v24
> -	vncipher	$out6,$out6,v24
> -	vncipher	$out7,$out7,v24
> -
> -	vncipher	$out1,$out1,v25
> -	vncipher	$out2,$out2,v25
> -	vncipher	$out3,$out3,v25
> -	vncipher	$out4,$out4,v25
> -	vncipher	$out5,$out5,v25
> -	vncipher	$out6,$out6,v25
> -	vncipher	$out7,$out7,v25
> -
> -	vncipher	$out1,$out1,v26
> -	vncipher	$out2,$out2,v26
> -	vncipher	$out3,$out3,v26
> -	vncipher	$out4,$out4,v26
> -	vncipher	$out5,$out5,v26
> -	vncipher	$out6,$out6,v26
> -	vncipher	$out7,$out7,v26
> -
> -	vncipher	$out1,$out1,v27
> -	vncipher	$out2,$out2,v27
> -	vncipher	$out3,$out3,v27
> -	vncipher	$out4,$out4,v27
> -	vncipher	$out5,$out5,v27
> -	vncipher	$out6,$out6,v27
> -	vncipher	$out7,$out7,v27
> -
> -	vncipher	$out1,$out1,v28
> -	vncipher	$out2,$out2,v28
> -	vncipher	$out3,$out3,v28
> -	vncipher	$out4,$out4,v28
> -	vncipher	$out5,$out5,v28
> -	vncipher	$out6,$out6,v28
> -	vncipher	$out7,$out7,v28
> -
> -	vncipher	$out1,$out1,v29
> -	vncipher	$out2,$out2,v29
> -	vncipher	$out3,$out3,v29
> -	vncipher	$out4,$out4,v29
> -	vncipher	$out5,$out5,v29
> -	vncipher	$out6,$out6,v29
> -	vncipher	$out7,$out7,v29
> -
> -	vncipher	$out1,$out1,v30
> -	 vxor		$ivec,$ivec,v31		# last round key
> -	vncipher	$out2,$out2,v30
> -	 vxor		$in1,$in1,v31
> -	vncipher	$out3,$out3,v30
> -	 vxor		$in2,$in2,v31
> -	vncipher	$out4,$out4,v30
> -	 vxor		$in3,$in3,v31
> -	vncipher	$out5,$out5,v30
> -	 vxor		$in4,$in4,v31
> -	vncipher	$out6,$out6,v30
> -	 vxor		$in5,$in5,v31
> -	vncipher	$out7,$out7,v30
> -	 vxor		$in6,$in6,v31
> -
> -	cmplwi		$len,32			# switch($len)
> -	blt		Lcbc_dec8x_one
> -	nop
> -	beq		Lcbc_dec8x_two
> -	cmplwi		$len,64
> -	blt		Lcbc_dec8x_three
> -	nop
> -	beq		Lcbc_dec8x_four
> -	cmplwi		$len,96
> -	blt		Lcbc_dec8x_five
> -	nop
> -	beq		Lcbc_dec8x_six
> -
> -Lcbc_dec8x_seven:
> -	vncipherlast	$out1,$out1,$ivec
> -	vncipherlast	$out2,$out2,$in1
> -	vncipherlast	$out3,$out3,$in2
> -	vncipherlast	$out4,$out4,$in3
> -	vncipherlast	$out5,$out5,$in4
> -	vncipherlast	$out6,$out6,$in5
> -	vncipherlast	$out7,$out7,$in6
> -	vmr		$ivec,$in7
> -
> -	le?vperm	$out1,$out1,$out1,$inpperm
> -	le?vperm	$out2,$out2,$out2,$inpperm
> -	stvx_u		$out1,$x00,$out
> -	le?vperm	$out3,$out3,$out3,$inpperm
> -	stvx_u		$out2,$x10,$out
> -	le?vperm	$out4,$out4,$out4,$inpperm
> -	stvx_u		$out3,$x20,$out
> -	le?vperm	$out5,$out5,$out5,$inpperm
> -	stvx_u		$out4,$x30,$out
> -	le?vperm	$out6,$out6,$out6,$inpperm
> -	stvx_u		$out5,$x40,$out
> -	le?vperm	$out7,$out7,$out7,$inpperm
> -	stvx_u		$out6,$x50,$out
> -	stvx_u		$out7,$x60,$out
> -	addi		$out,$out,0x70
> -	b		Lcbc_dec8x_done
> -
> -.align	5
> -Lcbc_dec8x_six:
> -	vncipherlast	$out2,$out2,$ivec
> -	vncipherlast	$out3,$out3,$in2
> -	vncipherlast	$out4,$out4,$in3
> -	vncipherlast	$out5,$out5,$in4
> -	vncipherlast	$out6,$out6,$in5
> -	vncipherlast	$out7,$out7,$in6
> -	vmr		$ivec,$in7
> -
> -	le?vperm	$out2,$out2,$out2,$inpperm
> -	le?vperm	$out3,$out3,$out3,$inpperm
> -	stvx_u		$out2,$x00,$out
> -	le?vperm	$out4,$out4,$out4,$inpperm
> -	stvx_u		$out3,$x10,$out
> -	le?vperm	$out5,$out5,$out5,$inpperm
> -	stvx_u		$out4,$x20,$out
> -	le?vperm	$out6,$out6,$out6,$inpperm
> -	stvx_u		$out5,$x30,$out
> -	le?vperm	$out7,$out7,$out7,$inpperm
> -	stvx_u		$out6,$x40,$out
> -	stvx_u		$out7,$x50,$out
> -	addi		$out,$out,0x60
> -	b		Lcbc_dec8x_done
> -
> -.align	5
> -Lcbc_dec8x_five:
> -	vncipherlast	$out3,$out3,$ivec
> -	vncipherlast	$out4,$out4,$in3
> -	vncipherlast	$out5,$out5,$in4
> -	vncipherlast	$out6,$out6,$in5
> -	vncipherlast	$out7,$out7,$in6
> -	vmr		$ivec,$in7
> -
> -	le?vperm	$out3,$out3,$out3,$inpperm
> -	le?vperm	$out4,$out4,$out4,$inpperm
> -	stvx_u		$out3,$x00,$out
> -	le?vperm	$out5,$out5,$out5,$inpperm
> -	stvx_u		$out4,$x10,$out
> -	le?vperm	$out6,$out6,$out6,$inpperm
> -	stvx_u		$out5,$x20,$out
> -	le?vperm	$out7,$out7,$out7,$inpperm
> -	stvx_u		$out6,$x30,$out
> -	stvx_u		$out7,$x40,$out
> -	addi		$out,$out,0x50
> -	b		Lcbc_dec8x_done
> -
> -.align	5
> -Lcbc_dec8x_four:
> -	vncipherlast	$out4,$out4,$ivec
> -	vncipherlast	$out5,$out5,$in4
> -	vncipherlast	$out6,$out6,$in5
> -	vncipherlast	$out7,$out7,$in6
> -	vmr		$ivec,$in7
> -
> -	le?vperm	$out4,$out4,$out4,$inpperm
> -	le?vperm	$out5,$out5,$out5,$inpperm
> -	stvx_u		$out4,$x00,$out
> -	le?vperm	$out6,$out6,$out6,$inpperm
> -	stvx_u		$out5,$x10,$out
> -	le?vperm	$out7,$out7,$out7,$inpperm
> -	stvx_u		$out6,$x20,$out
> -	stvx_u		$out7,$x30,$out
> -	addi		$out,$out,0x40
> -	b		Lcbc_dec8x_done
> -
> -.align	5
> -Lcbc_dec8x_three:
> -	vncipherlast	$out5,$out5,$ivec
> -	vncipherlast	$out6,$out6,$in5
> -	vncipherlast	$out7,$out7,$in6
> -	vmr		$ivec,$in7
> -
> -	le?vperm	$out5,$out5,$out5,$inpperm
> -	le?vperm	$out6,$out6,$out6,$inpperm
> -	stvx_u		$out5,$x00,$out
> -	le?vperm	$out7,$out7,$out7,$inpperm
> -	stvx_u		$out6,$x10,$out
> -	stvx_u		$out7,$x20,$out
> -	addi		$out,$out,0x30
> -	b		Lcbc_dec8x_done
> -
> -.align	5
> -Lcbc_dec8x_two:
> -	vncipherlast	$out6,$out6,$ivec
> -	vncipherlast	$out7,$out7,$in6
> -	vmr		$ivec,$in7
> -
> -	le?vperm	$out6,$out6,$out6,$inpperm
> -	le?vperm	$out7,$out7,$out7,$inpperm
> -	stvx_u		$out6,$x00,$out
> -	stvx_u		$out7,$x10,$out
> -	addi		$out,$out,0x20
> -	b		Lcbc_dec8x_done
> -
> -.align	5
> -Lcbc_dec8x_one:
> -	vncipherlast	$out7,$out7,$ivec
> -	vmr		$ivec,$in7
> -
> -	le?vperm	$out7,$out7,$out7,$inpperm
> -	stvx_u		$out7,0,$out
> -	addi		$out,$out,0x10
> -
> -Lcbc_dec8x_done:
> -	le?vperm	$ivec,$ivec,$ivec,$inpperm
> -	stvx_u		$ivec,0,$ivp		# write [unaligned] iv
> -
> -	li		r10,`$FRAME+15`
> -	li		r11,`$FRAME+31`
> -	stvx		$inpperm,r10,$sp	# wipe copies of round keys
> -	addi		r10,r10,32
> -	stvx		$inpperm,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$inpperm,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$inpperm,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$inpperm,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$inpperm,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$inpperm,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$inpperm,r11,$sp
> -	addi		r11,r11,32
> -
> -	mtspr		256,$vrsave
> -	lvx		v20,r10,$sp		# ABI says so
> -	addi		r10,r10,32
> -	lvx		v21,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v22,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v23,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v24,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v25,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v26,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v27,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v28,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v29,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v30,r10,$sp
> -	lvx		v31,r11,$sp
> -	$POP		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
> -	$POP		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
> -	$POP		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
> -	$POP		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
> -	$POP		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
> -	$POP		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
> -	addi		$sp,$sp,`$FRAME+21*16+6*$SIZE_T`
> -	blr
> -	.long		0
> -	.byte		0,12,0x14,0,0x80,6,6,0
> -	.long		0
> -.size	.${prefix}_cbc_encrypt,.-.${prefix}_cbc_encrypt
> -___
> -}}	}}}
> -
> -#########################################################################
> -{{{	# CTR procedure[s]						#
> -my ($inp,$out,$len,$key,$ivp,$x10,$rounds,$idx)=map("r$_",(3..10));
> -my ($rndkey0,$rndkey1,$inout,$tmp)=		map("v$_",(0..3));
> -my ($ivec,$inptail,$inpperm,$outhead,$outperm,$outmask,$keyperm,$one)=
> -						map("v$_",(4..11));
> -my $dat=$tmp;
> -
> -$code.=<<___;
> -.globl	.${prefix}_ctr32_encrypt_blocks
> -	${UCMP}i	$len,1
> -	bltlr-
> -
> -	lis		r0,0xfff0
> -	mfspr		$vrsave,256
> -	mtspr		256,r0
> -
> -	li		$idx,15
> -	vxor		$rndkey0,$rndkey0,$rndkey0
> -	le?vspltisb	$tmp,0x0f
> -
> -	lvx		$ivec,0,$ivp		# load [unaligned] iv
> -	lvsl		$inpperm,0,$ivp
> -	lvx		$inptail,$idx,$ivp
> -	 vspltisb	$one,1
> -	le?vxor		$inpperm,$inpperm,$tmp
> -	vperm		$ivec,$ivec,$inptail,$inpperm
> -	 vsldoi		$one,$rndkey0,$one,1
> -
> -	neg		r11,$inp
> -	?lvsl		$keyperm,0,$key		# prepare for unaligned key
> -	lwz		$rounds,240($key)
> -
> -	lvsr		$inpperm,0,r11		# prepare for unaligned load
> -	lvx		$inptail,0,$inp
> -	addi		$inp,$inp,15		# 15 is not typo
> -	le?vxor		$inpperm,$inpperm,$tmp
> -
> -	srwi		$rounds,$rounds,1
> -	li		$idx,16
> -	subi		$rounds,$rounds,1
> -
> -	${UCMP}i	$len,8
> -	bge		_aesp8_ctr32_encrypt8x
> -
> -	?lvsr		$outperm,0,$out		# prepare for unaligned store
> -	vspltisb	$outmask,-1
> -	lvx		$outhead,0,$out
> -	?vperm		$outmask,$rndkey0,$outmask,$outperm
> -	le?vxor		$outperm,$outperm,$tmp
> -
> -	lvx		$rndkey0,0,$key
> -	mtctr		$rounds
> -	lvx		$rndkey1,$idx,$key
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$inout,$ivec,$rndkey0
> -	lvx		$rndkey0,$idx,$key
> -	addi		$idx,$idx,16
> -	b		Loop_ctr32_enc
> -
> -.align	5
> -Loop_ctr32_enc:
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vcipher		$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vcipher		$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key
> -	addi		$idx,$idx,16
> -	bdnz		Loop_ctr32_enc
> -
> -	vadduwm		$ivec,$ivec,$one
> -	 vmr		$dat,$inptail
> -	 lvx		$inptail,0,$inp
> -	 addi		$inp,$inp,16
> -	 subic.		$len,$len,1		# blocks--
> -
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vcipher		$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key
> -	 vperm		$dat,$dat,$inptail,$inpperm
> -	 li		$idx,16
> -	?vperm		$rndkey1,$rndkey0,$rndkey1,$keyperm
> -	 lvx		$rndkey0,0,$key
> -	vxor		$dat,$dat,$rndkey1	# last round key
> -	vcipherlast	$inout,$inout,$dat
> -
> -	 lvx		$rndkey1,$idx,$key
> -	 addi		$idx,$idx,16
> -	vperm		$inout,$inout,$inout,$outperm
> -	vsel		$dat,$outhead,$inout,$outmask
> -	 mtctr		$rounds
> -	 ?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vmr		$outhead,$inout
> -	 vxor		$inout,$ivec,$rndkey0
> -	 lvx		$rndkey0,$idx,$key
> -	 addi		$idx,$idx,16
> -	stvx		$dat,0,$out
> -	addi		$out,$out,16
> -	bne		Loop_ctr32_enc
> -
> -	addi		$out,$out,-1
> -	lvx		$inout,0,$out		# redundant in aligned case
> -	vsel		$inout,$outhead,$inout,$outmask
> -	stvx		$inout,0,$out
> -
> -	mtspr		256,$vrsave
> -	blr
> -	.long		0
> -	.byte		0,12,0x14,0,0,0,6,0
> -	.long		0
> -___
> -#########################################################################
> -{{	# Optimized CTR procedure					#
> -my $key_="r11";
> -my ($x00,$x10,$x20,$x30,$x40,$x50,$x60,$x70)=map("r$_",(0,8,26..31));
> -my ($in0, $in1, $in2, $in3, $in4, $in5, $in6, $in7 )=map("v$_",(0..3,10,12..14));
> -my ($out0,$out1,$out2,$out3,$out4,$out5,$out6,$out7)=map("v$_",(15..22));
> -my $rndkey0="v23";	# v24-v25 rotating buffer for first found keys
> -			# v26-v31 last 6 round keys
> -my ($tmp,$keyperm)=($in3,$in4);	# aliases with "caller", redundant assignment
> -my ($two,$three,$four)=($outhead,$outperm,$outmask);
> -
> -$code.=<<___;
> -.align	5
> -_aesp8_ctr32_encrypt8x:
> -	$STU		$sp,-`($FRAME+21*16+6*$SIZE_T)`($sp)
> -	li		r10,`$FRAME+8*16+15`
> -	li		r11,`$FRAME+8*16+31`
> -	stvx		v20,r10,$sp		# ABI says so
> -	addi		r10,r10,32
> -	stvx		v21,r11,$sp
> -	addi		r11,r11,32
> -	stvx		v22,r10,$sp
> -	addi		r10,r10,32
> -	stvx		v23,r11,$sp
> -	addi		r11,r11,32
> -	stvx		v24,r10,$sp
> -	addi		r10,r10,32
> -	stvx		v25,r11,$sp
> -	addi		r11,r11,32
> -	stvx		v26,r10,$sp
> -	addi		r10,r10,32
> -	stvx		v27,r11,$sp
> -	addi		r11,r11,32
> -	stvx		v28,r10,$sp
> -	addi		r10,r10,32
> -	stvx		v29,r11,$sp
> -	addi		r11,r11,32
> -	stvx		v30,r10,$sp
> -	stvx		v31,r11,$sp
> -	li		r0,-1
> -	stw		$vrsave,`$FRAME+21*16-4`($sp)	# save vrsave
> -	li		$x10,0x10
> -	$PUSH		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
> -	li		$x20,0x20
> -	$PUSH		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
> -	li		$x30,0x30
> -	$PUSH		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
> -	li		$x40,0x40
> -	$PUSH		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
> -	li		$x50,0x50
> -	$PUSH		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
> -	li		$x60,0x60
> -	$PUSH		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
> -	li		$x70,0x70
> -	mtspr		256,r0
> -
> -	subi		$rounds,$rounds,3	# -4 in total
> -
> -	lvx		$rndkey0,$x00,$key	# load key schedule
> -	lvx		v30,$x10,$key
> -	addi		$key,$key,0x20
> -	lvx		v31,$x00,$key
> -	?vperm		$rndkey0,$rndkey0,v30,$keyperm
> -	addi		$key_,$sp,$FRAME+15
> -	mtctr		$rounds
> -
> -Load_ctr32_enc_key:
> -	?vperm		v24,v30,v31,$keyperm
> -	lvx		v30,$x10,$key
> -	addi		$key,$key,0x20
> -	stvx		v24,$x00,$key_		# off-load round[1]
> -	?vperm		v25,v31,v30,$keyperm
> -	lvx		v31,$x00,$key
> -	stvx		v25,$x10,$key_		# off-load round[2]
> -	addi		$key_,$key_,0x20
> -	bdnz		Load_ctr32_enc_key
> -
> -	lvx		v26,$x10,$key
> -	?vperm		v24,v30,v31,$keyperm
> -	lvx		v27,$x20,$key
> -	stvx		v24,$x00,$key_		# off-load round[3]
> -	?vperm		v25,v31,v26,$keyperm
> -	lvx		v28,$x30,$key
> -	stvx		v25,$x10,$key_		# off-load round[4]
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	?vperm		v26,v26,v27,$keyperm
> -	lvx		v29,$x40,$key
> -	?vperm		v27,v27,v28,$keyperm
> -	lvx		v30,$x50,$key
> -	?vperm		v28,v28,v29,$keyperm
> -	lvx		v31,$x60,$key
> -	?vperm		v29,v29,v30,$keyperm
> -	lvx		$out0,$x70,$key		# borrow $out0
> -	?vperm		v30,v30,v31,$keyperm
> -	lvx		v24,$x00,$key_		# pre-load round[1]
> -	?vperm		v31,v31,$out0,$keyperm
> -	lvx		v25,$x10,$key_		# pre-load round[2]
> -
> -	vadduqm		$two,$one,$one
> -	subi		$inp,$inp,15		# undo "caller"
> -	$SHL		$len,$len,4
> -
> -	vadduqm		$out1,$ivec,$one	# counter values ...
> -	vadduqm		$out2,$ivec,$two
> -	vxor		$out0,$ivec,$rndkey0	# ... xored with rndkey[0]
> -	 le?li		$idx,8
> -	vadduqm		$out3,$out1,$two
> -	vxor		$out1,$out1,$rndkey0
> -	 le?lvsl	$inpperm,0,$idx
> -	vadduqm		$out4,$out2,$two
> -	vxor		$out2,$out2,$rndkey0
> -	 le?vspltisb	$tmp,0x0f
> -	vadduqm		$out5,$out3,$two
> -	vxor		$out3,$out3,$rndkey0
> -	 le?vxor	$inpperm,$inpperm,$tmp	# transform for lvx_u/stvx_u
> -	vadduqm		$out6,$out4,$two
> -	vxor		$out4,$out4,$rndkey0
> -	vadduqm		$out7,$out5,$two
> -	vxor		$out5,$out5,$rndkey0
> -	vadduqm		$ivec,$out6,$two	# next counter value
> -	vxor		$out6,$out6,$rndkey0
> -	vxor		$out7,$out7,$rndkey0
> -
> -	mtctr		$rounds
> -	b		Loop_ctr32_enc8x
> -.align	5
> -Loop_ctr32_enc8x:
> -	vcipher 	$out0,$out0,v24
> -	vcipher 	$out1,$out1,v24
> -	vcipher 	$out2,$out2,v24
> -	vcipher 	$out3,$out3,v24
> -	vcipher 	$out4,$out4,v24
> -	vcipher 	$out5,$out5,v24
> -	vcipher 	$out6,$out6,v24
> -	vcipher 	$out7,$out7,v24
> -Loop_ctr32_enc8x_middle:
> -	lvx		v24,$x20,$key_		# round[3]
> -	addi		$key_,$key_,0x20
> -
> -	vcipher 	$out0,$out0,v25
> -	vcipher 	$out1,$out1,v25
> -	vcipher 	$out2,$out2,v25
> -	vcipher 	$out3,$out3,v25
> -	vcipher 	$out4,$out4,v25
> -	vcipher 	$out5,$out5,v25
> -	vcipher 	$out6,$out6,v25
> -	vcipher 	$out7,$out7,v25
> -	lvx		v25,$x10,$key_		# round[4]
> -	bdnz		Loop_ctr32_enc8x
> -
> -	subic		r11,$len,256		# $len-256, borrow $key_
> -	vcipher 	$out0,$out0,v24
> -	vcipher 	$out1,$out1,v24
> -	vcipher 	$out2,$out2,v24
> -	vcipher 	$out3,$out3,v24
> -	vcipher 	$out4,$out4,v24
> -	vcipher 	$out5,$out5,v24
> -	vcipher 	$out6,$out6,v24
> -	vcipher 	$out7,$out7,v24
> -
> -	subfe		r0,r0,r0		# borrow?-1:0
> -	vcipher 	$out0,$out0,v25
> -	vcipher 	$out1,$out1,v25
> -	vcipher 	$out2,$out2,v25
> -	vcipher 	$out3,$out3,v25
> -	vcipher 	$out4,$out4,v25
> -	vcipher		$out5,$out5,v25
> -	vcipher		$out6,$out6,v25
> -	vcipher		$out7,$out7,v25
> -
> -	and		r0,r0,r11
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	vcipher		$out0,$out0,v26
> -	vcipher		$out1,$out1,v26
> -	vcipher		$out2,$out2,v26
> -	vcipher		$out3,$out3,v26
> -	vcipher		$out4,$out4,v26
> -	vcipher		$out5,$out5,v26
> -	vcipher		$out6,$out6,v26
> -	vcipher		$out7,$out7,v26
> -	lvx		v24,$x00,$key_		# re-pre-load round[1]
> -
> -	subic		$len,$len,129		# $len-=129
> -	vcipher		$out0,$out0,v27
> -	addi		$len,$len,1		# $len-=128 really
> -	vcipher		$out1,$out1,v27
> -	vcipher		$out2,$out2,v27
> -	vcipher		$out3,$out3,v27
> -	vcipher		$out4,$out4,v27
> -	vcipher		$out5,$out5,v27
> -	vcipher		$out6,$out6,v27
> -	vcipher		$out7,$out7,v27
> -	lvx		v25,$x10,$key_		# re-pre-load round[2]
> -
> -	vcipher		$out0,$out0,v28
> -	 lvx_u		$in0,$x00,$inp		# load input
> -	vcipher		$out1,$out1,v28
> -	 lvx_u		$in1,$x10,$inp
> -	vcipher		$out2,$out2,v28
> -	 lvx_u		$in2,$x20,$inp
> -	vcipher		$out3,$out3,v28
> -	 lvx_u		$in3,$x30,$inp
> -	vcipher		$out4,$out4,v28
> -	 lvx_u		$in4,$x40,$inp
> -	vcipher		$out5,$out5,v28
> -	 lvx_u		$in5,$x50,$inp
> -	vcipher		$out6,$out6,v28
> -	 lvx_u		$in6,$x60,$inp
> -	vcipher		$out7,$out7,v28
> -	 lvx_u		$in7,$x70,$inp
> -	 addi		$inp,$inp,0x80
> -
> -	vcipher		$out0,$out0,v29
> -	 le?vperm	$in0,$in0,$in0,$inpperm
> -	vcipher		$out1,$out1,v29
> -	 le?vperm	$in1,$in1,$in1,$inpperm
> -	vcipher		$out2,$out2,v29
> -	 le?vperm	$in2,$in2,$in2,$inpperm
> -	vcipher		$out3,$out3,v29
> -	 le?vperm	$in3,$in3,$in3,$inpperm
> -	vcipher		$out4,$out4,v29
> -	 le?vperm	$in4,$in4,$in4,$inpperm
> -	vcipher		$out5,$out5,v29
> -	 le?vperm	$in5,$in5,$in5,$inpperm
> -	vcipher		$out6,$out6,v29
> -	 le?vperm	$in6,$in6,$in6,$inpperm
> -	vcipher		$out7,$out7,v29
> -	 le?vperm	$in7,$in7,$in7,$inpperm
> -
> -	add		$inp,$inp,r0		# $inp is adjusted in such
> -						# way that at exit from the
> -						# loop inX-in7 are loaded
> -						# with last "words"
> -	subfe.		r0,r0,r0		# borrow?-1:0
> -	vcipher		$out0,$out0,v30
> -	 vxor		$in0,$in0,v31		# xor with last round key
> -	vcipher		$out1,$out1,v30
> -	 vxor		$in1,$in1,v31
> -	vcipher		$out2,$out2,v30
> -	 vxor		$in2,$in2,v31
> -	vcipher		$out3,$out3,v30
> -	 vxor		$in3,$in3,v31
> -	vcipher		$out4,$out4,v30
> -	 vxor		$in4,$in4,v31
> -	vcipher		$out5,$out5,v30
> -	 vxor		$in5,$in5,v31
> -	vcipher		$out6,$out6,v30
> -	 vxor		$in6,$in6,v31
> -	vcipher		$out7,$out7,v30
> -	 vxor		$in7,$in7,v31
> -
> -	bne		Lctr32_enc8x_break	# did $len-129 borrow?
> -
> -	vcipherlast	$in0,$out0,$in0
> -	vcipherlast	$in1,$out1,$in1
> -	 vadduqm	$out1,$ivec,$one	# counter values ...
> -	vcipherlast	$in2,$out2,$in2
> -	 vadduqm	$out2,$ivec,$two
> -	 vxor		$out0,$ivec,$rndkey0	# ... xored with rndkey[0]
> -	vcipherlast	$in3,$out3,$in3
> -	 vadduqm	$out3,$out1,$two
> -	 vxor		$out1,$out1,$rndkey0
> -	vcipherlast	$in4,$out4,$in4
> -	 vadduqm	$out4,$out2,$two
> -	 vxor		$out2,$out2,$rndkey0
> -	vcipherlast	$in5,$out5,$in5
> -	 vadduqm	$out5,$out3,$two
> -	 vxor		$out3,$out3,$rndkey0
> -	vcipherlast	$in6,$out6,$in6
> -	 vadduqm	$out6,$out4,$two
> -	 vxor		$out4,$out4,$rndkey0
> -	vcipherlast	$in7,$out7,$in7
> -	 vadduqm	$out7,$out5,$two
> -	 vxor		$out5,$out5,$rndkey0
> -	le?vperm	$in0,$in0,$in0,$inpperm
> -	 vadduqm	$ivec,$out6,$two	# next counter value
> -	 vxor		$out6,$out6,$rndkey0
> -	le?vperm	$in1,$in1,$in1,$inpperm
> -	 vxor		$out7,$out7,$rndkey0
> -	mtctr		$rounds
> -
> -	 vcipher	$out0,$out0,v24
> -	stvx_u		$in0,$x00,$out
> -	le?vperm	$in2,$in2,$in2,$inpperm
> -	 vcipher	$out1,$out1,v24
> -	stvx_u		$in1,$x10,$out
> -	le?vperm	$in3,$in3,$in3,$inpperm
> -	 vcipher	$out2,$out2,v24
> -	stvx_u		$in2,$x20,$out
> -	le?vperm	$in4,$in4,$in4,$inpperm
> -	 vcipher	$out3,$out3,v24
> -	stvx_u		$in3,$x30,$out
> -	le?vperm	$in5,$in5,$in5,$inpperm
> -	 vcipher	$out4,$out4,v24
> -	stvx_u		$in4,$x40,$out
> -	le?vperm	$in6,$in6,$in6,$inpperm
> -	 vcipher	$out5,$out5,v24
> -	stvx_u		$in5,$x50,$out
> -	le?vperm	$in7,$in7,$in7,$inpperm
> -	 vcipher	$out6,$out6,v24
> -	stvx_u		$in6,$x60,$out
> -	 vcipher	$out7,$out7,v24
> -	stvx_u		$in7,$x70,$out
> -	addi		$out,$out,0x80
> -
> -	b		Loop_ctr32_enc8x_middle
> -
> -.align	5
> -Lctr32_enc8x_break:
> -	cmpwi		$len,-0x60
> -	blt		Lctr32_enc8x_one
> -	nop
> -	beq		Lctr32_enc8x_two
> -	cmpwi		$len,-0x40
> -	blt		Lctr32_enc8x_three
> -	nop
> -	beq		Lctr32_enc8x_four
> -	cmpwi		$len,-0x20
> -	blt		Lctr32_enc8x_five
> -	nop
> -	beq		Lctr32_enc8x_six
> -	cmpwi		$len,0x00
> -	blt		Lctr32_enc8x_seven
> -
> -Lctr32_enc8x_eight:
> -	vcipherlast	$out0,$out0,$in0
> -	vcipherlast	$out1,$out1,$in1
> -	vcipherlast	$out2,$out2,$in2
> -	vcipherlast	$out3,$out3,$in3
> -	vcipherlast	$out4,$out4,$in4
> -	vcipherlast	$out5,$out5,$in5
> -	vcipherlast	$out6,$out6,$in6
> -	vcipherlast	$out7,$out7,$in7
> -
> -	le?vperm	$out0,$out0,$out0,$inpperm
> -	le?vperm	$out1,$out1,$out1,$inpperm
> -	stvx_u		$out0,$x00,$out
> -	le?vperm	$out2,$out2,$out2,$inpperm
> -	stvx_u		$out1,$x10,$out
> -	le?vperm	$out3,$out3,$out3,$inpperm
> -	stvx_u		$out2,$x20,$out
> -	le?vperm	$out4,$out4,$out4,$inpperm
> -	stvx_u		$out3,$x30,$out
> -	le?vperm	$out5,$out5,$out5,$inpperm
> -	stvx_u		$out4,$x40,$out
> -	le?vperm	$out6,$out6,$out6,$inpperm
> -	stvx_u		$out5,$x50,$out
> -	le?vperm	$out7,$out7,$out7,$inpperm
> -	stvx_u		$out6,$x60,$out
> -	stvx_u		$out7,$x70,$out
> -	addi		$out,$out,0x80
> -	b		Lctr32_enc8x_done
> -
> -.align	5
> -Lctr32_enc8x_seven:
> -	vcipherlast	$out0,$out0,$in1
> -	vcipherlast	$out1,$out1,$in2
> -	vcipherlast	$out2,$out2,$in3
> -	vcipherlast	$out3,$out3,$in4
> -	vcipherlast	$out4,$out4,$in5
> -	vcipherlast	$out5,$out5,$in6
> -	vcipherlast	$out6,$out6,$in7
> -
> -	le?vperm	$out0,$out0,$out0,$inpperm
> -	le?vperm	$out1,$out1,$out1,$inpperm
> -	stvx_u		$out0,$x00,$out
> -	le?vperm	$out2,$out2,$out2,$inpperm
> -	stvx_u		$out1,$x10,$out
> -	le?vperm	$out3,$out3,$out3,$inpperm
> -	stvx_u		$out2,$x20,$out
> -	le?vperm	$out4,$out4,$out4,$inpperm
> -	stvx_u		$out3,$x30,$out
> -	le?vperm	$out5,$out5,$out5,$inpperm
> -	stvx_u		$out4,$x40,$out
> -	le?vperm	$out6,$out6,$out6,$inpperm
> -	stvx_u		$out5,$x50,$out
> -	stvx_u		$out6,$x60,$out
> -	addi		$out,$out,0x70
> -	b		Lctr32_enc8x_done
> -
> -.align	5
> -Lctr32_enc8x_six:
> -	vcipherlast	$out0,$out0,$in2
> -	vcipherlast	$out1,$out1,$in3
> -	vcipherlast	$out2,$out2,$in4
> -	vcipherlast	$out3,$out3,$in5
> -	vcipherlast	$out4,$out4,$in6
> -	vcipherlast	$out5,$out5,$in7
> -
> -	le?vperm	$out0,$out0,$out0,$inpperm
> -	le?vperm	$out1,$out1,$out1,$inpperm
> -	stvx_u		$out0,$x00,$out
> -	le?vperm	$out2,$out2,$out2,$inpperm
> -	stvx_u		$out1,$x10,$out
> -	le?vperm	$out3,$out3,$out3,$inpperm
> -	stvx_u		$out2,$x20,$out
> -	le?vperm	$out4,$out4,$out4,$inpperm
> -	stvx_u		$out3,$x30,$out
> -	le?vperm	$out5,$out5,$out5,$inpperm
> -	stvx_u		$out4,$x40,$out
> -	stvx_u		$out5,$x50,$out
> -	addi		$out,$out,0x60
> -	b		Lctr32_enc8x_done
> -
> -.align	5
> -Lctr32_enc8x_five:
> -	vcipherlast	$out0,$out0,$in3
> -	vcipherlast	$out1,$out1,$in4
> -	vcipherlast	$out2,$out2,$in5
> -	vcipherlast	$out3,$out3,$in6
> -	vcipherlast	$out4,$out4,$in7
> -
> -	le?vperm	$out0,$out0,$out0,$inpperm
> -	le?vperm	$out1,$out1,$out1,$inpperm
> -	stvx_u		$out0,$x00,$out
> -	le?vperm	$out2,$out2,$out2,$inpperm
> -	stvx_u		$out1,$x10,$out
> -	le?vperm	$out3,$out3,$out3,$inpperm
> -	stvx_u		$out2,$x20,$out
> -	le?vperm	$out4,$out4,$out4,$inpperm
> -	stvx_u		$out3,$x30,$out
> -	stvx_u		$out4,$x40,$out
> -	addi		$out,$out,0x50
> -	b		Lctr32_enc8x_done
> -
> -.align	5
> -Lctr32_enc8x_four:
> -	vcipherlast	$out0,$out0,$in4
> -	vcipherlast	$out1,$out1,$in5
> -	vcipherlast	$out2,$out2,$in6
> -	vcipherlast	$out3,$out3,$in7
> -
> -	le?vperm	$out0,$out0,$out0,$inpperm
> -	le?vperm	$out1,$out1,$out1,$inpperm
> -	stvx_u		$out0,$x00,$out
> -	le?vperm	$out2,$out2,$out2,$inpperm
> -	stvx_u		$out1,$x10,$out
> -	le?vperm	$out3,$out3,$out3,$inpperm
> -	stvx_u		$out2,$x20,$out
> -	stvx_u		$out3,$x30,$out
> -	addi		$out,$out,0x40
> -	b		Lctr32_enc8x_done
> -
> -.align	5
> -Lctr32_enc8x_three:
> -	vcipherlast	$out0,$out0,$in5
> -	vcipherlast	$out1,$out1,$in6
> -	vcipherlast	$out2,$out2,$in7
> -
> -	le?vperm	$out0,$out0,$out0,$inpperm
> -	le?vperm	$out1,$out1,$out1,$inpperm
> -	stvx_u		$out0,$x00,$out
> -	le?vperm	$out2,$out2,$out2,$inpperm
> -	stvx_u		$out1,$x10,$out
> -	stvx_u		$out2,$x20,$out
> -	addi		$out,$out,0x30
> -	b		Lcbc_dec8x_done
> -
> -.align	5
> -Lctr32_enc8x_two:
> -	vcipherlast	$out0,$out0,$in6
> -	vcipherlast	$out1,$out1,$in7
> -
> -	le?vperm	$out0,$out0,$out0,$inpperm
> -	le?vperm	$out1,$out1,$out1,$inpperm
> -	stvx_u		$out0,$x00,$out
> -	stvx_u		$out1,$x10,$out
> -	addi		$out,$out,0x20
> -	b		Lcbc_dec8x_done
> -
> -.align	5
> -Lctr32_enc8x_one:
> -	vcipherlast	$out0,$out0,$in7
> -
> -	le?vperm	$out0,$out0,$out0,$inpperm
> -	stvx_u		$out0,0,$out
> -	addi		$out,$out,0x10
> -
> -Lctr32_enc8x_done:
> -	li		r10,`$FRAME+15`
> -	li		r11,`$FRAME+31`
> -	stvx		$inpperm,r10,$sp	# wipe copies of round keys
> -	addi		r10,r10,32
> -	stvx		$inpperm,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$inpperm,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$inpperm,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$inpperm,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$inpperm,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$inpperm,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$inpperm,r11,$sp
> -	addi		r11,r11,32
> -
> -	mtspr		256,$vrsave
> -	lvx		v20,r10,$sp		# ABI says so
> -	addi		r10,r10,32
> -	lvx		v21,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v22,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v23,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v24,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v25,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v26,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v27,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v28,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v29,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v30,r10,$sp
> -	lvx		v31,r11,$sp
> -	$POP		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
> -	$POP		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
> -	$POP		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
> -	$POP		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
> -	$POP		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
> -	$POP		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
> -	addi		$sp,$sp,`$FRAME+21*16+6*$SIZE_T`
> -	blr
> -	.long		0
> -	.byte		0,12,0x14,0,0x80,6,6,0
> -	.long		0
> -.size	.${prefix}_ctr32_encrypt_blocks,.-.${prefix}_ctr32_encrypt_blocks
> -___
> -}}	}}}
> -
> -#########################################################################
> -{{{	# XTS procedures						#
> -# int aes_p8_xts_[en|de]crypt(const char *inp, char *out, size_t len,	#
> -#                             const AES_KEY *key1, const AES_KEY *key2,	#
> -#                             [const] unsigned char iv[16]);		#
> -# If $key2 is NULL, then a "tweak chaining" mode is engaged, in which	#
> -# input tweak value is assumed to be encrypted already, and last tweak	#
> -# value, one suitable for consecutive call on same chunk of data, is	#
> -# written back to original buffer. In addition, in "tweak chaining"	#
> -# mode only complete input blocks are processed.			#
> -
> -my ($inp,$out,$len,$key1,$key2,$ivp,$rounds,$idx) =	map("r$_",(3..10));
> -my ($rndkey0,$rndkey1,$inout) =				map("v$_",(0..2));
> -my ($output,$inptail,$inpperm,$leperm,$keyperm) =	map("v$_",(3..7));
> -my ($tweak,$seven,$eighty7,$tmp,$tweak1) =		map("v$_",(8..12));
> -my $taillen = $key2;
> -
> -   ($inp,$idx) = ($idx,$inp);				# reassign
> -
> -$code.=<<___;
> -.globl	.${prefix}_xts_encrypt
> -	mr		$inp,r3				# reassign
> -	li		r3,-1
> -	${UCMP}i	$len,16
> -	bltlr-
> -
> -	lis		r0,0xfff0
> -	mfspr		r12,256				# save vrsave
> -	li		r11,0
> -	mtspr		256,r0
> -
> -	vspltisb	$seven,0x07			# 0x070707..07
> -	le?lvsl		$leperm,r11,r11
> -	le?vspltisb	$tmp,0x0f
> -	le?vxor		$leperm,$leperm,$seven
> -
> -	li		$idx,15
> -	lvx		$tweak,0,$ivp			# load [unaligned] iv
> -	lvsl		$inpperm,0,$ivp
> -	lvx		$inptail,$idx,$ivp
> -	le?vxor		$inpperm,$inpperm,$tmp
> -	vperm		$tweak,$tweak,$inptail,$inpperm
> -
> -	neg		r11,$inp
> -	lvsr		$inpperm,0,r11			# prepare for unaligned load
> -	lvx		$inout,0,$inp
> -	addi		$inp,$inp,15			# 15 is not typo
> -	le?vxor		$inpperm,$inpperm,$tmp
> -
> -	${UCMP}i	$key2,0				# key2==NULL?
> -	beq		Lxts_enc_no_key2
> -
> -	?lvsl		$keyperm,0,$key2		# prepare for unaligned key
> -	lwz		$rounds,240($key2)
> -	srwi		$rounds,$rounds,1
> -	subi		$rounds,$rounds,1
> -	li		$idx,16
> -
> -	lvx		$rndkey0,0,$key2
> -	lvx		$rndkey1,$idx,$key2
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$tweak,$tweak,$rndkey0
> -	lvx		$rndkey0,$idx,$key2
> -	addi		$idx,$idx,16
> -	mtctr		$rounds
> -
> -Ltweak_xts_enc:
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vcipher		$tweak,$tweak,$rndkey1
> -	lvx		$rndkey1,$idx,$key2
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vcipher		$tweak,$tweak,$rndkey0
> -	lvx		$rndkey0,$idx,$key2
> -	addi		$idx,$idx,16
> -	bdnz		Ltweak_xts_enc
> -
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vcipher		$tweak,$tweak,$rndkey1
> -	lvx		$rndkey1,$idx,$key2
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vcipherlast	$tweak,$tweak,$rndkey0
> -
> -	li		$ivp,0				# don't chain the tweak
> -	b		Lxts_enc
> -
> -Lxts_enc_no_key2:
> -	li		$idx,-16
> -	and		$len,$len,$idx			# in "tweak chaining"
> -							# mode only complete
> -							# blocks are processed
> -Lxts_enc:
> -	lvx		$inptail,0,$inp
> -	addi		$inp,$inp,16
> -
> -	?lvsl		$keyperm,0,$key1		# prepare for unaligned key
> -	lwz		$rounds,240($key1)
> -	srwi		$rounds,$rounds,1
> -	subi		$rounds,$rounds,1
> -	li		$idx,16
> -
> -	vslb		$eighty7,$seven,$seven		# 0x808080..80
> -	vor		$eighty7,$eighty7,$seven	# 0x878787..87
> -	vspltisb	$tmp,1				# 0x010101..01
> -	vsldoi		$eighty7,$eighty7,$tmp,15	# 0x870101..01
> -
> -	${UCMP}i	$len,96
> -	bge		_aesp8_xts_encrypt6x
> -
> -	andi.		$taillen,$len,15
> -	subic		r0,$len,32
> -	subi		$taillen,$taillen,16
> -	subfe		r0,r0,r0
> -	and		r0,r0,$taillen
> -	add		$inp,$inp,r0
> -
> -	lvx		$rndkey0,0,$key1
> -	lvx		$rndkey1,$idx,$key1
> -	addi		$idx,$idx,16
> -	vperm		$inout,$inout,$inptail,$inpperm
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$inout,$inout,$tweak
> -	vxor		$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key1
> -	addi		$idx,$idx,16
> -	mtctr		$rounds
> -	b		Loop_xts_enc
> -
> -.align	5
> -Loop_xts_enc:
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vcipher		$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key1
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vcipher		$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key1
> -	addi		$idx,$idx,16
> -	bdnz		Loop_xts_enc
> -
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vcipher		$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key1
> -	li		$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$rndkey0,$rndkey0,$tweak
> -	vcipherlast	$output,$inout,$rndkey0
> -
> -	le?vperm	$tmp,$output,$output,$leperm
> -	be?nop
> -	le?stvx_u	$tmp,0,$out
> -	be?stvx_u	$output,0,$out
> -	addi		$out,$out,16
> -
> -	subic.		$len,$len,16
> -	beq		Lxts_enc_done
> -
> -	vmr		$inout,$inptail
> -	lvx		$inptail,0,$inp
> -	addi		$inp,$inp,16
> -	lvx		$rndkey0,0,$key1
> -	lvx		$rndkey1,$idx,$key1
> -	addi		$idx,$idx,16
> -
> -	subic		r0,$len,32
> -	subfe		r0,r0,r0
> -	and		r0,r0,$taillen
> -	add		$inp,$inp,r0
> -
> -	vsrab		$tmp,$tweak,$seven		# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	vand		$tmp,$tmp,$eighty7
> -	vxor		$tweak,$tweak,$tmp
> -
> -	vperm		$inout,$inout,$inptail,$inpperm
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$inout,$inout,$tweak
> -	vxor		$output,$output,$rndkey0	# just in case $len<16
> -	vxor		$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key1
> -	addi		$idx,$idx,16
> -
> -	mtctr		$rounds
> -	${UCMP}i	$len,16
> -	bge		Loop_xts_enc
> -
> -	vxor		$output,$output,$tweak
> -	lvsr		$inpperm,0,$len			# $inpperm is no longer needed
> -	vxor		$inptail,$inptail,$inptail	# $inptail is no longer needed
> -	vspltisb	$tmp,-1
> -	vperm		$inptail,$inptail,$tmp,$inpperm
> -	vsel		$inout,$inout,$output,$inptail
> -
> -	subi		r11,$out,17
> -	subi		$out,$out,16
> -	mtctr		$len
> -	li		$len,16
> -Loop_xts_enc_steal:
> -	lbzu		r0,1(r11)
> -	stb		r0,16(r11)
> -	bdnz		Loop_xts_enc_steal
> -
> -	mtctr		$rounds
> -	b		Loop_xts_enc			# one more time...
> -
> -Lxts_enc_done:
> -	${UCMP}i	$ivp,0
> -	beq		Lxts_enc_ret
> -
> -	vsrab		$tmp,$tweak,$seven		# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	vand		$tmp,$tmp,$eighty7
> -	vxor		$tweak,$tweak,$tmp
> -
> -	le?vperm	$tweak,$tweak,$tweak,$leperm
> -	stvx_u		$tweak,0,$ivp
> -
> -Lxts_enc_ret:
> -	mtspr		256,r12				# restore vrsave
> -	li		r3,0
> -	blr
> -	.long		0
> -	.byte		0,12,0x04,0,0x80,6,6,0
> -	.long		0
> -.size	.${prefix}_xts_encrypt,.-.${prefix}_xts_encrypt
> -
> -.globl	.${prefix}_xts_decrypt
> -	mr		$inp,r3				# reassign
> -	li		r3,-1
> -	${UCMP}i	$len,16
> -	bltlr-
> -
> -	lis		r0,0xfff8
> -	mfspr		r12,256				# save vrsave
> -	li		r11,0
> -	mtspr		256,r0
> -
> -	andi.		r0,$len,15
> -	neg		r0,r0
> -	andi.		r0,r0,16
> -	sub		$len,$len,r0
> -
> -	vspltisb	$seven,0x07			# 0x070707..07
> -	le?lvsl		$leperm,r11,r11
> -	le?vspltisb	$tmp,0x0f
> -	le?vxor		$leperm,$leperm,$seven
> -
> -	li		$idx,15
> -	lvx		$tweak,0,$ivp			# load [unaligned] iv
> -	lvsl		$inpperm,0,$ivp
> -	lvx		$inptail,$idx,$ivp
> -	le?vxor		$inpperm,$inpperm,$tmp
> -	vperm		$tweak,$tweak,$inptail,$inpperm
> -
> -	neg		r11,$inp
> -	lvsr		$inpperm,0,r11			# prepare for unaligned load
> -	lvx		$inout,0,$inp
> -	addi		$inp,$inp,15			# 15 is not typo
> -	le?vxor		$inpperm,$inpperm,$tmp
> -
> -	${UCMP}i	$key2,0				# key2==NULL?
> -	beq		Lxts_dec_no_key2
> -
> -	?lvsl		$keyperm,0,$key2		# prepare for unaligned key
> -	lwz		$rounds,240($key2)
> -	srwi		$rounds,$rounds,1
> -	subi		$rounds,$rounds,1
> -	li		$idx,16
> -
> -	lvx		$rndkey0,0,$key2
> -	lvx		$rndkey1,$idx,$key2
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$tweak,$tweak,$rndkey0
> -	lvx		$rndkey0,$idx,$key2
> -	addi		$idx,$idx,16
> -	mtctr		$rounds
> -
> -Ltweak_xts_dec:
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vcipher		$tweak,$tweak,$rndkey1
> -	lvx		$rndkey1,$idx,$key2
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vcipher		$tweak,$tweak,$rndkey0
> -	lvx		$rndkey0,$idx,$key2
> -	addi		$idx,$idx,16
> -	bdnz		Ltweak_xts_dec
> -
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vcipher		$tweak,$tweak,$rndkey1
> -	lvx		$rndkey1,$idx,$key2
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vcipherlast	$tweak,$tweak,$rndkey0
> -
> -	li		$ivp,0				# don't chain the tweak
> -	b		Lxts_dec
> -
> -Lxts_dec_no_key2:
> -	neg		$idx,$len
> -	andi.		$idx,$idx,15
> -	add		$len,$len,$idx			# in "tweak chaining"
> -							# mode only complete
> -							# blocks are processed
> -Lxts_dec:
> -	lvx		$inptail,0,$inp
> -	addi		$inp,$inp,16
> -
> -	?lvsl		$keyperm,0,$key1		# prepare for unaligned key
> -	lwz		$rounds,240($key1)
> -	srwi		$rounds,$rounds,1
> -	subi		$rounds,$rounds,1
> -	li		$idx,16
> -
> -	vslb		$eighty7,$seven,$seven		# 0x808080..80
> -	vor		$eighty7,$eighty7,$seven	# 0x878787..87
> -	vspltisb	$tmp,1				# 0x010101..01
> -	vsldoi		$eighty7,$eighty7,$tmp,15	# 0x870101..01
> -
> -	${UCMP}i	$len,96
> -	bge		_aesp8_xts_decrypt6x
> -
> -	lvx		$rndkey0,0,$key1
> -	lvx		$rndkey1,$idx,$key1
> -	addi		$idx,$idx,16
> -	vperm		$inout,$inout,$inptail,$inpperm
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$inout,$inout,$tweak
> -	vxor		$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key1
> -	addi		$idx,$idx,16
> -	mtctr		$rounds
> -
> -	${UCMP}i	$len,16
> -	blt		Ltail_xts_dec
> -	be?b		Loop_xts_dec
> -
> -.align	5
> -Loop_xts_dec:
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vncipher	$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key1
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vncipher	$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key1
> -	addi		$idx,$idx,16
> -	bdnz		Loop_xts_dec
> -
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vncipher	$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key1
> -	li		$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$rndkey0,$rndkey0,$tweak
> -	vncipherlast	$output,$inout,$rndkey0
> -
> -	le?vperm	$tmp,$output,$output,$leperm
> -	be?nop
> -	le?stvx_u	$tmp,0,$out
> -	be?stvx_u	$output,0,$out
> -	addi		$out,$out,16
> -
> -	subic.		$len,$len,16
> -	beq		Lxts_dec_done
> -
> -	vmr		$inout,$inptail
> -	lvx		$inptail,0,$inp
> -	addi		$inp,$inp,16
> -	lvx		$rndkey0,0,$key1
> -	lvx		$rndkey1,$idx,$key1
> -	addi		$idx,$idx,16
> -
> -	vsrab		$tmp,$tweak,$seven		# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	vand		$tmp,$tmp,$eighty7
> -	vxor		$tweak,$tweak,$tmp
> -
> -	vperm		$inout,$inout,$inptail,$inpperm
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$inout,$inout,$tweak
> -	vxor		$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key1
> -	addi		$idx,$idx,16
> -
> -	mtctr		$rounds
> -	${UCMP}i	$len,16
> -	bge		Loop_xts_dec
> -
> -Ltail_xts_dec:
> -	vsrab		$tmp,$tweak,$seven		# next tweak value
> -	vaddubm		$tweak1,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	vand		$tmp,$tmp,$eighty7
> -	vxor		$tweak1,$tweak1,$tmp
> -
> -	subi		$inp,$inp,16
> -	add		$inp,$inp,$len
> -
> -	vxor		$inout,$inout,$tweak		# :-(
> -	vxor		$inout,$inout,$tweak1		# :-)
> -
> -Loop_xts_dec_short:
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vncipher	$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key1
> -	addi		$idx,$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vncipher	$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key1
> -	addi		$idx,$idx,16
> -	bdnz		Loop_xts_dec_short
> -
> -	?vperm		$rndkey1,$rndkey1,$rndkey0,$keyperm
> -	vncipher	$inout,$inout,$rndkey1
> -	lvx		$rndkey1,$idx,$key1
> -	li		$idx,16
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -	vxor		$rndkey0,$rndkey0,$tweak1
> -	vncipherlast	$output,$inout,$rndkey0
> -
> -	le?vperm	$tmp,$output,$output,$leperm
> -	be?nop
> -	le?stvx_u	$tmp,0,$out
> -	be?stvx_u	$output,0,$out
> -
> -	vmr		$inout,$inptail
> -	lvx		$inptail,0,$inp
> -	#addi		$inp,$inp,16
> -	lvx		$rndkey0,0,$key1
> -	lvx		$rndkey1,$idx,$key1
> -	addi		$idx,$idx,16
> -	vperm		$inout,$inout,$inptail,$inpperm
> -	?vperm		$rndkey0,$rndkey0,$rndkey1,$keyperm
> -
> -	lvsr		$inpperm,0,$len			# $inpperm is no longer needed
> -	vxor		$inptail,$inptail,$inptail	# $inptail is no longer needed
> -	vspltisb	$tmp,-1
> -	vperm		$inptail,$inptail,$tmp,$inpperm
> -	vsel		$inout,$inout,$output,$inptail
> -
> -	vxor		$rndkey0,$rndkey0,$tweak
> -	vxor		$inout,$inout,$rndkey0
> -	lvx		$rndkey0,$idx,$key1
> -	addi		$idx,$idx,16
> -
> -	subi		r11,$out,1
> -	mtctr		$len
> -	li		$len,16
> -Loop_xts_dec_steal:
> -	lbzu		r0,1(r11)
> -	stb		r0,16(r11)
> -	bdnz		Loop_xts_dec_steal
> -
> -	mtctr		$rounds
> -	b		Loop_xts_dec			# one more time...
> -
> -Lxts_dec_done:
> -	${UCMP}i	$ivp,0
> -	beq		Lxts_dec_ret
> -
> -	vsrab		$tmp,$tweak,$seven		# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	vand		$tmp,$tmp,$eighty7
> -	vxor		$tweak,$tweak,$tmp
> -
> -	le?vperm	$tweak,$tweak,$tweak,$leperm
> -	stvx_u		$tweak,0,$ivp
> -
> -Lxts_dec_ret:
> -	mtspr		256,r12				# restore vrsave
> -	li		r3,0
> -	blr
> -	.long		0
> -	.byte		0,12,0x04,0,0x80,6,6,0
> -	.long		0
> -.size	.${prefix}_xts_decrypt,.-.${prefix}_xts_decrypt
> -___
> -#########################################################################
> -{{	# Optimized XTS procedures					#
> -my $key_=$key2;
> -my ($x00,$x10,$x20,$x30,$x40,$x50,$x60,$x70)=map("r$_",(0,3,26..31));
> -    $x00=0 if ($flavour =~ /osx/);
> -my ($in0,  $in1,  $in2,  $in3,  $in4,  $in5 )=map("v$_",(0..5));
> -my ($out0, $out1, $out2, $out3, $out4, $out5)=map("v$_",(7,12..16));
> -my ($twk0, $twk1, $twk2, $twk3, $twk4, $twk5)=map("v$_",(17..22));
> -my $rndkey0="v23";	# v24-v25 rotating buffer for first found keys
> -			# v26-v31 last 6 round keys
> -my ($keyperm)=($out0);	# aliases with "caller", redundant assignment
> -my $taillen=$x70;
> -
> -$code.=<<___;
> -.align	5
> -_aesp8_xts_encrypt6x:
> -	$STU		$sp,-`($FRAME+21*16+6*$SIZE_T)`($sp)
> -	mflr		r11
> -	li		r7,`$FRAME+8*16+15`
> -	li		r3,`$FRAME+8*16+31`
> -	$PUSH		r11,`$FRAME+21*16+6*$SIZE_T+$LRSAVE`($sp)
> -	stvx		v20,r7,$sp		# ABI says so
> -	addi		r7,r7,32
> -	stvx		v21,r3,$sp
> -	addi		r3,r3,32
> -	stvx		v22,r7,$sp
> -	addi		r7,r7,32
> -	stvx		v23,r3,$sp
> -	addi		r3,r3,32
> -	stvx		v24,r7,$sp
> -	addi		r7,r7,32
> -	stvx		v25,r3,$sp
> -	addi		r3,r3,32
> -	stvx		v26,r7,$sp
> -	addi		r7,r7,32
> -	stvx		v27,r3,$sp
> -	addi		r3,r3,32
> -	stvx		v28,r7,$sp
> -	addi		r7,r7,32
> -	stvx		v29,r3,$sp
> -	addi		r3,r3,32
> -	stvx		v30,r7,$sp
> -	stvx		v31,r3,$sp
> -	li		r0,-1
> -	stw		$vrsave,`$FRAME+21*16-4`($sp)	# save vrsave
> -	li		$x10,0x10
> -	$PUSH		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
> -	li		$x20,0x20
> -	$PUSH		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
> -	li		$x30,0x30
> -	$PUSH		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
> -	li		$x40,0x40
> -	$PUSH		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
> -	li		$x50,0x50
> -	$PUSH		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
> -	li		$x60,0x60
> -	$PUSH		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
> -	li		$x70,0x70
> -	mtspr		256,r0
> -
> -	subi		$rounds,$rounds,3	# -4 in total
> -
> -	lvx		$rndkey0,$x00,$key1	# load key schedule
> -	lvx		v30,$x10,$key1
> -	addi		$key1,$key1,0x20
> -	lvx		v31,$x00,$key1
> -	?vperm		$rndkey0,$rndkey0,v30,$keyperm
> -	addi		$key_,$sp,$FRAME+15
> -	mtctr		$rounds
> -
> -Load_xts_enc_key:
> -	?vperm		v24,v30,v31,$keyperm
> -	lvx		v30,$x10,$key1
> -	addi		$key1,$key1,0x20
> -	stvx		v24,$x00,$key_		# off-load round[1]
> -	?vperm		v25,v31,v30,$keyperm
> -	lvx		v31,$x00,$key1
> -	stvx		v25,$x10,$key_		# off-load round[2]
> -	addi		$key_,$key_,0x20
> -	bdnz		Load_xts_enc_key
> -
> -	lvx		v26,$x10,$key1
> -	?vperm		v24,v30,v31,$keyperm
> -	lvx		v27,$x20,$key1
> -	stvx		v24,$x00,$key_		# off-load round[3]
> -	?vperm		v25,v31,v26,$keyperm
> -	lvx		v28,$x30,$key1
> -	stvx		v25,$x10,$key_		# off-load round[4]
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	?vperm		v26,v26,v27,$keyperm
> -	lvx		v29,$x40,$key1
> -	?vperm		v27,v27,v28,$keyperm
> -	lvx		v30,$x50,$key1
> -	?vperm		v28,v28,v29,$keyperm
> -	lvx		v31,$x60,$key1
> -	?vperm		v29,v29,v30,$keyperm
> -	lvx		$twk5,$x70,$key1	# borrow $twk5
> -	?vperm		v30,v30,v31,$keyperm
> -	lvx		v24,$x00,$key_		# pre-load round[1]
> -	?vperm		v31,v31,$twk5,$keyperm
> -	lvx		v25,$x10,$key_		# pre-load round[2]
> -
> -	 vperm		$in0,$inout,$inptail,$inpperm
> -	 subi		$inp,$inp,31		# undo "caller"
> -	vxor		$twk0,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out0,$in0,$twk0
> -	vxor		$tweak,$tweak,$tmp
> -
> -	 lvx_u		$in1,$x10,$inp
> -	vxor		$twk1,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	 le?vperm	$in1,$in1,$in1,$leperm
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out1,$in1,$twk1
> -	vxor		$tweak,$tweak,$tmp
> -
> -	 lvx_u		$in2,$x20,$inp
> -	 andi.		$taillen,$len,15
> -	vxor		$twk2,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	 le?vperm	$in2,$in2,$in2,$leperm
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out2,$in2,$twk2
> -	vxor		$tweak,$tweak,$tmp
> -
> -	 lvx_u		$in3,$x30,$inp
> -	 sub		$len,$len,$taillen
> -	vxor		$twk3,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	 le?vperm	$in3,$in3,$in3,$leperm
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out3,$in3,$twk3
> -	vxor		$tweak,$tweak,$tmp
> -
> -	 lvx_u		$in4,$x40,$inp
> -	 subi		$len,$len,0x60
> -	vxor		$twk4,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	 le?vperm	$in4,$in4,$in4,$leperm
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out4,$in4,$twk4
> -	vxor		$tweak,$tweak,$tmp
> -
> -	 lvx_u		$in5,$x50,$inp
> -	 addi		$inp,$inp,0x60
> -	vxor		$twk5,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	 le?vperm	$in5,$in5,$in5,$leperm
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out5,$in5,$twk5
> -	vxor		$tweak,$tweak,$tmp
> -
> -	vxor		v31,v31,$rndkey0
> -	mtctr		$rounds
> -	b		Loop_xts_enc6x
> -
> -.align	5
> -Loop_xts_enc6x:
> -	vcipher		$out0,$out0,v24
> -	vcipher		$out1,$out1,v24
> -	vcipher		$out2,$out2,v24
> -	vcipher		$out3,$out3,v24
> -	vcipher		$out4,$out4,v24
> -	vcipher		$out5,$out5,v24
> -	lvx		v24,$x20,$key_		# round[3]
> -	addi		$key_,$key_,0x20
> -
> -	vcipher		$out0,$out0,v25
> -	vcipher		$out1,$out1,v25
> -	vcipher		$out2,$out2,v25
> -	vcipher		$out3,$out3,v25
> -	vcipher		$out4,$out4,v25
> -	vcipher		$out5,$out5,v25
> -	lvx		v25,$x10,$key_		# round[4]
> -	bdnz		Loop_xts_enc6x
> -
> -	subic		$len,$len,96		# $len-=96
> -	 vxor		$in0,$twk0,v31		# xor with last round key
> -	vcipher		$out0,$out0,v24
> -	vcipher		$out1,$out1,v24
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk0,$tweak,$rndkey0
> -	 vaddubm	$tweak,$tweak,$tweak
> -	vcipher		$out2,$out2,v24
> -	vcipher		$out3,$out3,v24
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -	vcipher		$out4,$out4,v24
> -	vcipher		$out5,$out5,v24
> -
> -	subfe.		r0,r0,r0		# borrow?-1:0
> -	 vand		$tmp,$tmp,$eighty7
> -	vcipher		$out0,$out0,v25
> -	vcipher		$out1,$out1,v25
> -	 vxor		$tweak,$tweak,$tmp
> -	vcipher		$out2,$out2,v25
> -	vcipher		$out3,$out3,v25
> -	 vxor		$in1,$twk1,v31
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk1,$tweak,$rndkey0
> -	vcipher		$out4,$out4,v25
> -	vcipher		$out5,$out5,v25
> -
> -	and		r0,r0,$len
> -	 vaddubm	$tweak,$tweak,$tweak
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -	vcipher		$out0,$out0,v26
> -	vcipher		$out1,$out1,v26
> -	 vand		$tmp,$tmp,$eighty7
> -	vcipher		$out2,$out2,v26
> -	vcipher		$out3,$out3,v26
> -	 vxor		$tweak,$tweak,$tmp
> -	vcipher		$out4,$out4,v26
> -	vcipher		$out5,$out5,v26
> -
> -	add		$inp,$inp,r0		# $inp is adjusted in such
> -						# way that at exit from the
> -						# loop inX-in5 are loaded
> -						# with last "words"
> -	 vxor		$in2,$twk2,v31
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk2,$tweak,$rndkey0
> -	 vaddubm	$tweak,$tweak,$tweak
> -	vcipher		$out0,$out0,v27
> -	vcipher		$out1,$out1,v27
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -	vcipher		$out2,$out2,v27
> -	vcipher		$out3,$out3,v27
> -	 vand		$tmp,$tmp,$eighty7
> -	vcipher		$out4,$out4,v27
> -	vcipher		$out5,$out5,v27
> -
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	 vxor		$tweak,$tweak,$tmp
> -	vcipher		$out0,$out0,v28
> -	vcipher		$out1,$out1,v28
> -	 vxor		$in3,$twk3,v31
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk3,$tweak,$rndkey0
> -	vcipher		$out2,$out2,v28
> -	vcipher		$out3,$out3,v28
> -	 vaddubm	$tweak,$tweak,$tweak
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -	vcipher		$out4,$out4,v28
> -	vcipher		$out5,$out5,v28
> -	lvx		v24,$x00,$key_		# re-pre-load round[1]
> -	 vand		$tmp,$tmp,$eighty7
> -
> -	vcipher		$out0,$out0,v29
> -	vcipher		$out1,$out1,v29
> -	 vxor		$tweak,$tweak,$tmp
> -	vcipher		$out2,$out2,v29
> -	vcipher		$out3,$out3,v29
> -	 vxor		$in4,$twk4,v31
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk4,$tweak,$rndkey0
> -	vcipher		$out4,$out4,v29
> -	vcipher		$out5,$out5,v29
> -	lvx		v25,$x10,$key_		# re-pre-load round[2]
> -	 vaddubm	$tweak,$tweak,$tweak
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -
> -	vcipher		$out0,$out0,v30
> -	vcipher		$out1,$out1,v30
> -	 vand		$tmp,$tmp,$eighty7
> -	vcipher		$out2,$out2,v30
> -	vcipher		$out3,$out3,v30
> -	 vxor		$tweak,$tweak,$tmp
> -	vcipher		$out4,$out4,v30
> -	vcipher		$out5,$out5,v30
> -	 vxor		$in5,$twk5,v31
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk5,$tweak,$rndkey0
> -
> -	vcipherlast	$out0,$out0,$in0
> -	 lvx_u		$in0,$x00,$inp		# load next input block
> -	 vaddubm	$tweak,$tweak,$tweak
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -	vcipherlast	$out1,$out1,$in1
> -	 lvx_u		$in1,$x10,$inp
> -	vcipherlast	$out2,$out2,$in2
> -	 le?vperm	$in0,$in0,$in0,$leperm
> -	 lvx_u		$in2,$x20,$inp
> -	 vand		$tmp,$tmp,$eighty7
> -	vcipherlast	$out3,$out3,$in3
> -	 le?vperm	$in1,$in1,$in1,$leperm
> -	 lvx_u		$in3,$x30,$inp
> -	vcipherlast	$out4,$out4,$in4
> -	 le?vperm	$in2,$in2,$in2,$leperm
> -	 lvx_u		$in4,$x40,$inp
> -	 vxor		$tweak,$tweak,$tmp
> -	vcipherlast	$tmp,$out5,$in5		# last block might be needed
> -						# in stealing mode
> -	 le?vperm	$in3,$in3,$in3,$leperm
> -	 lvx_u		$in5,$x50,$inp
> -	 addi		$inp,$inp,0x60
> -	 le?vperm	$in4,$in4,$in4,$leperm
> -	 le?vperm	$in5,$in5,$in5,$leperm
> -
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	le?vperm	$out1,$out1,$out1,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	 vxor		$out0,$in0,$twk0
> -	le?vperm	$out2,$out2,$out2,$leperm
> -	stvx_u		$out1,$x10,$out
> -	 vxor		$out1,$in1,$twk1
> -	le?vperm	$out3,$out3,$out3,$leperm
> -	stvx_u		$out2,$x20,$out
> -	 vxor		$out2,$in2,$twk2
> -	le?vperm	$out4,$out4,$out4,$leperm
> -	stvx_u		$out3,$x30,$out
> -	 vxor		$out3,$in3,$twk3
> -	le?vperm	$out5,$tmp,$tmp,$leperm
> -	stvx_u		$out4,$x40,$out
> -	 vxor		$out4,$in4,$twk4
> -	le?stvx_u	$out5,$x50,$out
> -	be?stvx_u	$tmp, $x50,$out
> -	 vxor		$out5,$in5,$twk5
> -	addi		$out,$out,0x60
> -
> -	mtctr		$rounds
> -	beq		Loop_xts_enc6x		# did $len-=96 borrow?
> -
> -	addic.		$len,$len,0x60
> -	beq		Lxts_enc6x_zero
> -	cmpwi		$len,0x20
> -	blt		Lxts_enc6x_one
> -	nop
> -	beq		Lxts_enc6x_two
> -	cmpwi		$len,0x40
> -	blt		Lxts_enc6x_three
> -	nop
> -	beq		Lxts_enc6x_four
> -
> -Lxts_enc6x_five:
> -	vxor		$out0,$in1,$twk0
> -	vxor		$out1,$in2,$twk1
> -	vxor		$out2,$in3,$twk2
> -	vxor		$out3,$in4,$twk3
> -	vxor		$out4,$in5,$twk4
> -
> -	bl		_aesp8_xts_enc5x
> -
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	vmr		$twk0,$twk5		# unused tweak
> -	le?vperm	$out1,$out1,$out1,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	le?vperm	$out2,$out2,$out2,$leperm
> -	stvx_u		$out1,$x10,$out
> -	le?vperm	$out3,$out3,$out3,$leperm
> -	stvx_u		$out2,$x20,$out
> -	vxor		$tmp,$out4,$twk5	# last block prep for stealing
> -	le?vperm	$out4,$out4,$out4,$leperm
> -	stvx_u		$out3,$x30,$out
> -	stvx_u		$out4,$x40,$out
> -	addi		$out,$out,0x50
> -	bne		Lxts_enc6x_steal
> -	b		Lxts_enc6x_done
> -
> -.align	4
> -Lxts_enc6x_four:
> -	vxor		$out0,$in2,$twk0
> -	vxor		$out1,$in3,$twk1
> -	vxor		$out2,$in4,$twk2
> -	vxor		$out3,$in5,$twk3
> -	vxor		$out4,$out4,$out4
> -
> -	bl		_aesp8_xts_enc5x
> -
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	vmr		$twk0,$twk4		# unused tweak
> -	le?vperm	$out1,$out1,$out1,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	le?vperm	$out2,$out2,$out2,$leperm
> -	stvx_u		$out1,$x10,$out
> -	vxor		$tmp,$out3,$twk4	# last block prep for stealing
> -	le?vperm	$out3,$out3,$out3,$leperm
> -	stvx_u		$out2,$x20,$out
> -	stvx_u		$out3,$x30,$out
> -	addi		$out,$out,0x40
> -	bne		Lxts_enc6x_steal
> -	b		Lxts_enc6x_done
> -
> -.align	4
> -Lxts_enc6x_three:
> -	vxor		$out0,$in3,$twk0
> -	vxor		$out1,$in4,$twk1
> -	vxor		$out2,$in5,$twk2
> -	vxor		$out3,$out3,$out3
> -	vxor		$out4,$out4,$out4
> -
> -	bl		_aesp8_xts_enc5x
> -
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	vmr		$twk0,$twk3		# unused tweak
> -	le?vperm	$out1,$out1,$out1,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	vxor		$tmp,$out2,$twk3	# last block prep for stealing
> -	le?vperm	$out2,$out2,$out2,$leperm
> -	stvx_u		$out1,$x10,$out
> -	stvx_u		$out2,$x20,$out
> -	addi		$out,$out,0x30
> -	bne		Lxts_enc6x_steal
> -	b		Lxts_enc6x_done
> -
> -.align	4
> -Lxts_enc6x_two:
> -	vxor		$out0,$in4,$twk0
> -	vxor		$out1,$in5,$twk1
> -	vxor		$out2,$out2,$out2
> -	vxor		$out3,$out3,$out3
> -	vxor		$out4,$out4,$out4
> -
> -	bl		_aesp8_xts_enc5x
> -
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	vmr		$twk0,$twk2		# unused tweak
> -	vxor		$tmp,$out1,$twk2	# last block prep for stealing
> -	le?vperm	$out1,$out1,$out1,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	stvx_u		$out1,$x10,$out
> -	addi		$out,$out,0x20
> -	bne		Lxts_enc6x_steal
> -	b		Lxts_enc6x_done
> -
> -.align	4
> -Lxts_enc6x_one:
> -	vxor		$out0,$in5,$twk0
> -	nop
> -Loop_xts_enc1x:
> -	vcipher		$out0,$out0,v24
> -	lvx		v24,$x20,$key_		# round[3]
> -	addi		$key_,$key_,0x20
> -
> -	vcipher		$out0,$out0,v25
> -	lvx		v25,$x10,$key_		# round[4]
> -	bdnz		Loop_xts_enc1x
> -
> -	add		$inp,$inp,$taillen
> -	cmpwi		$taillen,0
> -	vcipher		$out0,$out0,v24
> -
> -	subi		$inp,$inp,16
> -	vcipher		$out0,$out0,v25
> -
> -	lvsr		$inpperm,0,$taillen
> -	vcipher		$out0,$out0,v26
> -
> -	lvx_u		$in0,0,$inp
> -	vcipher		$out0,$out0,v27
> -
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	vcipher		$out0,$out0,v28
> -	lvx		v24,$x00,$key_		# re-pre-load round[1]
> -
> -	vcipher		$out0,$out0,v29
> -	lvx		v25,$x10,$key_		# re-pre-load round[2]
> -	 vxor		$twk0,$twk0,v31
> -
> -	le?vperm	$in0,$in0,$in0,$leperm
> -	vcipher		$out0,$out0,v30
> -
> -	vperm		$in0,$in0,$in0,$inpperm
> -	vcipherlast	$out0,$out0,$twk0
> -
> -	vmr		$twk0,$twk1		# unused tweak
> -	vxor		$tmp,$out0,$twk1	# last block prep for stealing
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	addi		$out,$out,0x10
> -	bne		Lxts_enc6x_steal
> -	b		Lxts_enc6x_done
> -
> -.align	4
> -Lxts_enc6x_zero:
> -	cmpwi		$taillen,0
> -	beq		Lxts_enc6x_done
> -
> -	add		$inp,$inp,$taillen
> -	subi		$inp,$inp,16
> -	lvx_u		$in0,0,$inp
> -	lvsr		$inpperm,0,$taillen	# $in5 is no more
> -	le?vperm	$in0,$in0,$in0,$leperm
> -	vperm		$in0,$in0,$in0,$inpperm
> -	vxor		$tmp,$tmp,$twk0
> -Lxts_enc6x_steal:
> -	vxor		$in0,$in0,$twk0
> -	vxor		$out0,$out0,$out0
> -	vspltisb	$out1,-1
> -	vperm		$out0,$out0,$out1,$inpperm
> -	vsel		$out0,$in0,$tmp,$out0	# $tmp is last block, remember?
> -
> -	subi		r30,$out,17
> -	subi		$out,$out,16
> -	mtctr		$taillen
> -Loop_xts_enc6x_steal:
> -	lbzu		r0,1(r30)
> -	stb		r0,16(r30)
> -	bdnz		Loop_xts_enc6x_steal
> -
> -	li		$taillen,0
> -	mtctr		$rounds
> -	b		Loop_xts_enc1x		# one more time...
> -
> -.align	4
> -Lxts_enc6x_done:
> -	${UCMP}i	$ivp,0
> -	beq		Lxts_enc6x_ret
> -
> -	vxor		$tweak,$twk0,$rndkey0
> -	le?vperm	$tweak,$tweak,$tweak,$leperm
> -	stvx_u		$tweak,0,$ivp
> -
> -Lxts_enc6x_ret:
> -	mtlr		r11
> -	li		r10,`$FRAME+15`
> -	li		r11,`$FRAME+31`
> -	stvx		$seven,r10,$sp		# wipe copies of round keys
> -	addi		r10,r10,32
> -	stvx		$seven,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$seven,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$seven,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$seven,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$seven,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$seven,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$seven,r11,$sp
> -	addi		r11,r11,32
> -
> -	mtspr		256,$vrsave
> -	lvx		v20,r10,$sp		# ABI says so
> -	addi		r10,r10,32
> -	lvx		v21,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v22,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v23,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v24,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v25,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v26,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v27,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v28,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v29,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v30,r10,$sp
> -	lvx		v31,r11,$sp
> -	$POP		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
> -	$POP		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
> -	$POP		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
> -	$POP		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
> -	$POP		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
> -	$POP		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
> -	addi		$sp,$sp,`$FRAME+21*16+6*$SIZE_T`
> -	blr
> -	.long		0
> -	.byte		0,12,0x04,1,0x80,6,6,0
> -	.long		0
> -
> -.align	5
> -_aesp8_xts_enc5x:
> -	vcipher		$out0,$out0,v24
> -	vcipher		$out1,$out1,v24
> -	vcipher		$out2,$out2,v24
> -	vcipher		$out3,$out3,v24
> -	vcipher		$out4,$out4,v24
> -	lvx		v24,$x20,$key_		# round[3]
> -	addi		$key_,$key_,0x20
> -
> -	vcipher		$out0,$out0,v25
> -	vcipher		$out1,$out1,v25
> -	vcipher		$out2,$out2,v25
> -	vcipher		$out3,$out3,v25
> -	vcipher		$out4,$out4,v25
> -	lvx		v25,$x10,$key_		# round[4]
> -	bdnz		_aesp8_xts_enc5x
> -
> -	add		$inp,$inp,$taillen
> -	cmpwi		$taillen,0
> -	vcipher		$out0,$out0,v24
> -	vcipher		$out1,$out1,v24
> -	vcipher		$out2,$out2,v24
> -	vcipher		$out3,$out3,v24
> -	vcipher		$out4,$out4,v24
> -
> -	subi		$inp,$inp,16
> -	vcipher		$out0,$out0,v25
> -	vcipher		$out1,$out1,v25
> -	vcipher		$out2,$out2,v25
> -	vcipher		$out3,$out3,v25
> -	vcipher		$out4,$out4,v25
> -	 vxor		$twk0,$twk0,v31
> -
> -	vcipher		$out0,$out0,v26
> -	lvsr		$inpperm,r0,$taillen	# $in5 is no more
> -	vcipher		$out1,$out1,v26
> -	vcipher		$out2,$out2,v26
> -	vcipher		$out3,$out3,v26
> -	vcipher		$out4,$out4,v26
> -	 vxor		$in1,$twk1,v31
> -
> -	vcipher		$out0,$out0,v27
> -	lvx_u		$in0,0,$inp
> -	vcipher		$out1,$out1,v27
> -	vcipher		$out2,$out2,v27
> -	vcipher		$out3,$out3,v27
> -	vcipher		$out4,$out4,v27
> -	 vxor		$in2,$twk2,v31
> -
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	vcipher		$out0,$out0,v28
> -	vcipher		$out1,$out1,v28
> -	vcipher		$out2,$out2,v28
> -	vcipher		$out3,$out3,v28
> -	vcipher		$out4,$out4,v28
> -	lvx		v24,$x00,$key_		# re-pre-load round[1]
> -	 vxor		$in3,$twk3,v31
> -
> -	vcipher		$out0,$out0,v29
> -	le?vperm	$in0,$in0,$in0,$leperm
> -	vcipher		$out1,$out1,v29
> -	vcipher		$out2,$out2,v29
> -	vcipher		$out3,$out3,v29
> -	vcipher		$out4,$out4,v29
> -	lvx		v25,$x10,$key_		# re-pre-load round[2]
> -	 vxor		$in4,$twk4,v31
> -
> -	vcipher		$out0,$out0,v30
> -	vperm		$in0,$in0,$in0,$inpperm
> -	vcipher		$out1,$out1,v30
> -	vcipher		$out2,$out2,v30
> -	vcipher		$out3,$out3,v30
> -	vcipher		$out4,$out4,v30
> -
> -	vcipherlast	$out0,$out0,$twk0
> -	vcipherlast	$out1,$out1,$in1
> -	vcipherlast	$out2,$out2,$in2
> -	vcipherlast	$out3,$out3,$in3
> -	vcipherlast	$out4,$out4,$in4
> -	blr
> -        .long   	0
> -        .byte   	0,12,0x14,0,0,0,0,0
> -
> -.align	5
> -_aesp8_xts_decrypt6x:
> -	$STU		$sp,-`($FRAME+21*16+6*$SIZE_T)`($sp)
> -	mflr		r11
> -	li		r7,`$FRAME+8*16+15`
> -	li		r3,`$FRAME+8*16+31`
> -	$PUSH		r11,`$FRAME+21*16+6*$SIZE_T+$LRSAVE`($sp)
> -	stvx		v20,r7,$sp		# ABI says so
> -	addi		r7,r7,32
> -	stvx		v21,r3,$sp
> -	addi		r3,r3,32
> -	stvx		v22,r7,$sp
> -	addi		r7,r7,32
> -	stvx		v23,r3,$sp
> -	addi		r3,r3,32
> -	stvx		v24,r7,$sp
> -	addi		r7,r7,32
> -	stvx		v25,r3,$sp
> -	addi		r3,r3,32
> -	stvx		v26,r7,$sp
> -	addi		r7,r7,32
> -	stvx		v27,r3,$sp
> -	addi		r3,r3,32
> -	stvx		v28,r7,$sp
> -	addi		r7,r7,32
> -	stvx		v29,r3,$sp
> -	addi		r3,r3,32
> -	stvx		v30,r7,$sp
> -	stvx		v31,r3,$sp
> -	li		r0,-1
> -	stw		$vrsave,`$FRAME+21*16-4`($sp)	# save vrsave
> -	li		$x10,0x10
> -	$PUSH		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
> -	li		$x20,0x20
> -	$PUSH		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
> -	li		$x30,0x30
> -	$PUSH		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
> -	li		$x40,0x40
> -	$PUSH		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
> -	li		$x50,0x50
> -	$PUSH		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
> -	li		$x60,0x60
> -	$PUSH		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
> -	li		$x70,0x70
> -	mtspr		256,r0
> -
> -	subi		$rounds,$rounds,3	# -4 in total
> -
> -	lvx		$rndkey0,$x00,$key1	# load key schedule
> -	lvx		v30,$x10,$key1
> -	addi		$key1,$key1,0x20
> -	lvx		v31,$x00,$key1
> -	?vperm		$rndkey0,$rndkey0,v30,$keyperm
> -	addi		$key_,$sp,$FRAME+15
> -	mtctr		$rounds
> -
> -Load_xts_dec_key:
> -	?vperm		v24,v30,v31,$keyperm
> -	lvx		v30,$x10,$key1
> -	addi		$key1,$key1,0x20
> -	stvx		v24,$x00,$key_		# off-load round[1]
> -	?vperm		v25,v31,v30,$keyperm
> -	lvx		v31,$x00,$key1
> -	stvx		v25,$x10,$key_		# off-load round[2]
> -	addi		$key_,$key_,0x20
> -	bdnz		Load_xts_dec_key
> -
> -	lvx		v26,$x10,$key1
> -	?vperm		v24,v30,v31,$keyperm
> -	lvx		v27,$x20,$key1
> -	stvx		v24,$x00,$key_		# off-load round[3]
> -	?vperm		v25,v31,v26,$keyperm
> -	lvx		v28,$x30,$key1
> -	stvx		v25,$x10,$key_		# off-load round[4]
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	?vperm		v26,v26,v27,$keyperm
> -	lvx		v29,$x40,$key1
> -	?vperm		v27,v27,v28,$keyperm
> -	lvx		v30,$x50,$key1
> -	?vperm		v28,v28,v29,$keyperm
> -	lvx		v31,$x60,$key1
> -	?vperm		v29,v29,v30,$keyperm
> -	lvx		$twk5,$x70,$key1	# borrow $twk5
> -	?vperm		v30,v30,v31,$keyperm
> -	lvx		v24,$x00,$key_		# pre-load round[1]
> -	?vperm		v31,v31,$twk5,$keyperm
> -	lvx		v25,$x10,$key_		# pre-load round[2]
> -
> -	 vperm		$in0,$inout,$inptail,$inpperm
> -	 subi		$inp,$inp,31		# undo "caller"
> -	vxor		$twk0,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out0,$in0,$twk0
> -	vxor		$tweak,$tweak,$tmp
> -
> -	 lvx_u		$in1,$x10,$inp
> -	vxor		$twk1,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	 le?vperm	$in1,$in1,$in1,$leperm
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out1,$in1,$twk1
> -	vxor		$tweak,$tweak,$tmp
> -
> -	 lvx_u		$in2,$x20,$inp
> -	 andi.		$taillen,$len,15
> -	vxor		$twk2,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	 le?vperm	$in2,$in2,$in2,$leperm
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out2,$in2,$twk2
> -	vxor		$tweak,$tweak,$tmp
> -
> -	 lvx_u		$in3,$x30,$inp
> -	 sub		$len,$len,$taillen
> -	vxor		$twk3,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	 le?vperm	$in3,$in3,$in3,$leperm
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out3,$in3,$twk3
> -	vxor		$tweak,$tweak,$tmp
> -
> -	 lvx_u		$in4,$x40,$inp
> -	 subi		$len,$len,0x60
> -	vxor		$twk4,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	 le?vperm	$in4,$in4,$in4,$leperm
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out4,$in4,$twk4
> -	vxor		$tweak,$tweak,$tmp
> -
> -	 lvx_u		$in5,$x50,$inp
> -	 addi		$inp,$inp,0x60
> -	vxor		$twk5,$tweak,$rndkey0
> -	vsrab		$tmp,$tweak,$seven	# next tweak value
> -	vaddubm		$tweak,$tweak,$tweak
> -	vsldoi		$tmp,$tmp,$tmp,15
> -	 le?vperm	$in5,$in5,$in5,$leperm
> -	vand		$tmp,$tmp,$eighty7
> -	 vxor		$out5,$in5,$twk5
> -	vxor		$tweak,$tweak,$tmp
> -
> -	vxor		v31,v31,$rndkey0
> -	mtctr		$rounds
> -	b		Loop_xts_dec6x
> -
> -.align	5
> -Loop_xts_dec6x:
> -	vncipher	$out0,$out0,v24
> -	vncipher	$out1,$out1,v24
> -	vncipher	$out2,$out2,v24
> -	vncipher	$out3,$out3,v24
> -	vncipher	$out4,$out4,v24
> -	vncipher	$out5,$out5,v24
> -	lvx		v24,$x20,$key_		# round[3]
> -	addi		$key_,$key_,0x20
> -
> -	vncipher	$out0,$out0,v25
> -	vncipher	$out1,$out1,v25
> -	vncipher	$out2,$out2,v25
> -	vncipher	$out3,$out3,v25
> -	vncipher	$out4,$out4,v25
> -	vncipher	$out5,$out5,v25
> -	lvx		v25,$x10,$key_		# round[4]
> -	bdnz		Loop_xts_dec6x
> -
> -	subic		$len,$len,96		# $len-=96
> -	 vxor		$in0,$twk0,v31		# xor with last round key
> -	vncipher	$out0,$out0,v24
> -	vncipher	$out1,$out1,v24
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk0,$tweak,$rndkey0
> -	 vaddubm	$tweak,$tweak,$tweak
> -	vncipher	$out2,$out2,v24
> -	vncipher	$out3,$out3,v24
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -	vncipher	$out4,$out4,v24
> -	vncipher	$out5,$out5,v24
> -
> -	subfe.		r0,r0,r0		# borrow?-1:0
> -	 vand		$tmp,$tmp,$eighty7
> -	vncipher	$out0,$out0,v25
> -	vncipher	$out1,$out1,v25
> -	 vxor		$tweak,$tweak,$tmp
> -	vncipher	$out2,$out2,v25
> -	vncipher	$out3,$out3,v25
> -	 vxor		$in1,$twk1,v31
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk1,$tweak,$rndkey0
> -	vncipher	$out4,$out4,v25
> -	vncipher	$out5,$out5,v25
> -
> -	and		r0,r0,$len
> -	 vaddubm	$tweak,$tweak,$tweak
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -	vncipher	$out0,$out0,v26
> -	vncipher	$out1,$out1,v26
> -	 vand		$tmp,$tmp,$eighty7
> -	vncipher	$out2,$out2,v26
> -	vncipher	$out3,$out3,v26
> -	 vxor		$tweak,$tweak,$tmp
> -	vncipher	$out4,$out4,v26
> -	vncipher	$out5,$out5,v26
> -
> -	add		$inp,$inp,r0		# $inp is adjusted in such
> -						# way that at exit from the
> -						# loop inX-in5 are loaded
> -						# with last "words"
> -	 vxor		$in2,$twk2,v31
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk2,$tweak,$rndkey0
> -	 vaddubm	$tweak,$tweak,$tweak
> -	vncipher	$out0,$out0,v27
> -	vncipher	$out1,$out1,v27
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -	vncipher	$out2,$out2,v27
> -	vncipher	$out3,$out3,v27
> -	 vand		$tmp,$tmp,$eighty7
> -	vncipher	$out4,$out4,v27
> -	vncipher	$out5,$out5,v27
> -
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	 vxor		$tweak,$tweak,$tmp
> -	vncipher	$out0,$out0,v28
> -	vncipher	$out1,$out1,v28
> -	 vxor		$in3,$twk3,v31
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk3,$tweak,$rndkey0
> -	vncipher	$out2,$out2,v28
> -	vncipher	$out3,$out3,v28
> -	 vaddubm	$tweak,$tweak,$tweak
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -	vncipher	$out4,$out4,v28
> -	vncipher	$out5,$out5,v28
> -	lvx		v24,$x00,$key_		# re-pre-load round[1]
> -	 vand		$tmp,$tmp,$eighty7
> -
> -	vncipher	$out0,$out0,v29
> -	vncipher	$out1,$out1,v29
> -	 vxor		$tweak,$tweak,$tmp
> -	vncipher	$out2,$out2,v29
> -	vncipher	$out3,$out3,v29
> -	 vxor		$in4,$twk4,v31
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk4,$tweak,$rndkey0
> -	vncipher	$out4,$out4,v29
> -	vncipher	$out5,$out5,v29
> -	lvx		v25,$x10,$key_		# re-pre-load round[2]
> -	 vaddubm	$tweak,$tweak,$tweak
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -
> -	vncipher	$out0,$out0,v30
> -	vncipher	$out1,$out1,v30
> -	 vand		$tmp,$tmp,$eighty7
> -	vncipher	$out2,$out2,v30
> -	vncipher	$out3,$out3,v30
> -	 vxor		$tweak,$tweak,$tmp
> -	vncipher	$out4,$out4,v30
> -	vncipher	$out5,$out5,v30
> -	 vxor		$in5,$twk5,v31
> -	 vsrab		$tmp,$tweak,$seven	# next tweak value
> -	 vxor		$twk5,$tweak,$rndkey0
> -
> -	vncipherlast	$out0,$out0,$in0
> -	 lvx_u		$in0,$x00,$inp		# load next input block
> -	 vaddubm	$tweak,$tweak,$tweak
> -	 vsldoi		$tmp,$tmp,$tmp,15
> -	vncipherlast	$out1,$out1,$in1
> -	 lvx_u		$in1,$x10,$inp
> -	vncipherlast	$out2,$out2,$in2
> -	 le?vperm	$in0,$in0,$in0,$leperm
> -	 lvx_u		$in2,$x20,$inp
> -	 vand		$tmp,$tmp,$eighty7
> -	vncipherlast	$out3,$out3,$in3
> -	 le?vperm	$in1,$in1,$in1,$leperm
> -	 lvx_u		$in3,$x30,$inp
> -	vncipherlast	$out4,$out4,$in4
> -	 le?vperm	$in2,$in2,$in2,$leperm
> -	 lvx_u		$in4,$x40,$inp
> -	 vxor		$tweak,$tweak,$tmp
> -	vncipherlast	$out5,$out5,$in5
> -	 le?vperm	$in3,$in3,$in3,$leperm
> -	 lvx_u		$in5,$x50,$inp
> -	 addi		$inp,$inp,0x60
> -	 le?vperm	$in4,$in4,$in4,$leperm
> -	 le?vperm	$in5,$in5,$in5,$leperm
> -
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	le?vperm	$out1,$out1,$out1,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	 vxor		$out0,$in0,$twk0
> -	le?vperm	$out2,$out2,$out2,$leperm
> -	stvx_u		$out1,$x10,$out
> -	 vxor		$out1,$in1,$twk1
> -	le?vperm	$out3,$out3,$out3,$leperm
> -	stvx_u		$out2,$x20,$out
> -	 vxor		$out2,$in2,$twk2
> -	le?vperm	$out4,$out4,$out4,$leperm
> -	stvx_u		$out3,$x30,$out
> -	 vxor		$out3,$in3,$twk3
> -	le?vperm	$out5,$out5,$out5,$leperm
> -	stvx_u		$out4,$x40,$out
> -	 vxor		$out4,$in4,$twk4
> -	stvx_u		$out5,$x50,$out
> -	 vxor		$out5,$in5,$twk5
> -	addi		$out,$out,0x60
> -
> -	mtctr		$rounds
> -	beq		Loop_xts_dec6x		# did $len-=96 borrow?
> -
> -	addic.		$len,$len,0x60
> -	beq		Lxts_dec6x_zero
> -	cmpwi		$len,0x20
> -	blt		Lxts_dec6x_one
> -	nop
> -	beq		Lxts_dec6x_two
> -	cmpwi		$len,0x40
> -	blt		Lxts_dec6x_three
> -	nop
> -	beq		Lxts_dec6x_four
> -
> -Lxts_dec6x_five:
> -	vxor		$out0,$in1,$twk0
> -	vxor		$out1,$in2,$twk1
> -	vxor		$out2,$in3,$twk2
> -	vxor		$out3,$in4,$twk3
> -	vxor		$out4,$in5,$twk4
> -
> -	bl		_aesp8_xts_dec5x
> -
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	vmr		$twk0,$twk5		# unused tweak
> -	vxor		$twk1,$tweak,$rndkey0
> -	le?vperm	$out1,$out1,$out1,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	vxor		$out0,$in0,$twk1
> -	le?vperm	$out2,$out2,$out2,$leperm
> -	stvx_u		$out1,$x10,$out
> -	le?vperm	$out3,$out3,$out3,$leperm
> -	stvx_u		$out2,$x20,$out
> -	le?vperm	$out4,$out4,$out4,$leperm
> -	stvx_u		$out3,$x30,$out
> -	stvx_u		$out4,$x40,$out
> -	addi		$out,$out,0x50
> -	bne		Lxts_dec6x_steal
> -	b		Lxts_dec6x_done
> -
> -.align	4
> -Lxts_dec6x_four:
> -	vxor		$out0,$in2,$twk0
> -	vxor		$out1,$in3,$twk1
> -	vxor		$out2,$in4,$twk2
> -	vxor		$out3,$in5,$twk3
> -	vxor		$out4,$out4,$out4
> -
> -	bl		_aesp8_xts_dec5x
> -
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	vmr		$twk0,$twk4		# unused tweak
> -	vmr		$twk1,$twk5
> -	le?vperm	$out1,$out1,$out1,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	vxor		$out0,$in0,$twk5
> -	le?vperm	$out2,$out2,$out2,$leperm
> -	stvx_u		$out1,$x10,$out
> -	le?vperm	$out3,$out3,$out3,$leperm
> -	stvx_u		$out2,$x20,$out
> -	stvx_u		$out3,$x30,$out
> -	addi		$out,$out,0x40
> -	bne		Lxts_dec6x_steal
> -	b		Lxts_dec6x_done
> -
> -.align	4
> -Lxts_dec6x_three:
> -	vxor		$out0,$in3,$twk0
> -	vxor		$out1,$in4,$twk1
> -	vxor		$out2,$in5,$twk2
> -	vxor		$out3,$out3,$out3
> -	vxor		$out4,$out4,$out4
> -
> -	bl		_aesp8_xts_dec5x
> -
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	vmr		$twk0,$twk3		# unused tweak
> -	vmr		$twk1,$twk4
> -	le?vperm	$out1,$out1,$out1,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	vxor		$out0,$in0,$twk4
> -	le?vperm	$out2,$out2,$out2,$leperm
> -	stvx_u		$out1,$x10,$out
> -	stvx_u		$out2,$x20,$out
> -	addi		$out,$out,0x30
> -	bne		Lxts_dec6x_steal
> -	b		Lxts_dec6x_done
> -
> -.align	4
> -Lxts_dec6x_two:
> -	vxor		$out0,$in4,$twk0
> -	vxor		$out1,$in5,$twk1
> -	vxor		$out2,$out2,$out2
> -	vxor		$out3,$out3,$out3
> -	vxor		$out4,$out4,$out4
> -
> -	bl		_aesp8_xts_dec5x
> -
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	vmr		$twk0,$twk2		# unused tweak
> -	vmr		$twk1,$twk3
> -	le?vperm	$out1,$out1,$out1,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	vxor		$out0,$in0,$twk3
> -	stvx_u		$out1,$x10,$out
> -	addi		$out,$out,0x20
> -	bne		Lxts_dec6x_steal
> -	b		Lxts_dec6x_done
> -
> -.align	4
> -Lxts_dec6x_one:
> -	vxor		$out0,$in5,$twk0
> -	nop
> -Loop_xts_dec1x:
> -	vncipher	$out0,$out0,v24
> -	lvx		v24,$x20,$key_		# round[3]
> -	addi		$key_,$key_,0x20
> -
> -	vncipher	$out0,$out0,v25
> -	lvx		v25,$x10,$key_		# round[4]
> -	bdnz		Loop_xts_dec1x
> -
> -	subi		r0,$taillen,1
> -	vncipher	$out0,$out0,v24
> -
> -	andi.		r0,r0,16
> -	cmpwi		$taillen,0
> -	vncipher	$out0,$out0,v25
> -
> -	sub		$inp,$inp,r0
> -	vncipher	$out0,$out0,v26
> -
> -	lvx_u		$in0,0,$inp
> -	vncipher	$out0,$out0,v27
> -
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	vncipher	$out0,$out0,v28
> -	lvx		v24,$x00,$key_		# re-pre-load round[1]
> -
> -	vncipher	$out0,$out0,v29
> -	lvx		v25,$x10,$key_		# re-pre-load round[2]
> -	 vxor		$twk0,$twk0,v31
> -
> -	le?vperm	$in0,$in0,$in0,$leperm
> -	vncipher	$out0,$out0,v30
> -
> -	mtctr		$rounds
> -	vncipherlast	$out0,$out0,$twk0
> -
> -	vmr		$twk0,$twk1		# unused tweak
> -	vmr		$twk1,$twk2
> -	le?vperm	$out0,$out0,$out0,$leperm
> -	stvx_u		$out0,$x00,$out		# store output
> -	addi		$out,$out,0x10
> -	vxor		$out0,$in0,$twk2
> -	bne		Lxts_dec6x_steal
> -	b		Lxts_dec6x_done
> -
> -.align	4
> -Lxts_dec6x_zero:
> -	cmpwi		$taillen,0
> -	beq		Lxts_dec6x_done
> -
> -	lvx_u		$in0,0,$inp
> -	le?vperm	$in0,$in0,$in0,$leperm
> -	vxor		$out0,$in0,$twk1
> -Lxts_dec6x_steal:
> -	vncipher	$out0,$out0,v24
> -	lvx		v24,$x20,$key_		# round[3]
> -	addi		$key_,$key_,0x20
> -
> -	vncipher	$out0,$out0,v25
> -	lvx		v25,$x10,$key_		# round[4]
> -	bdnz		Lxts_dec6x_steal
> -
> -	add		$inp,$inp,$taillen
> -	vncipher	$out0,$out0,v24
> -
> -	cmpwi		$taillen,0
> -	vncipher	$out0,$out0,v25
> -
> -	lvx_u		$in0,0,$inp
> -	vncipher	$out0,$out0,v26
> -
> -	lvsr		$inpperm,0,$taillen	# $in5 is no more
> -	vncipher	$out0,$out0,v27
> -
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	vncipher	$out0,$out0,v28
> -	lvx		v24,$x00,$key_		# re-pre-load round[1]
> -
> -	vncipher	$out0,$out0,v29
> -	lvx		v25,$x10,$key_		# re-pre-load round[2]
> -	 vxor		$twk1,$twk1,v31
> -
> -	le?vperm	$in0,$in0,$in0,$leperm
> -	vncipher	$out0,$out0,v30
> -
> -	vperm		$in0,$in0,$in0,$inpperm
> -	vncipherlast	$tmp,$out0,$twk1
> -
> -	le?vperm	$out0,$tmp,$tmp,$leperm
> -	le?stvx_u	$out0,0,$out
> -	be?stvx_u	$tmp,0,$out
> -
> -	vxor		$out0,$out0,$out0
> -	vspltisb	$out1,-1
> -	vperm		$out0,$out0,$out1,$inpperm
> -	vsel		$out0,$in0,$tmp,$out0
> -	vxor		$out0,$out0,$twk0
> -
> -	subi		r30,$out,1
> -	mtctr		$taillen
> -Loop_xts_dec6x_steal:
> -	lbzu		r0,1(r30)
> -	stb		r0,16(r30)
> -	bdnz		Loop_xts_dec6x_steal
> -
> -	li		$taillen,0
> -	mtctr		$rounds
> -	b		Loop_xts_dec1x		# one more time...
> -
> -.align	4
> -Lxts_dec6x_done:
> -	${UCMP}i	$ivp,0
> -	beq		Lxts_dec6x_ret
> -
> -	vxor		$tweak,$twk0,$rndkey0
> -	le?vperm	$tweak,$tweak,$tweak,$leperm
> -	stvx_u		$tweak,0,$ivp
> -
> -Lxts_dec6x_ret:
> -	mtlr		r11
> -	li		r10,`$FRAME+15`
> -	li		r11,`$FRAME+31`
> -	stvx		$seven,r10,$sp		# wipe copies of round keys
> -	addi		r10,r10,32
> -	stvx		$seven,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$seven,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$seven,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$seven,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$seven,r11,$sp
> -	addi		r11,r11,32
> -	stvx		$seven,r10,$sp
> -	addi		r10,r10,32
> -	stvx		$seven,r11,$sp
> -	addi		r11,r11,32
> -
> -	mtspr		256,$vrsave
> -	lvx		v20,r10,$sp		# ABI says so
> -	addi		r10,r10,32
> -	lvx		v21,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v22,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v23,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v24,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v25,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v26,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v27,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v28,r10,$sp
> -	addi		r10,r10,32
> -	lvx		v29,r11,$sp
> -	addi		r11,r11,32
> -	lvx		v30,r10,$sp
> -	lvx		v31,r11,$sp
> -	$POP		r26,`$FRAME+21*16+0*$SIZE_T`($sp)
> -	$POP		r27,`$FRAME+21*16+1*$SIZE_T`($sp)
> -	$POP		r28,`$FRAME+21*16+2*$SIZE_T`($sp)
> -	$POP		r29,`$FRAME+21*16+3*$SIZE_T`($sp)
> -	$POP		r30,`$FRAME+21*16+4*$SIZE_T`($sp)
> -	$POP		r31,`$FRAME+21*16+5*$SIZE_T`($sp)
> -	addi		$sp,$sp,`$FRAME+21*16+6*$SIZE_T`
> -	blr
> -	.long		0
> -	.byte		0,12,0x04,1,0x80,6,6,0
> -	.long		0
> -
> -.align	5
> -_aesp8_xts_dec5x:
> -	vncipher	$out0,$out0,v24
> -	vncipher	$out1,$out1,v24
> -	vncipher	$out2,$out2,v24
> -	vncipher	$out3,$out3,v24
> -	vncipher	$out4,$out4,v24
> -	lvx		v24,$x20,$key_		# round[3]
> -	addi		$key_,$key_,0x20
> -
> -	vncipher	$out0,$out0,v25
> -	vncipher	$out1,$out1,v25
> -	vncipher	$out2,$out2,v25
> -	vncipher	$out3,$out3,v25
> -	vncipher	$out4,$out4,v25
> -	lvx		v25,$x10,$key_		# round[4]
> -	bdnz		_aesp8_xts_dec5x
> -
> -	subi		r0,$taillen,1
> -	vncipher	$out0,$out0,v24
> -	vncipher	$out1,$out1,v24
> -	vncipher	$out2,$out2,v24
> -	vncipher	$out3,$out3,v24
> -	vncipher	$out4,$out4,v24
> -
> -	andi.		r0,r0,16
> -	cmpwi		$taillen,0
> -	vncipher	$out0,$out0,v25
> -	vncipher	$out1,$out1,v25
> -	vncipher	$out2,$out2,v25
> -	vncipher	$out3,$out3,v25
> -	vncipher	$out4,$out4,v25
> -	 vxor		$twk0,$twk0,v31
> -
> -	sub		$inp,$inp,r0
> -	vncipher	$out0,$out0,v26
> -	vncipher	$out1,$out1,v26
> -	vncipher	$out2,$out2,v26
> -	vncipher	$out3,$out3,v26
> -	vncipher	$out4,$out4,v26
> -	 vxor		$in1,$twk1,v31
> -
> -	vncipher	$out0,$out0,v27
> -	lvx_u		$in0,0,$inp
> -	vncipher	$out1,$out1,v27
> -	vncipher	$out2,$out2,v27
> -	vncipher	$out3,$out3,v27
> -	vncipher	$out4,$out4,v27
> -	 vxor		$in2,$twk2,v31
> -
> -	addi		$key_,$sp,$FRAME+15	# rewind $key_
> -	vncipher	$out0,$out0,v28
> -	vncipher	$out1,$out1,v28
> -	vncipher	$out2,$out2,v28
> -	vncipher	$out3,$out3,v28
> -	vncipher	$out4,$out4,v28
> -	lvx		v24,$x00,$key_		# re-pre-load round[1]
> -	 vxor		$in3,$twk3,v31
> -
> -	vncipher	$out0,$out0,v29
> -	le?vperm	$in0,$in0,$in0,$leperm
> -	vncipher	$out1,$out1,v29
> -	vncipher	$out2,$out2,v29
> -	vncipher	$out3,$out3,v29
> -	vncipher	$out4,$out4,v29
> -	lvx		v25,$x10,$key_		# re-pre-load round[2]
> -	 vxor		$in4,$twk4,v31
> -
> -	vncipher	$out0,$out0,v30
> -	vncipher	$out1,$out1,v30
> -	vncipher	$out2,$out2,v30
> -	vncipher	$out3,$out3,v30
> -	vncipher	$out4,$out4,v30
> -
> -	vncipherlast	$out0,$out0,$twk0
> -	vncipherlast	$out1,$out1,$in1
> -	vncipherlast	$out2,$out2,$in2
> -	vncipherlast	$out3,$out3,$in3
> -	vncipherlast	$out4,$out4,$in4
> -	mtctr		$rounds
> -	blr
> -        .long   	0
> -        .byte   	0,12,0x14,0,0,0,0,0
> -___
> -}}	}}}
> -
> -my $consts=1;
> -foreach(split("\n",$code)) {
> -        s/\`([^\`]*)\`/eval($1)/geo;
> -
> -	# constants table endian-specific conversion
> -	if ($consts && m/\.(long|byte)\s+(.+)\s+(\?[a-z]*)$/o) {
> -	    my $conv=$3;
> -	    my @bytes=();
> -
> -	    # convert to endian-agnostic format
> -	    if ($1 eq "long") {
> -	      foreach (split(/,\s*/,$2)) {
> -		my $l = /^0/?oct:int;
> -		push @bytes,($l>>24)&0xff,($l>>16)&0xff,($l>>8)&0xff,$l&0xff;
> -	      }
> -	    } else {
> -		@bytes = map(/^0/?oct:int,split(/,\s*/,$2));
> -	    }
> -
> -	    # little-endian conversion
> -	    if ($flavour =~ /le$/o) {
> -		SWITCH: for($conv)  {
> -		    /\?inv/ && do   { @bytes=map($_^0xf,@bytes); last; };
> -		    /\?rev/ && do   { @bytes=reverse(@bytes);    last; };
> -		}
> -	    }
> -
> -	    #emit
> -	    print ".byte\t",join(',',map (sprintf("0x%02x",$_),@bytes)),"\n";
> -	    next;
> -	}
> -	$consts=0 if (m/Lconsts:/o);	# end of table
> -
> -	# instructions prefixed with '?' are endian-specific and need
> -	# to be adjusted accordingly...
> -	if ($flavour =~ /le$/o) {	# little-endian
> -	    s/le\?//o		or
> -	    s/be\?/#be#/o	or
> -	    s/\?lvsr/lvsl/o	or
> -	    s/\?lvsl/lvsr/o	or
> -	    s/\?(vperm\s+v[0-9]+,\s*)(v[0-9]+,\s*)(v[0-9]+,\s*)(v[0-9]+)/$1$3$2$4/o or
> -	    s/\?(vsldoi\s+v[0-9]+,\s*)(v[0-9]+,)\s*(v[0-9]+,\s*)([0-9]+)/$1$3$2 16-$4/o or
> -	    s/\?(vspltw\s+v[0-9]+,\s*)(v[0-9]+,)\s*([0-9])/$1$2 3-$3/o;
> -	} else {			# big-endian
> -	    s/le\?/#le#/o	or
> -	    s/be\?//o		or
> -	    s/\?([a-z]+)/$1/o;
> -	}
> -
> -        print $_,"\n";
> -}
> -
> -close STDOUT;
> diff --git a/drivers/crypto/vmx/ghash.c b/drivers/crypto/vmx/ghash.c
> deleted file mode 100644
> index 27a94a119009..000000000000
> --- a/drivers/crypto/vmx/ghash.c
> +++ /dev/null
> @@ -1,227 +0,0 @@
> -/**
> - * GHASH routines supporting VMX instructions on the Power 8
> - *
> - * Copyright (C) 2015 International Business Machines Inc.
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; version 2 only.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program; if not, write to the Free Software
> - * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> - *
> - * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
> - */
> -
> -#include <linux/types.h>
> -#include <linux/err.h>
> -#include <linux/crypto.h>
> -#include <linux/delay.h>
> -#include <linux/hardirq.h>
> -#include <asm/switch_to.h>
> -#include <crypto/aes.h>
> -#include <crypto/ghash.h>
> -#include <crypto/scatterwalk.h>
> -#include <crypto/internal/hash.h>
> -#include <crypto/b128ops.h>
> -
> -#define IN_INTERRUPT in_interrupt()
> -
> -void gcm_init_p8(u128 htable[16], const u64 Xi[2]);
> -void gcm_gmult_p8(u64 Xi[2], const u128 htable[16]);
> -void gcm_ghash_p8(u64 Xi[2], const u128 htable[16],
> -		  const u8 *in, size_t len);
> -
> -struct p8_ghash_ctx {
> -	u128 htable[16];
> -	struct crypto_shash *fallback;
> -};
> -
> -struct p8_ghash_desc_ctx {
> -	u64 shash[2];
> -	u8 buffer[GHASH_DIGEST_SIZE];
> -	int bytes;
> -	struct shash_desc fallback_desc;
> -};
> -
> -static int p8_ghash_init_tfm(struct crypto_tfm *tfm)
> -{
> -	const char *alg = "ghash-generic";
> -	struct crypto_shash *fallback;
> -	struct crypto_shash *shash_tfm = __crypto_shash_cast(tfm);
> -	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	fallback = crypto_alloc_shash(alg, 0, CRYPTO_ALG_NEED_FALLBACK);
> -	if (IS_ERR(fallback)) {
> -		printk(KERN_ERR
> -		       "Failed to allocate transformation for '%s': %ld\n",
> -		       alg, PTR_ERR(fallback));
> -		return PTR_ERR(fallback);
> -	}
> -	printk(KERN_INFO "Using '%s' as fallback implementation.\n",
> -	       crypto_tfm_alg_driver_name(crypto_shash_tfm(fallback)));
> -
> -	crypto_shash_set_flags(fallback,
> -			       crypto_shash_get_flags((struct crypto_shash
> -						       *) tfm));
> -
> -	/* Check if the descsize defined in the algorithm is still enough. */
> -	if (shash_tfm->descsize < sizeof(struct p8_ghash_desc_ctx)
> -	    + crypto_shash_descsize(fallback)) {
> -		printk(KERN_ERR
> -		       "Desc size of the fallback implementation (%s) does not match the expected value: %lu vs %u\n",
> -		       alg,
> -		       shash_tfm->descsize - sizeof(struct p8_ghash_desc_ctx),
> -		       crypto_shash_descsize(fallback));
> -		return -EINVAL;
> -	}
> -	ctx->fallback = fallback;
> -
> -	return 0;
> -}
> -
> -static void p8_ghash_exit_tfm(struct crypto_tfm *tfm)
> -{
> -	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(tfm);
> -
> -	if (ctx->fallback) {
> -		crypto_free_shash(ctx->fallback);
> -		ctx->fallback = NULL;
> -	}
> -}
> -
> -static int p8_ghash_init(struct shash_desc *desc)
> -{
> -	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
> -	struct p8_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
> -
> -	dctx->bytes = 0;
> -	memset(dctx->shash, 0, GHASH_DIGEST_SIZE);
> -	dctx->fallback_desc.tfm = ctx->fallback;
> -	dctx->fallback_desc.flags = desc->flags;
> -	return crypto_shash_init(&dctx->fallback_desc);
> -}
> -
> -static int p8_ghash_setkey(struct crypto_shash *tfm, const u8 *key,
> -			   unsigned int keylen)
> -{
> -	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(tfm));
> -
> -	if (keylen != GHASH_BLOCK_SIZE)
> -		return -EINVAL;
> -
> -	preempt_disable();
> -	pagefault_disable();
> -	enable_kernel_vsx();
> -	gcm_init_p8(ctx->htable, (const u64 *) key);
> -	disable_kernel_vsx();
> -	pagefault_enable();
> -	preempt_enable();
> -	return crypto_shash_setkey(ctx->fallback, key, keylen);
> -}
> -
> -static int p8_ghash_update(struct shash_desc *desc,
> -			   const u8 *src, unsigned int srclen)
> -{
> -	unsigned int len;
> -	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
> -	struct p8_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
> -
> -	if (IN_INTERRUPT) {
> -		return crypto_shash_update(&dctx->fallback_desc, src,
> -					   srclen);
> -	} else {
> -		if (dctx->bytes) {
> -			if (dctx->bytes + srclen < GHASH_DIGEST_SIZE) {
> -				memcpy(dctx->buffer + dctx->bytes, src,
> -				       srclen);
> -				dctx->bytes += srclen;
> -				return 0;
> -			}
> -			memcpy(dctx->buffer + dctx->bytes, src,
> -			       GHASH_DIGEST_SIZE - dctx->bytes);
> -			preempt_disable();
> -			pagefault_disable();
> -			enable_kernel_vsx();
> -			gcm_ghash_p8(dctx->shash, ctx->htable,
> -				     dctx->buffer, GHASH_DIGEST_SIZE);
> -			disable_kernel_vsx();
> -			pagefault_enable();
> -			preempt_enable();
> -			src += GHASH_DIGEST_SIZE - dctx->bytes;
> -			srclen -= GHASH_DIGEST_SIZE - dctx->bytes;
> -			dctx->bytes = 0;
> -		}
> -		len = srclen & ~(GHASH_DIGEST_SIZE - 1);
> -		if (len) {
> -			preempt_disable();
> -			pagefault_disable();
> -			enable_kernel_vsx();
> -			gcm_ghash_p8(dctx->shash, ctx->htable, src, len);
> -			disable_kernel_vsx();
> -			pagefault_enable();
> -			preempt_enable();
> -			src += len;
> -			srclen -= len;
> -		}
> -		if (srclen) {
> -			memcpy(dctx->buffer, src, srclen);
> -			dctx->bytes = srclen;
> -		}
> -		return 0;
> -	}
> -}
> -
> -static int p8_ghash_final(struct shash_desc *desc, u8 *out)
> -{
> -	int i;
> -	struct p8_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
> -	struct p8_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
> -
> -	if (IN_INTERRUPT) {
> -		return crypto_shash_final(&dctx->fallback_desc, out);
> -	} else {
> -		if (dctx->bytes) {
> -			for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++)
> -				dctx->buffer[i] = 0;
> -			preempt_disable();
> -			pagefault_disable();
> -			enable_kernel_vsx();
> -			gcm_ghash_p8(dctx->shash, ctx->htable,
> -				     dctx->buffer, GHASH_DIGEST_SIZE);
> -			disable_kernel_vsx();
> -			pagefault_enable();
> -			preempt_enable();
> -			dctx->bytes = 0;
> -		}
> -		memcpy(out, dctx->shash, GHASH_DIGEST_SIZE);
> -		return 0;
> -	}
> -}
> -
> -struct shash_alg p8_ghash_alg = {
> -	.digestsize = GHASH_DIGEST_SIZE,
> -	.init = p8_ghash_init,
> -	.update = p8_ghash_update,
> -	.final = p8_ghash_final,
> -	.setkey = p8_ghash_setkey,
> -	.descsize = sizeof(struct p8_ghash_desc_ctx)
> -		+ sizeof(struct ghash_desc_ctx),
> -	.base = {
> -		 .cra_name = "ghash",
> -		 .cra_driver_name = "p8_ghash",
> -		 .cra_priority = 1000,
> -		 .cra_flags = CRYPTO_ALG_TYPE_SHASH | CRYPTO_ALG_NEED_FALLBACK,
> -		 .cra_blocksize = GHASH_BLOCK_SIZE,
> -		 .cra_ctxsize = sizeof(struct p8_ghash_ctx),
> -		 .cra_module = THIS_MODULE,
> -		 .cra_init = p8_ghash_init_tfm,
> -		 .cra_exit = p8_ghash_exit_tfm,
> -	},
> -};
> diff --git a/drivers/crypto/vmx/ghashp8-ppc.pl b/drivers/crypto/vmx/ghashp8-ppc.pl
> deleted file mode 100644
> index d8429cb71f02..000000000000
> --- a/drivers/crypto/vmx/ghashp8-ppc.pl
> +++ /dev/null
> @@ -1,234 +0,0 @@
> -#!/usr/bin/env perl
> -#
> -# ====================================================================
> -# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
> -# project. The module is, however, dual licensed under OpenSSL and
> -# CRYPTOGAMS licenses depending on where you obtain it. For further
> -# details see http://www.openssl.org/~appro/cryptogams/.
> -# ====================================================================
> -#
> -# GHASH for for PowerISA v2.07.
> -#
> -# July 2014
> -#
> -# Accurate performance measurements are problematic, because it's
> -# always virtualized setup with possibly throttled processor.
> -# Relative comparison is therefore more informative. This initial
> -# version is ~2.1x slower than hardware-assisted AES-128-CTR, ~12x
> -# faster than "4-bit" integer-only compiler-generated 64-bit code.
> -# "Initial version" means that there is room for futher improvement.
> -
> -$flavour=shift;
> -$output =shift;
> -
> -if ($flavour =~ /64/) {
> -	$SIZE_T=8;
> -	$LRSAVE=2*$SIZE_T;
> -	$STU="stdu";
> -	$POP="ld";
> -	$PUSH="std";
> -} elsif ($flavour =~ /32/) {
> -	$SIZE_T=4;
> -	$LRSAVE=$SIZE_T;
> -	$STU="stwu";
> -	$POP="lwz";
> -	$PUSH="stw";
> -} else { die "nonsense $flavour"; }
> -
> -$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
> -( $xlate="${dir}ppc-xlate.pl" and -f $xlate ) or
> -( $xlate="${dir}../../perlasm/ppc-xlate.pl" and -f $xlate) or
> -die "can't locate ppc-xlate.pl";
> -
> -open STDOUT,"| $^X $xlate $flavour $output" || die "can't call $xlate: $!";
> -
> -my ($Xip,$Htbl,$inp,$len)=map("r$_",(3..6));	# argument block
> -
> -my ($Xl,$Xm,$Xh,$IN)=map("v$_",(0..3));
> -my ($zero,$t0,$t1,$t2,$xC2,$H,$Hh,$Hl,$lemask)=map("v$_",(4..12));
> -my $vrsave="r12";
> -
> -$code=<<___;
> -.machine	"any"
> -
> -.text
> -
> -.globl	.gcm_init_p8
> -	lis		r0,0xfff0
> -	li		r8,0x10
> -	mfspr		$vrsave,256
> -	li		r9,0x20
> -	mtspr		256,r0
> -	li		r10,0x30
> -	lvx_u		$H,0,r4			# load H
> -	le?xor		r7,r7,r7
> -	le?addi		r7,r7,0x8		# need a vperm start with 08
> -	le?lvsr		5,0,r7
> -	le?vspltisb	6,0x0f
> -	le?vxor		5,5,6			# set a b-endian mask
> -	le?vperm	$H,$H,$H,5
> -
> -	vspltisb	$xC2,-16		# 0xf0
> -	vspltisb	$t0,1			# one
> -	vaddubm		$xC2,$xC2,$xC2		# 0xe0
> -	vxor		$zero,$zero,$zero
> -	vor		$xC2,$xC2,$t0		# 0xe1
> -	vsldoi		$xC2,$xC2,$zero,15	# 0xe1...
> -	vsldoi		$t1,$zero,$t0,1		# ...1
> -	vaddubm		$xC2,$xC2,$xC2		# 0xc2...
> -	vspltisb	$t2,7
> -	vor		$xC2,$xC2,$t1		# 0xc2....01
> -	vspltb		$t1,$H,0		# most significant byte
> -	vsl		$H,$H,$t0		# H<<=1
> -	vsrab		$t1,$t1,$t2		# broadcast carry bit
> -	vand		$t1,$t1,$xC2
> -	vxor		$H,$H,$t1		# twisted H
> -
> -	vsldoi		$H,$H,$H,8		# twist even more ...
> -	vsldoi		$xC2,$zero,$xC2,8	# 0xc2.0
> -	vsldoi		$Hl,$zero,$H,8		# ... and split
> -	vsldoi		$Hh,$H,$zero,8
> -
> -	stvx_u		$xC2,0,r3		# save pre-computed table
> -	stvx_u		$Hl,r8,r3
> -	stvx_u		$H, r9,r3
> -	stvx_u		$Hh,r10,r3
> -
> -	mtspr		256,$vrsave
> -	blr
> -	.long		0
> -	.byte		0,12,0x14,0,0,0,2,0
> -	.long		0
> -.size	.gcm_init_p8,.-.gcm_init_p8
> -
> -.globl	.gcm_gmult_p8
> -	lis		r0,0xfff8
> -	li		r8,0x10
> -	mfspr		$vrsave,256
> -	li		r9,0x20
> -	mtspr		256,r0
> -	li		r10,0x30
> -	lvx_u		$IN,0,$Xip		# load Xi
> -
> -	lvx_u		$Hl,r8,$Htbl		# load pre-computed table
> -	 le?lvsl	$lemask,r0,r0
> -	lvx_u		$H, r9,$Htbl
> -	 le?vspltisb	$t0,0x07
> -	lvx_u		$Hh,r10,$Htbl
> -	 le?vxor	$lemask,$lemask,$t0
> -	lvx_u		$xC2,0,$Htbl
> -	 le?vperm	$IN,$IN,$IN,$lemask
> -	vxor		$zero,$zero,$zero
> -
> -	vpmsumd		$Xl,$IN,$Hl		# H.lo�Xi.lo
> -	vpmsumd		$Xm,$IN,$H		# H.hi�Xi.lo+H.lo�Xi.hi
> -	vpmsumd		$Xh,$IN,$Hh		# H.hi�Xi.hi
> -
> -	vpmsumd		$t2,$Xl,$xC2		# 1st phase
> -
> -	vsldoi		$t0,$Xm,$zero,8
> -	vsldoi		$t1,$zero,$Xm,8
> -	vxor		$Xl,$Xl,$t0
> -	vxor		$Xh,$Xh,$t1
> -
> -	vsldoi		$Xl,$Xl,$Xl,8
> -	vxor		$Xl,$Xl,$t2
> -
> -	vsldoi		$t1,$Xl,$Xl,8		# 2nd phase
> -	vpmsumd		$Xl,$Xl,$xC2
> -	vxor		$t1,$t1,$Xh
> -	vxor		$Xl,$Xl,$t1
> -
> -	le?vperm	$Xl,$Xl,$Xl,$lemask
> -	stvx_u		$Xl,0,$Xip		# write out Xi
> -
> -	mtspr		256,$vrsave
> -	blr
> -	.long		0
> -	.byte		0,12,0x14,0,0,0,2,0
> -	.long		0
> -.size	.gcm_gmult_p8,.-.gcm_gmult_p8
> -
> -.globl	.gcm_ghash_p8
> -	lis		r0,0xfff8
> -	li		r8,0x10
> -	mfspr		$vrsave,256
> -	li		r9,0x20
> -	mtspr		256,r0
> -	li		r10,0x30
> -	lvx_u		$Xl,0,$Xip		# load Xi
> -
> -	lvx_u		$Hl,r8,$Htbl		# load pre-computed table
> -	 le?lvsl	$lemask,r0,r0
> -	lvx_u		$H, r9,$Htbl
> -	 le?vspltisb	$t0,0x07
> -	lvx_u		$Hh,r10,$Htbl
> -	 le?vxor	$lemask,$lemask,$t0
> -	lvx_u		$xC2,0,$Htbl
> -	 le?vperm	$Xl,$Xl,$Xl,$lemask
> -	vxor		$zero,$zero,$zero
> -
> -	lvx_u		$IN,0,$inp
> -	addi		$inp,$inp,16
> -	subi		$len,$len,16
> -	 le?vperm	$IN,$IN,$IN,$lemask
> -	vxor		$IN,$IN,$Xl
> -	b		Loop
> -
> -.align	5
> -Loop:
> -	 subic		$len,$len,16
> -	vpmsumd		$Xl,$IN,$Hl		# H.lo�Xi.lo
> -	 subfe.		r0,r0,r0		# borrow?-1:0
> -	vpmsumd		$Xm,$IN,$H		# H.hi�Xi.lo+H.lo�Xi.hi
> -	 and		r0,r0,$len
> -	vpmsumd		$Xh,$IN,$Hh		# H.hi�Xi.hi
> -	 add		$inp,$inp,r0
> -
> -	vpmsumd		$t2,$Xl,$xC2		# 1st phase
> -
> -	vsldoi		$t0,$Xm,$zero,8
> -	vsldoi		$t1,$zero,$Xm,8
> -	vxor		$Xl,$Xl,$t0
> -	vxor		$Xh,$Xh,$t1
> -
> -	vsldoi		$Xl,$Xl,$Xl,8
> -	vxor		$Xl,$Xl,$t2
> -	 lvx_u		$IN,0,$inp
> -	 addi		$inp,$inp,16
> -
> -	vsldoi		$t1,$Xl,$Xl,8		# 2nd phase
> -	vpmsumd		$Xl,$Xl,$xC2
> -	 le?vperm	$IN,$IN,$IN,$lemask
> -	vxor		$t1,$t1,$Xh
> -	vxor		$IN,$IN,$t1
> -	vxor		$IN,$IN,$Xl
> -	beq		Loop			# did $len-=16 borrow?
> -
> -	vxor		$Xl,$Xl,$t1
> -	le?vperm	$Xl,$Xl,$Xl,$lemask
> -	stvx_u		$Xl,0,$Xip		# write out Xi
> -
> -	mtspr		256,$vrsave
> -	blr
> -	.long		0
> -	.byte		0,12,0x14,0,0,0,4,0
> -	.long		0
> -.size	.gcm_ghash_p8,.-.gcm_ghash_p8
> -
> -.asciz  "GHASH for PowerISA 2.07, CRYPTOGAMS by <appro\@openssl.org>"
> -.align  2
> -___
> -
> -foreach (split("\n",$code)) {
> -	if ($flavour =~ /le$/o) {	# little-endian
> -	    s/le\?//o		or
> -	    s/be\?/#be#/o;
> -	} else {
> -	    s/le\?/#le#/o	or
> -	    s/be\?//o;
> -	}
> -	print $_,"\n";
> -}
> -
> -close STDOUT; # enforce flush
> diff --git a/drivers/crypto/vmx/ppc-xlate.pl b/drivers/crypto/vmx/ppc-xlate.pl
> deleted file mode 100644
> index b18e67d0e065..000000000000
> --- a/drivers/crypto/vmx/ppc-xlate.pl
> +++ /dev/null
> @@ -1,228 +0,0 @@
> -#!/usr/bin/env perl
> -
> -# PowerPC assembler distiller by <appro>.
> -
> -my $flavour = shift;
> -my $output = shift;
> -open STDOUT,">$output" || die "can't open $output: $!";
> -
> -my %GLOBALS;
> -my $dotinlocallabels=($flavour=~/linux/)?1:0;
> -
> -################################################################
> -# directives which need special treatment on different platforms
> -################################################################
> -my $globl = sub {
> -    my $junk = shift;
> -    my $name = shift;
> -    my $global = \$GLOBALS{$name};
> -    my $ret;
> -
> -    $name =~ s|^[\.\_]||;
> - 
> -    SWITCH: for ($flavour) {
> -	/aix/		&& do { $name = ".$name";
> -				last;
> -			      };
> -	/osx/		&& do { $name = "_$name";
> -				last;
> -			      };
> -	/linux/
> -			&& do {	$ret = "_GLOBAL($name)";
> -				last;
> -			      };
> -    }
> -
> -    $ret = ".globl	$name\nalign 5\n$name:" if (!$ret);
> -    $$global = $name;
> -    $ret;
> -};
> -my $text = sub {
> -    my $ret = ($flavour =~ /aix/) ? ".csect\t.text[PR],7" : ".text";
> -    $ret = ".abiversion	2\n".$ret	if ($flavour =~ /linux.*64le/);
> -    $ret;
> -};
> -my $machine = sub {
> -    my $junk = shift;
> -    my $arch = shift;
> -    if ($flavour =~ /osx/)
> -    {	$arch =~ s/\"//g;
> -	$arch = ($flavour=~/64/) ? "ppc970-64" : "ppc970" if ($arch eq "any");
> -    }
> -    ".machine	$arch";
> -};
> -my $size = sub {
> -    if ($flavour =~ /linux/)
> -    {	shift;
> -	my $name = shift; $name =~ s|^[\.\_]||;
> -	my $ret  = ".size	$name,.-".($flavour=~/64$/?".":"").$name;
> -	$ret .= "\n.size	.$name,.-.$name" if ($flavour=~/64$/);
> -	$ret;
> -    }
> -    else
> -    {	"";	}
> -};
> -my $asciz = sub {
> -    shift;
> -    my $line = join(",",@_);
> -    if ($line =~ /^"(.*)"$/)
> -    {	".byte	" . join(",",unpack("C*",$1),0) . "\n.align	2";	}
> -    else
> -    {	"";	}
> -};
> -my $quad = sub {
> -    shift;
> -    my @ret;
> -    my ($hi,$lo);
> -    for (@_) {
> -	if (/^0x([0-9a-f]*?)([0-9a-f]{1,8})$/io)
> -	{  $hi=$1?"0x$1":"0"; $lo="0x$2";  }
> -	elsif (/^([0-9]+)$/o)
> -	{  $hi=$1>>32; $lo=$1&0xffffffff;  } # error-prone with 32-bit perl
> -	else
> -	{  $hi=undef; $lo=$_; }
> -
> -	if (defined($hi))
> -	{  push(@ret,$flavour=~/le$/o?".long\t$lo,$hi":".long\t$hi,$lo");  }
> -	else
> -	{  push(@ret,".quad	$lo");  }
> -    }
> -    join("\n",@ret);
> -};
> -
> -################################################################
> -# simplified mnemonics not handled by at least one assembler
> -################################################################
> -my $cmplw = sub {
> -    my $f = shift;
> -    my $cr = 0; $cr = shift if ($#_>1);
> -    # Some out-of-date 32-bit GNU assembler just can't handle cmplw...
> -    ($flavour =~ /linux.*32/) ?
> -	"	.long	".sprintf "0x%x",31<<26|$cr<<23|$_[0]<<16|$_[1]<<11|64 :
> -	"	cmplw	".join(',',$cr,@_);
> -};
> -my $bdnz = sub {
> -    my $f = shift;
> -    my $bo = $f=~/[\+\-]/ ? 16+9 : 16;	# optional "to be taken" hint
> -    "	bc	$bo,0,".shift;
> -} if ($flavour!~/linux/);
> -my $bltlr = sub {
> -    my $f = shift;
> -    my $bo = $f=~/\-/ ? 12+2 : 12;	# optional "not to be taken" hint
> -    ($flavour =~ /linux/) ?		# GNU as doesn't allow most recent hints
> -	"	.long	".sprintf "0x%x",19<<26|$bo<<21|16<<1 :
> -	"	bclr	$bo,0";
> -};
> -my $bnelr = sub {
> -    my $f = shift;
> -    my $bo = $f=~/\-/ ? 4+2 : 4;	# optional "not to be taken" hint
> -    ($flavour =~ /linux/) ?		# GNU as doesn't allow most recent hints
> -	"	.long	".sprintf "0x%x",19<<26|$bo<<21|2<<16|16<<1 :
> -	"	bclr	$bo,2";
> -};
> -my $beqlr = sub {
> -    my $f = shift;
> -    my $bo = $f=~/-/ ? 12+2 : 12;	# optional "not to be taken" hint
> -    ($flavour =~ /linux/) ?		# GNU as doesn't allow most recent hints
> -	"	.long	".sprintf "0x%X",19<<26|$bo<<21|2<<16|16<<1 :
> -	"	bclr	$bo,2";
> -};
> -# GNU assembler can't handle extrdi rA,rS,16,48, or when sum of last two
> -# arguments is 64, with "operand out of range" error.
> -my $extrdi = sub {
> -    my ($f,$ra,$rs,$n,$b) = @_;
> -    $b = ($b+$n)&63; $n = 64-$n;
> -    "	rldicl	$ra,$rs,$b,$n";
> -};
> -my $vmr = sub {
> -    my ($f,$vx,$vy) = @_;
> -    "	vor	$vx,$vy,$vy";
> -};
> -
> -# Some ABIs specify vrsave, special-purpose register #256, as reserved
> -# for system use.
> -my $no_vrsave = ($flavour =~ /linux-ppc64le/);
> -my $mtspr = sub {
> -    my ($f,$idx,$ra) = @_;
> -    if ($idx == 256 && $no_vrsave) {
> -	"	or	$ra,$ra,$ra";
> -    } else {
> -	"	mtspr	$idx,$ra";
> -    }
> -};
> -my $mfspr = sub {
> -    my ($f,$rd,$idx) = @_;
> -    if ($idx == 256 && $no_vrsave) {
> -	"	li	$rd,-1";
> -    } else {
> -	"	mfspr	$rd,$idx";
> -    }
> -};
> -
> -# PowerISA 2.06 stuff
> -sub vsxmem_op {
> -    my ($f, $vrt, $ra, $rb, $op) = @_;
> -    "	.long	".sprintf "0x%X",(31<<26)|($vrt<<21)|($ra<<16)|($rb<<11)|($op*2+1);
> -}
> -# made-up unaligned memory reference AltiVec/VMX instructions
> -my $lvx_u	= sub {	vsxmem_op(@_, 844); };	# lxvd2x
> -my $stvx_u	= sub {	vsxmem_op(@_, 972); };	# stxvd2x
> -my $lvdx_u	= sub {	vsxmem_op(@_, 588); };	# lxsdx
> -my $stvdx_u	= sub {	vsxmem_op(@_, 716); };	# stxsdx
> -my $lvx_4w	= sub { vsxmem_op(@_, 780); };	# lxvw4x
> -my $stvx_4w	= sub { vsxmem_op(@_, 908); };	# stxvw4x
> -
> -# PowerISA 2.07 stuff
> -sub vcrypto_op {
> -    my ($f, $vrt, $vra, $vrb, $op) = @_;
> -    "	.long	".sprintf "0x%X",(4<<26)|($vrt<<21)|($vra<<16)|($vrb<<11)|$op;
> -}
> -my $vcipher	= sub { vcrypto_op(@_, 1288); };
> -my $vcipherlast	= sub { vcrypto_op(@_, 1289); };
> -my $vncipher	= sub { vcrypto_op(@_, 1352); };
> -my $vncipherlast= sub { vcrypto_op(@_, 1353); };
> -my $vsbox	= sub { vcrypto_op(@_, 0, 1480); };
> -my $vshasigmad	= sub { my ($st,$six)=splice(@_,-2); vcrypto_op(@_, $st<<4|$six, 1730); };
> -my $vshasigmaw	= sub { my ($st,$six)=splice(@_,-2); vcrypto_op(@_, $st<<4|$six, 1666); };
> -my $vpmsumb	= sub { vcrypto_op(@_, 1032); };
> -my $vpmsumd	= sub { vcrypto_op(@_, 1224); };
> -my $vpmsubh	= sub { vcrypto_op(@_, 1096); };
> -my $vpmsumw	= sub { vcrypto_op(@_, 1160); };
> -my $vaddudm	= sub { vcrypto_op(@_, 192);  };
> -my $vadduqm	= sub { vcrypto_op(@_, 256);  };
> -
> -my $mtsle	= sub {
> -    my ($f, $arg) = @_;
> -    "	.long	".sprintf "0x%X",(31<<26)|($arg<<21)|(147*2);
> -};
> -
> -print "#include <asm/ppc_asm.h>\n" if $flavour =~ /linux/;
> -
> -while($line=<>) {
> -
> -    $line =~ s|[#!;].*$||;	# get rid of asm-style comments...
> -    $line =~ s|/\*.*\*/||;	# ... and C-style comments...
> -    $line =~ s|^\s+||;		# ... and skip white spaces in beginning...
> -    $line =~ s|\s+$||;		# ... and at the end
> -
> -    {
> -	$line =~ s|\b\.L(\w+)|L$1|g;	# common denominator for Locallabel
> -	$line =~ s|\bL(\w+)|\.L$1|g	if ($dotinlocallabels);
> -    }
> -
> -    {
> -	$line =~ s|^\s*(\.?)(\w+)([\.\+\-]?)\s*||;
> -	my $c = $1; $c = "\t" if ($c eq "");
> -	my $mnemonic = $2;
> -	my $f = $3;
> -	my $opcode = eval("\$$mnemonic");
> -	$line =~ s/\b(c?[rf]|v|vs)([0-9]+)\b/$2/g if ($c ne "." and $flavour !~ /osx/);
> -	if (ref($opcode) eq 'CODE') { $line = &$opcode($f,split(',',$line)); }
> -	elsif ($mnemonic)           { $line = $c.$mnemonic.$f."\t".$line; }
> -    }
> -
> -    print $line if ($line);
> -    print "\n";
> -}
> -
> -close STDOUT;
> diff --git a/drivers/crypto/vmx/vmx.c b/drivers/crypto/vmx/vmx.c
> deleted file mode 100644
> index 31a98dc6f849..000000000000
> --- a/drivers/crypto/vmx/vmx.c
> +++ /dev/null
> @@ -1,88 +0,0 @@
> -/**
> - * Routines supporting VMX instructions on the Power 8
> - *
> - * Copyright (C) 2015 International Business Machines Inc.
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; version 2 only.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program; if not, write to the Free Software
> - * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> - *
> - * Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
> - */
> -
> -#include <linux/module.h>
> -#include <linux/moduleparam.h>
> -#include <linux/types.h>
> -#include <linux/err.h>
> -#include <linux/cpufeature.h>
> -#include <linux/crypto.h>
> -#include <asm/cputable.h>
> -#include <crypto/internal/hash.h>
> -
> -extern struct shash_alg p8_ghash_alg;
> -extern struct crypto_alg p8_aes_alg;
> -extern struct crypto_alg p8_aes_cbc_alg;
> -extern struct crypto_alg p8_aes_ctr_alg;
> -extern struct crypto_alg p8_aes_xts_alg;
> -static struct crypto_alg *algs[] = {
> -	&p8_aes_alg,
> -	&p8_aes_cbc_alg,
> -	&p8_aes_ctr_alg,
> -	&p8_aes_xts_alg,
> -	NULL,
> -};
> -
> -int __init p8_init(void)
> -{
> -	int ret = 0;
> -	struct crypto_alg **alg_it;
> -
> -	for (alg_it = algs; *alg_it; alg_it++) {
> -		ret = crypto_register_alg(*alg_it);
> -		printk(KERN_INFO "crypto_register_alg '%s' = %d\n",
> -		       (*alg_it)->cra_name, ret);
> -		if (ret) {
> -			for (alg_it--; alg_it >= algs; alg_it--)
> -				crypto_unregister_alg(*alg_it);
> -			break;
> -		}
> -	}
> -	if (ret)
> -		return ret;
> -
> -	ret = crypto_register_shash(&p8_ghash_alg);
> -	if (ret) {
> -		for (alg_it = algs; *alg_it; alg_it++)
> -			crypto_unregister_alg(*alg_it);
> -	}
> -	return ret;
> -}
> -
> -void __exit p8_exit(void)
> -{
> -	struct crypto_alg **alg_it;
> -
> -	for (alg_it = algs; *alg_it; alg_it++) {
> -		printk(KERN_INFO "Removing '%s'\n", (*alg_it)->cra_name);
> -		crypto_unregister_alg(*alg_it);
> -	}
> -	crypto_unregister_shash(&p8_ghash_alg);
> -}
> -
> -module_cpu_feature_match(PPC_MODULE_FEATURE_VEC_CRYPTO, p8_init);
> -module_exit(p8_exit);
> -
> -MODULE_AUTHOR("Marcelo Cerri<mhcerri@br.ibm.com>");
> -MODULE_DESCRIPTION("IBM VMX cryptographic acceleration instructions "
> -		   "support on Power 8");
> -MODULE_LICENSE("GPL");
> -MODULE_VERSION("1.0.0");
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] crypto: vmx: Remove dubiously licensed crypto code
       [not found] ` <87r31flisi.fsf@concordia.ellerman.id.au>
@ 2017-03-30  0:49   ` Tyrel Datwyler
  0 siblings, 0 replies; 10+ messages in thread
From: Tyrel Datwyler @ 2017-03-30  0:49 UTC (permalink / raw)
  To: Michael Ellerman, Michal Suchanek, Herbert Xu, David S. Miller,
	Benjamin Herrenschmidt, Paul Mackerras, Greg Kroah-Hartman,
	Geert Uytterhoeven, Mauro Carvalho Chehab, linux-kernel,
	linux-crypto, linuxppc-dev, paulmck, appro

On 03/29/2017 05:17 PM, Michael Ellerman wrote:
> Michal Suchanek <msuchanek@suse.de> writes:
> 
>> While reviewing commit 11c6e16ee13a ("crypto: vmx - Adding asm
>> subroutines for XTS") which adds the OpenSSL license header to
>> drivers/crypto/vmx/aesp8-ppc.pl licensing of this driver came into
>> qestion. The whole license reads:
>>
>>  # Licensed under the OpenSSL license (the "License").  You may not use
>>  # this file except in compliance with the License.  You can obtain a
>>  # copy
>>  # in the file LICENSE in the source distribution or at
>>  # https://www.openssl.org/source/license.html
>>
>>  #
>>  # ====================================================================
>>  # Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
>>  # project. The module is, however, dual licensed under OpenSSL and
>>  # CRYPTOGAMS licenses depending on where you obtain it. For further
>>  # details see http://www.openssl.org/~appro/cryptogams/.
>>  # ====================================================================
>>
>> After seeking legal advice it is still not clear that this driver can be
>> legally used in Linux. In particular the "depending on where you obtain
>> it" part does not make it clear when you can apply the GPL and when the
>> OpenSSL license.
> 
> It seems pretty clear to me that the intention is that the CRYPTOGAM
> license applies.
> 
> If you visit it's URL it includes:
> 
>   ALTERNATIVELY, provided that this notice is retained in full, this
>   product may be distributed under the terms of the GNU General Public
>   License (GPL), in which case the provisions of the GPL apply INSTEAD OF
>   those given above.
> 
> 
> I agree that the text in the file is not sufficiently clear about what
> license applies, but I'm unconvinced that there is any code here that is
> actually being distributed incorrectly.

The original commit message also outlines that the authors collaborated
directly with Andy.

commit 5c380d623ed30b71a2441fb4f2e053a4e1a50794
Author: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
Date:   Fri Feb 6 14:59:35 2015 -0200

    crypto: vmx - Add support for VMS instructions by ASM

    OpenSSL implements optimized ASM algorithms which support
    VMX instructions on Power 8 CPU.

    These scripts generate an endian-agnostic ASM implementation
    in order to support both big and little-endian.
        - aesp8-ppc.pl: implements suport for AES instructions
        implemented by POWER8 processor.
        - ghashp8-ppc.pl: implements support for  GHASH for Power8.
        - ppc-xlate.pl:  ppc assembler distiller.

    These code has been adopted from OpenSSL project in collaboration
    with the original author (Andy Polyakov <appro@openssl.org>).

    Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

-Tyrel

> 
> Getting the text in the header changed to be clearer seems like the
> obvious solution.
> 
>> I tried contacting the author of the code for clarification but did not
>> hear back. In absence of clear licensing the only solution I see is
>> removing this code.
> 
> Did you try contacting anyone else? Like perhaps the powerpc or crypto
> maintainers, or anyone else who's worked on the driver?
> 
> Sending a patch to delete all the code clearly works to get people's
> attention, I'll give you that.
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] crypto: vmx: Remove dubiously licensed crypto code
  2017-03-29 23:08     ` Tyrel Datwyler
@ 2017-03-30 16:30       ` Paulo Flabiano Smorigo
  2017-04-13 13:30         ` Michal Suchánek
  2017-05-05 13:52         ` Michal Suchánek
  0 siblings, 2 replies; 10+ messages in thread
From: Paulo Flabiano Smorigo @ 2017-03-30 16:30 UTC (permalink / raw)
  To: Tyrel Datwyler
  Cc: Michal Suchánek, Greg Kroah-Hartman, Leonidas S. Barbosa,
	Herbert Xu, Geert Uytterhoeven, linux-kernel, Paul Mackerras,
	linux-crypto, Mauro Carvalho Chehab, linuxppc-dev,
	David S. Miller

On 2017-03-29 20:08, Tyrel Datwyler wrote:
> On 03/29/2017 08:13 AM, Michal Suchánek wrote:
>> On Wed, 29 Mar 2017 16:51:35 +0200
>> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
>> 
>>> On Wed, Mar 29, 2017 at 02:56:39PM +0200, Michal Suchanek wrote:
>>>> While reviewing commit 11c6e16ee13a ("crypto: vmx - Adding asm
>>>> subroutines for XTS") which adds the OpenSSL license header to
>>>> drivers/crypto/vmx/aesp8-ppc.pl licensing of this driver came into
>>>> qestion. The whole license reads:
>>>> 
>>>>  # Licensed under the OpenSSL license (the "License").  You may not
>>>> use # this file except in compliance with the License.  You can
>>>> obtain a # copy
>>>>  # in the file LICENSE in the source distribution or at
>>>>  # https://www.openssl.org/source/license.html
>>>> 
>>>>  #
>>>>  #
>>>> ====================================================================
>>>> # Written by Andy Polyakov <appro@openssl.org> for the OpenSSL #
>>>> project. The module is, however, dual licensed under OpenSSL and #
>>>> CRYPTOGAMS licenses depending on where you obtain it. For further #
>>>> details see http://www.openssl.org/~appro/cryptogams/. #
>>>> ====================================================================
>>>> 
>>>> After seeking legal advice it is still not clear that this driver
>>>> can be legally used in Linux. In particular the "depending on where
>>>> you obtain it" part does not make it clear when you can apply the
>>>> GPL and when the OpenSSL license.
>>>> 
>>>> I tried contacting the author of the code for clarification but did
>>>> not hear back. In absence of clear licensing the only solution I
>>>> see is removing this code.
> 
> A quick 'git grep OpenSSL' of the Linux tree returns several other
> crypto files under the ARM architecture that are similarly licensed. 
> Namely:
> 
> arch/arm/crypto/sha1-armv4-large.S
> arch/arm/crypto/sha256-armv4.pl
> arch/arm/crypto/sha256-core.S_shipped
> arch/arm/crypto/sha512-armv4.pl
> arch/arm/crypto/sha512-core.S_shipped
> arch/arm64/crypto/sha256-core.S_shipped
> arch/arm64/crypto/sha512-armv8.pl
> arch/arm64/crypto/sha512-core.S_shipped
> 
> On closer inspection of some of those files have the addendum that
> "Permission to use under GPL terms is granted", but not all of them.
> 
> -Tyrel

In 2015, Andy Polyakov, the author, replied in this mailing list [1]:

"I have no problems with reusing assembly modules in kernel context. The
whole idea behind cryptogams initiative was exactly to reuse code in
different contexts."

[1] https://patchwork.kernel.org/patch/6027481/

-- 
Paulo Flabiano Smorigo
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] crypto: vmx: Remove dubiously licensed crypto code
  2017-03-30 16:30       ` Paulo Flabiano Smorigo
@ 2017-04-13 13:30         ` Michal Suchánek
  2017-05-05 13:52         ` Michal Suchánek
  1 sibling, 0 replies; 10+ messages in thread
From: Michal Suchánek @ 2017-04-13 13:30 UTC (permalink / raw)
  To: Paulo Flabiano Smorigo
  Cc: Tyrel Datwyler, Leonidas S. Barbosa, Mauro Carvalho Chehab,
	Herbert Xu, Geert Uytterhoeven, Greg Kroah-Hartman, linux-kernel,
	Paul Mackerras, linux-crypto, linuxppc-dev, David S. Miller

On Thu, 30 Mar 2017 13:30:17 -0300
Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com> wrote:

> On 2017-03-29 20:08, Tyrel Datwyler wrote:
> > On 03/29/2017 08:13 AM, Michal Suchánek wrote:  
> >> On Wed, 29 Mar 2017 16:51:35 +0200
> >> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> >>   
> >>> On Wed, Mar 29, 2017 at 02:56:39PM +0200, Michal Suchanek wrote:  
> >>>> While reviewing commit 11c6e16ee13a ("crypto: vmx - Adding asm
> >>>> subroutines for XTS") which adds the OpenSSL license header to
> >>>> drivers/crypto/vmx/aesp8-ppc.pl licensing of this driver came
> >>>> into qestion. The whole license reads:
> >>>> 
> >>>>  # Licensed under the OpenSSL license (the "License").  You may
> >>>> not use # this file except in compliance with the License.  You
> >>>> can obtain a # copy
> >>>>  # in the file LICENSE in the source distribution or at
> >>>>  # https://www.openssl.org/source/license.html
> >>>> 
> >>>>  #
> >>>>  #
> >>>> ====================================================================
> >>>> # Written by Andy Polyakov <appro@openssl.org> for the OpenSSL #
> >>>> project. The module is, however, dual licensed under OpenSSL and
> >>>> # CRYPTOGAMS licenses depending on where you obtain it. For
> >>>> further # details see http://www.openssl.org/~appro/cryptogams/.
> >>>> #
> >>>> ====================================================================
> >>>> 
> >>>> After seeking legal advice it is still not clear that this driver
> >>>> can be legally used in Linux. In particular the "depending on
> >>>> where you obtain it" part does not make it clear when you can
> >>>> apply the GPL and when the OpenSSL license.
> >>>> 
> >>>> I tried contacting the author of the code for clarification but
> >>>> did not hear back. In absence of clear licensing the only
> >>>> solution I see is removing this code.  
> > 
> > A quick 'git grep OpenSSL' of the Linux tree returns several other
> > crypto files under the ARM architecture that are similarly
> > licensed. Namely:
> > 
> > arch/arm/crypto/sha1-armv4-large.S
> > arch/arm/crypto/sha256-armv4.pl
> > arch/arm/crypto/sha256-core.S_shipped
> > arch/arm/crypto/sha512-armv4.pl
> > arch/arm/crypto/sha512-core.S_shipped
> > arch/arm64/crypto/sha256-core.S_shipped
> > arch/arm64/crypto/sha512-armv8.pl
> > arch/arm64/crypto/sha512-core.S_shipped
> > 
> > On closer inspection of some of those files have the addendum that
> > "Permission to use under GPL terms is granted", but not all of them.
> > 
> > -Tyrel  
> 
> In 2015, Andy Polyakov, the author, replied in this mailing list [1]:
> 
> "I have no problems with reusing assembly modules in kernel context.
> The whole idea behind cryptogams initiative was exactly to reuse code
> in different contexts."
> 
> [1] https://patchwork.kernel.org/patch/6027481/
> 

So what? You also got a statement from whoever is relevant from OpenSSL
from where this code is obviously merged? Even if you did has the
e-mail discussion any value whatsoever?

Neither is a replacement for including a proper license statement with
the code. Not by reference to an e-mail discussion or a web site.

Thanks

Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] crypto: vmx: Remove dubiously licensed crypto code
  2017-03-30 16:30       ` Paulo Flabiano Smorigo
  2017-04-13 13:30         ` Michal Suchánek
@ 2017-05-05 13:52         ` Michal Suchánek
  2017-05-05 18:11           ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 10+ messages in thread
From: Michal Suchánek @ 2017-05-05 13:52 UTC (permalink / raw)
  To: Paulo Flabiano Smorigo
  Cc: Tyrel Datwyler, Leonidas S. Barbosa, Mauro Carvalho Chehab,
	Herbert Xu, Geert Uytterhoeven, Greg Kroah-Hartman, linux-kernel,
	Paul Mackerras, linux-crypto, linuxppc-dev, David S. Miller,
	appro

Hello,

On Thu, 30 Mar 2017 13:30:17 -0300
Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com> wrote:

> On 2017-03-29 20:08, Tyrel Datwyler wrote:
> > On 03/29/2017 08:13 AM, Michal Suchánek wrote:  
> >> On Wed, 29 Mar 2017 16:51:35 +0200
> >> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> >>   
> >>> On Wed, Mar 29, 2017 at 02:56:39PM +0200, Michal Suchanek wrote:  
> >>>> While reviewing commit 11c6e16ee13a ("crypto: vmx - Adding asm
> >>>> subroutines for XTS") which adds the OpenSSL license header to
> >>>> drivers/crypto/vmx/aesp8-ppc.pl licensing of this driver came
> >>>> into qestion. The whole license reads:
> >>>> 
> >>>>  # Licensed under the OpenSSL license (the "License").  You may
> >>>> not use # this file except in compliance with the License.  You
> >>>> can obtain a # copy
> >>>>  # in the file LICENSE in the source distribution or at
> >>>>  # https://www.openssl.org/source/license.html
> >>>> 
> >>>>  #
> >>>>  #
> >>>> ====================================================================
> >>>> # Written by Andy Polyakov <appro@openssl.org> for the OpenSSL #
> >>>> project. The module is, however, dual licensed under OpenSSL and
> >>>> # CRYPTOGAMS licenses depending on where you obtain it. For
> >>>> further # details see http://www.openssl.org/~appro/cryptogams/.
> >>>> #
> >>>> ====================================================================
> >>>> 
> >>>> After seeking legal advice it is still not clear that this driver
> >>>> can be legally used in Linux. In particular the "depending on
> >>>> where you obtain it" part does not make it clear when you can
> >>>> apply the GPL and when the OpenSSL license.
> >>>> 
> >>>> I tried contacting the author of the code for clarification but
> >>>> did not hear back. In absence of clear licensing the only
> >>>> solution I see is removing this code.  
> > 
> > A quick 'git grep OpenSSL' of the Linux tree returns several other
> > crypto files under the ARM architecture that are similarly
> > licensed. Namely:
> > 
> > arch/arm/crypto/sha1-armv4-large.S
> > arch/arm/crypto/sha256-armv4.pl
> > arch/arm/crypto/sha256-core.S_shipped
> > arch/arm/crypto/sha512-armv4.pl
> > arch/arm/crypto/sha512-core.S_shipped
> > arch/arm64/crypto/sha256-core.S_shipped
> > arch/arm64/crypto/sha512-armv8.pl
> > arch/arm64/crypto/sha512-core.S_shipped
> > 
> > On closer inspection of some of those files have the addendum that
> > "Permission to use under GPL terms is granted", but not all of them.
> > 
> > -Tyrel  
> 
> In 2015,  , the author, replied in this mailing list [1]:
> 
> "I have no problems with reusing assembly modules in kernel context.
> The whole idea behind cryptogams initiative was exactly to reuse code
> in different contexts."
> 
> [1] https://patchwork.kernel.org/patch/6027481/
> 

So you have an e-mail message from one of the authors of the code.
Andy Polyakov wrote most of the code but there are probably other
contributors who never gave explicit consent for using their code
outside of OpenSSL. The OpenSSL maintainers made it explicitly clear by
stamping the OpenSSL license incompatible with GPL2 on the file that
they are not OK with hosting development for Linux kernel code.

This Cryptograms project did not seem to get anywhere so there is no
source for the code other than the OpenSSL tree. Merging code from
OpenSSL into Linux does not look legally feasible.

Andy Polyakov is unresponsive in discussions concerning his awesome
licensing terms.

The MAINTAINERS file has
IBM Power VMX Cryptographic instructions
M:	Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
M:	Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
L:	linux-crypto@vger.kernel.org
S:	Supported

So presumably the maintainers have access to necessary legal advice to
determine what steps are necessary to make this driver maintainable
legally.

I do not expect this will be resolved overnight. However, there is no
progress on this issue whatsoever so I suggest removal of the driver.

Thanks

Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] crypto: vmx: Remove dubiously licensed crypto code
  2017-05-05 13:52         ` Michal Suchánek
@ 2017-05-05 18:11           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2017-05-05 18:11 UTC (permalink / raw)
  To: Michal Suchánek, Paulo Flabiano Smorigo
  Cc: Leonidas S. Barbosa, Herbert Xu, Geert Uytterhoeven,
	Greg Kroah-Hartman, linux-kernel, Paul Mackerras, Tyrel Datwyler,
	appro, Mauro Carvalho Chehab, linuxppc-dev, David S. Miller,
	linux-crypto

On Fri, 2017-05-05 at 15:52 +0200, Michal Suchánek wrote:
> 
> So you have an e-mail message from one of the authors of the code.
> Andy Polyakov wrote most of the code but there are probably other
> contributors who never gave explicit consent for using their code
> outside of OpenSSL.

The only contributions to that file in OpenSSL that isn't from Andy
are a whitespace fix and the addition of the licence :-)

If that makes you feel better we could undo the whitespace fix :-)

I've checked the other PowerPC related files in there and the situation
is identical (aes-ppc.pl does have also a spelling fix in comments).

At this point, it's pretty clear that this code is authored solely by
Andy who gave permission to use it.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-05-05 18:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-29 12:56 [PATCH] crypto: vmx: Remove dubiously licensed crypto code Michal Suchanek
2017-03-29 14:51 ` Greg Kroah-Hartman
2017-03-29 15:13   ` Michal Suchánek
2017-03-29 23:08     ` Tyrel Datwyler
2017-03-30 16:30       ` Paulo Flabiano Smorigo
2017-04-13 13:30         ` Michal Suchánek
2017-05-05 13:52         ` Michal Suchánek
2017-05-05 18:11           ` Benjamin Herrenschmidt
2017-03-29 23:29 ` Tyrel Datwyler
     [not found] ` <87r31flisi.fsf@concordia.ellerman.id.au>
2017-03-30  0:49   ` Tyrel Datwyler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).