All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RT 0/7] 4.4.126-rt141-rc1
@ 2018-04-04  7:16 Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 1/7] locking: add types.h Daniel Wagner
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Daniel Wagner @ 2018-04-04  7:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker, Julia Cartwright, Daniel Wagner,
	tom.zanussi, Daniel Wagner

Dear RT Folks,

This is the RT stable review cycle of patch 4.4.126-rt141-rc1.

The series contains a few patches which never made it to the v4.4-rt
tree. I identified the patches using Steven's workflow. Next thing I
try to do is to use Julia's script to idendtify missing
patches. In theory nothing should be. 

Please scream at me if I messed something up. Please test the patches too.

The pre-releases will not be pushed to the git repository, only the
final release is.

If all goes well, this patch will be converted to the next main release
on 16.04.2017.

Enjoy,
-- Daniel

ps: Bare with me, I didn't have time to get all the magic tooling
working for releasing a -rc series as Steven has done in the past.
That's why this mail looks slightly different then what you are used.

Changes from 4.4.126-rt141:

Daniel Wagner (1):
  Linux 4.4.126-rt141-rc1

Sebastian Andrzej Siewior (4):
  locking: add types.h
  mm/slub: close possible memory-leak in kmem_cache_alloc_bulk()
  arm*: disable NEON in kernel mode
  crypto: limit more FPU-enabled sections

Steven Rostedt (VMware) (1):
  Revert "memcontrol: Prevent scheduling while atomic in cgroup code"

Thomas Gleixner (1):
  timer: Invoke timer_start_debug() where it makes sense

 arch/arm/Kconfig                           |  2 +-
 arch/arm64/crypto/Kconfig                  | 14 +++++++-------
 arch/x86/crypto/camellia_aesni_avx2_glue.c | 20 ++++++++++++++++++++
 arch/x86/crypto/camellia_aesni_avx_glue.c  | 19 +++++++++++++++++++
 arch/x86/crypto/cast6_avx_glue.c           | 24 +++++++++++++++++++-----
 arch/x86/crypto/chacha20_glue.c            |  8 +++++---
 arch/x86/crypto/serpent_avx2_glue.c        | 19 +++++++++++++++++++
 arch/x86/crypto/serpent_avx_glue.c         | 23 +++++++++++++++++++----
 arch/x86/crypto/serpent_sse2_glue.c        | 23 +++++++++++++++++++----
 arch/x86/crypto/twofish_avx_glue.c         | 27 +++++++++++++++++++++++++--
 arch/x86/include/asm/fpu/api.h             |  1 +
 arch/x86/kernel/fpu/core.c                 | 12 ++++++++++++
 include/linux/spinlock_types_raw.h         |  2 ++
 kernel/time/timer.c                        |  4 ++--
 localversion-rt                            |  2 +-
 mm/memcontrol.c                            |  7 ++-----
 mm/slub.c                                  |  1 +
 17 files changed, 174 insertions(+), 34 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH RT 1/7] locking: add types.h
  2018-04-04  7:16 [PATCH RT 0/7] 4.4.126-rt141-rc1 Daniel Wagner
@ 2018-04-04  7:16 ` Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 2/7] timer: Invoke timer_start_debug() where it makes sense Daniel Wagner
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2018-04-04  7:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker, Julia Cartwright, Daniel Wagner,
	tom.zanussi, Sebastian Andrzej Siewior, stable-rt

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

During the stable update the arm architecture did not compile anymore
due to missing definition of u16/32.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/linux/spinlock_types_raw.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/spinlock_types_raw.h b/include/linux/spinlock_types_raw.h
index edffc4d53fc9..03235b475b77 100644
--- a/include/linux/spinlock_types_raw.h
+++ b/include/linux/spinlock_types_raw.h
@@ -1,6 +1,8 @@
 #ifndef __LINUX_SPINLOCK_TYPES_RAW_H
 #define __LINUX_SPINLOCK_TYPES_RAW_H
 
+#include <linux/types.h>
+
 #if defined(CONFIG_SMP)
 # include <asm/spinlock_types.h>
 #else
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH RT 2/7] timer: Invoke timer_start_debug() where it makes sense
  2018-04-04  7:16 [PATCH RT 0/7] 4.4.126-rt141-rc1 Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 1/7] locking: add types.h Daniel Wagner
@ 2018-04-04  7:16 ` Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 3/7] mm/slub: close possible memory-leak in kmem_cache_alloc_bulk() Daniel Wagner
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2018-04-04  7:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker, Julia Cartwright, Daniel Wagner,
	tom.zanussi, stable, rt, Sebastian Andrzej Siewior

From: Thomas Gleixner <tglx@linutronix.de>

The timer start debug function is called before the proper timer base is
set. As a consequence the trace data contains the stale CPU and flags
values.

Call the debug function after setting the new base and flags.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: rt@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/time/timer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index a8246d79cb5a..6b322aea1c46 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -838,8 +838,6 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
 	if (!ret && pending_only)
 		goto out_unlock;
 
-	debug_activate(timer, expires);
-
 	new_base = get_target_base(base, pinned);
 
 	if (base != new_base) {
@@ -854,6 +852,8 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
 			base = switch_timer_base(timer, base, new_base);
 	}
 
+	debug_activate(timer, expires);
+
 	timer->expires = expires;
 	internal_add_timer(base, timer);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH RT 3/7] mm/slub: close possible memory-leak in kmem_cache_alloc_bulk()
  2018-04-04  7:16 [PATCH RT 0/7] 4.4.126-rt141-rc1 Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 1/7] locking: add types.h Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 2/7] timer: Invoke timer_start_debug() where it makes sense Daniel Wagner
@ 2018-04-04  7:16 ` Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 4/7] arm*: disable NEON in kernel mode Daniel Wagner
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2018-04-04  7:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker, Julia Cartwright, Daniel Wagner,
	tom.zanussi, Sebastian Andrzej Siewior, stable-rt

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Under certain circumstances we could leak elements which were moved to
the local "to_free" list. The damage is limited since I can't find
any users here.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 mm/slub.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/slub.c b/mm/slub.c
index b183c5271607..b3626ab324fe 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3026,6 +3026,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 	return i;
 error:
 	local_irq_enable();
+	free_delayed(&to_free);
 	slab_post_alloc_hook(s, flags, i, p);
 	__kmem_cache_free_bulk(s, i, p);
 	return 0;
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH RT 4/7] arm*: disable NEON in kernel mode
  2018-04-04  7:16 [PATCH RT 0/7] 4.4.126-rt141-rc1 Daniel Wagner
                   ` (2 preceding siblings ...)
  2018-04-04  7:16 ` [PATCH RT 3/7] mm/slub: close possible memory-leak in kmem_cache_alloc_bulk() Daniel Wagner
@ 2018-04-04  7:16 ` Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 5/7] crypto: limit more FPU-enabled sections Daniel Wagner
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2018-04-04  7:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker, Julia Cartwright, Daniel Wagner,
	tom.zanussi, Sebastian Andrzej Siewior, stable-rt

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

NEON in kernel mode is used by the crypto algorithms and raid6 code.
While the raid6 code looks okay, the crypto algorithms do not: NEON
is enabled on first invocation and may allocate/free/map memory before
the NEON mode is disabled again.
This needs to be changed until it can be enabled.
On ARM NEON in kernel mode can be simply disabled. on ARM64 it needs to
stay on due to possible EFI callbacks so here I disable each algorithm.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/arm/Kconfig          |  2 +-
 arch/arm64/crypto/Kconfig | 14 +++++++-------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 79c4603e9453..5e054a0c4b25 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -2119,7 +2119,7 @@ config NEON
 
 config KERNEL_MODE_NEON
 	bool "Support for NEON in kernel mode"
-	depends on NEON && AEABI
+	depends on NEON && AEABI && !PREEMPT_RT_BASE
 	help
 	  Say Y to include support for NEON in kernel mode.
 
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 2cf32e9887e1..cd71b3432720 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -10,41 +10,41 @@ if ARM64_CRYPTO
 
 config CRYPTO_SHA1_ARM64_CE
 	tristate "SHA-1 digest algorithm (ARMv8 Crypto Extensions)"
-	depends on ARM64 && KERNEL_MODE_NEON
+	depends on ARM64 && KERNEL_MODE_NEON && !PREEMPT_RT_BASE
 	select CRYPTO_HASH
 
 config CRYPTO_SHA2_ARM64_CE
 	tristate "SHA-224/SHA-256 digest algorithm (ARMv8 Crypto Extensions)"
-	depends on ARM64 && KERNEL_MODE_NEON
+	depends on ARM64 && KERNEL_MODE_NEON && !PREEMPT_RT_BASE
 	select CRYPTO_HASH
 
 config CRYPTO_GHASH_ARM64_CE
 	tristate "GHASH (for GCM chaining mode) using ARMv8 Crypto Extensions"
-	depends on ARM64 && KERNEL_MODE_NEON
+	depends on ARM64 && KERNEL_MODE_NEON && !PREEMPT_RT_BASE
 	select CRYPTO_HASH
 
 config CRYPTO_AES_ARM64_CE
 	tristate "AES core cipher using ARMv8 Crypto Extensions"
-	depends on ARM64 && KERNEL_MODE_NEON
+	depends on ARM64 && KERNEL_MODE_NEON && !PREEMPT_RT_BASE
 	select CRYPTO_ALGAPI
 
 config CRYPTO_AES_ARM64_CE_CCM
 	tristate "AES in CCM mode using ARMv8 Crypto Extensions"
-	depends on ARM64 && KERNEL_MODE_NEON
+	depends on ARM64 && KERNEL_MODE_NEON && !PREEMPT_RT_BASE
 	select CRYPTO_ALGAPI
 	select CRYPTO_AES_ARM64_CE
 	select CRYPTO_AEAD
 
 config CRYPTO_AES_ARM64_CE_BLK
 	tristate "AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions"
-	depends on ARM64 && KERNEL_MODE_NEON
+	depends on ARM64 && KERNEL_MODE_NEON && !PREEMPT_RT_BASE
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_AES_ARM64_CE
 	select CRYPTO_ABLK_HELPER
 
 config CRYPTO_AES_ARM64_NEON_BLK
 	tristate "AES in ECB/CBC/CTR/XTS modes using NEON instructions"
-	depends on ARM64 && KERNEL_MODE_NEON
+	depends on ARM64 && KERNEL_MODE_NEON && !PREEMPT_RT_BASE
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_AES
 	select CRYPTO_ABLK_HELPER
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH RT 5/7] crypto: limit more FPU-enabled sections
  2018-04-04  7:16 [PATCH RT 0/7] 4.4.126-rt141-rc1 Daniel Wagner
                   ` (3 preceding siblings ...)
  2018-04-04  7:16 ` [PATCH RT 4/7] arm*: disable NEON in kernel mode Daniel Wagner
@ 2018-04-04  7:16 ` Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 6/7] Revert "memcontrol: Prevent scheduling while atomic in cgroup code" Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 7/7] Linux 4.4.126-rt141-rc1 Daniel Wagner
  6 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2018-04-04  7:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker, Julia Cartwright, Daniel Wagner,
	tom.zanussi, Sebastian Andrzej Siewior, stable-rt

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Those crypto drivers use SSE/AVX/… for their crypto work and in order to
do so in kernel they need to enable the "FPU" in kernel mode which
disables preemption.
There are two problems with the way they are used:
- the while loop which processes X bytes may create latency spikes and
  should be avoided or limited.
- the cipher-walk-next part may allocate/free memory and may use
  kmap_atomic().

The whole kernel_fpu_begin()/end() processing isn't probably that cheap.
It most likely makes sense to process as much of those as possible in one
go. The new *_fpu_sched_rt() schedules only if a RT task is pending.

Probably we should measure the performance those ciphers in pure SW
mode and with this optimisations to see if it makes sense to keep them
for RT.

This kernel_fpu_resched() makes the code more preemptible which might hurt
performance.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/crypto/camellia_aesni_avx2_glue.c | 20 ++++++++++++++++++++
 arch/x86/crypto/camellia_aesni_avx_glue.c  | 19 +++++++++++++++++++
 arch/x86/crypto/cast6_avx_glue.c           | 24 +++++++++++++++++++-----
 arch/x86/crypto/chacha20_glue.c            |  8 +++++---
 arch/x86/crypto/serpent_avx2_glue.c        | 19 +++++++++++++++++++
 arch/x86/crypto/serpent_avx_glue.c         | 23 +++++++++++++++++++----
 arch/x86/crypto/serpent_sse2_glue.c        | 23 +++++++++++++++++++----
 arch/x86/crypto/twofish_avx_glue.c         | 27 +++++++++++++++++++++++++--
 arch/x86/include/asm/fpu/api.h             |  1 +
 arch/x86/kernel/fpu/core.c                 | 12 ++++++++++++
 10 files changed, 158 insertions(+), 18 deletions(-)

diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c
index d84456924563..c54536d9932c 100644
--- a/arch/x86/crypto/camellia_aesni_avx2_glue.c
+++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c
@@ -206,6 +206,20 @@ struct crypt_priv {
 	bool fpu_enabled;
 };
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+static void camellia_fpu_end_rt(struct crypt_priv *ctx)
+{
+       bool fpu_enabled = ctx->fpu_enabled;
+
+       if (!fpu_enabled)
+               return;
+       camellia_fpu_end(fpu_enabled);
+       ctx->fpu_enabled = false;
+}
+#else
+static void camellia_fpu_end_rt(struct crypt_priv *ctx) { }
+#endif
+
 static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 {
 	const unsigned int bsize = CAMELLIA_BLOCK_SIZE;
@@ -221,16 +235,19 @@ static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 	}
 
 	if (nbytes >= CAMELLIA_AESNI_PARALLEL_BLOCKS * bsize) {
+		kernel_fpu_resched();
 		camellia_ecb_enc_16way(ctx->ctx, srcdst, srcdst);
 		srcdst += bsize * CAMELLIA_AESNI_PARALLEL_BLOCKS;
 		nbytes -= bsize * CAMELLIA_AESNI_PARALLEL_BLOCKS;
 	}
 
 	while (nbytes >= CAMELLIA_PARALLEL_BLOCKS * bsize) {
+		kernel_fpu_resched();
 		camellia_enc_blk_2way(ctx->ctx, srcdst, srcdst);
 		srcdst += bsize * CAMELLIA_PARALLEL_BLOCKS;
 		nbytes -= bsize * CAMELLIA_PARALLEL_BLOCKS;
 	}
+	camellia_fpu_end_rt(ctx);
 
 	for (i = 0; i < nbytes / bsize; i++, srcdst += bsize)
 		camellia_enc_blk(ctx->ctx, srcdst, srcdst);
@@ -251,16 +268,19 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 	}
 
 	if (nbytes >= CAMELLIA_AESNI_PARALLEL_BLOCKS * bsize) {
+		kernel_fpu_resched();
 		camellia_ecb_dec_16way(ctx->ctx, srcdst, srcdst);
 		srcdst += bsize * CAMELLIA_AESNI_PARALLEL_BLOCKS;
 		nbytes -= bsize * CAMELLIA_AESNI_PARALLEL_BLOCKS;
 	}
 
 	while (nbytes >= CAMELLIA_PARALLEL_BLOCKS * bsize) {
+		kernel_fpu_resched();
 		camellia_dec_blk_2way(ctx->ctx, srcdst, srcdst);
 		srcdst += bsize * CAMELLIA_PARALLEL_BLOCKS;
 		nbytes -= bsize * CAMELLIA_PARALLEL_BLOCKS;
 	}
+	camellia_fpu_end_rt(ctx);
 
 	for (i = 0; i < nbytes / bsize; i++, srcdst += bsize)
 		camellia_dec_blk(ctx->ctx, srcdst, srcdst);
diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c
index 93d8f295784e..a1666a58eee5 100644
--- a/arch/x86/crypto/camellia_aesni_avx_glue.c
+++ b/arch/x86/crypto/camellia_aesni_avx_glue.c
@@ -210,6 +210,21 @@ struct crypt_priv {
 	bool fpu_enabled;
 };
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+static void camellia_fpu_end_rt(struct crypt_priv *ctx)
+{
+	bool fpu_enabled = ctx->fpu_enabled;
+
+	if (!fpu_enabled)
+		return;
+	camellia_fpu_end(fpu_enabled);
+	ctx->fpu_enabled = false;
+}
+
+#else
+static void camellia_fpu_end_rt(struct crypt_priv *ctx) { }
+#endif
+
 static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 {
 	const unsigned int bsize = CAMELLIA_BLOCK_SIZE;
@@ -225,10 +240,12 @@ static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 	}
 
 	while (nbytes >= CAMELLIA_PARALLEL_BLOCKS * bsize) {
+		kernel_fpu_resched();
 		camellia_enc_blk_2way(ctx->ctx, srcdst, srcdst);
 		srcdst += bsize * CAMELLIA_PARALLEL_BLOCKS;
 		nbytes -= bsize * CAMELLIA_PARALLEL_BLOCKS;
 	}
+	camellia_fpu_end_rt(ctx);
 
 	for (i = 0; i < nbytes / bsize; i++, srcdst += bsize)
 		camellia_enc_blk(ctx->ctx, srcdst, srcdst);
@@ -249,10 +266,12 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 	}
 
 	while (nbytes >= CAMELLIA_PARALLEL_BLOCKS * bsize) {
+		kernel_fpu_resched();
 		camellia_dec_blk_2way(ctx->ctx, srcdst, srcdst);
 		srcdst += bsize * CAMELLIA_PARALLEL_BLOCKS;
 		nbytes -= bsize * CAMELLIA_PARALLEL_BLOCKS;
 	}
+	camellia_fpu_end_rt(ctx);
 
 	for (i = 0; i < nbytes / bsize; i++, srcdst += bsize)
 		camellia_dec_blk(ctx->ctx, srcdst, srcdst);
diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c
index fca459578c35..9eb469b9836a 100644
--- a/arch/x86/crypto/cast6_avx_glue.c
+++ b/arch/x86/crypto/cast6_avx_glue.c
@@ -205,19 +205,33 @@ struct crypt_priv {
 	bool fpu_enabled;
 };
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+static void cast6_fpu_end_rt(struct crypt_priv *ctx)
+{
+	bool fpu_enabled = ctx->fpu_enabled;
+
+	if (!fpu_enabled)
+		return;
+	cast6_fpu_end(fpu_enabled);
+	ctx->fpu_enabled = false;
+}
+
+#else
+static void cast6_fpu_end_rt(struct crypt_priv *ctx) { }
+#endif
+
 static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 {
 	const unsigned int bsize = CAST6_BLOCK_SIZE;
 	struct crypt_priv *ctx = priv;
 	int i;
 
-	ctx->fpu_enabled = cast6_fpu_begin(ctx->fpu_enabled, nbytes);
-
 	if (nbytes == bsize * CAST6_PARALLEL_BLOCKS) {
+		ctx->fpu_enabled = cast6_fpu_begin(ctx->fpu_enabled, nbytes);
 		cast6_ecb_enc_8way(ctx->ctx, srcdst, srcdst);
+		cast6_fpu_end_rt(ctx);
 		return;
 	}
-
 	for (i = 0; i < nbytes / bsize; i++, srcdst += bsize)
 		__cast6_encrypt(ctx->ctx, srcdst, srcdst);
 }
@@ -228,10 +242,10 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 	struct crypt_priv *ctx = priv;
 	int i;
 
-	ctx->fpu_enabled = cast6_fpu_begin(ctx->fpu_enabled, nbytes);
-
 	if (nbytes == bsize * CAST6_PARALLEL_BLOCKS) {
+		ctx->fpu_enabled = cast6_fpu_begin(ctx->fpu_enabled, nbytes);
 		cast6_ecb_dec_8way(ctx->ctx, srcdst, srcdst);
+		cast6_fpu_end_rt(ctx);
 		return;
 	}
 
diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c
index 722bacea040e..0e3eb53a87cd 100644
--- a/arch/x86/crypto/chacha20_glue.c
+++ b/arch/x86/crypto/chacha20_glue.c
@@ -80,22 +80,24 @@ static int chacha20_simd(struct blkcipher_desc *desc, struct scatterlist *dst,
 
 	crypto_chacha20_init(state, crypto_blkcipher_ctx(desc->tfm), walk.iv);
 
-	kernel_fpu_begin();
-
 	while (walk.nbytes >= CHACHA20_BLOCK_SIZE) {
+		kernel_fpu_begin();
+
 		chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
 				rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE));
+		kernel_fpu_end();
 		err = blkcipher_walk_done(desc, &walk,
 					  walk.nbytes % CHACHA20_BLOCK_SIZE);
 	}
 
 	if (walk.nbytes) {
+		kernel_fpu_begin();
 		chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr,
 				walk.nbytes);
+		kernel_fpu_end();
 		err = blkcipher_walk_done(desc, &walk, 0);
 	}
 
-	kernel_fpu_end();
 
 	return err;
 }
diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c
index 6d198342e2de..0954bae9a995 100644
--- a/arch/x86/crypto/serpent_avx2_glue.c
+++ b/arch/x86/crypto/serpent_avx2_glue.c
@@ -184,6 +184,21 @@ struct crypt_priv {
 	bool fpu_enabled;
 };
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+static void serpent_fpu_end_rt(struct crypt_priv *ctx)
+{
+       bool fpu_enabled = ctx->fpu_enabled;
+
+       if (!fpu_enabled)
+               return;
+       serpent_fpu_end(fpu_enabled);
+       ctx->fpu_enabled = false;
+}
+
+#else
+static void serpent_fpu_end_rt(struct crypt_priv *ctx) { }
+#endif
+
 static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 {
 	const unsigned int bsize = SERPENT_BLOCK_SIZE;
@@ -199,10 +214,12 @@ static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 	}
 
 	while (nbytes >= SERPENT_PARALLEL_BLOCKS * bsize) {
+		kernel_fpu_resched();
 		serpent_ecb_enc_8way_avx(ctx->ctx, srcdst, srcdst);
 		srcdst += bsize * SERPENT_PARALLEL_BLOCKS;
 		nbytes -= bsize * SERPENT_PARALLEL_BLOCKS;
 	}
+	serpent_fpu_end_rt(ctx);
 
 	for (i = 0; i < nbytes / bsize; i++, srcdst += bsize)
 		__serpent_encrypt(ctx->ctx, srcdst, srcdst);
@@ -223,10 +240,12 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 	}
 
 	while (nbytes >= SERPENT_PARALLEL_BLOCKS * bsize) {
+		kernel_fpu_resched();
 		serpent_ecb_dec_8way_avx(ctx->ctx, srcdst, srcdst);
 		srcdst += bsize * SERPENT_PARALLEL_BLOCKS;
 		nbytes -= bsize * SERPENT_PARALLEL_BLOCKS;
 	}
+	serpent_fpu_end_rt(ctx);
 
 	for (i = 0; i < nbytes / bsize; i++, srcdst += bsize)
 		__serpent_decrypt(ctx->ctx, srcdst, srcdst);
diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c
index 5dc37026c7ce..8c812a05a084 100644
--- a/arch/x86/crypto/serpent_avx_glue.c
+++ b/arch/x86/crypto/serpent_avx_glue.c
@@ -218,16 +218,31 @@ struct crypt_priv {
 	bool fpu_enabled;
 };
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+static void serpent_fpu_end_rt(struct crypt_priv *ctx)
+{
+	bool fpu_enabled = ctx->fpu_enabled;
+
+	if (!fpu_enabled)
+		return;
+	serpent_fpu_end(fpu_enabled);
+	ctx->fpu_enabled = false;
+}
+
+#else
+static void serpent_fpu_end_rt(struct crypt_priv *ctx) { }
+#endif
+
 static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 {
 	const unsigned int bsize = SERPENT_BLOCK_SIZE;
 	struct crypt_priv *ctx = priv;
 	int i;
 
-	ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes);
-
 	if (nbytes == bsize * SERPENT_PARALLEL_BLOCKS) {
+		ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes);
 		serpent_ecb_enc_8way_avx(ctx->ctx, srcdst, srcdst);
+		serpent_fpu_end_rt(ctx);
 		return;
 	}
 
@@ -241,10 +256,10 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 	struct crypt_priv *ctx = priv;
 	int i;
 
-	ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes);
-
 	if (nbytes == bsize * SERPENT_PARALLEL_BLOCKS) {
+		ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes);
 		serpent_ecb_dec_8way_avx(ctx->ctx, srcdst, srcdst);
+		serpent_fpu_end_rt(ctx);
 		return;
 	}
 
diff --git a/arch/x86/crypto/serpent_sse2_glue.c b/arch/x86/crypto/serpent_sse2_glue.c
index 3643dd508f45..1ef09f8b5cc7 100644
--- a/arch/x86/crypto/serpent_sse2_glue.c
+++ b/arch/x86/crypto/serpent_sse2_glue.c
@@ -187,16 +187,31 @@ struct crypt_priv {
 	bool fpu_enabled;
 };
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+static void serpent_fpu_end_rt(struct crypt_priv *ctx)
+{
+	bool fpu_enabled = ctx->fpu_enabled;
+
+	if (!fpu_enabled)
+		return;
+	serpent_fpu_end(fpu_enabled);
+	ctx->fpu_enabled = false;
+}
+
+#else
+static void serpent_fpu_end_rt(struct crypt_priv *ctx) { }
+#endif
+
 static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 {
 	const unsigned int bsize = SERPENT_BLOCK_SIZE;
 	struct crypt_priv *ctx = priv;
 	int i;
 
-	ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes);
-
 	if (nbytes == bsize * SERPENT_PARALLEL_BLOCKS) {
+		ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes);
 		serpent_enc_blk_xway(ctx->ctx, srcdst, srcdst);
+		serpent_fpu_end_rt(ctx);
 		return;
 	}
 
@@ -210,10 +225,10 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 	struct crypt_priv *ctx = priv;
 	int i;
 
-	ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes);
-
 	if (nbytes == bsize * SERPENT_PARALLEL_BLOCKS) {
+		ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes);
 		serpent_dec_blk_xway(ctx->ctx, srcdst, srcdst);
+		serpent_fpu_end_rt(ctx);
 		return;
 	}
 
diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c
index b7a3904b953c..de00fe24927e 100644
--- a/arch/x86/crypto/twofish_avx_glue.c
+++ b/arch/x86/crypto/twofish_avx_glue.c
@@ -218,6 +218,21 @@ struct crypt_priv {
 	bool fpu_enabled;
 };
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+static void twofish_fpu_end_rt(struct crypt_priv *ctx)
+{
+	bool fpu_enabled = ctx->fpu_enabled;
+
+	if (!fpu_enabled)
+		return;
+	twofish_fpu_end(fpu_enabled);
+	ctx->fpu_enabled = false;
+}
+
+#else
+static void twofish_fpu_end_rt(struct crypt_priv *ctx) { }
+#endif
+
 static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 {
 	const unsigned int bsize = TF_BLOCK_SIZE;
@@ -228,12 +243,16 @@ static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 
 	if (nbytes == bsize * TWOFISH_PARALLEL_BLOCKS) {
 		twofish_ecb_enc_8way(ctx->ctx, srcdst, srcdst);
+		twofish_fpu_end_rt(ctx);
 		return;
 	}
 
-	for (i = 0; i < nbytes / (bsize * 3); i++, srcdst += bsize * 3)
+	for (i = 0; i < nbytes / (bsize * 3); i++, srcdst += bsize * 3) {
+		kernel_fpu_resched();
 		twofish_enc_blk_3way(ctx->ctx, srcdst, srcdst);
+	}
 
+	twofish_fpu_end_rt(ctx);
 	nbytes %= bsize * 3;
 
 	for (i = 0; i < nbytes / bsize; i++, srcdst += bsize)
@@ -250,11 +269,15 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes)
 
 	if (nbytes == bsize * TWOFISH_PARALLEL_BLOCKS) {
 		twofish_ecb_dec_8way(ctx->ctx, srcdst, srcdst);
+		twofish_fpu_end_rt(ctx);
 		return;
 	}
 
-	for (i = 0; i < nbytes / (bsize * 3); i++, srcdst += bsize * 3)
+	for (i = 0; i < nbytes / (bsize * 3); i++, srcdst += bsize * 3) {
+		kernel_fpu_resched();
 		twofish_dec_blk_3way(ctx->ctx, srcdst, srcdst);
+	}
+	twofish_fpu_end_rt(ctx);
 
 	nbytes %= bsize * 3;
 
diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index 1429a7c736db..85428df40a22 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -24,6 +24,7 @@ extern void __kernel_fpu_begin(void);
 extern void __kernel_fpu_end(void);
 extern void kernel_fpu_begin(void);
 extern void kernel_fpu_end(void);
+extern void kernel_fpu_resched(void);
 extern bool irq_fpu_usable(void);
 
 /*
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index d25097c3fc1d..67ae18aeaf8b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -149,6 +149,18 @@ void kernel_fpu_end(void)
 }
 EXPORT_SYMBOL_GPL(kernel_fpu_end);
 
+void kernel_fpu_resched(void)
+{
+	WARN_ON_FPU(!this_cpu_read(in_kernel_fpu));
+
+	if (should_resched(PREEMPT_OFFSET)) {
+		kernel_fpu_end();
+		cond_resched();
+		kernel_fpu_begin();
+	}
+}
+EXPORT_SYMBOL_GPL(kernel_fpu_resched);
+
 /*
  * CR0::TS save/restore functions:
  */
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH RT 6/7] Revert "memcontrol: Prevent scheduling while atomic in cgroup code"
  2018-04-04  7:16 [PATCH RT 0/7] 4.4.126-rt141-rc1 Daniel Wagner
                   ` (4 preceding siblings ...)
  2018-04-04  7:16 ` [PATCH RT 5/7] crypto: limit more FPU-enabled sections Daniel Wagner
@ 2018-04-04  7:16 ` Daniel Wagner
  2018-04-04  7:16 ` [PATCH RT 7/7] Linux 4.4.126-rt141-rc1 Daniel Wagner
  6 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2018-04-04  7:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker, Julia Cartwright, Daniel Wagner,
	tom.zanussi, stable, Sebastian Andrzej Siewior

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The commit "memcontrol: Prevent scheduling while atomic in cgroup code"
fixed this issue:

       refill_stock()
          get_cpu_var()
          drain_stock()
             res_counter_uncharge()
                res_counter_uncharge_until()
                   spin_lock() <== boom

But commit 3e32cb2e0a12b ("mm: memcontrol: lockless page counters") replaced
the calls to res_counter_uncharge() in drain_stock() to the lockless
function page_counter_uncharge(). There is no more spin lock there and no
more reason to have that local lock.

Cc: <stable@vger.kernel.org>
Reported-by: Haiyang HY1 Tan <tanhy1@lenovo.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
[bigeasy: That upstream commit appeared in v3.19 and the patch in
  question in v3.18.7-rt2 and v3.18 seems still to be maintained. So I
  guess that v3.18 would need the locallocks that we are about to remove
  here. I am not sure if any earlier versions have the patch
  backported.
  The stable tag here is because Haiyang reported (and debugged) a crash
  in 4.4-RT with this patch applied (which has get_cpu_light() instead
  the locallocks it gained in v4.9-RT).
  https://lkml.kernel.org/r/05AA4EC5C6EC1D48BE2CDCFF3AE0B8A637F78A15@CNMAILEX04.lenovo.com
]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 mm/memcontrol.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 493b4986d5dc..56f67a15937b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1925,17 +1925,14 @@ static void drain_local_stock(struct work_struct *dummy)
  */
 static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
-	struct memcg_stock_pcp *stock;
-	int cpu = get_cpu_light();
-
-	stock = &per_cpu(memcg_stock, cpu);
+	struct memcg_stock_pcp *stock = &get_cpu_var(memcg_stock);
 
 	if (stock->cached != memcg) { /* reset if necessary */
 		drain_stock(stock);
 		stock->cached = memcg;
 	}
 	stock->nr_pages += nr_pages;
-	put_cpu_light();
+	put_cpu_var(memcg_stock);
 }
 
 /*
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH RT 7/7] Linux 4.4.126-rt141-rc1
  2018-04-04  7:16 [PATCH RT 0/7] 4.4.126-rt141-rc1 Daniel Wagner
                   ` (5 preceding siblings ...)
  2018-04-04  7:16 ` [PATCH RT 6/7] Revert "memcontrol: Prevent scheduling while atomic in cgroup code" Daniel Wagner
@ 2018-04-04  7:16 ` Daniel Wagner
  6 siblings, 0 replies; 8+ messages in thread
From: Daniel Wagner @ 2018-04-04  7:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-rt-users, Steven Rostedt, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker, Julia Cartwright, Daniel Wagner,
	tom.zanussi, Daniel Wagner

Signed-off-by: Daniel Wagner <wagi@monom.org>
---
 localversion-rt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/localversion-rt b/localversion-rt
index 09a4d4e4f010..564532c76548 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt141
+-rt141-rc1
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-04-04  7:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-04  7:16 [PATCH RT 0/7] 4.4.126-rt141-rc1 Daniel Wagner
2018-04-04  7:16 ` [PATCH RT 1/7] locking: add types.h Daniel Wagner
2018-04-04  7:16 ` [PATCH RT 2/7] timer: Invoke timer_start_debug() where it makes sense Daniel Wagner
2018-04-04  7:16 ` [PATCH RT 3/7] mm/slub: close possible memory-leak in kmem_cache_alloc_bulk() Daniel Wagner
2018-04-04  7:16 ` [PATCH RT 4/7] arm*: disable NEON in kernel mode Daniel Wagner
2018-04-04  7:16 ` [PATCH RT 5/7] crypto: limit more FPU-enabled sections Daniel Wagner
2018-04-04  7:16 ` [PATCH RT 6/7] Revert "memcontrol: Prevent scheduling while atomic in cgroup code" Daniel Wagner
2018-04-04  7:16 ` [PATCH RT 7/7] Linux 4.4.126-rt141-rc1 Daniel Wagner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.