linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 4/8] mm/swap: Use local_lock for protection
       [not found] <20200519201912.1564477-1-bigeasy@linutronix.de>
@ 2020-05-19 20:19 ` Sebastian Andrzej Siewior
  2020-05-19 23:58   ` Andrew Morton
  2020-05-20 10:53   ` Peter Zijlstra
  2020-05-19 20:19 ` [PATCH 8/8] mm/zswap: Use local lock to protect per-CPU data Sebastian Andrzej Siewior
  1 sibling, 2 replies; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-05-19 20:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Will Deacon,
	Thomas Gleixner, Paul E . McKenney, Linus Torvalds,
	Andrew Morton, linux-mm, Sebastian Andrzej Siewior

From: Ingo Molnar <mingo@kernel.org>

The various struct pagevec per CPU variables are protected by disabling
either preemption or interrupts across the critical sections. Inside
these sections spinlocks have to be acquired.

These spinlocks are regular spinlock_t types which are converted to
"sleeping" spinlocks on PREEMPT_RT enabled kernels. Obviously sleeping
locks cannot be acquired in preemption or interrupt disabled sections.

local locks provide a trivial way to substitute preempt and interrupt
disable instances. On a non PREEMPT_RT enabled kernel local_lock() maps
to preempt_disable() and local_lock_irq() to local_irq_disable().

Add swapvec_lock to protect the per-CPU lru_add_pvec and
lru_lazyfree_pvecs variables and rotate_lock to protect the per-CPU
lru_rotate_pvecs variable

Change the relevant call sites to acquire these locks instead of using
preempt_disable() / get_cpu() / get_cpu_var() and local_irq_disable() /
local_irq_save().

There is neither a functional change nor a change in the generated
binary code for non PREEMPT_RT enabled non-debug kernels.

When lockdep is enabled local locks have lockdep maps embedded. These
allow lockdep to validate the protections, i.e. inappropriate usage of a
preemption only protected sections would result in a lockdep warning
while the same problem would not be noticed with a plain
preempt_disable() based protection.

local locks also improve readability as they provide a named scope for
the protections while preempt/interrupt disable are opaque scopeless.

Finally local locks allow PREEMPT_RT to substitute them with real
locking primitives to ensure the correctness of operation in a fully
preemptible kernel.
No functional change.

[ bigeasy: Adopted to use local_lock ]

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/linux/swap.h |  2 ++
 mm/compaction.c      |  7 +++---
 mm/swap.c            | 59 ++++++++++++++++++++++++++++++--------------
 3 files changed, 45 insertions(+), 23 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index e1bbf7a16b276..540b52c71bc95 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -12,6 +12,7 @@
 #include <linux/fs.h>
 #include <linux/atomic.h>
 #include <linux/page-flags.h>
+#include <linux/locallock.h>
 #include <asm/page.h>
 
 struct notifier_block;
@@ -328,6 +329,7 @@ extern unsigned long nr_free_pagecache_pages(void);
 
 
 /* linux/mm/swap.c */
+DECLARE_LOCAL_LOCK(swapvec_lock);
 extern void lru_cache_add(struct page *);
 extern void lru_cache_add_anon(struct page *page);
 extern void lru_cache_add_file(struct page *page);
diff --git a/mm/compaction.c b/mm/compaction.c
index 46f0fcc93081e..77972c8d4dead 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2243,15 +2243,14 @@ compact_zone(struct compact_control *cc, struct capture_control *capc)
 		 * would succeed.
 		 */
 		if (cc->order > 0 && last_migrated_pfn) {
-			int cpu;
 			unsigned long current_block_start =
 				block_start_pfn(cc->migrate_pfn, cc->order);
 
 			if (last_migrated_pfn < current_block_start) {
-				cpu = get_cpu();
-				lru_add_drain_cpu(cpu);
+				local_lock(swapvec_lock);
+				lru_add_drain_cpu(smp_processor_id());
 				drain_local_pages(cc->zone);
-				put_cpu();
+				local_unlock(swapvec_lock);
 				/* No more flushing until we migrate again */
 				last_migrated_pfn = 0;
 			}
diff --git a/mm/swap.c b/mm/swap.c
index bf9a79fed62d7..03c97d15fcd69 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -44,8 +44,14 @@
 /* How many pages do we try to swap or page in/out together? */
 int page_cluster;
 
-static DEFINE_PER_CPU(struct pagevec, lru_add_pvec);
+
+/* Protecting lru_rotate_pvecs */
+static DEFINE_LOCAL_LOCK(rotate_lock);
 static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
+
+/* Protecting the following struct pagevec */
+DEFINE_LOCAL_LOCK(swapvec_lock);
+static DEFINE_PER_CPU(struct pagevec, lru_add_pvec);
 static DEFINE_PER_CPU(struct pagevec, lru_deactivate_file_pvecs);
 static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
 static DEFINE_PER_CPU(struct pagevec, lru_lazyfree_pvecs);
@@ -254,11 +260,11 @@ void rotate_reclaimable_page(struct page *page)
 		unsigned long flags;
 
 		get_page(page);
-		local_irq_save(flags);
+		local_lock_irqsave(rotate_lock, flags);
 		pvec = this_cpu_ptr(&lru_rotate_pvecs);
 		if (!pagevec_add(pvec, page) || PageCompound(page))
 			pagevec_move_tail(pvec);
-		local_irq_restore(flags);
+		local_unlock_irqrestore(rotate_lock, flags);
 	}
 }
 
@@ -308,12 +314,14 @@ void activate_page(struct page *page)
 {
 	page = compound_head(page);
 	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
-		struct pagevec *pvec = &get_cpu_var(activate_page_pvecs);
+		struct pagevec *pvec;
 
+		local_lock(swapvec_lock);
+		pvec = this_cpu_ptr(&activate_page_pvecs);
 		get_page(page);
 		if (!pagevec_add(pvec, page) || PageCompound(page))
 			pagevec_lru_move_fn(pvec, __activate_page, NULL);
-		put_cpu_var(activate_page_pvecs);
+		local_unlock(swapvec_lock);
 	}
 }
 
@@ -335,9 +343,12 @@ void activate_page(struct page *page)
 
 static void __lru_cache_activate_page(struct page *page)
 {
-	struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
+	struct pagevec *pvec;
 	int i;
 
+	local_lock(swapvec_lock);
+	pvec = this_cpu_ptr(&lru_add_pvec);
+
 	/*
 	 * Search backwards on the optimistic assumption that the page being
 	 * activated has just been added to this pagevec. Note that only
@@ -357,7 +368,7 @@ static void __lru_cache_activate_page(struct page *page)
 		}
 	}
 
-	put_cpu_var(lru_add_pvec);
+	local_unlock(swapvec_lock);
 }
 
 /*
@@ -404,12 +415,14 @@ EXPORT_SYMBOL(mark_page_accessed);
 
 static void __lru_cache_add(struct page *page)
 {
-	struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
+	struct pagevec *pvec;
 
+	local_lock(swapvec_lock);
+	pvec = this_cpu_ptr(&lru_add_pvec);
 	get_page(page);
 	if (!pagevec_add(pvec, page) || PageCompound(page))
 		__pagevec_lru_add(pvec);
-	put_cpu_var(lru_add_pvec);
+	local_unlock(swapvec_lock);
 }
 
 /**
@@ -603,9 +616,9 @@ void lru_add_drain_cpu(int cpu)
 		unsigned long flags;
 
 		/* No harm done if a racing interrupt already did this */
-		local_irq_save(flags);
+		local_lock_irqsave(rotate_lock, flags);
 		pagevec_move_tail(pvec);
-		local_irq_restore(flags);
+		local_unlock_irqrestore(rotate_lock, flags);
 	}
 
 	pvec = &per_cpu(lru_deactivate_file_pvecs, cpu);
@@ -641,11 +654,14 @@ void deactivate_file_page(struct page *page)
 		return;
 
 	if (likely(get_page_unless_zero(page))) {
-		struct pagevec *pvec = &get_cpu_var(lru_deactivate_file_pvecs);
+		struct pagevec *pvec;
+
+		local_lock(swapvec_lock);
+		pvec = this_cpu_ptr(&lru_deactivate_file_pvecs);
 
 		if (!pagevec_add(pvec, page) || PageCompound(page))
 			pagevec_lru_move_fn(pvec, lru_deactivate_file_fn, NULL);
-		put_cpu_var(lru_deactivate_file_pvecs);
+		local_unlock(swapvec_lock);
 	}
 }
 
@@ -660,12 +676,14 @@ void deactivate_file_page(struct page *page)
 void deactivate_page(struct page *page)
 {
 	if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
-		struct pagevec *pvec = &get_cpu_var(lru_deactivate_pvecs);
+		struct pagevec *pvec;
 
+		local_lock(swapvec_lock);
+		pvec = this_cpu_ptr(&lru_deactivate_pvecs);
 		get_page(page);
 		if (!pagevec_add(pvec, page) || PageCompound(page))
 			pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL);
-		put_cpu_var(lru_deactivate_pvecs);
+		local_unlock(swapvec_lock);
 	}
 }
 
@@ -680,19 +698,22 @@ void mark_page_lazyfree(struct page *page)
 {
 	if (PageLRU(page) && PageAnon(page) && PageSwapBacked(page) &&
 	    !PageSwapCache(page) && !PageUnevictable(page)) {
-		struct pagevec *pvec = &get_cpu_var(lru_lazyfree_pvecs);
+		struct pagevec *pvec;
 
+		local_lock(swapvec_lock);
+		pvec = this_cpu_ptr(&lru_lazyfree_pvecs);
 		get_page(page);
 		if (!pagevec_add(pvec, page) || PageCompound(page))
 			pagevec_lru_move_fn(pvec, lru_lazyfree_fn, NULL);
-		put_cpu_var(lru_lazyfree_pvecs);
+		local_unlock(swapvec_lock);
 	}
 }
 
 void lru_add_drain(void)
 {
-	lru_add_drain_cpu(get_cpu());
-	put_cpu();
+	local_lock(swapvec_lock);
+	lru_add_drain_cpu(smp_processor_id());
+	local_unlock(swapvec_lock);
 }
 
 #ifdef CONFIG_SMP
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 8/8] mm/zswap: Use local lock to protect per-CPU data
       [not found] <20200519201912.1564477-1-bigeasy@linutronix.de>
  2020-05-19 20:19 ` [PATCH 4/8] mm/swap: Use local_lock for protection Sebastian Andrzej Siewior
@ 2020-05-19 20:19 ` Sebastian Andrzej Siewior
  2020-05-19 21:46   ` Song Bao Hua
  1 sibling, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-05-19 20:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Will Deacon,
	Thomas Gleixner, Paul E . McKenney, Linus Torvalds,
	Luis Claudio R. Goncalves, Seth Jennings, Dan Streetman,
	Vitaly Wool, Andrew Morton, linux-mm, Sebastian Andrzej Siewior

From: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>

zwap uses per-CPU compression. The per-CPU data pointer is acquired with
get_cpu_ptr() which implicitly disables preemption. It allocates
memory inside the preempt disabled region which conflicts with the
PREEMPT_RT semantics.

Replace the implicit preemption control with an explicit local lock.
This allows RT kernels to substitute it with a real per CPU lock, which
serializes the access but keeps the code section preemptible. On non RT
kernels this maps to preempt_disable() as before, i.e. no functional
change.

[bigeasy: Use local_lock(), additional hunks, patch description]

Cc: Seth Jennings <sjenning@redhat.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 mm/zswap.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index fbb782924ccc5..1db2ad941e501 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -18,6 +18,7 @@
 #include <linux/highmem.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include <linux/locallock.h>
 #include <linux/types.h>
 #include <linux/atomic.h>
 #include <linux/frontswap.h>
@@ -388,6 +389,8 @@ static struct zswap_entry *zswap_entry_find_get(struct rb_root *root,
 * per-cpu code
 **********************************/
 static DEFINE_PER_CPU(u8 *, zswap_dstmem);
+/* Used for zswap_dstmem and tfm */
+static DEFINE_LOCAL_LOCK(zswap_cpu_lock);
 
 static int zswap_dstmem_prepare(unsigned int cpu)
 {
@@ -919,10 +922,11 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle)
 		dlen = PAGE_SIZE;
 		src = (u8 *)zhdr + sizeof(struct zswap_header);
 		dst = kmap_atomic(page);
-		tfm = *get_cpu_ptr(entry->pool->tfm);
+		local_lock(zswap_cpu_lock);
+		tfm = *this_cpu_ptr(entry->pool->tfm);
 		ret = crypto_comp_decompress(tfm, src, entry->length,
 					     dst, &dlen);
-		put_cpu_ptr(entry->pool->tfm);
+		local_unlock(zswap_cpu_lock);
 		kunmap_atomic(dst);
 		BUG_ON(ret);
 		BUG_ON(dlen != PAGE_SIZE);
@@ -1074,12 +1078,12 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
 	}
 
 	/* compress */
-	dst = get_cpu_var(zswap_dstmem);
-	tfm = *get_cpu_ptr(entry->pool->tfm);
+	local_lock(zswap_cpu_lock);
+	dst = *this_cpu_ptr(&zswap_dstmem);
+	tfm = *this_cpu_ptr(entry->pool->tfm);
 	src = kmap_atomic(page);
 	ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, &dlen);
 	kunmap_atomic(src);
-	put_cpu_ptr(entry->pool->tfm);
 	if (ret) {
 		ret = -EINVAL;
 		goto put_dstmem;
@@ -1103,7 +1107,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
 	memcpy(buf, &zhdr, hlen);
 	memcpy(buf + hlen, dst, dlen);
 	zpool_unmap_handle(entry->pool->zpool, handle);
-	put_cpu_var(zswap_dstmem);
+	local_unlock(zswap_cpu_lock);
 
 	/* populate entry */
 	entry->offset = offset;
@@ -1131,7 +1135,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
 	return 0;
 
 put_dstmem:
-	put_cpu_var(zswap_dstmem);
+	local_unlock(zswap_cpu_lock);
 	zswap_pool_put(entry->pool);
 freepage:
 	zswap_entry_cache_free(entry);
@@ -1176,9 +1180,10 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
 	if (zpool_evictable(entry->pool->zpool))
 		src += sizeof(struct zswap_header);
 	dst = kmap_atomic(page);
-	tfm = *get_cpu_ptr(entry->pool->tfm);
+	local_lock(zswap_cpu_lock);
+	tfm = *this_cpu_ptr(entry->pool->tfm);
 	ret = crypto_comp_decompress(tfm, src, entry->length, dst, &dlen);
-	put_cpu_ptr(entry->pool->tfm);
+	local_unlock(zswap_cpu_lock);
 	kunmap_atomic(dst);
 	zpool_unmap_handle(entry->pool->zpool, entry->handle);
 	BUG_ON(ret);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH 8/8] mm/zswap: Use local lock to protect per-CPU data
  2020-05-19 20:19 ` [PATCH 8/8] mm/zswap: Use local lock to protect per-CPU data Sebastian Andrzej Siewior
@ 2020-05-19 21:46   ` Song Bao Hua
  2020-05-20 10:26     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 10+ messages in thread
From: Song Bao Hua @ 2020-05-19 21:46 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Steven Rostedt, Will Deacon,
	Thomas Gleixner, Paul E . McKenney, Linus Torvalds,
	Luis Claudio R. Goncalves, Seth Jennings, Dan Streetman,
	Vitaly Wool, Andrew Morton, linux-mm, Linuxarm


> From: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>

> zwap uses per-CPU compression. The per-CPU data pointer is acquired with
> get_cpu_ptr() which implicitly disables preemption. It allocates memory inside the preempt disabled region which conflicts with the PREEMPT_RT semantics.

> Replace the implicit preemption control with an explicit local lock.
> This allows RT kernels to substitute it with a real per CPU lock, which serializes the access but keeps the code section preemptible. On non RT kernels this maps to preempt_disable() as before, i.e. no functional change.

Hi Luis,
In the below patch, in order to use the acomp APIs to leverage the power of hardware compressors. I have moved to mutex:
https://marc.info/?l=linux-crypto-vger&m=158941285830302&w=2
https://marc.info/?l=linux-crypto-vger&m=158941287930311&w=2

so once we get some progress on that one, I guess we don't need a special patch for RT any more.

> [bigeasy: Use local_lock(), additional hunks, patch description]

> Cc: Seth Jennings <sjenning@redhat.com>
> Cc: Dan Streetman <ddstreet@ieee.org>
> Cc: Vitaly Wool <vitaly.wool@konsulko.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: linux-mm@kvack.org
> Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  mm/zswap.c | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)

Thanks
Barry




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/8] mm/swap: Use local_lock for protection
  2020-05-19 20:19 ` [PATCH 4/8] mm/swap: Use local_lock for protection Sebastian Andrzej Siewior
@ 2020-05-19 23:58   ` Andrew Morton
  2020-05-20  2:17     ` Matthew Wilcox
  2020-05-20 10:53   ` Peter Zijlstra
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2020-05-19 23:58 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Will Deacon, Thomas Gleixner, Paul E . McKenney, Linus Torvalds,
	linux-mm

On Tue, 19 May 2020 22:19:08 +0200 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> From: Ingo Molnar <mingo@kernel.org>
> 
> The various struct pagevec per CPU variables are protected by disabling
> either preemption or interrupts across the critical sections. Inside
> these sections spinlocks have to be acquired.
> 
> These spinlocks are regular spinlock_t types which are converted to
> "sleeping" spinlocks on PREEMPT_RT enabled kernels. Obviously sleeping
> locks cannot be acquired in preemption or interrupt disabled sections.
> 
> local locks provide a trivial way to substitute preempt and interrupt
> disable instances. On a non PREEMPT_RT enabled kernel local_lock() maps
> to preempt_disable() and local_lock_irq() to local_irq_disable().
> 
> Add swapvec_lock to protect the per-CPU lru_add_pvec and
> lru_lazyfree_pvecs variables and rotate_lock to protect the per-CPU
> lru_rotate_pvecs variable
> 
> Change the relevant call sites to acquire these locks instead of using
> preempt_disable() / get_cpu() / get_cpu_var() and local_irq_disable() /
> local_irq_save().
> 
> There is neither a functional change nor a change in the generated
> binary code for non PREEMPT_RT enabled non-debug kernels.
> 
> When lockdep is enabled local locks have lockdep maps embedded. These
> allow lockdep to validate the protections, i.e. inappropriate usage of a
> preemption only protected sections would result in a lockdep warning
> while the same problem would not be noticed with a plain
> preempt_disable() based protection.
> 
> local locks also improve readability as they provide a named scope for
> the protections while preempt/interrupt disable are opaque scopeless.
> 
> Finally local locks allow PREEMPT_RT to substitute them with real
> locking primitives to ensure the correctness of operation in a fully
> preemptible kernel.
> No functional change.
>
> ...
>
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -12,6 +12,7 @@
>  #include <linux/fs.h>
>  #include <linux/atomic.h>
>  #include <linux/page-flags.h>
> +#include <linux/locallock.h>

Could we please make these local_lock.h and local_lock_internal.h?  Making
the filenames different from everything else is just irritating!

> +				local_lock(swapvec_lock);

It's quite peculiar that these operations appear to be pass-by-value. 
All other locking operations are pass-by-reference - spin_lock(&lock),
not spin_lock(lock).  This is what the eye expects to see and it's
simply more logical - calling code shouldn't have to "know" that the
locking operations are implemented as cpp macros.  And we'd be in a
mess if someone tried to convert these to real C functions.

Which prompts the question: why were all these operations implemented
in the processor anyway?  afaict they could have been written in C.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/8] mm/swap: Use local_lock for protection
  2020-05-19 23:58   ` Andrew Morton
@ 2020-05-20  2:17     ` Matthew Wilcox
  0 siblings, 0 replies; 10+ messages in thread
From: Matthew Wilcox @ 2020-05-20  2:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sebastian Andrzej Siewior, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Steven Rostedt, Will Deacon, Thomas Gleixner,
	Paul E . McKenney, Linus Torvalds, linux-mm

On Tue, May 19, 2020 at 04:58:37PM -0700, Andrew Morton wrote:
> On Tue, 19 May 2020 22:19:08 +0200 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> > +				local_lock(swapvec_lock);
> 
> It's quite peculiar that these operations appear to be pass-by-value. 
> All other locking operations are pass-by-reference - spin_lock(&lock),
> not spin_lock(lock).  This is what the eye expects to see and it's
> simply more logical - calling code shouldn't have to "know" that the
> locking operations are implemented as cpp macros.  And we'd be in a
> mess if someone tried to convert these to real C functions.

The funny thing is that the documentation gets this right:

+The mapping of local_lock to spinlock_t on PREEMPT_RT kernels has a few
+implications. For example, on a non-PREEMPT_RT kernel the following code
+sequence works as expected::
+
+  local_lock_irq(&local_lock);
+  raw_spin_lock(&lock);

but apparently the implementation changed without the documentation matching.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 8/8] mm/zswap: Use local lock to protect per-CPU data
  2020-05-19 21:46   ` Song Bao Hua
@ 2020-05-20 10:26     ` Sebastian Andrzej Siewior
  2020-05-20 11:13       ` Song Bao Hua
  0 siblings, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-05-20 10:26 UTC (permalink / raw)
  To: Song Bao Hua
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Will Deacon, Thomas Gleixner, Paul E . McKenney, Linus Torvalds,
	Luis Claudio R. Goncalves, Seth Jennings, Dan Streetman,
	Vitaly Wool, Andrew Morton, linux-mm, Linuxarm

On 2020-05-19 21:46:06 [+0000], Song Bao Hua wrote:
> Hi Luis,
> In the below patch, in order to use the acomp APIs to leverage the power of hardware compressors. I have moved to mutex:
> https://marc.info/?l=linux-crypto-vger&m=158941285830302&w=2
> https://marc.info/?l=linux-crypto-vger&m=158941287930311&w=2
> 
> so once we get some progress on that one, I guess we don't need a special patch for RT any more.

If you convert this way from the current concept then we could drop it
from the series.
The second patch shows the following hunk:

|@@ -1075,11 +1124,20 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
| 
| 	/* compress */
| 	dst = get_cpu_var(zswap_dstmem);
|	acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
|	put_cpu_var(zswap_dstmem);

So here you get per-CPU version of `dst' and `acomp_ctx' and then allow
preemption again.

|	mutex_lock(&acomp_ctx->mutex);
|
|	src = kmap(page);
|	sg_init_one(&input, src, PAGE_SIZE);
|	/* zswap_dstmem is of size (PAGE_SIZE * 2). Reflect same in sg_list */
|	sg_init_one(&output, dst, PAGE_SIZE * 2);

and here you use `dst' and `acomp_ctx' after the preempt_disable() has
been dropped so I don't understand why you used get_cpu_var(). It is
either protected by the mutex and doesn't require get_cpu_var() or it
isn't (and should have additional protection).

|	acomp_request_set_params(acomp_ctx->req, &input, &output, PAGE_SIZE, dlen);
|	ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait);
|	dlen = acomp_ctx->req->dlen;
|	kunmap(page);
|
| 	if (ret) {
| 		ret = -EINVAL;
| 		goto put_dstmem;
|

Sebastian


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/8] mm/swap: Use local_lock for protection
  2020-05-19 20:19 ` [PATCH 4/8] mm/swap: Use local_lock for protection Sebastian Andrzej Siewior
  2020-05-19 23:58   ` Andrew Morton
@ 2020-05-20 10:53   ` Peter Zijlstra
  1 sibling, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2020-05-20 10:53 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, Ingo Molnar, Steven Rostedt, Will Deacon,
	Thomas Gleixner, Paul E . McKenney, Linus Torvalds,
	Andrew Morton, linux-mm

On Tue, May 19, 2020 at 10:19:08PM +0200, Sebastian Andrzej Siewior wrote:
> diff --git a/mm/swap.c b/mm/swap.c
> index bf9a79fed62d7..03c97d15fcd69 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -44,8 +44,14 @@
>  /* How many pages do we try to swap or page in/out together? */
>  int page_cluster;
>  
> -static DEFINE_PER_CPU(struct pagevec, lru_add_pvec);
> +
> +/* Protecting lru_rotate_pvecs */
> +static DEFINE_LOCAL_LOCK(rotate_lock);
>  static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
> +
> +/* Protecting the following struct pagevec */
> +DEFINE_LOCAL_LOCK(swapvec_lock);
> +static DEFINE_PER_CPU(struct pagevec, lru_add_pvec);
>  static DEFINE_PER_CPU(struct pagevec, lru_deactivate_file_pvecs);
>  static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
>  static DEFINE_PER_CPU(struct pagevec, lru_lazyfree_pvecs);

So personally I'd prefer to have this look like:

struct lru_vecs {
	struct local_lock lock;
	struct pagevec add;
	struct pagevec rotate;
	struct pagevec deact_file;
	struct pagevec deact;
	struct pagevec lazyfree;
#ifdef CONFIG_SMP
	struct pagevec active;
#endif
};

DEFINE_PER_CPU(struct lru_pvec, lru_pvec);

or something, but I realize that is a lot of churn (although highly
automated), so I'll leave that to the mm folks.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 8/8] mm/zswap: Use local lock to protect per-CPU data
  2020-05-20 10:26     ` Sebastian Andrzej Siewior
@ 2020-05-20 11:13       ` Song Bao Hua
  2020-05-20 11:57         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 10+ messages in thread
From: Song Bao Hua @ 2020-05-20 11:13 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Will Deacon, Thomas Gleixner, Paul E . McKenney, Linus Torvalds,
	Luis Claudio R. Goncalves, Seth Jennings, Dan Streetman,
	Vitaly Wool, Andrew Morton, linux-mm, Linuxarm

> On 2020-05-19 21:46:06 [+0000], Song Bao Hua wrote:
> > Hi Luis,
> > In the below patch, in order to use the acomp APIs to leverage the power of
> hardware compressors. I have moved to mutex:
> > https://marc.info/?l=linux-crypto-vger&m=158941285830302&w=2
> > https://marc.info/?l=linux-crypto-vger&m=158941287930311&w=2
> >
> > so once we get some progress on that one, I guess we don't need a special
> patch for RT any more.
> 
> If you convert this way from the current concept then we could drop it from
> the series.
> The second patch shows the following hunk:
> 
> |@@ -1075,11 +1124,20 @@ static int zswap_frontswap_store(unsigned
> type,
> |pgoff_t offset,
> |
> | 	/* compress */
> | 	dst = get_cpu_var(zswap_dstmem);
> |	acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
> |	put_cpu_var(zswap_dstmem);
> 
> So here you get per-CPU version of `dst' and `acomp_ctx' and then allow
> preemption again.
> 
> |	mutex_lock(&acomp_ctx->mutex);
> |
> |	src = kmap(page);
> |	sg_init_one(&input, src, PAGE_SIZE);
> |	/* zswap_dstmem is of size (PAGE_SIZE * 2). Reflect same in sg_list */
> |	sg_init_one(&output, dst, PAGE_SIZE * 2);
> 
> and here you use `dst' and `acomp_ctx' after the preempt_disable() has been
> dropped so I don't understand why you used get_cpu_var(). It is either
> protected by the mutex and doesn't require get_cpu_var() or it isn't (and
> should have additional protection).

The old code was like:
For each cpu, there is one percpu comp and one percpu pages for compression destination - zswap_dstmem.
For example, on cpu1, once you begin to compress, you hold the percpu comp and percpu destination buffer. Meanwhile, preemption is disabled. So decompression won't be able to work at the same core in parallel. And two compressions won't be able to do at the same core in parallel as well. At the same time, the thread won't be able to migrate to another core. Other cores might can do compression/decompression in parallel

The new code is like:
For each cpu, there is still one percpu acomp-ctx and one percpu pages for compression destination. Here acomp replaces comp, and acomp requires sleep during compressing and decompressing.
For example, on cpu1, once you begin to compress, you hold the percpu acomp-ctx and percpu destination buffer of CPU1, the below code makes sure you get the acomp and dstmem from the same core by disabling preemption with get_cpu_var and put_cpu_var:
dst = get_cpu_var(zswap_dstmem);
acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
put_cpu_var(zswap_dstmem);

then there are two cases:

1. after getting dst and acomp_ctx of cpu1, you might always work in cpu1, the mutex in per-cpu acomp-ctx will guarantee two compressions won't do at the same core in parallel, and it also makes certain compression and decompression won't do at the same core in parallel. Everything is like before.

2. after getting dst and acomp_ctx of cpu1, you might migrate to cpu2, but even you move to cpu2, you are still holding the mutex of cpu1's acomp-ctx.
If at that time, cpu1 has another request to do compression, it will be blocked by the mutex held by cpu2.
If at that time, cpu1 wants to do decompression, it wil be blocked by the mutex held by cpu2.

Everything is like before. No matter which core you have migrated to, once you hold the mutex of core N, another compression/decompression who wants to hold the mutex of core N will be blocked. So mostly, if you have M cores, you can do M compression/decompression in parallel like before.

My basic idea is keeping the work model unchanged like before.

> 
> |	acomp_request_set_params(acomp_ctx->req, &input, &output,
> PAGE_SIZE, dlen);
> |	ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req),
> &acomp_ctx->wait);
> |	dlen = acomp_ctx->req->dlen;
> |	kunmap(page);
> |
> | 	if (ret) {
> | 		ret = -EINVAL;
> | 		goto put_dstmem;
> |
> 
> Sebastian

Barry


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 8/8] mm/zswap: Use local lock to protect per-CPU data
  2020-05-20 11:13       ` Song Bao Hua
@ 2020-05-20 11:57         ` Sebastian Andrzej Siewior
  2020-05-20 12:01           ` Song Bao Hua
  0 siblings, 1 reply; 10+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-05-20 11:57 UTC (permalink / raw)
  To: Song Bao Hua
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Will Deacon, Thomas Gleixner, Paul E . McKenney, Linus Torvalds,
	Luis Claudio R. Goncalves, Seth Jennings, Dan Streetman,
	Vitaly Wool, Andrew Morton, linux-mm, Linuxarm

On 2020-05-20 11:13:31 [+0000], Song Bao Hua wrote:
> For example, on cpu1, once you begin to compress, you hold the percpu acomp-ctx and percpu destination buffer of CPU1, the below code makes sure you get the acomp and dstmem from the same core by disabling preemption with get_cpu_var and put_cpu_var:
> dst = get_cpu_var(zswap_dstmem);
> acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
> put_cpu_var(zswap_dstmem);
> 
> then there are two cases:
> 
> 1. after getting dst and acomp_ctx of cpu1, you might always work in cpu1, the mutex in per-cpu acomp-ctx will guarantee two compressions won't do at the same core in parallel, and it also makes certain compression and decompression won't do at the same core in parallel. Everything is like before.

For readability I suggest not to mix per-CPU and per-CTX variables like
that. If zswap_dstmem is protected by the mutex, please make it part of
acomp_ctx.

Sebastian


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 8/8] mm/zswap: Use local lock to protect per-CPU data
  2020-05-20 11:57         ` Sebastian Andrzej Siewior
@ 2020-05-20 12:01           ` Song Bao Hua
  0 siblings, 0 replies; 10+ messages in thread
From: Song Bao Hua @ 2020-05-20 12:01 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Steven Rostedt,
	Will Deacon, Thomas Gleixner, Paul E . McKenney, Linus Torvalds,
	Luis Claudio R. Goncalves, Seth Jennings, Dan Streetman,
	Vitaly Wool, Andrew Morton, linux-mm, Linuxarm

> Subject: Re: [PATCH 8/8] mm/zswap: Use local lock to protect per-CPU data
> 
> On 2020-05-20 11:13:31 [+0000], Song Bao Hua wrote:
> > For example, on cpu1, once you begin to compress, you hold the percpu
> acomp-ctx and percpu destination buffer of CPU1, the below code makes sure
> you get the acomp and dstmem from the same core by disabling preemption
> with get_cpu_var and put_cpu_var:
> > dst = get_cpu_var(zswap_dstmem);
> > acomp_ctx = *this_cpu_ptr(entry->pool->acomp_ctx);
> > put_cpu_var(zswap_dstmem);
> >
> > then there are two cases:
> >
> > 1. after getting dst and acomp_ctx of cpu1, you might always work in cpu1,
> the mutex in per-cpu acomp-ctx will guarantee two compressions won't do at
> the same core in parallel, and it also makes certain compression and
> decompression won't do at the same core in parallel. Everything is like before.
> 
> For readability I suggest not to mix per-CPU and per-CTX variables like that. If
> zswap_dstmem is protected by the mutex, please make it part of acomp_ctx.
> 

Yep. it seems this will avoid the further explanations to more people who will read the code :-)

> Sebastian

Barry


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-05-20 12:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200519201912.1564477-1-bigeasy@linutronix.de>
2020-05-19 20:19 ` [PATCH 4/8] mm/swap: Use local_lock for protection Sebastian Andrzej Siewior
2020-05-19 23:58   ` Andrew Morton
2020-05-20  2:17     ` Matthew Wilcox
2020-05-20 10:53   ` Peter Zijlstra
2020-05-19 20:19 ` [PATCH 8/8] mm/zswap: Use local lock to protect per-CPU data Sebastian Andrzej Siewior
2020-05-19 21:46   ` Song Bao Hua
2020-05-20 10:26     ` Sebastian Andrzej Siewior
2020-05-20 11:13       ` Song Bao Hua
2020-05-20 11:57         ` Sebastian Andrzej Siewior
2020-05-20 12:01           ` Song Bao Hua

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).