All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: Disable zsmalloc on PREEMPT_RT
@ 2021-09-23 17:01 Sebastian Andrzej Siewior
  2021-09-23 23:06 ` Minchan Kim
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-23 17:01 UTC (permalink / raw)
  To: linux-mm; +Cc: Andrew Morton, Thomas Gleixner, Sebastian Andrzej Siewior

For efficiency reasons, zsmalloc is using a slim `handle'. The value is
the address of a memory allocation of 4 or 8 bytes depending on the size
of the long data type. The lowest bit in that allocated memory is used
as a bit spin lock.
The usage of the bit spin lock is problematic because with the bit spin
lock held zsmalloc acquires a rwlock_t and spinlock_t which are both
sleeping locks on PREEMPT_RT and therefore must not be acquired with
disabled preemption.

There is a patch which extends the handle on PREEMPT_RT so that a full
spinlock_t fits (even with lockdep enabled) and then eliminates the bit
spin lock. I'm not sure how sensible zsmalloc on PREEMPT_RT is given
that it is used to store compressed user memory.

Disable ZSMALLOC on PREEMPT_RT. If there is need for it, we can try to
get it to work.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 mm/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index 9f1e0098522c2..541371e64c477 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -640,6 +640,7 @@ config ZSWAP_ZPOOL_DEFAULT_Z3FOLD
 
 config ZSWAP_ZPOOL_DEFAULT_ZSMALLOC
 	bool "zsmalloc"
+	depends on !PREEMPT_RT
 	select ZSMALLOC
 	help
 	  Use the zsmalloc allocator as the default allocator.
@@ -690,7 +691,7 @@ config Z3FOLD
 
 config ZSMALLOC
 	tristate "Memory allocator for compressed pages"
-	depends on MMU
+	depends on MMU && !PREEMPT_RT
 	help
 	  zsmalloc is a slab-based memory allocator designed to store
 	  compressed RAM pages.  zsmalloc uses virtual memory mapping
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Disable zsmalloc on PREEMPT_RT
  2021-09-23 17:01 [PATCH] mm: Disable zsmalloc on PREEMPT_RT Sebastian Andrzej Siewior
@ 2021-09-23 23:06 ` Minchan Kim
  2021-09-24  7:08   ` Sebastian Andrzej Siewior
  2021-09-28  8:44   ` [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage Sebastian Andrzej Siewior
  0 siblings, 2 replies; 9+ messages in thread
From: Minchan Kim @ 2021-09-23 23:06 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-mm, Andrew Morton, Thomas Gleixner

On Thu, Sep 23, 2021 at 07:01:21PM +0200, Sebastian Andrzej Siewior wrote:
> For efficiency reasons, zsmalloc is using a slim `handle'. The value is
> the address of a memory allocation of 4 or 8 bytes depending on the size
> of the long data type. The lowest bit in that allocated memory is used
> as a bit spin lock.
> The usage of the bit spin lock is problematic because with the bit spin
> lock held zsmalloc acquires a rwlock_t and spinlock_t which are both
> sleeping locks on PREEMPT_RT and therefore must not be acquired with
> disabled preemption.

I am not sure how long the preemption disabled section takes since it
just disable for a page copy mostly.

> 
> There is a patch which extends the handle on PREEMPT_RT so that a full
> spinlock_t fits (even with lockdep enabled) and then eliminates the bit
> spin lock. I'm not sure how sensible zsmalloc on PREEMPT_RT is given
> that it is used to store compressed user memory.

I don't see what's relation between PREEMPT_RT and compressed user
memory so you can reach such conclustion. Disable zsmalloc also makes
disable zram. I think in-compress memory swap rather than storage
swap/block sometimes would be useful for even RT.

> 
> Disable ZSMALLOC on PREEMPT_RT. If there is need for it, we can try to
> get it to work.

Please send the patch which extends handle with spin_lock rather than
simply disabing.

> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  mm/Kconfig | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 9f1e0098522c2..541371e64c477 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -640,6 +640,7 @@ config ZSWAP_ZPOOL_DEFAULT_Z3FOLD
>  
>  config ZSWAP_ZPOOL_DEFAULT_ZSMALLOC
>  	bool "zsmalloc"
> +	depends on !PREEMPT_RT
>  	select ZSMALLOC
>  	help
>  	  Use the zsmalloc allocator as the default allocator.
> @@ -690,7 +691,7 @@ config Z3FOLD
>  
>  config ZSMALLOC
>  	tristate "Memory allocator for compressed pages"
> -	depends on MMU
> +	depends on MMU && !PREEMPT_RT
>  	help
>  	  zsmalloc is a slab-based memory allocator designed to store
>  	  compressed RAM pages.  zsmalloc uses virtual memory mapping
> -- 
> 2.33.0
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm: Disable zsmalloc on PREEMPT_RT
  2021-09-23 23:06 ` Minchan Kim
@ 2021-09-24  7:08   ` Sebastian Andrzej Siewior
  2021-09-28  8:44   ` [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage Sebastian Andrzej Siewior
  1 sibling, 0 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-24  7:08 UTC (permalink / raw)
  To: Minchan Kim; +Cc: linux-mm, Andrew Morton, Thomas Gleixner

On 2021-09-23 16:06:58 [-0700], Minchan Kim wrote:
> On Thu, Sep 23, 2021 at 07:01:21PM +0200, Sebastian Andrzej Siewior wrote:
> > For efficiency reasons, zsmalloc is using a slim `handle'. The value is
> > the address of a memory allocation of 4 or 8 bytes depending on the size
> > of the long data type. The lowest bit in that allocated memory is used
> > as a bit spin lock.
> > The usage of the bit spin lock is problematic because with the bit spin
> > lock held zsmalloc acquires a rwlock_t and spinlock_t which are both
> > sleeping locks on PREEMPT_RT and therefore must not be acquired with
> > disabled preemption.
> 
> I am not sure how long the preemption disabled section takes since it
> just disable for a page copy mostly.

after the bit-spin-lock, there are sleeping locks. These are
problematic.

> > 
> > There is a patch which extends the handle on PREEMPT_RT so that a full
> > spinlock_t fits (even with lockdep enabled) and then eliminates the bit
> > spin lock. I'm not sure how sensible zsmalloc on PREEMPT_RT is given
> > that it is used to store compressed user memory.
> 
> I don't see what's relation between PREEMPT_RT and compressed user
> memory so you can reach such conclustion. Disable zsmalloc also makes
> disable zram. I think in-compress memory swap rather than storage
> swap/block sometimes would be useful for even RT.

The only user of zsmalloc I found was zswap which compressed user pages.
Maybe I didn't collect all the dots.

> > Disable ZSMALLOC on PREEMPT_RT. If there is need for it, we can try to
> > get it to work.
> 
> Please send the patch which extends handle with spin_lock rather than
> simply disabing.

Okay. I will clean it up a little and post it then.

Sebastian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage.
  2021-09-23 23:06 ` Minchan Kim
  2021-09-24  7:08   ` Sebastian Andrzej Siewior
@ 2021-09-28  8:44   ` Sebastian Andrzej Siewior
  2021-09-28 22:47     ` Andrew Morton
  1 sibling, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-28  8:44 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, Andrew Morton, Thomas Gleixner, Mike Galbraith, Mike Galbraith

From: Mike Galbraith <umgwanakikbuti@gmail.com>

For efficiency reasons, zsmalloc is using a slim `handle'. The value is
the address of a memory allocation of 4 or 8 bytes depending on the size
of the long data type. The lowest bit in that allocated memory is used
as a bit spin lock.
The usage of the bit spin lock is problematic because with the bit spin
lock held zsmalloc acquires a rwlock_t and spinlock_t which are both
sleeping locks on PREEMPT_RT and therefore must not be acquired with
disabled preemption.

Extend the handle to struct zsmalloc_handle which holds the old handle as
addr and a spinlock_t which replaces the bit spinlock. Replace all the
wrapper functions accordingly.

The usage of get_cpu_var() in zs_map_object() is problematic because
it disables preemption and makes it impossible to acquire any sleeping
lock on PREEMPT_RT such as a spinlock_t.
Replace the get_cpu_var() usage with a local_lock_t which is embedded
struct mapping_area. It ensures that the access the struct is
synchronized against all users on the same CPU.

This survived LTP testing.

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bigeasy: replace the bitspin_lock() with a mutex, get_locked_var() and
 patch description. Mike then fixed the size magic and made handle lock
 spinlock_t.]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---

Minchan, this is the replacement we have so far. There is something
similar for block/zram. By disabling zsmalloc we also have zram disabled
so no fallout.
To my best knowledge, the only usage/testing is LTP based.

 mm/zsmalloc.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 78 insertions(+), 6 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 68e8831068f4b..22c18ac605b5f 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -57,6 +57,7 @@
 #include <linux/wait.h>
 #include <linux/pagemap.h>
 #include <linux/fs.h>
+#include <linux/local_lock.h>
 
 #define ZSPAGE_MAGIC	0x58
 
@@ -77,6 +78,20 @@
 
 #define ZS_HANDLE_SIZE (sizeof(unsigned long))
 
+#ifdef CONFIG_PREEMPT_RT
+
+struct zsmalloc_handle {
+	unsigned long addr;
+	spinlock_t lock;
+};
+
+#define ZS_HANDLE_ALLOC_SIZE (sizeof(struct zsmalloc_handle))
+
+#else
+
+#define ZS_HANDLE_ALLOC_SIZE (sizeof(unsigned long))
+#endif
+
 /*
  * Object location (<PFN>, <obj_idx>) is encoded as
  * a single (unsigned long) handle value.
@@ -293,6 +308,7 @@ struct zspage {
 };
 
 struct mapping_area {
+	local_lock_t lock;
 	char *vm_buf; /* copy buffer for objects that span pages */
 	char *vm_addr; /* address of kmap_atomic()'ed pages */
 	enum zs_mapmode vm_mm; /* mapping mode */
@@ -322,7 +338,7 @@ static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage) {}
 
 static int create_cache(struct zs_pool *pool)
 {
-	pool->handle_cachep = kmem_cache_create("zs_handle", ZS_HANDLE_SIZE,
+	pool->handle_cachep = kmem_cache_create("zs_handle", ZS_HANDLE_ALLOC_SIZE,
 					0, 0, NULL);
 	if (!pool->handle_cachep)
 		return 1;
@@ -346,10 +362,27 @@ static void destroy_cache(struct zs_pool *pool)
 
 static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp)
 {
-	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
-			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
+	void *p;
+
+	p = kmem_cache_alloc(pool->handle_cachep,
+			     gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
+#ifdef CONFIG_PREEMPT_RT
+	if (p) {
+		struct zsmalloc_handle *zh = p;
+
+		spin_lock_init(&zh->lock);
+	}
+#endif
+	return (unsigned long)p;
 }
 
+#ifdef CONFIG_PREEMPT_RT
+static struct zsmalloc_handle *zs_get_pure_handle(unsigned long handle)
+{
+	return (void *)(handle & ~((1 << OBJ_TAG_BITS) - 1));
+}
+#endif
+
 static void cache_free_handle(struct zs_pool *pool, unsigned long handle)
 {
 	kmem_cache_free(pool->handle_cachep, (void *)handle);
@@ -368,12 +401,18 @@ static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage)
 
 static void record_obj(unsigned long handle, unsigned long obj)
 {
+#ifdef CONFIG_PREEMPT_RT
+	struct zsmalloc_handle *zh = zs_get_pure_handle(handle);
+
+	WRITE_ONCE(zh->addr, obj);
+#else
 	/*
 	 * lsb of @obj represents handle lock while other bits
 	 * represent object value the handle is pointing so
 	 * updating shouldn't do store tearing.
 	 */
 	WRITE_ONCE(*(unsigned long *)handle, obj);
+#endif
 }
 
 /* zpool driver */
@@ -455,7 +494,9 @@ MODULE_ALIAS("zpool-zsmalloc");
 #endif /* CONFIG_ZPOOL */
 
 /* per-cpu VM mapping areas for zspage accesses that cross page boundaries */
-static DEFINE_PER_CPU(struct mapping_area, zs_map_area);
+static DEFINE_PER_CPU(struct mapping_area, zs_map_area) = {
+	.lock	= INIT_LOCAL_LOCK(lock),
+};
 
 static bool is_zspage_isolated(struct zspage *zspage)
 {
@@ -862,7 +903,13 @@ static unsigned long location_to_obj(struct page *page, unsigned int obj_idx)
 
 static unsigned long handle_to_obj(unsigned long handle)
 {
+#ifdef CONFIG_PREEMPT_RT
+	struct zsmalloc_handle *zh = zs_get_pure_handle(handle);
+
+	return zh->addr;
+#else
 	return *(unsigned long *)handle;
+#endif
 }
 
 static unsigned long obj_to_head(struct page *page, void *obj)
@@ -876,22 +923,46 @@ static unsigned long obj_to_head(struct page *page, void *obj)
 
 static inline int testpin_tag(unsigned long handle)
 {
+#ifdef CONFIG_PREEMPT_RT
+	struct zsmalloc_handle *zh = zs_get_pure_handle(handle);
+
+	return spin_is_locked(&zh->lock);
+#else
 	return bit_spin_is_locked(HANDLE_PIN_BIT, (unsigned long *)handle);
+#endif
 }
 
 static inline int trypin_tag(unsigned long handle)
 {
+#ifdef CONFIG_PREEMPT_RT
+	struct zsmalloc_handle *zh = zs_get_pure_handle(handle);
+
+	return spin_trylock(&zh->lock);
+#else
 	return bit_spin_trylock(HANDLE_PIN_BIT, (unsigned long *)handle);
+#endif
 }
 
 static void pin_tag(unsigned long handle) __acquires(bitlock)
 {
+#ifdef CONFIG_PREEMPT_RT
+	struct zsmalloc_handle *zh = zs_get_pure_handle(handle);
+
+	return spin_lock(&zh->lock);
+#else
 	bit_spin_lock(HANDLE_PIN_BIT, (unsigned long *)handle);
+#endif
 }
 
 static void unpin_tag(unsigned long handle) __releases(bitlock)
 {
+#ifdef CONFIG_PREEMPT_RT
+	struct zsmalloc_handle *zh = zs_get_pure_handle(handle);
+
+	return spin_unlock(&zh->lock);
+#else
 	bit_spin_unlock(HANDLE_PIN_BIT, (unsigned long *)handle);
+#endif
 }
 
 static void reset_page(struct page *page)
@@ -1274,7 +1345,8 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle,
 	class = pool->size_class[class_idx];
 	off = (class->size * obj_idx) & ~PAGE_MASK;
 
-	area = &get_cpu_var(zs_map_area);
+	local_lock(&zs_map_area.lock);
+	area = this_cpu_ptr(&zs_map_area);
 	area->vm_mm = mm;
 	if (off + class->size <= PAGE_SIZE) {
 		/* this object is contained entirely within a page */
@@ -1328,7 +1400,7 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
 
 		__zs_unmap_object(area, pages, off, class->size);
 	}
-	put_cpu_var(zs_map_area);
+	local_unlock(&zs_map_area.lock);
 
 	migrate_read_unlock(zspage);
 	unpin_tag(handle);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage.
  2021-09-28  8:44   ` [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage Sebastian Andrzej Siewior
@ 2021-09-28 22:47     ` Andrew Morton
  2021-09-29  2:11       ` Mike Galbraith
  2021-09-29  7:23       ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Morton @ 2021-09-28 22:47 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Minchan Kim, linux-mm, Thomas Gleixner, Mike Galbraith, Mike Galbraith

On Tue, 28 Sep 2021 10:44:19 +0200 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> From: Mike Galbraith <umgwanakikbuti@gmail.com>
> 
> For efficiency reasons, zsmalloc is using a slim `handle'. The value is
> the address of a memory allocation of 4 or 8 bytes depending on the size
> of the long data type. The lowest bit in that allocated memory is used
> as a bit spin lock.
> The usage of the bit spin lock is problematic because with the bit spin
> lock held zsmalloc acquires a rwlock_t and spinlock_t which are both
> sleeping locks on PREEMPT_RT and therefore must not be acquired with
> disabled preemption.
> 
> Extend the handle to struct zsmalloc_handle which holds the old handle as
> addr and a spinlock_t which replaces the bit spinlock. Replace all the
> wrapper functions accordingly.
> 
> The usage of get_cpu_var() in zs_map_object() is problematic because
> it disables preemption and makes it impossible to acquire any sleeping
> lock on PREEMPT_RT such as a spinlock_t.
> Replace the get_cpu_var() usage with a local_lock_t which is embedded
> struct mapping_area. It ensures that the access the struct is
> synchronized against all users on the same CPU.
> 
> This survived LTP testing.

Rather nasty with all the ifdefs and two different locking approaches
to be tested.  What would be the impact of simply switching to the new
scheme for all configs?

Which is identical to asking "what is the impact of switching to the new
scheme for PREEMPT_RT"!  Which is I think an important thing for the
changelog to address?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage.
  2021-09-28 22:47     ` Andrew Morton
@ 2021-09-29  2:11       ` Mike Galbraith
  2021-09-29  7:23       ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 9+ messages in thread
From: Mike Galbraith @ 2021-09-29  2:11 UTC (permalink / raw)
  To: Andrew Morton, Sebastian Andrzej Siewior
  Cc: Minchan Kim, linux-mm, Thomas Gleixner

On Tue, 2021-09-28 at 15:47 -0700, Andrew Morton wrote:
> On Tue, 28 Sep 2021 10:44:19 +0200 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
>
> > From: Mike Galbraith <umgwanakikbuti@gmail.com>
> >
> > For efficiency reasons, zsmalloc is using a slim `handle'. The value is
> > the address of a memory allocation of 4 or 8 bytes depending on the size
> > of the long data type. The lowest bit in that allocated memory is used
> > as a bit spin lock.
> > The usage of the bit spin lock is problematic because with the bit spin
> > lock held zsmalloc acquires a rwlock_t and spinlock_t which are both
> > sleeping locks on PREEMPT_RT and therefore must not be acquired with
> > disabled preemption.
> >
> > Extend the handle to struct zsmalloc_handle which holds the old handle as
> > addr and a spinlock_t which replaces the bit spinlock. Replace all the
> > wrapper functions accordingly.
> >
> > The usage of get_cpu_var() in zs_map_object() is problematic because
> > it disables preemption and makes it impossible to acquire any sleeping
> > lock on PREEMPT_RT such as a spinlock_t.
> > Replace the get_cpu_var() usage with a local_lock_t which is embedded
> > struct mapping_area. It ensures that the access the struct is
> > synchronized against all users on the same CPU.
> >
> > This survived LTP testing.
>
> Rather nasty with all the ifdefs and two different locking approaches
> to be tested.  What would be the impact of simply switching to the new
> scheme for all configs?
>
> Which is identical to asking "what is the impact of switching to the new
> scheme for PREEMPT_RT"!  Which is I think an important thing for the
> changelog to address?

Good questions both, for which I have no answers.  The problematic bit
spinlock going away would certainly be preferred if deemed acceptable.
I frankly doubt either it or zram would be missed were they to instead
be disabled for RT configs.  I certainly have no use for it, making it
functional was a simple case of boy meets bug, and annoys it to death.

	-Mike


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage.
  2021-09-28 22:47     ` Andrew Morton
  2021-09-29  2:11       ` Mike Galbraith
@ 2021-09-29  7:23       ` Sebastian Andrzej Siewior
  2021-09-29 19:09         ` Minchan Kim
  1 sibling, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-29  7:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, linux-mm, Thomas Gleixner, Mike Galbraith, Mike Galbraith

On 2021-09-28 15:47:23 [-0700], Andrew Morton wrote:
> Rather nasty with all the ifdefs and two different locking approaches
> to be tested.  What would be the impact of simply switching to the new
> scheme for all configs?

The current scheme uses the lower bit (OBJ_ALLOCATED_TAG) as something
special which is guaranteed to be zero due to memory alignment
requirements. The content of the memory, that long, is then used a bit
spinlock.

Moving it to spinlock_t would consume only 4 bytes of memory assuming
lockdep is off. It is then 4 bytes less than a long on 64 bits archs.
So we could do this if nobody disagrees. The spinlock_t has clearly
advantages over a bit spinlock like the "order" from the qspinlock
implementation. But then I have no idea what the contention here is.
With lockdep enabled the struct gets a little bigger which I assume was to
avoid. But then only debug builds are affected so…
 
> Which is identical to asking "what is the impact of switching to the new
> scheme for PREEMPT_RT"!  Which is I think an important thing for the
> changelog to address?

Well, PREEMPT_RT can't work with the bit spinlock in it. That is that
part from the changelog:
| The usage of the bit spin lock is problematic because with the bit spin
| lock held zsmalloc acquires a rwlock_t and spinlock_t which are both
| sleeping locks on PREEMPT_RT and therefore must not be acquired with
| disabled preemption.


Sebastian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage.
  2021-09-29  7:23       ` Sebastian Andrzej Siewior
@ 2021-09-29 19:09         ` Minchan Kim
  2021-09-30  6:42           ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Minchan Kim @ 2021-09-29 19:09 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Andrew Morton, linux-mm, Thomas Gleixner, Mike Galbraith, Mike Galbraith

On Wed, Sep 29, 2021 at 09:23:59AM +0200, Sebastian Andrzej Siewior wrote:
> On 2021-09-28 15:47:23 [-0700], Andrew Morton wrote:
> > Rather nasty with all the ifdefs and two different locking approaches
> > to be tested.  What would be the impact of simply switching to the new
> > scheme for all configs?
> 
> The current scheme uses the lower bit (OBJ_ALLOCATED_TAG) as something
> special which is guaranteed to be zero due to memory alignment
> requirements. The content of the memory, that long, is then used a bit
> spinlock.
> 
> Moving it to spinlock_t would consume only 4 bytes of memory assuming
> lockdep is off. It is then 4 bytes less than a long on 64 bits archs.
> So we could do this if nobody disagrees. The spinlock_t has clearly
> advantages over a bit spinlock like the "order" from the qspinlock
> implementation. But then I have no idea what the contention here is.
> With lockdep enabled the struct gets a little bigger which I assume was to
> avoid. But then only debug builds are affected so…

First of all, thanks for the patch, Sebastian.

The zsmalloc is usually used with swap and swap size is normally several
GB above. Thus, adding per-page spinlock is rather expensive so I'd like to
consider the approach as last resort. About the lock contention, it's rare
so spinlock wouldn't help it much.

Let me try changing the bit lock into sleepable lock in PREEMPT_RT with 
bigger granuarity.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage.
  2021-09-29 19:09         ` Minchan Kim
@ 2021-09-30  6:42           ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-30  6:42 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, Thomas Gleixner, Mike Galbraith, Mike Galbraith

On 2021-09-29 12:09:05 [-0700], Minchan Kim wrote:
> First of all, thanks for the patch, Sebastian.
> 
> The zsmalloc is usually used with swap and swap size is normally several
> GB above. Thus, adding per-page spinlock is rather expensive so I'd like to
> consider the approach as last resort. About the lock contention, it's rare
> so spinlock wouldn't help it much.
> 
> Let me try changing the bit lock into sleepable lock in PREEMPT_RT with 
> bigger granuarity.

Okay, thank you. spinlock_t has always four bytes without lockdep so you
could fit twice as many locks compared to a long on 64bit ;)

Sebastian


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-09-30  6:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-23 17:01 [PATCH] mm: Disable zsmalloc on PREEMPT_RT Sebastian Andrzej Siewior
2021-09-23 23:06 ` Minchan Kim
2021-09-24  7:08   ` Sebastian Andrzej Siewior
2021-09-28  8:44   ` [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage Sebastian Andrzej Siewior
2021-09-28 22:47     ` Andrew Morton
2021-09-29  2:11       ` Mike Galbraith
2021-09-29  7:23       ` Sebastian Andrzej Siewior
2021-09-29 19:09         ` Minchan Kim
2021-09-30  6:42           ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.