linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* reduce latency in __purge_vmap_area_lazy
@ 2016-10-22 15:17 Christoph Hellwig
  2016-10-22 15:17 ` [PATCH 1/7] mm: remove free_unmap_vmap_area_noflush Christoph Hellwig
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Christoph Hellwig @ 2016-10-22 15:17 UTC (permalink / raw)
  To: akpm
  Cc: joelaf, jszhang, chris, joaodias, linux-mm, linux-rt-users, linux-kernel

Hi all,

this is my spin at sorting out the long lock hold times in
__purge_vmap_area_lazy.  It is based on the patch from Joel sent this
week.  I don't have any good numbers for it, but it survived an
xfstests run on XFS which is a significant vmalloc user.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/7] mm: remove free_unmap_vmap_area_noflush
  2016-10-22 15:17 reduce latency in __purge_vmap_area_lazy Christoph Hellwig
@ 2016-10-22 15:17 ` Christoph Hellwig
  2016-10-22 15:17 ` [PATCH 2/7] mm: remove free_unmap_vmap_area_addr Christoph Hellwig
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2016-10-22 15:17 UTC (permalink / raw)
  To: akpm
  Cc: joelaf, jszhang, chris, joaodias, linux-mm, linux-rt-users, linux-kernel

Just inline it into the only caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
 mm/vmalloc.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index f2481cb..45de736 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -711,22 +711,13 @@ static void free_vmap_area_noflush(struct vmap_area *va)
 }
 
 /*
- * Free and unmap a vmap area, caller ensuring flush_cache_vunmap had been
- * called for the correct range previously.
- */
-static void free_unmap_vmap_area_noflush(struct vmap_area *va)
-{
-	unmap_vmap_area(va);
-	free_vmap_area_noflush(va);
-}
-
-/*
  * Free and unmap a vmap area
  */
 static void free_unmap_vmap_area(struct vmap_area *va)
 {
 	flush_cache_vunmap(va->va_start, va->va_end);
-	free_unmap_vmap_area_noflush(va);
+	unmap_vmap_area(va);
+	free_vmap_area_noflush(va);
 }
 
 static struct vmap_area *find_vmap_area(unsigned long addr)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/7] mm: remove free_unmap_vmap_area_addr
  2016-10-22 15:17 reduce latency in __purge_vmap_area_lazy Christoph Hellwig
  2016-10-22 15:17 ` [PATCH 1/7] mm: remove free_unmap_vmap_area_noflush Christoph Hellwig
@ 2016-10-22 15:17 ` Christoph Hellwig
  2016-10-22 15:17 ` [PATCH 3/7] mm: refactor __purge_vmap_area_lazy Christoph Hellwig
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2016-10-22 15:17 UTC (permalink / raw)
  To: akpm
  Cc: joelaf, jszhang, chris, joaodias, linux-mm, linux-rt-users, linux-kernel

Just inline it into the only caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
 mm/vmalloc.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 45de736..cf1a5ab 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -731,16 +731,6 @@ static struct vmap_area *find_vmap_area(unsigned long addr)
 	return va;
 }
 
-static void free_unmap_vmap_area_addr(unsigned long addr)
-{
-	struct vmap_area *va;
-
-	va = find_vmap_area(addr);
-	BUG_ON(!va);
-	free_unmap_vmap_area(va);
-}
-
-
 /*** Per cpu kva allocator ***/
 
 /*
@@ -1098,6 +1088,7 @@ void vm_unmap_ram(const void *mem, unsigned int count)
 {
 	unsigned long size = (unsigned long)count << PAGE_SHIFT;
 	unsigned long addr = (unsigned long)mem;
+	struct vmap_area *va;
 
 	BUG_ON(!addr);
 	BUG_ON(addr < VMALLOC_START);
@@ -1107,10 +1098,14 @@ void vm_unmap_ram(const void *mem, unsigned int count)
 	debug_check_no_locks_freed(mem, size);
 	vmap_debug_free_range(addr, addr+size);
 
-	if (likely(count <= VMAP_MAX_ALLOC))
+	if (likely(count <= VMAP_MAX_ALLOC)) {
 		vb_free(mem, size);
-	else
-		free_unmap_vmap_area_addr(addr);
+		return;
+	}
+
+	va = find_vmap_area(addr);
+	BUG_ON(!va);
+	free_unmap_vmap_area(va);
 }
 EXPORT_SYMBOL(vm_unmap_ram);
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/7] mm: refactor __purge_vmap_area_lazy
  2016-10-22 15:17 reduce latency in __purge_vmap_area_lazy Christoph Hellwig
  2016-10-22 15:17 ` [PATCH 1/7] mm: remove free_unmap_vmap_area_noflush Christoph Hellwig
  2016-10-22 15:17 ` [PATCH 2/7] mm: remove free_unmap_vmap_area_addr Christoph Hellwig
@ 2016-10-22 15:17 ` Christoph Hellwig
  2016-10-22 15:17 ` [PATCH 4/7] mm: defer vmalloc from atomic context Christoph Hellwig
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2016-10-22 15:17 UTC (permalink / raw)
  To: akpm
  Cc: joelaf, jszhang, chris, joaodias, linux-mm, linux-rt-users, linux-kernel

Move the purge_lock synchronization to the callers, move the call to
purge_fragmented_blocks_allcpus at the beginning of the function to
the callers that need it, move the force_flush behavior to the caller
that needs it, and pass start and end by value instead of by reference.

No change in behavior.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
 mm/vmalloc.c | 80 ++++++++++++++++++++++++++----------------------------------
 1 file changed, 35 insertions(+), 45 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index cf1a5ab..a4e2cec 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -601,6 +601,13 @@ static unsigned long lazy_max_pages(void)
 
 static atomic_t vmap_lazy_nr = ATOMIC_INIT(0);
 
+/*
+ * Serialize vmap purging.  There is no actual criticial section protected
+ * by this look, but we want to avoid concurrent calls for performance
+ * reasons and to make the pcpu_get_vm_areas more deterministic.
+ */
+static DEFINE_SPINLOCK(vmap_purge_lock);
+
 /* for per-CPU blocks */
 static void purge_fragmented_blocks_allcpus(void);
 
@@ -615,59 +622,36 @@ void set_iounmap_nonlazy(void)
 
 /*
  * Purges all lazily-freed vmap areas.
- *
- * If sync is 0 then don't purge if there is already a purge in progress.
- * If force_flush is 1, then flush kernel TLBs between *start and *end even
- * if we found no lazy vmap areas to unmap (callers can use this to optimise
- * their own TLB flushing).
- * Returns with *start = min(*start, lowest purged address)
- *              *end = max(*end, highest purged address)
  */
-static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end,
-					int sync, int force_flush)
+static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
 {
-	static DEFINE_SPINLOCK(purge_lock);
 	struct llist_node *valist;
 	struct vmap_area *va;
 	struct vmap_area *n_va;
 	int nr = 0;
 
-	/*
-	 * If sync is 0 but force_flush is 1, we'll go sync anyway but callers
-	 * should not expect such behaviour. This just simplifies locking for
-	 * the case that isn't actually used at the moment anyway.
-	 */
-	if (!sync && !force_flush) {
-		if (!spin_trylock(&purge_lock))
-			return;
-	} else
-		spin_lock(&purge_lock);
-
-	if (sync)
-		purge_fragmented_blocks_allcpus();
+	lockdep_assert_held(&vmap_purge_lock);
 
 	valist = llist_del_all(&vmap_purge_list);
 	llist_for_each_entry(va, valist, purge_list) {
-		if (va->va_start < *start)
-			*start = va->va_start;
-		if (va->va_end > *end)
-			*end = va->va_end;
+		if (va->va_start < start)
+			start = va->va_start;
+		if (va->va_end > end)
+			end = va->va_end;
 		nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
 	}
 
-	if (nr)
-		atomic_sub(nr, &vmap_lazy_nr);
+	if (!nr)
+		return false;
 
-	if (nr || force_flush)
-		flush_tlb_kernel_range(*start, *end);
+	atomic_sub(nr, &vmap_lazy_nr);
+	flush_tlb_kernel_range(start, end);
 
-	if (nr) {
-		spin_lock(&vmap_area_lock);
-		llist_for_each_entry_safe(va, n_va, valist, purge_list)
-			__free_vmap_area(va);
-		spin_unlock(&vmap_area_lock);
-	}
-	spin_unlock(&purge_lock);
+	spin_lock(&vmap_area_lock);
+	llist_for_each_entry_safe(va, n_va, valist, purge_list)
+		__free_vmap_area(va);
+	spin_unlock(&vmap_area_lock);
+	return true;
 }
 
 /*
@@ -676,9 +660,10 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end,
  */
 static void try_purge_vmap_area_lazy(void)
 {
-	unsigned long start = ULONG_MAX, end = 0;
-
-	__purge_vmap_area_lazy(&start, &end, 0, 0);
+	if (spin_trylock(&vmap_purge_lock)) {
+		__purge_vmap_area_lazy(ULONG_MAX, 0);
+		spin_unlock(&vmap_purge_lock);
+	}
 }
 
 /*
@@ -686,9 +671,10 @@ static void try_purge_vmap_area_lazy(void)
  */
 static void purge_vmap_area_lazy(void)
 {
-	unsigned long start = ULONG_MAX, end = 0;
-
-	__purge_vmap_area_lazy(&start, &end, 1, 0);
+	spin_lock(&vmap_purge_lock);
+	purge_fragmented_blocks_allcpus();
+	__purge_vmap_area_lazy(ULONG_MAX, 0);
+	spin_unlock(&vmap_purge_lock);
 }
 
 /*
@@ -1075,7 +1061,11 @@ void vm_unmap_aliases(void)
 		rcu_read_unlock();
 	}
 
-	__purge_vmap_area_lazy(&start, &end, 1, flush);
+	spin_lock(&vmap_purge_lock);
+	purge_fragmented_blocks_allcpus();
+	if (!__purge_vmap_area_lazy(start, end) && flush)
+		flush_tlb_kernel_range(start, end);
+	spin_unlock(&vmap_purge_lock);
 }
 EXPORT_SYMBOL_GPL(vm_unmap_aliases);
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/7] mm: defer vmalloc from atomic context
  2016-10-22 15:17 reduce latency in __purge_vmap_area_lazy Christoph Hellwig
                   ` (2 preceding siblings ...)
  2016-10-22 15:17 ` [PATCH 3/7] mm: refactor __purge_vmap_area_lazy Christoph Hellwig
@ 2016-10-22 15:17 ` Christoph Hellwig
  2016-10-24 15:44   ` Andrey Ryabinin
  2016-10-22 15:17 ` [PATCH 5/7] mm: mark all calls into the vmalloc subsystem as potentially sleeping Christoph Hellwig
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2016-10-22 15:17 UTC (permalink / raw)
  To: akpm
  Cc: joelaf, jszhang, chris, joaodias, linux-mm, linux-rt-users, linux-kernel

We want to be able to use a sleeping lock for freeing vmap to keep
latency down.  For this we need to use the deferred vfree mechanisms
no only from interrupt, but from any atomic context.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 mm/vmalloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index a4e2cec..bcc1a64 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1509,7 +1509,7 @@ void vfree(const void *addr)
 
 	if (!addr)
 		return;
-	if (unlikely(in_interrupt())) {
+	if (unlikely(in_atomic())) {
 		struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
 		if (llist_add((struct llist_node *)addr, &p->list))
 			schedule_work(&p->wq);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/7] mm: mark all calls into the vmalloc subsystem as potentially sleeping
  2016-10-22 15:17 reduce latency in __purge_vmap_area_lazy Christoph Hellwig
                   ` (3 preceding siblings ...)
  2016-10-22 15:17 ` [PATCH 4/7] mm: defer vmalloc from atomic context Christoph Hellwig
@ 2016-10-22 15:17 ` Christoph Hellwig
  2016-10-22 15:17 ` [PATCH 6/7] mm: turn vmap_purge_lock into a mutex Christoph Hellwig
  2016-10-22 15:17 ` [PATCH 7/7] mm: add preempt points into __purge_vmap_area_lazy Christoph Hellwig
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2016-10-22 15:17 UTC (permalink / raw)
  To: akpm
  Cc: joelaf, jszhang, chris, joaodias, linux-mm, linux-rt-users, linux-kernel

We will take a sleeping lock in later in this series, so this adds the
proper safeguards.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
 mm/vmalloc.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index bcc1a64..0e7f523 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -365,7 +365,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	BUG_ON(offset_in_page(size));
 	BUG_ON(!is_power_of_2(align));
 
-	might_sleep_if(gfpflags_allow_blocking(gfp_mask));
+	might_sleep();
 
 	va = kmalloc_node(sizeof(struct vmap_area),
 			gfp_mask & GFP_RECLAIM_MASK, node);
@@ -1037,6 +1037,8 @@ void vm_unmap_aliases(void)
 	if (unlikely(!vmap_initialized))
 		return;
 
+	might_sleep();
+
 	for_each_possible_cpu(cpu) {
 		struct vmap_block_queue *vbq = &per_cpu(vmap_block_queue, cpu);
 		struct vmap_block *vb;
@@ -1080,6 +1082,7 @@ void vm_unmap_ram(const void *mem, unsigned int count)
 	unsigned long addr = (unsigned long)mem;
 	struct vmap_area *va;
 
+	might_sleep();
 	BUG_ON(!addr);
 	BUG_ON(addr < VMALLOC_START);
 	BUG_ON(addr > VMALLOC_END);
@@ -1431,6 +1434,8 @@ struct vm_struct *remove_vm_area(const void *addr)
 {
 	struct vmap_area *va;
 
+	might_sleep();
+
 	va = find_vmap_area((unsigned long)addr);
 	if (va && va->flags & VM_VM_AREA) {
 		struct vm_struct *vm = va->vm;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/7] mm: turn vmap_purge_lock into a mutex
  2016-10-22 15:17 reduce latency in __purge_vmap_area_lazy Christoph Hellwig
                   ` (4 preceding siblings ...)
  2016-10-22 15:17 ` [PATCH 5/7] mm: mark all calls into the vmalloc subsystem as potentially sleeping Christoph Hellwig
@ 2016-10-22 15:17 ` Christoph Hellwig
  2016-10-22 15:17 ` [PATCH 7/7] mm: add preempt points into __purge_vmap_area_lazy Christoph Hellwig
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2016-10-22 15:17 UTC (permalink / raw)
  To: akpm
  Cc: joelaf, jszhang, chris, joaodias, linux-mm, linux-rt-users, linux-kernel

The purge_lock spinlock causes high latencies with non RT kernel. This
has been reported multiple times on lkml [1] [2] and affects
applications like audio.

This patch replaces it with a mutex to allow preemption while holding
the lock.

Thanks to Joel Fernandes for the detailed report and analysis as well
as an earlier attempt at fixing this issue.

[1] http://lists.openwall.net/linux-kernel/2016/03/23/29
[2] https://lkml.org/lkml/2016/10/9/59

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
 mm/vmalloc.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 0e7f523..23d6797 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -606,7 +606,7 @@ static atomic_t vmap_lazy_nr = ATOMIC_INIT(0);
  * by this look, but we want to avoid concurrent calls for performance
  * reasons and to make the pcpu_get_vm_areas more deterministic.
  */
-static DEFINE_SPINLOCK(vmap_purge_lock);
+static DEFINE_MUTEX(vmap_purge_lock);
 
 /* for per-CPU blocks */
 static void purge_fragmented_blocks_allcpus(void);
@@ -660,9 +660,9 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
  */
 static void try_purge_vmap_area_lazy(void)
 {
-	if (spin_trylock(&vmap_purge_lock)) {
+	if (mutex_trylock(&vmap_purge_lock)) {
 		__purge_vmap_area_lazy(ULONG_MAX, 0);
-		spin_unlock(&vmap_purge_lock);
+		mutex_unlock(&vmap_purge_lock);
 	}
 }
 
@@ -671,10 +671,10 @@ static void try_purge_vmap_area_lazy(void)
  */
 static void purge_vmap_area_lazy(void)
 {
-	spin_lock(&vmap_purge_lock);
+	mutex_lock(&vmap_purge_lock);
 	purge_fragmented_blocks_allcpus();
 	__purge_vmap_area_lazy(ULONG_MAX, 0);
-	spin_unlock(&vmap_purge_lock);
+	mutex_unlock(&vmap_purge_lock);
 }
 
 /*
@@ -1063,11 +1063,11 @@ void vm_unmap_aliases(void)
 		rcu_read_unlock();
 	}
 
-	spin_lock(&vmap_purge_lock);
+	mutex_lock(&vmap_purge_lock);
 	purge_fragmented_blocks_allcpus();
 	if (!__purge_vmap_area_lazy(start, end) && flush)
 		flush_tlb_kernel_range(start, end);
-	spin_unlock(&vmap_purge_lock);
+	mutex_unlock(&vmap_purge_lock);
 }
 EXPORT_SYMBOL_GPL(vm_unmap_aliases);
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 7/7] mm: add preempt points into __purge_vmap_area_lazy
  2016-10-22 15:17 reduce latency in __purge_vmap_area_lazy Christoph Hellwig
                   ` (5 preceding siblings ...)
  2016-10-22 15:17 ` [PATCH 6/7] mm: turn vmap_purge_lock into a mutex Christoph Hellwig
@ 2016-10-22 15:17 ` Christoph Hellwig
  6 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2016-10-22 15:17 UTC (permalink / raw)
  To: akpm
  Cc: joelaf, jszhang, chris, joaodias, linux-mm, linux-rt-users, linux-kernel

From: Joel Fernandes <joelaf@google.com>

Use cond_resched_lock to avoid holding the vmap_area_lock for a
potentially long time and thus creating bad latencies for various
workloads.

Signed-off-by: Joel Fernandes <joelaf@google.com>
[hch: split from a larger patch by Joel, wrote the crappy changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
 mm/vmalloc.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 23d6797..6c8b921 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -628,7 +628,7 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
 	struct llist_node *valist;
 	struct vmap_area *va;
 	struct vmap_area *n_va;
-	int nr = 0;
+	bool do_free = false;
 
 	lockdep_assert_held(&vmap_purge_lock);
 
@@ -638,18 +638,22 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
 			start = va->va_start;
 		if (va->va_end > end)
 			end = va->va_end;
-		nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
+		do_free = true;
 	}
 
-	if (!nr)
+	if (!do_free)
 		return false;
 
-	atomic_sub(nr, &vmap_lazy_nr);
 	flush_tlb_kernel_range(start, end);
 
 	spin_lock(&vmap_area_lock);
-	llist_for_each_entry_safe(va, n_va, valist, purge_list)
+	llist_for_each_entry_safe(va, n_va, valist, purge_list) {
+		int nr = (va->va_end - va->va_start) >> PAGE_SHIFT;
+
 		__free_vmap_area(va);
+		atomic_sub(nr, &vmap_lazy_nr);
+		cond_resched_lock(&vmap_area_lock);
+	}
 	spin_unlock(&vmap_area_lock);
 	return true;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/7] mm: defer vmalloc from atomic context
  2016-10-22 15:17 ` [PATCH 4/7] mm: defer vmalloc from atomic context Christoph Hellwig
@ 2016-10-24 15:44   ` Andrey Ryabinin
  2016-11-05  3:43     ` Joel Fernandes
  0 siblings, 1 reply; 17+ messages in thread
From: Andrey Ryabinin @ 2016-10-24 15:44 UTC (permalink / raw)
  To: Christoph Hellwig, akpm
  Cc: joelaf, jszhang, chris, joaodias, linux-mm, linux-rt-users, linux-kernel



On 10/22/2016 06:17 PM, Christoph Hellwig wrote:
> We want to be able to use a sleeping lock for freeing vmap to keep
> latency down.  For this we need to use the deferred vfree mechanisms
> no only from interrupt, but from any atomic context.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  mm/vmalloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index a4e2cec..bcc1a64 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1509,7 +1509,7 @@ void vfree(const void *addr)
>  
>  	if (!addr)
>  		return;
> -	if (unlikely(in_interrupt())) {
> +	if (unlikely(in_atomic())) {

in_atomic() cannot always detect atomic context, thus it shouldn't be used here.
You can add something like vfree_in_atomic() and use it in atomic call sites.

>  		struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
>  		if (llist_add((struct llist_node *)addr, &p->list))
>  			schedule_work(&p->wq);
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/7] mm: defer vmalloc from atomic context
  2016-10-24 15:44   ` Andrey Ryabinin
@ 2016-11-05  3:43     ` Joel Fernandes
  2016-11-07 15:01       ` Andrey Ryabinin
  0 siblings, 1 reply; 17+ messages in thread
From: Joel Fernandes @ 2016-11-05  3:43 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Christoph Hellwig, Andrew Morton, Jisheng Zhang, Chris Wilson,
	John Dias, open list:MEMORY MANAGEMENT, linux-rt-users, LKML

On Mon, Oct 24, 2016 at 8:44 AM, Andrey Ryabinin
<aryabinin@virtuozzo.com> wrote:
>
>
> On 10/22/2016 06:17 PM, Christoph Hellwig wrote:
>> We want to be able to use a sleeping lock for freeing vmap to keep
>> latency down.  For this we need to use the deferred vfree mechanisms
>> no only from interrupt, but from any atomic context.
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> ---
>>  mm/vmalloc.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index a4e2cec..bcc1a64 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -1509,7 +1509,7 @@ void vfree(const void *addr)
>>
>>       if (!addr)
>>               return;
>> -     if (unlikely(in_interrupt())) {
>> +     if (unlikely(in_atomic())) {
>
> in_atomic() cannot always detect atomic context, thus it shouldn't be used here.
> You can add something like vfree_in_atomic() and use it in atomic call sites.
>

So because in_atomic doesn't work for !CONFIG_PREEMPT kernels, can we
always defer the work in these cases?

So for non-preemptible kernels, we always defer:

if (!IS_ENABLED(CONFIG_PREEMPT) || in_atomic()) {
  // defer
}

Is this fine? Or any other ideas?

Thanks,
Joel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/7] mm: defer vmalloc from atomic context
  2016-11-05  3:43     ` Joel Fernandes
@ 2016-11-07 15:01       ` Andrey Ryabinin
  2016-11-07 15:09         ` Christoph Hellwig
  2016-11-07 16:50         ` [PATCH 4/7] mm: defer vmalloc from atomic context Joel Fernandes
  0 siblings, 2 replies; 17+ messages in thread
From: Andrey Ryabinin @ 2016-11-07 15:01 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Christoph Hellwig, Andrew Morton, Jisheng Zhang, Chris Wilson,
	John Dias, open list:MEMORY MANAGEMENT, linux-rt-users, LKML,
	Andy Lutomirski

On 11/05/2016 06:43 AM, Joel Fernandes wrote:
> On Mon, Oct 24, 2016 at 8:44 AM, Andrey Ryabinin
> <aryabinin@virtuozzo.com> wrote:
>>
>>
>> On 10/22/2016 06:17 PM, Christoph Hellwig wrote:
>>> We want to be able to use a sleeping lock for freeing vmap to keep
>>> latency down.  For this we need to use the deferred vfree mechanisms
>>> no only from interrupt, but from any atomic context.
>>>
>>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>>> ---
>>>  mm/vmalloc.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index a4e2cec..bcc1a64 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -1509,7 +1509,7 @@ void vfree(const void *addr)
>>>
>>>       if (!addr)
>>>               return;
>>> -     if (unlikely(in_interrupt())) {
>>> +     if (unlikely(in_atomic())) {
>>
>> in_atomic() cannot always detect atomic context, thus it shouldn't be used here.
>> You can add something like vfree_in_atomic() and use it in atomic call sites.
>>
> 
> So because in_atomic doesn't work for !CONFIG_PREEMPT kernels, can we
> always defer the work in these cases?
> 
> So for non-preemptible kernels, we always defer:
> 
> if (!IS_ENABLED(CONFIG_PREEMPT) || in_atomic()) {
>   // defer
> }
> 
> Is this fine? Or any other ideas?
> 

What's wrong with my idea?
We can add vfree_in_atomic() and use it to free vmapped stacks
and for any other places where vfree() used 'in_atomict() && !in_interrupt()' context.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/7] mm: defer vmalloc from atomic context
  2016-11-07 15:01       ` Andrey Ryabinin
@ 2016-11-07 15:09         ` Christoph Hellwig
  2016-11-08 15:03           ` Andrey Ryabinin
  2016-11-08 15:05           ` [PATCH 1/3] mm/vmalloc: add vfree_atomic() Andrey Ryabinin
  2016-11-07 16:50         ` [PATCH 4/7] mm: defer vmalloc from atomic context Joel Fernandes
  1 sibling, 2 replies; 17+ messages in thread
From: Christoph Hellwig @ 2016-11-07 15:09 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Joel Fernandes, Christoph Hellwig, Andrew Morton, Jisheng Zhang,
	Chris Wilson, John Dias, open list:MEMORY MANAGEMENT,
	linux-rt-users, LKML, Andy Lutomirski

On Mon, Nov 07, 2016 at 06:01:45PM +0300, Andrey Ryabinin wrote:
> > So because in_atomic doesn't work for !CONFIG_PREEMPT kernels, can we
> > always defer the work in these cases?
> > 
> > So for non-preemptible kernels, we always defer:
> > 
> > if (!IS_ENABLED(CONFIG_PREEMPT) || in_atomic()) {
> >   // defer
> > }
> > 
> > Is this fine? Or any other ideas?
> > 
> 
> What's wrong with my idea?
> We can add vfree_in_atomic() and use it to free vmapped stacks
> and for any other places where vfree() used 'in_atomict() && !in_interrupt()' context.

I somehow missed the mail, sorry.  That beeing said always defer is
going to suck badly in terms of performance, so I'm not sure it's an all
that good idea.

vfree_in_atomic sounds good, but I wonder if we'll need to annotate
more callers than just the stacks.  I'm fairly bust this week, do you
want to give that a spin?  Otherwise I'll give it a try towards the
end of this week or next week.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/7] mm: defer vmalloc from atomic context
  2016-11-07 15:01       ` Andrey Ryabinin
  2016-11-07 15:09         ` Christoph Hellwig
@ 2016-11-07 16:50         ` Joel Fernandes
  1 sibling, 0 replies; 17+ messages in thread
From: Joel Fernandes @ 2016-11-07 16:50 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Christoph Hellwig, Andrew Morton, Jisheng Zhang, Chris Wilson,
	John Dias, open list:MEMORY MANAGEMENT, linux-rt-users, LKML,
	Andy Lutomirski

On Mon, Nov 7, 2016 at 7:01 AM, Andrey Ryabinin <aryabinin@virtuozzo.com> wrote:
> On 11/05/2016 06:43 AM, Joel Fernandes wrote:
>> On Mon, Oct 24, 2016 at 8:44 AM, Andrey Ryabinin
>> <aryabinin@virtuozzo.com> wrote:
>>>
>>>
>>> On 10/22/2016 06:17 PM, Christoph Hellwig wrote:
>>>> We want to be able to use a sleeping lock for freeing vmap to keep
>>>> latency down.  For this we need to use the deferred vfree mechanisms
>>>> no only from interrupt, but from any atomic context.
>>>>
>>>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>>>> ---
>>>>  mm/vmalloc.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>>> index a4e2cec..bcc1a64 100644
>>>> --- a/mm/vmalloc.c
>>>> +++ b/mm/vmalloc.c
>>>> @@ -1509,7 +1509,7 @@ void vfree(const void *addr)
>>>>
>>>>       if (!addr)
>>>>               return;
>>>> -     if (unlikely(in_interrupt())) {
>>>> +     if (unlikely(in_atomic())) {
>>>
>>> in_atomic() cannot always detect atomic context, thus it shouldn't be used here.
>>> You can add something like vfree_in_atomic() and use it in atomic call sites.
>>>
>>
>> So because in_atomic doesn't work for !CONFIG_PREEMPT kernels, can we
>> always defer the work in these cases?
>>
>> So for non-preemptible kernels, we always defer:
>>
>> if (!IS_ENABLED(CONFIG_PREEMPT) || in_atomic()) {
>>   // defer
>> }
>>
>> Is this fine? Or any other ideas?
>>
>
> What's wrong with my idea?
> We can add vfree_in_atomic() and use it to free vmapped stacks
> and for any other places where vfree() used 'in_atomict() && !in_interrupt()' context.

Yes, this sounds like a better idea as there may not be that many
callers and my idea may hurt perf.

Thanks,

Joel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/7] mm: defer vmalloc from atomic context
  2016-11-07 15:09         ` Christoph Hellwig
@ 2016-11-08 15:03           ` Andrey Ryabinin
  2016-11-08 15:05           ` [PATCH 1/3] mm/vmalloc: add vfree_atomic() Andrey Ryabinin
  1 sibling, 0 replies; 17+ messages in thread
From: Andrey Ryabinin @ 2016-11-08 15:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Joel Fernandes, Andrew Morton, Jisheng Zhang, Chris Wilson,
	John Dias, open list:MEMORY MANAGEMENT, linux-rt-users, LKML,
	Andy Lutomirski



On 11/07/2016 06:09 PM, Christoph Hellwig wrote:
> On Mon, Nov 07, 2016 at 06:01:45PM +0300, Andrey Ryabinin wrote:
>>> So because in_atomic doesn't work for !CONFIG_PREEMPT kernels, can we
>>> always defer the work in these cases?
>>>
>>> So for non-preemptible kernels, we always defer:
>>>
>>> if (!IS_ENABLED(CONFIG_PREEMPT) || in_atomic()) {
>>>   // defer
>>> }
>>>
>>> Is this fine? Or any other ideas?
>>>
>>
>> What's wrong with my idea?
>> We can add vfree_in_atomic() and use it to free vmapped stacks
>> and for any other places where vfree() used 'in_atomict() && !in_interrupt()' context.
> 
> I somehow missed the mail, sorry.  That beeing said always defer is
> going to suck badly in terms of performance, so I'm not sure it's an all
> that good idea.
> 
> vfree_in_atomic sounds good, but I wonder if we'll need to annotate
> more callers than just the stacks.  I'm fairly bust this week, do you
> want to give that a spin?  Otherwise I'll give it a try towards the
> end of this week or next week.
> 

Yeah, it appears that we need more annotations. I've found another case in free_ldt_struct(),
and I bet it won't be the last.
I'll send patches.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] mm/vmalloc: add vfree_atomic()
  2016-11-07 15:09         ` Christoph Hellwig
  2016-11-08 15:03           ` Andrey Ryabinin
@ 2016-11-08 15:05           ` Andrey Ryabinin
  2016-11-08 15:05             ` [PATCH 2/3] kernel/fork: use vfree_atomic() to free thread stack Andrey Ryabinin
  2016-11-08 15:05             ` [PATCH 3/3] x86/ldt: use vfree_atomic() to free ldt entries Andrey Ryabinin
  1 sibling, 2 replies; 17+ messages in thread
From: Andrey Ryabinin @ 2016-11-08 15:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, linux-mm, linux-rt-users, linux-kernel,
	Andrey Ryabinin, Andy Lutomirski, Joel Fernandes, Jisheng Zhang,
	Chris Wilson, John Dias, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86

We are going to use sleeping lock for freeing vmap. However some
vfree() users want to free memory from atomic (but not from interrupt)
context. For this we add vfree_atomic() - deferred variation of vfree()
which can be used in any atomic context (except NMIs).

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jisheng Zhang <jszhang@marvell.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
---
 include/linux/vmalloc.h |  1 +
 mm/vmalloc.c            | 36 ++++++++++++++++++++++++++++++------
 2 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 3d9d786..d68edff 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -82,6 +82,7 @@ extern void *__vmalloc_node_range(unsigned long size, unsigned long align,
 			const void *caller);
 
 extern void vfree(const void *addr);
+extern void vfree_atomic(const void *addr);
 
 extern void *vmap(struct page **pages, unsigned int count,
 			unsigned long flags, pgprot_t prot);
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 719ced3..b0edc67 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1471,7 +1471,33 @@ static void __vunmap(const void *addr, int deallocate_pages)
 	kfree(area);
 	return;
 }
- 
+
+static inline void __vfree_deferred(const void *addr)
+{
+	struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
+
+	if (llist_add((struct llist_node *)addr, &p->list))
+		schedule_work(&p->wq);
+}
+
+/**
+ *	vfree_atomic  -  release memory allocated by vmalloc()
+ *	@addr:		memory base address
+ *
+ *	This one is just like vfree() but can be called in any atomic context
+ *	except NMIs.
+ */
+void vfree_atomic(const void *addr)
+{
+	BUG_ON(in_nmi());
+
+	kmemleak_free(addr);
+
+	if (!addr)
+		return;
+	__vfree_deferred(addr);
+}
+
 /**
  *	vfree  -  release memory allocated by vmalloc()
  *	@addr:		memory base address
@@ -1494,11 +1520,9 @@ void vfree(const void *addr)
 
 	if (!addr)
 		return;
-	if (unlikely(in_interrupt())) {
-		struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
-		if (llist_add((struct llist_node *)addr, &p->list))
-			schedule_work(&p->wq);
-	} else
+	if (unlikely(in_interrupt()))
+		__vfree_deferred(addr);
+	else
 		__vunmap(addr, 1);
 }
 EXPORT_SYMBOL(vfree);
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] kernel/fork: use vfree_atomic() to free thread stack
  2016-11-08 15:05           ` [PATCH 1/3] mm/vmalloc: add vfree_atomic() Andrey Ryabinin
@ 2016-11-08 15:05             ` Andrey Ryabinin
  2016-11-08 15:05             ` [PATCH 3/3] x86/ldt: use vfree_atomic() to free ldt entries Andrey Ryabinin
  1 sibling, 0 replies; 17+ messages in thread
From: Andrey Ryabinin @ 2016-11-08 15:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, linux-mm, linux-rt-users, linux-kernel,
	Andrey Ryabinin, Andy Lutomirski, Joel Fernandes, Jisheng Zhang,
	Chris Wilson, John Dias, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86

vfree() is going to use sleeping lock. Thread stack freed in atomic
context, therefore we must use vfree_atomic() here.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jisheng Zhang <jszhang@marvell.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
---
 kernel/fork.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index fd85c68..417e94f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -229,7 +229,7 @@ static inline void free_thread_stack(struct task_struct *tsk)
 		}
 		local_irq_restore(flags);
 
-		vfree(tsk->stack);
+		vfree_atomic(tsk->stack);
 		return;
 	}
 #endif
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] x86/ldt: use vfree_atomic() to free ldt entries
  2016-11-08 15:05           ` [PATCH 1/3] mm/vmalloc: add vfree_atomic() Andrey Ryabinin
  2016-11-08 15:05             ` [PATCH 2/3] kernel/fork: use vfree_atomic() to free thread stack Andrey Ryabinin
@ 2016-11-08 15:05             ` Andrey Ryabinin
  1 sibling, 0 replies; 17+ messages in thread
From: Andrey Ryabinin @ 2016-11-08 15:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, linux-mm, linux-rt-users, linux-kernel,
	Andrey Ryabinin, Andy Lutomirski, Joel Fernandes, Jisheng Zhang,
	Chris Wilson, John Dias, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86

vfree() is going to use sleeping lock. free_ldt_struct()
may be called with disabled preemption, therefore we must
use vfree_atomic() here.

E.g. call trace:
	vfree()
	free_ldt_struct()
	destroy_context_ldt()
	__mmdrop()
	finish_task_switch()
	schedule_tail()
	ret_from_fork()

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jisheng Zhang <jszhang@marvell.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
---
 arch/x86/kernel/ldt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
index 6707039..4d12cdf 100644
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -93,7 +93,7 @@ static void free_ldt_struct(struct ldt_struct *ldt)
 
 	paravirt_free_ldt(ldt->entries, ldt->size);
 	if (ldt->size * LDT_ENTRY_SIZE > PAGE_SIZE)
-		vfree(ldt->entries);
+		vfree_atomic(ldt->entries);
 	else
 		free_page((unsigned long)ldt->entries);
 	kfree(ldt);
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-11-08 17:36 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-22 15:17 reduce latency in __purge_vmap_area_lazy Christoph Hellwig
2016-10-22 15:17 ` [PATCH 1/7] mm: remove free_unmap_vmap_area_noflush Christoph Hellwig
2016-10-22 15:17 ` [PATCH 2/7] mm: remove free_unmap_vmap_area_addr Christoph Hellwig
2016-10-22 15:17 ` [PATCH 3/7] mm: refactor __purge_vmap_area_lazy Christoph Hellwig
2016-10-22 15:17 ` [PATCH 4/7] mm: defer vmalloc from atomic context Christoph Hellwig
2016-10-24 15:44   ` Andrey Ryabinin
2016-11-05  3:43     ` Joel Fernandes
2016-11-07 15:01       ` Andrey Ryabinin
2016-11-07 15:09         ` Christoph Hellwig
2016-11-08 15:03           ` Andrey Ryabinin
2016-11-08 15:05           ` [PATCH 1/3] mm/vmalloc: add vfree_atomic() Andrey Ryabinin
2016-11-08 15:05             ` [PATCH 2/3] kernel/fork: use vfree_atomic() to free thread stack Andrey Ryabinin
2016-11-08 15:05             ` [PATCH 3/3] x86/ldt: use vfree_atomic() to free ldt entries Andrey Ryabinin
2016-11-07 16:50         ` [PATCH 4/7] mm: defer vmalloc from atomic context Joel Fernandes
2016-10-22 15:17 ` [PATCH 5/7] mm: mark all calls into the vmalloc subsystem as potentially sleeping Christoph Hellwig
2016-10-22 15:17 ` [PATCH 6/7] mm: turn vmap_purge_lock into a mutex Christoph Hellwig
2016-10-22 15:17 ` [PATCH 7/7] mm: add preempt points into __purge_vmap_area_lazy Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).