* reduce latency in __purge_vmap_area_lazy V2
@ 2016-11-18 13:03 Christoph Hellwig
2016-11-18 13:03 ` [PATCH 01/10] mm: remove free_unmap_vmap_area_noflush Christoph Hellwig
` (9 more replies)
0 siblings, 10 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
Hi all,
this is my spin at sorting out the long lock hold times in
__purge_vmap_area_lazy. It is based on a patch from Joel.
Changes since V1:
- add vfree_atomic, thanks to Andrey Ryabinin.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 01/10] mm: remove free_unmap_vmap_area_noflush
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
@ 2016-11-18 13:03 ` Christoph Hellwig
2016-11-18 13:03 ` [PATCH 02/10] mm: remove free_unmap_vmap_area_addr Christoph Hellwig
` (8 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
Just inline it into the only caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
mm/vmalloc.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index f2481cb..45de736 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -711,22 +711,13 @@ static void free_vmap_area_noflush(struct vmap_area *va)
}
/*
- * Free and unmap a vmap area, caller ensuring flush_cache_vunmap had been
- * called for the correct range previously.
- */
-static void free_unmap_vmap_area_noflush(struct vmap_area *va)
-{
- unmap_vmap_area(va);
- free_vmap_area_noflush(va);
-}
-
-/*
* Free and unmap a vmap area
*/
static void free_unmap_vmap_area(struct vmap_area *va)
{
flush_cache_vunmap(va->va_start, va->va_end);
- free_unmap_vmap_area_noflush(va);
+ unmap_vmap_area(va);
+ free_vmap_area_noflush(va);
}
static struct vmap_area *find_vmap_area(unsigned long addr)
--
2.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 02/10] mm: remove free_unmap_vmap_area_addr
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
2016-11-18 13:03 ` [PATCH 01/10] mm: remove free_unmap_vmap_area_noflush Christoph Hellwig
@ 2016-11-18 13:03 ` Christoph Hellwig
2016-11-18 13:03 ` [PATCH 03/10] mm: refactor __purge_vmap_area_lazy Christoph Hellwig
` (7 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
Just inline it into the only caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
mm/vmalloc.c | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 45de736..cf1a5ab 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -731,16 +731,6 @@ static struct vmap_area *find_vmap_area(unsigned long addr)
return va;
}
-static void free_unmap_vmap_area_addr(unsigned long addr)
-{
- struct vmap_area *va;
-
- va = find_vmap_area(addr);
- BUG_ON(!va);
- free_unmap_vmap_area(va);
-}
-
-
/*** Per cpu kva allocator ***/
/*
@@ -1098,6 +1088,7 @@ void vm_unmap_ram(const void *mem, unsigned int count)
{
unsigned long size = (unsigned long)count << PAGE_SHIFT;
unsigned long addr = (unsigned long)mem;
+ struct vmap_area *va;
BUG_ON(!addr);
BUG_ON(addr < VMALLOC_START);
@@ -1107,10 +1098,14 @@ void vm_unmap_ram(const void *mem, unsigned int count)
debug_check_no_locks_freed(mem, size);
vmap_debug_free_range(addr, addr+size);
- if (likely(count <= VMAP_MAX_ALLOC))
+ if (likely(count <= VMAP_MAX_ALLOC)) {
vb_free(mem, size);
- else
- free_unmap_vmap_area_addr(addr);
+ return;
+ }
+
+ va = find_vmap_area(addr);
+ BUG_ON(!va);
+ free_unmap_vmap_area(va);
}
EXPORT_SYMBOL(vm_unmap_ram);
--
2.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 03/10] mm: refactor __purge_vmap_area_lazy
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
2016-11-18 13:03 ` [PATCH 01/10] mm: remove free_unmap_vmap_area_noflush Christoph Hellwig
2016-11-18 13:03 ` [PATCH 02/10] mm: remove free_unmap_vmap_area_addr Christoph Hellwig
@ 2016-11-18 13:03 ` Christoph Hellwig
2016-11-18 13:03 ` [PATCH 04/10] mm: add vfree_atomic() Christoph Hellwig
` (6 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
Move the purge_lock synchronization to the callers, move the call to
purge_fragmented_blocks_allcpus at the beginning of the function to
the callers that need it, move the force_flush behavior to the caller
that needs it, and pass start and end by value instead of by reference.
No change in behavior.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
mm/vmalloc.c | 80 ++++++++++++++++++++++++++----------------------------------
1 file changed, 35 insertions(+), 45 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index cf1a5ab..a4e2cec 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -601,6 +601,13 @@ static unsigned long lazy_max_pages(void)
static atomic_t vmap_lazy_nr = ATOMIC_INIT(0);
+/*
+ * Serialize vmap purging. There is no actual criticial section protected
+ * by this look, but we want to avoid concurrent calls for performance
+ * reasons and to make the pcpu_get_vm_areas more deterministic.
+ */
+static DEFINE_SPINLOCK(vmap_purge_lock);
+
/* for per-CPU blocks */
static void purge_fragmented_blocks_allcpus(void);
@@ -615,59 +622,36 @@ void set_iounmap_nonlazy(void)
/*
* Purges all lazily-freed vmap areas.
- *
- * If sync is 0 then don't purge if there is already a purge in progress.
- * If force_flush is 1, then flush kernel TLBs between *start and *end even
- * if we found no lazy vmap areas to unmap (callers can use this to optimise
- * their own TLB flushing).
- * Returns with *start = min(*start, lowest purged address)
- * *end = max(*end, highest purged address)
*/
-static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end,
- int sync, int force_flush)
+static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
{
- static DEFINE_SPINLOCK(purge_lock);
struct llist_node *valist;
struct vmap_area *va;
struct vmap_area *n_va;
int nr = 0;
- /*
- * If sync is 0 but force_flush is 1, we'll go sync anyway but callers
- * should not expect such behaviour. This just simplifies locking for
- * the case that isn't actually used at the moment anyway.
- */
- if (!sync && !force_flush) {
- if (!spin_trylock(&purge_lock))
- return;
- } else
- spin_lock(&purge_lock);
-
- if (sync)
- purge_fragmented_blocks_allcpus();
+ lockdep_assert_held(&vmap_purge_lock);
valist = llist_del_all(&vmap_purge_list);
llist_for_each_entry(va, valist, purge_list) {
- if (va->va_start < *start)
- *start = va->va_start;
- if (va->va_end > *end)
- *end = va->va_end;
+ if (va->va_start < start)
+ start = va->va_start;
+ if (va->va_end > end)
+ end = va->va_end;
nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
}
- if (nr)
- atomic_sub(nr, &vmap_lazy_nr);
+ if (!nr)
+ return false;
- if (nr || force_flush)
- flush_tlb_kernel_range(*start, *end);
+ atomic_sub(nr, &vmap_lazy_nr);
+ flush_tlb_kernel_range(start, end);
- if (nr) {
- spin_lock(&vmap_area_lock);
- llist_for_each_entry_safe(va, n_va, valist, purge_list)
- __free_vmap_area(va);
- spin_unlock(&vmap_area_lock);
- }
- spin_unlock(&purge_lock);
+ spin_lock(&vmap_area_lock);
+ llist_for_each_entry_safe(va, n_va, valist, purge_list)
+ __free_vmap_area(va);
+ spin_unlock(&vmap_area_lock);
+ return true;
}
/*
@@ -676,9 +660,10 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end,
*/
static void try_purge_vmap_area_lazy(void)
{
- unsigned long start = ULONG_MAX, end = 0;
-
- __purge_vmap_area_lazy(&start, &end, 0, 0);
+ if (spin_trylock(&vmap_purge_lock)) {
+ __purge_vmap_area_lazy(ULONG_MAX, 0);
+ spin_unlock(&vmap_purge_lock);
+ }
}
/*
@@ -686,9 +671,10 @@ static void try_purge_vmap_area_lazy(void)
*/
static void purge_vmap_area_lazy(void)
{
- unsigned long start = ULONG_MAX, end = 0;
-
- __purge_vmap_area_lazy(&start, &end, 1, 0);
+ spin_lock(&vmap_purge_lock);
+ purge_fragmented_blocks_allcpus();
+ __purge_vmap_area_lazy(ULONG_MAX, 0);
+ spin_unlock(&vmap_purge_lock);
}
/*
@@ -1075,7 +1061,11 @@ void vm_unmap_aliases(void)
rcu_read_unlock();
}
- __purge_vmap_area_lazy(&start, &end, 1, flush);
+ spin_lock(&vmap_purge_lock);
+ purge_fragmented_blocks_allcpus();
+ if (!__purge_vmap_area_lazy(start, end) && flush)
+ flush_tlb_kernel_range(start, end);
+ spin_unlock(&vmap_purge_lock);
}
EXPORT_SYMBOL_GPL(vm_unmap_aliases);
--
2.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 04/10] mm: add vfree_atomic()
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
` (2 preceding siblings ...)
2016-11-18 13:03 ` [PATCH 03/10] mm: refactor __purge_vmap_area_lazy Christoph Hellwig
@ 2016-11-18 13:03 ` Christoph Hellwig
2016-11-18 13:03 ` [PATCH 05/10] kernel/fork: use vfree_atomic() to free thread stack Christoph Hellwig
` (5 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
From: Andrey Ryabinin <aryabinin@virtuozzo.com>
We are going to use sleeping lock for freeing vmap. However some
vfree() users want to free memory from atomic (but not from interrupt)
context. For this we add vfree_atomic() - deferred variation of vfree()
which can be used in any atomic context (except NMIs).
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
---
include/linux/vmalloc.h | 1 +
mm/vmalloc.c | 36 ++++++++++++++++++++++++++++++------
2 files changed, 31 insertions(+), 6 deletions(-)
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 3d9d786..d68edff 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -82,6 +82,7 @@ extern void *__vmalloc_node_range(unsigned long size, unsigned long align,
const void *caller);
extern void vfree(const void *addr);
+extern void vfree_atomic(const void *addr);
extern void *vmap(struct page **pages, unsigned int count,
unsigned long flags, pgprot_t prot);
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index a4e2cec..80f3fae 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1486,7 +1486,33 @@ static void __vunmap(const void *addr, int deallocate_pages)
kfree(area);
return;
}
-
+
+static inline void __vfree_deferred(const void *addr)
+{
+ struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
+
+ if (llist_add((struct llist_node *)addr, &p->list))
+ schedule_work(&p->wq);
+}
+
+/**
+ * vfree_atomic - release memory allocated by vmalloc()
+ * @addr: memory base address
+ *
+ * This one is just like vfree() but can be called in any atomic context
+ * except NMIs.
+ */
+void vfree_atomic(const void *addr)
+{
+ BUG_ON(in_nmi());
+
+ kmemleak_free(addr);
+
+ if (!addr)
+ return;
+ __vfree_deferred(addr);
+}
+
/**
* vfree - release memory allocated by vmalloc()
* @addr: memory base address
@@ -1509,11 +1535,9 @@ void vfree(const void *addr)
if (!addr)
return;
- if (unlikely(in_interrupt())) {
- struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
- if (llist_add((struct llist_node *)addr, &p->list))
- schedule_work(&p->wq);
- } else
+ if (unlikely(in_interrupt()))
+ __vfree_deferred(addr);
+ else
__vunmap(addr, 1);
}
EXPORT_SYMBOL(vfree);
--
2.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 05/10] kernel/fork: use vfree_atomic() to free thread stack
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
` (3 preceding siblings ...)
2016-11-18 13:03 ` [PATCH 04/10] mm: add vfree_atomic() Christoph Hellwig
@ 2016-11-18 13:03 ` Christoph Hellwig
2016-11-18 13:03 ` [PATCH 06/10] x86/ldt: use vfree_atomic() to free ldt entries Christoph Hellwig
` (4 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
From: Andrey Ryabinin <aryabinin@virtuozzo.com>
vfree() is going to use sleeping lock. Thread stack freed in atomic
context, therefore we must use vfree_atomic() here.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
---
kernel/fork.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 997ac1d..cfee5ec 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -229,7 +229,7 @@ static inline void free_thread_stack(struct task_struct *tsk)
}
local_irq_restore(flags);
- vfree(tsk->stack);
+ vfree_atomic(tsk->stack);
return;
}
#endif
--
2.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 06/10] x86/ldt: use vfree_atomic() to free ldt entries
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
` (4 preceding siblings ...)
2016-11-18 13:03 ` [PATCH 05/10] kernel/fork: use vfree_atomic() to free thread stack Christoph Hellwig
@ 2016-11-18 13:03 ` Christoph Hellwig
2016-11-18 13:03 ` [PATCH 07/10] mm: warn about vfree from atomic context Christoph Hellwig
` (3 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
From: Andrey Ryabinin <aryabinin@virtuozzo.com>
vfree() is going to use sleeping lock. free_ldt_struct()
may be called with disabled preemption, therefore we must
use vfree_atomic() here.
E.g. call trace:
vfree()
free_ldt_struct()
destroy_context_ldt()
__mmdrop()
finish_task_switch()
schedule_tail()
ret_from_fork()
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
---
arch/x86/kernel/ldt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
index 6707039..4d12cdf 100644
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -93,7 +93,7 @@ static void free_ldt_struct(struct ldt_struct *ldt)
paravirt_free_ldt(ldt->entries, ldt->size);
if (ldt->size * LDT_ENTRY_SIZE > PAGE_SIZE)
- vfree(ldt->entries);
+ vfree_atomic(ldt->entries);
else
free_page((unsigned long)ldt->entries);
kfree(ldt);
--
2.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 07/10] mm: warn about vfree from atomic context
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
` (5 preceding siblings ...)
2016-11-18 13:03 ` [PATCH 06/10] x86/ldt: use vfree_atomic() to free ldt entries Christoph Hellwig
@ 2016-11-18 13:03 ` Christoph Hellwig
2016-11-22 16:35 ` Andrey Ryabinin
2016-11-18 13:03 ` [PATCH 08/10] mm: mark all calls into the vmalloc subsystem as potentially sleeping Christoph Hellwig
` (2 subsequent siblings)
9 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
We can't handle vfree itself from atomic context, but callers
can explicitly use vfree_atomic instead, which defers the actual
vfree to a workqueue. Unfortunately in_atomic does not work
on non-preemptible kernels, so we can't just do the right thing
by default.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
mm/vmalloc.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 80f3fae..e2030b4 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1530,6 +1530,7 @@ void vfree_atomic(const void *addr)
void vfree(const void *addr)
{
BUG_ON(in_nmi());
+ WARN_ON_ONCE(in_atomic());
kmemleak_free(addr);
--
2.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 08/10] mm: mark all calls into the vmalloc subsystem as potentially sleeping
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
` (6 preceding siblings ...)
2016-11-18 13:03 ` [PATCH 07/10] mm: warn about vfree from atomic context Christoph Hellwig
@ 2016-11-18 13:03 ` Christoph Hellwig
2016-11-18 13:03 ` [PATCH 09/10] mm: turn vmap_purge_lock into a mutex Christoph Hellwig
2016-11-18 13:03 ` [PATCH 10/10] mm: add preempt points into __purge_vmap_area_lazy Christoph Hellwig
9 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
We will take a sleeping lock in later in this series, so this adds the
proper safeguards.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
mm/vmalloc.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e2030b4..25283af 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -365,7 +365,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
BUG_ON(offset_in_page(size));
BUG_ON(!is_power_of_2(align));
- might_sleep_if(gfpflags_allow_blocking(gfp_mask));
+ might_sleep();
va = kmalloc_node(sizeof(struct vmap_area),
gfp_mask & GFP_RECLAIM_MASK, node);
@@ -1037,6 +1037,8 @@ void vm_unmap_aliases(void)
if (unlikely(!vmap_initialized))
return;
+ might_sleep();
+
for_each_possible_cpu(cpu) {
struct vmap_block_queue *vbq = &per_cpu(vmap_block_queue, cpu);
struct vmap_block *vb;
@@ -1080,6 +1082,7 @@ void vm_unmap_ram(const void *mem, unsigned int count)
unsigned long addr = (unsigned long)mem;
struct vmap_area *va;
+ might_sleep();
BUG_ON(!addr);
BUG_ON(addr < VMALLOC_START);
BUG_ON(addr > VMALLOC_END);
@@ -1431,6 +1434,8 @@ struct vm_struct *remove_vm_area(const void *addr)
{
struct vmap_area *va;
+ might_sleep();
+
va = find_vmap_area((unsigned long)addr);
if (va && va->flags & VM_VM_AREA) {
struct vm_struct *vm = va->vm;
--
2.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 09/10] mm: turn vmap_purge_lock into a mutex
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
` (7 preceding siblings ...)
2016-11-18 13:03 ` [PATCH 08/10] mm: mark all calls into the vmalloc subsystem as potentially sleeping Christoph Hellwig
@ 2016-11-18 13:03 ` Christoph Hellwig
2016-11-18 13:03 ` [PATCH 10/10] mm: add preempt points into __purge_vmap_area_lazy Christoph Hellwig
9 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
The purge_lock spinlock causes high latencies with non RT kernel. This
has been reported multiple times on lkml [1] [2] and affects
applications like audio.
This patch replaces it with a mutex to allow preemption while holding
the lock.
Thanks to Joel Fernandes for the detailed report and analysis as well
as an earlier attempt at fixing this issue.
[1] http://lists.openwall.net/linux-kernel/2016/03/23/29
[2] https://lkml.org/lkml/2016/10/9/59
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
mm/vmalloc.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 25283af..dccf242 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -606,7 +606,7 @@ static atomic_t vmap_lazy_nr = ATOMIC_INIT(0);
* by this look, but we want to avoid concurrent calls for performance
* reasons and to make the pcpu_get_vm_areas more deterministic.
*/
-static DEFINE_SPINLOCK(vmap_purge_lock);
+static DEFINE_MUTEX(vmap_purge_lock);
/* for per-CPU blocks */
static void purge_fragmented_blocks_allcpus(void);
@@ -660,9 +660,9 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
*/
static void try_purge_vmap_area_lazy(void)
{
- if (spin_trylock(&vmap_purge_lock)) {
+ if (mutex_trylock(&vmap_purge_lock)) {
__purge_vmap_area_lazy(ULONG_MAX, 0);
- spin_unlock(&vmap_purge_lock);
+ mutex_unlock(&vmap_purge_lock);
}
}
@@ -671,10 +671,10 @@ static void try_purge_vmap_area_lazy(void)
*/
static void purge_vmap_area_lazy(void)
{
- spin_lock(&vmap_purge_lock);
+ mutex_lock(&vmap_purge_lock);
purge_fragmented_blocks_allcpus();
__purge_vmap_area_lazy(ULONG_MAX, 0);
- spin_unlock(&vmap_purge_lock);
+ mutex_unlock(&vmap_purge_lock);
}
/*
@@ -1063,11 +1063,11 @@ void vm_unmap_aliases(void)
rcu_read_unlock();
}
- spin_lock(&vmap_purge_lock);
+ mutex_lock(&vmap_purge_lock);
purge_fragmented_blocks_allcpus();
if (!__purge_vmap_area_lazy(start, end) && flush)
flush_tlb_kernel_range(start, end);
- spin_unlock(&vmap_purge_lock);
+ mutex_unlock(&vmap_purge_lock);
}
EXPORT_SYMBOL_GPL(vm_unmap_aliases);
--
2.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 10/10] mm: add preempt points into __purge_vmap_area_lazy
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
` (8 preceding siblings ...)
2016-11-18 13:03 ` [PATCH 09/10] mm: turn vmap_purge_lock into a mutex Christoph Hellwig
@ 2016-11-18 13:03 ` Christoph Hellwig
9 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-18 13:03 UTC (permalink / raw)
To: akpm
Cc: aryabinin, joelaf, jszhang, chris, joaodias, linux-mm,
linux-rt-users, x86, linux-kernel
From: Joel Fernandes <joelaf@google.com>
Use cond_resched_lock to avoid holding the vmap_area_lock for a
potentially long time and thus creating bad latencies for various
workloads.
Signed-off-by: Joel Fernandes <joelaf@google.com>
[hch: split from a larger patch by Joel, wrote the crappy changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
---
mm/vmalloc.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index dccf242..976db21 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -628,7 +628,7 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
struct llist_node *valist;
struct vmap_area *va;
struct vmap_area *n_va;
- int nr = 0;
+ bool do_free = false;
lockdep_assert_held(&vmap_purge_lock);
@@ -638,18 +638,22 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
start = va->va_start;
if (va->va_end > end)
end = va->va_end;
- nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
+ do_free = true;
}
- if (!nr)
+ if (!do_free)
return false;
- atomic_sub(nr, &vmap_lazy_nr);
flush_tlb_kernel_range(start, end);
spin_lock(&vmap_area_lock);
- llist_for_each_entry_safe(va, n_va, valist, purge_list)
+ llist_for_each_entry_safe(va, n_va, valist, purge_list) {
+ int nr = (va->va_end - va->va_start) >> PAGE_SHIFT;
+
__free_vmap_area(va);
+ atomic_sub(nr, &vmap_lazy_nr);
+ cond_resched_lock(&vmap_area_lock);
+ }
spin_unlock(&vmap_area_lock);
return true;
}
--
2.1.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 07/10] mm: warn about vfree from atomic context
2016-11-18 13:03 ` [PATCH 07/10] mm: warn about vfree from atomic context Christoph Hellwig
@ 2016-11-22 16:35 ` Andrey Ryabinin
2016-11-23 8:39 ` Christoph Hellwig
0 siblings, 1 reply; 13+ messages in thread
From: Andrey Ryabinin @ 2016-11-22 16:35 UTC (permalink / raw)
To: Christoph Hellwig, akpm
Cc: joelaf, jszhang, chris, joaodias, linux-mm, linux-rt-users, x86,
linux-kernel
On 11/18/2016 04:03 PM, Christoph Hellwig wrote:
> We can't handle vfree itself from atomic context, but callers
> can explicitly use vfree_atomic instead, which defers the actual
> vfree to a workqueue. Unfortunately in_atomic does not work
> on non-preemptible kernels, so we can't just do the right thing
> by default.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> mm/vmalloc.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 80f3fae..e2030b4 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1530,6 +1530,7 @@ void vfree_atomic(const void *addr)
> void vfree(const void *addr)
> {
> BUG_ON(in_nmi());
> + WARN_ON_ONCE(in_atomic());
This one is wrong. We still can call vfree() from interrupt context.
So WARN_ON_ONCE(in_atomic() && !in_interrupt()) would be correct,
but also redundant. DEBUG_ATOMIC_SLEEP=y should catch illegal vfree() calls.
Let's just drop this patch, ok?
> kmemleak_free(addr);
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 07/10] mm: warn about vfree from atomic context
2016-11-22 16:35 ` Andrey Ryabinin
@ 2016-11-23 8:39 ` Christoph Hellwig
0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2016-11-23 8:39 UTC (permalink / raw)
To: Andrey Ryabinin
Cc: Christoph Hellwig, akpm, joelaf, jszhang, chris, joaodias,
linux-mm, linux-rt-users, x86, linux-kernel
On Tue, Nov 22, 2016 at 07:35:34PM +0300, Andrey Ryabinin wrote:
> This one is wrong. We still can call vfree() from interrupt context.
> So WARN_ON_ONCE(in_atomic() && !in_interrupt()) would be correct,
> but also redundant. DEBUG_ATOMIC_SLEEP=y should catch illegal vfree() calls.
> Let's just drop this patch, ok?
Ok, fine.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2016-11-23 8:40 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-18 13:03 reduce latency in __purge_vmap_area_lazy V2 Christoph Hellwig
2016-11-18 13:03 ` [PATCH 01/10] mm: remove free_unmap_vmap_area_noflush Christoph Hellwig
2016-11-18 13:03 ` [PATCH 02/10] mm: remove free_unmap_vmap_area_addr Christoph Hellwig
2016-11-18 13:03 ` [PATCH 03/10] mm: refactor __purge_vmap_area_lazy Christoph Hellwig
2016-11-18 13:03 ` [PATCH 04/10] mm: add vfree_atomic() Christoph Hellwig
2016-11-18 13:03 ` [PATCH 05/10] kernel/fork: use vfree_atomic() to free thread stack Christoph Hellwig
2016-11-18 13:03 ` [PATCH 06/10] x86/ldt: use vfree_atomic() to free ldt entries Christoph Hellwig
2016-11-18 13:03 ` [PATCH 07/10] mm: warn about vfree from atomic context Christoph Hellwig
2016-11-22 16:35 ` Andrey Ryabinin
2016-11-23 8:39 ` Christoph Hellwig
2016-11-18 13:03 ` [PATCH 08/10] mm: mark all calls into the vmalloc subsystem as potentially sleeping Christoph Hellwig
2016-11-18 13:03 ` [PATCH 09/10] mm: turn vmap_purge_lock into a mutex Christoph Hellwig
2016-11-18 13:03 ` [PATCH 10/10] mm: add preempt points into __purge_vmap_area_lazy Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).