linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v1] mm: add the preempt check into alloc_vmap_area()
@ 2018-02-27 10:22 Uladzislau Rezki (Sony)
  2018-02-27 13:06 ` Matthew Wilcox
  0 siblings, 1 reply; 5+ messages in thread
From: Uladzislau Rezki (Sony) @ 2018-02-27 10:22 UTC (permalink / raw)
  To: linux-mm
  Cc: LKML, Ingo Molnar, Thomas Garnier, Oleksiy Avramchenko,
	Andrew Morton, Kirill A . Shutemov, Steven Rostedt,
	Thomas Gleixner, Uladzislau Rezki (Sony)

During finding a suitable hole in the vmap_area_list
there is an explicit rescheduling check for latency reduction.
We do it, since there are workloads which are sensitive for
long (more than 1 millisecond) preemption off scenario.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 57 +++++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 45 insertions(+), 12 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 673942094328..60a57752f8fc 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -325,6 +325,7 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
 
 #define VM_LAZY_FREE	0x02
 #define VM_VM_AREA	0x04
+#define VM_LAZY_FREE_DEFER	0x08
 
 static DEFINE_SPINLOCK(vmap_area_lock);
 /* Export for kexec only */
@@ -491,6 +492,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 		if (addr + size < addr)
 			goto overflow;
 
+		/*
+		 * Put on hold this VA preventing it from being
+		 * removed from the list because of dropping the
+		 * vmap_area_lock. It means we are save to proceed
+		 * the search after the lock is taken again if we
+		 * were scheduled out or the spin needed a break.
+		 */
+		if (gfpflags_allow_blocking(gfp_mask) &&
+				!(first->flags & VM_LAZY_FREE_DEFER)) {
+			first->flags |= VM_LAZY_FREE_DEFER;
+			cond_resched_lock(&vmap_area_lock);
+			first->flags &= ~VM_LAZY_FREE_DEFER;
+		}
+
 		if (list_is_last(&first->list, &vmap_area_list))
 			goto found;
 
@@ -586,16 +601,6 @@ static void __free_vmap_area(struct vmap_area *va)
 }
 
 /*
- * Free a region of KVA allocated by alloc_vmap_area
- */
-static void free_vmap_area(struct vmap_area *va)
-{
-	spin_lock(&vmap_area_lock);
-	__free_vmap_area(va);
-	spin_unlock(&vmap_area_lock);
-}
-
-/*
  * Clear the pagetable entries of a given vmap_area
  */
 static void unmap_vmap_area(struct vmap_area *va)
@@ -678,6 +683,7 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
 	struct vmap_area *va;
 	struct vmap_area *n_va;
 	bool do_free = false;
+	int va_nr_pages;
 
 	lockdep_assert_held(&vmap_purge_lock);
 
@@ -697,10 +703,19 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
 
 	spin_lock(&vmap_area_lock);
 	llist_for_each_entry_safe(va, n_va, valist, purge_list) {
-		int nr = (va->va_end - va->va_start) >> PAGE_SHIFT;
+		if (unlikely(va->flags & VM_LAZY_FREE_DEFER)) {
+			/*
+			 * Put deferred VA back to the vmap_purge_list.
+			 * We do not need to modify vmap_lazy_nr since
+			 * the va will not be removed now.
+			 */
+			llist_add(&va->purge_list, &vmap_purge_list);
+			continue;
+		}
 
+		va_nr_pages = (va->va_end - va->va_start) >> PAGE_SHIFT;
 		__free_vmap_area(va);
-		atomic_sub(nr, &vmap_lazy_nr);
+		atomic_sub(va_nr_pages, &vmap_lazy_nr);
 		cond_resched_lock(&vmap_area_lock);
 	}
 	spin_unlock(&vmap_area_lock);
@@ -750,6 +765,24 @@ static void free_vmap_area_noflush(struct vmap_area *va)
 }
 
 /*
+ * Free a region of KVA allocated by alloc_vmap_area
+ */
+static void free_vmap_area(struct vmap_area *va)
+{
+	bool do_lazy_free = false;
+
+	spin_lock(&vmap_area_lock);
+	if (unlikely(va->flags & VM_LAZY_FREE_DEFER))
+		do_lazy_free = true;
+	else
+		__free_vmap_area(va);
+	spin_unlock(&vmap_area_lock);
+
+	if (unlikely(do_lazy_free))
+		free_vmap_area_noflush(va);
+}
+
+/*
  * Free and unmap a vmap area
  */
 static void free_unmap_vmap_area(struct vmap_area *va)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC v1] mm: add the preempt check into alloc_vmap_area()
  2018-02-27 10:22 [RFC v1] mm: add the preempt check into alloc_vmap_area() Uladzislau Rezki (Sony)
@ 2018-02-27 13:06 ` Matthew Wilcox
  2018-02-28 12:40   ` Uladzislau Rezki
  2018-03-02 23:34   ` Andrew Morton
  0 siblings, 2 replies; 5+ messages in thread
From: Matthew Wilcox @ 2018-02-27 13:06 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: linux-mm, LKML, Ingo Molnar, Thomas Garnier, Oleksiy Avramchenko,
	Andrew Morton, Kirill A . Shutemov, Steven Rostedt,
	Thomas Gleixner

On Tue, Feb 27, 2018 at 11:22:59AM +0100, Uladzislau Rezki (Sony) wrote:
> During finding a suitable hole in the vmap_area_list
> there is an explicit rescheduling check for latency reduction.
> We do it, since there are workloads which are sensitive for
> long (more than 1 millisecond) preemption off scenario.

I understand your problem, but this is a horrid solution.  If it takes
us a millisecond to find a suitable chunk of free address space, something
is terribly wrong.  On a 3GHz CPU, that's 3 million clock ticks!

I think our real problem is that we have no data structure that stores
free VA space.  We have the vmap_area which stores allocated space, but no
data structure to store free space.

My initial proposal would be to reuse the vmap_area structure and store
the freed ones in a second rb_tree sorted by the size (ie va_end - va_start).
When freeing, we might need to merge forwards and backwards.  Allocating
would be a matter of finding an area preferably of the exact right size;
otherwise split a larger free area into a free area and an allocated area
(there's a lot of literature on how exactly to choose which larger area
to split; memory allocators are pretty well-studied).

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC v1] mm: add the preempt check into alloc_vmap_area()
  2018-02-27 13:06 ` Matthew Wilcox
@ 2018-02-28 12:40   ` Uladzislau Rezki
  2018-03-02 23:34   ` Andrew Morton
  1 sibling, 0 replies; 5+ messages in thread
From: Uladzislau Rezki @ 2018-02-28 12:40 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Uladzislau Rezki (Sony),
	linux-mm, LKML, Ingo Molnar, Thomas Garnier, Oleksiy Avramchenko,
	Andrew Morton, Kirill A . Shutemov, Steven Rostedt,
	Thomas Gleixner

On Tue, Feb 27, 2018 at 05:06:43AM -0800, Matthew Wilcox wrote:
> On Tue, Feb 27, 2018 at 11:22:59AM +0100, Uladzislau Rezki (Sony) wrote:
> > During finding a suitable hole in the vmap_area_list
> > there is an explicit rescheduling check for latency reduction.
> > We do it, since there are workloads which are sensitive for
> > long (more than 1 millisecond) preemption off scenario.
> 
> I understand your problem, but this is a horrid solution.  If it takes
> us a millisecond to find a suitable chunk of free address space, something
> is terribly wrong.  On a 3GHz CPU, that's 3 million clock ticks!
>
Some background. I spent some time analyzing an issue regarding audio
drops/glitches during playing hires audio on our mobile device. It is
ARM A53 with 4 CPUs on one socket. When it comes to frequency and test
case, the system is most likely idle and operation is done on ~576 MHz.

I found out that the reason was in vmalloc due to it can take time
to find a suitable chunk of memory and it is done in non-preemptible
context. As a result the other audio thread is not run on CPU in time
despite need_resched is set.

> 
> I think our real problem is that we have no data structure that stores
> free VA space.  We have the vmap_area which stores allocated space, but no
> data structure to store free space.
> 
> My initial proposal would be to reuse the vmap_area structure and store
> the freed ones in a second rb_tree sorted by the size (ie va_end - va_start).
> When freeing, we might need to merge forwards and backwards.  Allocating
> would be a matter of finding an area preferably of the exact right size;
> otherwise split a larger free area into a free area and an allocated area
> (there's a lot of literature on how exactly to choose which larger area
> to split; memory allocators are pretty well-studied).
> 
Thank you for your comments and proposal.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC v1] mm: add the preempt check into alloc_vmap_area()
  2018-02-27 13:06 ` Matthew Wilcox
  2018-02-28 12:40   ` Uladzislau Rezki
@ 2018-03-02 23:34   ` Andrew Morton
  2018-03-03 21:18     ` Uladzislau Rezki
  1 sibling, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2018-03-02 23:34 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Uladzislau Rezki (Sony),
	linux-mm, LKML, Ingo Molnar, Thomas Garnier, Oleksiy Avramchenko,
	Kirill A . Shutemov, Steven Rostedt, Thomas Gleixner

On Tue, 27 Feb 2018 05:06:43 -0800 Matthew Wilcox <willy@infradead.org> wrote:

> On Tue, Feb 27, 2018 at 11:22:59AM +0100, Uladzislau Rezki (Sony) wrote:
> > During finding a suitable hole in the vmap_area_list
> > there is an explicit rescheduling check for latency reduction.
> > We do it, since there are workloads which are sensitive for
> > long (more than 1 millisecond) preemption off scenario.
> 
> I understand your problem, but this is a horrid solution.  If it takes
> us a millisecond to find a suitable chunk of free address space, something
> is terribly wrong.  On a 3GHz CPU, that's 3 million clock ticks!

Yup.

> I think our real problem is that we have no data structure that stores
> free VA space.  We have the vmap_area which stores allocated space, but no
> data structure to store free space.

I wonder if we can reuse free_vmap_cache as a quick fix: if
need_resched(), point free_vmap_cache at the current rb_node, drop the
lock, cond_resched, goto retry?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC v1] mm: add the preempt check into alloc_vmap_area()
  2018-03-02 23:34   ` Andrew Morton
@ 2018-03-03 21:18     ` Uladzislau Rezki
  0 siblings, 0 replies; 5+ messages in thread
From: Uladzislau Rezki @ 2018-03-03 21:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox, Uladzislau Rezki (Sony),
	linux-mm, LKML, Ingo Molnar, Thomas Garnier, Oleksiy Avramchenko,
	Kirill A . Shutemov, Steven Rostedt, Thomas Gleixner

On Fri, Mar 02, 2018 at 03:34:52PM -0800, Andrew Morton wrote:
> On Tue, 27 Feb 2018 05:06:43 -0800 Matthew Wilcox <willy@infradead.org> wrote:
> 
> > On Tue, Feb 27, 2018 at 11:22:59AM +0100, Uladzislau Rezki (Sony) wrote:
> > > During finding a suitable hole in the vmap_area_list
> > > there is an explicit rescheduling check for latency reduction.
> > > We do it, since there are workloads which are sensitive for
> > > long (more than 1 millisecond) preemption off scenario.
> > 
> > I understand your problem, but this is a horrid solution.  If it takes
> > us a millisecond to find a suitable chunk of free address space, something
> > is terribly wrong.  On a 3GHz CPU, that's 3 million clock ticks!
> 
> Yup.
> 
> > I think our real problem is that we have no data structure that stores
> > free VA space.  We have the vmap_area which stores allocated space, but no
> > data structure to store free space.
> 
> I wonder if we can reuse free_vmap_cache as a quick fix: if
> need_resched(), point free_vmap_cache at the current rb_node, drop the
> lock, cond_resched, goto retry?
> 
It sounds like we can. But there is a concern if that potentially can
introduce a degrade of search time due to changing a starting point
for our search.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-03-03 21:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-27 10:22 [RFC v1] mm: add the preempt check into alloc_vmap_area() Uladzislau Rezki (Sony)
2018-02-27 13:06 ` Matthew Wilcox
2018-02-28 12:40   ` Uladzislau Rezki
2018-03-02 23:34   ` Andrew Morton
2018-03-03 21:18     ` Uladzislau Rezki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).