linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Galbraith <efault@gmx.de>
To: Vlastimil Babka <vbabka@suse.cz>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>
Cc: linux-rt-users@vger.kernel.org,
	Mel Gorman <mgorman@techsingularity.net>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [rfc/patch] mm/slub: restore/expand unfreeze_partials() local exclusion scope
Date: Mon, 26 Jul 2021 19:00:46 +0200	[thread overview]
Message-ID: <e3470ab357b48bccfbd1f5133b982178a7d2befb.camel@gmx.de> (raw)
In-Reply-To: <b41fa4f2-8368-f33a-10c2-68554b16eb1e@suse.cz>

On Sun, 2021-07-25 at 21:34 +0200, Vlastimil Babka wrote:
> On 7/25/21 9:12 PM, Vlastimil Babka wrote:
> > +       /*
> > +        * On !RT we just want to disable preemption, on RT we need
> > the lock
> > +        * for real. This happens to match local_lock() semantics.
> > +        */
> > +       local_lock(&s->cpu_slab->lock);
>
> OK I realized (and tglx confirmed) that this will blow up on !RT +
> lockdep if interrupted by ___slab_alloc() that will do
> local_lock_irqsave(). So back to #ifdefs it is. But should work as-is
> for RT testing.

Speaking of that local_lock_irqsave(), and some unloved ifdefs..

Why not do something like the below?  When I look at new_slab:, I see
cpu_slab->partial assignment protected by IRQs being disabled, which
implies to me it should probably be so protected everywhere.  There
used to be another slub_set_percpu_partial() call in
unfreeze_partials(), which was indeed called with IRQs disabled, quite
sane looking to an mm outsider looking in. The odd man out ->partial
assignment was the preempt disabled put_cpu_partial() cmpxchg loop,
which contained an IRQ disabled region to accommodate the
aforementioned unfreeze_partials().

Is there real world benefit to the cmpxchg loops whacked below (ala
monkey see monkey do) over everyone just taking the straight shot you
laid down for RT?  It's easier on the eye (mine anyway), and neither
PREEMPT or PREEMPT_RT seem inclined to complain... tick... tock...

---
 mm/slub.c |   79 ++++++++++++++++++++++++++++++++------------------------------
 1 file changed, 41 insertions(+), 38 deletions(-)

--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2437,13 +2437,12 @@ static void __unfreeze_partials(struct k
 static void unfreeze_partials(struct kmem_cache *s)
 {
 	struct page *partial_page;
+	unsigned long flags;

-	do {
-		partial_page = this_cpu_read(s->cpu_slab->partial);
-
-	} while (partial_page &&
-		 this_cpu_cmpxchg(s->cpu_slab->partial, partial_page, NULL)
-				  != partial_page);
+	local_lock_irqsave(&s->cpu_slab->lock, flags);
+	partial_page = this_cpu_read(s->cpu_slab->partial);
+	this_cpu_write(s->cpu_slab->partial, NULL);
+	local_unlock_irqrestore(&s->cpu_slab->lock, flags);

 	if (partial_page)
 		__unfreeze_partials(s, partial_page);
@@ -2480,41 +2479,45 @@ static void put_cpu_partial(struct kmem_
 {
 #ifdef CONFIG_SLUB_CPU_PARTIAL
 	struct page *oldpage;
-	int pages;
-	int pobjects;
-
-	slub_get_cpu_ptr(s->cpu_slab);
-	do {
-		pages = 0;
-		pobjects = 0;
-		oldpage = this_cpu_read(s->cpu_slab->partial);
-
-		if (oldpage) {
-			pobjects = oldpage->pobjects;
-			pages = oldpage->pages;
-			if (drain && pobjects > slub_cpu_partial(s)) {
-				/*
-				 * partial array is full. Move the existing
-				 * set to the per node partial list.
-				 */
-				unfreeze_partials(s);
-				oldpage = NULL;
-				pobjects = 0;
-				pages = 0;
-				stat(s, CPU_PARTIAL_DRAIN);
-			}
+	struct page *page_to_unfreeze = NULL;
+	unsigned long flags;
+	int pages = 0, pobjects = 0;
+
+	local_lock_irqsave(&s->cpu_slab->lock, flags);
+
+	if (oldpage = this_cpu_read(s->cpu_slab->partial)) {
+		pobjects = oldpage->pobjects;
+		pages = oldpage->pages;
+		if (drain && pobjects > slub_cpu_partial(s)) {
+			/*
+			 * partial array is full. Move the existing
+			 * set to the per node partial list.
+			 *
+			 * Postpone unfreezing until we drop the local
+			 * lock to avoid an RT unlock/relock requirement
+			 * due to MEMCG __slab_free() recursion.
+			 */
+			page_to_unfreeze = oldpage;
+
+			oldpage = NULL;
+			pobjects = 0;
+			pages = 0;
+			stat(s, CPU_PARTIAL_DRAIN);
 		}
+	}
+
+	pages++;
+	pobjects += page->objects - page->inuse;
+
+	page->pages = pages;
+	page->pobjects = pobjects;
+	page->next = oldpage;

-		pages++;
-		pobjects += page->objects - page->inuse;
+	this_cpu_write(s->cpu_slab->partial, page);
+	local_unlock_irqrestore(&s->cpu_slab->lock, flags);

-		page->pages = pages;
-		page->pobjects = pobjects;
-		page->next = oldpage;
-
-	} while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page)
-								!= oldpage);
-	slub_put_cpu_ptr(s->cpu_slab);
+	if (page_to_unfreeze)
+		__unfreeze_partials(s, page_to_unfreeze);
 #endif	/* CONFIG_SLUB_CPU_PARTIAL */
 }




  parent reply	other threads:[~2021-07-26 17:09 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-08  9:42 [ANNOUNCE] v5.13-rt1 Thomas Gleixner
2021-07-09  5:20 ` [patch] mm/slub: Fix kmem_cache_alloc_bulk() error path Mike Galbraith
2021-07-09  5:20 ` [patch] mm/slub: Replace do_slab_free() local_lock_irqsave/restore() calls in PREEMPT_RT scope Mike Galbraith
2021-07-09  5:21 ` [rfc/patch] mm/slub: restore/expand unfreeze_partials() local exclusion scope Mike Galbraith
2021-07-09  5:25   ` Mike Galbraith
2021-07-09 19:28   ` Thomas Gleixner
2021-07-10  1:12     ` Mike Galbraith
2021-07-15 16:34       ` Mike Galbraith
2021-07-17 14:58         ` [patch] v2 " Mike Galbraith
2021-07-18  7:58           ` Vlastimil Babka
2021-07-18  8:11             ` Mike Galbraith
2021-07-18 15:43           ` Mike Galbraith
2021-07-18 21:19           ` Vlastimil Babka
2021-07-19  4:01             ` Mike Galbraith
2021-07-19 13:15               ` Mike Galbraith
2021-07-20  2:46           ` kernel test robot
2021-07-20  8:56         ` [rfc/patch] " Vlastimil Babka
2021-07-20 11:26           ` Mike Galbraith
2021-07-21  4:56             ` Mike Galbraith
2021-07-21  8:44               ` Vlastimil Babka
2021-07-21  9:33                 ` Mike Galbraith
2021-07-23 22:39                   ` Vlastimil Babka
2021-07-24  2:25                     ` Mike Galbraith
2021-07-25 14:09                     ` Mike Galbraith
2021-07-25 14:16                       ` Vlastimil Babka
2021-07-25 15:02                         ` Mike Galbraith
2021-07-25 16:27                           ` Vlastimil Babka
2021-07-25 19:12                             ` Vlastimil Babka
2021-07-25 19:34                               ` Vlastimil Babka
2021-07-26 10:04                                 ` Mike Galbraith
2021-07-26 17:00                                 ` Mike Galbraith [this message]
2021-07-26 21:26                                   ` Vlastimil Babka
2021-07-27  4:09                                     ` Mike Galbraith
2021-07-28 16:59                                       ` Vlastimil Babka
2021-07-29  4:51                                         ` Mike Galbraith
2021-07-29  9:51                                           ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3470ab357b48bccfbd1f5133b982178a7d2befb.camel@gmx.de \
    --to=efault@gmx.de \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).