All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Aaron Lu <aaron.lu@intel.com>,
	"ying.huang@intel.com" <ying.huang@intel.com>,
	Waiman Long <longman@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Will Deacon <will@kernel.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	kernel test robot <oliver.sang@intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Michal Hocko <mhocko@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, kernel test robot <lkp@intel.com>,
	Feng Tang <feng.tang@intel.com>,
	Zhengjun Xing <zhengjun.xing@linux.intel.com>,
	fengwei.yin@intel.com, linux-mm@kvack.org
Subject: Re: [mm/page_alloc] f26b3fa046: netperf.Throughput_Mbps -18.0% regression
Date: Thu, 12 May 2022 11:06:34 -0700	[thread overview]
Message-ID: <20220512110634.712057e4663ecc5f697bf185@linux-foundation.org> (raw)
In-Reply-To: <CAHk-=whmeWNC-YH_cGRofdW3Spt8Y5nfWpoX=CipQ5pBYgnt2g@mail.gmail.com>

On Thu, 12 May 2022 10:42:09 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, May 12, 2022 at 5:46 AM Aaron Lu <aaron.lu@intel.com> wrote:
> >
> > When nr_process=16, zone lock contention increased about 21% from 6% to
> > 27%, performance dropped 17.8%, overall lock contention increased 14.3%:
> 
> So the contention issue seems real and nasty, and while the queued
> locks may have helped a bit, I don't think they ended up making a
> *huge* change: the queued locks help make sure the lock itself doesn't
> bounce all over the place, but clearly if the lock holder writes close
> to the lock, it will still bounce with at least *one* lock waiter.
> 
> And having looked at the qspinlock code, I have to agree with Waiman
> and PeterZ that I don't think the locking code can reasonably eb
> changed - I'm sure this particular case could be improved, but the
> downsides for other cases would be quite large enough to make that a
> bad idea.
> 
> So I think the issue is that
> 
>  (a) that zone lock is too hot.
> 
>  (b) given lock contention, the fields that get written to under the
> lock are too close to the lock
> 
> Now, the optimal fix would of course be to just fix the lock so that
> it isn't so hot. But assuming that's not possible, just looking at the
> definition of that 'struct zone', I do have to say that the
> ZONE_PADDING fields seem to have bit-rotted over the years.
> 
> The whole and only reason for them would be to avoid the cache
> bouncing, but commit 6168d0da2b47 ("mm/lru: replace pgdat lru_lock
> with lruvec lock") actively undid that for the 'lru_lock' case, and
> way back when commit a368ab67aa55 ("mm: move zone lock to a different
> cache line than order-0 free page lists") tried to make it true for
> the 'lock' vs free_area[] cases, but did it without actually using the
> ZONE_PADDING thing, but by moving things around, and not really
> *guaranteeing* that 'lock' was in a different cacheline, but really
> just making 'free_area[]' aligned, but still potentially in the same
> cache-line as 'lock' (so now the lower-order 'free_area[]' entries are
> not sharing a cache-line, but the higher-order 'free_area[]' ones
> probably are).
> 
> So I get the feeling that those 'ZONE_PADDING' things are a bit random
> and not really effective.
> 
> In a perfect world, somebody would fix the locking to just not have as
> much contention. But assuming that isn't an option, maybe somebody
> should just look at that 'struct zone' layout a bit more.

(hopefully adds linux-mm to cc)

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: lkp@lists.01.org
Subject: Re: [mm/page_alloc] f26b3fa046: netperf.Throughput_Mbps -18.0% regression
Date: Thu, 12 May 2022 11:06:34 -0700	[thread overview]
Message-ID: <20220512110634.712057e4663ecc5f697bf185@linux-foundation.org> (raw)
In-Reply-To: <CAHk-=whmeWNC-YH_cGRofdW3Spt8Y5nfWpoX=CipQ5pBYgnt2g@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2594 bytes --]

On Thu, 12 May 2022 10:42:09 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, May 12, 2022 at 5:46 AM Aaron Lu <aaron.lu@intel.com> wrote:
> >
> > When nr_process=16, zone lock contention increased about 21% from 6% to
> > 27%, performance dropped 17.8%, overall lock contention increased 14.3%:
> 
> So the contention issue seems real and nasty, and while the queued
> locks may have helped a bit, I don't think they ended up making a
> *huge* change: the queued locks help make sure the lock itself doesn't
> bounce all over the place, but clearly if the lock holder writes close
> to the lock, it will still bounce with at least *one* lock waiter.
> 
> And having looked at the qspinlock code, I have to agree with Waiman
> and PeterZ that I don't think the locking code can reasonably eb
> changed - I'm sure this particular case could be improved, but the
> downsides for other cases would be quite large enough to make that a
> bad idea.
> 
> So I think the issue is that
> 
>  (a) that zone lock is too hot.
> 
>  (b) given lock contention, the fields that get written to under the
> lock are too close to the lock
> 
> Now, the optimal fix would of course be to just fix the lock so that
> it isn't so hot. But assuming that's not possible, just looking at the
> definition of that 'struct zone', I do have to say that the
> ZONE_PADDING fields seem to have bit-rotted over the years.
> 
> The whole and only reason for them would be to avoid the cache
> bouncing, but commit 6168d0da2b47 ("mm/lru: replace pgdat lru_lock
> with lruvec lock") actively undid that for the 'lru_lock' case, and
> way back when commit a368ab67aa55 ("mm: move zone lock to a different
> cache line than order-0 free page lists") tried to make it true for
> the 'lock' vs free_area[] cases, but did it without actually using the
> ZONE_PADDING thing, but by moving things around, and not really
> *guaranteeing* that 'lock' was in a different cacheline, but really
> just making 'free_area[]' aligned, but still potentially in the same
> cache-line as 'lock' (so now the lower-order 'free_area[]' entries are
> not sharing a cache-line, but the higher-order 'free_area[]' ones
> probably are).
> 
> So I get the feeling that those 'ZONE_PADDING' things are a bit random
> and not really effective.
> 
> In a perfect world, somebody would fix the locking to just not have as
> much contention. But assuming that isn't an option, maybe somebody
> should just look at that 'struct zone' layout a bit more.

(hopefully adds linux-mm to cc)

  reply	other threads:[~2022-05-12 18:06 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-20  1:35 [mm/page_alloc] f26b3fa046: netperf.Throughput_Mbps -18.0% regression kernel test robot
2022-04-20  1:35 ` kernel test robot
2022-04-29 11:29 ` Aaron Lu
2022-04-29 11:29   ` Aaron Lu
2022-04-29 13:39   ` Mel Gorman
2022-04-29 13:39     ` Mel Gorman
2022-05-05  8:27     ` Aaron Lu
2022-05-05  8:27       ` Aaron Lu
2022-05-05 11:09       ` Mel Gorman
2022-05-05 11:09         ` Mel Gorman
2022-05-05 14:29         ` Aaron Lu
2022-05-05 14:29           ` Aaron Lu
2022-05-06  8:40   ` ying.huang
2022-05-06  8:40     ` ying.huang
2022-05-06 12:17     ` Aaron Lu
2022-05-06 12:17       ` Aaron Lu
2022-05-07  0:54       ` ying.huang
2022-05-07  0:54         ` ying.huang
2022-05-07  3:27         ` Aaron Lu
2022-05-07  3:27           ` Aaron Lu
2022-05-07  7:11           ` ying.huang
2022-05-07  7:11             ` ying.huang
2022-05-07  7:31             ` Aaron Lu
2022-05-07  7:31               ` Aaron Lu
2022-05-07  7:44               ` ying.huang
2022-05-07  7:44                 ` ying.huang
2022-05-10  3:43                 ` Aaron Lu
2022-05-10  3:43                   ` Aaron Lu
2022-05-10  6:23                   ` ying.huang
2022-05-10  6:23                     ` ying.huang
2022-05-10 18:05                     ` Linus Torvalds
2022-05-10 18:05                       ` Linus Torvalds
2022-05-10 18:47                       ` Waiman Long
2022-05-10 18:47                         ` Waiman Long
2022-05-10 19:03                         ` Linus Torvalds
2022-05-10 19:03                           ` Linus Torvalds
2022-05-10 19:25                           ` Linus Torvalds
2022-05-10 19:25                             ` Linus Torvalds
2022-05-10 19:46                           ` Waiman Long
2022-05-10 19:46                             ` Waiman Long
2022-05-10 19:27                       ` Peter Zijlstra
2022-05-10 19:27                         ` Peter Zijlstra
2022-05-11  1:58                       ` ying.huang
2022-05-11  1:58                         ` ying.huang
2022-05-11  2:06                         ` Waiman Long
2022-05-11  2:06                           ` Waiman Long
2022-05-11 11:04                         ` Aaron Lu
2022-05-11 11:04                           ` Aaron Lu
2022-05-12  3:17                           ` ying.huang
2022-05-12  3:17                             ` ying.huang
2022-05-12 12:45                             ` Aaron Lu
2022-05-12 12:45                               ` Aaron Lu
2022-05-12 17:42                               ` Linus Torvalds
2022-05-12 17:42                                 ` Linus Torvalds
2022-05-12 18:06                                 ` Andrew Morton [this message]
2022-05-12 18:06                                   ` Andrew Morton
2022-05-12 18:49                                   ` Linus Torvalds
2022-05-12 18:49                                     ` Linus Torvalds
2022-06-14  2:09                                     ` Feng Tang
2022-06-14  2:09                                       ` Feng Tang
2022-05-13  6:19                                 ` ying.huang
2022-05-13  6:19                                   ` ying.huang
2022-05-11  3:40                     ` Aaron Lu
2022-05-11  3:40                       ` Aaron Lu
2022-05-11  7:32                       ` ying.huang
2022-05-11  7:32                         ` ying.huang
2022-05-11  7:53                         ` Aaron Lu
2022-05-11  7:53                           ` Aaron Lu
2022-06-01  2:19                           ` Aaron Lu
2022-06-01  2:19                             ` Aaron Lu
2022-05-11 12:13 ` [mm/page_alloc] f26b3fa046: netperf.Throughput_Mbps -18.0% regression #forregzbot Thorsten Leemhuis
2022-05-13  8:37   ` Thorsten Leemhuis
2022-09-08 11:39     ` Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220512110634.712057e4663ecc5f697bf185@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=aaron.lu@intel.com \
    --cc=brouer@redhat.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=longman@redhat.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=oliver.sang@intel.com \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=ying.huang@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.