All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: Nadav Amit <namit@vmware.com>
Cc: kernel test robot <oliver.sang@intel.com>,
	Ingo Molnar <mingo@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"lkp@lists.01.org" <lkp@lists.01.org>,
	"lkp@intel.com" <lkp@intel.com>,
	"ying.huang@intel.com" <ying.huang@intel.com>,
	"feng.tang@intel.com" <feng.tang@intel.com>,
	"zhengjun.xing@linux.intel.com" <zhengjun.xing@linux.intel.com>,
	"fengwei.yin@intel.com" <fengwei.yin@intel.com>,
	Andy Lutomirski <luto@kernel.org>
Subject: Re: [x86/mm/tlb] 6035152d8e: will-it-scale.per_thread_ops -13.2% regression
Date: Thu, 17 Mar 2022 17:45:41 -0700	[thread overview]
Message-ID: <d4f62008-faa7-2931-5690-f29f9544b81b@intel.com> (raw)
In-Reply-To: <A185DAD5-3AA7-445B-B57D-AFAF6B55D144@vmware.com>

On 3/17/22 17:20, Nadav Amit wrote:
> I don’t have other data right now. Let me run some measurements later
> tonight. I understand your explanation, but I still do not see how
> much “later” can the lazy check be that it really matters. Just
> strange.

These will-it-scale tests are really brutal.  They're usually sitting in
really tight kernel entry/exit loops.  Everything is pounding on kernel
locks and bouncing cachelines around like crazy.  It might only be a few
thousand cycles between two successive kernel entries.

Things like the call_single_queue cacheline have to be dragged from
other CPUs *and* there are locks that you can spin on.  While a thread
is doing all this spinning, it is forcing more and more threads into the
lazy TLB state.  The longer you spin, the more threads have entered the
kernel, contended on the mmap_lock and gone idle.

Is it really surprising that a loop that can take hundreds of locks can
take a long time?

                for_each_cpu(cpu, cfd->cpumask) {
                        csd_lock(csd);
			...
		}

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave.hansen@intel.com>
To: lkp@lists.01.org
Subject: Re: [x86/mm/tlb] 6035152d8e: will-it-scale.per_thread_ops -13.2% regression
Date: Thu, 17 Mar 2022 17:45:41 -0700	[thread overview]
Message-ID: <d4f62008-faa7-2931-5690-f29f9544b81b@intel.com> (raw)
In-Reply-To: <A185DAD5-3AA7-445B-B57D-AFAF6B55D144@vmware.com>

[-- Attachment #1: Type: text/plain, Size: 1083 bytes --]

On 3/17/22 17:20, Nadav Amit wrote:
> I don’t have other data right now. Let me run some measurements later
> tonight. I understand your explanation, but I still do not see how
> much “later” can the lazy check be that it really matters. Just
> strange.

These will-it-scale tests are really brutal.  They're usually sitting in
really tight kernel entry/exit loops.  Everything is pounding on kernel
locks and bouncing cachelines around like crazy.  It might only be a few
thousand cycles between two successive kernel entries.

Things like the call_single_queue cacheline have to be dragged from
other CPUs *and* there are locks that you can spin on.  While a thread
is doing all this spinning, it is forcing more and more threads into the
lazy TLB state.  The longer you spin, the more threads have entered the
kernel, contended on the mmap_lock and gone idle.

Is it really surprising that a loop that can take hundreds of locks can
take a long time?

                for_each_cpu(cpu, cfd->cpumask) {
                        csd_lock(csd);
			...
		}

  reply	other threads:[~2022-03-18  0:45 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-17  9:04 [x86/mm/tlb] 6035152d8e: will-it-scale.per_thread_ops -13.2% regression kernel test robot
2022-03-17  9:04 ` kernel test robot
2022-03-17 18:38 ` Dave Hansen
2022-03-17 18:38   ` Dave Hansen
2022-03-17 19:02   ` Nadav Amit
2022-03-17 19:02     ` Nadav Amit
2022-03-17 19:11     ` Dave Hansen
2022-03-17 19:11       ` Dave Hansen
2022-03-17 20:32       ` Nadav Amit
2022-03-17 20:32         ` Nadav Amit
2022-03-17 20:49         ` Dave Hansen
2022-03-17 20:49           ` Dave Hansen
2022-03-18  2:56           ` Oliver Sang
2022-03-18  2:56             ` Oliver Sang
2022-03-18  0:16         ` Dave Hansen
2022-03-18  0:16           ` Dave Hansen
2022-03-18  0:20           ` Nadav Amit
2022-03-18  0:20             ` Nadav Amit
2022-03-18  0:45             ` Dave Hansen [this message]
2022-03-18  0:45               ` Dave Hansen
2022-03-18  3:02               ` Nadav Amit
2022-03-18  3:02                 ` Nadav Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4f62008-faa7-2931-5690-f29f9544b81b@intel.com \
    --to=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=namit@vmware.com \
    --cc=oliver.sang@intel.com \
    --cc=ying.huang@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.