linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	Aaron Tomlin <atomlin@atomlin.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Russell King <linux@armlinux.org.uk>,
	Huacai Chen <chenhuacai@kernel.org>,
	Heiko Carstens <hca@linux.ibm.com>,
	x86@kernel.org, Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH v7 00/13] fold per-CPU vmstats remotely
Date: Wed, 19 Apr 2023 10:48:03 -0300	[thread overview]
Message-ID: <ZD/xE6kR4RSOvUlR@tpad> (raw)
In-Reply-To: <ZD/dYXJD2xcoWFoQ@localhost.localdomain>

On Wed, Apr 19, 2023 at 02:24:01PM +0200, Frederic Weisbecker wrote:
> Le Wed, Apr 19, 2023 at 08:59:28AM -0300, Marcelo Tosatti a écrit :
> > On Wed, Apr 19, 2023 at 08:29:47AM -0300, Marcelo Tosatti wrote:
> > > On Wed, Apr 19, 2023 at 08:14:09AM -0300, Marcelo Tosatti wrote:
> > > > This was tried before:
> > > > https://lore.kernel.org/lkml/20220127173037.318440631@fedora.localdomain/
> > > > 
> > > > My conclusion from that discussion (and work) is that a special system
> > > > call:
> > > > 
> > > > 1) Does not allow the benefits to be widely applied (only modified
> > > > applications will benefit). Is not portable across different operating systems. 
> > > > 
> > > > Removing the vmstat_work interruption is a benefit for HPC workloads, 
> > > > for example (in fact, it is a benefit for any kind of application, 
> > > > since the interruption causes cache misses).
> > > > 
> > > > 2) Increases the system call cost for applications which would use
> > > > the interface.
> > > > 
> > > > So avoiding the vmstat_update update interruption, without userspace 
> > > > knowledge and modifications, is a better than solution than a modified
> > > > userspace.
> > > 
> > > Another important point is this: if an application dirties
> > > its own per-CPU vmstat cache, while performing a system call,
> > 
> > Or while handling a VM-exit from a vCPU.
> > 
> > This are, in my mind, sufficient reasons to discard the "flush per-cpu
> > caches" idea. This is also why i chose to abandon the prctrl interface
> > patchset.
> 
> If you're running your isolated workloads on guests, which sounds quite
> challenging but I guess you guys managed, I'd expect that VMEXITs are
> absolutely out of question while the task runs critical code, so I'm not
> sure why you would care. I guess not only your guests but also your hosts
> run nohz_full, right?

The answer is: there are VM-exits. For example to write MSRs to program
LAPIC timer.

Yes both host and guest are nohz_full (but for example, cyclictest 
or a PLC program can call nanosleep in the guest which translate to 
MSR writes to program LAPIC timer which is a VM-exit).

> I can't tell if the prctl solution which quiesces everything is the solution
> for you, I don't know well enough your workloads, but I would expect that
> the pattern is as follows:
> 
> 1) Arrange for full isolation (no more interrupts/exceptions/VMEXITs)

Yes, this in the general scheme. Full isolation is automated by
tuned (realtime-virtual-host/realtime-virtual-guest profiles).

There are VM-exits in our use-case.
There might be use-cases where interrupts are desired.

For more details:
https://www.youtube.com/watch?v=SyhfctYqjc8

> 2) Run critical code
> 3) Optionally do something once you're done
> 
> If vmstat is going to be the only thing to wait for on 1), then the remote
> solution looks good enough (although I leave that to -mm guys as I'm too
> clueless about those matters), 

I am mostly clueless too, but i don't see a problem with the proposed
patch (and no one has pointed any problem either).

> if there is more to be expected, I guess the
> quiescing prctl (or whatever syscall) is something to consider.
> 
> Thanks.

I don't know of anything else to consider ATM, and for all cases we have
analyzed so far there has always been the possibility to do the work remotely,
via RCU or some other locking scheme, rather than requiring the application
to be modified (which decreases the number of userspace applications that
can benefit).





  reply	other threads:[~2023-04-19 14:17 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-20 18:03 [PATCH v7 00/13] fold per-CPU vmstats remotely Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 01/13] vmstat: allow_direct_reclaim should use zone_page_state_snapshot Marcelo Tosatti
2023-03-20 18:21   ` Michal Hocko
2023-03-20 18:32     ` Marcelo Tosatti
2023-03-22 10:03       ` Michal Hocko
2023-03-20 18:03 ` [PATCH v7 02/13] this_cpu_cmpxchg: ARM64: switch this_cpu_cmpxchg to locked, add _local function Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 03/13] this_cpu_cmpxchg: loongarch: " Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 04/13] this_cpu_cmpxchg: S390: " Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 05/13] this_cpu_cmpxchg: x86: " Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 06/13] add this_cpu_cmpxchg_local and asm-generic definitions Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 07/13] convert this_cpu_cmpxchg users to this_cpu_cmpxchg_local Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 08/13] mm/vmstat: switch counter modification to cmpxchg Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 09/13] vmstat: switch per-cpu vmstat counters to 32-bits Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 10/13] mm/vmstat: use xchg in cpu_vm_stats_fold Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 11/13] mm/vmstat: switch vmstat shepherd to flush per-CPU counters remotely Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 12/13] mm/vmstat: refresh stats remotely instead of via work item Marcelo Tosatti
2023-03-20 18:03 ` [PATCH v7 13/13] vmstat: add pcp remote node draining via cpu_vm_stats_fold Marcelo Tosatti
2023-03-20 20:43   ` Tim Chen
2023-03-22  1:20     ` Marcelo Tosatti
2023-03-20 18:25 ` [PATCH v7 00/13] fold per-CPU vmstats remotely Michal Hocko
2023-03-20 19:07   ` Marcelo Tosatti
2023-03-22 10:13     ` Michal Hocko
2023-03-22 11:23       ` Marcelo Tosatti
2023-03-22 13:35         ` Michal Hocko
2023-03-22 14:20           ` Marcelo Tosatti
2023-03-23  7:51             ` Michal Hocko
2023-03-23 10:52               ` Marcelo Tosatti
2023-03-23 10:59                 ` Marcelo Tosatti
2023-03-23 12:17                 ` Michal Hocko
2023-03-23 13:30                   ` Marcelo Tosatti
2023-03-23 13:32                     ` Marcelo Tosatti
2023-04-18 22:02 ` Andrew Morton
2023-04-19 11:14   ` Marcelo Tosatti
2023-04-19 11:15     ` Marcelo Tosatti
2023-04-19 13:44       ` Andrew Theurer
2023-04-20  7:55         ` Michal Hocko
2023-04-23  1:25           ` Marcelo Tosatti
2023-04-19 11:29     ` Marcelo Tosatti
2023-04-19 11:59       ` Marcelo Tosatti
2023-04-19 12:24         ` Frederic Weisbecker
2023-04-19 13:48           ` Marcelo Tosatti [this message]
2023-04-19 14:35             ` Michal Hocko
2023-04-19 16:35               ` Marcelo Tosatti
2023-04-20  8:40                 ` Michal Hocko
2023-04-23  1:10                   ` Marcelo Tosatti
2023-04-20 13:45                 ` Marcelo Tosatti
2023-04-26 14:34                   ` Marcelo Tosatti
2023-04-27  8:31                     ` Michal Hocko
2023-04-27 14:59                       ` Marcelo Tosatti
2023-04-26 15:04                   ` Vlastimil Babka
2023-04-26 16:10                     ` Marcelo Tosatti
2023-04-27  8:39                       ` Michal Hocko
2023-04-27 16:25                         ` Marcelo Tosatti
2023-04-19 16:47       ` Vlastimil Babka
2023-04-19 19:15         ` Marcelo Tosatti
2023-05-03 13:51           ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZD/xE6kR4RSOvUlR@tpad \
    --to=mtosatti@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=atomlin@atomlin.com \
    --cc=chenhuacai@kernel.org \
    --cc=cl@linux.com \
    --cc=frederic@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).