All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com, mingo@kernel.org, jiangshanlai@gmail.com,
	dipankar@in.ibm.com, akpm@linux-foundation.org,
	mathieu.desnoyers@efficios.com, josh@joshtriplett.org,
	tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org,
	dhowells@redhat.com, edumazet@google.com, fweisbec@gmail.com,
	oleg@redhat.com, linux-mm@kvack.org
Subject: Re: [PATCH tip/core/rcu 02/26] mm/mmap.c: Add cond_resched() for exit_mmap() CPU stalls
Date: Tue, 23 Jun 2020 13:55:08 -0700	[thread overview]
Message-ID: <20200623205508.GS9247@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200623193431.GA68372@google.com>

On Tue, Jun 23, 2020 at 03:34:31PM -0400, Joel Fernandes wrote:
> On Mon, Jun 22, 2020 at 05:21:23PM -0700, paulmck@kernel.org wrote:
> > From: "Paul E. McKenney" <paulmck@kernel.org>
> > 
> > A large process running on a heavily loaded system can encounter the
> > following RCU CPU stall warning:
> > 
> >   rcu: INFO: rcu_sched self-detected stall on CPU
> >   rcu: \x093-....: (20998 ticks this GP) idle=4ea/1/0x4000000000000002 softirq=556558/556558 fqs=5190
> >   \x09(t=21013 jiffies g=1005461 q=132576)
> >   NMI backtrace for cpu 3
> >   CPU: 3 PID: 501900 Comm: aio-free-ring-w Kdump: loaded Not tainted 5.2.9-108_fbk12_rc3_3858_gb83b75af7909 #1
> >   Hardware name: Wiwynn   HoneyBadger/PantherPlus, BIOS HBM6.71 02/03/2016
> >   Call Trace:
> >    <IRQ>
> >    dump_stack+0x46/0x60
> >    nmi_cpu_backtrace.cold.3+0x13/0x50
> >    ? lapic_can_unplug_cpu.cold.27+0x34/0x34
> >    nmi_trigger_cpumask_backtrace+0xba/0xca
> >    rcu_dump_cpu_stacks+0x99/0xc7
> >    rcu_sched_clock_irq.cold.87+0x1aa/0x397
> >    ? tick_sched_do_timer+0x60/0x60
> >    update_process_times+0x28/0x60
> >    tick_sched_timer+0x37/0x70
> >    __hrtimer_run_queues+0xfe/0x270
> >    hrtimer_interrupt+0xf4/0x210
> >    smp_apic_timer_interrupt+0x5e/0x120
> >    apic_timer_interrupt+0xf/0x20
> >    </IRQ>
> >   RIP: 0010:kmem_cache_free+0x223/0x300
> >   Code: 88 00 00 00 0f 85 ca 00 00 00 41 8b 55 18 31 f6 f7 da 41 f6 45 0a 02 40 0f 94 c6 83 c6 05 9c 41 5e fa e8 a0 a7 01 00 41 56 9d <49> 8b 47 08 a8 03 0f 85 87 00 00 00 65 48 ff 08 e9 3d fe ff ff 65
> >   RSP: 0018:ffffc9000e8e3da8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
> >   RAX: 0000000000020000 RBX: ffff88861b9de960 RCX: 0000000000000030
> >   RDX: fffffffffffe41e8 RSI: 000060777fe3a100 RDI: 000000000001be18
> >   RBP: ffffea00186e7780 R08: ffffffffffffffff R09: ffffffffffffffff
> >   R10: ffff88861b9dea28 R11: ffff88887ffde000 R12: ffffffff81230a1f
> >   R13: ffff888854684dc0 R14: 0000000000000206 R15: ffff8888547dbc00
> >    ? remove_vma+0x4f/0x60
> >    remove_vma+0x4f/0x60
> >    exit_mmap+0xd6/0x160
> >    mmput+0x4a/0x110
> >    do_exit+0x278/0xae0
> >    ? syscall_trace_enter+0x1d3/0x2b0
> >    ? handle_mm_fault+0xaa/0x1c0
> >    do_group_exit+0x3a/0xa0
> >    __x64_sys_exit_group+0x14/0x20
> >    do_syscall_64+0x42/0x100
> >    entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > 
> > And on a PREEMPT=n kernel, the "while (vma)" loop in exit_mmap() can run
> > for a very long time given a large process.  This commit therefore adds
> > a cond_resched() to this loop, providing RCU any needed quiescent states.
> > 
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: <linux-mm@kvack.org>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > ---
> >  mm/mmap.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 59a4682..972f839 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -3159,6 +3159,7 @@ void exit_mmap(struct mm_struct *mm)
> >  		if (vma->vm_flags & VM_ACCOUNT)
> >  			nr_accounted += vma_pages(vma);
> >  		vma = remove_vma(vma);
> > +		cond_resched();
> 
> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Thank you!  I will apply this on my next rebase.

> Just for my understanding, cond_resched_tasks_rcu_qs() may not help here
> because preemption is not disabled right? Still I see no harm in using it
> here either as it may give a slight speed up for tasks-RCU.

The RCU-tasks stall-warning interval is ten minutes, and I have not yet
seen evidence that we are getting close to that.  If we do, then yes,
a cond_resched_tasks_rcu_qs() might be in this code's future.  But it
does add overhead, so we need to see the evidence first.

							Thanx, Paul

> thanks,
> 
>  - Joel
> 
> >  	}
> >  	vm_unacct_memory(nr_accounted);
> >  }
> > -- 
> > 2.9.5
> > 

  reply	other threads:[~2020-06-23 20:55 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-23  0:21 [PATCH tip/core/rcu 0/26] Miscellaneous fixes for v5.9 Paul E. McKenney
2020-06-23  0:21 ` [PATCH tip/core/rcu 01/26] rcu: Initialize and destroy rcu_synchronize only when necessary paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 02/26] mm/mmap.c: Add cond_resched() for exit_mmap() CPU stalls paulmck
2020-06-23  0:47   ` Shakeel Butt
2020-06-23  0:47     ` Shakeel Butt
2020-06-23  0:57     ` Paul E. McKenney
2020-06-23 19:34   ` Joel Fernandes
2020-06-23 20:55     ` Paul E. McKenney [this message]
2020-06-23 21:01       ` Joel Fernandes
2020-06-23  0:21 ` [PATCH tip/core/rcu 03/26] rcu: Simplify the calculation of rcu_state.ncpus paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 04/26] rcu: Add callbacks-invoked counters paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 05/26] rcu: Add comment documenting rcu_callback_map's purpose paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 06/26] trace: events: rcu: Change description of rcu_dyntick trace event paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 07/26] rcu: Grace-period-kthread related sleeps to idle priority paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 08/26] rcu: Priority-boost-related " paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 09/26] rcu: No-CBs-related " paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 10/26] rcu: Expedited grace-period " paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 11/26] fs/btrfs: Add cond_resched() for try_release_extent_mapping() stalls paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 12/26] rcu: Update comment from rsp->rcu_gp_seq to rsp->gp_seq paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 13/26] tick/nohz: Narrow down noise while setting current task's tick dependency paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 14/26] rcu: fix some kernel-doc warnings paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 15/26] rcu: Remove initialized but unused rnp from check_slow_task() paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 16/26] rcu: Mark rcu_nmi_enter() call to rcu_cleanup_after_idle() noinstr paulmck
2020-06-23 17:04   ` Peter Zijlstra
2020-06-23 17:50     ` Paul E. McKenney
2020-06-23  0:21 ` [PATCH tip/core/rcu 17/26] lockdep: Complain only once about RCU in extended quiescent state paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 18/26] rcu: Replace 1 with true paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 19/26] rcu: Stop shrinker loop paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 20/26] rcu: gp_max is protected by root rcu_node's lock paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 21/26] rcu: grplo/grphi just records CPU number paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 22/26] rcu: grpnum just records group number paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 23/26] kernel/rcu/tree.c: Fix kernel-doc warnings paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 24/26] rcu: fix some " paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 25/26] rcu: Remove KCSAN stubs paulmck
2020-06-23  0:21 ` [PATCH tip/core/rcu 26/26] rcu: Remove KCSAN stubs from update.c paulmck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200623205508.GS9247@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.