From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932131AbWAINex (ORCPT ); Mon, 9 Jan 2006 08:34:53 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751495AbWAINew (ORCPT ); Mon, 9 Jan 2006 08:34:52 -0500 Received: from mail.tv-sign.ru ([213.234.233.51]:9134 "EHLO several.ru") by vger.kernel.org with ESMTP id S1751456AbWAINew (ORCPT ); Mon, 9 Jan 2006 08:34:52 -0500 Message-ID: <43C2785C.4E937748@tv-sign.ru> Date: Mon, 09 Jan 2006 17:51:08 +0300 From: Oleg Nesterov X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.20 i686) X-Accept-Language: en MIME-Version: 1.0 To: vatsa@in.ibm.com Cc: linux-kernel@vger.kernel.org, Dipankar Sarma , Manfred Spraul , Linus Torvalds , "Paul E. McKenney" , Andrew Morton Subject: Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() References: <43C165BC.F7C6DCF5@tv-sign.ru> <20060109093141.GA10811@in.ibm.com> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Srivatsa Vaddagiri wrote: > On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote: > > ->donelist becomes != NULL only in rcu_process_callbacks(). > > > > rcu_process_callbacks() always calls rcu_do_batch() when > > ->donelist != NULL. > > > > rcu_do_batch() schedules rcu_process_callbacks() again if > > ->donelist was not flushed entirely. > > > > So ->donelist != NULL means that rcu_tasklet is either > > TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to > > check it in __rcu_pending(). > > Do I smell a bug wrt CPU Hotplug here? Basically, I see that we do > a rcu_move_batch of ->curlist and ->nxtlist of the dead CPU. Why not > ->donelist? After the quick reading CPU Hotplug code I think you are right, there is a bug in rcu_offline_cpu(). It should also move ->donelist, it is lost otherwise. > If we have to do a rcu_move_batch of ->donelist also, > then perhaps the ->donelist != NULL check is required in > rcu_pending? rcu_move_batch() always adds entries to the ->nxttail, so I think this patch is correct. > This is considering that the RCU tasklet of the dead > CPU is killed (rather than moved over to a different CPU). Yes, it is killed explicitly in rcu_offline_cpu() via tasklet_kill_immediate(). Note that we can't remove this tasklet_kill_immediate() and rely on takeover_tasklets(). rcu_process_callbacks() does __get_cpu_var(), so it can't find orphaned rcu_data anyway if the tasklet was moved to another cpu. So, do we need something like this (untested, uncompiled) patch or not? --- 2.6.15/kernel/rcupdate.c~ 2006-01-09 20:23:32.000000000 +0300 +++ 2.6.15/kernel/rcupdate.c 2006-01-09 20:26:20.000000000 +0300 @@ -355,8 +355,9 @@ static void __rcu_offline_cpu(struct rcu spin_unlock_bh(&rcp->lock); rcu_move_batch(this_rdp, rdp->curlist, rdp->curtail); rcu_move_batch(this_rdp, rdp->nxtlist, rdp->nxttail); - + rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail); } + static void rcu_offline_cpu(int cpu) { struct rcu_data *this_rdp = &get_cpu_var(rcu_data); Oleg.