From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932173AbXAIP7w (ORCPT ); Tue, 9 Jan 2007 10:59:52 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932175AbXAIP7w (ORCPT ); Tue, 9 Jan 2007 10:59:52 -0500 Received: from e5.ny.us.ibm.com ([32.97.182.145]:54428 "EHLO e5.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932173AbXAIP7v (ORCPT ); Tue, 9 Jan 2007 10:59:51 -0500 Date: Tue, 9 Jan 2007 21:29:08 +0530 From: Srivatsa Vaddagiri To: Oleg Nesterov Cc: Andrew Morton , David Howells , Christoph Hellwig , Ingo Molnar , Linus Torvalds , linux-kernel@vger.kernel.org, Gautham shenoy Subject: Re: [PATCH] flush_cpu_workqueue: don't flush an empty ->worklist Message-ID: <20070109155908.GD22080@in.ibm.com> Reply-To: vatsa@in.ibm.com References: <20070106163035.GA2948@tv-sign.ru> <20070106163851.GA13579@in.ibm.com> <20070106111117.54bb2307.akpm@osdl.org> <20070107110013.GD13579@in.ibm.com> <20070107115957.6080aa08.akpm@osdl.org> <20070107210139.GA2332@tv-sign.ru> <20070108155428.d76f3b73.akpm@osdl.org> <20070109050417.GC589@in.ibm.com> <20070108212656.ca77a3ba.akpm@osdl.org> <20070109150755.GB89@tv-sign.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070109150755.GB89@tv-sign.ru> User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 09, 2007 at 06:07:55PM +0300, Oleg Nesterov wrote: > but at some point we should thaw processes, including cwq->thread which > should die. I am presuming we will thaw processes after all CPU_DEAD handlers have run. > So we are doing things like take_over_work() and this is the > source of races, because the dead CPU is not on cpu_online_map. > > flush_workqueue() doesn't use any locks now. If we use freezer to implement > cpu-hotplug nothing will change, we still have races. We have races -if- CPU_DEAD handling can run concurrently with a ongoing flush_workqueue. From my recent understanding of process freezer, this is not possible. In other words, flush_workqueue() can be its old implementation as below w/o any races: some_thread: for_each_online_cpu(i) flush_cpu_workqueue(i); As long as this loop is running, cpu_down/up will not proceed. This means, cpu_online_map is stable even if flush_cpu_workqueue blocks .. Once this loop is complete and all threads have called try_to_freeze, cpu_down will proceed to change the bit map and run CPU_DEAD handlers of everyone. I am presuimg we will thaw processes only after all CPU_DEAD/ONLINE handlers have run (dont know if that is a problem). In that case do you still see races? Yes, this would require some changes in worker_thread to check for kthread_should_stop() after try_to_freeze returns ... -- Regards, vatsa