From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751711AbbAVLQH (ORCPT ); Thu, 22 Jan 2015 06:16:07 -0500 Received: from www.linutronix.de ([62.245.132.108]:59176 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751553AbbAVLP6 (ORCPT ); Thu, 22 Jan 2015 06:15:58 -0500 Date: Thu, 22 Jan 2015 12:15:36 +0100 (CET) From: Thomas Gleixner To: Preeti U Murthy cc: aik@ozlabs.ru, shreyas@linux.vnet.ibm.com, LKML , michael@ellerman.id.au, Anton Blanchard , svaidy@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org, Peter Zijlstra Subject: Re: [PATCH V3] tick/broadcast: Make movement of broadcast hrtimer robust against hotplug In-Reply-To: <54C09391.9080202@linux.vnet.ibm.com> Message-ID: References: <20150120103559.8430.50933.stgit@preeti.in.ibm.com> <54C09391.9080202@linux.vnet.ibm.com> User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 22 Jan 2015, Preeti U Murthy wrote: > On 01/21/2015 05:16 PM, Thomas Gleixner wrote: > How about when the cpu that is going offline receives a timer interrupt > just before setting its state to CPU_DEAD ? That is still possible right > given that its clock devices may not have been shutdown and it is > capable of receiving interrupts for a short duration. Even with the > above patch, is the following scenario possible ? > > CPU0 CPU1 > t0 Receives timer interrupt > > t1 Sees that there are hrtimers > to be serviced (hrtimers are not yet migrated) > > t2 calls hrtimer_interrupt() > > t3 tick_program_event() CPU_DEAD notifiers > CPU0's td->evtdev = NULL > > t4 clockevent_program_event() > references NULL tick device pointer > > So my concern is that since the CLOCK_EVT_NOTIFY_CPU_DEAD callback > handles shutting down of devices besides moving tick related duties. > it's functions may race with the hotplug cpu still handling tick events. __cpu_disable() is supposed to block interrupts on the dying cpu. But I agree, we should make it more robust. So we want an explicit call for disabling the cpu local stuff and an explicit takeover of the broadcast duty. I'm anyway distangling the clockevents_notify() stuff, so it should be simple to do so. Thanks, tglx