All of lore.kernel.org
 help / color / mirror / Atom feed
* clockevents_shutdown vs pending interrupt
@ 2015-06-23  9:56 Andriy Gapon
  2015-06-23 10:15 ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: Andriy Gapon @ 2015-06-23  9:56 UTC (permalink / raw)
  To: linux-kernel


Pardon if I am asking something obvious or silly...

tick_check_new_device() has the following code:

        if (tick_is_broadcast_device(curdev)) {
                clockevents_shutdown(curdev);
                curdev = NULL;
        }

and

void clockevents_shutdown(struct clock_event_device *dev)
{
        clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
        dev->next_event.tv64 = KTIME_MAX;
}

This is all done while interrupts are disabled on the current CPU.
But what if there is already a pending interrupt from the current source?
Is it possible that the timer interrupt would be processed by the device that
was put in the shutdown mode?

Some context: I am experiencing exactly the same symptoms as described here
http://thread.gmane.org/gmane.linux.kernel/1483297.  But I run a kernel where
that bug is fixed.  And my problem happens in a VM, so it's possible that there
are timing issues which are very unlikely on real hardware.


-- 
Andriy Gapon

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: clockevents_shutdown vs pending interrupt
  2015-06-23  9:56 clockevents_shutdown vs pending interrupt Andriy Gapon
@ 2015-06-23 10:15 ` Thomas Gleixner
  2015-07-01  9:45   ` Andriy Gapon
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Gleixner @ 2015-06-23 10:15 UTC (permalink / raw)
  To: Andriy Gapon; +Cc: linux-kernel

On Tue, 23 Jun 2015, Andriy Gapon wrote:
> Pardon if I am asking something obvious or silly...
> 
> tick_check_new_device() has the following code:
> 
>         if (tick_is_broadcast_device(curdev)) {
>                 clockevents_shutdown(curdev);
>                 curdev = NULL;
>         }
> 
> and
> 
> void clockevents_shutdown(struct clock_event_device *dev)
> {
>         clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>         dev->next_event.tv64 = KTIME_MAX;
> }
> 
> This is all done while interrupts are disabled on the current CPU.
> But what if there is already a pending interrupt from the current source?
> Is it possible that the timer interrupt would be processed by the device that
> was put in the shutdown mode?
> 
> Some context: I am experiencing exactly the same symptoms as described here
> http://thread.gmane.org/gmane.linux.kernel/1483297.  But I run a kernel where
> that bug is fixed.  And my problem happens in a VM, so it's possible that there
> are timing issues which are very unlikely on real hardware.

Can you provide a full dmesg please?

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: clockevents_shutdown vs pending interrupt
  2015-06-23 10:15 ` Thomas Gleixner
@ 2015-07-01  9:45   ` Andriy Gapon
  2015-07-01 14:57     ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: Andriy Gapon @ 2015-07-01  9:45 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel

On 23/06/2015 13:15, Thomas Gleixner wrote:
> Can you provide a full dmesg please?

Thomas,

I've caught a couple of boot logs with different stack traces from unsuccessful
boot attempts and one from a successful attempt with exactly the same VM
configuration.
The logs are here:
https://people.freebsd.org/~avg/linux-boot-hang/

-- 
Andriy Gapon

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: clockevents_shutdown vs pending interrupt
  2015-07-01  9:45   ` Andriy Gapon
@ 2015-07-01 14:57     ` Thomas Gleixner
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2015-07-01 14:57 UTC (permalink / raw)
  To: Andriy Gapon; +Cc: linux-kernel

Andriy,

On Wed, 1 Jul 2015, Andriy Gapon wrote:
> I've caught a couple of boot logs with different stack traces from unsuccessful
> boot attempts and one from a successful attempt with exactly the same VM
> configuration.
> The logs are here:
> https://people.freebsd.org/~avg/linux-boot-hang/

I have to admit, that I'm thoroughly confused about that broadcast
check in the install path. Can you apply the debug patch below and
provide the output?

Thanks,

	tglx
---
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index d39f32cdd1b5..f1f921a49da9 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -100,6 +100,7 @@ void tick_install_broadcast_device(struct clock_event_device *dev)
 	if (cur)
 		cur->event_handler = clockevents_handle_noop;
 	tick_broadcast_device.evtdev = dev;
+	pr_err("Install broadcast device %p %s\n", dev, dev->name);
 	if (!cpumask_empty(tick_broadcast_mask))
 		tick_broadcast_start_periodic(dev);
 	/*
@@ -301,6 +302,13 @@ static void tick_handle_periodic_broadcast(struct clock_event_device *dev)
 	bool bc_local;
 
 	raw_spin_lock(&tick_broadcast_lock);
+	/* Handle spurious interrupt */
+	if (clockevent_state_shutdown(dev)) {
+		pr_err("Spurious broadcast event %p %s\n", dev, dev->name);
+		raw_spin_unlock(&tick_broadcast_lock);
+		return;
+	}
+
 	bc_local = tick_do_periodic_broadcast();
 
 	if (clockevent_state_oneshot(dev)) {
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 76446cb5dfe1..ecd439b2de7e 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -321,6 +321,7 @@ void tick_check_new_device(struct clock_event_device *newdev)
 	if (!try_module_get(newdev->owner))
 		return;
 
+	pr_err("Install per cpu tick device %p %s\n", newdev, newdev->name);
 	/*
 	 * Replace the eventually existing device by the new
 	 * device. If the current device is the broadcast device, do



^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-07-01 14:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-23  9:56 clockevents_shutdown vs pending interrupt Andriy Gapon
2015-06-23 10:15 ` Thomas Gleixner
2015-07-01  9:45   ` Andriy Gapon
2015-07-01 14:57     ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.