From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: [PATCH] xen: explicitly create/destroy stop_machine workqueues outside suspend/resume region. Date: Tue, 1 Dec 2009 11:47:15 +0000 Message-ID: <1259668035-8552-2-git-send-email-ian.campbell@citrix.com> References: <1259668035-8552-1-git-send-email-ian.campbell@citrix.com> Return-path: In-Reply-To: <1259668035-8552-1-git-send-email-ian.campbell@citrix.com> In-Reply-To: <1259158328.7590.539.camel@zakaz.uk.xensource.com> References: <1259158328.7590.539.camel@zakaz.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com Cc: Jeremy Fitzhardinge , Ian Campbell List-Id: xen-devel@lists.xenproject.org I have observed cases where the implicit stop_machine_destroy() done by stop_machine() hangs while destroying the workqueues, specifically in kthread_stop(). This seems to be because timer ticks are not restarted until after stop_machine() returns. Fortunately stop_machine provides a facility to pre-create/post-destroy the workqueues so use this to ensure that workqueues are only destroyed after everything is really up and running again. I only actually observed this failure with 2.6.30. It seems that newer kernels are somehow more robust against doing kthread_stop() without timer interrupts (I tried some backports of some likely looking candidates but did not track down the commit which added this robustness). However this change seems like a reasonable belt&braces thing to do. Signed-off-by: Ian Campbell Cc: Jeremy Fitzhardinge --- drivers/xen/manage.c | 12 +++++++++++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index 2fb7d39..c499793 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -79,6 +79,12 @@ static void do_suspend(void) shutting_down = SHUTDOWN_SUSPEND; + err = stop_machine_create(); + if (err) { + printk(KERN_ERR "xen suspend: failed to setup stop_machine %d\n", err); + goto out; + } + #ifdef CONFIG_PREEMPT /* If the kernel is preemptible, we need to freeze all the processes to prevent them from being in the middle of a pagetable update @@ -86,7 +92,7 @@ static void do_suspend(void) err = freeze_processes(); if (err) { printk(KERN_ERR "xen suspend: freeze failed %d\n", err); - goto out; + goto out_destroy_sm; } #endif @@ -129,7 +135,11 @@ out_resume: out_thaw: #ifdef CONFIG_PREEMPT thaw_processes(); + +out_destroy_sm: #endif + stop_machine_destroy(); + out: shutting_down = SHUTDOWN_INVALID; } -- 1.5.6.5