From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755456AbaHAMrj (ORCPT ); Fri, 1 Aug 2014 08:47:39 -0400 Received: from cantor2.suse.de ([195.135.220.15]:54017 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754323AbaHAMri (ORCPT ); Fri, 1 Aug 2014 08:47:38 -0400 Date: Fri, 1 Aug 2014 14:47:34 +0200 From: Jan Kara To: NeilBrown Cc: Ben Greear , Andrew Morton , "linux-nfs@vger.kernel.org" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: Re: Killing process in D state on mount to dead NFS server. (when process is in fsync) Message-ID: <20140801124734.GB5431@quack.suse.cz> References: <53DA8443.407@candelatech.com> <20140801064217.01852788@notabene.brown> <53DAB307.2000206@candelatech.com> <20140801075053.2120cb33@notabene.brown> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140801075053.2120cb33@notabene.brown> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 01-08-14 07:50:53, NeilBrown wrote: > On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > On 07/31/2014 01:42 PM, NeilBrown wrote: > > > On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear wrote: > > > > > >> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server > > >> somewhere with the same IP as the gone-away NFS server. > > >> > > >> The problem is: > > >> > > >> I have some mounts to an NFS server that no longer exists (crashed/powered down). > > >> > > >> I have some processes stuck trying to write to files open on these mounts. > > >> > > >> I want to kill the process and unmount. > > >> > > >> umount -l will make the mount go a way, sort of. But process is still hung. umount -f complains: umount2: Device or resource busy umount.nfs: /mnt/foo: > > >> device is busy > > >> > > >> kill -9 does not work on process. > > > > > > Kill -1 should work (since about 2.6.25 or so). > > > > That is -[ONE], right? Assuming so, it did not work for me. > > No, it was "-9" .... sorry, I really shouldn't be let out without my proof > reader. > > However the 'stack' is sufficient to see what is going on. > > The problem is that it is blocked inside the "VM" well away from NFS and > there is no way for NFS to say "give up and go home". > > I'd suggest that is a bug. I cannot see any justification for fsync to not > be killable. > It wouldn't be too hard to create a patch to make it so. > It would be a little harder to examine all call paths and create a > convincing case that the patch was safe. > It might be herculean task to convince others that it was the right thing > to do.... so let's start with that one. > > Hi Linux-mm and fs-devel people. What do people think of making "fsync" and > variants "KILLABLE" ?? Sounds useful to me and I don't see how it could break some application... Honza -- Jan Kara SUSE Labs, CR From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: Killing process in D state on mount to dead NFS server. (when process is in fsync) Date: Fri, 1 Aug 2014 14:47:34 +0200 Message-ID: <20140801124734.GB5431@quack.suse.cz> References: <53DA8443.407@candelatech.com> <20140801064217.01852788@notabene.brown> <53DAB307.2000206@candelatech.com> <20140801075053.2120cb33@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Ben Greear , Andrew Morton , "linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: NeilBrown Return-path: Content-Disposition: inline In-Reply-To: <20140801075053.2120cb33-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org> Sender: linux-nfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-fsdevel.vger.kernel.org On Fri 01-08-14 07:50:53, NeilBrown wrote: > On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > On 07/31/2014 01:42 PM, NeilBrown wrote: > > > On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear wrote: > > > > > >> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server > > >> somewhere with the same IP as the gone-away NFS server. > > >> > > >> The problem is: > > >> > > >> I have some mounts to an NFS server that no longer exists (crashed/powered down). > > >> > > >> I have some processes stuck trying to write to files open on these mounts. > > >> > > >> I want to kill the process and unmount. > > >> > > >> umount -l will make the mount go a way, sort of. But process is still hung. umount -f complains: umount2: Device or resource busy umount.nfs: /mnt/foo: > > >> device is busy > > >> > > >> kill -9 does not work on process. > > > > > > Kill -1 should work (since about 2.6.25 or so). > > > > That is -[ONE], right? Assuming so, it did not work for me. > > No, it was "-9" .... sorry, I really shouldn't be let out without my proof > reader. > > However the 'stack' is sufficient to see what is going on. > > The problem is that it is blocked inside the "VM" well away from NFS and > there is no way for NFS to say "give up and go home". > > I'd suggest that is a bug. I cannot see any justification for fsync to not > be killable. > It wouldn't be too hard to create a patch to make it so. > It would be a little harder to examine all call paths and create a > convincing case that the patch was safe. > It might be herculean task to convince others that it was the right thing > to do.... so let's start with that one. > > Hi Linux-mm and fs-devel people. What do people think of making "fsync" and > variants "KILLABLE" ?? Sounds useful to me and I don't see how it could break some application... Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-we0-f180.google.com (mail-we0-f180.google.com [74.125.82.180]) by kanga.kvack.org (Postfix) with ESMTP id 2E5836B0036 for ; Fri, 1 Aug 2014 08:47:55 -0400 (EDT) Received: by mail-we0-f180.google.com with SMTP id w61so4245215wes.25 for ; Fri, 01 Aug 2014 05:47:53 -0700 (PDT) Received: from mx2.suse.de (cantor2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id c11si18364561wjs.107.2014.08.01.05.47.44 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 01 Aug 2014 05:47:44 -0700 (PDT) Date: Fri, 1 Aug 2014 14:47:34 +0200 From: Jan Kara Subject: Re: Killing process in D state on mount to dead NFS server. (when process is in fsync) Message-ID: <20140801124734.GB5431@quack.suse.cz> References: <53DA8443.407@candelatech.com> <20140801064217.01852788@notabene.brown> <53DAB307.2000206@candelatech.com> <20140801075053.2120cb33@notabene.brown> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140801075053.2120cb33@notabene.brown> Sender: owner-linux-mm@kvack.org List-ID: To: NeilBrown Cc: Ben Greear , Andrew Morton , "linux-nfs@vger.kernel.org" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org On Fri 01-08-14 07:50:53, NeilBrown wrote: > On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > On 07/31/2014 01:42 PM, NeilBrown wrote: > > > On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear wrote: > > > > > >> So, this has been asked all over the interweb for years and years, but the best answer I can find is to reboot the system or create a fake NFS server > > >> somewhere with the same IP as the gone-away NFS server. > > >> > > >> The problem is: > > >> > > >> I have some mounts to an NFS server that no longer exists (crashed/powered down). > > >> > > >> I have some processes stuck trying to write to files open on these mounts. > > >> > > >> I want to kill the process and unmount. > > >> > > >> umount -l will make the mount go a way, sort of. But process is still hung. umount -f complains: umount2: Device or resource busy umount.nfs: /mnt/foo: > > >> device is busy > > >> > > >> kill -9 does not work on process. > > > > > > Kill -1 should work (since about 2.6.25 or so). > > > > That is -[ONE], right? Assuming so, it did not work for me. > > No, it was "-9" .... sorry, I really shouldn't be let out without my proof > reader. > > However the 'stack' is sufficient to see what is going on. > > The problem is that it is blocked inside the "VM" well away from NFS and > there is no way for NFS to say "give up and go home". > > I'd suggest that is a bug. I cannot see any justification for fsync to not > be killable. > It wouldn't be too hard to create a patch to make it so. > It would be a little harder to examine all call paths and create a > convincing case that the patch was safe. > It might be herculean task to convince others that it was the right thing > to do.... so let's start with that one. > > Hi Linux-mm and fs-devel people. What do people think of making "fsync" and > variants "KILLABLE" ?? Sounds useful to me and I don't see how it could break some application... Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org