From: "Myklebust, Trond" <Trond.Myklebust@netapp.com> To: Nix <nix@esperi.org.uk> Cc: "J. Bruce Fields" <bfields@fieldses.org>, "Ted Ts'o" <tytso@mit.edu>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "Schumaker, Bryan" <Bryan.Schumaker@netapp.com>, Peng Tao <bergwolf@gmail.com>, "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>, "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>, Stanislav Kinsbursky <skinsbursky@parallels.com> Subject: Re: Heads-up: 3.6.2 / 3.6.3 NFS server oops: 3.6.2+ regression? (also an unrelated ext4 data loss bug) Date: Tue, 23 Oct 2012 18:23:48 +0000 [thread overview] Message-ID: <4FA345DA4F4AE44899BD2B03EEEC2FA90928CF49@SACEXCMBX04-PRD.hq.netapp.com> (raw) In-Reply-To: <1351015039.4622.23.camel@lade.trondhjem.org> [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3153 bytes --] On Tue, 2012-10-23 at 13:57 -0400, Trond Myklebust wrote: > On Tue, 2012-10-23 at 17:44 +0000, Myklebust, Trond wrote: > > You can't hold a spinlock while sleeping. Both mutex_lock() and nsm_create() can definitely sleep. > > > > The correct way to do this is to grab the spinlock and recheck the value of ln->nsm_users inside the 'if (!IS_ERR())' condition. If it is still zero, bump it and set ln->nsm_clnt, otherwise bump it, get the existing ln->nsm_clnt and call rpc_shutdown_clnt() on the redundant nsm client after dropping the spinlock. > > > > Cheers > > Trond > > Can you please check if the following patch fixes the issue? > > Cheers > Trond > Meh... This one gets rid of the 100% redundant mutex... 8<----------------------------------------------------------- >From 4187c816a15df12544ebcfa6b961fce96458e244 Mon Sep 17 00:00:00 2001 From: Trond Myklebust <Trond.Myklebust@netapp.com> Date: Tue, 23 Oct 2012 13:51:58 -0400 Subject: [PATCH] LOCKD: fix races in nsm_client_get Commit e9406db20fecbfcab646bad157b4cfdc7cadddfb (lockd: per-net NSM client creation and destruction helpers introduced) contains a nasty race on initialisation of the per-net NSM client because it doesn't check whether or not the client is set after grabbing the nsm_create_mutex. Reported-by: Nix <nix@esperi.org.uk> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org --- fs/lockd/mon.c | 43 ++++++++++++++++++++++++++----------------- 1 file changed, 26 insertions(+), 17 deletions(-) diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c index e4fb3ba..fe69560 100644 --- a/fs/lockd/mon.c +++ b/fs/lockd/mon.c @@ -85,29 +85,38 @@ static struct rpc_clnt *nsm_create(struct net *net) return rpc_create(&args); } +static struct rpc_clnt *nsm_client_set(struct lockd_net *ln, + struct rpc_clnt *clnt) +{ + spin_lock(&ln->nsm_clnt_lock); + if (ln->nsm_users == 0) { + if (clnt == NULL) + goto out; + ln->nsm_clnt = clnt; + } + clnt = ln->nsm_clnt; + ln->nsm_users++; +out: + spin_unlock(&ln->nsm_clnt_lock); + return clnt; +} + static struct rpc_clnt *nsm_client_get(struct net *net) { - static DEFINE_MUTEX(nsm_create_mutex); - struct rpc_clnt *clnt; + struct rpc_clnt *clnt, *new; struct lockd_net *ln = net_generic(net, lockd_net_id); - spin_lock(&ln->nsm_clnt_lock); - if (ln->nsm_users) { - ln->nsm_users++; - clnt = ln->nsm_clnt; - spin_unlock(&ln->nsm_clnt_lock); + clnt = nsm_client_set(ln, NULL); + if (clnt != NULL) goto out; - } - spin_unlock(&ln->nsm_clnt_lock); - mutex_lock(&nsm_create_mutex); - clnt = nsm_create(net); - if (!IS_ERR(clnt)) { - ln->nsm_clnt = clnt; - smp_wmb(); - ln->nsm_users = 1; - } - mutex_unlock(&nsm_create_mutex); + clnt = new = nsm_create(net); + if (IS_ERR(clnt)) + goto out; + + clnt = nsm_client_set(ln, new); + if (clnt != new) + rpc_shutdown_client(new); out: return clnt; } -- 1.7.11.7 -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥
WARNING: multiple messages have this Message-ID (diff)
From: "Myklebust, Trond" <Trond.Myklebust@netapp.com> To: Nix <nix@esperi.org.uk> Cc: "J. Bruce Fields" <bfields@fieldses.org>, "Ted Ts'o" <tytso@mit.edu>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "Schumaker, Bryan" <Bryan.Schumaker@netapp.com>, Peng Tao <bergwolf@gmail.com>, "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>, "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>, Stanislav Kinsbursky <skinsbursky@parallels.com> Subject: Re: Heads-up: 3.6.2 / 3.6.3 NFS server oops: 3.6.2+ regression? (also an unrelated ext4 data loss bug) Date: Tue, 23 Oct 2012 18:23:48 +0000 [thread overview] Message-ID: <4FA345DA4F4AE44899BD2B03EEEC2FA90928CF49@SACEXCMBX04-PRD.hq.netapp.com> (raw) In-Reply-To: <1351015039.4622.23.camel@lade.trondhjem.org> T24gVHVlLCAyMDEyLTEwLTIzIGF0IDEzOjU3IC0wNDAwLCBUcm9uZCBNeWtsZWJ1c3Qgd3JvdGU6 DQo+IE9uIFR1ZSwgMjAxMi0xMC0yMyBhdCAxNzo0NCArMDAwMCwgTXlrbGVidXN0LCBUcm9uZCB3 cm90ZToNCj4gPiBZb3UgY2FuJ3QgaG9sZCBhIHNwaW5sb2NrIHdoaWxlIHNsZWVwaW5nLiBCb3Ro IG11dGV4X2xvY2soKSBhbmQgbnNtX2NyZWF0ZSgpIGNhbiBkZWZpbml0ZWx5IHNsZWVwLg0KPiA+ IA0KPiA+IFRoZSBjb3JyZWN0IHdheSB0byBkbyB0aGlzIGlzIHRvIGdyYWIgdGhlIHNwaW5sb2Nr IGFuZCByZWNoZWNrIHRoZSB2YWx1ZSBvZiBsbi0+bnNtX3VzZXJzIGluc2lkZSB0aGUgJ2lmICgh SVNfRVJSKCkpJyBjb25kaXRpb24uIElmIGl0IGlzIHN0aWxsIHplcm8sIGJ1bXAgaXQgYW5kIHNl dCBsbi0+bnNtX2NsbnQsIG90aGVyd2lzZSBidW1wIGl0LCBnZXQgdGhlIGV4aXN0aW5nIGxuLT5u c21fY2xudCBhbmQgY2FsbCBycGNfc2h1dGRvd25fY2xudCgpIG9uIHRoZSByZWR1bmRhbnQgbnNt IGNsaWVudCBhZnRlciBkcm9wcGluZyB0aGUgc3BpbmxvY2suDQo+ID4gDQo+ID4gQ2hlZXJzDQo+ ID4gICBUcm9uZA0KPiANCj4gQ2FuIHlvdSBwbGVhc2UgY2hlY2sgaWYgdGhlIGZvbGxvd2luZyBw YXRjaCBmaXhlcyB0aGUgaXNzdWU/DQo+IA0KPiBDaGVlcnMNCj4gICBUcm9uZA0KPiANCk1laC4u LiBUaGlzIG9uZSBnZXRzIHJpZCBvZiB0aGUgMTAwJSByZWR1bmRhbnQgbXV0ZXguLi4NCg0KODwt LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LQ0KRnJvbSA0MTg3YzgxNmExNWRmMTI1NDRlYmNmYTZiOTYxZmNlOTY0NThlMjQ0IE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQ0KRnJvbTogVHJvbmQgTXlrbGVidXN0IDxUcm9uZC5NeWtsZWJ1c3RA bmV0YXBwLmNvbT4NCkRhdGU6IFR1ZSwgMjMgT2N0IDIwMTIgMTM6NTE6NTggLTA0MDANClN1Ympl Y3Q6IFtQQVRDSF0gTE9DS0Q6IGZpeCByYWNlcyBpbiBuc21fY2xpZW50X2dldA0KDQpDb21taXQg ZTk0MDZkYjIwZmVjYmZjYWI2NDZiYWQxNTdiNGNmZGM3Y2FkZGRmYiAobG9ja2Q6IHBlci1uZXQN Ck5TTSBjbGllbnQgY3JlYXRpb24gYW5kIGRlc3RydWN0aW9uIGhlbHBlcnMgaW50cm9kdWNlZCkg Y29udGFpbnMNCmEgbmFzdHkgcmFjZSBvbiBpbml0aWFsaXNhdGlvbiBvZiB0aGUgcGVyLW5ldCBO U00gY2xpZW50IGJlY2F1c2UNCml0IGRvZXNuJ3QgY2hlY2sgd2hldGhlciBvciBub3QgdGhlIGNs aWVudCBpcyBzZXQgYWZ0ZXIgZ3JhYmJpbmcNCnRoZSBuc21fY3JlYXRlX211dGV4Lg0KDQpSZXBv cnRlZC1ieTogTml4IDxuaXhAZXNwZXJpLm9yZy51az4NClNpZ25lZC1vZmYtYnk6IFRyb25kIE15 a2xlYnVzdCA8VHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20+DQpDYzogc3RhYmxlQHZnZXIua2Vy bmVsLm9yZw0KLS0tDQogZnMvbG9ja2QvbW9uLmMgfCA0MyArKysrKysrKysrKysrKysrKysrKysr KysrKy0tLS0tLS0tLS0tLS0tLS0tDQogMSBmaWxlIGNoYW5nZWQsIDI2IGluc2VydGlvbnMoKyks IDE3IGRlbGV0aW9ucygtKQ0KDQpkaWZmIC0tZ2l0IGEvZnMvbG9ja2QvbW9uLmMgYi9mcy9sb2Nr ZC9tb24uYw0KaW5kZXggZTRmYjNiYS4uZmU2OTU2MCAxMDA2NDQNCi0tLSBhL2ZzL2xvY2tkL21v bi5jDQorKysgYi9mcy9sb2NrZC9tb24uYw0KQEAgLTg1LDI5ICs4NSwzOCBAQCBzdGF0aWMgc3Ry dWN0IHJwY19jbG50ICpuc21fY3JlYXRlKHN0cnVjdCBuZXQgKm5ldCkNCiAJcmV0dXJuIHJwY19j cmVhdGUoJmFyZ3MpOw0KIH0NCiANCitzdGF0aWMgc3RydWN0IHJwY19jbG50ICpuc21fY2xpZW50 X3NldChzdHJ1Y3QgbG9ja2RfbmV0ICpsbiwNCisJCXN0cnVjdCBycGNfY2xudCAqY2xudCkNCit7 DQorCXNwaW5fbG9jaygmbG4tPm5zbV9jbG50X2xvY2spOw0KKwlpZiAobG4tPm5zbV91c2VycyA9 PSAwKSB7DQorCQlpZiAoY2xudCA9PSBOVUxMKQ0KKwkJCWdvdG8gb3V0Ow0KKwkJbG4tPm5zbV9j bG50ID0gY2xudDsNCisJfQ0KKwljbG50ID0gbG4tPm5zbV9jbG50Ow0KKwlsbi0+bnNtX3VzZXJz Kys7DQorb3V0Og0KKwlzcGluX3VubG9jaygmbG4tPm5zbV9jbG50X2xvY2spOw0KKwlyZXR1cm4g Y2xudDsNCit9DQorDQogc3RhdGljIHN0cnVjdCBycGNfY2xudCAqbnNtX2NsaWVudF9nZXQoc3Ry dWN0IG5ldCAqbmV0KQ0KIHsNCi0Jc3RhdGljIERFRklORV9NVVRFWChuc21fY3JlYXRlX211dGV4 KTsNCi0Jc3RydWN0IHJwY19jbG50CSpjbG50Ow0KKwlzdHJ1Y3QgcnBjX2NsbnQJKmNsbnQsICpu ZXc7DQogCXN0cnVjdCBsb2NrZF9uZXQgKmxuID0gbmV0X2dlbmVyaWMobmV0LCBsb2NrZF9uZXRf aWQpOw0KIA0KLQlzcGluX2xvY2soJmxuLT5uc21fY2xudF9sb2NrKTsNCi0JaWYgKGxuLT5uc21f dXNlcnMpIHsNCi0JCWxuLT5uc21fdXNlcnMrKzsNCi0JCWNsbnQgPSBsbi0+bnNtX2NsbnQ7DQot CQlzcGluX3VubG9jaygmbG4tPm5zbV9jbG50X2xvY2spOw0KKwljbG50ID0gbnNtX2NsaWVudF9z ZXQobG4sIE5VTEwpOw0KKwlpZiAoY2xudCAhPSBOVUxMKQ0KIAkJZ290byBvdXQ7DQotCX0NCi0J c3Bpbl91bmxvY2soJmxuLT5uc21fY2xudF9sb2NrKTsNCiANCi0JbXV0ZXhfbG9jaygmbnNtX2Ny ZWF0ZV9tdXRleCk7DQotCWNsbnQgPSBuc21fY3JlYXRlKG5ldCk7DQotCWlmICghSVNfRVJSKGNs bnQpKSB7DQotCQlsbi0+bnNtX2NsbnQgPSBjbG50Ow0KLQkJc21wX3dtYigpOw0KLQkJbG4tPm5z bV91c2VycyA9IDE7DQotCX0NCi0JbXV0ZXhfdW5sb2NrKCZuc21fY3JlYXRlX211dGV4KTsNCisJ Y2xudCA9IG5ldyA9IG5zbV9jcmVhdGUobmV0KTsNCisJaWYgKElTX0VSUihjbG50KSkNCisJCWdv dG8gb3V0Ow0KKw0KKwljbG50ID0gbnNtX2NsaWVudF9zZXQobG4sIG5ldyk7DQorCWlmIChjbG50 ICE9IG5ldykNCisJCXJwY19zaHV0ZG93bl9jbGllbnQobmV3KTsNCiBvdXQ6DQogCXJldHVybiBj bG50Ow0KIH0NCi0tIA0KMS43LjExLjcNCg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXgg TkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5j b20NCnd3dy5uZXRhcHAuY29tDQo=
next prev parent reply other threads:[~2012-10-23 18:23 UTC|newest] Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-10-22 16:17 Heads-up: 3.6.2 / 3.6.3 NFS server panic: 3.6.2+ regression? Nix 2012-10-23 1:33 ` J. Bruce Fields 2012-10-23 14:07 ` Nix 2012-10-23 14:30 ` J. Bruce Fields 2012-10-23 16:32 ` Heads-up: 3.6.2 / 3.6.3 NFS server oops: 3.6.2+ regression? (also an unrelated ext4 data loss bug) Nix 2012-10-23 16:46 ` J. Bruce Fields 2012-10-23 16:54 ` J. Bruce Fields 2012-10-23 16:56 ` Myklebust, Trond 2012-10-23 16:56 ` Myklebust, Trond 2012-10-23 17:05 ` Nix 2012-10-23 17:36 ` Nix 2012-10-23 17:43 ` J. Bruce Fields 2012-10-23 17:44 ` Myklebust, Trond 2012-10-23 17:57 ` Myklebust, Trond 2012-10-23 17:57 ` Myklebust, Trond [not found] ` <1351015039.4622.23.camel@lade.trondhjem.org> 2012-10-23 18:23 ` Myklebust, Trond [this message] 2012-10-23 18:23 ` Myklebust, Trond 2012-10-23 19:49 ` Nix 2012-10-24 10:18 ` [PATCH] lockd: fix races in per-net NSM client handling Stanislav Kinsbursky 2012-10-23 20:57 ` Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) Nix 2012-10-23 20:57 ` Nix 2012-10-23 22:19 ` Theodore Ts'o 2012-10-23 22:47 ` Nix 2012-10-23 23:16 ` Theodore Ts'o 2012-10-23 23:06 ` Nix 2012-10-23 23:28 ` Theodore Ts'o 2012-10-23 23:34 ` Nix 2012-10-24 0:57 ` Eric Sandeen 2012-10-24 20:17 ` Jan Kara 2012-10-26 15:25 ` Eric Sandeen 2012-10-24 19:13 ` Jannis Achstetter 2012-10-24 19:13 ` Jannis Achstetter 2012-10-24 21:31 ` Theodore Ts'o 2012-10-24 22:05 ` Jannis Achstetter 2012-10-24 23:47 ` Nix 2012-10-25 17:02 ` Felipe Contreras 2012-10-24 21:04 ` Jannis Achstetter 2012-10-24 1:13 ` Eric Sandeen 2012-10-24 1:13 ` Eric Sandeen 2012-10-24 4:15 ` Nix 2012-10-24 4:27 ` Eric Sandeen 2012-10-24 5:23 ` Theodore Ts'o 2012-10-24 7:00 ` Hugh Dickins 2012-10-24 11:46 ` Nix 2012-10-24 11:45 ` Nix 2012-10-24 17:22 ` Eric Sandeen 2012-10-24 19:49 ` Nix 2012-10-24 19:54 ` Nix 2012-10-24 20:30 ` Eric Sandeen 2012-10-24 20:34 ` Nix 2012-10-24 20:45 ` Nix 2012-10-24 21:08 ` Theodore Ts'o 2012-10-24 23:27 ` Apparent serious progressive ext4 data corruption bug in 3.6 (when rebooting during umount) Nix 2012-10-24 23:42 ` Nix 2012-10-25 1:10 ` Theodore Ts'o 2012-10-25 1:45 ` Nix 2012-10-25 1:45 ` Nix 2012-10-25 14:12 ` Theodore Ts'o 2012-10-25 14:15 ` Nix 2012-10-25 17:39 ` Nix 2012-10-25 11:06 ` Nix 2012-10-26 0:22 ` Apparent serious progressive ext4 data corruption bug in 3.6 (when rebooting during umount) (possibly blockdev / arcmsr at fault??) Nix 2012-10-26 0:11 ` Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) Ric Wheeler 2012-10-26 0:43 ` Theodore Ts'o 2012-10-26 12:12 ` Nix 2012-10-26 20:35 ` Eric Sandeen 2012-10-26 20:37 ` Nix 2012-10-26 20:56 ` Theodore Ts'o 2012-10-26 20:56 ` Theodore Ts'o 2012-10-26 20:59 ` Nix 2012-10-26 20:59 ` Nix 2012-10-26 21:15 ` Theodore Ts'o 2012-10-26 21:15 ` Theodore Ts'o 2012-10-26 21:19 ` Nix 2012-10-27 0:22 ` Theodore Ts'o 2012-10-27 0:22 ` Theodore Ts'o 2012-10-27 12:45 ` Nix 2012-10-27 17:55 ` Theodore Ts'o 2012-10-27 18:47 ` Nix 2012-10-27 21:19 ` Eric Sandeen 2012-10-27 21:21 ` Nix 2012-10-27 21:23 ` Eric Sandeen 2012-10-27 21:29 ` Nix 2012-10-27 21:34 ` Eric Sandeen 2012-10-27 21:40 ` Nix [not found] ` <09758CEA-74B5-48D0-8075-BB723A2CABBB@dilger.ca> 2012-10-29 2:09 ` Eric Sandeen 2012-10-27 22:42 ` Eric Sandeen 2012-10-29 1:00 ` Theodore Ts'o 2012-10-29 1:04 ` Nix 2012-10-29 2:24 ` Eric Sandeen 2012-10-29 2:34 ` Theodore Ts'o 2012-10-29 2:35 ` Eric Sandeen 2012-10-29 2:42 ` Theodore Ts'o 2012-10-27 18:30 ` Eric Sandeen 2012-10-27 3:11 ` Jim Rees 2012-10-27 3:11 ` Jim Rees 2012-10-27 8:01 ` Testing ext4's journal via simulating a reboot via KVM Theodore Ts'o 2012-10-28 4:23 ` [PATCH] ext4: fix unjournaled inode bitmap modification Eric Sandeen 2012-10-28 4:23 ` Eric Sandeen 2012-10-28 13:59 ` Nix 2012-10-29 2:30 ` [PATCH -v3] " Theodore Ts'o 2012-10-29 2:30 ` Theodore Ts'o 2012-10-29 3:24 ` Eric Sandeen 2012-10-29 5:07 ` Andreas Dilger 2012-10-29 17:08 ` Darrick J. Wong
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=4FA345DA4F4AE44899BD2B03EEEC2FA90928CF49@SACEXCMBX04-PRD.hq.netapp.com \ --to=trond.myklebust@netapp.com \ --cc=Bryan.Schumaker@netapp.com \ --cc=bergwolf@gmail.com \ --cc=bfields@fieldses.org \ --cc=gregkh@linuxfoundation.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nfs@vger.kernel.org \ --cc=nix@esperi.org.uk \ --cc=skinsbursky@parallels.com \ --cc=tytso@mit.edu \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.