From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161278AbcFHDz4 (ORCPT ); Tue, 7 Jun 2016 23:55:56 -0400 Received: from linuxhacker.ru ([217.76.32.60]:53982 "EHLO fiona.linuxhacker.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752150AbcFHDzu convert rfc822-to-8bit (ORCPT ); Tue, 7 Jun 2016 23:55:50 -0400 Subject: Re: Files leak from nfsd in 4.7.1-rc1 (and more?) Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=us-ascii From: Oleg Drokin In-Reply-To: Date: Tue, 7 Jun 2016 23:55:41 -0400 Cc: "J. Bruce Fields" , linux-nfs@vger.kernel.org, " Mailing List" Content-Transfer-Encoding: 8BIT Message-Id: <1D1008D4-9A28-48B2-9BA7-4069C2998D17@linuxhacker.ru> References: <4EDA6CFD-1FE8-4FCA-ACCF-84250BE342CB@linuxhacker.ru> <1465319435.3024.25.camel@poochiereds.net> <0F21EDD6-5CBB-4B5B-A1FF-E066011D18D6@linuxhacker.ru> <1465329897.3024.38.camel@poochiereds.net> <752F7196-1EE7-4FB3-8769-177131C8A793@linuxhacker.ru> <1465344205.3024.42.camel@poochiereds.net> To: Jeff Layton X-Mailer: Apple Mail (2.1283) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Jun 7, 2016, at 10:22 PM, Oleg Drokin wrote: > > On Jun 7, 2016, at 8:03 PM, Jeff Layton wrote: > >>>> That said, this code is quite subtle. I'd need to look over it in more >>>> detail before I offer up any fixes. I'd also appreciate it if anyone >>>> else wants to sanity check my analysis there. >>>> >> Yeah, I think you're right. It's fine since r/w opens have a distinct >> slot, even though the refcounting just tracks the number of read and >> write references. So yeah, the leak probably is in an error path >> someplace, or maybe a race someplace. > > So I noticed that set_access is always called locked, but clear_access is not, > this does not sound right. > > So I placed this strategic WARN_ON: > @@ -3991,6 +4030,7 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp, > goto out_put_access; > spin_lock(&fp->fi_lock); > if (!fp->fi_fds[oflag]) { > +WARN_ON(!test_access(open->op_share_access, stp)); > fp->fi_fds[oflag] = filp; > filp = NULL; > > This is right in the place where nfsd set the access flag already, discovered > that the file is not opened and went on to open it, yet some parallel thread > came in and cleared the flag by the time we got the file opened. > It did trigger (but there are 30 minutes left till test finish, so I don't > know yet if this will correspond to the problem at hand yet, so below is speculation). Duh, I looked for a warning, but did not cross reference, and it was not this one that hit yet. Though apparently I am hitting some of the "impossible" warnings, so you might want to look into that anyway. status = nfsd4_process_open2(rqstp, resfh, open); WARN(status && open->op_created, "nfsd4_process_open2 failed to open newly-created file! status=%u\n", be32_to_cpu(status)); and filp = find_readable_file(fp); if (!filp) { /* We should always have a readable file here */ WARN_ON_ONCE(1); locks_free_lock(fl); return -EBADF; }