From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from zeniv.linux.org.uk ([195.92.253.2]:35120 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965310AbcBQVR5 (ORCPT ); Wed, 17 Feb 2016 16:17:57 -0500 Date: Wed, 17 Feb 2016 21:17:54 +0000 From: Al Viro To: Mike Marshall Cc: Martin Brandenburg , Linus Torvalds , linux-fsdevel , Stephen Rothwell Subject: Re: Orangefs ABI documentation Message-ID: <20160217211754.GO17997@ZenIV.linux.org.uk> References: <20160214234312.GX17997@ZenIV.linux.org.uk> <20160215184554.GY17997@ZenIV.linux.org.uk> <20160215230434.GZ17997@ZenIV.linux.org.uk> <20160216233609.GE17997@ZenIV.linux.org.uk> <20160216235441.GF17997@ZenIV.linux.org.uk> <20160217201120.GN17997@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160217201120.GN17997@ZenIV.linux.org.uk> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Feb 17, 2016 at 08:11:20PM +0000, Al Viro wrote: > With reinit_completion() added? > > > Maybe this is relevant: > > > > Alloced OP ffff880015698000 <- doomed op for orangefs_create MAILBOX2.CPT > > service_operation: orangefs_create op ffff880015698000 > > ffff880015698000 got past is_daemon_in_service > > > > ... lots of stuff ... > > > > w_f_m_d returned -11 for ffff880015698000 <- first op to get EAGAIN > > > > first client core is NOT in service > > second op to get EAGAIN > > ... > > last client core is NOT in service > > > > ... lots of stuff ... > > > > service_operation returns to orangef_create with handle 0 fsid 0 ret 0 > > for MAILBOX2.CPT > > > > I'm guessing you want me to wait to do the switching of my branch > > until we fix this (last?) thing, let me know... > > What I'd like to check is the value of op->waitq.done at retry_servicing. > If we get there with a non-zero value, we've a problem. BTW, do you > hit any of gossip_err() in orangefs_clean_up_interrupted_operation()? Am I right assuming that you are seeing zero from service_operation() for ORANGEFS_VFS_OP_CREATE with zeroed ->downcall.resp.create.refn? AFAICS, that can only happen with wait_for_matching_downcall() returning 0, which mean op_state_serviced(op). And prior to that we had set_op_state_waiting(op), which means set_op_state_serviced(op) done between those two... So unless you have something really nasty going on (buggered op refcounting, memory corruption, etc.), we had orangefs_devreq_write_iter() pick that op and copy the entire op->downcall from userland. With op->downcall.status not set to -EFAULT, or we would've... Oh, shit - it's fed through orangefs_normalize_to_errno(). OTOH, it would've hit gossip_err("orangefs: orangefs_normalize_to_errno: got error code which is not from ORANGEFS.\n") and presumably you would've noticed that. And it still would've returned that -EFAULT intact, actually. In any case, it's a bug - we need ret = -(ORANGEFS_ERROR_BIT|9); goto Broken; instead of those ret = -EFAULT; goto Broken; and ret = -(ORANGEFS_ERROR_BIT|8); goto Broken; instead of ret = -ENOMEM; goto Broken; in orangefs_devreq_write_iter(). Anyway, it looks like we didn't go through Broken: in there, so copy_from_user() must've been successful. Hmm... Could the daemon have really given that reply? Relevant test would be something like WARN_ON(op->upcall.type == ORANGEFS_VFS_OP_CREATE && !op->downcall.resp.create.refn.fsid); right after wakeup: /* * tell the vfs op waiting on a waitqueue * that this op is done */ spin_lock(&op->lock); if (unlikely(op_state_given_up(op))) { spin_unlock(&op->lock); goto out; } set_op_state_serviced(op); and plain WARN_ON(1) right after Broken: If that op goes through either of those set_op_state_serviced() with that zero refn.fsid, we'll see a WARN_ON triggered...