From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-fsdevel-owner@vger.kernel.org>
Received: from zeniv.linux.org.uk ([195.92.253.2]:51799 "EHLO
	ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753996AbcAVRnk (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Fri, 22 Jan 2016 12:43:40 -0500
Date: Fri, 22 Jan 2016 17:43:38 +0000
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Mike Marshall <hubcap@omnibond.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: Orangefs ABI documentation
Message-ID: <20160122174338.GA17997@ZenIV.linux.org.uk>
References: <CAOg9mSS0Jg5JEjXxn0S5vTpV4kWR6qEyBiif7hu9EohNB0X9Bg@mail.gmail.com>
 <20160122071147.GY17997@ZenIV.linux.org.uk>
 <CAOg9mST6tDsY-cwvJsfYbcLt41Uufyy8Y-nXiCFsXwRJ0xanjQ@mail.gmail.com>
 <CAOg9mSQf4m-MM_kKd=UcGsCx_LZg=+nFESd8jhzDEbcM9DOGjg@mail.gmail.com>
 <20160122170838.GZ17997@ZenIV.linux.org.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160122170838.GZ17997@ZenIV.linux.org.uk>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Fri, Jan 22, 2016 at 05:08:38PM +0000, Al Viro wrote:

> BTW, what should happen to requests that got a buggered response in
> orangefs_devreq_write_iter()?  As it is, you just free them and to hell
> with whatever might've been waiting on them; that's easy to fix, but
> what about the submitter?  Should we let it time out/simulate an error
> properly returned by daemon/something else?

FWIW, here's what I have in mind for lifetime rules:
	1) op_alloc() creates them with refcount 1
	2) when we decide to pass the sucker to daemon, we bump the refcount
before taking the thing out of request list.  After successful copying
to userland, we check if the submitter has given up on it.  If it hasn't,
we move the sucker to hash and decrement the refcount (it can't be the
last reference there).  If it has, we do put_op().  After _failed_ copying
to userland, we also check if it's been given up.  If it hasn't, we move
it back to request list and decrement the refcount.  If it has, we do
put_op().
	3) when we get a response from daemon, we bump the refcount before
taking the thing out of hash.  Once we are done, we call put_op() - in *all*
cases.  Malformed response is treated like a well-formed error.
	4) if submitter decides to give up upon a request, it sets a flag in
->op_state.  From that point on nobody else is going to touch op->list.
	5) once submitter is done, it does put_op() - *NOT* op_release().
Normally it ends up with request freed, but if we'd raced with interaction
with daemon we'll get it freed by put_op() done in (2) or (3).  No
        /*
         * tell the device file owner waiting on I/O that this read has
         * completed and it can return now.  in this exact case, on
         * wakeup the daemon will free the op, so we *cannot* touch it
         * after this.
         */
        wake_up_daemon_for_return(new_op);
        new_op = NULL;
anymore - just wake it up and do put_op() regardless.  In that case daemon
has already done get_op() and will free the sucker once it does its put_op().

Objections, comments?

BTW, it really looks like we might be better off without atomic_t here -
op->lock is held for all non-trivial transitions and we might as well have
put_op() require op->lock held and do something like
	if (--op->ref_count) {
		spin_unlock(op->lock);
		return;
	}
	spin_unlock(op->lock);
	op_release(op);

Speaking of op->lock:
                        /*
                         * let process sleep for a few seconds so shared
                         * memory system can be initialized.
                         */
                        spin_lock_irqsave(&op->lock, irqflags);
                        prepare_to_wait(&orangefs_bufmap_init_waitq,
                                        &wait_entry,
                                        TASK_INTERRUPTIBLE);
                        spin_unlock_irqrestore(&op->lock, irqflags);
What do you need to block interrupts for?

Oh, and why do you call orangefs_op_initialize(orangefs_op) from op_release(),
when you immediately follow it with freeing orangefs_op and do it on
allocation anyway?