All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Marshall <hubcap@omnibond.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Martin Brandenburg <martin@omnibond.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>
Subject: Re: Orangefs ABI documentation
Date: Wed, 17 Feb 2016 17:24:56 -0500	[thread overview]
Message-ID: <CAOg9mSRtGvMtr2b8xjFUdgJDgS7hb+dYtjEBBMKkatZJsbLuJg@mail.gmail.com> (raw)
In-Reply-To: <20160217211754.GO17997@ZenIV.linux.org.uk>

> with reinit_completion() added?

In the right place, even...

> Am I right assuming that you are seeing zero from service_operation()
> for ORANGEFS_VFS_OP_CREATE with zeroed ->downcall.resp.create.refn?
> AFAICS, that can only happen with wait_for_matching_downcall() returning
> 0, which mean op_state_serviced(op).

Yes sir:

Feb 17 12:25:15 be1 kernel: [  857.901225] service_operation:
wait_for_matching_downcall returned 0 for ffff880015698000
Feb 17 12:25:15 be1 kernel: [  857.901228] orangefs: service_operation
orangefs_create returning: 0 for ffff880015698000.
Feb 17 12:25:15 be1 kernel: [  857.901230] orangefs_create:
MAILBOX2.CPT: handle:00000000-0000-0000-0000-000000000000: fsid:0:
new_op:ffff880015698000: ret:0:

I'll get those WARN_ONs in there and do some more
tests.

Martin thinks he might be on to a solution... he'll
probably send a response soon... I can hear him typing
at 500 characters a minute a few cubes away...

-Mike

On Wed, Feb 17, 2016 at 4:17 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Wed, Feb 17, 2016 at 08:11:20PM +0000, Al Viro wrote:
>> With reinit_completion() added?
>>
>> > Maybe this is relevant:
>> >
>> > Alloced OP ffff880015698000 <- doomed op for orangefs_create MAILBOX2.CPT
>> > service_operation: orangefs_create op ffff880015698000
>> > ffff880015698000 got past is_daemon_in_service
>> >
>> > ... lots of stuff ...
>> >
>> > w_f_m_d returned -11 for ffff880015698000 <- first op to get EAGAIN
>> >
>> > first client core is NOT in service
>> > second op to get EAGAIN
>> >           ...
>> > last client core is NOT in service
>> >
>> > ... lots of stuff ...
>> >
>> > service_operation returns to orangef_create with handle 0 fsid 0 ret 0
>> > for MAILBOX2.CPT
>> >
>> > I'm guessing you want me to wait to do the switching of my branch
>> > until we fix this (last?) thing, let me know...
>>
>> What I'd like to check is the value of op->waitq.done at retry_servicing.
>> If we get there with a non-zero value, we've a problem.  BTW, do you
>> hit any of gossip_err() in orangefs_clean_up_interrupted_operation()?
>
> Am I right assuming that you are seeing zero from service_operation()
> for ORANGEFS_VFS_OP_CREATE with zeroed ->downcall.resp.create.refn?
> AFAICS, that can only happen with wait_for_matching_downcall() returning
> 0, which mean op_state_serviced(op).  And prior to that we had
> set_op_state_waiting(op), which means set_op_state_serviced(op) done between
> those two...
>
> So unless you have something really nasty going on (buggered op refcounting,
> memory corruption, etc.), we had orangefs_devreq_write_iter() pick that
> op and copy the entire op->downcall from userland.  With op->downcall.status
> not set to -EFAULT, or we would've...  Oh, shit - it's fed through
> orangefs_normalize_to_errno().  OTOH, it would've hit
> gossip_err("orangefs: orangefs_normalize_to_errno: got error code which is not
> from ORANGEFS.\n") and presumably you would've noticed that.  And it still
> would've returned that -EFAULT intact, actually.  In any case, it's a bug -
> we need
>         ret = -(ORANGEFS_ERROR_BIT|9);
>         goto Broken;
> instead of those
>         ret = -EFAULT;
>         goto Broken;
> and
>         ret = -(ORANGEFS_ERROR_BIT|8);
>         goto Broken;
> instead of
>         ret = -ENOMEM;
>         goto Broken;
> in orangefs_devreq_write_iter().
>
> Anyway, it looks like we didn't go through Broken: in there, so copy_from_user()
> must've been successful.
>
> Hmm...  Could the daemon have really given that reply?  Relevant test would
> be something like
>         WARN_ON(op->upcall.type == ORANGEFS_VFS_OP_CREATE &&
>             !op->downcall.resp.create.refn.fsid);
> right after
> wakeup:
>         /*
>          * tell the vfs op waiting on a waitqueue
>          * that this op is done
>          */
>         spin_lock(&op->lock);
>         if (unlikely(op_state_given_up(op))) {
>                 spin_unlock(&op->lock);
>                 goto out;
>         }
>         set_op_state_serviced(op);
>
> and plain WARN_ON(1) right after Broken:
>
> If that op goes through either of those set_op_state_serviced() with that
> zero refn.fsid, we'll see a WARN_ON triggered...

  reply	other threads:[~2016-02-17 22:24 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-15 21:46 Orangefs ABI documentation Mike Marshall
2016-01-22  7:11 ` Al Viro
2016-01-22 11:09   ` Mike Marshall
2016-01-22 16:59     ` Mike Marshall
2016-01-22 17:08       ` Al Viro
2016-01-22 17:40         ` Mike Marshall
2016-01-22 17:43         ` Al Viro
2016-01-22 18:17           ` Mike Marshall
2016-01-22 18:37             ` Al Viro
2016-01-22 19:07               ` Mike Marshall
2016-01-22 19:21                 ` Mike Marshall
2016-01-22 20:04                   ` Al Viro
2016-01-22 20:30                     ` Mike Marshall
2016-01-23  0:12                       ` Al Viro
2016-01-23  1:28                         ` Al Viro
2016-01-23  2:54                           ` Mike Marshall
2016-01-23 19:10                             ` Al Viro
2016-01-23 19:24                               ` Mike Marshall
2016-01-23 21:35                                 ` Mike Marshall
2016-01-23 22:05                                   ` Al Viro
2016-01-23 21:40                                 ` Al Viro
2016-01-23 22:36                                   ` Mike Marshall
2016-01-24  0:16                                     ` Al Viro
2016-01-24  4:05                                       ` Al Viro
2016-01-24 22:12                                         ` Mike Marshall
2016-01-30 17:22                                           ` Al Viro
2016-01-26 19:52                                         ` Martin Brandenburg
2016-01-30 17:34                                           ` Al Viro
2016-01-30 18:27                                             ` Al Viro
2016-02-04 23:30                                               ` Mike Marshall
2016-02-06 19:42                                                 ` Al Viro
2016-02-07  1:38                                                   ` Al Viro
2016-02-07  3:53                                                     ` Al Viro
2016-02-07 20:01                                                       ` [RFC] bufmap-related wait logics (Re: Orangefs ABI documentation) Al Viro
2016-02-08 22:26                                                       ` Orangefs ABI documentation Mike Marshall
2016-02-08 23:35                                                         ` Al Viro
2016-02-09  3:32                                                           ` Al Viro
2016-02-09 14:34                                                             ` Mike Marshall
2016-02-09 17:40                                                               ` Al Viro
2016-02-09 21:06                                                                 ` Al Viro
2016-02-09 22:25                                                                   ` Mike Marshall
2016-02-11 23:36                                                                   ` Mike Marshall
2016-02-09 22:02                                                                 ` Mike Marshall
2016-02-09 22:16                                                                   ` Al Viro
2016-02-09 22:40                                                                     ` Al Viro
2016-02-09 23:13                                                                       ` Al Viro
2016-02-10 16:44                                                                         ` Al Viro
2016-02-10 21:26                                                                           ` Al Viro
2016-02-11 23:54                                                                           ` Mike Marshall
2016-02-12  0:55                                                                             ` Al Viro
2016-02-12 12:13                                                                               ` Mike Marshall
2016-02-11  0:44                                                                         ` Al Viro
2016-02-11  3:22                                                                           ` Mike Marshall
2016-02-12  4:27                                                                             ` Al Viro
2016-02-12 12:26                                                                               ` Mike Marshall
2016-02-12 18:00                                                                                 ` Martin Brandenburg
2016-02-13 17:18                                                                                   ` Mike Marshall
2016-02-13 17:47                                                                                     ` Al Viro
2016-02-14  2:56                                                                                       ` Al Viro
2016-02-14  3:46                                                                                         ` [RFC] slot allocator - waitqueue use review needed (Re: Orangefs ABI documentation) Al Viro
2016-02-14  4:06                                                                                           ` Al Viro
2016-02-16  2:12                                                                                           ` Al Viro
2016-02-16 19:28                                                                                             ` Al Viro
2016-02-14 22:31                                                                                         ` Orangefs ABI documentation Mike Marshall
2016-02-14 23:43                                                                                           ` Al Viro
2016-02-15 17:46                                                                                             ` Mike Marshall
2016-02-15 18:45                                                                                               ` Al Viro
2016-02-15 22:32                                                                                                 ` Martin Brandenburg
2016-02-15 23:04                                                                                                   ` Al Viro
2016-02-16 23:15                                                                                                     ` Mike Marshall
2016-02-16 23:36                                                                                                       ` Al Viro
2016-02-16 23:54                                                                                                         ` Al Viro
2016-02-17 19:24                                                                                                           ` Mike Marshall
2016-02-17 20:11                                                                                                             ` Al Viro
2016-02-17 21:17                                                                                                               ` Al Viro
2016-02-17 22:24                                                                                                                 ` Mike Marshall [this message]
2016-02-17 22:40                                                                                                             ` Martin Brandenburg
2016-02-17 23:09                                                                                                               ` Al Viro
2016-02-17 23:15                                                                                                                 ` Al Viro
2016-02-18  0:04                                                                                                                   ` Al Viro
2016-02-18 11:11                                                                                                                     ` Al Viro
2016-02-18 18:58                                                                                                                       ` Mike Marshall
2016-02-18 19:20                                                                                                                         ` Al Viro
2016-02-18 19:49                                                                                                                         ` Martin Brandenburg
2016-02-18 20:08                                                                                                                           ` Mike Marshall
2016-02-18 20:22                                                                                                                             ` Mike Marshall
2016-02-18 20:38                                                                                                                               ` Mike Marshall
2016-02-18 20:52                                                                                                                                 ` Al Viro
2016-02-18 21:50                                                                                                                                   ` Mike Marshall
2016-02-19  0:25                                                                                                                                     ` Al Viro
2016-02-19 22:11                                                                                                                                       ` Mike Marshall
2016-02-19 22:22                                                                                                                                         ` Al Viro
2016-02-20 12:14                                                                                                                                           ` Mike Marshall
2016-02-20 13:36                                                                                                                                             ` Al Viro
2016-02-22 16:20                                                                                                                                               ` Mike Marshall
2016-02-22 21:22                                                                                                                                                 ` Mike Marshall
2016-02-23 21:58                                                                                                                                                   ` Mike Marshall
2016-02-26 20:21                                                                                                                                                     ` Mike Marshall
2016-02-19 22:32                                                                                                                                         ` Al Viro
2016-02-19 22:45                                                                                                                                           ` Martin Brandenburg
2016-02-19 22:50                                                                                                                                           ` Martin Brandenburg
2016-02-18 20:49                                                                                                                               ` Al Viro
2016-02-15 22:47                                                                                                 ` Mike Marshall
2016-01-23 22:46                                   ` write() semantics (Re: Orangefs ABI documentation) Al Viro
2016-01-23 23:35                                     ` Linus Torvalds
2016-03-03 22:25                                       ` Mike Marshall
2016-03-04 20:55                                         ` Mike Marshall
2016-01-22 20:51                     ` Orangefs ABI documentation Mike Marshall
2016-01-22 23:53                       ` Mike Marshall
2016-01-22 19:54                 ` Al Viro
2016-01-22 19:50             ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOg9mSRtGvMtr2b8xjFUdgJDgS7hb+dYtjEBBMKkatZJsbLuJg@mail.gmail.com \
    --to=hubcap@omnibond.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=martin@omnibond.com \
    --cc=sfr@canb.auug.org.au \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.