linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kirill Tkhai <ktkhai@virtuozzo.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent
Date: Wed, 7 Nov 2018 17:25:52 +0300	[thread overview]
Message-ID: <6f27b5a5-0092-b23f-b28e-341ae093a241@virtuozzo.com> (raw)
In-Reply-To: <CAJfpeguDTsG7vEAhH=CHp43vJak70VzR8YH8K6=vZAUXCZZeEQ@mail.gmail.com>

On 07.11.2018 16:55, Miklos Szeredi wrote:
> On Tue, Nov 6, 2018 at 10:31 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>> When queue_interrupt() is called from fuse_dev_do_write(),
>> it came from userspace directly. Userspace may pass any
>> request id, even the request's we have not interrupted
>> (or even background's request). This patch adds sanity
>> check to make kernel safe against that.
> 
> Okay, I understand this far.
> 
>> The problem is real interrupt may be queued and requeued
>> by two tasks on two cpus. This case, the requeuer has not
>> guarantees it sees FR_INTERRUPTED bit on its cpu (since
>> we know nothing about the way userspace manages requests
>> between its threads and whether it uses smp barriers).
> 
> This sounds like BS. There's an explicit  smp_mb__after_atomic()
> after the set_bit(FR_INTERRUPTED,...).  Which means FR_INTERRUPTED is
> going to be visible on all CPU's after this, no need to fool around
> with setting FR_INTERRUPTED again, etc...

Hm, but how does it make the bit visible on all CPUS?

The problem is that smp_mb_xxx() barrier on a cpu has a sense
only in pair with the appropriate barrier on the second cpu.
There is no guarantee for visibility, if second cpu does not
have a barrier:

  CPU 1                  CPU2                        CPU3                       CPU4                        CPU5

  set FR_INTERRUPTED     set FR_SENT                                            
  <smp mb>               <smp mb>                          
  test FR_SENT (== 0)    test FR_INTERRUPTED (==1)
                         list_add[&req->intr_entry]  read[req by intr_entry]
                                                     <place to insert a barrier>
                                                     goto userspace
                                                     write in userspace buffer
                                                                                read from userspace buffer  
                                                                                write to userspace buffer
                                                                                                             read from userspace buffer
                                                                                                             enter kernel
                                                                                                             <place to insert a barrier>
                                                                                                             test FR_INTERRUPTED <- Not visible

The sequence:

set_bit(FR_INTERRUPTED, ...)
smp_mb__after_atomic();
test_bit(FR_SENT, &req->flags)

just guarantees the expected order on CPU2, which uses <smp mb>,
but CPU5 does not have any guarantees.


This 5 CPUs picture is a scheme, which illustrates the possible way userspace
may manage interrupts. Tags <place to insert a barrier> show places, where
we not have barriers yet, but where we may introduce them. But even in case
of we introduce them, there is no a way, that such the barriers help against CPU4.
So, this is the reason we have to set FR_INTERRUPTED bit again to make it visible
under the spinlock on CPU5.

Thanks,
Kirill

>>
>> To eliminate this problem, queuer writes FR_INTERRUPTED
>> bit again under fiq->waitq.lock, and this guarantees
>> requeuer will see the bit, when checks it.
>>
>> I try to introduce solution, which does not affect on
>> performance, and which does not force to take more
>> locks. This is the reason, the below solution is worse:
>>
>>    request_wait_answer()
>>    {
>>      ...
>>   +  spin_lock(&fiq->waitq.lock);
>>      set_bit(FR_INTERRUPTED, &req->flags);
>>   +  spin_unlock(&fiq->waitq.lock);
>>      ...
>>    }
>>
>> Also, it does not look a better idea to extend fuse_dev_do_read()
>> with the fix, since it's already a big function:
>>
>>    fuse_dev_do_read()
>>    {
>>      ...
>>      if (test_bit(FR_INTERRUPTED, &req->flags)) {
>>   +      /* Comment */
>>   +      barrier();
>>   +      set_bit(FR_INTERRUPTED, &req->flags);
>>          queue_interrupt(fiq, req);
>>      }
>>      ...
>>    }
>>
>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>> ---
>>  fs/fuse/dev.c |   26 +++++++++++++++++++++-----
>>  1 file changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
>> index 315d395d5c02..3bfc5ed61c9a 100644
>> --- a/fs/fuse/dev.c
>> +++ b/fs/fuse/dev.c
>> @@ -475,13 +475,27 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
>>         fuse_put_request(fc, req);
>>  }
>>
>> -static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>> +static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>>  {
>>         bool kill = false;
>>
>>         if (test_bit(FR_FINISHED, &req->flags))
>> -               return;
>> +               return 0;
>>         spin_lock(&fiq->waitq.lock);
>> +       /* Check for we've sent request to interrupt this req */
>> +       if (unlikely(!test_bit(FR_INTERRUPTED, &req->flags))) {
>> +               spin_unlock(&fiq->waitq.lock);
>> +               return -EINVAL;
>> +       }
>> +       /*
>> +        * Interrupt may be queued from fuse_dev_do_read(), and
>> +        * later requeued on other cpu by fuse_dev_do_write().
>> +        * To make FR_INTERRUPTED bit visible for the requeuer
>> +        * (under fiq->waitq.lock) we write it once again.
>> +        */
>> +       barrier();
>> +       __set_bit(FR_INTERRUPTED, &req->flags);
>> +
>>         if (list_empty(&req->intr_entry)) {
>>                 list_add_tail(&req->intr_entry, &fiq->interrupts);
>>                 /*
>> @@ -492,7 +506,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>>                 if (test_bit(FR_FINISHED, &req->flags)) {
>>                         list_del_init(&req->intr_entry);
>>                         spin_unlock(&fiq->waitq.lock);
>> -                       return;
>> +                       return 0;
>>                 }
>>                 wake_up_locked(&fiq->waitq);
>>                 kill = true;
>> @@ -500,6 +514,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>>         spin_unlock(&fiq->waitq.lock);
>>         if (kill)
>>                 kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
>> +       return (int)kill;
>>  }
>>
>>  static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)
>> @@ -1959,8 +1974,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
>>                         nbytes = -EINVAL;
>>                 else if (oh.error == -ENOSYS)
>>                         fc->no_interrupt = 1;
>> -               else if (oh.error == -EAGAIN)
>> -                       queue_interrupt(&fc->iq, req);
>> +               else if (oh.error == -EAGAIN &&
>> +                        queue_interrupt(&fc->iq, req) < 0)
>> +                       nbytes = -EINVAL;
>>
>>                 fuse_put_request(fc, req);
>>                 fuse_copy_finish(cs);
>>

  reply	other threads:[~2018-11-07 14:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-06  9:30 [PATCH 0/6] fuse: Interrupt-related optimizations and improvements Kirill Tkhai
2018-11-06  9:30 ` [PATCH 1/6] fuse: Kill fasync only if interrupt is queued in queue_interrupt() Kirill Tkhai
2018-11-07 12:45   ` Miklos Szeredi
2018-11-06  9:30 ` [PATCH 2/6] fuse: Optimize request_end() by not taking fiq->waitq.lock Kirill Tkhai
2018-11-07 13:09   ` Miklos Szeredi
2018-11-06  9:30 ` [PATCH 3/6] fuse: Wake up req->waitq of only not background requests in request_end() Kirill Tkhai
2018-11-06  9:30 ` [PATCH 4/6] fuse: Check for FR_SENT bit in fuse_dev_do_write() Kirill Tkhai
2018-11-07 13:16   ` Miklos Szeredi
2018-11-06  9:30 ` [PATCH 5/6] fuse: Do some refactoring " Kirill Tkhai
2018-11-06  9:31 ` [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent Kirill Tkhai
2018-11-07 13:55   ` Miklos Szeredi
2018-11-07 14:25     ` Kirill Tkhai [this message]
2018-11-07 14:45       ` Miklos Szeredi
2018-11-07 16:40         ` Kirill Tkhai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6f27b5a5-0092-b23f-b28e-341ae093a241@virtuozzo.com \
    --to=ktkhai@virtuozzo.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).