linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] fuse: Interrupt-related optimizations and improvements
@ 2018-11-06  9:30 Kirill Tkhai
  2018-11-06  9:30 ` [PATCH 1/6] fuse: Kill fasync only if interrupt is queued in queue_interrupt() Kirill Tkhai
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Kirill Tkhai @ 2018-11-06  9:30 UTC (permalink / raw)
  To: miklos, ktkhai, linux-fsdevel, linux-kernel

This patchset consists of several parts. Patches [1-2] optimize
likely case of request_end(), and make this function to take
fiq->waitq.lock only, when it is really needed. This improves
performance of this function completes.

Patch 3 makes request_end() to call wake_up() only for not
background requests (background requests never wait answer),
and this optimizes this function a bit more

Patches [4-6] makes code to check userspace requests correct
interrupt id to requeue (whether we've sent this interrupt).

---

Kirill Tkhai (6):
      fuse: Kill fasync only if interrupt is queued in queue_interrupt()
      fuse: Optimize request_end() by not taking fiq->waitq.lock
      fuse: Wake up req->waitq of only not background requests in request_end()
      fuse: Check for FR_SENT bit in fuse_dev_do_write()
      fuse: Do some refactoring in fuse_dev_do_write()
      fuse: Verify userspace asks to requeue interrupt that we really sent


 fs/fuse/dev.c |   73 +++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 53 insertions(+), 20 deletions(-)

--
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/6] fuse: Kill fasync only if interrupt is queued in queue_interrupt()
  2018-11-06  9:30 [PATCH 0/6] fuse: Interrupt-related optimizations and improvements Kirill Tkhai
@ 2018-11-06  9:30 ` Kirill Tkhai
  2018-11-07 12:45   ` Miklos Szeredi
  2018-11-06  9:30 ` [PATCH 2/6] fuse: Optimize request_end() by not taking fiq->waitq.lock Kirill Tkhai
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Kirill Tkhai @ 2018-11-06  9:30 UTC (permalink / raw)
  To: miklos, ktkhai, linux-fsdevel, linux-kernel

We should sent signal only in case of interrupt is really queued.
Not a real problem, but this makes the code clearer and intuitive.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/fuse/dev.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index fb2530ed84b3..7705f75c77a3 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -468,6 +468,8 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
 
 static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
 {
+	bool kill = false;
+
 	spin_lock(&fiq->waitq.lock);
 	if (test_bit(FR_FINISHED, &req->flags)) {
 		spin_unlock(&fiq->waitq.lock);
@@ -476,9 +478,11 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
 	if (list_empty(&req->intr_entry)) {
 		list_add_tail(&req->intr_entry, &fiq->interrupts);
 		wake_up_locked(&fiq->waitq);
+		kill = true;
 	}
 	spin_unlock(&fiq->waitq.lock);
-	kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
+	if (kill)
+		kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
 }
 
 static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/6] fuse: Optimize request_end() by not taking fiq->waitq.lock
  2018-11-06  9:30 [PATCH 0/6] fuse: Interrupt-related optimizations and improvements Kirill Tkhai
  2018-11-06  9:30 ` [PATCH 1/6] fuse: Kill fasync only if interrupt is queued in queue_interrupt() Kirill Tkhai
@ 2018-11-06  9:30 ` Kirill Tkhai
  2018-11-07 13:09   ` Miklos Szeredi
  2018-11-06  9:30 ` [PATCH 3/6] fuse: Wake up req->waitq of only not background requests in request_end() Kirill Tkhai
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Kirill Tkhai @ 2018-11-06  9:30 UTC (permalink / raw)
  To: miklos, ktkhai, linux-fsdevel, linux-kernel

We take global fiq->waitq.lock every time, when we are
in this function, but interrupted requests are just small
subset of all requests. This patch optimizes request_end()
and makes it to take the lock when it's really needed.

queue_interrupt() needs small change for that. After req
is linked to interrupt list, we do smp_mb() and check
for FR_FINISHED again. In case of FR_FINISHED bit has
appeared, we remove req and leave the function:

CPU 0                                                CPU 1
queue_interrupt()                                    request_end()

  spin_lock(&fiq->waitq.lock)


  list_add_tail(&req->intr_entry, &fiq->interrupts)    test_and_set_bit(FR_FINISHED, &req->flags)
  smp_mb()                                             <memory barrier implied test_and_set_bit()>
  if (test_bit(FR_FINISHED, &req->flags))              if (!list_empty(&req->intr_entry))

    list_del_init(&req->intr_entry)                      spin_lock(&fiq->waitq.lock)
                                                         list_del_init(&req->intr_entry)

Check the change is visible in perf report:

1)Firstly mount fusexmp_fh:
  $fuse/example/.libs/fusexmp_fh mnt

2)Run test doing
    futimes(fd, tv1);
    futimes(fd, tv2);
  in many threads endlessly.

3)perf record -g (all the system load)

Without the patch in request_end() we spend 62.58% of do_write() time:
(= 12.58 / 20.10 * 100%)

   55,22% entry_SYSCALL_64
     20,10% do_writev
      ...
          18,08% fuse_dev_do_write
           12,58% request_end
            10,08% __wake_up_common_lock
            1,97% queued_spin_lock_slowpath
           1,31% fuse_copy_args
           1,04% fuse_copy_one
           0,85% queued_spin_lock_slowpath

With the patch, the perf report becomes better, and only 58.16%
of do_write() time we spend in request_end():

   54,15% entry_SYSCALL_64
     18,24% do_writev
      ...
          16,25% fuse_dev_do_write
           10,61% request_end
            10,25% __wake_up_common_lock
           1,34% fuse_copy_args
           1,06% fuse_copy_one
           0,88% queued_spin_lock_slowpath

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/fuse/dev.c |   30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 7705f75c77a3..391498e680ec 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -427,10 +427,16 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
 
 	if (test_and_set_bit(FR_FINISHED, &req->flags))
 		goto put_request;
-
-	spin_lock(&fiq->waitq.lock);
-	list_del_init(&req->intr_entry);
-	spin_unlock(&fiq->waitq.lock);
+	/*
+	 * test_and_set_bit() implies smp_mb() between bit
+	 * changing and below intr_entry check. Pairs with
+	 * smp_mb() from queue_interrupt().
+	 */
+	if (!list_empty(&req->intr_entry)) {
+		spin_lock(&fiq->waitq.lock);
+		list_del_init(&req->intr_entry);
+		spin_unlock(&fiq->waitq.lock);
+	}
 	WARN_ON(test_bit(FR_PENDING, &req->flags));
 	WARN_ON(test_bit(FR_SENT, &req->flags));
 	if (test_bit(FR_BACKGROUND, &req->flags)) {
@@ -470,13 +476,21 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
 {
 	bool kill = false;
 
-	spin_lock(&fiq->waitq.lock);
-	if (test_bit(FR_FINISHED, &req->flags)) {
-		spin_unlock(&fiq->waitq.lock);
+	if (test_bit(FR_FINISHED, &req->flags))
 		return;
-	}
+	spin_lock(&fiq->waitq.lock);
 	if (list_empty(&req->intr_entry)) {
 		list_add_tail(&req->intr_entry, &fiq->interrupts);
+		/*
+		 * Pairs with smp_mb() implied by test_and_set_bit()
+		 * from request_end().
+		 */
+		smp_mb();
+		if (test_bit(FR_FINISHED, &req->flags)) {
+			list_del_init(&req->intr_entry);
+			spin_unlock(&fiq->waitq.lock);
+			return;
+		}
 		wake_up_locked(&fiq->waitq);
 		kill = true;
 	}


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/6] fuse: Wake up req->waitq of only not background requests in request_end()
  2018-11-06  9:30 [PATCH 0/6] fuse: Interrupt-related optimizations and improvements Kirill Tkhai
  2018-11-06  9:30 ` [PATCH 1/6] fuse: Kill fasync only if interrupt is queued in queue_interrupt() Kirill Tkhai
  2018-11-06  9:30 ` [PATCH 2/6] fuse: Optimize request_end() by not taking fiq->waitq.lock Kirill Tkhai
@ 2018-11-06  9:30 ` Kirill Tkhai
  2018-11-06  9:30 ` [PATCH 4/6] fuse: Check for FR_SENT bit in fuse_dev_do_write() Kirill Tkhai
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Kirill Tkhai @ 2018-11-06  9:30 UTC (permalink / raw)
  To: miklos, ktkhai, linux-fsdevel, linux-kernel

Currently, we wait on req->waitq in request_wait_answer()
function only, and it's never used for background requests.
Since wake_up() is not a light-weight macros, instead of this,
it unfolds in really called function, which makes locking
operations taking some cpu cycles, let's avoid its call
for the case we definitely know it's completely useless.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/fuse/dev.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 391498e680ec..739968ee8b0c 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -464,8 +464,11 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
 		fc->active_background--;
 		flush_bg_queue(fc);
 		spin_unlock(&fc->bg_lock);
+	} else {
+		/* Wake up waiter sleeping in request_wait_answer() */
+		wake_up(&req->waitq);
 	}
-	wake_up(&req->waitq);
+
 	if (req->end)
 		req->end(fc, req);
 put_request:


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/6] fuse: Check for FR_SENT bit in fuse_dev_do_write()
  2018-11-06  9:30 [PATCH 0/6] fuse: Interrupt-related optimizations and improvements Kirill Tkhai
                   ` (2 preceding siblings ...)
  2018-11-06  9:30 ` [PATCH 3/6] fuse: Wake up req->waitq of only not background requests in request_end() Kirill Tkhai
@ 2018-11-06  9:30 ` Kirill Tkhai
  2018-11-07 13:16   ` Miklos Szeredi
  2018-11-06  9:30 ` [PATCH 5/6] fuse: Do some refactoring " Kirill Tkhai
  2018-11-06  9:31 ` [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent Kirill Tkhai
  5 siblings, 1 reply; 14+ messages in thread
From: Kirill Tkhai @ 2018-11-06  9:30 UTC (permalink / raw)
  To: miklos, ktkhai, linux-fsdevel, linux-kernel

It's not possible to have answer to a request,
before the request is actually sent. Add sanity
check for that.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/fuse/dev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 739968ee8b0c..c603f1ebf0fd 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1947,7 +1947,7 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
 		goto err_unlock_pq;
 
 	req = request_find(fpq, oh.unique & ~FUSE_INT_REQ_BIT);
-	if (!req)
+	if (!req || !test_bit(FR_SENT, &req->flags))
 		goto err_unlock_pq;
 
 	/* Is it an interrupt reply ID? */


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/6] fuse: Do some refactoring in fuse_dev_do_write()
  2018-11-06  9:30 [PATCH 0/6] fuse: Interrupt-related optimizations and improvements Kirill Tkhai
                   ` (3 preceding siblings ...)
  2018-11-06  9:30 ` [PATCH 4/6] fuse: Check for FR_SENT bit in fuse_dev_do_write() Kirill Tkhai
@ 2018-11-06  9:30 ` Kirill Tkhai
  2018-11-06  9:31 ` [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent Kirill Tkhai
  5 siblings, 0 replies; 14+ messages in thread
From: Kirill Tkhai @ 2018-11-06  9:30 UTC (permalink / raw)
  To: miklos, ktkhai, linux-fsdevel, linux-kernel

This is needed for next patch.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/fuse/dev.c |   12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index c603f1ebf0fd..315d395d5c02 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1955,18 +1955,14 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
 		__fuse_get_request(req);
 		spin_unlock(&fpq->lock);
 
-		err = -EINVAL;
-		if (nbytes != sizeof(struct fuse_out_header)) {
-			fuse_put_request(fc, req);
-			goto err_finish;
-		}
-
-		if (oh.error == -ENOSYS)
+		if (nbytes != sizeof(struct fuse_out_header))
+			nbytes = -EINVAL;
+		else if (oh.error == -ENOSYS)
 			fc->no_interrupt = 1;
 		else if (oh.error == -EAGAIN)
 			queue_interrupt(&fc->iq, req);
-		fuse_put_request(fc, req);
 
+		fuse_put_request(fc, req);
 		fuse_copy_finish(cs);
 		return nbytes;
 	}


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent
  2018-11-06  9:30 [PATCH 0/6] fuse: Interrupt-related optimizations and improvements Kirill Tkhai
                   ` (4 preceding siblings ...)
  2018-11-06  9:30 ` [PATCH 5/6] fuse: Do some refactoring " Kirill Tkhai
@ 2018-11-06  9:31 ` Kirill Tkhai
  2018-11-07 13:55   ` Miklos Szeredi
  5 siblings, 1 reply; 14+ messages in thread
From: Kirill Tkhai @ 2018-11-06  9:31 UTC (permalink / raw)
  To: miklos, ktkhai, linux-fsdevel, linux-kernel

When queue_interrupt() is called from fuse_dev_do_write(),
it came from userspace directly. Userspace may pass any
request id, even the request's we have not interrupted
(or even background's request). This patch adds sanity
check to make kernel safe against that.

The problem is real interrupt may be queued and requeued
by two tasks on two cpus. This case, the requeuer has not
guarantees it sees FR_INTERRUPTED bit on its cpu (since
we know nothing about the way userspace manages requests
between its threads and whether it uses smp barriers).

To eliminate this problem, queuer writes FR_INTERRUPTED
bit again under fiq->waitq.lock, and this guarantees
requeuer will see the bit, when checks it.

I try to introduce solution, which does not affect on
performance, and which does not force to take more
locks. This is the reason, the below solution is worse:

   request_wait_answer()
   {
     ...
  +  spin_lock(&fiq->waitq.lock);
     set_bit(FR_INTERRUPTED, &req->flags);
  +  spin_unlock(&fiq->waitq.lock);
     ...
   }

Also, it does not look a better idea to extend fuse_dev_do_read()
with the fix, since it's already a big function:

   fuse_dev_do_read()
   {
     ...
     if (test_bit(FR_INTERRUPTED, &req->flags)) {
  +      /* Comment */
  +      barrier();
  +      set_bit(FR_INTERRUPTED, &req->flags);
         queue_interrupt(fiq, req);
     }
     ...
   }

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 fs/fuse/dev.c |   26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 315d395d5c02..3bfc5ed61c9a 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -475,13 +475,27 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
 	fuse_put_request(fc, req);
 }
 
-static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
+static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
 {
 	bool kill = false;
 
 	if (test_bit(FR_FINISHED, &req->flags))
-		return;
+		return 0;
 	spin_lock(&fiq->waitq.lock);
+	/* Check for we've sent request to interrupt this req */
+	if (unlikely(!test_bit(FR_INTERRUPTED, &req->flags))) {
+		spin_unlock(&fiq->waitq.lock);
+		return -EINVAL;
+	}
+	/*
+	 * Interrupt may be queued from fuse_dev_do_read(), and
+	 * later requeued on other cpu by fuse_dev_do_write().
+	 * To make FR_INTERRUPTED bit visible for the requeuer
+	 * (under fiq->waitq.lock) we write it once again.
+	 */
+	barrier();
+	__set_bit(FR_INTERRUPTED, &req->flags);
+
 	if (list_empty(&req->intr_entry)) {
 		list_add_tail(&req->intr_entry, &fiq->interrupts);
 		/*
@@ -492,7 +506,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
 		if (test_bit(FR_FINISHED, &req->flags)) {
 			list_del_init(&req->intr_entry);
 			spin_unlock(&fiq->waitq.lock);
-			return;
+			return 0;
 		}
 		wake_up_locked(&fiq->waitq);
 		kill = true;
@@ -500,6 +514,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
 	spin_unlock(&fiq->waitq.lock);
 	if (kill)
 		kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
+	return (int)kill;
 }
 
 static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)
@@ -1959,8 +1974,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
 			nbytes = -EINVAL;
 		else if (oh.error == -ENOSYS)
 			fc->no_interrupt = 1;
-		else if (oh.error == -EAGAIN)
-			queue_interrupt(&fc->iq, req);
+		else if (oh.error == -EAGAIN &&
+			 queue_interrupt(&fc->iq, req) < 0)
+			nbytes = -EINVAL;
 
 		fuse_put_request(fc, req);
 		fuse_copy_finish(cs);


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/6] fuse: Kill fasync only if interrupt is queued in queue_interrupt()
  2018-11-06  9:30 ` [PATCH 1/6] fuse: Kill fasync only if interrupt is queued in queue_interrupt() Kirill Tkhai
@ 2018-11-07 12:45   ` Miklos Szeredi
  0 siblings, 0 replies; 14+ messages in thread
From: Miklos Szeredi @ 2018-11-07 12:45 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: linux-fsdevel, linux-kernel

On Tue, Nov 6, 2018 at 10:30 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> We should sent signal only in case of interrupt is really queued.
> Not a real problem, but this makes the code clearer and intuitive.
>
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> ---
>  fs/fuse/dev.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index fb2530ed84b3..7705f75c77a3 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -468,6 +468,8 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
>
>  static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>  {
> +       bool kill = false;
> +
>         spin_lock(&fiq->waitq.lock);
>         if (test_bit(FR_FINISHED, &req->flags)) {
>                 spin_unlock(&fiq->waitq.lock);
> @@ -476,9 +478,11 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>         if (list_empty(&req->intr_entry)) {
>                 list_add_tail(&req->intr_entry, &fiq->interrupts);
>                 wake_up_locked(&fiq->waitq);
> +               kill = true;
>         }
>         spin_unlock(&fiq->waitq.lock);
> -       kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
> +       if (kill)
> +               kill_fasync(&fiq->fasync, SIGIO, POLL_IN);

All other cases just do the kill_fasync() inside the fiq->waitq.lock
locked region.  That seems the simpler and more readable solution to
this.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] fuse: Optimize request_end() by not taking fiq->waitq.lock
  2018-11-06  9:30 ` [PATCH 2/6] fuse: Optimize request_end() by not taking fiq->waitq.lock Kirill Tkhai
@ 2018-11-07 13:09   ` Miklos Szeredi
  0 siblings, 0 replies; 14+ messages in thread
From: Miklos Szeredi @ 2018-11-07 13:09 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: linux-fsdevel, linux-kernel

On Tue, Nov 6, 2018 at 10:30 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> We take global fiq->waitq.lock every time, when we are
> in this function, but interrupted requests are just small
> subset of all requests. This patch optimizes request_end()
> and makes it to take the lock when it's really needed.
>
> queue_interrupt() needs small change for that. After req
> is linked to interrupt list, we do smp_mb() and check
> for FR_FINISHED again. In case of FR_FINISHED bit has
> appeared, we remove req and leave the function:
>
> CPU 0                                                CPU 1
> queue_interrupt()                                    request_end()
>
>   spin_lock(&fiq->waitq.lock)
>
>
>   list_add_tail(&req->intr_entry, &fiq->interrupts)    test_and_set_bit(FR_FINISHED, &req->flags)
>   smp_mb()                                             <memory barrier implied test_and_set_bit()>
>   if (test_bit(FR_FINISHED, &req->flags))              if (!list_empty(&req->intr_entry))
>
>     list_del_init(&req->intr_entry)                      spin_lock(&fiq->waitq.lock)
>                                                          list_del_init(&req->intr_entry)
>
> Check the change is visible in perf report:
>
> 1)Firstly mount fusexmp_fh:
>   $fuse/example/.libs/fusexmp_fh mnt
>
> 2)Run test doing
>     futimes(fd, tv1);
>     futimes(fd, tv2);
>   in many threads endlessly.
>
> 3)perf record -g (all the system load)
>
> Without the patch in request_end() we spend 62.58% of do_write() time:
> (= 12.58 / 20.10 * 100%)
>
>    55,22% entry_SYSCALL_64
>      20,10% do_writev
>       ...
>           18,08% fuse_dev_do_write
>            12,58% request_end
>             10,08% __wake_up_common_lock
>             1,97% queued_spin_lock_slowpath
>            1,31% fuse_copy_args
>            1,04% fuse_copy_one
>            0,85% queued_spin_lock_slowpath
>
> With the patch, the perf report becomes better, and only 58.16%
> of do_write() time we spend in request_end():
>
>    54,15% entry_SYSCALL_64
>      18,24% do_writev
>       ...
>           16,25% fuse_dev_do_write
>            10,61% request_end
>             10,25% __wake_up_common_lock
>            1,34% fuse_copy_args
>            1,06% fuse_copy_one
>            0,88% queued_spin_lock_slowpath
>
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> ---
>  fs/fuse/dev.c |   30 ++++++++++++++++++++++--------
>  1 file changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 7705f75c77a3..391498e680ec 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -427,10 +427,16 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
>
>         if (test_and_set_bit(FR_FINISHED, &req->flags))
>                 goto put_request;
> -
> -       spin_lock(&fiq->waitq.lock);
> -       list_del_init(&req->intr_entry);
> -       spin_unlock(&fiq->waitq.lock);
> +       /*
> +        * test_and_set_bit() implies smp_mb() between bit
> +        * changing and below intr_entry check. Pairs with
> +        * smp_mb() from queue_interrupt().
> +        */
> +       if (!list_empty(&req->intr_entry)) {
> +               spin_lock(&fiq->waitq.lock);
> +               list_del_init(&req->intr_entry);
> +               spin_unlock(&fiq->waitq.lock);
> +       }
>         WARN_ON(test_bit(FR_PENDING, &req->flags));
>         WARN_ON(test_bit(FR_SENT, &req->flags));
>         if (test_bit(FR_BACKGROUND, &req->flags)) {
> @@ -470,13 +476,21 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>  {
>         bool kill = false;
>
> -       spin_lock(&fiq->waitq.lock);
> -       if (test_bit(FR_FINISHED, &req->flags)) {
> -               spin_unlock(&fiq->waitq.lock);
> +       if (test_bit(FR_FINISHED, &req->flags))

The only case this test would make sense if this was a performance
sensitive path, which it's not.

>                 return;
> -       }
> +       spin_lock(&fiq->waitq.lock);
>         if (list_empty(&req->intr_entry)) {
>                 list_add_tail(&req->intr_entry, &fiq->interrupts);
> +               /*
> +                * Pairs with smp_mb() implied by test_and_set_bit()
> +                * from request_end().
> +                */
> +               smp_mb();
> +               if (test_bit(FR_FINISHED, &req->flags)) {
> +                       list_del_init(&req->intr_entry);
> +                       spin_unlock(&fiq->waitq.lock);
> +                       return;
> +               }
>                 wake_up_locked(&fiq->waitq);
>                 kill = true;
>         }
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/6] fuse: Check for FR_SENT bit in fuse_dev_do_write()
  2018-11-06  9:30 ` [PATCH 4/6] fuse: Check for FR_SENT bit in fuse_dev_do_write() Kirill Tkhai
@ 2018-11-07 13:16   ` Miklos Szeredi
  0 siblings, 0 replies; 14+ messages in thread
From: Miklos Szeredi @ 2018-11-07 13:16 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: linux-fsdevel, linux-kernel

On Tue, Nov 6, 2018 at 10:30 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> It's not possible to have answer to a request,
> before the request is actually sent. Add sanity
> check for that.

It's checking for the impossible.  That sometimes makes sense as a
WARN_ON() or in special cases a BUG_ON().


>
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> ---
>  fs/fuse/dev.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 739968ee8b0c..c603f1ebf0fd 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -1947,7 +1947,7 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
>                 goto err_unlock_pq;
>
>         req = request_find(fpq, oh.unique & ~FUSE_INT_REQ_BIT);
> -       if (!req)
> +       if (!req || !test_bit(FR_SENT, &req->flags))
>                 goto err_unlock_pq;
>
>         /* Is it an interrupt reply ID? */
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent
  2018-11-06  9:31 ` [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent Kirill Tkhai
@ 2018-11-07 13:55   ` Miklos Szeredi
  2018-11-07 14:25     ` Kirill Tkhai
  0 siblings, 1 reply; 14+ messages in thread
From: Miklos Szeredi @ 2018-11-07 13:55 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: linux-fsdevel, linux-kernel

On Tue, Nov 6, 2018 at 10:31 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> When queue_interrupt() is called from fuse_dev_do_write(),
> it came from userspace directly. Userspace may pass any
> request id, even the request's we have not interrupted
> (or even background's request). This patch adds sanity
> check to make kernel safe against that.

Okay, I understand this far.

> The problem is real interrupt may be queued and requeued
> by two tasks on two cpus. This case, the requeuer has not
> guarantees it sees FR_INTERRUPTED bit on its cpu (since
> we know nothing about the way userspace manages requests
> between its threads and whether it uses smp barriers).

This sounds like BS.   There's an explicit  smp_mb__after_atomic()
after the set_bit(FR_INTERRUPTED,...).  Which means FR_INTERRUPTED is
going to be visible on all CPU's after this, no need to fool around
with setting FR_INTERRUPTED again, etc...



>
> To eliminate this problem, queuer writes FR_INTERRUPTED
> bit again under fiq->waitq.lock, and this guarantees
> requeuer will see the bit, when checks it.
>
> I try to introduce solution, which does not affect on
> performance, and which does not force to take more
> locks. This is the reason, the below solution is worse:
>
>    request_wait_answer()
>    {
>      ...
>   +  spin_lock(&fiq->waitq.lock);
>      set_bit(FR_INTERRUPTED, &req->flags);
>   +  spin_unlock(&fiq->waitq.lock);
>      ...
>    }
>
> Also, it does not look a better idea to extend fuse_dev_do_read()
> with the fix, since it's already a big function:
>
>    fuse_dev_do_read()
>    {
>      ...
>      if (test_bit(FR_INTERRUPTED, &req->flags)) {
>   +      /* Comment */
>   +      barrier();
>   +      set_bit(FR_INTERRUPTED, &req->flags);
>          queue_interrupt(fiq, req);
>      }
>      ...
>    }
>
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> ---
>  fs/fuse/dev.c |   26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 315d395d5c02..3bfc5ed61c9a 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -475,13 +475,27 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
>         fuse_put_request(fc, req);
>  }
>
> -static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
> +static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>  {
>         bool kill = false;
>
>         if (test_bit(FR_FINISHED, &req->flags))
> -               return;
> +               return 0;
>         spin_lock(&fiq->waitq.lock);
> +       /* Check for we've sent request to interrupt this req */
> +       if (unlikely(!test_bit(FR_INTERRUPTED, &req->flags))) {
> +               spin_unlock(&fiq->waitq.lock);
> +               return -EINVAL;
> +       }
> +       /*
> +        * Interrupt may be queued from fuse_dev_do_read(), and
> +        * later requeued on other cpu by fuse_dev_do_write().
> +        * To make FR_INTERRUPTED bit visible for the requeuer
> +        * (under fiq->waitq.lock) we write it once again.
> +        */
> +       barrier();
> +       __set_bit(FR_INTERRUPTED, &req->flags);
> +
>         if (list_empty(&req->intr_entry)) {
>                 list_add_tail(&req->intr_entry, &fiq->interrupts);
>                 /*
> @@ -492,7 +506,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>                 if (test_bit(FR_FINISHED, &req->flags)) {
>                         list_del_init(&req->intr_entry);
>                         spin_unlock(&fiq->waitq.lock);
> -                       return;
> +                       return 0;
>                 }
>                 wake_up_locked(&fiq->waitq);
>                 kill = true;
> @@ -500,6 +514,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>         spin_unlock(&fiq->waitq.lock);
>         if (kill)
>                 kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
> +       return (int)kill;
>  }
>
>  static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)
> @@ -1959,8 +1974,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
>                         nbytes = -EINVAL;
>                 else if (oh.error == -ENOSYS)
>                         fc->no_interrupt = 1;
> -               else if (oh.error == -EAGAIN)
> -                       queue_interrupt(&fc->iq, req);
> +               else if (oh.error == -EAGAIN &&
> +                        queue_interrupt(&fc->iq, req) < 0)
> +                       nbytes = -EINVAL;
>
>                 fuse_put_request(fc, req);
>                 fuse_copy_finish(cs);
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent
  2018-11-07 13:55   ` Miklos Szeredi
@ 2018-11-07 14:25     ` Kirill Tkhai
  2018-11-07 14:45       ` Miklos Szeredi
  0 siblings, 1 reply; 14+ messages in thread
From: Kirill Tkhai @ 2018-11-07 14:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

On 07.11.2018 16:55, Miklos Szeredi wrote:
> On Tue, Nov 6, 2018 at 10:31 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>> When queue_interrupt() is called from fuse_dev_do_write(),
>> it came from userspace directly. Userspace may pass any
>> request id, even the request's we have not interrupted
>> (or even background's request). This patch adds sanity
>> check to make kernel safe against that.
> 
> Okay, I understand this far.
> 
>> The problem is real interrupt may be queued and requeued
>> by two tasks on two cpus. This case, the requeuer has not
>> guarantees it sees FR_INTERRUPTED bit on its cpu (since
>> we know nothing about the way userspace manages requests
>> between its threads and whether it uses smp barriers).
> 
> This sounds like BS. There's an explicit  smp_mb__after_atomic()
> after the set_bit(FR_INTERRUPTED,...).  Which means FR_INTERRUPTED is
> going to be visible on all CPU's after this, no need to fool around
> with setting FR_INTERRUPTED again, etc...

Hm, but how does it make the bit visible on all CPUS?

The problem is that smp_mb_xxx() barrier on a cpu has a sense
only in pair with the appropriate barrier on the second cpu.
There is no guarantee for visibility, if second cpu does not
have a barrier:

  CPU 1                  CPU2                        CPU3                       CPU4                        CPU5

  set FR_INTERRUPTED     set FR_SENT                                            
  <smp mb>               <smp mb>                          
  test FR_SENT (== 0)    test FR_INTERRUPTED (==1)
                         list_add[&req->intr_entry]  read[req by intr_entry]
                                                     <place to insert a barrier>
                                                     goto userspace
                                                     write in userspace buffer
                                                                                read from userspace buffer  
                                                                                write to userspace buffer
                                                                                                             read from userspace buffer
                                                                                                             enter kernel
                                                                                                             <place to insert a barrier>
                                                                                                             test FR_INTERRUPTED <- Not visible

The sequence:

set_bit(FR_INTERRUPTED, ...)
smp_mb__after_atomic();
test_bit(FR_SENT, &req->flags)

just guarantees the expected order on CPU2, which uses <smp mb>,
but CPU5 does not have any guarantees.


This 5 CPUs picture is a scheme, which illustrates the possible way userspace
may manage interrupts. Tags <place to insert a barrier> show places, where
we not have barriers yet, but where we may introduce them. But even in case
of we introduce them, there is no a way, that such the barriers help against CPU4.
So, this is the reason we have to set FR_INTERRUPTED bit again to make it visible
under the spinlock on CPU5.

Thanks,
Kirill

>>
>> To eliminate this problem, queuer writes FR_INTERRUPTED
>> bit again under fiq->waitq.lock, and this guarantees
>> requeuer will see the bit, when checks it.
>>
>> I try to introduce solution, which does not affect on
>> performance, and which does not force to take more
>> locks. This is the reason, the below solution is worse:
>>
>>    request_wait_answer()
>>    {
>>      ...
>>   +  spin_lock(&fiq->waitq.lock);
>>      set_bit(FR_INTERRUPTED, &req->flags);
>>   +  spin_unlock(&fiq->waitq.lock);
>>      ...
>>    }
>>
>> Also, it does not look a better idea to extend fuse_dev_do_read()
>> with the fix, since it's already a big function:
>>
>>    fuse_dev_do_read()
>>    {
>>      ...
>>      if (test_bit(FR_INTERRUPTED, &req->flags)) {
>>   +      /* Comment */
>>   +      barrier();
>>   +      set_bit(FR_INTERRUPTED, &req->flags);
>>          queue_interrupt(fiq, req);
>>      }
>>      ...
>>    }
>>
>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>> ---
>>  fs/fuse/dev.c |   26 +++++++++++++++++++++-----
>>  1 file changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
>> index 315d395d5c02..3bfc5ed61c9a 100644
>> --- a/fs/fuse/dev.c
>> +++ b/fs/fuse/dev.c
>> @@ -475,13 +475,27 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
>>         fuse_put_request(fc, req);
>>  }
>>
>> -static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>> +static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>>  {
>>         bool kill = false;
>>
>>         if (test_bit(FR_FINISHED, &req->flags))
>> -               return;
>> +               return 0;
>>         spin_lock(&fiq->waitq.lock);
>> +       /* Check for we've sent request to interrupt this req */
>> +       if (unlikely(!test_bit(FR_INTERRUPTED, &req->flags))) {
>> +               spin_unlock(&fiq->waitq.lock);
>> +               return -EINVAL;
>> +       }
>> +       /*
>> +        * Interrupt may be queued from fuse_dev_do_read(), and
>> +        * later requeued on other cpu by fuse_dev_do_write().
>> +        * To make FR_INTERRUPTED bit visible for the requeuer
>> +        * (under fiq->waitq.lock) we write it once again.
>> +        */
>> +       barrier();
>> +       __set_bit(FR_INTERRUPTED, &req->flags);
>> +
>>         if (list_empty(&req->intr_entry)) {
>>                 list_add_tail(&req->intr_entry, &fiq->interrupts);
>>                 /*
>> @@ -492,7 +506,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>>                 if (test_bit(FR_FINISHED, &req->flags)) {
>>                         list_del_init(&req->intr_entry);
>>                         spin_unlock(&fiq->waitq.lock);
>> -                       return;
>> +                       return 0;
>>                 }
>>                 wake_up_locked(&fiq->waitq);
>>                 kill = true;
>> @@ -500,6 +514,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>>         spin_unlock(&fiq->waitq.lock);
>>         if (kill)
>>                 kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
>> +       return (int)kill;
>>  }
>>
>>  static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)
>> @@ -1959,8 +1974,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
>>                         nbytes = -EINVAL;
>>                 else if (oh.error == -ENOSYS)
>>                         fc->no_interrupt = 1;
>> -               else if (oh.error == -EAGAIN)
>> -                       queue_interrupt(&fc->iq, req);
>> +               else if (oh.error == -EAGAIN &&
>> +                        queue_interrupt(&fc->iq, req) < 0)
>> +                       nbytes = -EINVAL;
>>
>>                 fuse_put_request(fc, req);
>>                 fuse_copy_finish(cs);
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent
  2018-11-07 14:25     ` Kirill Tkhai
@ 2018-11-07 14:45       ` Miklos Szeredi
  2018-11-07 16:40         ` Kirill Tkhai
  0 siblings, 1 reply; 14+ messages in thread
From: Miklos Szeredi @ 2018-11-07 14:45 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: linux-fsdevel, linux-kernel

On Wed, Nov 7, 2018 at 3:25 PM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> On 07.11.2018 16:55, Miklos Szeredi wrote:
>> On Tue, Nov 6, 2018 at 10:31 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>>> When queue_interrupt() is called from fuse_dev_do_write(),
>>> it came from userspace directly. Userspace may pass any
>>> request id, even the request's we have not interrupted
>>> (or even background's request). This patch adds sanity
>>> check to make kernel safe against that.
>>
>> Okay, I understand this far.
>>
>>> The problem is real interrupt may be queued and requeued
>>> by two tasks on two cpus. This case, the requeuer has not
>>> guarantees it sees FR_INTERRUPTED bit on its cpu (since
>>> we know nothing about the way userspace manages requests
>>> between its threads and whether it uses smp barriers).
>>
>> This sounds like BS. There's an explicit  smp_mb__after_atomic()
>> after the set_bit(FR_INTERRUPTED,...).  Which means FR_INTERRUPTED is
>> going to be visible on all CPU's after this, no need to fool around
>> with setting FR_INTERRUPTED again, etc...
>
> Hm, but how does it make the bit visible on all CPUS?
>
> The problem is that smp_mb_xxx() barrier on a cpu has a sense
> only in pair with the appropriate barrier on the second cpu.
> There is no guarantee for visibility, if second cpu does not
> have a barrier:
>
>   CPU 1                  CPU2                        CPU3                       CPU4                        CPU5
>
>   set FR_INTERRUPTED     set FR_SENT
>   <smp mb>               <smp mb>
>   test FR_SENT (== 0)    test FR_INTERRUPTED (==1)
>                          list_add[&req->intr_entry]  read[req by intr_entry]
>                                                      <place to insert a barrier>
>                                                      goto userspace
>                                                      write in userspace buffer
>                                                                                 read from userspace buffer
>                                                                                 write to userspace buffer
>                                                                                                              read from userspace buffer
>                                                                                                              enter kernel
>                                                                                                              <place to insert a barrier>
>                                                                                                              test FR_INTERRUPTED <- Not visible
>
> The sequence:
>
> set_bit(FR_INTERRUPTED, ...)
> smp_mb__after_atomic();
> test_bit(FR_SENT, &req->flags)
>
> just guarantees the expected order on CPU2, which uses <smp mb>,
> but CPU5 does not have any guarantees.

What you are missing is that there are other things that synchronize
memory access besides the memory barrier.  In your example the
ordering will be guaranteed by the fiq->waitq.lock in
queue_interrupt() on CPU2: the set_bit() cannot move past the one-way
barrier provided by the spin_unlock().

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent
  2018-11-07 14:45       ` Miklos Szeredi
@ 2018-11-07 16:40         ` Kirill Tkhai
  0 siblings, 0 replies; 14+ messages in thread
From: Kirill Tkhai @ 2018-11-07 16:40 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

On 07.11.2018 17:45, Miklos Szeredi wrote:
> On Wed, Nov 7, 2018 at 3:25 PM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>> On 07.11.2018 16:55, Miklos Szeredi wrote:
>>> On Tue, Nov 6, 2018 at 10:31 AM, Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>>>> When queue_interrupt() is called from fuse_dev_do_write(),
>>>> it came from userspace directly. Userspace may pass any
>>>> request id, even the request's we have not interrupted
>>>> (or even background's request). This patch adds sanity
>>>> check to make kernel safe against that.
>>>
>>> Okay, I understand this far.
>>>
>>>> The problem is real interrupt may be queued and requeued
>>>> by two tasks on two cpus. This case, the requeuer has not
>>>> guarantees it sees FR_INTERRUPTED bit on its cpu (since
>>>> we know nothing about the way userspace manages requests
>>>> between its threads and whether it uses smp barriers).
>>>
>>> This sounds like BS. There's an explicit  smp_mb__after_atomic()
>>> after the set_bit(FR_INTERRUPTED,...).  Which means FR_INTERRUPTED is
>>> going to be visible on all CPU's after this, no need to fool around
>>> with setting FR_INTERRUPTED again, etc...
>>
>> Hm, but how does it make the bit visible on all CPUS?
>>
>> The problem is that smp_mb_xxx() barrier on a cpu has a sense
>> only in pair with the appropriate barrier on the second cpu.
>> There is no guarantee for visibility, if second cpu does not
>> have a barrier:
>>
>>   CPU 1                  CPU2                        CPU3                       CPU4                        CPU5
>>
>>   set FR_INTERRUPTED     set FR_SENT
>>   <smp mb>               <smp mb>
>>   test FR_SENT (== 0)    test FR_INTERRUPTED (==1)
>>                          list_add[&req->intr_entry]  read[req by intr_entry]
>>                                                      <place to insert a barrier>
>>                                                      goto userspace
>>                                                      write in userspace buffer
>>                                                                                 read from userspace buffer
>>                                                                                 write to userspace buffer
>>                                                                                                              read from userspace buffer
>>                                                                                                              enter kernel
>>                                                                                                              <place to insert a barrier>
>>                                                                                                              test FR_INTERRUPTED <- Not visible
>>
>> The sequence:
>>
>> set_bit(FR_INTERRUPTED, ...)
>> smp_mb__after_atomic();
>> test_bit(FR_SENT, &req->flags)
>>
>> just guarantees the expected order on CPU2, which uses <smp mb>,
>> but CPU5 does not have any guarantees.
> 
> What you are missing is that there are other things that synchronize
> memory access besides the memory barrier.  In your example the
> ordering will be guaranteed by the fiq->waitq.lock in
> queue_interrupt() on CPU2: the set_bit() cannot move past the one-way
> barrier provided by the spin_unlock().

I thought, RELEASE is related to memory operations, which is made on the same cpu.
Strange thing, but the below memory-model test says, it's not true. Ok, I'll change
the patch, thanks for pointing this.


===== tools/memory-model/litmus-tests/test.litmus =====
C SB+test

{}

P0(atomic_t *flags)
{
	int r0;

	atomic_add(1, flags);
	smp_mb__after_atomic();
	r0 = (atomic_read(flags) & 2);
}

P1(atomic_t *flags, int *linked)
{
	int r0;

	atomic_add(2, flags);
	smp_mb__after_atomic();
	r0 = (atomic_read(flags) & 1);

	if (r0) {
		spin_lock(mylock);
		*linked = 1;
		spin_unlock(mylock);
	}
}

P2(atomic_t *flags, int *linked)
{
	int r0;

	spin_lock(mylock);
	if (*linked) {
		r0 = atomic_read(flags);
	} else
		r0 = 0;
	spin_unlock(mylock);
}

exists (0:r0=0 /\ 1:r0=1 /\ 2:r0=2)

===================================

kirill@pro:~/linux-next/tools/memory-model$ ~/.opam/system/bin/herd7 -conf linux-kernel.cfg litmus-tests/test.litmus 
Test SB+test Allowed
States 5
0:r0=0; 1:r0=1; 2:r0=0;
0:r0=0; 1:r0=1; 2:r0=3;
0:r0=2; 1:r0=0; 2:r0=0;
0:r0=2; 1:r0=1; 2:r0=0;
0:r0=2; 1:r0=1; 2:r0=3;
No
Witnesses
Positive: 0 Negative: 7
Condition exists (0:r0=0 /\ 1:r0=1 /\ 2:r0=2)
Observation SB+test Never 0 7
Time SB+test 0.09
Hash=0213fd54f80c511af2326e1bd120a96b

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-11-07 16:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-06  9:30 [PATCH 0/6] fuse: Interrupt-related optimizations and improvements Kirill Tkhai
2018-11-06  9:30 ` [PATCH 1/6] fuse: Kill fasync only if interrupt is queued in queue_interrupt() Kirill Tkhai
2018-11-07 12:45   ` Miklos Szeredi
2018-11-06  9:30 ` [PATCH 2/6] fuse: Optimize request_end() by not taking fiq->waitq.lock Kirill Tkhai
2018-11-07 13:09   ` Miklos Szeredi
2018-11-06  9:30 ` [PATCH 3/6] fuse: Wake up req->waitq of only not background requests in request_end() Kirill Tkhai
2018-11-06  9:30 ` [PATCH 4/6] fuse: Check for FR_SENT bit in fuse_dev_do_write() Kirill Tkhai
2018-11-07 13:16   ` Miklos Szeredi
2018-11-06  9:30 ` [PATCH 5/6] fuse: Do some refactoring " Kirill Tkhai
2018-11-06  9:31 ` [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent Kirill Tkhai
2018-11-07 13:55   ` Miklos Szeredi
2018-11-07 14:25     ` Kirill Tkhai
2018-11-07 14:45       ` Miklos Szeredi
2018-11-07 16:40         ` Kirill Tkhai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).