linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] bdi: make sure congestion states are clear on free
@ 2018-02-02 17:53 Tejun Heo
  2018-02-02 17:54 ` [PATCH 2/2] FUSE: fix congested state leak on aborted connections Tejun Heo
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Tejun Heo @ 2018-02-02 17:53 UTC (permalink / raw)
  To: Jens Axboe, Miklos Szeredi
  Cc: Joshua Miller, kernel-team, Johannes Weiner, Jan Kara, stable,
	linux-kernel, linux-fsdevel

FUSE has a bug where it fails to clear congestion states if a
connection gets aborted while congested, which can leave
nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
wait spuriously.

While the bdi owner, FUSE, is primarily responsible for clearing
congestion states before destroying bdi_writebacks, bdi layer can
ensure that congestion states are not leaked beyond bdi_writeback
lifecycle.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Joshua Miller <joshmiller@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
---
 include/linux/backing-dev.h |   14 +++++++++++++-
 mm/backing-dev.c            |    2 +-
 2 files changed, 14 insertions(+), 2 deletions(-)

--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -220,6 +220,18 @@ static inline int bdi_sched_wait(void *w
 	return 0;
 }
 
+static inline void __wb_congested_free(struct bdi_writeback_congested *congested)
+{
+	/*
+	 * Make sure congestion states are cleared before freeing to avoid
+	 * nr_wb_congested() corruption which can lead to misbehaving
+	 * wait_iff_congested().
+	 */
+	clear_wb_congested(congested, BLK_RW_SYNC);
+	clear_wb_congested(congested, BLK_RW_ASYNC);
+	kfree(congested);
+}
+
 #ifdef CONFIG_CGROUP_WRITEBACK
 
 struct bdi_writeback_congested *
@@ -409,7 +421,7 @@ wb_congested_get_create(struct backing_d
 static inline void wb_congested_put(struct bdi_writeback_congested *congested)
 {
 	if (atomic_dec_and_test(&congested->refcnt))
-		kfree(congested);
+		__wb_congested_free(congested);
 }
 
 static inline struct bdi_writeback *wb_find_current(struct backing_dev_info *bdi)
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -509,7 +509,7 @@ void wb_congested_put(struct bdi_writeba
 	}
 
 	spin_unlock_irqrestore(&cgwb_lock, flags);
-	kfree(congested);
+	__wb_congested_free(congested);
 }
 
 static void cgwb_release_workfn(struct work_struct *work)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/2] FUSE: fix congested state leak on aborted connections
  2018-02-02 17:53 [PATCH 1/2] bdi: make sure congestion states are clear on free Tejun Heo
@ 2018-02-02 17:54 ` Tejun Heo
  2018-02-06 16:25   ` Jan Kara
  2018-02-05 23:02 ` [PATCH 1/2] bdi: make sure congestion states are clear on free Johannes Weiner
  2018-02-06 16:19 ` Jan Kara
  2 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2018-02-02 17:54 UTC (permalink / raw)
  To: Jens Axboe, Miklos Szeredi
  Cc: Joshua Miller, kernel-team, Johannes Weiner, Jan Kara, stable,
	linux-kernel, linux-fsdevel

If a connection gets aborted while congested, FUSE can leave
nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
wait spuriously which can lead to severe performance degradation.

The leak is caused by gating congestion state clearing with
fc->connected test in request_end().  This was added way back in 2009
by 26c3679101db ("fuse: destroy bdi on umount").  While the commit
description doesn't explain why the test was added, it most likely was
to avoid dereferencing bdi after it got destroyed.

Since then, bdi lifetime rules have changed many times and now we're
always guaranteed to have access to the bdi while the superblock is
alive (fc->sb).

Drop fc->connected conditional to avoid leaking congestion states.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Joshua Miller <joshmiller@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
---
 fs/fuse/dev.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -381,8 +381,7 @@ static void request_end(struct fuse_conn
 		if (!fc->blocked && waitqueue_active(&fc->blocked_waitq))
 			wake_up(&fc->blocked_waitq);
 
-		if (fc->num_background == fc->congestion_threshold &&
-		    fc->connected && fc->sb) {
+		if (fc->num_background == fc->congestion_threshold && fc->sb) {
 			clear_bdi_congested(fc->sb->s_bdi, BLK_RW_SYNC);
 			clear_bdi_congested(fc->sb->s_bdi, BLK_RW_ASYNC);
 		}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] bdi: make sure congestion states are clear on free
  2018-02-02 17:53 [PATCH 1/2] bdi: make sure congestion states are clear on free Tejun Heo
  2018-02-02 17:54 ` [PATCH 2/2] FUSE: fix congested state leak on aborted connections Tejun Heo
@ 2018-02-05 23:02 ` Johannes Weiner
  2018-02-06 16:19 ` Jan Kara
  2 siblings, 0 replies; 6+ messages in thread
From: Johannes Weiner @ 2018-02-05 23:02 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Miklos Szeredi, Joshua Miller, kernel-team, Jan Kara,
	stable, linux-kernel, linux-fsdevel

On Fri, Feb 02, 2018 at 09:53:28AM -0800, Tejun Heo wrote:
> FUSE has a bug where it fails to clear congestion states if a
> connection gets aborted while congested, which can leave
> nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
> wait spuriously.
> 
> While the bdi owner, FUSE, is primarily responsible for clearing
> congestion states before destroying bdi_writebacks, bdi layer can
> ensure that congestion states are not leaked beyond bdi_writeback
> lifecycle.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Joshua Miller <joshmiller@fb.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] bdi: make sure congestion states are clear on free
  2018-02-02 17:53 [PATCH 1/2] bdi: make sure congestion states are clear on free Tejun Heo
  2018-02-02 17:54 ` [PATCH 2/2] FUSE: fix congested state leak on aborted connections Tejun Heo
  2018-02-05 23:02 ` [PATCH 1/2] bdi: make sure congestion states are clear on free Johannes Weiner
@ 2018-02-06 16:19 ` Jan Kara
  2 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2018-02-06 16:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Miklos Szeredi, Joshua Miller, kernel-team,
	Johannes Weiner, Jan Kara, stable, linux-kernel, linux-fsdevel

On Fri 02-02-18 09:53:28, Tejun Heo wrote:
> FUSE has a bug where it fails to clear congestion states if a
> connection gets aborted while congested, which can leave
> nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
> wait spuriously.
> 
> While the bdi owner, FUSE, is primarily responsible for clearing
> congestion states before destroying bdi_writebacks, bdi layer can
> ensure that congestion states are not leaked beyond bdi_writeback
> lifecycle.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Joshua Miller <joshmiller@fb.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org

Looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  include/linux/backing-dev.h |   14 +++++++++++++-
>  mm/backing-dev.c            |    2 +-
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> --- a/include/linux/backing-dev.h
> +++ b/include/linux/backing-dev.h
> @@ -220,6 +220,18 @@ static inline int bdi_sched_wait(void *w
>  	return 0;
>  }
>  
> +static inline void __wb_congested_free(struct bdi_writeback_congested *congested)
> +{
> +	/*
> +	 * Make sure congestion states are cleared before freeing to avoid
> +	 * nr_wb_congested() corruption which can lead to misbehaving
> +	 * wait_iff_congested().
> +	 */
> +	clear_wb_congested(congested, BLK_RW_SYNC);
> +	clear_wb_congested(congested, BLK_RW_ASYNC);
> +	kfree(congested);
> +}
> +
>  #ifdef CONFIG_CGROUP_WRITEBACK
>  
>  struct bdi_writeback_congested *
> @@ -409,7 +421,7 @@ wb_congested_get_create(struct backing_d
>  static inline void wb_congested_put(struct bdi_writeback_congested *congested)
>  {
>  	if (atomic_dec_and_test(&congested->refcnt))
> -		kfree(congested);
> +		__wb_congested_free(congested);
>  }
>  
>  static inline struct bdi_writeback *wb_find_current(struct backing_dev_info *bdi)
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -509,7 +509,7 @@ void wb_congested_put(struct bdi_writeba
>  	}
>  
>  	spin_unlock_irqrestore(&cgwb_lock, flags);
> -	kfree(congested);
> +	__wb_congested_free(congested);
>  }
>  
>  static void cgwb_release_workfn(struct work_struct *work)
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] FUSE: fix congested state leak on aborted connections
  2018-02-02 17:54 ` [PATCH 2/2] FUSE: fix congested state leak on aborted connections Tejun Heo
@ 2018-02-06 16:25   ` Jan Kara
  2018-05-30 14:22     ` Miklos Szeredi
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2018-02-06 16:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Miklos Szeredi, Joshua Miller, kernel-team,
	Johannes Weiner, Jan Kara, stable, linux-kernel, linux-fsdevel

On Fri 02-02-18 09:54:14, Tejun Heo wrote:
> If a connection gets aborted while congested, FUSE can leave
> nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
> wait spuriously which can lead to severe performance degradation.
> 
> The leak is caused by gating congestion state clearing with
> fc->connected test in request_end().  This was added way back in 2009
> by 26c3679101db ("fuse: destroy bdi on umount").  While the commit
> description doesn't explain why the test was added, it most likely was
> to avoid dereferencing bdi after it got destroyed.
> 
> Since then, bdi lifetime rules have changed many times and now we're
> always guaranteed to have access to the bdi while the superblock is
> alive (fc->sb).
> 
> Drop fc->connected conditional to avoid leaking congestion states.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Joshua Miller <joshmiller@fb.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Jan Kara <jack@suse.cz>
> Cc: stable@vger.kernel.org

Yeah, this should be fine AFAICT but my knowledge of FUSE is very cursory.
Anyway:

Acked-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/fuse/dev.c |    3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -381,8 +381,7 @@ static void request_end(struct fuse_conn
>  		if (!fc->blocked && waitqueue_active(&fc->blocked_waitq))
>  			wake_up(&fc->blocked_waitq);
>  
> -		if (fc->num_background == fc->congestion_threshold &&
> -		    fc->connected && fc->sb) {
> +		if (fc->num_background == fc->congestion_threshold && fc->sb) {
>  			clear_bdi_congested(fc->sb->s_bdi, BLK_RW_SYNC);
>  			clear_bdi_congested(fc->sb->s_bdi, BLK_RW_ASYNC);
>  		}
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] FUSE: fix congested state leak on aborted connections
  2018-02-06 16:25   ` Jan Kara
@ 2018-05-30 14:22     ` Miklos Szeredi
  0 siblings, 0 replies; 6+ messages in thread
From: Miklos Szeredi @ 2018-05-30 14:22 UTC (permalink / raw)
  To: Jan Kara
  Cc: Tejun Heo, Jens Axboe, Joshua Miller, kernel-team,
	Johannes Weiner, stable, linux-kernel, linux-fsdevel

On Tue, Feb 6, 2018 at 5:25 PM, Jan Kara <jack@suse.cz> wrote:
> On Fri 02-02-18 09:54:14, Tejun Heo wrote:
>> If a connection gets aborted while congested, FUSE can leave
>> nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
>> wait spuriously which can lead to severe performance degradation.
>>
>> The leak is caused by gating congestion state clearing with
>> fc->connected test in request_end().  This was added way back in 2009
>> by 26c3679101db ("fuse: destroy bdi on umount").  While the commit
>> description doesn't explain why the test was added, it most likely was
>> to avoid dereferencing bdi after it got destroyed.
>>
>> Since then, bdi lifetime rules have changed many times and now we're
>> always guaranteed to have access to the bdi while the superblock is
>> alive (fc->sb).
>>
>> Drop fc->connected conditional to avoid leaking congestion states.
>>
>> Signed-off-by: Tejun Heo <tj@kernel.org>
>> Reported-by: Joshua Miller <joshmiller@fb.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Miklos Szeredi <miklos@szeredi.hu>
>> Cc: Jan Kara <jack@suse.cz>
>> Cc: stable@vger.kernel.org
>
> Yeah, this should be fine AFAICT but my knowledge of FUSE is very cursory.
> Anyway:
>
> Acked-by: Jan Kara <jack@suse.cz>

Can't say I fully understand how the global "is any bdi congested"
state is used in direct reclaim, but the patch is an obvious
improvement, so applied.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-05-30 14:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-02 17:53 [PATCH 1/2] bdi: make sure congestion states are clear on free Tejun Heo
2018-02-02 17:54 ` [PATCH 2/2] FUSE: fix congested state leak on aborted connections Tejun Heo
2018-02-06 16:25   ` Jan Kara
2018-05-30 14:22     ` Miklos Szeredi
2018-02-05 23:02 ` [PATCH 1/2] bdi: make sure congestion states are clear on free Johannes Weiner
2018-02-06 16:19 ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).