All of lore.kernel.org
 help / color / mirror / Atom feed
From: Artem Bityutskiy <dedekind1@gmail.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCHv2 07/11] writeback: do not remove bdi from bdi_list
Date: Wed, 21 Jul 2010 12:31:42 +0300	[thread overview]
Message-ID: <1279704706-1267-8-git-send-email-dedekind1@gmail.com> (raw)
In-Reply-To: <1279704706-1267-1-git-send-email-dedekind1@gmail.com>

From: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

The forker thread removes bdis from 'bdi_list' before forking the bdi thread.
But this is wrong for at least 2 reasons.

Reason #1: if we temporary remove a bdi from the list, we may miss works which
           would otherwise be given to us.

Reason #2: this is racy; indeed, 'bdi_wb_shutdown()' expects that bdis are
           always in the 'bdi_list' (see 'bdi_remove_from_list()'), and when
           it races with the forker thread, it can shut down the bdi thread
           at the same time as the forker creates it.

This patch makes sure the forker thread never removes bdis from 'bdi_list'
(which was suggested by Christoph Hellwig).

In order to make sure that we do not race with 'bdi_wb_shutdown()', we have to
hold the 'bdi_lock' while walking the 'bdi_list' and setting the 'BDI_pending'
flag.

NOTE! The error path is interesting. Currently, when we fail to create a bdi
thread, we move the bdi to the tail of 'bdi_list'. But if we never remove the
bdi from the list, we cannot move it to the tail either, because then we can
mess up the RCU readers which walk the list. And also, we'll have the race
described above in "Reason #2".

But I not think that adding to the tail is any important so I just do not do
that.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
---
 fs/fs-writeback.c |    7 -------
 mm/backing-dev.c  |   31 ++++++++++---------------------
 2 files changed, 10 insertions(+), 28 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 8cf53ba..eaef5c9 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -804,13 +804,6 @@ int bdi_writeback_thread(void *data)
 	unsigned long wait_jiffies = -1UL;
 	long pages_written;
 
-	/*
-	 * Add us to the active bdi_list
-	 */
-	spin_lock_bh(&bdi_lock);
-	list_add_rcu(&bdi->bdi_list, &bdi_list);
-	spin_unlock_bh(&bdi_lock);
-
 	current->flags |= PF_FLUSHER | PF_SWAPWRITE;
 	set_freezable();
 
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index dbc6681..672c17b 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -331,7 +331,7 @@ static int bdi_forker_thread(void *ptr)
 	for (;;) {
 		bool fork = false;
 		struct task_struct *task;
-		struct backing_dev_info *bdi, *tmp;
+		struct backing_dev_info *bdi;
 
 		/*
 		 * Temporary measure, we want to make sure we don't see
@@ -347,7 +347,7 @@ static int bdi_forker_thread(void *ptr)
 		 * Check if any existing bdi's have dirty data without
 		 * a thread registered. If so, set that up.
 		 */
-		list_for_each_entry_safe(bdi, tmp, &bdi_list, bdi_list) {
+		list_for_each_entry(bdi, &bdi_list, bdi_list) {
 			if (!bdi_cap_writeback_dirty(bdi))
 				continue;
 			if (bdi->wb.task)
@@ -359,8 +359,13 @@ static int bdi_forker_thread(void *ptr)
 			WARN(!test_bit(BDI_registered, &bdi->state),
 			     "bdi %p/%s is not registered!\n", bdi, bdi->name);
 
-			list_del_rcu(&bdi->bdi_list);
 			fork = true;
+
+			/*
+			 * Set the pending bit - if someone will try to
+			 * unregister this bdi - it'll wait on this bit.
+			 */
+			set_bit(BDI_pending, &bdi->state);
 			break;
 		}
 		spin_unlock_bh(&bdi_lock);
@@ -383,29 +388,13 @@ static int bdi_forker_thread(void *ptr)
 
 		__set_current_state(TASK_RUNNING);
 
-		/*
-		 * Set the pending bit - if someone will try to unregister this
-		 * bdi - it'll wait on this bit.
-		 */
-		set_bit(BDI_pending, &bdi->state);
-
-		/* Make sure no one uses the picked bdi */
-		synchronize_rcu();
-
 		task = kthread_run(bdi_writeback_thread, &bdi->wb, "flush-%s",
 				   dev_name(bdi->dev));
 		if (IS_ERR(task)) {
 			/*
-			 * If thread creation fails, then readd the bdi back to
-			 * the list and force writeout of the bdi from this
-			 * forker thread. That will free some memory and we can
-			 * try again. Add it to the tail so we get a chance to
-			 * flush other bdi's to free memory.
+			 * If thread creation fails, force writeout of the bdi
+			 * from the thread.
 			 */
-			spin_lock_bh(&bdi_lock);
-			list_add_tail_rcu(&bdi->bdi_list, &bdi_list);
-			spin_unlock_bh(&bdi_lock);
-
 			bdi_flush_io(bdi);
 		} else
 			bdi->wb.task = task;
-- 
1.7.1.1


  parent reply	other threads:[~2010-07-21  9:32 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-21  9:31 [PATCHv2 00/16] kill unnecessary bdi wakeups + cleanups Artem Bityutskiy
2010-07-21  9:31 ` [PATCHv2 01/11] writeback: harmonize writeback threads naming Artem Bityutskiy
2010-07-21  9:31 ` [PATCHv2 02/11] writeback: fix possible race when creating bdi threads Artem Bityutskiy
2010-07-21 11:57   ` Christoph Hellwig
2010-07-21  9:31 ` [PATCHv2 03/11] writeback: do not lose wake-ups in the forker thread - 1 Artem Bityutskiy
2010-07-21  9:31 ` [PATCHv2 04/11] writeback: do not lose wake-ups in the forker thread - 2 Artem Bityutskiy
2010-07-21 11:57   ` Christoph Hellwig
2010-07-21  9:31 ` [PATCHv2 05/11] writeback: do not lose wake-ups in bdi threads Artem Bityutskiy
2010-07-21  9:31 ` [PATCHv2 06/11] writeback: simplify bdi code a little Artem Bityutskiy
2010-07-21 12:01   ` Christoph Hellwig
2010-07-21  9:31 ` Artem Bityutskiy [this message]
2010-07-21 12:02   ` [PATCHv2 07/11] writeback: do not remove bdi from bdi_list Christoph Hellwig
2010-07-21  9:31 ` [PATCHv2 08/11] writeback: move last_active to bdi Artem Bityutskiy
2010-07-21  9:31 ` [PATCHv2 09/11] writeback: restructure bdi forker loop a little Artem Bityutskiy
2010-07-21 12:02   ` Christoph Hellwig
2010-07-21  9:31 ` [PATCHv2 10/11] writeback: move bdi threads exiting logic to the forker thread Artem Bityutskiy
2010-07-21 12:04   ` Christoph Hellwig
2010-07-21  9:31 ` [PATCHv2 11/11] writeback: prevent unnecessary bdi threads wakeups Artem Bityutskiy
2010-07-21 11:45   ` Artem Bityutskiy
2010-07-22  0:41     ` Dave Chinner
2010-07-22  6:50       ` Artem Bityutskiy
2010-07-22  6:50         ` Artem Bityutskiy
2010-07-22  9:00       ` Christoph Hellwig
2010-07-22  9:24         ` Artem Bityutskiy
2010-07-22 13:27         ` Artem Bityutskiy
2010-07-22 13:27           ` Artem Bityutskiy
2010-07-21 12:12   ` Christoph Hellwig
2010-07-22  3:19   ` Nick Piggin
2010-07-22  6:48     ` Artem Bityutskiy
2010-07-22  7:22       ` Tero.Kristo
2010-07-22  7:22         ` Tero.Kristo
2010-07-22  8:07         ` Nick Piggin
2010-07-22  8:05       ` Nick Piggin
2010-07-22  8:02         ` Artem Bityutskiy
2010-07-22  8:02           ` Artem Bityutskiy
2010-07-22  8:59           ` Nick Piggin
2010-07-22  9:50         ` Artem Bityutskiy
2010-07-22  9:50           ` Artem Bityutskiy
2010-07-23 15:03         ` Artem Bityutskiy
2010-07-23 15:03           ` Artem Bityutskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1279704706-1267-8-git-send-email-dedekind1@gmail.com \
    --to=dedekind1@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.