All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ming Lei <ming.lei@canonical.com>,
	Alex Riesen <raa.lkml@gmail.com>,
	Alan Stern <stern@rowland.harvard.edu>,
	Jens Axboe <axboe@kernel.dk>,
	USB list <linux-usb@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Arjan van de Ven <arjan@linux.intel.com>
Subject: [PATCH] async: fix __lowest_in_progress()
Date: Wed, 16 Jan 2013 09:19:52 -0800	[thread overview]
Message-ID: <20130116171952.GQ2668@htj.dyndns.org> (raw)
In-Reply-To: <20130115183204.GE2668@htj.dyndns.org>

083b804c4d3e1e3d0eace56bdbc0f674946d2847 ("async: use workqueue for
worker pool") made it possible that async jobs are moved from pending
to running out-of-order.  While pending async jobs will be queued and
dispatched for execution in the same order, nothing guarantees they'll
enter "1) move self to the running queue" of async_run_entry_fn() in
the same order.

This broke __lowest_in_progress().  running->domain may not be
properly sorted and is not guaranteed to contain lower cookies than
pending list when not empty.  Fix it by ensuring sort-inserting to the
running list and always looking at both pending and running when
trying to determine the lowest cookie.

Over time, the async synchronization implementation became quite
messy.  We better restructure it such that each async_entry is linked
to two lists - one global and one per domain - and not move it when
execution starts.  There's no reason to distinguish pending and
running.  They behave the same for synchronization purposes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: stable@vger.kernel.org
---
And here's the fix for the breakage I mentioned earlier.  It wouldn't
happen often in the wild and the effect of it happening wouldn't be
critical for modern distros but it's still kinda surprising nobody
noticed this.

We definitely need to rewrite async synchronization.  It was already
messy and this makes it worse and there's no reason to be messy here.

Thanks.

 kernel/async.c |   27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

--- a/kernel/async.c
+++ b/kernel/async.c
@@ -86,18 +86,27 @@ static atomic_t entry_count;
  */
 static async_cookie_t  __lowest_in_progress(struct async_domain *running)
 {
+	async_cookie_t first_running = next_cookie;	/* infinity value */
+	async_cookie_t first_pending = next_cookie;	/* ditto */
 	struct async_entry *entry;
 
+	/*
+	 * Both running and pending lists are sorted but not disjoint.
+	 * Take the first cookies from both and return the min.
+	 */
 	if (!list_empty(&running->domain)) {
 		entry = list_first_entry(&running->domain, typeof(*entry), list);
-		return entry->cookie;
+		first_running = entry->cookie;
 	}
 
-	list_for_each_entry(entry, &async_pending, list)
-		if (entry->running == running)
-			return entry->cookie;
+	list_for_each_entry(entry, &async_pending, list) {
+		if (entry->running == running) {
+			first_pending = entry->cookie;
+			break;
+		}
+	}
 
-	return next_cookie;	/* "infinity" value */
+	return min(first_running, first_pending);
 }
 
 static async_cookie_t  lowest_in_progress(struct async_domain *running)
@@ -118,13 +127,17 @@ static void async_run_entry_fn(struct wo
 {
 	struct async_entry *entry =
 		container_of(work, struct async_entry, work);
+	struct async_entry *pos;
 	unsigned long flags;
 	ktime_t uninitialized_var(calltime), delta, rettime;
 	struct async_domain *running = entry->running;
 
-	/* 1) move self to the running queue */
+	/* 1) move self to the running queue, make sure it stays sorted */
 	spin_lock_irqsave(&async_lock, flags);
-	list_move_tail(&entry->list, &running->domain);
+	list_for_each_entry_reverse(pos, &running->domain, list)
+		if (entry->cookie < pos->cookie)
+			break;
+	list_move_tail(&entry->list, &pos->list);
 	spin_unlock_irqrestore(&async_lock, flags);
 
 	/* 2) run (and print duration) */

  parent reply	other threads:[~2013-01-16 17:19 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-11 21:04 USB device cannot be reconnected and khubd "blocked for more than 120 seconds" Alex Riesen
2013-01-12  7:48 ` Alex Riesen
2013-01-12  9:18   ` Lan Tianyu
2013-01-12 17:37   ` Alan Stern
2013-01-12 19:39     ` Alex Riesen
2013-01-12 20:33       ` Alex Riesen
2013-01-12 22:52         ` Alan Stern
2013-01-13 12:09           ` Alex Riesen
2013-01-13 16:56             ` Alan Stern
2013-01-13 17:42               ` Alex Riesen
2013-01-13 19:16                 ` Oliver Neukum
2013-01-14  2:39                   ` Alan Stern
2013-01-14 16:43                     ` Alex Riesen
2013-01-14  3:47                 ` Ming Lei
2013-01-14  7:15                   ` Ming Lei
2013-01-14 17:30                     ` Linus Torvalds
2013-01-14 18:04                       ` Alan Stern
2013-01-14 18:34                         ` Linus Torvalds
2013-01-15  1:53                       ` Ming Lei
2013-01-15  6:23                         ` Ming Lei
2013-01-15 17:36                           ` Linus Torvalds
2013-01-15 18:18                             ` Linus Torvalds
2013-01-15 23:17                               ` Tejun Heo
2013-01-15 18:20                             ` Alan Stern
2013-01-15 18:39                               ` Tejun Heo
2013-01-15 18:32                             ` Tejun Heo
2013-01-15 20:18                               ` Linus Torvalds
2013-01-15 23:50                                 ` Tejun Heo
2013-01-16  0:25                                   ` Arjan van de Ven
2013-01-16  0:35                                     ` Tejun Heo
2013-01-16  4:01                                       ` Alan Stern
2013-01-16 16:12                                         ` Tejun Heo
2013-01-16 17:01                                           ` Alan Stern
2013-01-16 17:37                                             ` Tejun Heo
2013-01-16 17:51                                               ` Alan Stern
2013-01-16  0:36                                   ` Linus Torvalds
2013-01-16  0:40                                     ` Linus Torvalds
2013-01-16  2:52                                       ` [PATCH] module, async: async_synchronize_full() on module init iff async is used Tejun Heo
2013-01-16  3:00                                         ` Linus Torvalds
2013-01-16  3:25                                           ` Tejun Heo
2013-01-16  3:37                                             ` Linus Torvalds
2013-01-16 16:22                                               ` Arjan van de Ven
2013-01-16 16:48                                               ` Tejun Heo
2013-01-16 17:03                                                 ` Arjan van de Ven
2013-01-16 17:06                                                   ` Linus Torvalds
2013-01-16 21:30                                                     ` [PATCH 1/2] init, block: try to load default elevator module early during boot Tejun Heo
2013-01-17 18:05                                                       ` Linus Torvalds
2013-01-17 18:38                                                         ` Tejun Heo
2013-01-17 18:46                                                           ` Linus Torvalds
2013-01-17 18:59                                                             ` Tejun Heo
2013-01-17 19:00                                                               ` Linus Torvalds
2013-01-18  1:24                                                         ` [PATCH 1/3] workqueue: set PF_WQ_WORKER on rescuers Tejun Heo
2013-01-18  1:25                                                         ` [PATCH 2/3] workqueue, async: implement work/async_current_func() Tejun Heo
2013-01-18  2:47                                                           ` Linus Torvalds
2013-01-18  2:59                                                             ` Tejun Heo
2013-01-18  3:04                                                               ` Tejun Heo
2013-01-18  3:18                                                                 ` Linus Torvalds
2013-01-18  3:47                                                                   ` Tejun Heo
2013-01-18 22:08                                                                   ` [PATCH 1/5] workqueue: set PF_WQ_WORKER on rescuers Tejun Heo
2013-01-18 22:10                                                                   ` [PATCH 2/5] workqueue: rename kernel/workqueue_sched.h to kernel/workqueue_internal.h Tejun Heo
2013-01-18 22:11                                                                   ` [PATCH 3/5] workqueue: move struct worker definition to workqueue_internal.h Tejun Heo
2013-01-18 22:11                                                                   ` [PATCH 4/5] workqueue: implement current_is_async() Tejun Heo
2013-01-18 22:12                                                                   ` [PATCH 5/5] async, kmod: warn on synchronous request_module() from async workers Tejun Heo
2022-06-23  5:25                                                                     ` Saravana Kannan
2013-01-18  1:27                                                         ` [PATCH 3/3] " Tejun Heo
2013-01-23  0:53                                                       ` [PATCH v2 1/2] init, block: try to load default elevator module early during boot Tejun Heo
2013-01-16 21:31                                                     ` [PATCH 2/2] block: don't request module during elevator init Tejun Heo
2013-01-23  0:51                                                       ` [PATCH v2 " Tejun Heo
2013-01-16  3:30                                         ` [PATCH] module, async: async_synchronize_full() on module init iff async is used Ming Lei
2013-01-16  4:24                                         ` Rusty Russell
2013-01-16 11:36                                         ` Alex Riesen
2013-08-12  7:04                                         ` [3.8-rc3 -> 3.8-rc4 regression] " Jonathan Nieder
2013-08-12 15:09                                           ` Tejun Heo
2013-11-26 21:29                                             ` Josh Hunt
2013-11-26 21:53                                               ` Linus Torvalds
2013-11-26 22:12                                                 ` Josh Hunt
2013-11-26 22:29                                                   ` Tejun Heo
2013-12-03 14:28                                                     ` Josh Hunt
2013-12-03 15:19                                                       ` Tejun Heo
2013-12-04 23:01                                                         ` Josh Hunt
2013-12-04 23:12                                                           ` Tejun Heo
2013-11-26 22:30                                                   ` Linus Torvalds
2013-01-16  0:44                                     ` USB device cannot be reconnected and khubd "blocked for more than 120 seconds" Tejun Heo
2013-01-16 17:19                               ` Tejun Heo [this message]
2013-01-17 18:16                                 ` [PATCH] async: fix __lowest_in_progress() Linus Torvalds
2013-01-17 18:50                                   ` Tejun Heo
2013-01-23  0:15                                 ` [PATCH v2] " Tejun Heo
2013-01-23  0:22                                   ` Linus Torvalds
2013-01-16  3:05                             ` USB device cannot be reconnected and khubd "blocked for more than 120 seconds" Ming Lei
2013-01-16  4:14                               ` Linus Torvalds
2013-01-14  8:22                   ` Oliver Neukum
2013-01-14  8:40                     ` Ming Lei
2013-01-12 19:56     ` Alex Riesen
  -- strict thread matches above, loose matches on Subject: below --
2009-01-12 10:16 [PATCH] async: fix __lowest_in_progress() Arjan van de Ven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130116171952.GQ2668@htj.dyndns.org \
    --to=tj@kernel.org \
    --cc=arjan@linux.intel.com \
    --cc=axboe@kernel.dk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=ming.lei@canonical.com \
    --cc=raa.lkml@gmail.com \
    --cc=stern@rowland.harvard.edu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.