linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH BUGFIX] block, bfq: fix bug causing crashes
@ 2017-07-03  8:00 Paolo Valente
  2017-07-03  8:00 ` [PATCH BUGFIX] block, bfq: don't change ioprio class for a bfq_queue on a service tree Paolo Valente
  2017-07-03 22:49 ` [PATCH BUGFIX] block, bfq: fix bug causing crashes Jens Axboe
  0 siblings, 2 replies; 4+ messages in thread
From: Paolo Valente @ 2017-07-03  8:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, ulf.hansson, broonie, Paolo Valente

Hi Jens,
I'm writing this short cover letter to hopefully help you decide what
to do with this patch, in this late phase of the development
cycle. This patch fixes a bug causing kernel crashes for at least
one year. Crashes apparently affect only a minority of users, but are
systematic for them (a crash every few tens of minutes for some).

Thanks,
Paolo

Paolo Valente (1):
  block, bfq: don't change ioprio class for a bfq_queue on a service
    tree

 block/bfq-iosched.c | 14 ++++++++++----
 block/bfq-iosched.h |  3 ++-
 block/bfq-wf2q.c    | 39 ++++++++++++++++++++++++++++++++++-----
 3 files changed, 46 insertions(+), 10 deletions(-)

--
2.10.0

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH BUGFIX] block, bfq: don't change ioprio class for a bfq_queue on a service tree
  2017-07-03  8:00 [PATCH BUGFIX] block, bfq: fix bug causing crashes Paolo Valente
@ 2017-07-03  8:00 ` Paolo Valente
  2017-07-03 22:49 ` [PATCH BUGFIX] block, bfq: fix bug causing crashes Jens Axboe
  1 sibling, 0 replies; 4+ messages in thread
From: Paolo Valente @ 2017-07-03  8:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, ulf.hansson, broonie, Paolo Valente

On each deactivation or re-scheduling (after being served) of a
bfq_queue, BFQ invokes the function __bfq_entity_update_weight_prio(),
to perform pending updates of ioprio, weight and ioprio class for the
bfq_queue. BFQ also invokes this function on I/O-request dispatches,
to raise or lower weights more quickly when needed, thereby improving
latency. However, the entity representing the bfq_queue may be on the
active (sub)tree of a service tree when this happens, and, although
with a very low probability, the bfq_queue may happen to also have a
pending change of its ioprio class. If both conditions hold when
__bfq_entity_update_weight_prio() is invoked, then the entity moves to
a sort of hybrid state: the new service tree for the entity, as
returned by bfq_entity_service_tree(), differs from service tree on
which the entity still is. The functions that handle activations and
deactivations of entities do not cope with such a hybrid state (and
would need to become more complex to cope).

This commit addresses this issue by just making
__bfq_entity_update_weight_prio() not perform also a possible pending
change of ioprio class, when invoked on an I/O-request dispatch for a
bfq_queue. Such a change is thus postponed to when
__bfq_entity_update_weight_prio() is invoked on deactivation or
re-scheduling of the bfq_queue.

Reported-by: Marco Piazza <mpiazza@gmail.com>
Reported-by: Laurentiu Nicola <lnicola@dend.ro>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Tested-by: Marco Piazza <mpiazza@gmail.com>
---
 block/bfq-iosched.c | 14 ++++++++++----
 block/bfq-iosched.h |  3 ++-
 block/bfq-wf2q.c    | 39 ++++++++++++++++++++++++++++++++++-----
 3 files changed, 46 insertions(+), 10 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index ed93da2..9d4919b 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -3471,11 +3471,17 @@ static void bfq_update_wr_data(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 			}
 		}
 	}
-	/* Update weight both if it must be raised and if it must be lowered */
+	/*
+	 * To improve latency (for this or other queues), immediately
+	 * update weight both if it must be raised and if it must be
+	 * lowered. Since, entity may be on some active tree here, and
+	 * might have a pending change of its ioprio class, invoke
+	 * next function with the last parameter unset (see the
+	 * comments on the function).
+	 */
 	if ((entity->weight > entity->orig_weight) != (bfqq->wr_coeff > 1))
-		__bfq_entity_update_weight_prio(
-			bfq_entity_service_tree(entity),
-			entity);
+		__bfq_entity_update_weight_prio(bfq_entity_service_tree(entity),
+						entity, false);
 }
 
 /*
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 5c3bf98..8fd83b8 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -892,7 +892,8 @@ void bfq_put_idle_entity(struct bfq_service_tree *st,
 			 struct bfq_entity *entity);
 struct bfq_service_tree *
 __bfq_entity_update_weight_prio(struct bfq_service_tree *old_st,
-				struct bfq_entity *entity);
+				struct bfq_entity *entity,
+				bool update_class_too);
 void bfq_bfqq_served(struct bfq_queue *bfqq, int served);
 void bfq_bfqq_charge_time(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 			  unsigned long time_ms);
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index 8726ede..5ec05cd 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -694,10 +694,28 @@ struct bfq_service_tree *bfq_entity_service_tree(struct bfq_entity *entity)
 	return sched_data->service_tree + idx;
 }
 
-
+/*
+ * Update weight and priority of entity. If update_class_too is true,
+ * then update the ioprio_class of entity too.
+ *
+ * The reason why the update of ioprio_class is controlled through the
+ * last parameter is as follows. Changing the ioprio class of an
+ * entity implies changing the destination service trees for that
+ * entity. If such a change occurred when the entity is already on one
+ * of the service trees for its previous class, then the state of the
+ * entity would become more complex: none of the new possible service
+ * trees for the entity, according to bfq_entity_service_tree(), would
+ * match any of the possible service trees on which the entity
+ * is. Complex operations involving these trees, such as entity
+ * activations and deactivations, should take into account this
+ * additional complexity.  To avoid this issue, this function is
+ * invoked with update_class_too unset in the points in the code where
+ * entity may happen to be on some tree.
+ */
 struct bfq_service_tree *
 __bfq_entity_update_weight_prio(struct bfq_service_tree *old_st,
-				struct bfq_entity *entity)
+				struct bfq_entity *entity,
+				bool update_class_too)
 {
 	struct bfq_service_tree *new_st = old_st;
 
@@ -739,9 +757,15 @@ __bfq_entity_update_weight_prio(struct bfq_service_tree *old_st,
 				  bfq_weight_to_ioprio(entity->orig_weight);
 		}
 
-		if (bfqq)
+		if (bfqq && update_class_too)
 			bfqq->ioprio_class = bfqq->new_ioprio_class;
-		entity->prio_changed = 0;
+
+		/*
+		 * Reset prio_changed only if the ioprio_class change
+		 * is not pending any longer.
+		 */
+		if (!bfqq || bfqq->ioprio_class == bfqq->new_ioprio_class)
+			entity->prio_changed = 0;
 
 		/*
 		 * NOTE: here we may be changing the weight too early,
@@ -867,7 +891,12 @@ static void bfq_update_fin_time_enqueue(struct bfq_entity *entity,
 {
 	struct bfq_queue *bfqq = bfq_entity_to_bfqq(entity);
 
-	st = __bfq_entity_update_weight_prio(st, entity);
+	/*
+	 * When this function is invoked, entity is not in any service
+	 * tree, then it is safe to invoke next function with the last
+	 * parameter set (see the comments on the function).
+	 */
+	st = __bfq_entity_update_weight_prio(st, entity, true);
 	bfq_calc_finish(entity, entity->budget);
 
 	/*
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH BUGFIX] block, bfq: fix bug causing crashes
  2017-07-03  8:00 [PATCH BUGFIX] block, bfq: fix bug causing crashes Paolo Valente
  2017-07-03  8:00 ` [PATCH BUGFIX] block, bfq: don't change ioprio class for a bfq_queue on a service tree Paolo Valente
@ 2017-07-03 22:49 ` Jens Axboe
  2017-07-04  7:08   ` Paolo Valente
  1 sibling, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2017-07-03 22:49 UTC (permalink / raw)
  To: Paolo Valente; +Cc: linux-block, linux-kernel, ulf.hansson, broonie

On 07/03/2017 02:00 AM, Paolo Valente wrote:
> Hi Jens,
> I'm writing this short cover letter to hopefully help you decide what
> to do with this patch, in this late phase of the development
> cycle. This patch fixes a bug causing kernel crashes for at least
> one year. Crashes apparently affect only a minority of users, but are
> systematic for them (a crash every few tens of minutes for some).

By the time you wrote that email, 4.12 was already released for hours.
So there's really no choice this time but to queue it up for 4.13.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH BUGFIX] block, bfq: fix bug causing crashes
  2017-07-03 22:49 ` [PATCH BUGFIX] block, bfq: fix bug causing crashes Jens Axboe
@ 2017-07-04  7:08   ` Paolo Valente
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Valente @ 2017-07-04  7:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Linux-Kernal, ulf.hansson, broonie


> Il giorno 04 lug 2017, alle ore 00:49, Jens Axboe <axboe@kernel.dk> ha scritto:
> 
> On 07/03/2017 02:00 AM, Paolo Valente wrote:
>> Hi Jens,
>> I'm writing this short cover letter to hopefully help you decide what
>> to do with this patch, in this late phase of the development
>> cycle. This patch fixes a bug causing kernel crashes for at least
>> one year. Crashes apparently affect only a minority of users, but are
>> systematic for them (a crash every few tens of minutes for some).
> 
> By the time you wrote that email, 4.12 was already released for hours.

Oops ...

> So there's really no choice this time but to queue it up for 4.13.
> 

At least, no decision to make :)

Thanks,
Paolo

> -- 
> Jens Axboe
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-07-04  7:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-03  8:00 [PATCH BUGFIX] block, bfq: fix bug causing crashes Paolo Valente
2017-07-03  8:00 ` [PATCH BUGFIX] block, bfq: don't change ioprio class for a bfq_queue on a service tree Paolo Valente
2017-07-03 22:49 ` [PATCH BUGFIX] block, bfq: fix bug causing crashes Jens Axboe
2017-07-04  7:08   ` Paolo Valente

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).