All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] blk-mq: start request gstate with gen 1
@ 2018-04-17  3:46 Jianchao Wang
  2018-04-17  3:56 ` Jens Axboe
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jianchao Wang @ 2018-04-17  3:46 UTC (permalink / raw)
  To: axboe
  Cc: bart.vanassche, tj, ming.lei, Martin, stable, linux-block, linux-kernel

rq->gstate and rq->aborted_gstate both are zero before rqs are
allocated. If we have a small timeout, when the timer fires,
there could be rqs that are never allocated, and also there could
be rq that has been allocated but not initialized and started. At
the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
the blk_mq_terminate_expired will identify the rq is timed out and
invoke .timeout early.

For scsi, this will cause scsi_times_out to be invoked before the
scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
the moment, then we will get crash.

Cc: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Martin Steigerwald <Martin@Lichtvoll.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 block/blk-core.c | 4 ++++
 block/blk-mq.c   | 7 +++++++
 2 files changed, 11 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index abcb868..ce62681 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -201,6 +201,10 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
 	rq->part = NULL;
 	seqcount_init(&rq->gstate_seq);
 	u64_stats_init(&rq->aborted_gstate_sync);
+	/*
+	 * See comment of blk_mq_init_request
+	 */
+	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
 }
 EXPORT_SYMBOL(blk_rq_init);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f5c7dbc..d62030a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2069,6 +2069,13 @@ static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
 
 	seqcount_init(&rq->gstate_seq);
 	u64_stats_init(&rq->aborted_gstate_sync);
+	/*
+	 * start gstate with gen 1 instead of 0, otherwise it will be equal
+	 * to aborted_gstate, and be identified timed out by
+	 * blk_mq_terminate_expired.
+	 */
+	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
+
 	return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: start request gstate with gen 1
  2018-04-17  3:46 [PATCH] blk-mq: start request gstate with gen 1 Jianchao Wang
@ 2018-04-17  3:56 ` Jens Axboe
  2018-04-17  4:46 ` Ming Lei
  2018-04-17 12:10 ` Martin Steigerwald
  2 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2018-04-17  3:56 UTC (permalink / raw)
  To: Jianchao Wang
  Cc: bart.vanassche, tj, ming.lei, Martin, stable, linux-block, linux-kernel

On 4/16/18 9:46 PM, Jianchao Wang wrote:
> rq->gstate and rq->aborted_gstate both are zero before rqs are
> allocated. If we have a small timeout, when the timer fires,
> there could be rqs that are never allocated, and also there could
> be rq that has been allocated but not initialized and started. At
> the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
> the blk_mq_terminate_expired will identify the rq is timed out and
> invoke .timeout early.
> 
> For scsi, this will cause scsi_times_out to be invoked before the
> scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
> the moment, then we will get crash.

Oops, this looks good to me. Applied.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: start request gstate with gen 1
  2018-04-17  3:46 [PATCH] blk-mq: start request gstate with gen 1 Jianchao Wang
  2018-04-17  3:56 ` Jens Axboe
@ 2018-04-17  4:46 ` Ming Lei
  2018-04-17 12:10 ` Martin Steigerwald
  2 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2018-04-17  4:46 UTC (permalink / raw)
  To: Jianchao Wang
  Cc: axboe, bart.vanassche, tj, Martin, stable, linux-block, linux-kernel

On Tue, Apr 17, 2018 at 11:46:20AM +0800, Jianchao Wang wrote:
> rq->gstate and rq->aborted_gstate both are zero before rqs are
> allocated. If we have a small timeout, when the timer fires,
> there could be rqs that are never allocated, and also there could
> be rq that has been allocated but not initialized and started. At
> the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
> the blk_mq_terminate_expired will identify the rq is timed out and
> invoke .timeout early.
> 
> For scsi, this will cause scsi_times_out to be invoked before the
> scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
> the moment, then we will get crash.
> 
> Cc: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Martin Steigerwald <Martin@Lichtvoll.de>
> Cc: stable@vger.kernel.org
> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> ---
>  block/blk-core.c | 4 ++++
>  block/blk-mq.c   | 7 +++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index abcb868..ce62681 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -201,6 +201,10 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
>  	rq->part = NULL;
>  	seqcount_init(&rq->gstate_seq);
>  	u64_stats_init(&rq->aborted_gstate_sync);
> +	/*
> +	 * See comment of blk_mq_init_request
> +	 */
> +	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
>  }
>  EXPORT_SYMBOL(blk_rq_init);
>  
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index f5c7dbc..d62030a 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2069,6 +2069,13 @@ static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
>  
>  	seqcount_init(&rq->gstate_seq);
>  	u64_stats_init(&rq->aborted_gstate_sync);
> +	/*
> +	 * start gstate with gen 1 instead of 0, otherwise it will be equal
> +	 * to aborted_gstate, and be identified timed out by
> +	 * blk_mq_terminate_expired.
> +	 */
> +	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
> +
>  	return 0;
>  }

Good catch, blk_mq_check_expired() is bypassed, but it is still hit
by blk_mq_terminate_expired().

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: start request gstate with gen 1
  2018-04-17  3:46 [PATCH] blk-mq: start request gstate with gen 1 Jianchao Wang
  2018-04-17  3:56 ` Jens Axboe
  2018-04-17  4:46 ` Ming Lei
@ 2018-04-17 12:10 ` Martin Steigerwald
  2018-04-17 14:34   ` jianchao.wang
  2 siblings, 1 reply; 8+ messages in thread
From: Martin Steigerwald @ 2018-04-17 12:10 UTC (permalink / raw)
  To: Jianchao Wang
  Cc: axboe, bart.vanassche, tj, ming.lei, stable, linux-block, linux-kernel

Hi Jianchao,

Jianchao Wang - 17.04.18, 05:46:
> rq->gstate and rq->aborted_gstate both are zero before rqs are
> allocated. If we have a small timeout, when the timer fires,
> there could be rqs that are never allocated, and also there could
> be rq that has been allocated but not initialized and started. At
> the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
> the blk_mq_terminate_expired will identify the rq is timed out and
> invoke .timeout early.

For testing it I add it to 4.16.2 with the patches I have already?

- '[PATCH] blk-mq_Directly schedule q->timeout_work when aborting a 
request.mbox'

- '[PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into 
rcu_read_{lock,unlock}().mbox'

- '[PATCH V4 1_2] blk-mq_set RQF_MQ_TIMEOUT_EXPIRED when the rq'\''s 
timeout isn'\''t handled.mbox'

- '[PATCH V4 2_2] blk-mq_fix race between complete and 
BLK_EH_RESET_TIMER.mbox'

> For scsi, this will cause scsi_times_out to be invoked before the
> scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
> the moment, then we will get crash.
> 
> Cc: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Martin Steigerwald <Martin@Lichtvoll.de>
> Cc: stable@vger.kernel.org
> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> ---
>  block/blk-core.c | 4 ++++
>  block/blk-mq.c   | 7 +++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index abcb868..ce62681 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -201,6 +201,10 @@ void blk_rq_init(struct request_queue *q, struct
> request *rq) rq->part = NULL;
>  	seqcount_init(&rq->gstate_seq);
>  	u64_stats_init(&rq->aborted_gstate_sync);
> +	/*
> +	 * See comment of blk_mq_init_request
> +	 */
> +	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
>  }
>  EXPORT_SYMBOL(blk_rq_init);
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index f5c7dbc..d62030a 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2069,6 +2069,13 @@ static int blk_mq_init_request(struct
> blk_mq_tag_set *set, struct request *rq,
> 
>  	seqcount_init(&rq->gstate_seq);
>  	u64_stats_init(&rq->aborted_gstate_sync);
> +	/*
> +	 * start gstate with gen 1 instead of 0, otherwise it will be equal
> +	 * to aborted_gstate, and be identified timed out by
> +	 * blk_mq_terminate_expired.
> +	 */
> +	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
> +
>  	return 0;
>  }


-- 
Martin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: start request gstate with gen 1
  2018-04-17 12:10 ` Martin Steigerwald
@ 2018-04-17 14:34   ` jianchao.wang
  2018-04-23  7:07     ` Martin Steigerwald
  2018-04-23  8:39     ` Martin Steigerwald
  0 siblings, 2 replies; 8+ messages in thread
From: jianchao.wang @ 2018-04-17 14:34 UTC (permalink / raw)
  To: Martin Steigerwald
  Cc: axboe, bart.vanassche, tj, ming.lei, stable, linux-block, linux-kernel

Hi Martin

On 04/17/2018 08:10 PM, Martin Steigerwald wrote:
> For testing it I add it to 4.16.2 with the patches I have already?

You could try to only apply this patch to have a test. :)

> 
> - '[PATCH] blk-mq_Directly schedule q->timeout_work when aborting a 
> request.mbox'
> 
> - '[PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into 
> rcu_read_{lock,unlock}().mbox'
> 
> - '[PATCH V4 1_2] blk-mq_set RQF_MQ_TIMEOUT_EXPIRED when the rq'\''s 
> timeout isn'\''t handled.mbox'
> 
> - '[PATCH V4 2_2] blk-mq_fix race between complete and 
> BLK_EH_RESET_TIMER.mbox


Thanks
Jianchao

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: start request gstate with gen 1
  2018-04-17 14:34   ` jianchao.wang
@ 2018-04-23  7:07     ` Martin Steigerwald
  2018-04-23  8:39     ` Martin Steigerwald
  1 sibling, 0 replies; 8+ messages in thread
From: Martin Steigerwald @ 2018-04-23  7:07 UTC (permalink / raw)
  To: jianchao.wang
  Cc: axboe, bart.vanassche, tj, ming.lei, stable, linux-block, linux-kernel

Hi Jianchao.

jianchao.wang - 17.04.18, 16:34:
> On 04/17/2018 08:10 PM, Martin Steigerwald wrote:
> > For testing it I add it to 4.16.2 with the patches I have already?
> 
> You could try to only apply this patch to have a test. 

Compiling now to have a test.

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] blk-mq: start request gstate with gen 1
  2018-04-17 14:34   ` jianchao.wang
  2018-04-23  7:07     ` Martin Steigerwald
@ 2018-04-23  8:39     ` Martin Steigerwald
  1 sibling, 0 replies; 8+ messages in thread
From: Martin Steigerwald @ 2018-04-23  8:39 UTC (permalink / raw)
  To: jianchao.wang
  Cc: axboe, bart.vanassche, tj, ming.lei, stable, linux-block, linux-kernel

Hi Jianchao.

jianchao.wang - 17.04.18, 16:34:
> On 04/17/2018 08:10 PM, Martin Steigerwald wrote:
> > For testing it I add it to 4.16.2 with the patches I have already?
> 
> You could try to only apply this patch to have a test. :)

I tested 4.16.3 with just your patch (+ the unrelated btrfs trimming fix 
I carry for a long time already) and it did at least 15 boots 
successfully (without hanging). So far also no "error loading smart data 
mail", but it takes a few days with suspend/hibernation + resume cycles 
in order to know for sure.

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] blk-mq: start request gstate with gen 1
@ 2018-04-17  3:44 Jianchao Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Jianchao Wang @ 2018-04-17  3:44 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, linux-kernel

rq->gstate and rq->aborted_gstate both are zero before rqs are
allocated. If we have a small timeout, when the timer fires,
there could be rqs that are never allocated, and also there could
be rq that has been allocated but not initialized and started. At
the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
the blk_mq_terminate_expired will identify the rq is timed out and
invoke .timeout early.

For scsi, this will cause scsi_times_out to be invoked before the
scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
the moment, then we will get crash.

Cc: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Martin Steigerwald <Martin@Lichtvoll.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 block/blk-core.c | 4 ++++
 block/blk-mq.c   | 7 +++++++
 2 files changed, 11 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index abcb868..ce62681 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -201,6 +201,10 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
 	rq->part = NULL;
 	seqcount_init(&rq->gstate_seq);
 	u64_stats_init(&rq->aborted_gstate_sync);
+	/*
+	 * See comment of blk_mq_init_request
+	 */
+	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
 }
 EXPORT_SYMBOL(blk_rq_init);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f5c7dbc..d62030a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2069,6 +2069,13 @@ static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
 
 	seqcount_init(&rq->gstate_seq);
 	u64_stats_init(&rq->aborted_gstate_sync);
+	/*
+	 * start gstate with gen 1 instead of 0, otherwise it will be equal
+	 * to aborted_gstate, and be identified timed out by
+	 * blk_mq_terminate_expired.
+	 */
+	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
+
 	return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-04-23  8:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-17  3:46 [PATCH] blk-mq: start request gstate with gen 1 Jianchao Wang
2018-04-17  3:56 ` Jens Axboe
2018-04-17  4:46 ` Ming Lei
2018-04-17 12:10 ` Martin Steigerwald
2018-04-17 14:34   ` jianchao.wang
2018-04-23  7:07     ` Martin Steigerwald
2018-04-23  8:39     ` Martin Steigerwald
  -- strict thread matches above, loose matches on Subject: below --
2018-04-17  3:44 Jianchao Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.