linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Mike Snitzer <snitzer@redhat.com>,
	Bart Van Assche <bvanassche@acm.org>
Subject: Re: [PATCH v3] blk-mq: punt failed direct issue to dispatch list
Date: Fri, 7 Dec 2018 18:50:28 +0800	[thread overview]
Message-ID: <20181207105027.GG29027@ming.t460p> (raw)
In-Reply-To: <aa6abfa7-30c3-57d7-7369-754d514584f5@kernel.dk>

On Thu, Dec 06, 2018 at 10:17:44PM -0700, Jens Axboe wrote:
> After the direct dispatch corruption fix, we permanently disallow direct
> dispatch of non read/write requests. This works fine off the normal IO
> path, as they will be retried like any other failed direct dispatch
> request. But for the blk_insert_cloned_request() that only DM uses to
> bypass the bottom level scheduler, we always first attempt direct
> dispatch. For some types of requests, that's now a permanent failure,
> and no amount of retrying will make that succeed. This results in a
> livelock.
> 
> Instead of making special cases for what we can direct issue, and now
> having to deal with DM solving the livelock while still retaining a BUSY
> condition feedback loop, always just add a request that has been through
> ->queue_rq() to the hardware queue dispatch list. These are safe to use
> as no merging can take place there. Additionally, if requests do have
> prepped data from drivers, we aren't dependent on them not sharing space
> in the request structure to safely add them to the IO scheduler lists.
> 
> This basically reverts ffe81d45322c and is based on a patch from Ming,
> but with the list insert case covered as well.
> 
> Fixes: ffe81d45322c ("blk-mq: fix corruption with direct issue")
> Cc: stable@vger.kernel.org
> Suggested-by: Ming Lei <ming.lei@redhat.com>
> Reported-by: Bart Van Assche <bvanassche@acm.org>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> ---
> 
> I've thrown the initial hang test reported by Bart at it, works fine.
> My reproducer for the corruption case is also happy, as expected.
> 
> I'm running blktests and xfstests on it overnight. If that passes as
> expected, this qualms my initial worries on using ->dispatch as a
> holding place for these types of requests.
> 
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 3262d83b9e07..6a7566244de3 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1715,15 +1715,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
>  		break;
>  	case BLK_STS_RESOURCE:
>  	case BLK_STS_DEV_RESOURCE:
> -		/*
> -		 * If direct dispatch fails, we cannot allow any merging on
> -		 * this IO. Drivers (like SCSI) may have set up permanent state
> -		 * for this request, like SG tables and mappings, and if we
> -		 * merge to it later on then we'll still only do IO to the
> -		 * original part.
> -		 */
> -		rq->cmd_flags |= REQ_NOMERGE;
> -
>  		blk_mq_update_dispatch_busy(hctx, true);
>  		__blk_mq_requeue_request(rq);
>  		break;
> @@ -1736,18 +1727,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
>  	return ret;
>  }
>  
> -/*
> - * Don't allow direct dispatch of anything but regular reads/writes,
> - * as some of the other commands can potentially share request space
> - * with data we need for the IO scheduler. If we attempt a direct dispatch
> - * on those and fail, we can't safely add it to the scheduler afterwards
> - * without potentially overwriting data that the driver has already written.
> - */
> -static bool blk_rq_can_direct_dispatch(struct request *rq)
> -{
> -	return req_op(rq) == REQ_OP_READ || req_op(rq) == REQ_OP_WRITE;
> -}
> -
>  static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  						struct request *rq,
>  						blk_qc_t *cookie,
> @@ -1769,7 +1748,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  		goto insert;
>  	}
>  
> -	if (!blk_rq_can_direct_dispatch(rq) || (q->elevator && !bypass_insert))
> +	if (q->elevator && !bypass_insert)
>  		goto insert;
>  
>  	if (!blk_mq_get_dispatch_budget(hctx))
> @@ -1785,7 +1764,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  	if (bypass_insert)
>  		return BLK_STS_RESOURCE;
>  
> -	blk_mq_sched_insert_request(rq, false, run_queue, false);
> +	blk_mq_request_bypass_insert(rq, run_queue);
>  	return BLK_STS_OK;
>  }
>  
> @@ -1801,7 +1780,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  
>  	ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
>  	if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
> -		blk_mq_sched_insert_request(rq, false, true, false);
> +		blk_mq_request_bypass_insert(rq, true);
>  	else if (ret != BLK_STS_OK)
>  		blk_mq_end_request(rq, ret);
>  
> @@ -1831,15 +1810,13 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
>  		struct request *rq = list_first_entry(list, struct request,
>  				queuelist);
>  
> -		if (!blk_rq_can_direct_dispatch(rq))
> -			break;
> -
>  		list_del_init(&rq->queuelist);
>  		ret = blk_mq_request_issue_directly(rq);
>  		if (ret != BLK_STS_OK) {
>  			if (ret == BLK_STS_RESOURCE ||
>  					ret == BLK_STS_DEV_RESOURCE) {
> -				list_add(&rq->queuelist, list);
> +				blk_mq_request_bypass_insert(rq,
> +							list_empty(list));
>  				break;
>  			}
>  			blk_mq_end_request(rq, ret);
> 

Tested-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

  parent reply	other threads:[~2018-12-07 10:50 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-07  5:17 [PATCH v3] blk-mq: punt failed direct issue to dispatch list Jens Axboe
2018-12-07  8:24 ` Ming Lei
2018-12-07 10:50 ` Ming Lei [this message]
2018-12-07 15:15 ` Mike Snitzer
2018-12-07 16:19 ` Bart Van Assche
2018-12-07 16:24   ` Jens Axboe
2018-12-07 16:35     ` Jens Axboe
2018-12-07 16:41       ` Bart Van Assche
2018-12-07 16:45         ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181207105027.GG29027@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).