linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] block: fix two regressiones on bounce
@ 2017-12-18  7:40 Ming Lei
  2017-12-18  7:40 ` [PATCH 1/2] block: don't let passthrough IO go into .make_request_fn() Ming Lei
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Ming Lei @ 2017-12-18  7:40 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: Christoph Hellwig, NeilBrown, Michele Ballabio, Vincent Batts, Ming Lei

Hi Jens,

The 1st patch makes sure that passthrough IO won't enter .make_request_fn.

The 2nd one fixes blk_rq_append_bio(), which is from your post with
small change on failure handling.


Jens Axboe (1):
  block: fix blk_rq_append_bio

Ming Lei (1):
  block: don't let passthrough IO go into .make_request_fn()

 block/blk-map.c                    | 38 ++++++++++++++++++++++----------------
 block/bounce.c                     |  6 ++++--
 drivers/scsi/osd/osd_initiator.c   |  4 +++-
 drivers/target/target_core_pscsi.c |  4 ++--
 include/linux/blkdev.h             | 23 ++++++++++++++++++++---
 5 files changed, 51 insertions(+), 24 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/2] block: don't let passthrough IO go into .make_request_fn()
  2017-12-18  7:40 [PATCH 0/2] block: fix two regressiones on bounce Ming Lei
@ 2017-12-18  7:40 ` Ming Lei
  2017-12-18  7:40 ` [PATCH 2/2] block: fix blk_rq_append_bio Ming Lei
  2017-12-18 20:22 ` [PATCH 0/2] block: fix two regressiones on bounce Jens Axboe
  2 siblings, 0 replies; 19+ messages in thread
From: Ming Lei @ 2017-12-18  7:40 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: Christoph Hellwig, NeilBrown, Michele Ballabio, Vincent Batts, Ming Lei

Commit a8821f3f3("block: Improvements to bounce-buffer handling") tries
to make sure that the bio to .make_request_fn won't exceed BIO_MAX_PAGES,
but ignores that passthrough I/O can use blk_queue_bounce() too.
Especially, passthrough IO may not be sector-aligned, and the check
of 'sectors < bio_sectors(*bio_orig)' inside __blk_queue_bounce() may
become true even though the max bvec number doesn't exceed BIO_MAX_PAGES,
then cause the bio splitted, and the original passthrough bio is submited
to generic_make_request().

This patch fixes this issue by checking if the bio is passthrough IO,
and use bio_kmalloc() to allocate the cloned passthrough bio.

Cc: NeilBrown <neilb@suse.com>
Fixes: a8821f3f3("block: Improvements to bounce-buffer handling")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/bounce.c         |  6 ++++--
 include/linux/blkdev.h | 21 +++++++++++++++++++--
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/block/bounce.c b/block/bounce.c
index fceb1a96480b..1d05c422c932 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -200,6 +200,7 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
 	unsigned i = 0;
 	bool bounce = false;
 	int sectors = 0;
+	bool passthrough = bio_is_passthrough(*bio_orig);
 
 	bio_for_each_segment(from, *bio_orig, iter) {
 		if (i++ < BIO_MAX_PAGES)
@@ -210,13 +211,14 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
 	if (!bounce)
 		return;
 
-	if (sectors < bio_sectors(*bio_orig)) {
+	if (!passthrough && sectors < bio_sectors(*bio_orig)) {
 		bio = bio_split(*bio_orig, sectors, GFP_NOIO, bounce_bio_split);
 		bio_chain(bio, *bio_orig);
 		generic_make_request(*bio_orig);
 		*bio_orig = bio;
 	}
-	bio = bio_clone_bioset(*bio_orig, GFP_NOIO, bounce_bio_set);
+	bio = bio_clone_bioset(*bio_orig, GFP_NOIO, passthrough ? NULL :
+			bounce_bio_set);
 
 	bio_for_each_segment_all(to, bio, i) {
 		struct page *page = to->bv_page;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8089ca17db9a..abd06f540863 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -241,14 +241,24 @@ struct request {
 	struct request *next_rq;
 };
 
+static inline bool blk_op_is_scsi(unsigned int op)
+{
+	return op == REQ_OP_SCSI_IN || op == REQ_OP_SCSI_OUT;
+}
+
+static inline bool blk_op_is_private(unsigned int op)
+{
+	return op == REQ_OP_DRV_IN || op == REQ_OP_DRV_OUT;
+}
+
 static inline bool blk_rq_is_scsi(struct request *rq)
 {
-	return req_op(rq) == REQ_OP_SCSI_IN || req_op(rq) == REQ_OP_SCSI_OUT;
+	return blk_op_is_scsi(req_op(rq));
 }
 
 static inline bool blk_rq_is_private(struct request *rq)
 {
-	return req_op(rq) == REQ_OP_DRV_IN || req_op(rq) == REQ_OP_DRV_OUT;
+	return blk_op_is_private(req_op(rq));
 }
 
 static inline bool blk_rq_is_passthrough(struct request *rq)
@@ -256,6 +266,13 @@ static inline bool blk_rq_is_passthrough(struct request *rq)
 	return blk_rq_is_scsi(rq) || blk_rq_is_private(rq);
 }
 
+static inline bool bio_is_passthrough(struct bio *bio)
+{
+	unsigned op = bio_op(bio);
+
+	return blk_op_is_scsi(op) || blk_op_is_private(op);
+}
+
 static inline unsigned short req_get_ioprio(struct request *req)
 {
 	return req->ioprio;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/2] block: fix blk_rq_append_bio
  2017-12-18  7:40 [PATCH 0/2] block: fix two regressiones on bounce Ming Lei
  2017-12-18  7:40 ` [PATCH 1/2] block: don't let passthrough IO go into .make_request_fn() Ming Lei
@ 2017-12-18  7:40 ` Ming Lei
  2018-01-25 13:58   ` [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio Jiří Paleček
  2018-01-30  1:42   ` [2/2] block: fix blk_rq_append_bio Jiri Palecek
  2017-12-18 20:22 ` [PATCH 0/2] block: fix two regressiones on bounce Jens Axboe
  2 siblings, 2 replies; 19+ messages in thread
From: Ming Lei @ 2017-12-18  7:40 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: Christoph Hellwig, NeilBrown, Michele Ballabio, Vincent Batts,
	Jens Axboe, Christoph Hellwig, Ming Lei

From: Jens Axboe <axboe@kernel.dk>

Commit caa4b02476e3(blk-map: call blk_queue_bounce from blk_rq_append_bio)
moves blk_queue_bounce() into blk_rq_append_bio(), but don't consider
that the bounced bio becomes invisiable to caller since the parameter type is
'struct bio *', which should have been 'struct bio **'.

This patch fixes this issue by passing 'struct bio **' to
blk_rq_append_bio(), then the bounced bio can be returned to caller.

Also failure handling is considered too.

Fixes: caa4b02476e3 ("blk-map: call blk_queue_bounce from blk_rq_append_bio")
Cc: Christoph Hellwig <hch@lst.de>
Reported-by: Michele Ballabio <barra_cuda@katamail.com>
(handling failure of blk_rq_append_bio(), only call bio_get() after
blk_rq_append_bio() returns OK)
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-map.c                    | 38 ++++++++++++++++++++++----------------
 drivers/scsi/osd/osd_initiator.c   |  4 +++-
 drivers/target/target_core_pscsi.c |  4 ++--
 include/linux/blkdev.h             |  2 +-
 4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index b21f8e86f120..d3a94719f03f 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -12,22 +12,29 @@
 #include "blk.h"
 
 /*
- * Append a bio to a passthrough request.  Only works can be merged into
- * the request based on the driver constraints.
+ * Append a bio to a passthrough request.  Only works if the bio can be merged
+ * into the request based on the driver constraints.
  */
-int blk_rq_append_bio(struct request *rq, struct bio *bio)
+int blk_rq_append_bio(struct request *rq, struct bio **bio)
 {
-	blk_queue_bounce(rq->q, &bio);
+	struct bio *orig_bio = *bio;
+
+	blk_queue_bounce(rq->q, bio);
 
 	if (!rq->bio) {
-		blk_rq_bio_prep(rq->q, rq, bio);
+		blk_rq_bio_prep(rq->q, rq, *bio);
 	} else {
-		if (!ll_back_merge_fn(rq->q, rq, bio))
+		if (!ll_back_merge_fn(rq->q, rq, *bio)) {
+			if (orig_bio != *bio) {
+				bio_put(*bio);
+				*bio = orig_bio;
+			}
 			return -EINVAL;
+		}
 
-		rq->biotail->bi_next = bio;
-		rq->biotail = bio;
-		rq->__data_len += bio->bi_iter.bi_size;
+		rq->biotail->bi_next = *bio;
+		rq->biotail = *bio;
+		rq->__data_len += (*bio)->bi_iter.bi_size;
 	}
 
 	return 0;
@@ -73,14 +80,12 @@ static int __blk_rq_map_user_iov(struct request *rq,
 	 * We link the bounce buffer in and could have to traverse it
 	 * later so we have to get a ref to prevent it from being freed
 	 */
-	ret = blk_rq_append_bio(rq, bio);
-	bio_get(bio);
+	ret = blk_rq_append_bio(rq, &bio);
 	if (ret) {
-		bio_endio(bio);
 		__blk_rq_unmap_user(orig_bio);
-		bio_put(bio);
 		return ret;
 	}
+	bio_get(bio);
 
 	return 0;
 }
@@ -213,7 +218,7 @@ int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
 	int reading = rq_data_dir(rq) == READ;
 	unsigned long addr = (unsigned long) kbuf;
 	int do_copy = 0;
-	struct bio *bio;
+	struct bio *bio, *orig_bio;
 	int ret;
 
 	if (len > (queue_max_hw_sectors(q) << 9))
@@ -236,10 +241,11 @@ int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
 	if (do_copy)
 		rq->rq_flags |= RQF_COPY_USER;
 
-	ret = blk_rq_append_bio(rq, bio);
+	orig_bio = bio;
+	ret = blk_rq_append_bio(rq, &bio);
 	if (unlikely(ret)) {
 		/* request is too big */
-		bio_put(bio);
+		bio_put(orig_bio);
 		return ret;
 	}
 
diff --git a/drivers/scsi/osd/osd_initiator.c b/drivers/scsi/osd/osd_initiator.c
index a4f28b7e4c65..e18877177f1b 100644
--- a/drivers/scsi/osd/osd_initiator.c
+++ b/drivers/scsi/osd/osd_initiator.c
@@ -1576,7 +1576,9 @@ static struct request *_make_request(struct request_queue *q, bool has_write,
 		return req;
 
 	for_each_bio(bio) {
-		ret = blk_rq_append_bio(req, bio);
+		struct bio *bounce_bio = bio;
+
+		ret = blk_rq_append_bio(req, &bounce_bio);
 		if (ret)
 			return ERR_PTR(ret);
 	}
diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c
index 7c69b4a9694d..0d99b242e82e 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -920,7 +920,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 					" %d i: %d bio: %p, allocating another"
 					" bio\n", bio->bi_vcnt, i, bio);
 
-				rc = blk_rq_append_bio(req, bio);
+				rc = blk_rq_append_bio(req, &bio);
 				if (rc) {
 					pr_err("pSCSI: failed to append bio\n");
 					goto fail;
@@ -938,7 +938,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 	}
 
 	if (bio) {
-		rc = blk_rq_append_bio(req, bio);
+		rc = blk_rq_append_bio(req, &bio);
 		if (rc) {
 			pr_err("pSCSI: failed to append bio\n");
 			goto fail;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index abd06f540863..100d0df38026 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -965,7 +965,7 @@ extern int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
 extern void blk_rq_unprep_clone(struct request *rq);
 extern blk_status_t blk_insert_cloned_request(struct request_queue *q,
 				     struct request *rq);
-extern int blk_rq_append_bio(struct request *rq, struct bio *bio);
+extern int blk_rq_append_bio(struct request *rq, struct bio **bio);
 extern void blk_delay_queue(struct request_queue *, unsigned long);
 extern void blk_queue_split(struct request_queue *, struct bio **);
 extern void blk_recount_segments(struct request_queue *, struct bio *);
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/2] block: fix two regressiones on bounce
  2017-12-18  7:40 [PATCH 0/2] block: fix two regressiones on bounce Ming Lei
  2017-12-18  7:40 ` [PATCH 1/2] block: don't let passthrough IO go into .make_request_fn() Ming Lei
  2017-12-18  7:40 ` [PATCH 2/2] block: fix blk_rq_append_bio Ming Lei
@ 2017-12-18 20:22 ` Jens Axboe
  2017-12-19 16:25   ` Vincent Batts
  2018-01-02 21:16   ` Vincent Batts
  2 siblings, 2 replies; 19+ messages in thread
From: Jens Axboe @ 2017-12-18 20:22 UTC (permalink / raw)
  To: Ming Lei, linux-block
  Cc: Christoph Hellwig, NeilBrown, Michele Ballabio, Vincent Batts

On 12/18/2017 12:40 AM, Ming Lei wrote:
> Hi Jens,
> 
> The 1st patch makes sure that passthrough IO won't enter .make_request_fn.
> 
> The 2nd one fixes blk_rq_append_bio(), which is from your post with
> small change on failure handling.

Thanks Ming, queued up.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/2] block: fix two regressiones on bounce
  2017-12-18 20:22 ` [PATCH 0/2] block: fix two regressiones on bounce Jens Axboe
@ 2017-12-19 16:25   ` Vincent Batts
  2018-01-02 21:16   ` Vincent Batts
  1 sibling, 0 replies; 19+ messages in thread
From: Vincent Batts @ 2017-12-19 16:25 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Ming Lei, linux-block, Christoph Hellwig, NeilBrown, Michele Ballabio

Just got a confirmation that this patch set, when applied to the 4.14
branch, fixes the issue.

On Mon, Dec 18, 2017 at 3:22 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On 12/18/2017 12:40 AM, Ming Lei wrote:
>> Hi Jens,
>>
>> The 1st patch makes sure that passthrough IO won't enter .make_request_fn.
>>
>> The 2nd one fixes blk_rq_append_bio(), which is from your post with
>> small change on failure handling.
>
> Thanks Ming, queued up.
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/2] block: fix two regressiones on bounce
  2017-12-18 20:22 ` [PATCH 0/2] block: fix two regressiones on bounce Jens Axboe
  2017-12-19 16:25   ` Vincent Batts
@ 2018-01-02 21:16   ` Vincent Batts
  2018-01-03  1:29     ` Ming Lei
  1 sibling, 1 reply; 19+ messages in thread
From: Vincent Batts @ 2018-01-02 21:16 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Ming Lei, linux-block, Christoph Hellwig, NeilBrown, Michele Ballabio

Will this set make it to the 4.14.x stable branch?

vb

On Mon, Dec 18, 2017 at 3:22 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On 12/18/2017 12:40 AM, Ming Lei wrote:
>> Hi Jens,
>>
>> The 1st patch makes sure that passthrough IO won't enter .make_request_fn.
>>
>> The 2nd one fixes blk_rq_append_bio(), which is from your post with
>> small change on failure handling.
>
> Thanks Ming, queued up.
>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/2] block: fix two regressiones on bounce
  2018-01-02 21:16   ` Vincent Batts
@ 2018-01-03  1:29     ` Ming Lei
  0 siblings, 0 replies; 19+ messages in thread
From: Ming Lei @ 2018-01-03  1:29 UTC (permalink / raw)
  To: Vincent Batts
  Cc: Jens Axboe, linux-block, Christoph Hellwig, NeilBrown, Michele Ballabio

Hi Vincent,

On Tue, Jan 02, 2018 at 04:16:44PM -0500, Vincent Batts wrote:
> Will this set make it to the 4.14.x stable branch?

Greg has posted out the two patches for 4.14-stable review:

	https://marc.info/?l=linux-kernel&m=151481772722222&w=2

So it will be in 4.14.x stable branch soon.

Thanks,
Ming

> 
> vb
> 
> On Mon, Dec 18, 2017 at 3:22 PM, Jens Axboe <axboe@kernel.dk> wrote:
> > On 12/18/2017 12:40 AM, Ming Lei wrote:
> >> Hi Jens,
> >>
> >> The 1st patch makes sure that passthrough IO won't enter .make_request_fn.
> >>
> >> The 2nd one fixes blk_rq_append_bio(), which is from your post with
> >> small change on failure handling.
> >
> > Thanks Ming, queued up.
> >
> > --
> > Jens Axboe
> >

-- 
Ming

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio
  2017-12-18  7:40 ` [PATCH 2/2] block: fix blk_rq_append_bio Ming Lei
@ 2018-01-25 13:58   ` Jiří Paleček
  2018-01-30 12:53     ` Ming Lei
  2018-01-30  1:42   ` [2/2] block: fix blk_rq_append_bio Jiri Palecek
  1 sibling, 1 reply; 19+ messages in thread
From: Jiří Paleček @ 2018-01-25 13:58 UTC (permalink / raw)
  To: linux-block; +Cc: Ming Lei, Jens Axboe, Christoph Hellwig

 Avoids page leak from bounced requests
---
 block/blk-map.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index d3a94719f03f..702d68166689 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -26,7 +26,8 @@ int blk_rq_append_bio(struct request *rq, struct bio **bio)
 	} else {
 		if (!ll_back_merge_fn(rq->q, rq, *bio)) {
 			if (orig_bio != *bio) {
-				bio_put(*bio);
+				bio_inc_remaining(orig_bio);
+				bio_endio(*bio);
 				*bio = orig_bio;
 			}
 			return -EINVAL;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [2/2] block: fix blk_rq_append_bio
  2017-12-18  7:40 ` [PATCH 2/2] block: fix blk_rq_append_bio Ming Lei
  2018-01-25 13:58   ` [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio Jiří Paleček
@ 2018-01-30  1:42   ` Jiri Palecek
  2018-01-30  4:10     ` Ming Lei
  1 sibling, 1 reply; 19+ messages in thread
From: Jiri Palecek @ 2018-01-30  1:42 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Christoph Hellwig

Hello,

I see a problem with this patch:

Ming Lei <ming.lei@redhat.com> writes:

> From: Jens Axboe <axboe@kernel.dk>
>
> Commit caa4b02476e3(blk-map: call blk_queue_bounce from blk_rq_append_bio)
> moves blk_queue_bounce() into blk_rq_append_bio(), but don't consider
> that the bounced bio becomes invisiable to caller since the parameter type is
> 'struct bio *', which should have been 'struct bio **'.
>
> This patch fixes this issue by passing 'struct bio **' to
> blk_rq_append_bio(), then the bounced bio can be returned to caller.
>
> Also failure handling is considered too.
>
> Fixes: caa4b02476e3 ("blk-map: call blk_queue_bounce from blk_rq_append_bio")
> Cc: Christoph Hellwig <hch@lst.de>
> Reported-by: Michele Ballabio <barra_cuda@katamail.com>
> (handling failure of blk_rq_append_bio(), only call bio_get() after
> blk_rq_append_bio() returns OK)
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/blk-map.c                    | 38 ++++++++++++++++++++++----------------
>  drivers/scsi/osd/osd_initiator.c   |  4 +++-
>  drivers/target/target_core_pscsi.c |  4 ++--
>  include/linux/blkdev.h             |  2 +-
>  4 files changed, 28 insertions(+), 20 deletions(-)
>
> diff --git a/block/blk-map.c b/block/blk-map.c
> index b21f8e86f120..d3a94719f03f 100644
> --- a/block/blk-map.c
> +++ b/block/blk-map.c
> @@ -12,22 +12,29 @@
>  #include "blk.h"
>  
>  /*
> - * Append a bio to a passthrough request.  Only works can be merged into
> - * the request based on the driver constraints.
> + * Append a bio to a passthrough request.  Only works if the bio can be merged
> + * into the request based on the driver constraints.
>   */
> -int blk_rq_append_bio(struct request *rq, struct bio *bio)
> +int blk_rq_append_bio(struct request *rq, struct bio **bio)
>  {
> -	blk_queue_bounce(rq->q, &bio);
> +	struct bio *orig_bio = *bio;
> +
> +	blk_queue_bounce(rq->q, bio);
>  
>  	if (!rq->bio) {
> -		blk_rq_bio_prep(rq->q, rq, bio);
> +		blk_rq_bio_prep(rq->q, rq, *bio);
>  	} else {
> -		if (!ll_back_merge_fn(rq->q, rq, bio))
> +		if (!ll_back_merge_fn(rq->q, rq, *bio)) {
> +			if (orig_bio != *bio) {
> +				bio_put(*bio);
> +				*bio = orig_bio;

1) Suppose the bio *does* bounce, which allocates some pages from the
DMA pool, and subsequently this test fails. bio_put won't deallocate
those pages. Prior to caa4b02476e3, it would be done by bio_endio, which
is IMHO the correct way to do it.

2) What if the bio bounces and goes into the splitting path of
__blk_queue_bounce. That would submit the request, which would be a
problem should this test or some other later fail, cancelling the whole
request.

For 1, I will send a patch. As for 2, it would be a problem for a long
time so I have a hunch it doesn't actually happen, however, I'd like to
know if it is somehow guaranteed.

Regards
    Jiri Palecek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [2/2] block: fix blk_rq_append_bio
  2018-01-30  1:42   ` [2/2] block: fix blk_rq_append_bio Jiri Palecek
@ 2018-01-30  4:10     ` Ming Lei
  0 siblings, 0 replies; 19+ messages in thread
From: Ming Lei @ 2018-01-30  4:10 UTC (permalink / raw)
  To: Jiri Palecek; +Cc: Jens Axboe, linux-block, Christoph Hellwig

On Tue, Jan 30, 2018 at 02:42:41AM +0100, Jiri Palecek wrote:
> Hello,
> 
> I see a problem with this patch:
> 
> Ming Lei <ming.lei@redhat.com> writes:
> 
> > From: Jens Axboe <axboe@kernel.dk>
> >
> > Commit caa4b02476e3(blk-map: call blk_queue_bounce from blk_rq_append_bio)
> > moves blk_queue_bounce() into blk_rq_append_bio(), but don't consider
> > that the bounced bio becomes invisiable to caller since the parameter type is
> > 'struct bio *', which should have been 'struct bio **'.
> >
> > This patch fixes this issue by passing 'struct bio **' to
> > blk_rq_append_bio(), then the bounced bio can be returned to caller.
> >
> > Also failure handling is considered too.
> >
> > Fixes: caa4b02476e3 ("blk-map: call blk_queue_bounce from blk_rq_append_bio")
> > Cc: Christoph Hellwig <hch@lst.de>
> > Reported-by: Michele Ballabio <barra_cuda@katamail.com>
> > (handling failure of blk_rq_append_bio(), only call bio_get() after
> > blk_rq_append_bio() returns OK)
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  block/blk-map.c                    | 38 ++++++++++++++++++++++----------------
> >  drivers/scsi/osd/osd_initiator.c   |  4 +++-
> >  drivers/target/target_core_pscsi.c |  4 ++--
> >  include/linux/blkdev.h             |  2 +-
> >  4 files changed, 28 insertions(+), 20 deletions(-)
> >
> > diff --git a/block/blk-map.c b/block/blk-map.c
> > index b21f8e86f120..d3a94719f03f 100644
> > --- a/block/blk-map.c
> > +++ b/block/blk-map.c
> > @@ -12,22 +12,29 @@
> >  #include "blk.h"
> >  
> >  /*
> > - * Append a bio to a passthrough request.  Only works can be merged into
> > - * the request based on the driver constraints.
> > + * Append a bio to a passthrough request.  Only works if the bio can be merged
> > + * into the request based on the driver constraints.
> >   */
> > -int blk_rq_append_bio(struct request *rq, struct bio *bio)
> > +int blk_rq_append_bio(struct request *rq, struct bio **bio)
> >  {
> > -	blk_queue_bounce(rq->q, &bio);
> > +	struct bio *orig_bio = *bio;
> > +
> > +	blk_queue_bounce(rq->q, bio);
> >  
> >  	if (!rq->bio) {
> > -		blk_rq_bio_prep(rq->q, rq, bio);
> > +		blk_rq_bio_prep(rq->q, rq, *bio);
> >  	} else {
> > -		if (!ll_back_merge_fn(rq->q, rq, bio))
> > +		if (!ll_back_merge_fn(rq->q, rq, *bio)) {
> > +			if (orig_bio != *bio) {
> > +				bio_put(*bio);
> > +				*bio = orig_bio;
> 
> 1) Suppose the bio *does* bounce, which allocates some pages from the
> DMA pool, and subsequently this test fails. bio_put won't deallocate
> those pages. Prior to caa4b02476e3, it would be done by bio_endio, which
> is IMHO the correct way to do it.

You are right, that is my fault, and will review your patch later.

> 
> 2) What if the bio bounces and goes into the splitting path of
> __blk_queue_bounce. That would submit the request, which would be a
> problem should this test or some other later fail, cancelling the whole
> request.

When bio comes here, bio_is_passthrough() is true, it won't go into that
path, so needn't to worry.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio
  2018-01-25 13:58   ` [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio Jiří Paleček
@ 2018-01-30 12:53     ` Ming Lei
  2018-01-30 15:24       ` Jiri Palecek
  0 siblings, 1 reply; 19+ messages in thread
From: Ming Lei @ 2018-01-30 12:53 UTC (permalink / raw)
  To: Jiří Paleček
  Cc: linux-block, Ming Lei, Jens Axboe, Christoph Hellwig

On Thu, Jan 25, 2018 at 9:58 PM, Ji=C5=99=C3=AD Pale=C4=8Dek <jpalecek@web.=
de> wrote:
>  Avoids page leak from bounced requests
> ---
>  block/blk-map.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/block/blk-map.c b/block/blk-map.c
> index d3a94719f03f..702d68166689 100644
> --- a/block/blk-map.c
> +++ b/block/blk-map.c
> @@ -26,7 +26,8 @@ int blk_rq_append_bio(struct request *rq, struct bio **=
bio)
>         } else {
>                 if (!ll_back_merge_fn(rq->q, rq, *bio)) {
>                         if (orig_bio !=3D *bio) {
> -                               bio_put(*bio);
> +                               bio_inc_remaining(orig_bio);
> +                               bio_endio(*bio);

'bio_inc_remaining(orig_bio);' shouldn't be needed since we don't chain bou=
nced
bio, otherwise this patch is fine.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio
  2018-01-30 12:53     ` Ming Lei
@ 2018-01-30 15:24       ` Jiri Palecek
  2018-01-31  5:24         ` Ming Lei
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Palecek @ 2018-01-30 15:24 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, Jens Axboe, Christoph Hellwig


On 1/30/18 1:53 PM, Ming Lei wrote:
> On Thu, Jan 25, 2018 at 9:58 PM, Jiří Paleček <jpalecek@web.de> wrote:
>>   Avoids page leak from bounced requests
>> ---
>>   block/blk-map.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-map.c b/block/blk-map.c
>> index d3a94719f03f..702d68166689 100644
>> --- a/block/blk-map.c
>> +++ b/block/blk-map.c
>> @@ -26,7 +26,8 @@ int blk_rq_append_bio(struct request *rq, struct bio **bio)
>>          } else {
>>                  if (!ll_back_merge_fn(rq->q, rq, *bio)) {
>>                          if (orig_bio != *bio) {
>> -                               bio_put(*bio);
>> +                               bio_inc_remaining(orig_bio);
>> +                               bio_endio(*bio);
> 'bio_inc_remaining(orig_bio);' shouldn't be needed since we don't chain bounced
> bio, otherwise this patch is fine.

I believe it is needed or at least desirable. The situation when the 
request bounced is like this

bio (bounced) . bi_private ---> orig_bio

and at the end of bounce_end_io, bio_endio(bio->bi_private) [which is 
bio_endio(orig_bio) in our case] is called. That doesn't have any effect 
on __blk_rq_map_user_iov; its bios have .bi_end_io==0, therefore one 
call more or less doesn't matter. However, for other callers, like 
osd_initiator.c, this is not the case. They pass bios which have 
bi_end_io, and might be surprised if this was called unexpectedly.

Before  caa4b02476e3, blk_rq_append_request wouldn't touch/delete the 
passed bio at all under any circumstances. I believe it should stay that 
way and incrementing the remaining counter, which effectively nullifies 
the extra bio_endio call, does that pretty straightforwardly.

Regards

     Jiri Palecek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio
  2018-01-30 15:24       ` Jiri Palecek
@ 2018-01-31  5:24         ` Ming Lei
  2018-01-31 22:15           ` Jiri Palecek
  2018-02-20 15:21           ` Jiri Palecek
  0 siblings, 2 replies; 19+ messages in thread
From: Ming Lei @ 2018-01-31  5:24 UTC (permalink / raw)
  To: Jiri Palecek; +Cc: Ming Lei, linux-block, Jens Axboe, Christoph Hellwig

On Tue, Jan 30, 2018 at 04:24:14PM +0100, Jiri Palecek wrote:
> 
> On 1/30/18 1:53 PM, Ming Lei wrote:
> > On Thu, Jan 25, 2018 at 9:58 PM, Jiří Paleček <jpalecek@web.de> wrote:
> > >   Avoids page leak from bounced requests
> > > ---
> > >   block/blk-map.c | 3 ++-
> > >   1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/block/blk-map.c b/block/blk-map.c
> > > index d3a94719f03f..702d68166689 100644
> > > --- a/block/blk-map.c
> > > +++ b/block/blk-map.c
> > > @@ -26,7 +26,8 @@ int blk_rq_append_bio(struct request *rq, struct bio **bio)
> > >          } else {
> > >                  if (!ll_back_merge_fn(rq->q, rq, *bio)) {
> > >                          if (orig_bio != *bio) {
> > > -                               bio_put(*bio);
> > > +                               bio_inc_remaining(orig_bio);
> > > +                               bio_endio(*bio);
> > 'bio_inc_remaining(orig_bio);' shouldn't be needed since we don't chain bounced
> > bio, otherwise this patch is fine.
> 
> I believe it is needed or at least desirable. The situation when the request
> bounced is like this
> 
> bio (bounced) . bi_private ---> orig_bio
> 
> and at the end of bounce_end_io, bio_endio(bio->bi_private) [which is
> bio_endio(orig_bio) in our case] is called. That doesn't have any effect on
> __blk_rq_map_user_iov; its bios have .bi_end_io==0, therefore one call more
> or less doesn't matter. However, for other callers, like osd_initiator.c,
> this is not the case. They pass bios which have bi_end_io, and might be
> surprised if this was called unexpectedly.

OK, I think it is good to follow the rule of not calling .bi_end_io() in
the failure path, just like __blk_rq_map_user_iov()/blk_rq_map_kern().

But seems pscsi_map_sg() depends on bio_endio(orig_bio) to free the 'orig_bio',
could you fix it in this patch too?

> 
> Before  caa4b02476e3, blk_rq_append_request wouldn't touch/delete the passed
> bio at all under any circumstances. I believe it should stay that way and
> incrementing the remaining counter, which effectively nullifies the extra
> bio_endio call, does that pretty straightforwardly.

Seems too tricky to use bio_inc_remaining() for avoiding bio_endio(orig_bio),
if we have to do that, I suggest to add comment on that.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio
  2018-01-31  5:24         ` Ming Lei
@ 2018-01-31 22:15           ` Jiri Palecek
  2018-02-20 15:21           ` Jiri Palecek
  1 sibling, 0 replies; 19+ messages in thread
From: Jiri Palecek @ 2018-01-31 22:15 UTC (permalink / raw)
  To: Ming Lei; +Cc: Ming Lei, linux-block, Jens Axboe, Christoph Hellwig


On 1/31/18 6:24 AM, Ming Lei wrote:
> On Tue, Jan 30, 2018 at 04:24:14PM +0100, Jiri Palecek wrote:
>> On 1/30/18 1:53 PM, Ming Lei wrote:
>>> On Thu, Jan 25, 2018 at 9:58 PM, Jiří Paleček <jpalecek@web.de> wrote:
>>>>    Avoids page leak from bounced requests
>>>> ---
>>>>    block/blk-map.c | 3 ++-
>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/block/blk-map.c b/block/blk-map.c
>>>> index d3a94719f03f..702d68166689 100644
>>>> --- a/block/blk-map.c
>>>> +++ b/block/blk-map.c
>>>> @@ -26,7 +26,8 @@ int blk_rq_append_bio(struct request *rq, struct bio **bio)
>>>>           } else {
>>>>                   if (!ll_back_merge_fn(rq->q, rq, *bio)) {
>>>>                           if (orig_bio != *bio) {
>>>> -                               bio_put(*bio);
>>>> +                               bio_inc_remaining(orig_bio);
>>>> +                               bio_endio(*bio);
>>> 'bio_inc_remaining(orig_bio);' shouldn't be needed since we don't chain bounced
>>> bio, otherwise this patch is fine.
>> I believe it is needed or at least desirable. The situation when the request
>> bounced is like this
>>
>> bio (bounced) . bi_private ---> orig_bio
>>
>> and at the end of bounce_end_io, bio_endio(bio->bi_private) [which is
>> bio_endio(orig_bio) in our case] is called. That doesn't have any effect on
>> __blk_rq_map_user_iov; its bios have .bi_end_io==0, therefore one call more
>> or less doesn't matter. However, for other callers, like osd_initiator.c,
>> this is not the case. They pass bios which have bi_end_io, and might be
>> surprised if this was called unexpectedly.
> OK, I think it is good to follow the rule of not calling .bi_end_io() in
> the failure path, just like __blk_rq_map_user_iov()/blk_rq_map_kern().
>
> But seems pscsi_map_sg() depends on bio_endio(orig_bio) to free the 'orig_bio',
> could you fix it in this patch too?
Yes, there seems to be leak in the error path of that code. However, 
that was there for more than a year so I didn't think that was urgent. 
I'll have a look at it, but I would prefer if someone familiar with 
pscsi had their say as well.
>
>> Before  caa4b02476e3, blk_rq_append_request wouldn't touch/delete the passed
>> bio at all under any circumstances. I believe it should stay that way and
>> incrementing the remaining counter, which effectively nullifies the extra
>> bio_endio call, does that pretty straightforwardly.
> Seems too tricky to use bio_inc_remaining() for avoiding bio_endio(orig_bio),
> if we have to do that, I suggest to add comment on that.

Okay. I thought that it didn't really reach the level of sophistication 
otherwise seen in block layer code :)

Regards

     Jiri Palecek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio
  2018-01-31  5:24         ` Ming Lei
  2018-01-31 22:15           ` Jiri Palecek
@ 2018-02-20 15:21           ` Jiri Palecek
  2018-02-21 16:20             ` Boaz Harrosh
  1 sibling, 1 reply; 19+ messages in thread
From: Jiri Palecek @ 2018-02-20 15:21 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jiri Palecek, Ming Lei, linux-block, Jens Axboe,
	Christoph Hellwig, Boaz Harrosh, James E.J. Bottomley,
	Martin K. Petersen, Nicholas A. Bellinger

[-- Attachment #1: Type: text/plain, Size: 3486 bytes --]

Hello,

I had a look at the callers of blk_rq_append_bio and checked the
callers. Some changes may need to be done there and I'd like the input
of their maintainers as well before finalising the patch.

Ming Lei <ming.lei@redhat.com> writes:

> On Tue, Jan 30, 2018 at 04:24:14PM +0100, Jiri Palecek wrote:
>> 
>> On 1/30/18 1:53 PM, Ming Lei wrote:
>> > On Thu, Jan 25, 2018 at 9:58 PM, Jiří Paleček <jpalecek@web.de> wrote:
>> > >   Avoids page leak from bounced requests
>> > > ---
>> > >   block/blk-map.c | 3 ++-
>> > >   1 file changed, 2 insertions(+), 1 deletion(-)
>> > > 
>> > > diff --git a/block/blk-map.c b/block/blk-map.c
>> > > index d3a94719f03f..702d68166689 100644
>> > > --- a/block/blk-map.c
>> > > +++ b/block/blk-map.c
>> > > @@ -26,7 +26,8 @@ int blk_rq_append_bio(struct request *rq, struct bio **bio)
>> > >          } else {
>> > >                  if (!ll_back_merge_fn(rq->q, rq, *bio)) {
>> > >                          if (orig_bio != *bio) {
>> > > -                               bio_put(*bio);
>> > > +                               bio_inc_remaining(orig_bio);
>> > > +                               bio_endio(*bio);
>> > 'bio_inc_remaining(orig_bio);' shouldn't be needed since we don't chain bounced
>> > bio, otherwise this patch is fine.
>> 
>> I believe it is needed or at least desirable. The situation when the request
>> bounced is like this
>> 
>> bio (bounced) . bi_private ---> orig_bio
>> 
>> and at the end of bounce_end_io, bio_endio(bio->bi_private) [which is
>> bio_endio(orig_bio) in our case] is called. That doesn't have any effect on
>> __blk_rq_map_user_iov; its bios have .bi_end_io==0, therefore one call more
>> or less doesn't matter. However, for other callers, like osd_initiator.c,
>> this is not the case. They pass bios which have bi_end_io, and might be
>> surprised if this was called unexpectedly.
>
> OK, I think it is good to follow the rule of not calling .bi_end_io() in
> the failure path, just like __blk_rq_map_user_iov()/blk_rq_map_kern().
>
> But seems pscsi_map_sg() depends on bio_endio(orig_bio) to free the 'orig_bio',
> could you fix it in this patch too?

I've come up with the following patch. Some notes:

1) First of all, I think your suggestion to call bio_endio of the
   original bio in blk_rq_append_bio is right. It would make things a
   bit easier for the callers and those who I suspected need to hold on
   the bio are actually just ignorant about errors. However, it does
   change the api. So if you agree and the other parts are OK too, I'd
   make a patch without the bio_inc_remaining.

2) The osd_initiator.c seems to be a bit oblivious about errors, leaking
   not only the bios, but the request as well. I added some functions
   for proper cleanup there, with considerations:
   
   - I think it's better to unhook the .bi_next link of the bio before
     adding it to the request, because blk_rq_append_bio uses that for
     its own purposes.

   - Once the bios from osd_request have been spent (added to a
     blk_request), they can be NULL-ified.

   I'd like hear if these are OK.

3) PSCSI needs to free the bios in pscsi_map_sg, but also needs to clear
   the bios from the request in pscsi_execute_cmd in case of partial
   errors (ie. first sg segment makes it to the request, second
   fails). To my understanding, blk_put_request is insufficient in that
   case.

Regards
    Jiri Palecek


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0002-Some-error-paths.patch --]
[-- Type: text/x-diff, Size: 3067 bytes --]

>From fca495c6bf41775be152e7f7f00be6f0dc746ac3 Mon Sep 17 00:00:00 2001
From: =?iso8859-2?q?Ji=F8=ED=20Pale=E8ek?= <jpalecek@web.de>
Date: Sun, 4 Feb 2018 21:55:56 +0100
Subject: [PATCH 2/2] Some error paths

---
 block/blk-map.c                    |  4 +++-
 drivers/scsi/osd/osd_initiator.c   | 29 +++++++++++++++++++++++++----
 drivers/target/target_core_pscsi.c |  4 +++-
 3 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index 702d68166689..8378f4ddd419 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -246,7 +246,7 @@ int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
 	ret = blk_rq_append_bio(rq, &bio);
 	if (unlikely(ret)) {
 		/* request is too big */
-		bio_put(orig_bio);
+		bio_endio(orig_bio);
 		return ret;
 	}
 
diff --git a/drivers/scsi/osd/osd_initiator.c b/drivers/scsi/osd/osd_initiator.c
index e18877177f1b..f352fdda52b9 100644
--- a/drivers/scsi/osd/osd_initiator.c
+++ b/drivers/scsi/osd/osd_initiator.c
@@ -437,6 +437,16 @@ static void _osd_free_seg(struct osd_request *or __unused,
 	seg->alloc_size = 0;
 }
 
+static void _osd_end_bios(struct bio *bio)
+{
+	struct bio *nxt = NULL;
+	while (bio) {
+		nxt = bio->bi_next;
+		bio_endio(bio);
+		bio = nxt;
+	}
+}
+
 static void _put_request(struct request *rq)
 {
 	/*
@@ -469,6 +479,10 @@ void osd_end_request(struct osd_request *or)
 	_osd_free_seg(or, &or->set_attr);
 	_osd_free_seg(or, &or->cdb_cont);
 
+	/* Only in case of errors should these be non-NULL */
+	_osd_end_bios(or->in.bio);
+	_osd_end_bios(or->out.bio);
+
 	_osd_request_free(or);
 }
 EXPORT_SYMBOL(osd_end_request);
@@ -1575,13 +1589,20 @@ static struct request *_make_request(struct request_queue *q, bool has_write,
 	if (IS_ERR(req))
 		return req;
 
-	for_each_bio(bio) {
-		struct bio *bounce_bio = bio;
+	while(bio) {
+		struct bio *nxt = bio->bi_next;
+		bio->bi_next = NULL;
 
-		ret = blk_rq_append_bio(req, &bounce_bio);
-		if (ret)
+		ret = blk_rq_append_bio(req, &bio);
+		if (ret) {
+			_put_request(req);
+			bio_endio(bio);
+			oii->bio = nxt;
 			return ERR_PTR(ret);
+		}
+		bio = nxt;
 	}
+	oii->bio = NULL;
 
 	return req;
 }
diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c
index 0d99b242e82e..42c24356d683 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -922,6 +922,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 
 				rc = blk_rq_append_bio(req, &bio);
 				if (rc) {
+					bio_put(bio);
 					pr_err("pSCSI: failed to append bio\n");
 					goto fail;
 				}
@@ -940,6 +941,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 	if (bio) {
 		rc = blk_rq_append_bio(req, &bio);
 		if (rc) {
+			bio_put(bio);
 			pr_err("pSCSI: failed to append bio\n");
 			goto fail;
 		}
@@ -1016,7 +1018,7 @@ pscsi_execute_cmd(struct se_cmd *cmd)
 	return 0;
 
 fail_put_request:
-	blk_put_request(req);
+	blk_end_request_all(req, BLK_STS_IOERR);
 fail:
 	kfree(pt);
 	return ret;
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio
  2018-02-20 15:21           ` Jiri Palecek
@ 2018-02-21 16:20             ` Boaz Harrosh
  2018-02-21 17:12               ` Boaz Harrosh
       [not found]               ` <058e8999-3f89-ae20-ee38-229c78d6d1c8@web.de>
  0 siblings, 2 replies; 19+ messages in thread
From: Boaz Harrosh @ 2018-02-21 16:20 UTC (permalink / raw)
  To: Jiri Palecek, Ming Lei
  Cc: Ming Lei, linux-block, Jens Axboe, Christoph Hellwig,
	James E.J. Bottomley, Martin K. Petersen, Nicholas A. Bellinger

On 20/02/18 17:21, Jiri Palecek wrote:
> Hello,
> 
> I had a look at the callers of blk_rq_append_bio and checked the
> callers. Some changes may need to be done there and I'd like the input
> of their maintainers as well before finalising the patch.
> 
> Ming Lei <ming.lei@redhat.com> writes:
> 
>> On Tue, Jan 30, 2018 at 04:24:14PM +0100, Jiri Palecek wrote:
>>>
>>> On 1/30/18 1:53 PM, Ming Lei wrote:
>>>> On Thu, Jan 25, 2018 at 9:58 PM, Jiří Paleček <jpalecek@web.de> wrote:
>>>>>   Avoids page leak from bounced requests
>>>>> ---
>>>>>   block/blk-map.c | 3 ++-
>>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/block/blk-map.c b/block/blk-map.c
>>>>> index d3a94719f03f..702d68166689 100644
>>>>> --- a/block/blk-map.c
>>>>> +++ b/block/blk-map.c
>>>>> @@ -26,7 +26,8 @@ int blk_rq_append_bio(struct request *rq, struct bio **bio)
>>>>>          } else {
>>>>>                  if (!ll_back_merge_fn(rq->q, rq, *bio)) {
>>>>>                          if (orig_bio != *bio) {
>>>>> -                               bio_put(*bio);
>>>>> +                               bio_inc_remaining(orig_bio);
>>>>> +                               bio_endio(*bio);
>>>> 'bio_inc_remaining(orig_bio);' shouldn't be needed since we don't chain bounced
>>>> bio, otherwise this patch is fine.
>>>
>>> I believe it is needed or at least desirable. The situation when the request
>>> bounced is like this
>>>
>>> bio (bounced) . bi_private ---> orig_bio
>>>
>>> and at the end of bounce_end_io, bio_endio(bio->bi_private) [which is
>>> bio_endio(orig_bio) in our case] is called. That doesn't have any effect on
>>> __blk_rq_map_user_iov; its bios have .bi_end_io==0, therefore one call more
>>> or less doesn't matter. However, for other callers, like osd_initiator.c,
>>> this is not the case. They pass bios which have bi_end_io, and might be
>>> surprised if this was called unexpectedly.
>>
>> OK, I think it is good to follow the rule of not calling .bi_end_io() in
>> the failure path, just like __blk_rq_map_user_iov()/blk_rq_map_kern().
>>
>> But seems pscsi_map_sg() depends on bio_endio(orig_bio) to free the 'orig_bio',
>> could you fix it in this patch too?
> 
> I've come up with the following patch. Some notes:
> 
> 1) First of all, I think your suggestion to call bio_endio of the
>    original bio in blk_rq_append_bio is right. It would make things a
>    bit easier for the callers and those who I suspected need to hold on
>    the bio are actually just ignorant about errors. However, it does
>    change the api. So if you agree and the other parts are OK too, I'd
>    make a patch without the bio_inc_remaining.
> 
> 2) The osd_initiator.c seems to be a bit oblivious about errors, leaking
>    not only the bios, but the request as well. I added some functions
>    for proper cleanup there, with considerations:
>    
>    - I think it's better to unhook the .bi_next link of the bio before
>      adding it to the request, because blk_rq_append_bio uses that for
>      its own purposes.
> 
>    - Once the bios from osd_request have been spent (added to a
>      blk_request), they can be NULL-ified.
> 

I have not followed closely the issue but ...

No! the osd_initiator.c is completely out of scope. And does not leak any bios
and need not to be fixed. let me explain.

These are BIDI commands that travel as a couple of chained requests. They are
sent as BLOCK_PC command and complete or fail as one hole unit. The system is not
allowed (And does not know how) to split them or complete them partially. This is
not an FS read or write request of data. But a unique OSD request that the system
knows nothing about. The all scsi-command is specially crafted by the caller.
It is all properly destroyed at osd_request end of life (sync or async)
(NOTE there is no bio-end function only req-end)

again not followed closely but if it is about the free of the bounce buffers
then I would just disable bouncing for osd all together. There are only a very
few (2) drivers that support bidi for osd. iscsi and iser so we know those are
totally cool and this is all not an existing problem.

Thanks
Boaz

>    I'd like hear if these are OK.
> 
> 3) PSCSI needs to free the bios in pscsi_map_sg, but also needs to clear
>    the bios from the request in pscsi_execute_cmd in case of partial
>    errors (ie. first sg segment makes it to the request, second
>    fails). To my understanding, blk_put_request is insufficient in that
>    case.
> 
> Regards
>     Jiri Palecek
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio
  2018-02-21 16:20             ` Boaz Harrosh
@ 2018-02-21 17:12               ` Boaz Harrosh
       [not found]               ` <058e8999-3f89-ae20-ee38-229c78d6d1c8@web.de>
  1 sibling, 0 replies; 19+ messages in thread
From: Boaz Harrosh @ 2018-02-21 17:12 UTC (permalink / raw)
  To: Jiri Palecek, Ming Lei
  Cc: Ming Lei, linux-block, Jens Axboe, Christoph Hellwig,
	James E.J. Bottomley, Martin K. Petersen, Nicholas A. Bellinger

On 21/02/18 18:20, Boaz Harrosh wrote:
<>
> again not followed closely but if it is about the free of the bounce buffers
> then I would just disable bouncing for osd all together. There are only a very
> few (2) drivers that support bidi for osd. iscsi and iser so we know those are
> totally cool and this is all not an existing problem.
> 

And also it is possible that I have at all not understood the issue
please bring me up to speed. I would like to help any way I can.

> Thanks
> Boaz
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio
       [not found]               ` <058e8999-3f89-ae20-ee38-229c78d6d1c8@web.de>
@ 2018-02-28 17:38                 ` Boaz Harrosh
  2018-02-28 17:50                   ` Boaz Harrosh
  0 siblings, 1 reply; 19+ messages in thread
From: Boaz Harrosh @ 2018-02-28 17:38 UTC (permalink / raw)
  To: Jiri Palecek, Ming Lei
  Cc: Ming Lei, linux-block, Jens Axboe, Christoph Hellwig,
	James E.J. Bottomley, Martin K. Petersen, Nicholas A. Bellinger

On 27/02/18 16:44, Jiri Palecek wrote:
<>
>> These are BIDI commands that travel as a couple of chained requests. They are
>> sent as BLOCK_PC command and complete or fail as one hole unit. The system is not
>> allowed (And does not know how) to split them or complete them partially. This is
>> not an FS read or write request of data. But a unique OSD request that the system
>> knows nothing about. The all scsi-command is specially crafted by the caller.
>> It is all properly destroyed at osd_request end of life (sync or async)
>> (NOTE there is no bio-end function only req-end)
> 
> That's correct, however, the assumed leak would happen when the
> preparation of the request fails.
> 
> The whole issue started with possible bounce buffer leaks in
> blk_rq_append_request, which didn't happen before it started to
> handle bouncing. However, Ming Lei noticed it would be simpler and
> easier to also free the original bio, as opposed to just the bounce
> buffer, on error. Because that wasn't so previously, the question now
> stands, what do the callers likely expect?
> 

Sigh! That bouncing thing. It is so not relevant for osd. I wish we could
just drop the bouncing and return an error, if it was ever needed
(because it never is for osd)

> This leads us to osd_initiator.c, which IMHO doesn't expect anything
> and simply lacks any sane handling of such possibility. If you say
> there can't be any leaks, please explain to me this riddle. Given the
> code:
> 
> static struct request *_make_request(struct request_queue *q, bool has_write,
>                   struct _osd_io_info *oii, gfp_t flags)
> {
>     struct request *req;
>     struct bio *bio = oii->bio;
>     int ret;
> 
>     req = blk_get_request(q, has_write ? REQ_OP_SCSI_OUT : REQ_OP_SCSI_IN,
>             flags); /** 1 **/
>     if (IS_ERR(req))
>         return req;
> 
>     for_each_bio(bio) {
>         struct bio *bounce_bio = bio;
> 
>         ret = blk_rq_append_bio(req, &bounce_bio);

[Sorry still did not see the all code]
What is the meaning of the pointer-to-pointer-to-bio here? it used to
be relevant to blk_queue_bounce, that I guess you internalized into
blk_rq_append_bio() but why do we need it outside the call now?

>         if (ret)
>             return ERR_PTR(ret); /** 2 **/
>     }
> 
>     return req;
> }
> 
> 1. If this function exits by line labeled "2", where is struct
>    request *req allocated on line 1 freed?

You are probably right, if you want to fix it? This used to never fail because
blk_queue_bounce() retuning void. And blk_rq_append_bio could only fail if
ll_back_merge_fn would fail, but for OSD supporting drivers this would never fail.
And also those checks were already preformed before (when more segments were added).

So in practice this could never fail and never showed in testing. But I agree
this is a layering violation and the code is wrong.

> 2. If this function exits by line 2, where are the bio(s) in oii->bio
>    freed? What if it fails on the second bio?
> 
> I assume osd_end_request would be called afterwards (as in exofs_read_kern).
> 

This code always scared me to bits because if you look closely it does nothing
It does not actually builds the chain it relays on bio->bi_next not being modified
and so we loop on doing nothing (resting the bio->bi_next to the same thing it was before)

So yes oii->bio(s) are freed in osd_end_request and/or by callers because the bios
might have one more ref on them, because they might be needed by caller
to finish up.

> Regards
>    Jiri Palecek
> 

Thank you for looking into this, sorry for the headache this code gives you.
If there was a way to just fail if bounce is needed that could be just fine
for osd. But else I guess that error exit needs fixing.

Based on v4.15 code feel free to add to your patchset
where needed

----
Don't leak the request if blk_rq_append_bio fails

Sign-of-by: Boaz Harrosh <ooo@electrozaur.com>
----
diff --git a/drivers/scsi/osd/osd_initiator.c b/drivers/scsi/osd/osd_initiator.c
index 2f2a991..3c97e0e 100644
--- a/drivers/scsi/osd/osd_initiator.c
+++ b/drivers/scsi/osd/osd_initiator.c
@@ -1572,8 +1572,10 @@ static struct request *_make_request(struct request_queue *q, bool has_write,
 
 		blk_queue_bounce(req->q, &bounce_bio);
 		ret = blk_rq_append_bio(req, bounce_bio);
-		if (ret)
+		if (ret) {
+			_put_request(req);
 			return ERR_PTR(ret);
+		}
 	}
 
 	return req;

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio
  2018-02-28 17:38                 ` Boaz Harrosh
@ 2018-02-28 17:50                   ` Boaz Harrosh
  0 siblings, 0 replies; 19+ messages in thread
From: Boaz Harrosh @ 2018-02-28 17:50 UTC (permalink / raw)
  To: Jiri Palecek, Ming Lei
  Cc: Ming Lei, linux-block, Jens Axboe, Christoph Hellwig,
	James E.J. Bottomley, Martin K. Petersen, Nicholas A. Bellinger

On 28/02/18 19:38, Boaz Harrosh wrote:
<>
> Based on v4.15 code feel free to add to your patchset
> where needed
> 

Sorry is what happens when you work on so many Linux versions at
the same time. This one is based on v4.15

> ----
> Don't leak the request if blk_rq_append_bio fails
> 
> Sign-of-by: Boaz Harrosh <ooo@electrozaur.com>
> ----
diff --git a/drivers/scsi/osd/osd_initiator.c b/drivers/scsi/osd/osd_initiator.c
index e188771..58782dc 100644
--- a/drivers/scsi/osd/osd_initiator.c
+++ b/drivers/scsi/osd/osd_initiator.c
@@ -1579,8 +1579,10 @@ static struct request *_make_request(struct request_queue *q, bool has_write,
 		struct bio *bounce_bio = bio;
 
 		ret = blk_rq_append_bio(req, &bounce_bio);
-		if (ret)
+		if (ret) {
+			_put_request(req);
 			return ERR_PTR(ret);
+		}
 	}
 
 	return req;

^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-02-28 17:50 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-18  7:40 [PATCH 0/2] block: fix two regressiones on bounce Ming Lei
2017-12-18  7:40 ` [PATCH 1/2] block: don't let passthrough IO go into .make_request_fn() Ming Lei
2017-12-18  7:40 ` [PATCH 2/2] block: fix blk_rq_append_bio Ming Lei
2018-01-25 13:58   ` [PATCH] Use bio_endio instead of bio_put in error path of blk_rq_append_bio Jiří Paleček
2018-01-30 12:53     ` Ming Lei
2018-01-30 15:24       ` Jiri Palecek
2018-01-31  5:24         ` Ming Lei
2018-01-31 22:15           ` Jiri Palecek
2018-02-20 15:21           ` Jiri Palecek
2018-02-21 16:20             ` Boaz Harrosh
2018-02-21 17:12               ` Boaz Harrosh
     [not found]               ` <058e8999-3f89-ae20-ee38-229c78d6d1c8@web.de>
2018-02-28 17:38                 ` Boaz Harrosh
2018-02-28 17:50                   ` Boaz Harrosh
2018-01-30  1:42   ` [2/2] block: fix blk_rq_append_bio Jiri Palecek
2018-01-30  4:10     ` Ming Lei
2017-12-18 20:22 ` [PATCH 0/2] block: fix two regressiones on bounce Jens Axboe
2017-12-19 16:25   ` Vincent Batts
2018-01-02 21:16   ` Vincent Batts
2018-01-03  1:29     ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).