linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] lightnvm: pblk: fix bio leak on large sized io
       [not found] <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p1>
@ 2019-01-30  1:53 ` 김찬솔
  2019-01-30  6:58   ` Javier González
                     ` (2 more replies)
  2019-02-01  8:22 ` Chansol Kim
  1 sibling, 3 replies; 10+ messages in thread
From: 김찬솔 @ 2019-01-30  1:53 UTC (permalink / raw)
  To: linux-block; +Cc: matias.bjorling, javier


Changes:
 1. Function pblk_rw_io to get bio* as a reference
 2. In pblk_rw_io bio_put call on read case removed

A fix to address issue where
 1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA)
 2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference
 3. In blk_queue_split, when there is a split, the original bio* (0xA)
    is passed to generic_make_requests, and the newly allocated bio is
    returned
 4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*,
    that is not the one returned by blk_queue_split
 5. As a result bio_endio is not called on the newly allocated bio.

Signed-off-by: chansol.kim <chansol.kim@samsung.com>
---
 drivers/lightnvm/pblk-init.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index b57f764d..4efc929 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock);
 struct bio_set pblk_bio_set;
 
 static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
-			  struct bio *bio)
+			  struct bio **bio)
 {
-	int ret;
-
 	/* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
 	 * constraint. Writes can be of arbitrary size.
 	 */
-	if (bio_data_dir(bio) == READ) {
-		blk_queue_split(q, &bio);
-		ret = pblk_submit_read(pblk, bio);
-		if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
-			bio_put(bio);
-
-		return ret;
+	if (bio_data_dir(*bio) == READ) {
+		blk_queue_split(q, bio);
+		return pblk_submit_read(pblk, *bio);
 	}
 
 	/* Prevent deadlock in the case of a modest LUN configuration and large
 	 * user I/Os. Unless stalled, the rate limiter leaves at least 256KB
 	 * available for user I/O.
 	 */
-	if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
-		blk_queue_split(q, &bio);
+	if (pblk_get_secs(*bio) > pblk_rl_max_io(&pblk->rl))
+		blk_queue_split(q, bio);
 
-	return pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
+	return pblk_write_to_cache(pblk, *bio, PBLK_IOTYPE_USER);
 }
 
 static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
@@ -69,7 +63,7 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
 		}
 	}
 
-	switch (pblk_rw_io(q, pblk, bio)) {
+	switch (pblk_rw_io(q, pblk, &bio)) {
 	case NVM_IO_ERR:
 		bio_io_error(bio);
 		break;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] lightnvm: pblk: fix bio leak on large sized io
  2019-01-30  1:53 ` [PATCH] lightnvm: pblk: fix bio leak on large sized io 김찬솔
@ 2019-01-30  6:58   ` Javier González
  2019-01-30 14:06   ` Matias Bjørling
  2019-01-31 21:14   ` Matias Bjørling
  2 siblings, 0 replies; 10+ messages in thread
From: Javier González @ 2019-01-30  6:58 UTC (permalink / raw)
  To: chansol.kim; +Cc: linux-block, Matias Bjorling

[-- Attachment #1: Type: text/plain, Size: 2923 bytes --]

> On 30 Jan 2019, at 02.53, 김찬솔 <chansol.kim@samsung.com> wrote:
> 
> 
> Changes:
> 1. Function pblk_rw_io to get bio* as a reference
> 2. In pblk_rw_io bio_put call on read case removed
> 
> A fix to address issue where
> 1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA)
> 2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference
> 3. In blk_queue_split, when there is a split, the original bio* (0xA)
>    is passed to generic_make_requests, and the newly allocated bio is
>    returned
> 4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*,
>    that is not the one returned by blk_queue_split
> 5. As a result bio_endio is not called on the newly allocated bio.
> 
> Signed-off-by: chansol.kim <chansol.kim@samsung.com>
> ---
> drivers/lightnvm/pblk-init.c | 22 ++++++++--------------
> 1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index b57f764d..4efc929 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock);
> struct bio_set pblk_bio_set;
> 
> static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
> -			  struct bio *bio)
> +			  struct bio **bio)
> {
> -	int ret;
> -
> 	/* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
> 	 * constraint. Writes can be of arbitrary size.
> 	 */
> -	if (bio_data_dir(bio) == READ) {
> -		blk_queue_split(q, &bio);
> -		ret = pblk_submit_read(pblk, bio);
> -		if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
> -			bio_put(bio);
> -
> -		return ret;
> +	if (bio_data_dir(*bio) == READ) {
> +		blk_queue_split(q, bio);
> +		return pblk_submit_read(pblk, *bio);
> 	}
> 
> 	/* Prevent deadlock in the case of a modest LUN configuration and large
> 	 * user I/Os. Unless stalled, the rate limiter leaves at least 256KB
> 	 * available for user I/O.
> 	 */
> -	if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
> -		blk_queue_split(q, &bio);
> +	if (pblk_get_secs(*bio) > pblk_rl_max_io(&pblk->rl))
> +		blk_queue_split(q, bio);
> 
> -	return pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
> +	return pblk_write_to_cache(pblk, *bio, PBLK_IOTYPE_USER);
> }
> 
> static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> @@ -69,7 +63,7 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> 		}
> 	}
> 
> -	switch (pblk_rw_io(q, pblk, bio)) {
> +	switch (pblk_rw_io(q, pblk, &bio)) {
> 	case NVM_IO_ERR:
> 		bio_io_error(bio);
> 		break;
> --
> 2.7.4

Thanks for the fix Chansol.

Matias: I see that checkpatch complains about the signed-ff, but it
seems to be due to the name mismatch english / korean - not sure if
there is a good fix for this.

Reviewed-by: Javier González <javier@javigon.com>


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lightnvm: pblk: fix bio leak on large sized io
  2019-01-30  1:53 ` [PATCH] lightnvm: pblk: fix bio leak on large sized io 김찬솔
  2019-01-30  6:58   ` Javier González
@ 2019-01-30 14:06   ` Matias Bjørling
  2019-01-30 15:02     ` Javier González
       [not found]     ` <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p2>
  2019-01-31 21:14   ` Matias Bjørling
  2 siblings, 2 replies; 10+ messages in thread
From: Matias Bjørling @ 2019-01-30 14:06 UTC (permalink / raw)
  To: chansol.kim, linux-block; +Cc: matias.bjorling, javier

On 1/30/19 2:53 AM, 김찬솔 wrote:
> 
> Changes:
>   1. Function pblk_rw_io to get bio* as a reference
>   2. In pblk_rw_io bio_put call on read case removed
> 
> A fix to address issue where
>   1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA)
>   2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference
>   3. In blk_queue_split, when there is a split, the original bio* (0xA)
>      is passed to generic_make_requests, and the newly allocated bio is
>      returned
>   4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*,
>      that is not the one returned by blk_queue_split
>   5. As a result bio_endio is not called on the newly allocated bio.

The commit message needs a bit of massaging. The Changes: paragraph is 
not needed (that is what the patch is for). Also, please redo the second 
paragraph in without the bullets.

> 
> Signed-off-by: chansol.kim <chansol.kim@samsung.com>
> ---
>   drivers/lightnvm/pblk-init.c | 22 ++++++++--------------
>   1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index b57f764d..4efc929 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock);
>   struct bio_set pblk_bio_set;
>   
>   static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
> -			  struct bio *bio)
> +			  struct bio **bio)

I am not sure why the **bio change is needed. Can you elaborate?

>   {
> -	int ret;
> -
>   	/* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
>   	 * constraint. Writes can be of arbitrary size.
>   	 */
> -	if (bio_data_dir(bio) == READ) {
> -		blk_queue_split(q, &bio);
> -		ret = pblk_submit_read(pblk, bio);
> -		if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
> -			bio_put(bio);
> -
> -		return ret;
> +	if (bio_data_dir(*bio) == READ) {
> +		blk_queue_split(q, bio);
> +		return pblk_submit_read(pblk, *bio);
>   	}
>   
>   	/* Prevent deadlock in the case of a modest LUN configuration and large
>   	 * user I/Os. Unless stalled, the rate limiter leaves at least 256KB
>   	 * available for user I/O.
>   	 */
> -	if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
> -		blk_queue_split(q, &bio);
> +	if (pblk_get_secs(*bio) > pblk_rl_max_io(&pblk->rl))
> +		blk_queue_split(q, bio);
>   
> -	return pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
> +	return pblk_write_to_cache(pblk, *bio, PBLK_IOTYPE_USER);
>   }
>   
>   static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> @@ -69,7 +63,7 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
>   		}
>   	}
>   
> -	switch (pblk_rw_io(q, pblk, bio)) {
> +	switch (pblk_rw_io(q, pblk, &bio)) {
>   	case NVM_IO_ERR:
>   		bio_io_error(bio);
>   		break;
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lightnvm: pblk: fix bio leak on large sized io
  2019-01-30 14:06   ` Matias Bjørling
@ 2019-01-30 15:02     ` Javier González
       [not found]     ` <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p2>
  1 sibling, 0 replies; 10+ messages in thread
From: Javier González @ 2019-01-30 15:02 UTC (permalink / raw)
  To: Matias Bjørling; +Cc: chansol.kim, linux-block, Matias Bjorling

[-- Attachment #1: Type: text/plain, Size: 1870 bytes --]


> On 30 Jan 2019, at 15.06, Matias Bjørling <mb@lightnvm.io> wrote:
> 
> On 1/30/19 2:53 AM, 김찬솔 wrote:
>> Changes:
>>  1. Function pblk_rw_io to get bio* as a reference
>>  2. In pblk_rw_io bio_put call on read case removed
>> A fix to address issue where
>>  1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA)
>>  2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference
>>  3. In blk_queue_split, when there is a split, the original bio* (0xA)
>>     is passed to generic_make_requests, and the newly allocated bio is
>>     returned
>>  4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*,
>>     that is not the one returned by blk_queue_split
>>  5. As a result bio_endio is not called on the newly allocated bio.
> 
> The commit message needs a bit of massaging. The Changes: paragraph is not needed (that is what the patch is for). Also, please redo the second paragraph in without the bullets.
> 
>> Signed-off-by: chansol.kim <chansol.kim@samsung.com>
>> ---
>>  drivers/lightnvm/pblk-init.c | 22 ++++++++--------------
>>  1 file changed, 8 insertions(+), 14 deletions(-)
>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
>> index b57f764d..4efc929 100644
>> --- a/drivers/lightnvm/pblk-init.c
>> +++ b/drivers/lightnvm/pblk-init.c
>> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock);
>>  struct bio_set pblk_bio_set;
>>    static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
>> -			  struct bio *bio)
>> +			  struct bio **bio)
> 
> I am not sure why the **bio change is needed. Can you elaborate?

bio_split creates a new bio and queues the original one again. If the
reference is not updated then the newly created bio is never freed when
NVM_IO_DONE is returned to pblk_make_rq(), thus resulting on a leak.

Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] lightnvm: pblk: fix bio leak on large sized io
       [not found]     ` <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p2>
@ 2019-01-31  5:55       ` Chansol Kim
  0 siblings, 0 replies; 10+ messages in thread
From: Chansol Kim @ 2019-01-31  5:55 UTC (permalink / raw)
  To: Javier González, Matias Bjørling
  Cc: Chansol Kim, linux-block, Matias Bjorling

 

> -----Original Message-----
> From: Javier González [mailto:javier@javigon.com]
> Sent: Thursday, January 31, 2019 12:02 AM
> To: Matias Bjørling <mb@lightnvm.io>
> Cc: chansol.kim@samsung.com; linux-block@vger.kernel.org; Matias Bjorling <matias.bjorling@wdc.com>
> Subject: Re: [PATCH] lightnvm: pblk: fix bio leak on large sized io
> 
> 
> > On 30 Jan 2019, at 15.06, Matias Bjørling <mb@lightnvm.io> wrote:
> >
> > On 1/30/19 2:53 AM, 김찬솔 wrote:
> >> Changes:
> >>  1. Function pblk_rw_io to get bio* as a reference  2. In pblk_rw_io
> >> bio_put call on read case removed A fix to address issue where  1.
> >> pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA)
> >> 2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference
> >> 3. In blk_queue_split, when there is a split, the original bio* (0xA)
> >>     is passed to generic_make_requests, and the newly allocated bio is
> >>     returned
> >>  4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*,
> >>     that is not the one returned by blk_queue_split  5. As a result
> >> bio_endio is not called on the newly allocated bio.
> >
> > The commit message needs a bit of massaging. The Changes: paragraph is not needed (that is what the
> patch is for). Also, please redo the second paragraph in without the bullets.
> >

Thank you for comment, I will format it better.

> >> Signed-off-by: chansol.kim <chansol.kim@samsung.com>
> >> ---
> >>  drivers/lightnvm/pblk-init.c | 22 ++++++++--------------
> >>  1 file changed, 8 insertions(+), 14 deletions(-) diff --git
> >> a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c index
> >> b57f764d..4efc929 100644
> >> --- a/drivers/lightnvm/pblk-init.c
> >> +++ b/drivers/lightnvm/pblk-init.c
> >> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock);  struct bio_set
> >> pblk_bio_set;
> >>    static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
> >> -			  struct bio *bio)
> >> +			  struct bio **bio)
> >
> > I am not sure why the **bio change is needed. Can you elaborate?
> 
> bio_split creates a new bio and queues the original one again. If the reference is not updated then
> the newly created bio is never freed when NVM_IO_DONE is returned to pblk_make_rq(), thus resulting on
> a leak.
> 
> Javier

Thank you Javier for the explanation.
Dear Matias, please review whether the above change is acceptabl. If further change is required, I will address that and include the above commit log amend along with it. Thank you.

Regards,
Chansol Kim


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lightnvm: pblk: fix bio leak on large sized io
  2019-01-30  1:53 ` [PATCH] lightnvm: pblk: fix bio leak on large sized io 김찬솔
  2019-01-30  6:58   ` Javier González
  2019-01-30 14:06   ` Matias Bjørling
@ 2019-01-31 21:14   ` Matias Bjørling
  2 siblings, 0 replies; 10+ messages in thread
From: Matias Bjørling @ 2019-01-31 21:14 UTC (permalink / raw)
  To: chansol.kim, linux-block; +Cc: matias.bjorling, javier

On 1/30/19 2:53 AM, 김찬솔 wrote:
> 
> Changes:
>   1. Function pblk_rw_io to get bio* as a reference
>   2. In pblk_rw_io bio_put call on read case removed
> 
> A fix to address issue where
>   1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA)
>   2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference
>   3. In blk_queue_split, when there is a split, the original bio* (0xA)
>      is passed to generic_make_requests, and the newly allocated bio is
>      returned
>   4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*,
>      that is not the one returned by blk_queue_split
>   5. As a result bio_endio is not called on the newly allocated bio.
> 
> Signed-off-by: chansol.kim <chansol.kim@samsung.com>
> ---
>   drivers/lightnvm/pblk-init.c | 22 ++++++++--------------
>   1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index b57f764d..4efc929 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock);
>   struct bio_set pblk_bio_set;
>   
>   static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
> -			  struct bio *bio)
> +			  struct bio **bio)
>   {
> -	int ret;
> -
>   	/* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
>   	 * constraint. Writes can be of arbitrary size.
>   	 */
> -	if (bio_data_dir(bio) == READ) {
> -		blk_queue_split(q, &bio);
> -		ret = pblk_submit_read(pblk, bio);
> -		if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
> -			bio_put(bio);

Could we kill the NVM_DONE_IO check in the pblk_rw_io, that should 
achieve the same?

> -
> -		return ret;
> +	if (bio_data_dir(*bio) == READ) {
> +		blk_queue_split(q, bio);
> +		return pblk_submit_read(pblk, *bio);
>   	}
>   
>   	/* Prevent deadlock in the case of a modest LUN configuration and large
>   	 * user I/Os. Unless stalled, the rate limiter leaves at least 256KB
>   	 * available for user I/O.
>   	 */
> -	if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
> -		blk_queue_split(q, &bio);
> +	if (pblk_get_secs(*bio) > pblk_rl_max_io(&pblk->rl))
> +		blk_queue_split(q, bio);
>   
> -	return pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
> +	return pblk_write_to_cache(pblk, *bio, PBLK_IOTYPE_USER);
>   }
>   
>   static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> @@ -69,7 +63,7 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
>   		}
>   	}
>   
> -	switch (pblk_rw_io(q, pblk, bio)) {
> +	switch (pblk_rw_io(q, pblk, &bio)) {
>   	case NVM_IO_ERR:
>   		bio_io_error(bio);
>   		break;
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] lightnvm: pblk: fix bio leak on large sized io
       [not found] <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p1>
  2019-01-30  1:53 ` [PATCH] lightnvm: pblk: fix bio leak on large sized io 김찬솔
@ 2019-02-01  8:22 ` Chansol Kim
  2019-02-05  9:23   ` Matias Bjørling
  1 sibling, 1 reply; 10+ messages in thread
From: Chansol Kim @ 2019-02-01  8:22 UTC (permalink / raw)
  To: Matias Bjørling, linux-block; +Cc: matias.bjorling, javier


On 01/31/19 22:14 PM, Matias Bjørling wrote:
> On 1/30/19 2:53 AM, 김찬솔 wrote:
>> 
>> Changes:
>>   1. Function pblk_rw_io to get bio* as a reference
>>   2. In pblk_rw_io bio_put call on read case removed
>> 
>> A fix to address issue where
>>   1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA)
>>   2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference
>>   3. In blk_queue_split, when there is a split, the original bio* (0xA)
>>      is passed to generic_make_requests, and the newly allocated bio is
>>      returned
>>   4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*,
>>      that is not the one returned by blk_queue_split
>>   5. As a result bio_endio is not called on the newly allocated bio.
>> 
>> Signed-off-by: chansol.kim <chansol.kim@samsung.com>
>> ---
>>   drivers/lightnvm/pblk-init.c | 22 ++++++++--------------
>>   1 file changed, 8 insertions(+), 14 deletions(-)
>> 
>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
>> index b57f764d..4efc929 100644
>> --- a/drivers/lightnvm/pblk-init.c
>> +++ b/drivers/lightnvm/pblk-init.c
>> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock);
>>   struct bio_set pblk_bio_set;
>>   
>>   static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
>> -			  struct bio *bio)
>> +			  struct bio **bio)
>>   {
>> -	int ret;
>> -
>>   	/* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
>>   	 * constraint. Writes can be of arbitrary size.
>>   	 */
>> -	if (bio_data_dir(bio) == READ) {
>> -		blk_queue_split(q, &bio);
>> -		ret = pblk_submit_read(pblk, bio);
>> -		if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
>> -			bio_put(bio);
>
> Could we kill the NVM_DONE_IO check in the pblk_rw_io, that should 
> achieve the same?

I think it is possible to remove NVM_DONE_IO check here. And in that
case perhaps it is necessary to change bio_endio call to somewhere other
than pblk_make_rq, otherwise endio call would not be made to the new
bio*.

Assuming pblk_rw_io's second parameter is to be remained as bio*, There
are three cases I think needs consideration. NVM_IO_ERROR return case,
the read case and the write case.

In NVM_IO_ERROR return case, for both read and write. NVM_IO_ERROR
received by pblk_make_rq and bio_io_error called on bio, since this bio*
that pblk_submit_read and pblk_write_to_cache function tried and failed
might be a new one, so bio_io_error call needs to be made inside
pblk_rw_io.

In read case, there are three sub-cases. The first is All data is available
in ring buffer and NVM_IO_DONE is returned. The second is all to be read
from the device, which currently NVM_IO_OK is returned and endio is
called after read completion from the device. The third is partial read,
where the data that needs to be read from the device is read
synchronously and pblk_rw_io returns NVM_IO_DONE.

In write case, there are two sub-cases. Firstly, non REQ_PRE_FLUSH case,
pblk_write_cache wil return either NVM_IO_DONE or NVM_IO_ERROR. A endio
call is required in place somewhere NVM_IO_DONE is decided. 

For REQ_PREFLUSH case bio (new bio* if split) is added to w_ctx.bios,
pblk_write_to_cache will return either NVM_IO_OK or NVM_IO_ERROR. bio*
added to w_ctx.bios will be called by bio_endio on write completion to
the disk. So it is already taken care of.

In summary my feeling is that having pblk_rw_io receive bio* as a
reference and removing bio_put in pblk_rw_io would be the minimum
change. Please share your insight, I will try experimenting alternatives.
>
>> -
>> -		return ret;
>> +	if (bio_data_dir(*bio) == READ) {
>> +		blk_queue_split(q, bio);
>> +		return pblk_submit_read(pblk, *bio);
>>   	}
>>   
>>   	/* Prevent deadlock in the case of a modest LUN configuration and large
>>   	 * user I/Os. Unless stalled, the rate limiter leaves at least 256KB
>>   	 * available for user I/O.
>>   	 */
>> -	if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
>> -		blk_queue_split(q, &bio);
>> +	if (pblk_get_secs(*bio) > pblk_rl_max_io(&pblk->rl))
>> +		blk_queue_split(q, bio);
>>   
>> -	return pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
>> +	return pblk_write_to_cache(pblk, *bio, PBLK_IOTYPE_USER);
>>   }
>>   
>>   static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
>> @@ -69,7 +63,7 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
>>   		}
>>   	}
>>   
>> -	switch (pblk_rw_io(q, pblk, bio)) {
>> +	switch (pblk_rw_io(q, pblk, &bio)) {
>>   	case NVM_IO_ERR:
>>   		bio_io_error(bio);
>>   		break;
>> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lightnvm: pblk: fix bio leak on large sized io
  2019-02-01  8:22 ` Chansol Kim
@ 2019-02-05  9:23   ` Matias Bjørling
  2019-02-05 10:20     ` Javier González
       [not found]     ` <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p7>
  0 siblings, 2 replies; 10+ messages in thread
From: Matias Bjørling @ 2019-02-05  9:23 UTC (permalink / raw)
  To: chansol.kim, linux-block; +Cc: javier

On 2/1/19 9:22 AM, Chansol Kim wrote:
> 
> On 01/31/19 22:14 PM, Matias Bjørling wrote:
>> On 1/30/19 2:53 AM, 김찬솔 wrote:
>>>
>>> Changes:
>>>    1. Function pblk_rw_io to get bio* as a reference
>>>    2. In pblk_rw_io bio_put call on read case removed
>>>
>>> A fix to address issue where
>>>    1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA)
>>>    2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference
>>>    3. In blk_queue_split, when there is a split, the original bio* (0xA)
>>>       is passed to generic_make_requests, and the newly allocated bio is
>>>       returned
>>>    4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*,
>>>       that is not the one returned by blk_queue_split
>>>    5. As a result bio_endio is not called on the newly allocated bio.
>>>
>>> Signed-off-by: chansol.kim <chansol.kim@samsung.com>
>>> ---
>>>    drivers/lightnvm/pblk-init.c | 22 ++++++++--------------
>>>    1 file changed, 8 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
>>> index b57f764d..4efc929 100644
>>> --- a/drivers/lightnvm/pblk-init.c
>>> +++ b/drivers/lightnvm/pblk-init.c
>>> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock);
>>>    struct bio_set pblk_bio_set;
>>>    
>>>    static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
>>> -			  struct bio *bio)
>>> +			  struct bio **bio)
>>>    {
>>> -	int ret;
>>> -
>>>    	/* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
>>>    	 * constraint. Writes can be of arbitrary size.
>>>    	 */
>>> -	if (bio_data_dir(bio) == READ) {
>>> -		blk_queue_split(q, &bio);
>>> -		ret = pblk_submit_read(pblk, bio);
>>> -		if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
>>> -			bio_put(bio);
>>
>> Could we kill the NVM_DONE_IO check in the pblk_rw_io, that should
>> achieve the same?
> 
> I think it is possible to remove NVM_DONE_IO check here. And in that
> case perhaps it is necessary to change bio_endio call to somewhere other
> than pblk_make_rq, otherwise endio call would not be made to the new
> bio*.
> 
> Assuming pblk_rw_io's second parameter is to be remained as bio*, There
> are three cases I think needs consideration. NVM_IO_ERROR return case,
> the read case and the write case.
> 
> In NVM_IO_ERROR return case, for both read and write. NVM_IO_ERROR
> received by pblk_make_rq and bio_io_error called on bio, since this bio*
> that pblk_submit_read and pblk_write_to_cache function tried and failed
> might be a new one, so bio_io_error call needs to be made inside
> pblk_rw_io.
> 
> In read case, there are three sub-cases. The first is All data is available
> in ring buffer and NVM_IO_DONE is returned. The second is all to be read
> from the device, which currently NVM_IO_OK is returned and endio is
> called after read completion from the device. The third is partial read,
> where the data that needs to be read from the device is read
> synchronously and pblk_rw_io returns NVM_IO_DONE.
> 
> In write case, there are two sub-cases. Firstly, non REQ_PRE_FLUSH case,
> pblk_write_cache wil return either NVM_IO_DONE or NVM_IO_ERROR. A endio
> call is required in place somewhere NVM_IO_DONE is decided.
> 
> For REQ_PREFLUSH case bio (new bio* if split) is added to w_ctx.bios,
> pblk_write_to_cache will return either NVM_IO_OK or NVM_IO_ERROR. bio*
> added to w_ctx.bios will be called by bio_endio on write completion to
> the disk. So it is already taken care of.
> 
> In summary my feeling is that having pblk_rw_io receive bio* as a
> reference and removing bio_put in pblk_rw_io would be the minimum
> change. Please share your insight, I will try experimenting alternatives.

What rubs me the wrong way is that that pattern isn't used in the rest 
of kernel. I would rather move the calls to bio_io_error and bio_endio 
into the pblk_rw_io() function. The implementation of pblk_rw_io() leaks 
out to pblk_make_rq(). The code is a mismatch of some bio_endio calls 
inside the pblk_rw_io, and others outside. It's not coherent.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lightnvm: pblk: fix bio leak on large sized io
  2019-02-05  9:23   ` Matias Bjørling
@ 2019-02-05 10:20     ` Javier González
       [not found]     ` <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p7>
  1 sibling, 0 replies; 10+ messages in thread
From: Javier González @ 2019-02-05 10:20 UTC (permalink / raw)
  To: Matias Bjørling; +Cc: chansol.kim, linux-block

[-- Attachment #1: Type: text/plain, Size: 4600 bytes --]


> On 5 Feb 2019, at 10.23, Matias Bjørling <mb@lightnvm.io> wrote:
> 
> On 2/1/19 9:22 AM, Chansol Kim wrote:
>> On 01/31/19 22:14 PM, Matias Bjørling wrote:
>>> On 1/30/19 2:53 AM, 김찬솔 wrote:
>>>> Changes:
>>>>   1. Function pblk_rw_io to get bio* as a reference
>>>>   2. In pblk_rw_io bio_put call on read case removed
>>>> 
>>>> A fix to address issue where
>>>>   1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA)
>>>>   2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference
>>>>   3. In blk_queue_split, when there is a split, the original bio* (0xA)
>>>>      is passed to generic_make_requests, and the newly allocated bio is
>>>>      returned
>>>>   4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*,
>>>>      that is not the one returned by blk_queue_split
>>>>   5. As a result bio_endio is not called on the newly allocated bio.
>>>> 
>>>> Signed-off-by: chansol.kim <chansol.kim@samsung.com>
>>>> ---
>>>>   drivers/lightnvm/pblk-init.c | 22 ++++++++--------------
>>>>   1 file changed, 8 insertions(+), 14 deletions(-)
>>>> 
>>>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
>>>> index b57f764d..4efc929 100644
>>>> --- a/drivers/lightnvm/pblk-init.c
>>>> +++ b/drivers/lightnvm/pblk-init.c
>>>> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock);
>>>>   struct bio_set pblk_bio_set;
>>>>      static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
>>>> -			  struct bio *bio)
>>>> +			  struct bio **bio)
>>>>   {
>>>> -	int ret;
>>>> -
>>>>   	/* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
>>>>   	 * constraint. Writes can be of arbitrary size.
>>>>   	 */
>>>> -	if (bio_data_dir(bio) == READ) {
>>>> -		blk_queue_split(q, &bio);
>>>> -		ret = pblk_submit_read(pblk, bio);
>>>> -		if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
>>>> -			bio_put(bio);
>>> 
>>> Could we kill the NVM_DONE_IO check in the pblk_rw_io, that should
>>> achieve the same?
>> I think it is possible to remove NVM_DONE_IO check here. And in that
>> case perhaps it is necessary to change bio_endio call to somewhere other
>> than pblk_make_rq, otherwise endio call would not be made to the new
>> bio*.
>> Assuming pblk_rw_io's second parameter is to be remained as bio*, There
>> are three cases I think needs consideration. NVM_IO_ERROR return case,
>> the read case and the write case.
>> In NVM_IO_ERROR return case, for both read and write. NVM_IO_ERROR
>> received by pblk_make_rq and bio_io_error called on bio, since this bio*
>> that pblk_submit_read and pblk_write_to_cache function tried and failed
>> might be a new one, so bio_io_error call needs to be made inside
>> pblk_rw_io.
>> In read case, there are three sub-cases. The first is All data is available
>> in ring buffer and NVM_IO_DONE is returned. The second is all to be read
>> from the device, which currently NVM_IO_OK is returned and endio is
>> called after read completion from the device. The third is partial read,
>> where the data that needs to be read from the device is read
>> synchronously and pblk_rw_io returns NVM_IO_DONE.
>> In write case, there are two sub-cases. Firstly, non REQ_PRE_FLUSH case,
>> pblk_write_cache wil return either NVM_IO_DONE or NVM_IO_ERROR. A endio
>> call is required in place somewhere NVM_IO_DONE is decided.
>> For REQ_PREFLUSH case bio (new bio* if split) is added to w_ctx.bios,
>> pblk_write_to_cache will return either NVM_IO_OK or NVM_IO_ERROR. bio*
>> added to w_ctx.bios will be called by bio_endio on write completion to
>> the disk. So it is already taken care of.
>> In summary my feeling is that having pblk_rw_io receive bio* as a
>> reference and removing bio_put in pblk_rw_io would be the minimum
>> change. Please share your insight, I will try experimenting alternatives.
> 
> What rubs me the wrong way is that that pattern isn't used in the rest
> of kernel. I would rather move the calls to bio_io_error and bio_endio
> into the pblk_rw_io() function. The implementation of pblk_rw_io()
> leaks out to pblk_make_rq(). The code is a mismatch of some bio_endio
> calls inside the pblk_rw_io, and others outside. It's not coherent.

I agree that NVM_IO_DONE is now more confusing than anything - this
comes from the rrpc days... Removing it here will require some
refactoring on the partial read path, but nothing too dramatic.

I'm also OK with unfolding pblk_rw_io() into pblk_make_rq().

Chansol: do you want to give it a go?

Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] lightnvm: pblk: fix bio leak on large sized io
       [not found]     ` <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p7>
@ 2019-02-11 11:14       ` Chansol Kim
  0 siblings, 0 replies; 10+ messages in thread
From: Chansol Kim @ 2019-02-11 11:14 UTC (permalink / raw)
  To: Javier González, Matias Bjørling; +Cc: linux-block

On 02/05/19 11:20 AM, Javier González wrote:
>
>> On 5 Feb 2019, at 10.23, Matias Bjørling <mb@lightnvm.io> wrote:
>> 
>> On 2/1/19 9:22 AM, Chansol Kim wrote:
>>> On 01/31/19 22:14 PM, Matias Bjørling wrote:
>>>> On 1/30/19 2:53 AM, 김찬솔 wrote:
>>>>> Changes:
>>>>>   1. Function pblk_rw_io to get bio* as a reference
>>>>>   2. In pblk_rw_io bio_put call on read case removed
>>>>> 
>>>>> A fix to address issue where
>>>>>   1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA)
>>>>>   2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference
>>>>>   3. In blk_queue_split, when there is a split, the original bio* (0xA)
>>>>>      is passed to generic_make_requests, and the newly allocated bio is
>>>>>      returned
>>>>>   4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*,
>>>>>      that is not the one returned by blk_queue_split
>>>>>   5. As a result bio_endio is not called on the newly allocated bio.
>>>>> 
>>>>> Signed-off-by: chansol.kim <chansol.kim@samsung.com>
>>>>> ---
>>>>>   drivers/lightnvm/pblk-init.c | 22 ++++++++--------------
>>>>>   1 file changed, 8 insertions(+), 14 deletions(-)
>>>>> 
>>>>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
>>>>> index b57f764d..4efc929 100644
>>>>> --- a/drivers/lightnvm/pblk-init.c
>>>>> +++ b/drivers/lightnvm/pblk-init.c
>>>>> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock);
>>>>>   struct bio_set pblk_bio_set;
>>>>>      static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
>>>>> -			  struct bio *bio)
>>>>> +			  struct bio **bio)
>>>>>   {
>>>>> -	int ret;
>>>>> -
>>>>>   	/* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
>>>>>   	 * constraint. Writes can be of arbitrary size.
>>>>>   	 */
>>>>> -	if (bio_data_dir(bio) == READ) {
>>>>> -		blk_queue_split(q, &bio);
>>>>> -		ret = pblk_submit_read(pblk, bio);
>>>>> -		if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
>>>>> -			bio_put(bio);
>>>> 
>>>> Could we kill the NVM_DONE_IO check in the pblk_rw_io, that should
>>>> achieve the same?
>>> I think it is possible to remove NVM_DONE_IO check here. And in that
>>> case perhaps it is necessary to change bio_endio call to somewhere other
>>> than pblk_make_rq, otherwise endio call would not be made to the new
>>> bio*.
>>> Assuming pblk_rw_io's second parameter is to be remained as bio*, There
>>> are three cases I think needs consideration. NVM_IO_ERROR return case,
>>> the read case and the write case.
>>> In NVM_IO_ERROR return case, for both read and write. NVM_IO_ERROR
>>> received by pblk_make_rq and bio_io_error called on bio, since this bio*
>>> that pblk_submit_read and pblk_write_to_cache function tried and failed
>>> might be a new one, so bio_io_error call needs to be made inside
>>> pblk_rw_io.
>>> In read case, there are three sub-cases. The first is All data is available
>>> in ring buffer and NVM_IO_DONE is returned. The second is all to be read
>>> from the device, which currently NVM_IO_OK is returned and endio is
>>> called after read completion from the device. The third is partial read,
>>> where the data that needs to be read from the device is read
>>> synchronously and pblk_rw_io returns NVM_IO_DONE.
>>> In write case, there are two sub-cases. Firstly, non REQ_PRE_FLUSH case,
>>> pblk_write_cache wil return either NVM_IO_DONE or NVM_IO_ERROR. A endio
>>> call is required in place somewhere NVM_IO_DONE is decided.
>>> For REQ_PREFLUSH case bio (new bio* if split) is added to w_ctx.bios,
>>> pblk_write_to_cache will return either NVM_IO_OK or NVM_IO_ERROR. bio*
>>> added to w_ctx.bios will be called by bio_endio on write completion to
>>> the disk. So it is already taken care of.
>>> In summary my feeling is that having pblk_rw_io receive bio* as a
>>> reference and removing bio_put in pblk_rw_io would be the minimum
>>> change. Please share your insight, I will try experimenting alternatives.
>> 
>> What rubs me the wrong way is that that pattern isn't used in the rest
>> of kernel. I would rather move the calls to bio_io_error and bio_endio
>> into the pblk_rw_io() function. The implementation of pblk_rw_io()
>> leaks out to pblk_make_rq(). The code is a mismatch of some bio_endio
>> calls inside the pblk_rw_io, and others outside. It's not coherent.
>
> I agree that NVM_IO_DONE is now more confusing than anything - this
> comes from the rrpc days... Removing it here will require some
> refactoring on the partial read path, but nothing too dramatic.
>
> I'm also OK with unfolding pblk_rw_io() into pblk_make_rq().
>
> Chansol: do you want to give it a go?
>
> Javier
>

Matias, like you mentioned and Javier suggested, unfolding pblk_rw_io
would make it more coherent with regards to call sites of bio_endio,
including REQ_OP_DISCARD with REQ_PREFLUSH unset case. pblk_make_rq
would be the place to call bio_io_error in case of NVM_IO_ERR, and to
call bio_endio for NVM_IO_DONE.

Javier: I am very up for it. Unfolding pblk_rw_io into pblk_make_rq
function. I will make the change, test etc, and submit the patch (with
better comment this time).

Thank you.

Chansol Kim

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-02-11 11:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p1>
2019-01-30  1:53 ` [PATCH] lightnvm: pblk: fix bio leak on large sized io 김찬솔
2019-01-30  6:58   ` Javier González
2019-01-30 14:06   ` Matias Bjørling
2019-01-30 15:02     ` Javier González
     [not found]     ` <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p2>
2019-01-31  5:55       ` Chansol Kim
2019-01-31 21:14   ` Matias Bjørling
2019-02-01  8:22 ` Chansol Kim
2019-02-05  9:23   ` Matias Bjørling
2019-02-05 10:20     ` Javier González
     [not found]     ` <CGME20190130015343epcms2p14be92e88982e86f5e9d494e3bdc3fb2a@epcms2p7>
2019-02-11 11:14       ` Chansol Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).