* [PATCH v2] lightnvm: pblk: fix bio leak on large sized io
[not found] <CGME20190307072351epcms2p1966fd363dfb73cc7b34248b9c2269987@epcms2p1>
@ 2019-03-07 7:23 ` Chansol Kim
2019-03-07 12:20 ` Javier González
0 siblings, 1 reply; 3+ messages in thread
From: Chansol Kim @ 2019-03-07 7:23 UTC (permalink / raw)
To: mb, javier, hans.holmberg, linux-block
For large size io where blk_queue_split needs to be called inside
pblk_rw_io, results in bio leak as bio_endio is not called on the
newly allocated. One way to observe this is to mounting ext4
filesystem on the target and issuing 1MB io with dd, e.g., dd bs=1MB
if=/dev/null of=/mount/myvolume. kmemleak reports:
unreferenced object 0xffff88803d7d0100 (size 256):
comm "kworker/u16:1", pid 68, jiffies 4294899333 (age 284.120s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 60 e8 31 81 88 ff ff .........`.1....
01 40 00 00 06 06 00 00 00 00 00 00 05 00 00 00 .@..............
backtrace:
[<000000001f5aa04f>] kmem_cache_alloc+0x204/0x3c0
[<0000000040945aab>] mempool_alloc_slab+0x1d/0x30
[<00000000b4959ab4>] mempool_alloc+0x83/0x220
[<00000000646bad9b>] bio_alloc_bioset+0x229/0x320
[<000000009264b251>] bio_clone_fast+0x26/0xc0
[<0000000008250252>] bio_split+0x41/0x110
[<00000000e365cad0>] blk_queue_split+0x349/0x930
[<00000000eb5426bc>] pblk_make_rq+0x1b5/0x1f0
[<00000000eea09cec>] generic_make_request+0x2f9/0x690
[<00000000ae6acede>] submit_bio+0x12e/0x1f0
[<00000000f9b8b82a>] ext4_io_submit+0x64/0x80
[<000000009e4f817d>] ext4_bio_write_page+0x32e/0x890
[<00000000cbd0d106>] mpage_submit_page+0x65/0xc0
[<000000000eec7359>] mpage_map_and_submit_buffers+0x171/0x330
[<000000009a7afcb6>] ext4_writepages+0xd5e/0x1650
[<000000004476b096>] do_writepages+0x39/0xc0
In case there is a need for a split, blk_queue_split returns the newly
allocated bio to the caller by changing the value of pointer passed as
a reference, while the original is passed to generic_make_requests.
Although pblk_rw_io's local variable bio* has changed and passed to
pblk_submit_read and pblk_write_to_cache, work is done on this new
bio*, and pblk_rw_io returns NVM_IO_DONE, pblk_make_rq calls bio_endio
on the old bio* because it passed bio pointer by value to pblk_rw_io.
pblk_rw_io is unfolded into pblk_make_rq so that there is no copying
of bio* and bio_endio is called on the correct bio*.
Signed-off-by: Chansol Kim <chansol.kim@samsung.com>
---
v2:
- Instead of changing pblk_rw_io's parameter, unfolded it into pblk_make_rq
drivers/lightnvm/pblk-init.c | 47 ++++++++++++++++++--------------------------
1 file changed, 19 insertions(+), 28 deletions(-)
diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 8b643d0..0c98305 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -47,36 +47,10 @@ static struct pblk_global_caches pblk_caches = {
struct bio_set pblk_bio_set;
-static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
- struct bio *bio)
-{
- int ret;
-
- /* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
- * constraint. Writes can be of arbitrary size.
- */
- if (bio_data_dir(bio) == READ) {
- blk_queue_split(q, &bio);
- ret = pblk_submit_read(pblk, bio);
- if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
- bio_put(bio);
-
- return ret;
- }
-
- /* Prevent deadlock in the case of a modest LUN configuration and large
- * user I/Os. Unless stalled, the rate limiter leaves at least 256KB
- * available for user I/O.
- */
- if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
- blk_queue_split(q, &bio);
-
- return pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
-}
-
static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
{
struct pblk *pblk = q->queuedata;
+ int ret;
if (bio_op(bio) == REQ_OP_DISCARD) {
pblk_discard(pblk, bio);
@@ -86,7 +60,24 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
}
}
- switch (pblk_rw_io(q, pblk, bio)) {
+ /* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
+ * constraint. Writes can be of arbitrary size.
+ */
+ if (bio_data_dir(bio) == READ) {
+ blk_queue_split(q, &bio);
+ ret = pblk_submit_read(pblk, bio);
+ } else {
+ /* Prevent deadlock in the case of a modest LUN configuration
+ * and large user I/Os. Unless stalled, the rate limiter
+ * leaves at least 256KB available for user I/O.
+ */
+ if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
+ blk_queue_split(q, &bio);
+
+ ret = pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
+ }
+
+ switch (ret) {
case NVM_IO_ERR:
bio_io_error(bio);
break;
--
2.7.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] lightnvm: pblk: fix bio leak on large sized io
2019-03-07 7:23 ` [PATCH v2] lightnvm: pblk: fix bio leak on large sized io Chansol Kim
@ 2019-03-07 12:20 ` Javier González
2019-03-09 17:00 ` Matias Bjørling
0 siblings, 1 reply; 3+ messages in thread
From: Javier González @ 2019-03-07 12:20 UTC (permalink / raw)
To: chansol.kim; +Cc: Matias Bjørling, Hans Holmberg, linux-block
[-- Attachment #1: Type: text/plain, Size: 4845 bytes --]
> On 7 Mar 2019, at 08.23, Chansol Kim <chansol.kim@samsung.com> wrote:
>
> For large size io where blk_queue_split needs to be called inside
> pblk_rw_io, results in bio leak as bio_endio is not called on the
> newly allocated. One way to observe this is to mounting ext4
> filesystem on the target and issuing 1MB io with dd, e.g., dd bs=1MB
> if=/dev/null of=/mount/myvolume. kmemleak reports:
>
> unreferenced object 0xffff88803d7d0100 (size 256):
> comm "kworker/u16:1", pid 68, jiffies 4294899333 (age 284.120s)
> hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 60 e8 31 81 88 ff ff .........`.1....
> 01 40 00 00 06 06 00 00 00 00 00 00 05 00 00 00 .@..............
> backtrace:
> [<000000001f5aa04f>] kmem_cache_alloc+0x204/0x3c0
> [<0000000040945aab>] mempool_alloc_slab+0x1d/0x30
> [<00000000b4959ab4>] mempool_alloc+0x83/0x220
> [<00000000646bad9b>] bio_alloc_bioset+0x229/0x320
> [<000000009264b251>] bio_clone_fast+0x26/0xc0
> [<0000000008250252>] bio_split+0x41/0x110
> [<00000000e365cad0>] blk_queue_split+0x349/0x930
> [<00000000eb5426bc>] pblk_make_rq+0x1b5/0x1f0
> [<00000000eea09cec>] generic_make_request+0x2f9/0x690
> [<00000000ae6acede>] submit_bio+0x12e/0x1f0
> [<00000000f9b8b82a>] ext4_io_submit+0x64/0x80
> [<000000009e4f817d>] ext4_bio_write_page+0x32e/0x890
> [<00000000cbd0d106>] mpage_submit_page+0x65/0xc0
> [<000000000eec7359>] mpage_map_and_submit_buffers+0x171/0x330
> [<000000009a7afcb6>] ext4_writepages+0xd5e/0x1650
> [<000000004476b096>] do_writepages+0x39/0xc0
>
> In case there is a need for a split, blk_queue_split returns the newly
> allocated bio to the caller by changing the value of pointer passed as
> a reference, while the original is passed to generic_make_requests.
>
> Although pblk_rw_io's local variable bio* has changed and passed to
> pblk_submit_read and pblk_write_to_cache, work is done on this new
> bio*, and pblk_rw_io returns NVM_IO_DONE, pblk_make_rq calls bio_endio
> on the old bio* because it passed bio pointer by value to pblk_rw_io.
>
> pblk_rw_io is unfolded into pblk_make_rq so that there is no copying
> of bio* and bio_endio is called on the correct bio*.
>
> Signed-off-by: Chansol Kim <chansol.kim@samsung.com>
> ---
> v2:
> - Instead of changing pblk_rw_io's parameter, unfolded it into pblk_make_rq
>
> drivers/lightnvm/pblk-init.c | 47 ++++++++++++++++++--------------------------
> 1 file changed, 19 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index 8b643d0..0c98305 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -47,36 +47,10 @@ static struct pblk_global_caches pblk_caches = {
>
> struct bio_set pblk_bio_set;
>
> -static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
> - struct bio *bio)
> -{
> - int ret;
> -
> - /* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
> - * constraint. Writes can be of arbitrary size.
> - */
> - if (bio_data_dir(bio) == READ) {
> - blk_queue_split(q, &bio);
> - ret = pblk_submit_read(pblk, bio);
> - if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
> - bio_put(bio);
> -
> - return ret;
> - }
> -
> - /* Prevent deadlock in the case of a modest LUN configuration and large
> - * user I/Os. Unless stalled, the rate limiter leaves at least 256KB
> - * available for user I/O.
> - */
> - if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
> - blk_queue_split(q, &bio);
> -
> - return pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
> -}
> -
> static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> {
> struct pblk *pblk = q->queuedata;
> + int ret;
>
> if (bio_op(bio) == REQ_OP_DISCARD) {
> pblk_discard(pblk, bio);
> @@ -86,7 +60,24 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> }
> }
>
> - switch (pblk_rw_io(q, pblk, bio)) {
> + /* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
> + * constraint. Writes can be of arbitrary size.
> + */
> + if (bio_data_dir(bio) == READ) {
> + blk_queue_split(q, &bio);
> + ret = pblk_submit_read(pblk, bio);
> + } else {
> + /* Prevent deadlock in the case of a modest LUN configuration
> + * and large user I/Os. Unless stalled, the rate limiter
> + * leaves at least 256KB available for user I/O.
> + */
> + if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
> + blk_queue_split(q, &bio);
> +
> + ret = pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
> + }
> +
> + switch (ret) {
> case NVM_IO_ERR:
> bio_io_error(bio);
> break;
> --
> 2.7.4
Looks good to me.
Reviewed-by: Javier González <javier@javigon.com>
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] lightnvm: pblk: fix bio leak on large sized io
2019-03-07 12:20 ` Javier González
@ 2019-03-09 17:00 ` Matias Bjørling
0 siblings, 0 replies; 3+ messages in thread
From: Matias Bjørling @ 2019-03-09 17:00 UTC (permalink / raw)
To: Javier González, chansol.kim; +Cc: Hans Holmberg, linux-block
On 3/7/19 1:20 PM, Javier González wrote:
>> On 7 Mar 2019, at 08.23, Chansol Kim <chansol.kim@samsung.com> wrote:
>>
>> For large size io where blk_queue_split needs to be called inside
>> pblk_rw_io, results in bio leak as bio_endio is not called on the
>> newly allocated. One way to observe this is to mounting ext4
>> filesystem on the target and issuing 1MB io with dd, e.g., dd bs=1MB
>> if=/dev/null of=/mount/myvolume. kmemleak reports:
>>
>> unreferenced object 0xffff88803d7d0100 (size 256):
>> comm "kworker/u16:1", pid 68, jiffies 4294899333 (age 284.120s)
>> hex dump (first 32 bytes):
>> 00 00 00 00 00 00 00 00 00 60 e8 31 81 88 ff ff .........`.1....
>> 01 40 00 00 06 06 00 00 00 00 00 00 05 00 00 00 .@..............
>> backtrace:
>> [<000000001f5aa04f>] kmem_cache_alloc+0x204/0x3c0
>> [<0000000040945aab>] mempool_alloc_slab+0x1d/0x30
>> [<00000000b4959ab4>] mempool_alloc+0x83/0x220
>> [<00000000646bad9b>] bio_alloc_bioset+0x229/0x320
>> [<000000009264b251>] bio_clone_fast+0x26/0xc0
>> [<0000000008250252>] bio_split+0x41/0x110
>> [<00000000e365cad0>] blk_queue_split+0x349/0x930
>> [<00000000eb5426bc>] pblk_make_rq+0x1b5/0x1f0
>> [<00000000eea09cec>] generic_make_request+0x2f9/0x690
>> [<00000000ae6acede>] submit_bio+0x12e/0x1f0
>> [<00000000f9b8b82a>] ext4_io_submit+0x64/0x80
>> [<000000009e4f817d>] ext4_bio_write_page+0x32e/0x890
>> [<00000000cbd0d106>] mpage_submit_page+0x65/0xc0
>> [<000000000eec7359>] mpage_map_and_submit_buffers+0x171/0x330
>> [<000000009a7afcb6>] ext4_writepages+0xd5e/0x1650
>> [<000000004476b096>] do_writepages+0x39/0xc0
>>
>> In case there is a need for a split, blk_queue_split returns the newly
>> allocated bio to the caller by changing the value of pointer passed as
>> a reference, while the original is passed to generic_make_requests.
>>
>> Although pblk_rw_io's local variable bio* has changed and passed to
>> pblk_submit_read and pblk_write_to_cache, work is done on this new
>> bio*, and pblk_rw_io returns NVM_IO_DONE, pblk_make_rq calls bio_endio
>> on the old bio* because it passed bio pointer by value to pblk_rw_io.
>>
>> pblk_rw_io is unfolded into pblk_make_rq so that there is no copying
>> of bio* and bio_endio is called on the correct bio*.
>>
>> Signed-off-by: Chansol Kim <chansol.kim@samsung.com>
>> ---
>> v2:
>> - Instead of changing pblk_rw_io's parameter, unfolded it into pblk_make_rq
>>
>> drivers/lightnvm/pblk-init.c | 47 ++++++++++++++++++--------------------------
>> 1 file changed, 19 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
>> index 8b643d0..0c98305 100644
>> --- a/drivers/lightnvm/pblk-init.c
>> +++ b/drivers/lightnvm/pblk-init.c
>> @@ -47,36 +47,10 @@ static struct pblk_global_caches pblk_caches = {
>>
>> struct bio_set pblk_bio_set;
>>
>> -static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
>> - struct bio *bio)
>> -{
>> - int ret;
>> -
>> - /* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
>> - * constraint. Writes can be of arbitrary size.
>> - */
>> - if (bio_data_dir(bio) == READ) {
>> - blk_queue_split(q, &bio);
>> - ret = pblk_submit_read(pblk, bio);
>> - if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
>> - bio_put(bio);
>> -
>> - return ret;
>> - }
>> -
>> - /* Prevent deadlock in the case of a modest LUN configuration and large
>> - * user I/Os. Unless stalled, the rate limiter leaves at least 256KB
>> - * available for user I/O.
>> - */
>> - if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
>> - blk_queue_split(q, &bio);
>> -
>> - return pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
>> -}
>> -
>> static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
>> {
>> struct pblk *pblk = q->queuedata;
>> + int ret;
>>
>> if (bio_op(bio) == REQ_OP_DISCARD) {
>> pblk_discard(pblk, bio);
>> @@ -86,7 +60,24 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
>> }
>> }
>>
>> - switch (pblk_rw_io(q, pblk, bio)) {
>> + /* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
>> + * constraint. Writes can be of arbitrary size.
>> + */
>> + if (bio_data_dir(bio) == READ) {
>> + blk_queue_split(q, &bio);
>> + ret = pblk_submit_read(pblk, bio);
>> + } else {
>> + /* Prevent deadlock in the case of a modest LUN configuration
>> + * and large user I/Os. Unless stalled, the rate limiter
>> + * leaves at least 256KB available for user I/O.
>> + */
>> + if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
>> + blk_queue_split(q, &bio);
>> +
>> + ret = pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
>> + }
>> +
>> + switch (ret) {
>> case NVM_IO_ERR:
>> bio_io_error(bio);
>> break;
>> --
>> 2.7.4
>
> Looks good to me.
>
> Reviewed-by: Javier González <javier@javigon.com>
>
Thanks. Applied to 5.2.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-03-09 17:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <CGME20190307072351epcms2p1966fd363dfb73cc7b34248b9c2269987@epcms2p1>
2019-03-07 7:23 ` [PATCH v2] lightnvm: pblk: fix bio leak on large sized io Chansol Kim
2019-03-07 12:20 ` Javier González
2019-03-09 17:00 ` Matias Bjørling
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).