All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
@ 2022-12-23 13:20 Mirsad Goran Todorovac
  2023-03-29 17:25 ` BUG FIX: [PATCH v1] " Mirsad Goran Todorovac
  0 siblings, 1 reply; 17+ messages in thread
From: Mirsad Goran Todorovac @ 2022-12-23 13:20 UTC (permalink / raw)
  To: LKML
  Cc: Greg KH, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov,
	Ulf Hansson, Jens Axboe, Christophe JAILLET, Hannes Reinecke,
	Jiasheng Jiang, ye xingchen, linux-mmc

[-- Attachment #1: Type: text/plain, Size: 3366 bytes --]

Hi all,

When building a RPM 6.1.0-rc3 for AlmaLinux 8.6, I have enabled 
CONFIG_DEBUG_KMEMLEAK=y
and the result showed an unreferenced object in kworker process:

cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff888105028d80 (size 16):
   comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
   hex dump (first 16 bytes):
     6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
   backtrace:
     [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
     [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
     [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
     [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
     [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
     [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
     [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
     [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
     [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
     [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
     [<ffffffffb676e556>] worker_thread+0x76/0x770
     [<ffffffffb677b468>] kthread+0x168/0x1a0
     [<ffffffffb6604c99>] ret_from_fork+0x29/0x50

mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
git bisect start
# bad: [f0c4d9fc9cc9462659728d168387191387e903cc] Linux 6.1-rc4
git bisect bad f0c4d9fc9cc9462659728d168387191387e903cc
# bad: [fbd56ddcecab5a3623a89c8e941fdbcc55b41045] Linux 6.0.1
git bisect bad fbd56ddcecab5a3623a89c8e941fdbcc55b41045
# bad: [7e18e42e4b280c85b76967a9106a13ca61c16179] Linux 6.0-rc4
git bisect bad 7e18e42e4b280c85b76967a9106a13ca61c16179
# bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
# bad: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
git bisect bad 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
# bad: [94710cac0ef4ee177a63b5227664b38c95bbf703] Linux 4.18
git bisect bad 94710cac0ef4ee177a63b5227664b38c95bbf703
# bad: [29dcea88779c856c7dc92040a0c01233263101d4] Linux 4.17
git bisect bad 29dcea88779c856c7dc92040a0c01233263101d4

Greg asked me if I would help bisect the bug, since I failed to 
reproduce it on pre 4.17 kernels, because they wouldn't boot (black 
screen) on the Lenovo ALmaLinux 8.7 (CentOS fork) desktop box that only 
reproduced that bug:

     product: 10TX000VCR (LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB)
     vendor: LENOVO
     version: V530S-07ICB

I would welcome any advice.

Please find attached the lshw output and the build config from the last 
kernel version that also exhibits this bug, so the conclusion is that it 
is not fixed since the report on November 29th 2022:

https://lore.kernel.org/regressions/0d9c3f6c-3948-d5d1-bcc1-baf31141beaa@alu.unizg.hr/T/#t

With the hint of Tvrtko, I was able to extract the correct list of 
maintainers this time.

The bug occurs in one kernel memory leak, and it is unobvious whether a 
skilled attacker could use an abusive program to trigger the leak of 
enough 16 byte slabs (and overhead) to exhaust kernel memory and cause 
denial-of-service (crash of the system).

I apologise for the first unsuccessful attempt.

Kind regards,
Mirsad

-- 
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
--
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

[-- Attachment #2: config-6.1.0+.xz --]
[-- Type: application/octet-stream, Size: 57168 bytes --]

[-- Attachment #3: lshw.txt.xz --]
[-- Type: application/octet-stream, Size: 4628 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH v1] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2022-12-23 13:20 BUG: memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17 Mirsad Goran Todorovac
@ 2023-03-29 17:25 ` Mirsad Goran Todorovac
  2023-03-31 14:46   ` BUG FIX: [PATCH RFC v2] " Mirsad Goran Todorovac
  0 siblings, 1 reply; 17+ messages in thread
From: Mirsad Goran Todorovac @ 2023-03-29 17:25 UTC (permalink / raw)
  To: LKML
  Cc: Greg KH, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov,
	Ulf Hansson, Jens Axboe, Christophe JAILLET, Hannes Reinecke,
	Jiasheng Jiang, ye xingchen, linux-mmc

On 23.12.2022. 14:20, Mirsad Goran Todorovac wrote:
> Hi all,
> 
> When building a RPM 6.1.0-rc3 for AlmaLinux 8.6, I have enabled CONFIG_DEBUG_KMEMLEAK=y
> and the result showed an unreferenced object in kworker process:
> 
> cat /sys/kernel/debug/kmemleak
> unreferenced object 0xffff888105028d80 (size 16):
>    comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
>    hex dump (first 16 bytes):
>      6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>    backtrace:
>      [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
>      [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>      [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
>      [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
>      [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
>      [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
>      [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
>      [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
>      [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
>      [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
>      [<ffffffffb676e556>] worker_thread+0x76/0x770
>      [<ffffffffb677b468>] kthread+0x168/0x1a0
>      [<ffffffffb6604c99>] ret_from_fork+0x29/0x50
> 
> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
> git bisect start
> # bad: [f0c4d9fc9cc9462659728d168387191387e903cc] Linux 6.1-rc4
> git bisect bad f0c4d9fc9cc9462659728d168387191387e903cc
> # bad: [fbd56ddcecab5a3623a89c8e941fdbcc55b41045] Linux 6.0.1
> git bisect bad fbd56ddcecab5a3623a89c8e941fdbcc55b41045
> # bad: [7e18e42e4b280c85b76967a9106a13ca61c16179] Linux 6.0-rc4
> git bisect bad 7e18e42e4b280c85b76967a9106a13ca61c16179
> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
> # bad: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
> git bisect bad 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
> # bad: [94710cac0ef4ee177a63b5227664b38c95bbf703] Linux 4.18
> git bisect bad 94710cac0ef4ee177a63b5227664b38c95bbf703
> # bad: [29dcea88779c856c7dc92040a0c01233263101d4] Linux 4.17
> git bisect bad 29dcea88779c856c7dc92040a0c01233263101d4
> 
> Greg asked me if I would help bisect the bug, since I failed to reproduce it on pre 4.17 kernels, because they wouldn't boot (black 
> screen) on the Lenovo ALmaLinux 8.7 (CentOS fork) desktop box that only reproduced that bug:
> 
>      product: 10TX000VCR (LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB)
>      vendor: LENOVO
>      version: V530S-07ICB
> 
> I would welcome any advice.
> 
> Please find attached the lshw output and the build config from the last kernel version that also exhibits this bug, so the 
> conclusion is that it is not fixed since the report on November 29th 2022:
> 
> https://lore.kernel.org/regressions/0d9c3f6c-3948-d5d1-bcc1-baf31141beaa@alu.unizg.hr/T/#t
> 
> With the hint of Tvrtko, I was able to extract the correct list of maintainers this time.
> 
> The bug occurs in one kernel memory leak, and it is unobvious whether a skilled attacker could use an abusive program to trigger the 
> leak of enough 16 byte slabs (and overhead) to exhaust kernel memory and cause denial-of-service (crash of the system).
> 
> I apologise for the first unsuccessful attempt.

static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)

calls dev_set_name(&card->dev, "%s", dev_name(&host->dev)); that
calls err = kobject_set_name_vargs(&dev->kobj, fmt, vargs); that
executes:

	if (strchr(s, '/')) {
		char *t;

		t = kstrdup(s, GFP_KERNEL);
		kfree_const(s);
		if (!t)
			return -ENOMEM;
		strreplace(t, '/', '!');
		s = t;
	}
	kfree_const(kobj->name);
	kobj->name = s;

so, this kobj->name was never freed in the "goto err_out" case in line 404.

380 static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
381 {
382         struct memstick_dev *card = kzalloc(sizeof(struct memstick_dev),
383                                             GFP_KERNEL);
384         struct memstick_dev *old_card = host->card;
385         struct ms_id_register id_reg;
386
387         if (card) {
388                 card->host = host;
389                 dev_set_name(&card->dev, "%s", dev_name(&host->dev));
390                 card->dev.parent = &host->dev;
391                 card->dev.bus = &memstick_bus_type;
392                 card->dev.release = memstick_free_card;
393                 card->check = memstick_dummy_check;
394
395                 card->reg_addr.r_offset = offsetof(struct ms_register, id);
396                 card->reg_addr.r_length = sizeof(id_reg);
397                 card->reg_addr.w_offset = offsetof(struct ms_register, id);
398                 card->reg_addr.w_length = sizeof(id_reg);
399
400                 init_completion(&card->mrq_complete);
401
402                 host->card = card;
403                 if (memstick_set_rw_addr(card))
404                         goto err_out;
405
406                 card->next_request = h_memstick_read_dev_id;
407                 memstick_new_req(host);
408                 wait_for_completion(&card->mrq_complete);
409
410                 if (card->current_mrq.error)
411                         goto err_out;
412         }
413         host->card = old_card;
414         return card;
415 err_out:
416         host->card = old_card;
421         kfree(card);
422         return NULL;
423 }

This little patch fixes it, also at the release() method.

However, release() had not yet been tested, and I am not certain that in that case
kobj->name would not be kfree_const()-ed automatically.

Maybe it ought to be separated in two independent patches?

diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
index bf7667845459..403ab06e3ffa 100644
--- a/drivers/memstick/core/memstick.c
+++ b/drivers/memstick/core/memstick.c
@@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
  {
         struct memstick_dev *card = container_of(dev, struct memstick_dev,
                                                  dev);
+       if ((card->dev).kobj.name) {
+               kfree_const((card->dev).kobj.name);
+               (card->dev).kobj.name = NULL;
+       }
         kfree(card);
  }

@@ -410,6 +414,10 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
         return card;
  err_out:
         host->card = old_card;
+       if ((card->dev).kobj.name) {
+               kfree_const((card->dev).kobj.name);
+               (card->dev).kobj.name = NULL;
+       }
         kfree(card);
         return NULL;
  }

This morning I did not feel like we'd fix two memory leaks today.

This one was a nag for three months :-)

Of course, this requires peer review. The fact that it fixed the /sys/kernel/debug/kmemleak
output doesn't mean that it wouldn't break something, does it?

It is clearly the good wind of the Providence.

Best regards,
Mirsad

-- 
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union

Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v2] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-03-29 17:25 ` BUG FIX: [PATCH v1] " Mirsad Goran Todorovac
@ 2023-03-31 14:46   ` Mirsad Goran Todorovac
  2023-03-31 16:32     ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Mirsad Goran Todorovac @ 2023-03-31 14:46 UTC (permalink / raw)
  To: LKML
  Cc: Greg KH, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov,
	Ulf Hansson, Jens Axboe, Christophe JAILLET, Hannes Reinecke,
	Jiasheng Jiang, ye xingchen, linux-mmc

On 29.3.2023. 19:25, Mirsad Goran Todorovac wrote:
> On 23.12.2022. 14:20, Mirsad Goran Todorovac wrote:
>> Hi all,
>>
>> When building a RPM 6.1.0-rc3 for AlmaLinux 8.6, I have enabled CONFIG_DEBUG_KMEMLEAK=y
>> and the result showed an unreferenced object in kworker process:
>>
>> cat /sys/kernel/debug/kmemleak
>> unreferenced object 0xffff888105028d80 (size 16):
>>    comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
>>    hex dump (first 16 bytes):
>>      6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>>    backtrace:
>>      [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
>>      [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>      [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
>>      [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
>>      [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
>>      [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
>>      [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
>>      [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
>>      [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
>>      [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
>>      [<ffffffffb676e556>] worker_thread+0x76/0x770
>>      [<ffffffffb677b468>] kthread+0x168/0x1a0
>>      [<ffffffffb6604c99>] ret_from_fork+0x29/0x50
>>
>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>> git bisect start
>> # bad: [f0c4d9fc9cc9462659728d168387191387e903cc] Linux 6.1-rc4
>> git bisect bad f0c4d9fc9cc9462659728d168387191387e903cc
>> # bad: [fbd56ddcecab5a3623a89c8e941fdbcc55b41045] Linux 6.0.1
>> git bisect bad fbd56ddcecab5a3623a89c8e941fdbcc55b41045
>> # bad: [7e18e42e4b280c85b76967a9106a13ca61c16179] Linux 6.0-rc4
>> git bisect bad 7e18e42e4b280c85b76967a9106a13ca61c16179
>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>> # bad: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
>> git bisect bad 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
>> # bad: [94710cac0ef4ee177a63b5227664b38c95bbf703] Linux 4.18
>> git bisect bad 94710cac0ef4ee177a63b5227664b38c95bbf703
>> # bad: [29dcea88779c856c7dc92040a0c01233263101d4] Linux 4.17
>> git bisect bad 29dcea88779c856c7dc92040a0c01233263101d4
>>
>> Greg asked me if I would help bisect the bug, since I failed to reproduce it on pre 4.17 kernels, because they wouldn't boot 
>> (black screen) on the Lenovo ALmaLinux 8.7 (CentOS fork) desktop box that only reproduced that bug:
>>
>>      product: 10TX000VCR (LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB)
>>      vendor: LENOVO
>>      version: V530S-07ICB
>>
>> I would welcome any advice.
>>
>> Please find attached the lshw output and the build config from the last kernel version that also exhibits this bug, so the 
>> conclusion is that it is not fixed since the report on November 29th 2022:
>>
>> https://lore.kernel.org/regressions/0d9c3f6c-3948-d5d1-bcc1-baf31141beaa@alu.unizg.hr/T/#t
>>
>> With the hint of Tvrtko, I was able to extract the correct list of maintainers this time.
>>
>> The bug occurs in one kernel memory leak, and it is unobvious whether a skilled attacker could use an abusive program to trigger 
>> the leak of enough 16 byte slabs (and overhead) to exhaust kernel memory and cause denial-of-service (crash of the system).
>>
>> I apologise for the first unsuccessful attempt.
> 
> static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> 
> calls dev_set_name(&card->dev, "%s", dev_name(&host->dev)); that
> calls err = kobject_set_name_vargs(&dev->kobj, fmt, vargs); that
> executes:
> 
>      if (strchr(s, '/')) {
>          char *t;
> 
>          t = kstrdup(s, GFP_KERNEL);
>          kfree_const(s);
>          if (!t)
>              return -ENOMEM;
>          strreplace(t, '/', '!');
>          s = t;
>      }
>      kfree_const(kobj->name);
>      kobj->name = s;
> 
> so, this kobj->name was never freed in the "goto err_out" case in line 404.
> 
> 380 static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> 381 {
> 382         struct memstick_dev *card = kzalloc(sizeof(struct memstick_dev),
> 383                                             GFP_KERNEL);
> 384         struct memstick_dev *old_card = host->card;
> 385         struct ms_id_register id_reg;
> 386
> 387         if (card) {
> 388                 card->host = host;
> 389                 dev_set_name(&card->dev, "%s", dev_name(&host->dev));
> 390                 card->dev.parent = &host->dev;
> 391                 card->dev.bus = &memstick_bus_type;
> 392                 card->dev.release = memstick_free_card;
> 393                 card->check = memstick_dummy_check;
> 394
> 395                 card->reg_addr.r_offset = offsetof(struct ms_register, id);
> 396                 card->reg_addr.r_length = sizeof(id_reg);
> 397                 card->reg_addr.w_offset = offsetof(struct ms_register, id);
> 398                 card->reg_addr.w_length = sizeof(id_reg);
> 399
> 400                 init_completion(&card->mrq_complete);
> 401
> 402                 host->card = card;
> 403                 if (memstick_set_rw_addr(card))
> 404                         goto err_out;
> 405
> 406                 card->next_request = h_memstick_read_dev_id;
> 407                 memstick_new_req(host);
> 408                 wait_for_completion(&card->mrq_complete);
> 409
> 410                 if (card->current_mrq.error)
> 411                         goto err_out;
> 412         }
> 413         host->card = old_card;
> 414         return card;
> 415 err_out:
> 416         host->card = old_card;
> 421         kfree(card);
> 422         return NULL;
> 423 }
> 
> This little patch fixes it, also at the release() method.
> 
> However, release() had not yet been tested, and I am not certain that in that case
> kobj->name would not be kfree_const()-ed automatically.
> 
> Maybe it ought to be separated in two independent patches?
> 
> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> index bf7667845459..403ab06e3ffa 100644
> --- a/drivers/memstick/core/memstick.c
> +++ b/drivers/memstick/core/memstick.c
> @@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
>   {
>          struct memstick_dev *card = container_of(dev, struct memstick_dev,
>                                                   dev);
> +       if ((card->dev).kobj.name) {
> +               kfree_const((card->dev).kobj.name);
> +               (card->dev).kobj.name = NULL;
> +       }
>          kfree(card);
>   }
> 
> @@ -410,6 +414,10 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>          return card;
>   err_out:
>          host->card = old_card;
> +       if ((card->dev).kobj.name) {
> +               kfree_const((card->dev).kobj.name);
> +               (card->dev).kobj.name = NULL;
> +       }
>          kfree(card);
>          return NULL;
>   }
> 
> This morning I did not feel like we'd fix two memory leaks today.
> 
> This one was a nag for three months :-)
> 
> Of course, this requires peer review. The fact that it fixed the /sys/kernel/debug/kmemleak
> output doesn't mean that it wouldn't break something, does it?
> 
> It is clearly the good wind of the Providence.

This is the second version of the patch, without the paranoid parentheses.

I am still in the process of convincing Thunderbird not to convert tabs to
spaces, so please use --ignore-whitespace when testing this patch. :-(

---
  drivers/memstick/core/memstick.c | 8 ++++++++
  1 file changed, 8 insertions(+)

diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
index bf7667845459..390287c23f27 100644
--- a/drivers/memstick/core/memstick.c
+++ b/drivers/memstick/core/memstick.c
@@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
  {
         struct memstick_dev *card = container_of(dev, struct memstick_dev,
                                                  dev);
+       if (card->dev.kobj.name) {
+               kfree_const(card->dev.kobj.name);
+               card->dev.kobj.name = NULL;
+       }
         kfree(card);
  }

@@ -410,6 +414,10 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
         return card;
  err_out:
         host->card = old_card;
+       if (card->dev.kobj.name) {
+               kfree_const(card->dev.kobj.name);
+               card->dev.kobj.name = NULL;
+       }
         kfree(card);
         return NULL;
  }


-- 
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union

Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v2] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-03-31 14:46   ` BUG FIX: [PATCH RFC v2] " Mirsad Goran Todorovac
@ 2023-03-31 16:32     ` Greg KH
  2023-03-31 20:48       ` Mirsad Goran Todorovac
  2023-04-01  6:23       ` BUG FIX: [PATCH RFC v3] " Mirsad Goran Todorovac
  0 siblings, 2 replies; 17+ messages in thread
From: Greg KH @ 2023-03-31 16:32 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On Fri, Mar 31, 2023 at 04:46:03PM +0200, Mirsad Goran Todorovac wrote:
> On 29.3.2023. 19:25, Mirsad Goran Todorovac wrote:
> > On 23.12.2022. 14:20, Mirsad Goran Todorovac wrote:
> > > Hi all,
> > > 
> > > When building a RPM 6.1.0-rc3 for AlmaLinux 8.6, I have enabled CONFIG_DEBUG_KMEMLEAK=y
> > > and the result showed an unreferenced object in kworker process:
> > > 
> > > cat /sys/kernel/debug/kmemleak
> > > unreferenced object 0xffff888105028d80 (size 16):
> > >    comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
> > >    hex dump (first 16 bytes):
> > >      6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
> > >    backtrace:
> > >      [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
> > >      [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
> > >      [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
> > >      [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
> > >      [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
> > >      [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
> > >      [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
> > >      [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
> > >      [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
> > >      [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
> > >      [<ffffffffb676e556>] worker_thread+0x76/0x770
> > >      [<ffffffffb677b468>] kthread+0x168/0x1a0
> > >      [<ffffffffb6604c99>] ret_from_fork+0x29/0x50
> > > 
> > > mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
> > > git bisect start
> > > # bad: [f0c4d9fc9cc9462659728d168387191387e903cc] Linux 6.1-rc4
> > > git bisect bad f0c4d9fc9cc9462659728d168387191387e903cc
> > > # bad: [fbd56ddcecab5a3623a89c8e941fdbcc55b41045] Linux 6.0.1
> > > git bisect bad fbd56ddcecab5a3623a89c8e941fdbcc55b41045
> > > # bad: [7e18e42e4b280c85b76967a9106a13ca61c16179] Linux 6.0-rc4
> > > git bisect bad 7e18e42e4b280c85b76967a9106a13ca61c16179
> > > # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
> > > git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
> > > # bad: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
> > > git bisect bad 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
> > > # bad: [94710cac0ef4ee177a63b5227664b38c95bbf703] Linux 4.18
> > > git bisect bad 94710cac0ef4ee177a63b5227664b38c95bbf703
> > > # bad: [29dcea88779c856c7dc92040a0c01233263101d4] Linux 4.17
> > > git bisect bad 29dcea88779c856c7dc92040a0c01233263101d4
> > > 
> > > Greg asked me if I would help bisect the bug, since I failed to
> > > reproduce it on pre 4.17 kernels, because they wouldn't boot (black
> > > screen) on the Lenovo ALmaLinux 8.7 (CentOS fork) desktop box that
> > > only reproduced that bug:
> > > 
> > >      product: 10TX000VCR (LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB)
> > >      vendor: LENOVO
> > >      version: V530S-07ICB
> > > 
> > > I would welcome any advice.
> > > 
> > > Please find attached the lshw output and the build config from the
> > > last kernel version that also exhibits this bug, so the conclusion
> > > is that it is not fixed since the report on November 29th 2022:
> > > 
> > > https://lore.kernel.org/regressions/0d9c3f6c-3948-d5d1-bcc1-baf31141beaa@alu.unizg.hr/T/#t
> > > 
> > > With the hint of Tvrtko, I was able to extract the correct list of maintainers this time.
> > > 
> > > The bug occurs in one kernel memory leak, and it is unobvious
> > > whether a skilled attacker could use an abusive program to trigger
> > > the leak of enough 16 byte slabs (and overhead) to exhaust kernel
> > > memory and cause denial-of-service (crash of the system).
> > > 
> > > I apologise for the first unsuccessful attempt.
> > 
> > static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> > 
> > calls dev_set_name(&card->dev, "%s", dev_name(&host->dev)); that
> > calls err = kobject_set_name_vargs(&dev->kobj, fmt, vargs); that
> > executes:
> > 
> >      if (strchr(s, '/')) {
> >          char *t;
> > 
> >          t = kstrdup(s, GFP_KERNEL);
> >          kfree_const(s);
> >          if (!t)
> >              return -ENOMEM;
> >          strreplace(t, '/', '!');
> >          s = t;
> >      }
> >      kfree_const(kobj->name);
> >      kobj->name = s;
> > 
> > so, this kobj->name was never freed in the "goto err_out" case in line 404.
> > 
> > 380 static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> > 381 {
> > 382         struct memstick_dev *card = kzalloc(sizeof(struct memstick_dev),
> > 383                                             GFP_KERNEL);
> > 384         struct memstick_dev *old_card = host->card;
> > 385         struct ms_id_register id_reg;
> > 386
> > 387         if (card) {
> > 388                 card->host = host;
> > 389                 dev_set_name(&card->dev, "%s", dev_name(&host->dev));
> > 390                 card->dev.parent = &host->dev;
> > 391                 card->dev.bus = &memstick_bus_type;
> > 392                 card->dev.release = memstick_free_card;
> > 393                 card->check = memstick_dummy_check;
> > 394
> > 395                 card->reg_addr.r_offset = offsetof(struct ms_register, id);
> > 396                 card->reg_addr.r_length = sizeof(id_reg);
> > 397                 card->reg_addr.w_offset = offsetof(struct ms_register, id);
> > 398                 card->reg_addr.w_length = sizeof(id_reg);
> > 399
> > 400                 init_completion(&card->mrq_complete);
> > 401
> > 402                 host->card = card;
> > 403                 if (memstick_set_rw_addr(card))
> > 404                         goto err_out;
> > 405
> > 406                 card->next_request = h_memstick_read_dev_id;
> > 407                 memstick_new_req(host);
> > 408                 wait_for_completion(&card->mrq_complete);
> > 409
> > 410                 if (card->current_mrq.error)
> > 411                         goto err_out;
> > 412         }
> > 413         host->card = old_card;
> > 414         return card;
> > 415 err_out:
> > 416         host->card = old_card;
> > 421         kfree(card);
> > 422         return NULL;
> > 423 }
> > 
> > This little patch fixes it, also at the release() method.
> > 
> > However, release() had not yet been tested, and I am not certain that in that case
> > kobj->name would not be kfree_const()-ed automatically.
> > 
> > Maybe it ought to be separated in two independent patches?
> > 
> > diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> > index bf7667845459..403ab06e3ffa 100644
> > --- a/drivers/memstick/core/memstick.c
> > +++ b/drivers/memstick/core/memstick.c
> > @@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
> >   {
> >          struct memstick_dev *card = container_of(dev, struct memstick_dev,
> >                                                   dev);
> > +       if ((card->dev).kobj.name) {
> > +               kfree_const((card->dev).kobj.name);
> > +               (card->dev).kobj.name = NULL;
> > +       }
> >          kfree(card);
> >   }
> > 
> > @@ -410,6 +414,10 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> >          return card;
> >   err_out:
> >          host->card = old_card;
> > +       if ((card->dev).kobj.name) {
> > +               kfree_const((card->dev).kobj.name);
> > +               (card->dev).kobj.name = NULL;
> > +       }
> >          kfree(card);
> >          return NULL;
> >   }
> > 
> > This morning I did not feel like we'd fix two memory leaks today.
> > 
> > This one was a nag for three months :-)
> > 
> > Of course, this requires peer review. The fact that it fixed the /sys/kernel/debug/kmemleak
> > output doesn't mean that it wouldn't break something, does it?
> > 
> > It is clearly the good wind of the Providence.
> 
> This is the second version of the patch, without the paranoid parentheses.
> 
> I am still in the process of convincing Thunderbird not to convert tabs to
> spaces, so please use --ignore-whitespace when testing this patch. :-(
> 
> ---
>  drivers/memstick/core/memstick.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> index bf7667845459..390287c23f27 100644
> --- a/drivers/memstick/core/memstick.c
> +++ b/drivers/memstick/core/memstick.c
> @@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
>  {
>         struct memstick_dev *card = container_of(dev, struct memstick_dev,
>                                                  dev);
> +       if (card->dev.kobj.name) {
> +               kfree_const(card->dev.kobj.name);

Ick, no, please don't mess around with a kobject name from within a
driver like this.  That's indicitave that something else is really
wrong.

Yes, the nvme core code does it, but it shouldn't.

Hm, the driver core does it in two places too, that's not good, I'll
look at fixing that up too.

This patch is implying that anyone who calls "dev_set_name()" also has
to do this hack, which shouldn't be the case at all.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v2] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-03-31 16:32     ` Greg KH
@ 2023-03-31 20:48       ` Mirsad Goran Todorovac
  2023-04-01  6:23       ` BUG FIX: [PATCH RFC v3] " Mirsad Goran Todorovac
  1 sibling, 0 replies; 17+ messages in thread
From: Mirsad Goran Todorovac @ 2023-03-31 20:48 UTC (permalink / raw)
  To: Greg KH
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On 31. 03. 2023. 18:32, Greg KH wrote:
> On Fri, Mar 31, 2023 at 04:46:03PM +0200, Mirsad Goran Todorovac wrote:
>> On 29.3.2023. 19:25, Mirsad Goran Todorovac wrote:
>>> On 23.12.2022. 14:20, Mirsad Goran Todorovac wrote:
>>>> Hi all,
>>>>
>>>> When building a RPM 6.1.0-rc3 for AlmaLinux 8.6, I have enabled CONFIG_DEBUG_KMEMLEAK=y
>>>> and the result showed an unreferenced object in kworker process:
>>>>
>>>> cat /sys/kernel/debug/kmemleak
>>>> unreferenced object 0xffff888105028d80 (size 16):
>>>>    comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
>>>>    hex dump (first 16 bytes):
>>>>      6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>>>>    backtrace:
>>>>      [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
>>>>      [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>>>      [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
>>>>      [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
>>>>      [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
>>>>      [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
>>>>      [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
>>>>      [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
>>>>      [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
>>>>      [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
>>>>      [<ffffffffb676e556>] worker_thread+0x76/0x770
>>>>      [<ffffffffb677b468>] kthread+0x168/0x1a0
>>>>      [<ffffffffb6604c99>] ret_from_fork+0x29/0x50
>>>>
>>>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>>>> git bisect start
>>>> # bad: [f0c4d9fc9cc9462659728d168387191387e903cc] Linux 6.1-rc4
>>>> git bisect bad f0c4d9fc9cc9462659728d168387191387e903cc
>>>> # bad: [fbd56ddcecab5a3623a89c8e941fdbcc55b41045] Linux 6.0.1
>>>> git bisect bad fbd56ddcecab5a3623a89c8e941fdbcc55b41045
>>>> # bad: [7e18e42e4b280c85b76967a9106a13ca61c16179] Linux 6.0-rc4
>>>> git bisect bad 7e18e42e4b280c85b76967a9106a13ca61c16179
>>>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>>>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>>>> # bad: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
>>>> git bisect bad 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
>>>> # bad: [94710cac0ef4ee177a63b5227664b38c95bbf703] Linux 4.18
>>>> git bisect bad 94710cac0ef4ee177a63b5227664b38c95bbf703
>>>> # bad: [29dcea88779c856c7dc92040a0c01233263101d4] Linux 4.17
>>>> git bisect bad 29dcea88779c856c7dc92040a0c01233263101d4
>>>>
>>>> Greg asked me if I would help bisect the bug, since I failed to
>>>> reproduce it on pre 4.17 kernels, because they wouldn't boot (black
>>>> screen) on the Lenovo ALmaLinux 8.7 (CentOS fork) desktop box that
>>>> only reproduced that bug:
>>>>
>>>>      product: 10TX000VCR (LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB)
>>>>      vendor: LENOVO
>>>>      version: V530S-07ICB
>>>>
>>>> I would welcome any advice.
>>>>
>>>> Please find attached the lshw output and the build config from the
>>>> last kernel version that also exhibits this bug, so the conclusion
>>>> is that it is not fixed since the report on November 29th 2022:
>>>>
>>>> https://lore.kernel.org/regressions/0d9c3f6c-3948-d5d1-bcc1-baf31141beaa@alu.unizg.hr/T/#t
>>>>
>>>> With the hint of Tvrtko, I was able to extract the correct list of maintainers this time.
>>>>
>>>> The bug occurs in one kernel memory leak, and it is unobvious
>>>> whether a skilled attacker could use an abusive program to trigger
>>>> the leak of enough 16 byte slabs (and overhead) to exhaust kernel
>>>> memory and cause denial-of-service (crash of the system).
>>>>
>>>> I apologise for the first unsuccessful attempt.
>>>
>>> static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>>
>>> calls dev_set_name(&card->dev, "%s", dev_name(&host->dev)); that
>>> calls err = kobject_set_name_vargs(&dev->kobj, fmt, vargs); that
>>> executes:
>>>
>>>      if (strchr(s, '/')) {
>>>          char *t;
>>>
>>>          t = kstrdup(s, GFP_KERNEL);
>>>          kfree_const(s);
>>>          if (!t)
>>>              return -ENOMEM;
>>>          strreplace(t, '/', '!');
>>>          s = t;
>>>      }
>>>      kfree_const(kobj->name);
>>>      kobj->name = s;
>>>
>>> so, this kobj->name was never freed in the "goto err_out" case in line 404.
>>>
>>> 380 static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>> 381 {
>>> 382         struct memstick_dev *card = kzalloc(sizeof(struct memstick_dev),
>>> 383                                             GFP_KERNEL);
>>> 384         struct memstick_dev *old_card = host->card;
>>> 385         struct ms_id_register id_reg;
>>> 386
>>> 387         if (card) {
>>> 388                 card->host = host;
>>> 389                 dev_set_name(&card->dev, "%s", dev_name(&host->dev));
>>> 390                 card->dev.parent = &host->dev;
>>> 391                 card->dev.bus = &memstick_bus_type;
>>> 392                 card->dev.release = memstick_free_card;
>>> 393                 card->check = memstick_dummy_check;
>>> 394
>>> 395                 card->reg_addr.r_offset = offsetof(struct ms_register, id);
>>> 396                 card->reg_addr.r_length = sizeof(id_reg);
>>> 397                 card->reg_addr.w_offset = offsetof(struct ms_register, id);
>>> 398                 card->reg_addr.w_length = sizeof(id_reg);
>>> 399
>>> 400                 init_completion(&card->mrq_complete);
>>> 401
>>> 402                 host->card = card;
>>> 403                 if (memstick_set_rw_addr(card))
>>> 404                         goto err_out;
>>> 405
>>> 406                 card->next_request = h_memstick_read_dev_id;
>>> 407                 memstick_new_req(host);
>>> 408                 wait_for_completion(&card->mrq_complete);
>>> 409
>>> 410                 if (card->current_mrq.error)
>>> 411                         goto err_out;
>>> 412         }
>>> 413         host->card = old_card;
>>> 414         return card;
>>> 415 err_out:
>>> 416         host->card = old_card;
>>> 421         kfree(card);
>>> 422         return NULL;
>>> 423 }
>>>
>>> This little patch fixes it, also at the release() method.
>>>
>>> However, release() had not yet been tested, and I am not certain that in that case
>>> kobj->name would not be kfree_const()-ed automatically.
>>>
>>> Maybe it ought to be separated in two independent patches?
>>>
>>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>>> index bf7667845459..403ab06e3ffa 100644
>>> --- a/drivers/memstick/core/memstick.c
>>> +++ b/drivers/memstick/core/memstick.c
>>> @@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
>>>   {
>>>          struct memstick_dev *card = container_of(dev, struct memstick_dev,
>>>                                                   dev);
>>> +       if ((card->dev).kobj.name) {
>>> +               kfree_const((card->dev).kobj.name);
>>> +               (card->dev).kobj.name = NULL;
>>> +       }
>>>          kfree(card);
>>>   }
>>>
>>> @@ -410,6 +414,10 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>>          return card;
>>>   err_out:
>>>          host->card = old_card;
>>> +       if ((card->dev).kobj.name) {
>>> +               kfree_const((card->dev).kobj.name);
>>> +               (card->dev).kobj.name = NULL;
>>> +       }
>>>          kfree(card);
>>>          return NULL;
>>>   }
>>>
>>> This morning I did not feel like we'd fix two memory leaks today.
>>>
>>> This one was a nag for three months :-)
>>>
>>> Of course, this requires peer review. The fact that it fixed the /sys/kernel/debug/kmemleak
>>> output doesn't mean that it wouldn't break something, does it?
>>>
>>> It is clearly the good wind of the Providence.
>>
>> This is the second version of the patch, without the paranoid parentheses.
>>
>> I am still in the process of convincing Thunderbird not to convert tabs to
>> spaces, so please use --ignore-whitespace when testing this patch. :-(
>>
>> ---
>>  drivers/memstick/core/memstick.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>> index bf7667845459..390287c23f27 100644
>> --- a/drivers/memstick/core/memstick.c
>> +++ b/drivers/memstick/core/memstick.c
>> @@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
>>  {
>>         struct memstick_dev *card = container_of(dev, struct memstick_dev,
>>                                                  dev);
>> +       if (card->dev.kobj.name) {
>> +               kfree_const(card->dev.kobj.name);
> 
> Ick, no, please don't mess around with a kobject name from within a
> driver like this.  That's indicitave that something else is really
> wrong.
> 
> Yes, the nvme core code does it, but it shouldn't.
> 
> Hm, the driver core does it in two places too, that's not good, I'll
> look at fixing that up too.
> 
> This patch is implying that anyone who calls "dev_set_name()" also has
> to do this hack, which shouldn't be the case at all.
> 
> thanks,
> 
> greg k-h

Hi, Mr. Greg,

I tend to agree with you that releasing card->dev.kobj.name
should be done from the destructor method of the card object.

AFAICT, the kobj.name is allocated deep withing dev_set_name in line 385
of drivers/memstick/core/memstick.c:

376 static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
377 {
378         struct memstick_dev *card = kzalloc(sizeof(struct memstick_dev),
379                                             GFP_KERNEL);
380         struct memstick_dev *old_card = host->card;
381         struct ms_id_register id_reg;
382 
383         if (card) {
384                 card->host = host;
385                 dev_set_name(&card->dev, "%s", dev_name(&host->dev));
386                 card->dev.parent = &host->dev;
387                 card->dev.bus = &memstick_bus_type;
388                 card->dev.release = memstick_free_card;
389                 card->check = memstick_dummy_check;
390 
391                 card->reg_addr.r_offset = offsetof(struct ms_register, id);
392                 card->reg_addr.r_length = sizeof(id_reg);
393                 card->reg_addr.w_offset = offsetof(struct ms_register, id);
394                 card->reg_addr.w_length = sizeof(id_reg);
395 
396                 init_completion(&card->mrq_complete);
397 
398                 host->card = card;
399                 if (memstick_set_rw_addr(card))
400                         goto err_out;
401 
402                 card->next_request = h_memstick_read_dev_id;
403                 memstick_new_req(host);
404                 wait_for_completion(&card->mrq_complete);
405 
406                 if (card->current_mrq.error)
407                         goto err_out;
408         }
409         host->card = old_card;
410         return card;
411 err_out:
412         host->card = old_card;
413         if (card->dev.kobj.name) {
414                 kfree_const(card->dev.kobj.name);
415                 card->dev.kobj.name = NULL;
416         }
417         kfree(card);
418         return NULL;
419 }

I am not certain what is semantically right thing to do in this case, I also 
see there is another kfree(card) in memstick_check():

431 static void memstick_check(struct work_struct *work)
432 {
433         struct memstick_host *host = container_of(work, struct memstick_host,
434                                                   media_checker);
435         struct memstick_dev *card;
436 
437         dev_dbg(&host->dev, "memstick_check started\n");
438         pm_runtime_get_noresume(host->dev.parent);
439         mutex_lock(&host->lock);
440         if (!host->card) {
441                 if (memstick_power_on(host))
442                         goto out_power_off;
443         } else if (host->card->stop)
444                 host->card->stop(host->card);
445 
446         if (host->removing)
447                 goto out_power_off;
448 
449         card = memstick_alloc_card(host);
450 
451         if (!card) {
452                 if (host->card) {
453                         device_unregister(&host->card->dev);
454                         host->card = NULL;
455                 }
456         } else {
457                 dev_dbg(&host->dev, "new card %02x, %02x, %02x\n",
458                         card->id.type, card->id.category, card->id.class);
459                 if (host->card) {
460                         if (memstick_set_rw_addr(host->card)
461                             || !memstick_dev_match(host->card, &card->id)
462                             || !(host->card->check(host->card))) {
463                                 device_unregister(&host->card->dev);
464                                 host->card = NULL;
465                         } else if (host->card->start)
466                                 host->card->start(host->card);
467                 }
468 
469                 if (!host->card) {
470                         host->card = card;
471                         if (device_register(&card->dev)) {
472                                 put_device(&card->dev);
473                                 host->card = NULL;
474                         }
475                 } else
476                         kfree(card);
477         }
478 
479 out_power_off:
480         if (!host->card)
481                 host->set_param(host, MEMSTICK_POWER, MEMSTICK_POWER_OFF);
482 
483         mutex_unlock(&host->lock);
484         pm_runtime_put(host->dev.parent);
485         dev_dbg(&host->dev, "memstick_check finished\n");
486 }

Frankly, I was happy to remove /sys/kernel/debug/kmemleak report of the leak,
and this is still in pre-alpha stage.

This fix was not a formal patch but a PoC - once the blame is located at the
line 385 in memstick_alloc_card() and on dev_set_name(), I see there are other
places where the driver could leak memory safe for the initial patched fix.

It seems obvious that the card->dev.kobj.name has to be released and freed or
it will leak memory. I do not have sufficient insight into the Source to
do that. I think you will do a better job at it, having the advantage of seeing
the forest and not just the tree.

Thanks,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
 
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

"I see something approaching fast ... Will it be friends with me?"


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-03-31 16:32     ` Greg KH
  2023-03-31 20:48       ` Mirsad Goran Todorovac
@ 2023-04-01  6:23       ` Mirsad Goran Todorovac
  2023-04-01  6:28         ` Greg KH
  1 sibling, 1 reply; 17+ messages in thread
From: Mirsad Goran Todorovac @ 2023-04-01  6:23 UTC (permalink / raw)
  To: Greg KH
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On 31. 03. 2023. 18:32, Greg KH wrote:
> On Fri, Mar 31, 2023 at 04:46:03PM +0200, Mirsad Goran Todorovac wrote:
>> On 29.3.2023. 19:25, Mirsad Goran Todorovac wrote:
>>> On 23.12.2022. 14:20, Mirsad Goran Todorovac wrote:
>>>> Hi all,
>>>>
>>>> When building a RPM 6.1.0-rc3 for AlmaLinux 8.6, I have enabled CONFIG_DEBUG_KMEMLEAK=y
>>>> and the result showed an unreferenced object in kworker process:
>>>>
>>>> cat /sys/kernel/debug/kmemleak
>>>> unreferenced object 0xffff888105028d80 (size 16):
>>>>    comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
>>>>    hex dump (first 16 bytes):
>>>>      6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>>>>    backtrace:
>>>>      [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
>>>>      [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>>>      [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
>>>>      [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
>>>>      [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
>>>>      [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
>>>>      [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
>>>>      [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
>>>>      [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
>>>>      [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
>>>>      [<ffffffffb676e556>] worker_thread+0x76/0x770
>>>>      [<ffffffffb677b468>] kthread+0x168/0x1a0
>>>>      [<ffffffffb6604c99>] ret_from_fork+0x29/0x50
>>>>
>>>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>>>> git bisect start
>>>> # bad: [f0c4d9fc9cc9462659728d168387191387e903cc] Linux 6.1-rc4
>>>> git bisect bad f0c4d9fc9cc9462659728d168387191387e903cc
>>>> # bad: [fbd56ddcecab5a3623a89c8e941fdbcc55b41045] Linux 6.0.1
>>>> git bisect bad fbd56ddcecab5a3623a89c8e941fdbcc55b41045
>>>> # bad: [7e18e42e4b280c85b76967a9106a13ca61c16179] Linux 6.0-rc4
>>>> git bisect bad 7e18e42e4b280c85b76967a9106a13ca61c16179
>>>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>>>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>>>> # bad: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
>>>> git bisect bad 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
>>>> # bad: [94710cac0ef4ee177a63b5227664b38c95bbf703] Linux 4.18
>>>> git bisect bad 94710cac0ef4ee177a63b5227664b38c95bbf703
>>>> # bad: [29dcea88779c856c7dc92040a0c01233263101d4] Linux 4.17
>>>> git bisect bad 29dcea88779c856c7dc92040a0c01233263101d4
>>>>
>>>> Greg asked me if I would help bisect the bug, since I failed to
>>>> reproduce it on pre 4.17 kernels, because they wouldn't boot (black
>>>> screen) on the Lenovo ALmaLinux 8.7 (CentOS fork) desktop box that
>>>> only reproduced that bug:
>>>>
>>>>      product: 10TX000VCR (LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB)
>>>>      vendor: LENOVO
>>>>      version: V530S-07ICB
>>>>
>>>> I would welcome any advice.
>>>>
>>>> Please find attached the lshw output and the build config from the
>>>> last kernel version that also exhibits this bug, so the conclusion
>>>> is that it is not fixed since the report on November 29th 2022:
>>>>
>>>> https://lore.kernel.org/regressions/0d9c3f6c-3948-d5d1-bcc1-baf31141beaa@alu.unizg.hr/T/#t
>>>>
>>>> With the hint of Tvrtko, I was able to extract the correct list of maintainers this time.
>>>>
>>>> The bug occurs in one kernel memory leak, and it is unobvious
>>>> whether a skilled attacker could use an abusive program to trigger
>>>> the leak of enough 16 byte slabs (and overhead) to exhaust kernel
>>>> memory and cause denial-of-service (crash of the system).
>>>>
>>>> I apologise for the first unsuccessful attempt.
>>>
>>> static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>>
>>> calls dev_set_name(&card->dev, "%s", dev_name(&host->dev)); that
>>> calls err = kobject_set_name_vargs(&dev->kobj, fmt, vargs); that
>>> executes:
>>>
>>>      if (strchr(s, '/')) {
>>>          char *t;
>>>
>>>          t = kstrdup(s, GFP_KERNEL);
>>>          kfree_const(s);
>>>          if (!t)
>>>              return -ENOMEM;
>>>          strreplace(t, '/', '!');
>>>          s = t;
>>>      }
>>>      kfree_const(kobj->name);
>>>      kobj->name = s;
>>>
>>> so, this kobj->name was never freed in the "goto err_out" case in line 404.
>>>
>>> 380 static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>> 381 {
>>> 382         struct memstick_dev *card = kzalloc(sizeof(struct memstick_dev),
>>> 383                                             GFP_KERNEL);
>>> 384         struct memstick_dev *old_card = host->card;
>>> 385         struct ms_id_register id_reg;
>>> 386
>>> 387         if (card) {
>>> 388                 card->host = host;
>>> 389                 dev_set_name(&card->dev, "%s", dev_name(&host->dev));
>>> 390                 card->dev.parent = &host->dev;
>>> 391                 card->dev.bus = &memstick_bus_type;
>>> 392                 card->dev.release = memstick_free_card;
>>> 393                 card->check = memstick_dummy_check;
>>> 394
>>> 395                 card->reg_addr.r_offset = offsetof(struct ms_register, id);
>>> 396                 card->reg_addr.r_length = sizeof(id_reg);
>>> 397                 card->reg_addr.w_offset = offsetof(struct ms_register, id);
>>> 398                 card->reg_addr.w_length = sizeof(id_reg);
>>> 399
>>> 400                 init_completion(&card->mrq_complete);
>>> 401
>>> 402                 host->card = card;
>>> 403                 if (memstick_set_rw_addr(card))
>>> 404                         goto err_out;
>>> 405
>>> 406                 card->next_request = h_memstick_read_dev_id;
>>> 407                 memstick_new_req(host);
>>> 408                 wait_for_completion(&card->mrq_complete);
>>> 409
>>> 410                 if (card->current_mrq.error)
>>> 411                         goto err_out;
>>> 412         }
>>> 413         host->card = old_card;
>>> 414         return card;
>>> 415 err_out:
>>> 416         host->card = old_card;
>>> 421         kfree(card);
>>> 422         return NULL;
>>> 423 }
>>>
>>> This little patch fixes it, also at the release() method.
>>>
>>> However, release() had not yet been tested, and I am not certain that in that case
>>> kobj->name would not be kfree_const()-ed automatically.
>>>
>>> Maybe it ought to be separated in two independent patches?
>>>
>>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>>> index bf7667845459..403ab06e3ffa 100644
>>> --- a/drivers/memstick/core/memstick.c
>>> +++ b/drivers/memstick/core/memstick.c
>>> @@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
>>>   {
>>>          struct memstick_dev *card = container_of(dev, struct memstick_dev,
>>>                                                   dev);
>>> +       if ((card->dev).kobj.name) {
>>> +               kfree_const((card->dev).kobj.name);
>>> +               (card->dev).kobj.name = NULL;
>>> +       }
>>>          kfree(card);
>>>   }
>>>
>>> @@ -410,6 +414,10 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>>          return card;
>>>   err_out:
>>>          host->card = old_card;
>>> +       if ((card->dev).kobj.name) {
>>> +               kfree_const((card->dev).kobj.name);
>>> +               (card->dev).kobj.name = NULL;
>>> +       }
>>>          kfree(card);
>>>          return NULL;
>>>   }
>>>
>>> This morning I did not feel like we'd fix two memory leaks today.
>>>
>>> This one was a nag for three months :-)
>>>
>>> Of course, this requires peer review. The fact that it fixed the /sys/kernel/debug/kmemleak
>>> output doesn't mean that it wouldn't break something, does it?
>>>
>>> It is clearly the good wind of the Providence.
>>
>> This is the second version of the patch, without the paranoid parentheses.
>>
>> I am still in the process of convincing Thunderbird not to convert tabs to
>> spaces, so please use --ignore-whitespace when testing this patch. :-(
>>
>> ---
>>  drivers/memstick/core/memstick.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>> index bf7667845459..390287c23f27 100644
>> --- a/drivers/memstick/core/memstick.c
>> +++ b/drivers/memstick/core/memstick.c
>> @@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
>>  {
>>         struct memstick_dev *card = container_of(dev, struct memstick_dev,
>>                                                  dev);
>> +       if (card->dev.kobj.name) {
>> +               kfree_const(card->dev.kobj.name);
> 
> Ick, no, please don't mess around with a kobject name from within a
> driver like this.  That's indicitave that something else is really
> wrong.
> 
> Yes, the nvme core code does it, but it shouldn't.
> 
> Hm, the driver core does it in two places too, that's not good, I'll
> look at fixing that up too.
> 
> This patch is implying that anyone who calls "dev_set_name()" also has
> to do this hack, which shouldn't be the case at all.
> 
> thanks,
> 
> greg k-h

This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
see a more sensible way to patch this up.

This fix patches the leak for me, and I am running thorough `make kselftest` ATM.

Apparently, struct kobject is a member of struct device, which is a member of
struct memstick_dev (those are not pointers so that they would require object destruction).

However, IMHO; if the card-dev.kobj.name is not freed explicitly, it will be left
orphaned if the parent kobject is freed together with struct device and struct memstick_dev.

Please consider the following patch RFC (not an formal patch submission):

---
 drivers/memstick/core/memstick.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
index bf7667845459..5bfefdfbeca9 100644
--- a/drivers/memstick/core/memstick.c
+++ b/drivers/memstick/core/memstick.c
@@ -191,6 +191,8 @@ static void memstick_free_card(struct device *dev)
 {
        struct memstick_dev *card = container_of(dev, struct memstick_dev,
                                                 dev);
+       if (dev_name(&card->dev))
+               kfree_const(dev_name(&card->dev));
        kfree(card);
 }
 
@@ -410,6 +412,8 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
        return card;
 err_out:
        host->card = old_card;
+       if (dev_name(&card->dev))
+               kfree_const(dev_name(&card->dev));
        kfree(card);
        return NULL;
 }
@@ -468,8 +472,11 @@ static void memstick_check(struct work_struct *work)
                                put_device(&card->dev);
                                host->card = NULL;
                        }
-               } else
+               } else {
+                       if (dev_name(&card->dev))
+                               kfree_const(dev_name(&card->dev));
                        kfree(card);
+               }
        }
 
 out_power_off:
--

Kind regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
 
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

"I see something approaching fast ... Will it be friends with me?"


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01  6:23       ` BUG FIX: [PATCH RFC v3] " Mirsad Goran Todorovac
@ 2023-04-01  6:28         ` Greg KH
  2023-04-01  6:33           ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2023-04-01  6:28 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
> > This patch is implying that anyone who calls "dev_set_name()" also has
> > to do this hack, which shouldn't be the case at all.
> > 
> > thanks,
> > 
> > greg k-h
> 
> This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
> see a more sensible way to patch this up.

In sleeping on this, I think this has to move to the driver core.  I
don't understand why we haven't seen this before, except maybe no one
has really noticed before (i.e. we haven't had good leak detection tools
that run with removable devices?)

Anyway, let me see if I can come up with something this weekend, give me
a chance...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01  6:28         ` Greg KH
@ 2023-04-01  6:33           ` Greg KH
  2023-04-01  9:18             ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2023-04-01  6:33 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
> On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
> > > This patch is implying that anyone who calls "dev_set_name()" also has
> > > to do this hack, which shouldn't be the case at all.
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > 
> > This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
> > see a more sensible way to patch this up.
> 
> In sleeping on this, I think this has to move to the driver core.  I
> don't understand why we haven't seen this before, except maybe no one
> has really noticed before (i.e. we haven't had good leak detection tools
> that run with removable devices?)
> 
> Anyway, let me see if I can come up with something this weekend, give me
> a chance...

Wait, no, this already should be handled by the kobject core, look at
kobject_cleanup(), at the bottom.  So your change should be merely
duplicating the logic there that already runs when the struct device is
freed, right?

So I don't understand why your change works, odd.  I need more coffee...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01  6:33           ` Greg KH
@ 2023-04-01  9:18             ` Greg KH
  2023-04-01  9:23               ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2023-04-01  9:18 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On Sat, Apr 01, 2023 at 08:33:36AM +0200, Greg KH wrote:
> On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
> > On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
> > > > This patch is implying that anyone who calls "dev_set_name()" also has
> > > > to do this hack, which shouldn't be the case at all.
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > 
> > > This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
> > > see a more sensible way to patch this up.
> > 
> > In sleeping on this, I think this has to move to the driver core.  I
> > don't understand why we haven't seen this before, except maybe no one
> > has really noticed before (i.e. we haven't had good leak detection tools
> > that run with removable devices?)
> > 
> > Anyway, let me see if I can come up with something this weekend, give me
> > a chance...
> 
> Wait, no, this already should be handled by the kobject core, look at
> kobject_cleanup(), at the bottom.  So your change should be merely
> duplicating the logic there that already runs when the struct device is
> freed, right?
> 
> So I don't understand why your change works, odd.  I need more coffee...

I think you got half of the change correctly.  This init code is a maze
of twisty passages, let me take your patch and tweak it a bit into
something that I think should work.  This looks to be only a memstick
issue, not a driver core issue (which makes me feel better.)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01  9:18             ` Greg KH
@ 2023-04-01  9:23               ` Greg KH
  2023-04-01  9:52                 ` Mirsad Goran Todorovac
  2023-04-01 11:25                 ` BUG FIX: [PATCH RFC v3] [TESTED OK] " Mirsad Goran Todorovac
  0 siblings, 2 replies; 17+ messages in thread
From: Greg KH @ 2023-04-01  9:23 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On Sat, Apr 01, 2023 at 11:18:19AM +0200, Greg KH wrote:
> On Sat, Apr 01, 2023 at 08:33:36AM +0200, Greg KH wrote:
> > On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
> > > On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
> > > > > This patch is implying that anyone who calls "dev_set_name()" also has
> > > > > to do this hack, which shouldn't be the case at all.
> > > > > 
> > > > > thanks,
> > > > > 
> > > > > greg k-h
> > > > 
> > > > This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
> > > > see a more sensible way to patch this up.
> > > 
> > > In sleeping on this, I think this has to move to the driver core.  I
> > > don't understand why we haven't seen this before, except maybe no one
> > > has really noticed before (i.e. we haven't had good leak detection tools
> > > that run with removable devices?)
> > > 
> > > Anyway, let me see if I can come up with something this weekend, give me
> > > a chance...
> > 
> > Wait, no, this already should be handled by the kobject core, look at
> > kobject_cleanup(), at the bottom.  So your change should be merely
> > duplicating the logic there that already runs when the struct device is
> > freed, right?
> > 
> > So I don't understand why your change works, odd.  I need more coffee...
> 
> I think you got half of the change correctly.  This init code is a maze
> of twisty passages, let me take your patch and tweak it a bit into
> something that I think should work.  This looks to be only a memstick
> issue, not a driver core issue (which makes me feel better.)

Oops, forgot the patch.  Can you try this change here and let me know if
that solves the problem or not?  I have compile-tested it only, so I
have no idea if it works.

If this does work, I'll make up a "real" function to replace the
horrible dev.kobj.name mess that a driver would have to do here as it
shouldn't be required that a driver author knows the internals of the
driver core that well...

thanks,

greg k-h

--------------------


diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
index bf7667845459..bbfaf6536903 100644
--- a/drivers/memstick/core/memstick.c
+++ b/drivers/memstick/core/memstick.c
@@ -410,6 +410,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
 	return card;
 err_out:
 	host->card = old_card;
+	kfree_const(card->dev.kobj.name);
 	kfree(card);
 	return NULL;
 }
@@ -468,8 +469,10 @@ static void memstick_check(struct work_struct *work)
 				put_device(&card->dev);
 				host->card = NULL;
 			}
-		} else
+		} else {
+			kfree_const(card->dev.kobj.name);
 			kfree(card);
+		}
 	}
 
 out_power_off:

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01  9:23               ` Greg KH
@ 2023-04-01  9:52                 ` Mirsad Goran Todorovac
  2023-04-01 10:01                   ` Mirsad Goran Todorovac
  2023-04-01 11:25                 ` BUG FIX: [PATCH RFC v3] [TESTED OK] " Mirsad Goran Todorovac
  1 sibling, 1 reply; 17+ messages in thread
From: Mirsad Goran Todorovac @ 2023-04-01  9:52 UTC (permalink / raw)
  To: Greg KH
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On 01. 04. 2023. 11:23, Greg KH wrote:
> On Sat, Apr 01, 2023 at 11:18:19AM +0200, Greg KH wrote:
>> On Sat, Apr 01, 2023 at 08:33:36AM +0200, Greg KH wrote:
>>> On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
>>>> On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
>>>>>> This patch is implying that anyone who calls "dev_set_name()" also has
>>>>>> to do this hack, which shouldn't be the case at all.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> greg k-h
>>>>>
>>>>> This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
>>>>> see a more sensible way to patch this up.
>>>>
>>>> In sleeping on this, I think this has to move to the driver core.  I
>>>> don't understand why we haven't seen this before, except maybe no one
>>>> has really noticed before (i.e. we haven't had good leak detection tools
>>>> that run with removable devices?)
>>>>
>>>> Anyway, let me see if I can come up with something this weekend, give me
>>>> a chance...
>>>
>>> Wait, no, this already should be handled by the kobject core, look at
>>> kobject_cleanup(), at the bottom.  So your change should be merely
>>> duplicating the logic there that already runs when the struct device is
>>> freed, right?
>>>
>>> So I don't understand why your change works, odd.  I need more coffee...
>>
>> I think you got half of the change correctly.  This init code is a maze
>> of twisty passages, let me take your patch and tweak it a bit into
>> something that I think should work.  This looks to be only a memstick
>> issue, not a driver core issue (which makes me feel better.)
> 
> Oops, forgot the patch.  Can you try this change here and let me know if
> that solves the problem or not?  I have compile-tested it only, so I
> have no idea if it works.
> 
> If this does work, I'll make up a "real" function to replace the
> horrible dev.kobj.name mess that a driver would have to do here as it
> shouldn't be required that a driver author knows the internals of the
> driver core that well...
> 
> thanks,
> 
> greg k-h
> 
> --------------------
> 
> 
> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> index bf7667845459..bbfaf6536903 100644
> --- a/drivers/memstick/core/memstick.c
> +++ b/drivers/memstick/core/memstick.c
> @@ -410,6 +410,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>  	return card;
>  err_out:
>  	host->card = old_card;
> +	kfree_const(card->dev.kobj.name);
>  	kfree(card);
>  	return NULL;
>  }
> @@ -468,8 +469,10 @@ static void memstick_check(struct work_struct *work)
>  				put_device(&card->dev);
>  				host->card = NULL;
>  			}
> -		} else
> +		} else {
> +			kfree_const(card->dev.kobj.name);
>  			kfree(card);
> +		}
>  	}
>  
>  out_power_off:

I thought of this version, but I am not sure about tracking the device_register() and
device_unregister() calls?

put_device() calls put_kobject() which frees the const char *kobj.name ...

I thought how host cannot just be kfree()d when host->card is still allocated.
And it is a pointer. That also seems to me like a bug :-/

Kind regards,
Mirsad

---
diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
index bf7667845459..46c7bda9715d 100644
--- a/drivers/memstick/core/memstick.c
+++ b/drivers/memstick/core/memstick.c
@@ -179,6 +179,8 @@ static void memstick_free(struct device *dev)
 {
        struct memstick_host *host = container_of(dev, struct memstick_host,
                                                  dev);
+       if (host->card && host->card->dev)
+               put_device(&host->card->dev);
        kfree(host);
 }
 
@@ -410,7 +412,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
        return card;
 err_out:
        host->card = old_card;
-       kfree(card);
+       put_device(&card->dev);
        return NULL;
 }
 
@@ -468,8 +470,9 @@ static void memstick_check(struct work_struct *work)
                                put_device(&card->dev);
                                host->card = NULL;
                        }
-               } else
-                       kfree(card);
+               } else {
+                       put_device(&card->dev);
+               }
        }
 
 out_power_off:


-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
 
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

"I see something approaching fast ... Will it be friends with me?"


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01  9:52                 ` Mirsad Goran Todorovac
@ 2023-04-01 10:01                   ` Mirsad Goran Todorovac
  2023-04-01 10:14                     ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Mirsad Goran Todorovac @ 2023-04-01 10:01 UTC (permalink / raw)
  To: Greg KH
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On 01. 04. 2023. 11:52, Mirsad Goran Todorovac wrote:
> On 01. 04. 2023. 11:23, Greg KH wrote:
>> On Sat, Apr 01, 2023 at 11:18:19AM +0200, Greg KH wrote:
>>> On Sat, Apr 01, 2023 at 08:33:36AM +0200, Greg KH wrote:
>>>> On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
>>>>> On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
>>>>>>> This patch is implying that anyone who calls "dev_set_name()" also has
>>>>>>> to do this hack, which shouldn't be the case at all.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> greg k-h
>>>>>>
>>>>>> This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
>>>>>> see a more sensible way to patch this up.
>>>>>
>>>>> In sleeping on this, I think this has to move to the driver core.  I
>>>>> don't understand why we haven't seen this before, except maybe no one
>>>>> has really noticed before (i.e. we haven't had good leak detection tools
>>>>> that run with removable devices?)
>>>>>
>>>>> Anyway, let me see if I can come up with something this weekend, give me
>>>>> a chance...
>>>>
>>>> Wait, no, this already should be handled by the kobject core, look at
>>>> kobject_cleanup(), at the bottom.  So your change should be merely
>>>> duplicating the logic there that already runs when the struct device is
>>>> freed, right?
>>>>
>>>> So I don't understand why your change works, odd.  I need more coffee...
>>>
>>> I think you got half of the change correctly.  This init code is a maze
>>> of twisty passages, let me take your patch and tweak it a bit into
>>> something that I think should work.  This looks to be only a memstick
>>> issue, not a driver core issue (which makes me feel better.)
>>
>> Oops, forgot the patch.  Can you try this change here and let me know if
>> that solves the problem or not?  I have compile-tested it only, so I
>> have no idea if it works.
>>
>> If this does work, I'll make up a "real" function to replace the
>> horrible dev.kobj.name mess that a driver would have to do here as it
>> shouldn't be required that a driver author knows the internals of the
>> driver core that well...
>>
>> thanks,
>>
>> greg k-h
>>
>> --------------------
>>
>>
>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>> index bf7667845459..bbfaf6536903 100644
>> --- a/drivers/memstick/core/memstick.c
>> +++ b/drivers/memstick/core/memstick.c
>> @@ -410,6 +410,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>  	return card;
>>  err_out:
>>  	host->card = old_card;
>> +	kfree_const(card->dev.kobj.name);
>>  	kfree(card);
>>  	return NULL;
>>  }
>> @@ -468,8 +469,10 @@ static void memstick_check(struct work_struct *work)
>>  				put_device(&card->dev);
>>  				host->card = NULL;
>>  			}
>> -		} else
>> +		} else {
>> +			kfree_const(card->dev.kobj.name);
>>  			kfree(card);
>> +		}
>>  	}
>>  
>>  out_power_off:
> 
> I thought of this version, but I am not sure about tracking the device_register() and
> device_unregister() calls?
> 
> put_device() calls put_kobject() which frees the const char *kobj.name ...
> 
> I thought how host cannot just be kfree()d when host->card is still allocated.
> And it is a pointer. That also seems to me like a bug :-/
> 
> Kind regards,
> Mirsad
> 
> ---
> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> index bf7667845459..46c7bda9715d 100644
> --- a/drivers/memstick/core/memstick.c
> +++ b/drivers/memstick/core/memstick.c
> @@ -179,6 +179,8 @@ static void memstick_free(struct device *dev)
>  {
>         struct memstick_host *host = container_of(dev, struct memstick_host,
>                                                   dev);
> +       if (host->card && host->card->dev)
> +               put_device(&host->card->dev);
>         kfree(host);
>  }
>  
> @@ -410,7 +412,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>         return card;
>  err_out:
>         host->card = old_card;
> -       kfree(card);
> +       put_device(&card->dev);
>         return NULL;
>  }
>  
> @@ -468,8 +470,9 @@ static void memstick_check(struct work_struct *work)
>                                 put_device(&card->dev);
>                                 host->card = NULL;
>                         }
> -               } else
> -                       kfree(card);
> +               } else {
> +                       put_device(&card->dev);
> +               }
>         }
>  
>  out_power_off:

Thousand apologies, the previous version had a compilation error. I've sent the untested
version.

I must have become over-confident. But they say that a mistake that makes you humbled
is better than success that makes you arrogant :-|

I would like your opinion on the patch before I actually start the kernel, for I won't
be able to reboot clean that machine if it hangs in kernel until Tuesday :-(

It seems that put_device() would call the release method of the device and kfree() in
it, but I cannot say anything about the side effects, for I do not know the source so
well ...

Kind regards,
Mirsad

---
diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
index bf7667845459..c63250322e26 100644
--- a/drivers/memstick/core/memstick.c
+++ b/drivers/memstick/core/memstick.c
@@ -179,6 +179,8 @@ static void memstick_free(struct device *dev)
 {
        struct memstick_host *host = container_of(dev, struct memstick_host,
                                                  dev);
+       if (host->card)
+               put_device(&host->card->dev);
        kfree(host);
 }
 
@@ -410,7 +412,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
        return card;
 err_out:
        host->card = old_card;
-       kfree(card);
+       put_device(&card->dev);
        return NULL;
 }
 
@@ -468,8 +470,9 @@ static void memstick_check(struct work_struct *work)
                                put_device(&card->dev);
                                host->card = NULL;
                        }
-               } else
-                       kfree(card);
+               } else {
+                       put_device(&card->dev);
+               }
        }
 
 out_power_off:
 

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
 
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

"I see something approaching fast ... Will it be friends with me?"


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01 10:01                   ` Mirsad Goran Todorovac
@ 2023-04-01 10:14                     ` Greg KH
  2023-04-01 10:38                       ` Mirsad Goran Todorovac
  0 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2023-04-01 10:14 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On Sat, Apr 01, 2023 at 12:01:43PM +0200, Mirsad Goran Todorovac wrote:
> On 01. 04. 2023. 11:52, Mirsad Goran Todorovac wrote:
> > On 01. 04. 2023. 11:23, Greg KH wrote:
> >> On Sat, Apr 01, 2023 at 11:18:19AM +0200, Greg KH wrote:
> >>> On Sat, Apr 01, 2023 at 08:33:36AM +0200, Greg KH wrote:
> >>>> On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
> >>>>> On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
> >>>>>>> This patch is implying that anyone who calls "dev_set_name()" also has
> >>>>>>> to do this hack, which shouldn't be the case at all.
> >>>>>>>
> >>>>>>> thanks,
> >>>>>>>
> >>>>>>> greg k-h
> >>>>>>
> >>>>>> This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
> >>>>>> see a more sensible way to patch this up.
> >>>>>
> >>>>> In sleeping on this, I think this has to move to the driver core.  I
> >>>>> don't understand why we haven't seen this before, except maybe no one
> >>>>> has really noticed before (i.e. we haven't had good leak detection tools
> >>>>> that run with removable devices?)
> >>>>>
> >>>>> Anyway, let me see if I can come up with something this weekend, give me
> >>>>> a chance...
> >>>>
> >>>> Wait, no, this already should be handled by the kobject core, look at
> >>>> kobject_cleanup(), at the bottom.  So your change should be merely
> >>>> duplicating the logic there that already runs when the struct device is
> >>>> freed, right?
> >>>>
> >>>> So I don't understand why your change works, odd.  I need more coffee...
> >>>
> >>> I think you got half of the change correctly.  This init code is a maze
> >>> of twisty passages, let me take your patch and tweak it a bit into
> >>> something that I think should work.  This looks to be only a memstick
> >>> issue, not a driver core issue (which makes me feel better.)
> >>
> >> Oops, forgot the patch.  Can you try this change here and let me know if
> >> that solves the problem or not?  I have compile-tested it only, so I
> >> have no idea if it works.
> >>
> >> If this does work, I'll make up a "real" function to replace the
> >> horrible dev.kobj.name mess that a driver would have to do here as it
> >> shouldn't be required that a driver author knows the internals of the
> >> driver core that well...
> >>
> >> thanks,
> >>
> >> greg k-h
> >>
> >> --------------------
> >>
> >>
> >> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> >> index bf7667845459..bbfaf6536903 100644
> >> --- a/drivers/memstick/core/memstick.c
> >> +++ b/drivers/memstick/core/memstick.c
> >> @@ -410,6 +410,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> >>  	return card;
> >>  err_out:
> >>  	host->card = old_card;
> >> +	kfree_const(card->dev.kobj.name);
> >>  	kfree(card);
> >>  	return NULL;
> >>  }
> >> @@ -468,8 +469,10 @@ static void memstick_check(struct work_struct *work)
> >>  				put_device(&card->dev);
> >>  				host->card = NULL;
> >>  			}
> >> -		} else
> >> +		} else {
> >> +			kfree_const(card->dev.kobj.name);
> >>  			kfree(card);
> >> +		}
> >>  	}
> >>  
> >>  out_power_off:
> > 
> > I thought of this version, but I am not sure about tracking the device_register() and
> > device_unregister() calls?
> > 
> > put_device() calls put_kobject() which frees the const char *kobj.name ...
> > 
> > I thought how host cannot just be kfree()d when host->card is still allocated.
> > And it is a pointer. That also seems to me like a bug :-/
> > 
> > Kind regards,
> > Mirsad
> > 
> > ---
> > diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> > index bf7667845459..46c7bda9715d 100644
> > --- a/drivers/memstick/core/memstick.c
> > +++ b/drivers/memstick/core/memstick.c
> > @@ -179,6 +179,8 @@ static void memstick_free(struct device *dev)
> >  {
> >         struct memstick_host *host = container_of(dev, struct memstick_host,
> >                                                   dev);
> > +       if (host->card && host->card->dev)
> > +               put_device(&host->card->dev);
> >         kfree(host);
> >  }
> >  
> > @@ -410,7 +412,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> >         return card;
> >  err_out:
> >         host->card = old_card;
> > -       kfree(card);
> > +       put_device(&card->dev);
> >         return NULL;
> >  }
> >  
> > @@ -468,8 +470,9 @@ static void memstick_check(struct work_struct *work)
> >                                 put_device(&card->dev);
> >                                 host->card = NULL;
> >                         }
> > -               } else
> > -                       kfree(card);
> > +               } else {
> > +                       put_device(&card->dev);
> > +               }
> >         }
> >  
> >  out_power_off:
> 
> Thousand apologies, the previous version had a compilation error. I've sent the untested
> version.
> 
> I must have become over-confident. But they say that a mistake that makes you humbled
> is better than success that makes you arrogant :-|
> 
> I would like your opinion on the patch before I actually start the kernel, for I won't
> be able to reboot clean that machine if it hangs in kernel until Tuesday :-(
> 
> It seems that put_device() would call the release method of the device and kfree() in
> it, but I cannot say anything about the side effects, for I do not know the source so
> well ...
> 
> Kind regards,
> Mirsad
> 
> ---
> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> index bf7667845459..c63250322e26 100644
> --- a/drivers/memstick/core/memstick.c
> +++ b/drivers/memstick/core/memstick.c
> @@ -179,6 +179,8 @@ static void memstick_free(struct device *dev)
>  {
>         struct memstick_host *host = container_of(dev, struct memstick_host,
>                                                   dev);
> +       if (host->card)
> +               put_device(&host->card->dev);

This isn't going to work as at this moment in time, the last reference
count has already happened, causing this release callback to be called,
so that the bus driver can free the memory for the device.

So you would be calling put_device() on a device already has 0 for a
reference count :)

>         kfree(host);
>  }
>  
> @@ -410,7 +412,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>         return card;
>  err_out:
>         host->card = old_card;
> -       kfree(card);
> +       put_device(&card->dev);

No, the device was not registered here yet, right?  That would be
required _IFF_ there was a call to device_register().

>         return NULL;
>  }
>  
> @@ -468,8 +470,9 @@ static void memstick_check(struct work_struct *work)
>                                 put_device(&card->dev);
>                                 host->card = NULL;
>                         }
> -               } else
> -                       kfree(card);
> +               } else {
> +                       put_device(&card->dev);

Same here, unless I'm reading this wrong, device_register() had not been
called yet, which is why the kfree was required (same for the above
call).

But hey, this driver really is a maze of twisty callbacks and workqueues
and complexity, for no obvious reason to me (maybe because of some async
requirement for memstick devices?  Thankfully I no longer have this
hardware...)  So I might be totally wrong...

I would recommend trying my version first, it "shouldn't" cause anything
worse to happen from what you have today, but hey, that's just my guess.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01 10:14                     ` Greg KH
@ 2023-04-01 10:38                       ` Mirsad Goran Todorovac
  0 siblings, 0 replies; 17+ messages in thread
From: Mirsad Goran Todorovac @ 2023-04-01 10:38 UTC (permalink / raw)
  To: Greg KH
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On 01. 04. 2023. 12:14, Greg KH wrote:
> On Sat, Apr 01, 2023 at 12:01:43PM +0200, Mirsad Goran Todorovac wrote:
>> On 01. 04. 2023. 11:52, Mirsad Goran Todorovac wrote:
>>> On 01. 04. 2023. 11:23, Greg KH wrote:
>>>> On Sat, Apr 01, 2023 at 11:18:19AM +0200, Greg KH wrote:
>>>>> On Sat, Apr 01, 2023 at 08:33:36AM +0200, Greg KH wrote:
>>>>>> On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
>>>>>>> On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
>>>>>>>>> This patch is implying that anyone who calls "dev_set_name()" also has
>>>>>>>>> to do this hack, which shouldn't be the case at all.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> greg k-h
>>>>>>>>
>>>>>>>> This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
>>>>>>>> see a more sensible way to patch this up.
>>>>>>>
>>>>>>> In sleeping on this, I think this has to move to the driver core.  I
>>>>>>> don't understand why we haven't seen this before, except maybe no one
>>>>>>> has really noticed before (i.e. we haven't had good leak detection tools
>>>>>>> that run with removable devices?)
>>>>>>>
>>>>>>> Anyway, let me see if I can come up with something this weekend, give me
>>>>>>> a chance...
>>>>>>
>>>>>> Wait, no, this already should be handled by the kobject core, look at
>>>>>> kobject_cleanup(), at the bottom.  So your change should be merely
>>>>>> duplicating the logic there that already runs when the struct device is
>>>>>> freed, right?
>>>>>>
>>>>>> So I don't understand why your change works, odd.  I need more coffee...
>>>>>
>>>>> I think you got half of the change correctly.  This init code is a maze
>>>>> of twisty passages, let me take your patch and tweak it a bit into
>>>>> something that I think should work.  This looks to be only a memstick
>>>>> issue, not a driver core issue (which makes me feel better.)
>>>>
>>>> Oops, forgot the patch.  Can you try this change here and let me know if
>>>> that solves the problem or not?  I have compile-tested it only, so I
>>>> have no idea if it works.
>>>>
>>>> If this does work, I'll make up a "real" function to replace the
>>>> horrible dev.kobj.name mess that a driver would have to do here as it
>>>> shouldn't be required that a driver author knows the internals of the
>>>> driver core that well...
>>>>
>>>> thanks,
>>>>
>>>> greg k-h
>>>>
>>>> --------------------
>>>>
>>>>
>>>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>>>> index bf7667845459..bbfaf6536903 100644
>>>> --- a/drivers/memstick/core/memstick.c
>>>> +++ b/drivers/memstick/core/memstick.c
>>>> @@ -410,6 +410,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>>>  	return card;
>>>>  err_out:
>>>>  	host->card = old_card;
>>>> +	kfree_const(card->dev.kobj.name);
>>>>  	kfree(card);
>>>>  	return NULL;
>>>>  }
>>>> @@ -468,8 +469,10 @@ static void memstick_check(struct work_struct *work)
>>>>  				put_device(&card->dev);
>>>>  				host->card = NULL;
>>>>  			}
>>>> -		} else
>>>> +		} else {
>>>> +			kfree_const(card->dev.kobj.name);
>>>>  			kfree(card);
>>>> +		}
>>>>  	}
>>>>  
>>>>  out_power_off:
>>>
>>> I thought of this version, but I am not sure about tracking the device_register() and
>>> device_unregister() calls?
>>>
>>> put_device() calls put_kobject() which frees the const char *kobj.name ...
>>>
>>> I thought how host cannot just be kfree()d when host->card is still allocated.
>>> And it is a pointer. That also seems to me like a bug :-/
>>>
>>> Kind regards,
>>> Mirsad
>>>
>>> ---
>>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>>> index bf7667845459..46c7bda9715d 100644
>>> --- a/drivers/memstick/core/memstick.c
>>> +++ b/drivers/memstick/core/memstick.c
>>> @@ -179,6 +179,8 @@ static void memstick_free(struct device *dev)
>>>  {
>>>         struct memstick_host *host = container_of(dev, struct memstick_host,
>>>                                                   dev);
>>> +       if (host->card && host->card->dev)
>>> +               put_device(&host->card->dev);
>>>         kfree(host);
>>>  }
>>>  
>>> @@ -410,7 +412,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>>         return card;
>>>  err_out:
>>>         host->card = old_card;
>>> -       kfree(card);
>>> +       put_device(&card->dev);
>>>         return NULL;
>>>  }
>>>  
>>> @@ -468,8 +470,9 @@ static void memstick_check(struct work_struct *work)
>>>                                 put_device(&card->dev);
>>>                                 host->card = NULL;
>>>                         }
>>> -               } else
>>> -                       kfree(card);
>>> +               } else {
>>> +                       put_device(&card->dev);
>>> +               }
>>>         }
>>>  
>>>  out_power_off:
>>
>> Thousand apologies, the previous version had a compilation error. I've sent the untested
>> version.
>>
>> I must have become over-confident. But they say that a mistake that makes you humbled
>> is better than success that makes you arrogant :-|
>>
>> I would like your opinion on the patch before I actually start the kernel, for I won't
>> be able to reboot clean that machine if it hangs in kernel until Tuesday :-(
>>
>> It seems that put_device() would call the release method of the device and kfree() in
>> it, but I cannot say anything about the side effects, for I do not know the source so
>> well ...
>>
>> Kind regards,
>> Mirsad
>>
>> ---
>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>> index bf7667845459..c63250322e26 100644
>> --- a/drivers/memstick/core/memstick.c
>> +++ b/drivers/memstick/core/memstick.c
>> @@ -179,6 +179,8 @@ static void memstick_free(struct device *dev)
>>  {
>>         struct memstick_host *host = container_of(dev, struct memstick_host,
>>                                                   dev);
>> +       if (host->card)
>> +               put_device(&host->card->dev);
> 
> This isn't going to work as at this moment in time, the last reference
> count has already happened, causing this release callback to be called,
> so that the bus driver can free the memory for the device.
> 
> So you would be calling put_device() on a device already has 0 for a
> reference count :)
> 
>>         kfree(host);
>>  }
>>  
>> @@ -410,7 +412,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>         return card;
>>  err_out:
>>         host->card = old_card;
>> -       kfree(card);
>> +       put_device(&card->dev);
> 
> No, the device was not registered here yet, right?  That would be
> required _IFF_ there was a call to device_register().
> 
>>         return NULL;
>>  }
>>  
>> @@ -468,8 +470,9 @@ static void memstick_check(struct work_struct *work)
>>                                 put_device(&card->dev);
>>                                 host->card = NULL;
>>                         }
>> -               } else
>> -                       kfree(card);
>> +               } else {
>> +                       put_device(&card->dev);
> 
> Same here, unless I'm reading this wrong, device_register() had not been
> called yet, which is why the kfree was required (same for the above
> call).
> 
> But hey, this driver really is a maze of twisty callbacks and workqueues
> and complexity, for no obvious reason to me (maybe because of some async
> requirement for memstick devices?  Thankfully I no longer have this
> hardware...)  So I might be totally wrong...
> 
> I would recommend trying my version first, it "shouldn't" cause anything
> worse to happen from what you have today, but hey, that's just my guess.
> 
> thanks,
> 
> greg k-h

Hi Mr. Greg,

Thank you for the additional insight.

I will build your patch ASAP and give feedback.

Kind regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
 
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

"I see something approaching fast ... Will it be friends with me?"


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] [TESTED OK] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01  9:23               ` Greg KH
  2023-04-01  9:52                 ` Mirsad Goran Todorovac
@ 2023-04-01 11:25                 ` Mirsad Goran Todorovac
  2023-04-01 14:56                   ` Greg KH
  1 sibling, 1 reply; 17+ messages in thread
From: Mirsad Goran Todorovac @ 2023-04-01 11:25 UTC (permalink / raw)
  To: Greg KH
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On 01. 04. 2023. 11:23, Greg KH wrote:
> On Sat, Apr 01, 2023 at 11:18:19AM +0200, Greg KH wrote:
>> On Sat, Apr 01, 2023 at 08:33:36AM +0200, Greg KH wrote:
>>> On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
>>>> On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
>>>>>> This patch is implying that anyone who calls "dev_set_name()" also has
>>>>>> to do this hack, which shouldn't be the case at all.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> greg k-h
>>>>>
>>>>> This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
>>>>> see a more sensible way to patch this up.
>>>>
>>>> In sleeping on this, I think this has to move to the driver core.  I
>>>> don't understand why we haven't seen this before, except maybe no one
>>>> has really noticed before (i.e. we haven't had good leak detection tools
>>>> that run with removable devices?)
>>>>
>>>> Anyway, let me see if I can come up with something this weekend, give me
>>>> a chance...
>>>
>>> Wait, no, this already should be handled by the kobject core, look at
>>> kobject_cleanup(), at the bottom.  So your change should be merely
>>> duplicating the logic there that already runs when the struct device is
>>> freed, right?
>>>
>>> So I don't understand why your change works, odd.  I need more coffee...
>>
>> I think you got half of the change correctly.  This init code is a maze
>> of twisty passages, let me take your patch and tweak it a bit into
>> something that I think should work.  This looks to be only a memstick
>> issue, not a driver core issue (which makes me feel better.)
> 
> Oops, forgot the patch.  Can you try this change here and let me know if
> that solves the problem or not?  I have compile-tested it only, so I
> have no idea if it works.
> 
> If this does work, I'll make up a "real" function to replace the
> horrible dev.kobj.name mess that a driver would have to do here as it
> shouldn't be required that a driver author knows the internals of the
> driver core that well...
> 
> thanks,
> 
> greg k-h
> 
> --------------------
> 
> 
> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> index bf7667845459..bbfaf6536903 100644
> --- a/drivers/memstick/core/memstick.c
> +++ b/drivers/memstick/core/memstick.c
> @@ -410,6 +410,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>  	return card;
>  err_out:
>  	host->card = old_card;
> +	kfree_const(card->dev.kobj.name);
>  	kfree(card);
>  	return NULL;
>  }
> @@ -468,8 +469,10 @@ static void memstick_check(struct work_struct *work)
>  				put_device(&card->dev);
>  				host->card = NULL;
>  			}
> -		} else
> +		} else {
> +			kfree_const(card->dev.kobj.name);
>  			kfree(card);
> +		}
>  	}
>  
>  out_power_off:

RESULTS:

w/o patch:

[root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
[root@pc-mtodorov marvin]# cat !$
cat /sys/kernel/debug/kmemleak
[root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
[root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffffa09a93249590 (size 16):
  comm "kworker/u12:4", pid 371, jiffies 4294896466 (age 52.748s)
  hex dump (first 16 bytes):
    6d 65 6d 73 74 69 63 6b 30 00 cc cc cc cc cc cc  memstick0.......
  backtrace:
    [<ffffffff936fb25c>] slab_post_alloc_hook+0x8c/0x3e0
    [<ffffffff93702b39>] __kmem_cache_alloc_node+0x1d9/0x2a0
    [<ffffffff936773b9>] __kmalloc_node_track_caller+0x59/0x180
    [<ffffffff93666a0a>] kstrdup+0x3a/0x70
    [<ffffffff93666a7c>] kstrdup_const+0x2c/0x40
    [<ffffffff93aa99dc>] kvasprintf_const+0x7c/0xb0
    [<ffffffff9443b427>] kobject_set_name_vargs+0x27/0xa0
    [<ffffffff93d8ee37>] dev_set_name+0x57/0x80
    [<ffffffffc0aeff0f>] memstick_check+0x10f/0x3b0 [memstick]
    [<ffffffff933cb4c0>] process_one_work+0x250/0x530
    [<ffffffff933cb7f8>] worker_thread+0x48/0x3a0
    [<ffffffff933d6dff>] kthread+0x10f/0x140
    [<ffffffff93202fa9>] ret_from_fork+0x29/0x50
unreferenced object 0xffffa09a97205990 (size 16):
  comm "kworker/u12:4", pid 371, jiffies 4294896471 (age 52.728s)
  hex dump (first 16 bytes):
    6d 65 6d 73 74 69 63 6b 30 00 cc cc cc cc cc cc  memstick0.......
  backtrace:
    [<ffffffff936fb25c>] slab_post_alloc_hook+0x8c/0x3e0
    [<ffffffff93702b39>] __kmem_cache_alloc_node+0x1d9/0x2a0
    [<ffffffff936773b9>] __kmalloc_node_track_caller+0x59/0x180
    [<ffffffff93666a0a>] kstrdup+0x3a/0x70
    [<ffffffff93666a7c>] kstrdup_const+0x2c/0x40
    [<ffffffff93aa99dc>] kvasprintf_const+0x7c/0xb0
    [<ffffffff9443b427>] kobject_set_name_vargs+0x27/0xa0
    [<ffffffff93d8ee37>] dev_set_name+0x57/0x80
    [<ffffffffc0aeff0f>] memstick_check+0x10f/0x3b0 [memstick]
    [<ffffffff933cb4c0>] process_one_work+0x250/0x530
    [<ffffffff933cb7f8>] worker_thread+0x48/0x3a0
    [<ffffffff933d6dff>] kthread+0x10f/0x140
    [<ffffffff93202fa9>] ret_from_fork+0x29/0x50
[root@pc-mtodorov marvin]# uname -rms
Linux 6.3.0-rc4-mt-20230401-00199-g7b50567bdcad-dirty x86_64
[root@pc-mtodorov marvin]# 

After the patch:

[root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
[root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
[root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
[root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
[root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
[root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak

So, congratulations, this did it!

This bug I detected on 2022-11-04, but it took me four months to find the leak,
before I was "blessed by the Source". You have asked me whether I would
help the memstick developers find a solution, and I like to keep promises. :-)

At your convenience, you might add in the patch:

Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>

It's been an honour serving with the memstick community with you and it was a real
brainstorming session for me.

Kind regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
 
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

"I see something approaching fast ... Will it be friends with me?"


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] [TESTED OK] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01 11:25                 ` BUG FIX: [PATCH RFC v3] [TESTED OK] " Mirsad Goran Todorovac
@ 2023-04-01 14:56                   ` Greg KH
  2023-04-04 10:37                     ` Mirsad Goran Todorovac
  0 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2023-04-01 14:56 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On Sat, Apr 01, 2023 at 01:25:21PM +0200, Mirsad Goran Todorovac wrote:
> On 01. 04. 2023. 11:23, Greg KH wrote:
> > On Sat, Apr 01, 2023 at 11:18:19AM +0200, Greg KH wrote:
> >> On Sat, Apr 01, 2023 at 08:33:36AM +0200, Greg KH wrote:
> >>> On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
> >>>> On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
> >>>>>> This patch is implying that anyone who calls "dev_set_name()" also has
> >>>>>> to do this hack, which shouldn't be the case at all.
> >>>>>>
> >>>>>> thanks,
> >>>>>>
> >>>>>> greg k-h
> >>>>>
> >>>>> This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
> >>>>> see a more sensible way to patch this up.
> >>>>
> >>>> In sleeping on this, I think this has to move to the driver core.  I
> >>>> don't understand why we haven't seen this before, except maybe no one
> >>>> has really noticed before (i.e. we haven't had good leak detection tools
> >>>> that run with removable devices?)
> >>>>
> >>>> Anyway, let me see if I can come up with something this weekend, give me
> >>>> a chance...
> >>>
> >>> Wait, no, this already should be handled by the kobject core, look at
> >>> kobject_cleanup(), at the bottom.  So your change should be merely
> >>> duplicating the logic there that already runs when the struct device is
> >>> freed, right?
> >>>
> >>> So I don't understand why your change works, odd.  I need more coffee...
> >>
> >> I think you got half of the change correctly.  This init code is a maze
> >> of twisty passages, let me take your patch and tweak it a bit into
> >> something that I think should work.  This looks to be only a memstick
> >> issue, not a driver core issue (which makes me feel better.)
> > 
> > Oops, forgot the patch.  Can you try this change here and let me know if
> > that solves the problem or not?  I have compile-tested it only, so I
> > have no idea if it works.
> > 
> > If this does work, I'll make up a "real" function to replace the
> > horrible dev.kobj.name mess that a driver would have to do here as it
> > shouldn't be required that a driver author knows the internals of the
> > driver core that well...
> > 
> > thanks,
> > 
> > greg k-h
> > 
> > --------------------
> > 
> > 
> > diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> > index bf7667845459..bbfaf6536903 100644
> > --- a/drivers/memstick/core/memstick.c
> > +++ b/drivers/memstick/core/memstick.c
> > @@ -410,6 +410,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> >  	return card;
> >  err_out:
> >  	host->card = old_card;
> > +	kfree_const(card->dev.kobj.name);
> >  	kfree(card);
> >  	return NULL;
> >  }
> > @@ -468,8 +469,10 @@ static void memstick_check(struct work_struct *work)
> >  				put_device(&card->dev);
> >  				host->card = NULL;
> >  			}
> > -		} else
> > +		} else {
> > +			kfree_const(card->dev.kobj.name);
> >  			kfree(card);
> > +		}
> >  	}
> >  
> >  out_power_off:
> 
> RESULTS:
> 
> w/o patch:
> 
> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
> [root@pc-mtodorov marvin]# cat !$
> cat /sys/kernel/debug/kmemleak
> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
> [root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
> unreferenced object 0xffffa09a93249590 (size 16):
>   comm "kworker/u12:4", pid 371, jiffies 4294896466 (age 52.748s)
>   hex dump (first 16 bytes):
>     6d 65 6d 73 74 69 63 6b 30 00 cc cc cc cc cc cc  memstick0.......
>   backtrace:
>     [<ffffffff936fb25c>] slab_post_alloc_hook+0x8c/0x3e0
>     [<ffffffff93702b39>] __kmem_cache_alloc_node+0x1d9/0x2a0
>     [<ffffffff936773b9>] __kmalloc_node_track_caller+0x59/0x180
>     [<ffffffff93666a0a>] kstrdup+0x3a/0x70
>     [<ffffffff93666a7c>] kstrdup_const+0x2c/0x40
>     [<ffffffff93aa99dc>] kvasprintf_const+0x7c/0xb0
>     [<ffffffff9443b427>] kobject_set_name_vargs+0x27/0xa0
>     [<ffffffff93d8ee37>] dev_set_name+0x57/0x80
>     [<ffffffffc0aeff0f>] memstick_check+0x10f/0x3b0 [memstick]
>     [<ffffffff933cb4c0>] process_one_work+0x250/0x530
>     [<ffffffff933cb7f8>] worker_thread+0x48/0x3a0
>     [<ffffffff933d6dff>] kthread+0x10f/0x140
>     [<ffffffff93202fa9>] ret_from_fork+0x29/0x50
> unreferenced object 0xffffa09a97205990 (size 16):
>   comm "kworker/u12:4", pid 371, jiffies 4294896471 (age 52.728s)
>   hex dump (first 16 bytes):
>     6d 65 6d 73 74 69 63 6b 30 00 cc cc cc cc cc cc  memstick0.......
>   backtrace:
>     [<ffffffff936fb25c>] slab_post_alloc_hook+0x8c/0x3e0
>     [<ffffffff93702b39>] __kmem_cache_alloc_node+0x1d9/0x2a0
>     [<ffffffff936773b9>] __kmalloc_node_track_caller+0x59/0x180
>     [<ffffffff93666a0a>] kstrdup+0x3a/0x70
>     [<ffffffff93666a7c>] kstrdup_const+0x2c/0x40
>     [<ffffffff93aa99dc>] kvasprintf_const+0x7c/0xb0
>     [<ffffffff9443b427>] kobject_set_name_vargs+0x27/0xa0
>     [<ffffffff93d8ee37>] dev_set_name+0x57/0x80
>     [<ffffffffc0aeff0f>] memstick_check+0x10f/0x3b0 [memstick]
>     [<ffffffff933cb4c0>] process_one_work+0x250/0x530
>     [<ffffffff933cb7f8>] worker_thread+0x48/0x3a0
>     [<ffffffff933d6dff>] kthread+0x10f/0x140
>     [<ffffffff93202fa9>] ret_from_fork+0x29/0x50
> [root@pc-mtodorov marvin]# uname -rms
> Linux 6.3.0-rc4-mt-20230401-00199-g7b50567bdcad-dirty x86_64
> [root@pc-mtodorov marvin]# 
> 
> After the patch:
> 
> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
> [root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
> [root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
> [root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
> 
> So, congratulations, this did it!

Great, thanks for testing!  And for working to narrow this down, that's
the hard part here.

> This bug I detected on 2022-11-04, but it took me four months to find the leak,
> before I was "blessed by the Source". You have asked me whether I would
> help the memstick developers find a solution, and I like to keep promises. :-)
> 
> At your convenience, you might add in the patch:
> 
> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
> 
> It's been an honour serving with the memstick community with you and it was a real
> brainstorming session for me.

Thanks, as you did way over half the work here, I think a co-developed
tag would be better.  I'll send it out with that and you can provide a
signed-off-by on it that would be great.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: BUG FIX: [PATCH RFC v3] [TESTED OK] memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17
  2023-04-01 14:56                   ` Greg KH
@ 2023-04-04 10:37                     ` Mirsad Goran Todorovac
  0 siblings, 0 replies; 17+ messages in thread
From: Mirsad Goran Todorovac @ 2023-04-04 10:37 UTC (permalink / raw)
  To: Greg KH
  Cc: LKML, Thorsten Leemhuis, Maxim Levitsky, Alex Dubov, Ulf Hansson,
	Jens Axboe, Christophe JAILLET, Hannes Reinecke, Jiasheng Jiang,
	ye xingchen, linux-mmc

On 4/1/2023 16:56, Greg KH wrote:
> On Sat, Apr 01, 2023 at 01:25:21PM +0200, Mirsad Goran Todorovac wrote:
>> On 01. 04. 2023. 11:23, Greg KH wrote:
>>> On Sat, Apr 01, 2023 at 11:18:19AM +0200, Greg KH wrote:
>>>> On Sat, Apr 01, 2023 at 08:33:36AM +0200, Greg KH wrote:
>>>>> On Sat, Apr 01, 2023 at 08:28:07AM +0200, Greg KH wrote:
>>>>>> On Sat, Apr 01, 2023 at 08:23:26AM +0200, Mirsad Goran Todorovac wrote:
>>>>>>>> This patch is implying that anyone who calls "dev_set_name()" also has
>>>>>>>> to do this hack, which shouldn't be the case at all.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> greg k-h
>>>>>>>
>>>>>>> This is my best guess. Unless there is dev_free_name() or kobject_free_name(), I don't
>>>>>>> see a more sensible way to patch this up.
>>>>>>
>>>>>> In sleeping on this, I think this has to move to the driver core.  I
>>>>>> don't understand why we haven't seen this before, except maybe no one
>>>>>> has really noticed before (i.e. we haven't had good leak detection tools
>>>>>> that run with removable devices?)
>>>>>>
>>>>>> Anyway, let me see if I can come up with something this weekend, give me
>>>>>> a chance...
>>>>>
>>>>> Wait, no, this already should be handled by the kobject core, look at
>>>>> kobject_cleanup(), at the bottom.  So your change should be merely
>>>>> duplicating the logic there that already runs when the struct device is
>>>>> freed, right?
>>>>>
>>>>> So I don't understand why your change works, odd.  I need more coffee...
>>>>
>>>> I think you got half of the change correctly.  This init code is a maze
>>>> of twisty passages, let me take your patch and tweak it a bit into
>>>> something that I think should work.  This looks to be only a memstick
>>>> issue, not a driver core issue (which makes me feel better.)
>>>
>>> Oops, forgot the patch.  Can you try this change here and let me know if
>>> that solves the problem or not?  I have compile-tested it only, so I
>>> have no idea if it works.
>>>
>>> If this does work, I'll make up a "real" function to replace the
>>> horrible dev.kobj.name mess that a driver would have to do here as it
>>> shouldn't be required that a driver author knows the internals of the
>>> driver core that well...
>>>
>>> thanks,
>>>
>>> greg k-h
>>>
>>> --------------------
>>>
>>>
>>> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
>>> index bf7667845459..bbfaf6536903 100644
>>> --- a/drivers/memstick/core/memstick.c
>>> +++ b/drivers/memstick/core/memstick.c
>>> @@ -410,6 +410,7 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
>>>   	return card;
>>>   err_out:
>>>   	host->card = old_card;
>>> +	kfree_const(card->dev.kobj.name);
>>>   	kfree(card);
>>>   	return NULL;
>>>   }
>>> @@ -468,8 +469,10 @@ static void memstick_check(struct work_struct *work)
>>>   				put_device(&card->dev);
>>>   				host->card = NULL;
>>>   			}
>>> -		} else
>>> +		} else {
>>> +			kfree_const(card->dev.kobj.name);
>>>   			kfree(card);
>>> +		}
>>>   	}
>>>   
>>>   out_power_off:
>>
>> RESULTS:
>>
>> w/o patch:
>>
>> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
>> [root@pc-mtodorov marvin]# cat !$
>> cat /sys/kernel/debug/kmemleak
>> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
>> [root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
>> unreferenced object 0xffffa09a93249590 (size 16):
>>    comm "kworker/u12:4", pid 371, jiffies 4294896466 (age 52.748s)
>>    hex dump (first 16 bytes):
>>      6d 65 6d 73 74 69 63 6b 30 00 cc cc cc cc cc cc  memstick0.......
>>    backtrace:
>>      [<ffffffff936fb25c>] slab_post_alloc_hook+0x8c/0x3e0
>>      [<ffffffff93702b39>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>      [<ffffffff936773b9>] __kmalloc_node_track_caller+0x59/0x180
>>      [<ffffffff93666a0a>] kstrdup+0x3a/0x70
>>      [<ffffffff93666a7c>] kstrdup_const+0x2c/0x40
>>      [<ffffffff93aa99dc>] kvasprintf_const+0x7c/0xb0
>>      [<ffffffff9443b427>] kobject_set_name_vargs+0x27/0xa0
>>      [<ffffffff93d8ee37>] dev_set_name+0x57/0x80
>>      [<ffffffffc0aeff0f>] memstick_check+0x10f/0x3b0 [memstick]
>>      [<ffffffff933cb4c0>] process_one_work+0x250/0x530
>>      [<ffffffff933cb7f8>] worker_thread+0x48/0x3a0
>>      [<ffffffff933d6dff>] kthread+0x10f/0x140
>>      [<ffffffff93202fa9>] ret_from_fork+0x29/0x50
>> unreferenced object 0xffffa09a97205990 (size 16):
>>    comm "kworker/u12:4", pid 371, jiffies 4294896471 (age 52.728s)
>>    hex dump (first 16 bytes):
>>      6d 65 6d 73 74 69 63 6b 30 00 cc cc cc cc cc cc  memstick0.......
>>    backtrace:
>>      [<ffffffff936fb25c>] slab_post_alloc_hook+0x8c/0x3e0
>>      [<ffffffff93702b39>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>      [<ffffffff936773b9>] __kmalloc_node_track_caller+0x59/0x180
>>      [<ffffffff93666a0a>] kstrdup+0x3a/0x70
>>      [<ffffffff93666a7c>] kstrdup_const+0x2c/0x40
>>      [<ffffffff93aa99dc>] kvasprintf_const+0x7c/0xb0
>>      [<ffffffff9443b427>] kobject_set_name_vargs+0x27/0xa0
>>      [<ffffffff93d8ee37>] dev_set_name+0x57/0x80
>>      [<ffffffffc0aeff0f>] memstick_check+0x10f/0x3b0 [memstick]
>>      [<ffffffff933cb4c0>] process_one_work+0x250/0x530
>>      [<ffffffff933cb7f8>] worker_thread+0x48/0x3a0
>>      [<ffffffff933d6dff>] kthread+0x10f/0x140
>>      [<ffffffff93202fa9>] ret_from_fork+0x29/0x50
>> [root@pc-mtodorov marvin]# uname -rms
>> Linux 6.3.0-rc4-mt-20230401-00199-g7b50567bdcad-dirty x86_64
>> [root@pc-mtodorov marvin]#
>>
>> After the patch:
>>
>> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
>> [root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
>> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
>> [root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
>> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
>> [root@pc-mtodorov marvin]# cat /sys/kernel/debug/kmemleak
>>
>> So, congratulations, this did it!
> 
> Great, thanks for testing!  And for working to narrow this down, that's
> the hard part here.
> 
>> This bug I detected on 2022-11-04, but it took me four months to find the leak,
>> before I was "blessed by the Source". You have asked me whether I would
>> help the memstick developers find a solution, and I like to keep promises. :-)
>>
>> At your convenience, you might add in the patch:
>>
>> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
>>
>> It's been an honour serving with the memstick community with you and it was a real
>> brainstorming session for me.
> 
> Thanks, as you did way over half the work here, I think a co-developed
> tag would be better.  I'll send it out with that and you can provide a
> signed-off-by on it that would be great.
> 
> thanks,
> 
> greg k-h

Sorry, only seen the mail today.
My Thunderbird still amuses me with its threading of emails :-|

Yes, the

Signed-of-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>

would be neat.

Thank you,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
tel. +385 (0)1 3711 451
mob. +385 91 57 88 355

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-04-04 10:37 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-23 13:20 BUG: memstick_check() memleak in kernel 6.1.0+ introduced pre 4.17 Mirsad Goran Todorovac
2023-03-29 17:25 ` BUG FIX: [PATCH v1] " Mirsad Goran Todorovac
2023-03-31 14:46   ` BUG FIX: [PATCH RFC v2] " Mirsad Goran Todorovac
2023-03-31 16:32     ` Greg KH
2023-03-31 20:48       ` Mirsad Goran Todorovac
2023-04-01  6:23       ` BUG FIX: [PATCH RFC v3] " Mirsad Goran Todorovac
2023-04-01  6:28         ` Greg KH
2023-04-01  6:33           ` Greg KH
2023-04-01  9:18             ` Greg KH
2023-04-01  9:23               ` Greg KH
2023-04-01  9:52                 ` Mirsad Goran Todorovac
2023-04-01 10:01                   ` Mirsad Goran Todorovac
2023-04-01 10:14                     ` Greg KH
2023-04-01 10:38                       ` Mirsad Goran Todorovac
2023-04-01 11:25                 ` BUG FIX: [PATCH RFC v3] [TESTED OK] " Mirsad Goran Todorovac
2023-04-01 14:56                   ` Greg KH
2023-04-04 10:37                     ` Mirsad Goran Todorovac

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.