linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
@ 2023-03-28  9:23 Mirsad Todorovac
  2023-03-28 10:04 ` Mirsad Todorovac
  2023-03-28 10:06 ` Dan Carpenter
  0 siblings, 2 replies; 11+ messages in thread
From: Mirsad Todorovac @ 2023-03-28  9:23 UTC (permalink / raw)
  To: linux-kselftest
  Cc: LKML, Greg Kroah-Hartman, Russ Weight, Takashi Iwai,
	Tianfei zhang, Luis Chamberlain, Shuah Khan, Colin Ian King,
	Dan Carpenter, Randy Dunlap

Hi all,

Platform is AlmaLinux 8.7 (CentOS fork), Lenovo desktop
LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB with the BIOS M22KT49A dated
11/10/2022.

Running Torvalds vanilla kernel 6.3-rc3 commit 6981739a967c with
CONFIG_DEBUG_KMEMLEAK and CONFIG_DEBUG_{KOBJECT,KOBJECT_RELEASE} enabled.

The leak is cummulative, it can be reproduced with
tools/testing/selftests/firmware/*.sh scripts.

The leaks are in chunks of 1024 bytes (+ overhead), but so far I could not
reproduce w/o root privileges, as tests refuse to run as unprivileged user.
(This is not the proof of non-existence of an unprivileged automated exploit
that would exhaust the kernel memory at approx. rate 4 MB/hour on our setup.

This would mean about 96 MB / day or 3 GB / month (of kernel memory).

TEST RESULTS (showing the number of kmemleaks per test):

root@pc-mtodorov marvin]# grep -c 'comm "test_' linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw*.log
linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_fallback.sh.log:0
linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_filesystem.sh.log:60
linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_lib.sh.log:9
linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_run_tests.sh.log:196
linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_upload.sh.log:0
[root@pc-mtodorov marvin]#

Leaks look like this:

unreferenced object 0xffff943c390f8400 (size 1024):
   comm "test_firmware-0", pid 449178, jiffies 4381453603 (age 824.844s)
   hex dump (first 32 bytes):
     45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
     [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
     [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
     [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
     [<ffffffff907d6dcf>] kthread+0x10f/0x140
     [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff943a902f6400 (size 1024):
   comm "test_firmware-1", pid 449179, jiffies 4381453603 (age 824.844s)
   hex dump (first 32 bytes):
     45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
     [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
     [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
     [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
     [<ffffffff907d6dcf>] kthread+0x10f/0x140
     [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff943a902f0400 (size 1024):
   comm "test_firmware-2", pid 449180, jiffies 4381453603 (age 824.844s)
   hex dump (first 32 bytes):
     45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
     [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
     [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
     [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
     [<ffffffff907d6dcf>] kthread+0x10f/0x140
     [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff943a902f4000 (size 1024):
   comm "test_firmware-3", pid 449181, jiffies 4381453603 (age 824.844s)
   hex dump (first 32 bytes):
     45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
     [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
     [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
     [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
     [<ffffffff907d6dcf>] kthread+0x10f/0x140
     [<ffffffff90602fa9>] ret_from_fork+0x29/0x50

Please find the build config, lshw output and the output of
/sys/kernel/debug/kmemleak in the following directory:

https://domac.alu.hr/~mtodorov/linux/bugreports/kmemleak-firmware/

NOTE: sent to the maintainers listed for selftest/firmware and those
listed for lib/test_firmware.c .

Best regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
  2023-03-28  9:23 [BUG] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request() Mirsad Todorovac
@ 2023-03-28 10:04 ` Mirsad Todorovac
  2023-03-28 13:23   ` Mirsad Todorovac
  2023-03-28 10:06 ` Dan Carpenter
  1 sibling, 1 reply; 11+ messages in thread
From: Mirsad Todorovac @ 2023-03-28 10:04 UTC (permalink / raw)
  To: linux-kselftest
  Cc: LKML, Greg Kroah-Hartman, Russ Weight, Takashi Iwai,
	Tianfei zhang, Luis Chamberlain, Shuah Khan, Colin Ian King,
	Dan Carpenter, Randy Dunlap

On 3/28/23 11:23, Mirsad Todorovac wrote:
> Hi all,
> 
> Platform is AlmaLinux 8.7 (CentOS fork), Lenovo desktop
> LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB with the BIOS M22KT49A dated
> 11/10/2022.
> 
> Running Torvalds vanilla kernel 6.3-rc3 commit 6981739a967c with
> CONFIG_DEBUG_KMEMLEAK and CONFIG_DEBUG_{KOBJECT,KOBJECT_RELEASE} enabled.
> 
> The leak is cummulative, it can be reproduced with
> tools/testing/selftests/firmware/*.sh scripts.
> 
> The leaks are in chunks of 1024 bytes (+ overhead), but so far I could not
> reproduce w/o root privileges, as tests refuse to run as unprivileged user.
> (This is not the proof of non-existence of an unprivileged automated exploit
> that would exhaust the kernel memory at approx. rate 4 MB/hour on our setup.
> 
> This would mean about 96 MB / day or 3 GB / month (of kernel memory).
> 
> TEST RESULTS (showing the number of kmemleaks per test):
> 
> root@pc-mtodorov marvin]# grep -c 'comm "test_' linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw*.log
> linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_fallback.sh.log:0
> linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_filesystem.sh.log:60
> linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_lib.sh.log:9
> linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_run_tests.sh.log:196
> linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_upload.sh.log:0
> [root@pc-mtodorov marvin]#
> 
> Leaks look like this:
> 
> unreferenced object 0xffff943c390f8400 (size 1024):
>    comm "test_firmware-0", pid 449178, jiffies 4381453603 (age 824.844s)
>    hex dump (first 32 bytes):
>      45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
>      [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
>      [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
>      [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
>      [<ffffffff907d6dcf>] kthread+0x10f/0x140
>      [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
> unreferenced object 0xffff943a902f6400 (size 1024):
>    comm "test_firmware-1", pid 449179, jiffies 4381453603 (age 824.844s)
>    hex dump (first 32 bytes):
>      45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
>      [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
>      [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
>      [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
>      [<ffffffff907d6dcf>] kthread+0x10f/0x140
>      [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
> unreferenced object 0xffff943a902f0400 (size 1024):
>    comm "test_firmware-2", pid 449180, jiffies 4381453603 (age 824.844s)
>    hex dump (first 32 bytes):
>      45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
>      [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
>      [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
>      [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
>      [<ffffffff907d6dcf>] kthread+0x10f/0x140
>      [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
> unreferenced object 0xffff943a902f4000 (size 1024):
>    comm "test_firmware-3", pid 449181, jiffies 4381453603 (age 824.844s)
>    hex dump (first 32 bytes):
>      45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
>      [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
>      [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
>      [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
>      [<ffffffff907d6dcf>] kthread+0x10f/0x140
>      [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
> 
> Please find the build config, lshw output and the output of
> /sys/kernel/debug/kmemleak in the following directory:
> 
> https://domac.alu.hr/~mtodorov/linux/bugreports/kmemleak-firmware/
> 
> NOTE: sent to the maintainers listed for selftest/firmware and those
> listed for lib/test_firmware.c .

Hi, again!

The problem seems to be here:

lib/test_firmware.c:
-----------------------------------------------------------------------------------
  826 static int test_fw_run_batch_request(void *data)
  827 {
  828         struct test_batched_req *req = data;
  829
  830         if (!req) {
  831                 test_fw_config->test_result = -EINVAL;
  832                 return -EINVAL;
  833         }
  834
  835         if (test_fw_config->into_buf) {
  836                 void *test_buf;
  837
  838                 test_buf = kzalloc(TEST_FIRMWARE_BUF_SIZE, GFP_KERNEL);
  839                 if (!test_buf)
  840                         return -ENOSPC;
  841
  842                 if (test_fw_config->partial)
  843                         req->rc = request_partial_firmware_into_buf
  844                                                 (&req->fw,
  845                                                  req->name,
  846                                                  req->dev,
  847                                                  test_buf,
  848                                                  test_fw_config->buf_size,
  849                                                  test_fw_config->file_offset);
  850                 else
  851                         req->rc = request_firmware_into_buf
  852                                                 (&req->fw,
  853                                                  req->name,
  854                                                  req->dev,
  855                                                  test_buf,
  856                                                  test_fw_config->buf_size);
  857                 if (!req->fw)
  858                         kfree(test_buf);
  859         } else {
  860                 req->rc = test_fw_config->req_firmware(&req->fw,
  861                                                        req->name,
  862                                                        req->dev);
  863         }
  864
  865         if (req->rc) {
  866                 pr_info("#%u: batched sync load failed: %d\n",
  867                         req->idx, req->rc);
  868                 if (!test_fw_config->test_result)
  869                         test_fw_config->test_result = req->rc;
  870         } else if (req->fw) {
  871                 req->sent = true;
  872                 pr_info("#%u: batched sync loaded %zu\n",
  873                         req->idx, req->fw->size);
  874         }
  875         complete(&req->completion);
  876
  877         req->task = NULL;
  878
  879         return 0;
  880 }

The scope of test_buf is from its definition in line 836 to its end in line 859,
so in case req->fw != NULL the execution line loses track of the memory
kzalloc()'d in line 838.

Unless it is somewhere non-transparently referenced, it appears that the kernel
loses track of this allocated block.

Hope this helps.

Best regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
  2023-03-28  9:23 [BUG] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request() Mirsad Todorovac
  2023-03-28 10:04 ` Mirsad Todorovac
@ 2023-03-28 10:06 ` Dan Carpenter
  2023-03-28 10:33   ` Mirsad Todorovac
                     ` (2 more replies)
  1 sibling, 3 replies; 11+ messages in thread
From: Dan Carpenter @ 2023-03-28 10:06 UTC (permalink / raw)
  To: Mirsad Todorovac, Luis Chamberlain
  Cc: linux-kselftest, LKML, Greg Kroah-Hartman, Russ Weight,
	Takashi Iwai, Tianfei zhang, Shuah Khan, Colin Ian King,
	Randy Dunlap

On Tue, Mar 28, 2023 at 11:23:00AM +0200, Mirsad Todorovac wrote:
> The leaks are in chunks of 1024 bytes (+ overhead), but so far I could not
> reproduce w/o root privileges, as tests refuse to run as unprivileged user.
> (This is not the proof of non-existence of an unprivileged automated exploit
> that would exhaust the kernel memory at approx. rate 4 MB/hour on our setup.
> 
> This would mean about 96 MB / day or 3 GB / month (of kernel memory).

This is firmware testing stuff.  In the real world people aren't going
to run their test scripts in a loop for days.

There is no security implications.  This is root only.  Also if the
user could load firmware then that would be the headline.  Once someone
is can already load firmware then who cares if they leak 100MB per day?

It looks like if you call trigger_batched_requests_store() twice in a
row then it will leak memory.  Definitely test_fw_config->reqs is leaked.
That's different from what the bug report is complaining about, but the
point is that there are some obvious leaks.  It looks like you're
supposed to call trigger_batched_requests_store() in between runs?

There are other races like config_num_requests_store() should hold the
mutex over the call to test_dev_config_update_u8() instead of dropping
and retaking it.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
  2023-03-28 10:06 ` Dan Carpenter
@ 2023-03-28 10:33   ` Mirsad Todorovac
  2023-03-30 12:14   ` [BUG] [PATCH RFC v1] " Mirsad Todorovac
  2023-03-30 15:12   ` [BUG] [PATCH RFC v2] " Mirsad Todorovac
  2 siblings, 0 replies; 11+ messages in thread
From: Mirsad Todorovac @ 2023-03-28 10:33 UTC (permalink / raw)
  To: Dan Carpenter, Luis Chamberlain
  Cc: linux-kselftest, LKML, Greg Kroah-Hartman, Russ Weight,
	Takashi Iwai, Tianfei zhang, Shuah Khan, Colin Ian King,
	Randy Dunlap

Hi, Dan,

On 3/28/23 12:06, Dan Carpenter wrote:
> On Tue, Mar 28, 2023 at 11:23:00AM +0200, Mirsad Todorovac wrote:
>> The leaks are in chunks of 1024 bytes (+ overhead), but so far I could not
>> reproduce w/o root privileges, as tests refuse to run as unprivileged user.
>> (This is not the proof of non-existence of an unprivileged automated exploit
>> that would exhaust the kernel memory at approx. rate 4 MB/hour on our setup.
>>
>> This would mean about 96 MB / day or 3 GB / month (of kernel memory).
> 
> This is firmware testing stuff.  In the real world people aren't going
> to run their test scripts in a loop for days.

Thank you for making that clear.

> There is no security implications.  This is root only.  Also if the
> user could load firmware then that would be the headline.  Once someone
> is can already load firmware then who cares if they leak 100MB per day?

Yes, this is correct, but I just don't like leaks even in the userland programs.
But that might be just me ...

IMHO the purpose of the tests is to find and fix bugs. There are probably
more critical issues, but pick seemed manageable.

> It looks like if you call trigger_batched_requests_store() twice in a
> row then it will leak memory.  Definitely test_fw_config->reqs is leaked.
> That's different from what the bug report is complaining about, but the
> point is that there are some obvious leaks.  It looks like you're
> supposed to call trigger_batched_requests_store() in between runs?
> 
> There are other races like config_num_requests_store() should hold the
> mutex over the call to test_dev_config_update_u8() instead of dropping
> and retaking it.

Please consider the scope of the void *test_buf in lines 836-859 and whether the
fact that test_buf is not kfree()-ed on (req->fw != NULL) and its going out of the
scope affects this issue.

I saw there is an additional race condition involved since the exact count of leaks
is not always the same (not deterministic), but I could not figure that out by myself.

Thank you again very much for your quick reply.

BTW, I can confirm that the leak still exists in 6.3.0-rc4-00025-g3a93e40326c8
build.

Best regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

"Something is quickly approaching ... Will it be friends with me?"

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
  2023-03-28 10:04 ` Mirsad Todorovac
@ 2023-03-28 13:23   ` Mirsad Todorovac
  0 siblings, 0 replies; 11+ messages in thread
From: Mirsad Todorovac @ 2023-03-28 13:23 UTC (permalink / raw)
  To: linux-kselftest
  Cc: LKML, Greg Kroah-Hartman, Russ Weight, Takashi Iwai,
	Tianfei zhang, Luis Chamberlain, Shuah Khan, Colin Ian King,
	Dan Carpenter, Randy Dunlap

On 3/28/23 12:04, Mirsad Todorovac wrote:
> On 3/28/23 11:23, Mirsad Todorovac wrote:

>> Platform is AlmaLinux 8.7 (CentOS fork), Lenovo desktop
>> LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB with the BIOS M22KT49A dated
>> 11/10/2022.
>>
>> Running Torvalds vanilla kernel 6.3-rc3 commit 6981739a967c with
>> CONFIG_DEBUG_KMEMLEAK and CONFIG_DEBUG_{KOBJECT,KOBJECT_RELEASE} enabled.
>>
>> The leak is cummulative, it can be reproduced with
>> tools/testing/selftests/firmware/*.sh scripts.
>>
>> The leaks are in chunks of 1024 bytes (+ overhead), but so far I could not
>> reproduce w/o root privileges, as tests refuse to run as unprivileged user.
>> (This is not the proof of non-existence of an unprivileged automated exploit
>> that would exhaust the kernel memory at approx. rate 4 MB/hour on our setup.
>>
>> This would mean about 96 MB / day or 3 GB / month (of kernel memory).
>>
>> TEST RESULTS (showing the number of kmemleaks per test):
>>
>> root@pc-mtodorov marvin]# grep -c 'comm "test_' linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw*.log
>> linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_fallback.sh.log:0
>> linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_filesystem.sh.log:60
>> linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_lib.sh.log:9
>> linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_run_tests.sh.log:196
>> linux/kernel_bugs/memleaks-6.3-rc3/kmemleak-fw_upload.sh.log:0
>> [root@pc-mtodorov marvin]#
>>
>> Leaks look like this:
>>
>> unreferenced object 0xffff943c390f8400 (size 1024):
>>    comm "test_firmware-0", pid 449178, jiffies 4381453603 (age 824.844s)
>>    hex dump (first 32 bytes):
>>      45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
>>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>    backtrace:
>>      [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
>>      [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>      [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
>>      [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
>>      [<ffffffff907d6dcf>] kthread+0x10f/0x140
>>      [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
>> unreferenced object 0xffff943a902f6400 (size 1024):
>>    comm "test_firmware-1", pid 449179, jiffies 4381453603 (age 824.844s)
>>    hex dump (first 32 bytes):
>>      45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
>>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>    backtrace:
>>      [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
>>      [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>      [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
>>      [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
>>      [<ffffffff907d6dcf>] kthread+0x10f/0x140
>>      [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
>> unreferenced object 0xffff943a902f0400 (size 1024):
>>    comm "test_firmware-2", pid 449180, jiffies 4381453603 (age 824.844s)
>>    hex dump (first 32 bytes):
>>      45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
>>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>    backtrace:
>>      [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
>>      [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>      [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
>>      [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
>>      [<ffffffff907d6dcf>] kthread+0x10f/0x140
>>      [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
>> unreferenced object 0xffff943a902f4000 (size 1024):
>>    comm "test_firmware-3", pid 449181, jiffies 4381453603 (age 824.844s)
>>    hex dump (first 32 bytes):
>>      45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
>>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>    backtrace:
>>      [<ffffffff90aed68c>] slab_post_alloc_hook+0x8c/0x3e0
>>      [<ffffffff90af4f69>] __kmem_cache_alloc_node+0x1d9/0x2a0
>>      [<ffffffff90a6a6ae>] kmalloc_trace+0x2e/0xc0
>>      [<ffffffff90eb2350>] test_fw_run_batch_request+0x90/0x170
>>      [<ffffffff907d6dcf>] kthread+0x10f/0x140
>>      [<ffffffff90602fa9>] ret_from_fork+0x29/0x50
>>
>> Please find the build config, lshw output and the output of
>> /sys/kernel/debug/kmemleak in the following directory:
>>
>> https://domac.alu.hr/~mtodorov/linux/bugreports/kmemleak-firmware/
>>
>> NOTE: sent to the maintainers listed for selftest/firmware and those
>> listed for lib/test_firmware.c .
> 
> Hi, again!
> 
> The problem seems to be here:
> 
> lib/test_firmware.c:
> -----------------------------------------------------------------------------------
>   826 static int test_fw_run_batch_request(void *data)
>   827 {
>   828         struct test_batched_req *req = data;
>   829
>   830         if (!req) {
>   831                 test_fw_config->test_result = -EINVAL;
>   832                 return -EINVAL;
>   833         }
>   834
>   835         if (test_fw_config->into_buf) {
>   836                 void *test_buf;
>   837
>   838                 test_buf = kzalloc(TEST_FIRMWARE_BUF_SIZE, GFP_KERNEL);
>   839                 if (!test_buf)
>   840                         return -ENOSPC;
>   841
>   842                 if (test_fw_config->partial)
>   843                         req->rc = request_partial_firmware_into_buf
>   844                                                 (&req->fw,
>   845                                                  req->name,
>   846                                                  req->dev,
>   847                                                  test_buf,
>   848                                                  test_fw_config->buf_size,
>   849                                                  test_fw_config->file_offset);
>   850                 else
>   851                         req->rc = request_firmware_into_buf
>   852                                                 (&req->fw,
>   853                                                  req->name,
>   854                                                  req->dev,
>   855                                                  test_buf,
>   856                                                  test_fw_config->buf_size);
>   857                 if (!req->fw)
>   858                         kfree(test_buf);
>   859         } else {
>   860                 req->rc = test_fw_config->req_firmware(&req->fw,
>   861                                                        req->name,
>   862                                                        req->dev);
>   863         }
>   864
>   865         if (req->rc) {
>   866                 pr_info("#%u: batched sync load failed: %d\n",
>   867                         req->idx, req->rc);
>   868                 if (!test_fw_config->test_result)
>   869                         test_fw_config->test_result = req->rc;
>   870         } else if (req->fw) {
>   871                 req->sent = true;
>   872                 pr_info("#%u: batched sync loaded %zu\n",
>   873                         req->idx, req->fw->size);
>   874         }
>   875         complete(&req->completion);
>   876
>   877         req->task = NULL;
>   878
>   879         return 0;
>   880 }
> 
> The scope of test_buf is from its definition in line 836 to its end in line 859,
> so in case req->fw != NULL the execution line loses track of the memory
> kzalloc()'d in line 838.
> 
> Unless it is somewhere non-transparently referenced, it appears that the kernel
> loses track of this allocated block.

CORRECTION: Withdrawn that!

After doing some homework, it appeared that something non-transparent is happening
in lib/test_firmware.c after all, and we cannot just kfree(test_buf), presumably
fixing the problem.

In line

  141         fw_priv->data = dbuf;

Allocated test_buf copied to some firmware data and is assigned to dbuf through 4
levels of function calls and assigned to fw_priv->data.

drivers/base/firmware_loader/main.c:141,
called from drivers/base/firmware_loader/main.c:189: alloc_lookup_fw_priv()
	tmp = __allocate_fw_priv(fw_name, fwc, dbuf, size, offset, opt_flags);

called from drivers/base/firmware_loader/main.c:748: _request_firmware_prepare():
	ret = alloc_lookup_fw_priv(name, &fw_cache, &fw_priv, dbuf, size,
				   offset, opt_flags);

called from ...:814 _request_firmware():
	ret = _request_firmware_prepare(&fw, name, device, buf, size,
					offset, opt_flags);

called from ...:1035 request_firmware_into_buf():
	ret = _request_firmware(firmware_p, name, device, buf, size, 0,
				FW_OPT_UEVENT | FW_OPT_NOCACHE);

called from lib/test_firmware.c:851 test_fw_run_batch_request()
(Which is where the leak appears to reside.)

drivers/base/firmware_loader/main.c:
  112 static struct fw_priv *__allocate_fw_priv(const char *fw_name,
  113                                           struct firmware_cache *fwc,
  114                                           void *dbuf,
  115                                           size_t size,
  116                                           size_t offset,
  117                                           u32 opt_flags)
  118 {
  119         struct fw_priv *fw_priv;
  120
  121         /* For a partial read, the buffer must be preallocated. */
  122         if ((opt_flags & FW_OPT_PARTIAL) && !dbuf)
  123                 return NULL;
  124
  125         /* Only partial reads are allowed to use an offset. */
  126         if (offset != 0 && !(opt_flags & FW_OPT_PARTIAL))
  127                 return NULL;
  128
  129         fw_priv = kzalloc(sizeof(*fw_priv), GFP_ATOMIC);
  130         if (!fw_priv)
  131                 return NULL;
  132
  133         fw_priv->fw_name = kstrdup_const(fw_name, GFP_ATOMIC);
  134         if (!fw_priv->fw_name) {
  135                 kfree(fw_priv);
  136                 return NULL;
  137         }
  138
  139         kref_init(&fw_priv->ref);
  140         fw_priv->fwc = fwc;
  141         fw_priv->data = dbuf;
  142         fw_priv->allocated_size = size;
  143         fw_priv->offset = offset;
  144         fw_priv->opt_flags = opt_flags;
  145         fw_state_init(fw_priv);
  146 #ifdef CONFIG_FW_LOADER_USER_HELPER
  147         INIT_LIST_HEAD(&fw_priv->pending_list);
  148 #endif
  149
  150         pr_debug("%s: fw-%s fw_priv=%p\n", __func__, fw_name, fw_priv);
  151
  152         return fw_priv;
  153 }

So, the functions request_firmware_into_buf() and request_partial_firmware_into_buf()
have side-effect of actually assigning test_buf to the struct fw_priv's member
fw_priv->data.

But it seems a bit awkward semantically dubious to request firmware into something that
is immediately released and having only side effect four levels of fcalls deep add a
second reference to it.

Independently, besides that, the error code given in case of memory full and
failed kzalloc() is counterintuitive:

  837
  838                 test_buf = kzalloc(TEST_FIRMWARE_BUF_SIZE, GFP_KERNEL);
  839                 if (!test_buf)
  840                         return -ENOSPC;
  841

The rest of the driver code usually returns -ENOMEM on k*alloc() failures:

  837
  838                 test_buf = kzalloc(TEST_FIRMWARE_BUF_SIZE, GFP_KERNEL);
  839                 if (!test_buf)
  840                         return -ENOMEM;
  841

and this appears to be called only at one place:

  916		req->task = kthread_run(test_fw_run_batch_request, req,
  917					     "%s-%u", KBUILD_MODNAME, req->idx);

so the impact of the proposed change would be very low.

Who is actually consuming the error code in this case of kthread_run()?

(We are nowhere near to fixing the actual leak.)

Thank you.

Best regards,

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

"What’s this thing suddenly coming towards me very fast? Very very fast.
... I wonder if it will be friends with me?"

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] [PATCH RFC v1] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
  2023-03-28 10:06 ` Dan Carpenter
  2023-03-28 10:33   ` Mirsad Todorovac
@ 2023-03-30 12:14   ` Mirsad Todorovac
  2023-03-30 15:12   ` [BUG] [PATCH RFC v2] " Mirsad Todorovac
  2 siblings, 0 replies; 11+ messages in thread
From: Mirsad Todorovac @ 2023-03-30 12:14 UTC (permalink / raw)
  To: Dan Carpenter, Luis Chamberlain
  Cc: linux-kselftest, LKML, Greg Kroah-Hartman, Russ Weight,
	Takashi Iwai, Tianfei zhang, Shuah Khan, Colin Ian King,
	Randy Dunlap

On 3/28/23 12:06, Dan Carpenter wrote:
> On Tue, Mar 28, 2023 at 11:23:00AM +0200, Mirsad Todorovac wrote:
>> The leaks are in chunks of 1024 bytes (+ overhead), but so far I could not
>> reproduce w/o root privileges, as tests refuse to run as unprivileged user.
>> (This is not the proof of non-existence of an unprivileged automated exploit
>> that would exhaust the kernel memory at approx. rate 4 MB/hour on our setup.
>>
>> This would mean about 96 MB / day or 3 GB / month (of kernel memory).
> 
> This is firmware testing stuff.  In the real world people aren't going
> to run their test scripts in a loop for days.
> 
> There is no security implications.  This is root only.  Also if the
> user could load firmware then that would be the headline.  Once someone
> is can already load firmware then who cares if they leak 100MB per day?
> 
> It looks like if you call trigger_batched_requests_store() twice in a
> row then it will leak memory.  Definitely test_fw_config->reqs is leaked.
> That's different from what the bug report is complaining about, but the
> point is that there are some obvious leaks.  It looks like you're
> supposed to call trigger_batched_requests_store() in between runs?
> 
> There are other races like config_num_requests_store() should hold the
> mutex over the call to test_dev_config_update_u8() instead of dropping
> and retaking it.

Hi Dan,

Following your insight and advice, I tried to mend this racing condition
like this:

diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 05ed84c2fc4c..6723c234ccbb 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -402,6 +402,8 @@ static ssize_t test_dev_config_show_int(char *buf, int val)
         return snprintf(buf, PAGE_SIZE, "%d\n", val);
  }

+static DEFINE_MUTEX(test_fw_mutex_update);
+
  static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
  {
         u8 val;
@@ -411,9 +413,9 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
         if (ret)
                 return ret;

-       mutex_lock(&test_fw_mutex);
+       mutex_lock(&test_fw_mutex_update);
         *(u8 *)cfg = val;
-       mutex_unlock(&test_fw_mutex);
+       mutex_unlock(&test_fw_mutex_update);

         /* Always return full write size even if we didn't consume all */
         return size;
@@ -471,10 +473,10 @@ static ssize_t config_num_requests_store(struct device *dev,
                 mutex_unlock(&test_fw_mutex);
                 goto out;
         }
-       mutex_unlock(&test_fw_mutex);

         rc = test_dev_config_update_u8(buf, count,
                                        &test_fw_config->num_requests);
+       mutex_unlock(&test_fw_mutex);

  out:
         return rc;

For the second trigger_batched_requests_store(), probably the desired behaviour
would be to extend the list of test_fw_config->reqs, rather than destroying them
and allocating the new ones?

I am not certain about the desired semantics and where is it documented.

Thank you.

Best regards,

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

"What’s this thing suddenly coming towards me very fast? Very very fast.
... I wonder if it will be friends with me?"

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [BUG] [PATCH RFC v2] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
  2023-03-28 10:06 ` Dan Carpenter
  2023-03-28 10:33   ` Mirsad Todorovac
  2023-03-30 12:14   ` [BUG] [PATCH RFC v1] " Mirsad Todorovac
@ 2023-03-30 15:12   ` Mirsad Todorovac
  2023-03-30 15:44     ` Dan Carpenter
  2 siblings, 1 reply; 11+ messages in thread
From: Mirsad Todorovac @ 2023-03-30 15:12 UTC (permalink / raw)
  To: Dan Carpenter, Luis Chamberlain
  Cc: linux-kselftest, LKML, Greg Kroah-Hartman, Russ Weight,
	Takashi Iwai, Tianfei zhang, Shuah Khan, Colin Ian King,
	Randy Dunlap

Hi, all,

This is not a formal patch, but please see if you think the way the
locking and race are solved correctly this time.

(Having two mutexes over the same set of resources is obviously a hazard.)

On 3/28/23 12:06, Dan Carpenter wrote:
> On Tue, Mar 28, 2023 at 11:23:00AM +0200, Mirsad Todorovac wrote:
>> The leaks are in chunks of 1024 bytes (+ overhead), but so far I could not
>> reproduce w/o root privileges, as tests refuse to run as unprivileged user.
>> (This is not the proof of non-existence of an unprivileged automated exploit
>> that would exhaust the kernel memory at approx. rate 4 MB/hour on our setup.
>>
>> This would mean about 96 MB / day or 3 GB / month (of kernel memory).
> 
> This is firmware testing stuff.  In the real world people aren't going
> to run their test scripts in a loop for days.
> 
> There is no security implications.  This is root only.  Also if the
> user could load firmware then that would be the headline.  Once someone
> is can already load firmware then who cares if they leak 100MB per day?
> 
> It looks like if you call trigger_batched_requests_store() twice in a
> row then it will leak memory.  Definitely test_fw_config->reqs is leaked.
> That's different from what the bug report is complaining about, but the
> point is that there are some obvious leaks.  It looks like you're
> supposed to call trigger_batched_requests_store() in between runs?
> 
> There are other races like config_num_requests_store() should hold the
> mutex over the call to test_dev_config_update_u8() instead of dropping
> and retaking it.

COMMENT: Like in libc putc() family of functions, there is also
putc_unlocked() The similar approach is applied here.

As the functions are callable from within both locked and non-locked
environment, we have to either:

1. have two or more locks, which is dubious in terms of concurrency
2. have locked and unlocked version of each function, for we cannot
    lock the same lock twice.

NOTE: Memory leaks are not solved with this patch, only a couple of
racing conditions.

---
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 05ed84c2fc4c..d6ed20bd1eb0 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -353,6 +353,19 @@ static ssize_t config_test_show_str(char *dst,
         return len;
  }

+static inline int test_dev_config_update_bool_unlocked(const char *buf, size_t size,
+                                      bool *cfg)
+{
+       int ret;
+
+       if (kstrtobool(buf, cfg) < 0)
+               ret = -EINVAL;
+       else
+               ret = size;
+
+       return ret;
+}
+
  static int test_dev_config_update_bool(const char *buf, size_t size,
                                        bool *cfg)
  {
@@ -373,6 +386,24 @@ static ssize_t test_dev_config_show_bool(char *buf, bool val)
         return snprintf(buf, PAGE_SIZE, "%d\n", val);
  }

+static int test_dev_config_update_size_t_unlocked(
+                                        const char *buf,
+                                        size_t size,
+                                        size_t *cfg)
+{
+       int ret;
+       long new;
+
+       ret = kstrtol(buf, 10, &new);
+       if (ret)
+               return ret;
+
+       *(size_t *)cfg = new;
+
+       /* Always return full write size even if we didn't consume all */
+       return size;
+}
+
  static int test_dev_config_update_size_t(const char *buf,
                                          size_t size,
                                          size_t *cfg)
@@ -402,6 +433,21 @@ static ssize_t test_dev_config_show_int(char *buf, int val)
         return snprintf(buf, PAGE_SIZE, "%d\n", val);
  }

+static int test_dev_config_update_u8_unlocked(const char *buf, size_t size, u8 *cfg)
+{
+       u8 val;
+       int ret;
+
+       ret = kstrtou8(buf, 10, &val);
+       if (ret)
+               return ret;
+
+       *(u8 *)cfg = val;
+
+       /* Always return full write size even if we didn't consume all */
+       return size;
+}
+
  static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
  {
         u8 val;
@@ -471,10 +517,10 @@ static ssize_t config_num_requests_store(struct device *dev,
                 mutex_unlock(&test_fw_mutex);
                 goto out;
         }
-       mutex_unlock(&test_fw_mutex);

-       rc = test_dev_config_update_u8(buf, count,
-                                      &test_fw_config->num_requests);
+       rc = test_dev_config_update_u8_unlocked(buf, count,
+                                               &test_fw_config->num_requests);
+       mutex_unlock(&test_fw_mutex);

  out:
         return rc;
@@ -518,10 +564,10 @@ static ssize_t config_buf_size_store(struct device *dev,
                 mutex_unlock(&test_fw_mutex);
                 goto out;
         }
-       mutex_unlock(&test_fw_mutex);

-       rc = test_dev_config_update_size_t(buf, count,
-                                          &test_fw_config->buf_size);
+       rc = test_dev_config_update_size_t_unlocked(buf, count,
+                                                   &test_fw_config->buf_size);
+       mutex_unlock(&test_fw_mutex);

  out:
         return rc;
@@ -548,10 +594,10 @@ static ssize_t config_file_offset_store(struct device *dev,
                 mutex_unlock(&test_fw_mutex);
                 goto out;
         }
-       mutex_unlock(&test_fw_mutex);

-       rc = test_dev_config_update_size_t(buf, count,
-                                          &test_fw_config->file_offset);
+       rc = test_dev_config_update_size_t_unlocked(buf, count,
+                                                   &test_fw_config->file_offset);
+       mutex_unlock(&test_fw_mutex);

  out:
         return rc;

Best regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

"What’s this thing suddenly coming towards me very fast? Very very fast.
... I wonder if it will be friends with me?"

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [BUG] [PATCH RFC v2] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
  2023-03-30 15:12   ` [BUG] [PATCH RFC v2] " Mirsad Todorovac
@ 2023-03-30 15:44     ` Dan Carpenter
  2023-03-30 16:01       ` Takashi Iwai
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Carpenter @ 2023-03-30 15:44 UTC (permalink / raw)
  To: Mirsad Todorovac
  Cc: Luis Chamberlain, linux-kselftest, LKML, Greg Kroah-Hartman,
	Russ Weight, Takashi Iwai, Tianfei zhang, Shuah Khan,
	Colin Ian King, Randy Dunlap

I admire your enthusiam.  :)  What about if we just did this?  Does it
help with the leaks?

regards,
dan carpenter

diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 05ed84c2fc4c..626b836895f4 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -895,6 +895,9 @@ static ssize_t trigger_batched_requests_store(struct device *dev,
 
 	mutex_lock(&test_fw_mutex);
 
+	if (test_fw_config->reqs)
+		return -EBUSY;
+
 	test_fw_config->reqs =
 		vzalloc(array3_size(sizeof(struct test_batched_req),
 				    test_fw_config->num_requests, 2));

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [BUG] [PATCH RFC v2] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
  2023-03-30 15:44     ` Dan Carpenter
@ 2023-03-30 16:01       ` Takashi Iwai
  2023-03-30 16:04         ` Dan Carpenter
  2023-03-30 18:42         ` Mirsad Goran Todorovac
  0 siblings, 2 replies; 11+ messages in thread
From: Takashi Iwai @ 2023-03-30 16:01 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Mirsad Todorovac, Luis Chamberlain, linux-kselftest, LKML,
	Greg Kroah-Hartman, Russ Weight, Takashi Iwai, Tianfei zhang,
	Shuah Khan, Colin Ian King, Randy Dunlap

On Thu, 30 Mar 2023 17:44:42 +0200,
Dan Carpenter wrote:
> 
> I admire your enthusiam.  :)  What about if we just did this?  Does it
> help with the leaks?
> 
> regards,
> dan carpenter
> 
> diff --git a/lib/test_firmware.c b/lib/test_firmware.c
> index 05ed84c2fc4c..626b836895f4 100644
> --- a/lib/test_firmware.c
> +++ b/lib/test_firmware.c
> @@ -895,6 +895,9 @@ static ssize_t trigger_batched_requests_store(struct device *dev,
>  
>  	mutex_lock(&test_fw_mutex);
>  
> +	if (test_fw_config->reqs)
> +		return -EBUSY;
> +

This leaves the mutex locked.
It should be the following instead, I suppose?

	if (test_fw_config->reqs) {
		rc = -EBUSY;
		goto out_unlock;
	}


Takashi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] [PATCH RFC v2] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
  2023-03-30 16:01       ` Takashi Iwai
@ 2023-03-30 16:04         ` Dan Carpenter
  2023-03-30 18:42         ` Mirsad Goran Todorovac
  1 sibling, 0 replies; 11+ messages in thread
From: Dan Carpenter @ 2023-03-30 16:04 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Mirsad Todorovac, Luis Chamberlain, linux-kselftest, LKML,
	Greg Kroah-Hartman, Russ Weight, Tianfei zhang, Shuah Khan,
	Colin Ian King, Randy Dunlap

On Thu, Mar 30, 2023 at 06:01:54PM +0200, Takashi Iwai wrote:
> On Thu, 30 Mar 2023 17:44:42 +0200,
> Dan Carpenter wrote:
> > 
> > I admire your enthusiam.  :)  What about if we just did this?  Does it
> > help with the leaks?
> > 
> > regards,
> > dan carpenter
> > 
> > diff --git a/lib/test_firmware.c b/lib/test_firmware.c
> > index 05ed84c2fc4c..626b836895f4 100644
> > --- a/lib/test_firmware.c
> > +++ b/lib/test_firmware.c
> > @@ -895,6 +895,9 @@ static ssize_t trigger_batched_requests_store(struct device *dev,
> >  
> >  	mutex_lock(&test_fw_mutex);
> >  
> > +	if (test_fw_config->reqs)
> > +		return -EBUSY;
> > +
> 
> This leaves the mutex locked.
> It should be the following instead, I suppose?
> 
> 	if (test_fw_config->reqs) {
> 		rc = -EBUSY;
> 		goto out_unlock;
> 	}

So embarrassing...  Yes.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] [PATCH RFC v2] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request()
  2023-03-30 16:01       ` Takashi Iwai
  2023-03-30 16:04         ` Dan Carpenter
@ 2023-03-30 18:42         ` Mirsad Goran Todorovac
  1 sibling, 0 replies; 11+ messages in thread
From: Mirsad Goran Todorovac @ 2023-03-30 18:42 UTC (permalink / raw)
  To: Takashi Iwai, Dan Carpenter
  Cc: Luis Chamberlain, linux-kselftest, LKML, Greg Kroah-Hartman,
	Russ Weight, Tianfei zhang, Shuah Khan, Colin Ian King,
	Randy Dunlap

On 30. 03. 2023. 18:01, Takashi Iwai wrote:
> On Thu, 30 Mar 2023 17:44:42 +0200,
> Dan Carpenter wrote:
>>
>> I admire your enthusiam.  :)  What about if we just did this?  Does it
>> help with the leaks?
>>
>> regards,
>> dan carpenter
>>
>> diff --git a/lib/test_firmware.c b/lib/test_firmware.c
>> index 05ed84c2fc4c..626b836895f4 100644
>> --- a/lib/test_firmware.c
>> +++ b/lib/test_firmware.c
>> @@ -895,6 +895,9 @@ static ssize_t trigger_batched_requests_store(struct device *dev,
>>  
>>  	mutex_lock(&test_fw_mutex);
>>  
>> +	if (test_fw_config->reqs)
>> +		return -EBUSY;
>> +
> 
> This leaves the mutex locked.
> It should be the following instead, I suppose?
> 
> 	if (test_fw_config->reqs) {
> 		rc = -EBUSY;
> 		goto out_unlock;
> 	}
> 
> 
> Takashi

Hi, Dan, Takashi,

Unfortunately, it did not suffice.

What I was building with was
commit 8bb95a1662f8:Merge tag 'sound-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
with the following patch for lib/test_firmware.c:


---
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 05ed84c2fc4c..4daa38bd2cac 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -353,6 +353,19 @@ static ssize_t config_test_show_str(char *dst,
        return len;
 }
 
+static inline int test_dev_config_update_bool_unlocked(const char *buf, size_t size,
+                                      bool *cfg)
+{
+       int ret;
+
+       if (kstrtobool(buf, cfg) < 0)
+               ret = -EINVAL;
+       else
+               ret = size;
+
+       return ret;
+}
+
 static int test_dev_config_update_bool(const char *buf, size_t size,
                                       bool *cfg)
 {
@@ -373,6 +386,24 @@ static ssize_t test_dev_config_show_bool(char *buf, bool val)
        return snprintf(buf, PAGE_SIZE, "%d\n", val);
 }
 
+static int test_dev_config_update_size_t_unlocked(
+                                        const char *buf,
+                                        size_t size,
+                                        size_t *cfg)
+{
+       int ret;
+       long new;
+
+       ret = kstrtol(buf, 10, &new);
+       if (ret)
+               return ret;
+
+       *(size_t *)cfg = new;
+
+       /* Always return full write size even if we didn't consume all */
+       return size;
+}
+
 static int test_dev_config_update_size_t(const char *buf,
                                         size_t size,
                                         size_t *cfg)
@@ -402,6 +433,21 @@ static ssize_t test_dev_config_show_int(char *buf, int val)
        return snprintf(buf, PAGE_SIZE, "%d\n", val);
 }
 
+static int test_dev_config_update_u8_unlocked(const char *buf, size_t size, u8 *cfg)
+{
+       u8 val;
+       int ret;
+
+       ret = kstrtou8(buf, 10, &val);
+       if (ret)
+               return ret;
+
+       *(u8 *)cfg = val;
+
+       /* Always return full write size even if we didn't consume all */
+       return size;
+}
+
 static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
 {
        u8 val;
@@ -471,10 +517,10 @@ static ssize_t config_num_requests_store(struct device *dev,
                mutex_unlock(&test_fw_mutex);
                goto out;
        }
-       mutex_unlock(&test_fw_mutex);
 
-       rc = test_dev_config_update_u8(buf, count,
-                                      &test_fw_config->num_requests);
+       rc = test_dev_config_update_u8_unlocked(buf, count,
+                                               &test_fw_config->num_requests);
+       mutex_unlock(&test_fw_mutex);
 
 out:
        return rc;
@@ -518,10 +564,10 @@ static ssize_t config_buf_size_store(struct device *dev,
                mutex_unlock(&test_fw_mutex);
                goto out;
        }
-       mutex_unlock(&test_fw_mutex);
 
-       rc = test_dev_config_update_size_t(buf, count,
-                                          &test_fw_config->buf_size);
+       rc = test_dev_config_update_size_t_unlocked(buf, count,
+                                                   &test_fw_config->buf_size);
+       mutex_unlock(&test_fw_mutex);
 
 out:
        return rc;
@@ -548,10 +594,10 @@ static ssize_t config_file_offset_store(struct device *dev,
                mutex_unlock(&test_fw_mutex);
                goto out;
        }
-       mutex_unlock(&test_fw_mutex);
 
-       rc = test_dev_config_update_size_t(buf, count,
-                                          &test_fw_config->file_offset);
+       rc = test_dev_config_update_size_t_unlocked(buf, count,
+                                                   &test_fw_config->file_offset);
+       mutex_unlock(&test_fw_mutex);
 
 out:
        return rc;
@@ -895,6 +941,11 @@ static ssize_t trigger_batched_requests_store(struct device *dev,
 
        mutex_lock(&test_fw_mutex);
 
+       if (test_fw_config->reqs) {
+               rc = -EBUSY;
+               goto out_unlock;
+       }
+
        test_fw_config->reqs =
                vzalloc(array3_size(sizeof(struct test_batched_req),
                                    test_fw_config->num_requests, 2));
@@ -993,6 +1044,11 @@ ssize_t trigger_batched_requests_async_store(struct device *dev,
 
        mutex_lock(&test_fw_mutex);
 
+       if (test_fw_config->reqs) {
+               rc = -EBUSY;
+               goto out;
+       }
+
        test_fw_config->reqs =
                vzalloc(array3_size(sizeof(struct test_batched_req),
                                    test_fw_config->num_requests, 2));

The leaks are the same:

unreferenced object 0xffff96deccc99c00 (size 1024):
  comm "test_firmware-2", pid 3093, jiffies 4294945062 (age 605.444s)
  hex dump (first 32 bytes):
    45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffffb58fb26c>] slab_post_alloc_hook+0x8c/0x3e0
    [<ffffffffb5902b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
    [<ffffffffb587696e>] kmalloc_trace+0x2e/0xc0
    [<ffffffffb5cc00c0>] test_fw_run_batch_request+0x90/0x170
    [<ffffffffb55d6dff>] kthread+0x10f/0x140
    [<ffffffffb5402fa9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff96ded72be400 (size 1024):
  comm "test_firmware-3", pid 3094, jiffies 4294945062 (age 605.444s)
  hex dump (first 32 bytes):
    45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffffb58fb26c>] slab_post_alloc_hook+0x8c/0x3e0
    [<ffffffffb5902b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
    [<ffffffffb587696e>] kmalloc_trace+0x2e/0xc0
    [<ffffffffb5cc00c0>] test_fw_run_batch_request+0x90/0x170
    [<ffffffffb55d6dff>] kthread+0x10f/0x140
    [<ffffffffb5402fa9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff96dec9e32800 (size 1024):
  comm "test_firmware-0", pid 3101, jiffies 4294945072 (age 605.404s)
  hex dump (first 32 bytes):
    45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffffb58fb26c>] slab_post_alloc_hook+0x8c/0x3e0
    [<ffffffffb5902b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
    [<ffffffffb587696e>] kmalloc_trace+0x2e/0xc0
    [<ffffffffb5cc00c0>] test_fw_run_batch_request+0x90/0x170
    [<ffffffffb55d6dff>] kthread+0x10f/0x140
    [<ffffffffb5402fa9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff96df0ab17000 (size 1024):
  comm "test_firmware-1", pid 3102, jiffies 4294945073 (age 605.432s)
  hex dump (first 32 bytes):
    45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffffb58fb26c>] slab_post_alloc_hook+0x8c/0x3e0
    [<ffffffffb5902b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
    [<ffffffffb587696e>] kmalloc_trace+0x2e/0xc0
    [<ffffffffb5cc00c0>] test_fw_run_batch_request+0x90/0x170
    [<ffffffffb55d6dff>] kthread+0x10f/0x140
    [<ffffffffb5402fa9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff96decd6f6400 (size 1024):
  comm "test_firmware-2", pid 3103, jiffies 4294945073 (age 605.432s)
  hex dump (first 32 bytes):
    45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffffb58fb26c>] slab_post_alloc_hook+0x8c/0x3e0
    [<ffffffffb5902b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
    [<ffffffffb587696e>] kmalloc_trace+0x2e/0xc0
    [<ffffffffb5cc00c0>] test_fw_run_batch_request+0x90/0x170
    [<ffffffffb55d6dff>] kthread+0x10f/0x140
    [<ffffffffb5402fa9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff96df0dc69c00 (size 1024):
  comm "test_firmware-3", pid 3104, jiffies 4294945073 (age 605.432s)
  hex dump (first 32 bytes):
    45 46 47 48 34 35 36 37 0a 00 00 00 00 00 00 00  EFGH4567........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffffb58fb26c>] slab_post_alloc_hook+0x8c/0x3e0
    [<ffffffffb5902b49>] __kmem_cache_alloc_node+0x1d9/0x2a0
    [<ffffffffb587696e>] kmalloc_trace+0x2e/0xc0
    [<ffffffffb5cc00c0>] test_fw_run_batch_request+0x90/0x170
    [<ffffffffb55d6dff>] kthread+0x10f/0x140
    [<ffffffffb5402fa9>] ret_from_fork+0x29/0x50
[root@pc-mtodorov linux_torvalds]# uname -rms
Linux 6.3.0-rc4mt+20230330-00051-g8bb95a1662f8-dirty x86_64
[root@pc-mtodorov linux_torvalds]# 

My gut feeling tells me that it is not test_fw_config->reqs because
there are 75 instances leaked.

Regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
 
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

"I see something approaching fast ... Will it be friends with me?"


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-03-30 18:43 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-28  9:23 [BUG] selftests/firmware: copious kernel memory leaks in test_fw_run_batch_request() Mirsad Todorovac
2023-03-28 10:04 ` Mirsad Todorovac
2023-03-28 13:23   ` Mirsad Todorovac
2023-03-28 10:06 ` Dan Carpenter
2023-03-28 10:33   ` Mirsad Todorovac
2023-03-30 12:14   ` [BUG] [PATCH RFC v1] " Mirsad Todorovac
2023-03-30 15:12   ` [BUG] [PATCH RFC v2] " Mirsad Todorovac
2023-03-30 15:44     ` Dan Carpenter
2023-03-30 16:01       ` Takashi Iwai
2023-03-30 16:04         ` Dan Carpenter
2023-03-30 18:42         ` Mirsad Goran Todorovac

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).