linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
@ 2022-01-07  6:48 Lukas Bulwahn
  2022-01-07 17:54 ` Shoaib Rao
  2022-01-09 21:20 ` Jakub Kicinski
  0 siblings, 2 replies; 11+ messages in thread
From: Lukas Bulwahn @ 2022-01-07  6:48 UTC (permalink / raw)
  To: Rao Shoaib, David S. Miller
  Cc: Linux Kernel Mailing List, Netdev, Sudip Mukherjee

Dear Rao and David,


In our syzkaller instance running on linux-next,
https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
observing a memory leak in prepare_creds,
https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
for quite some time.

It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
So, it is in mainline, was released and has not been fixed in
linux-next yet.

As syzkaller also provides a reproducer, we bisected this memory leak
to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
support").

We also tested that reverting this commit on torvalds' current tree
made the memory leak with the reproducer go away.

Could you please have a look how your commit introduces this memory
leak? We will gladly support testing your fix in case help is needed.


Best regards,

Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-07  6:48 Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support") Lukas Bulwahn
@ 2022-01-07 17:54 ` Shoaib Rao
  2022-01-09  4:10   ` Lukas Bulwahn
  2022-01-09 21:20 ` Jakub Kicinski
  1 sibling, 1 reply; 11+ messages in thread
From: Shoaib Rao @ 2022-01-07 17:54 UTC (permalink / raw)
  To: Lukas Bulwahn, David S. Miller
  Cc: Linux Kernel Mailing List, Netdev, Sudip Mukherjee

Hi Lukas,

I took a look at the patch and I fail to see how prepare_creds() could 
be impacted by the patch. The only reference to a cred in the patch is 
via maybe_add_creds().

prepare_creds() is called to make a copy of the current creds which will 
be later modified. If there is any leak it would be in the caller not 
releasing the memory. The patch does not do anything with creds.

If there is any more information that can help identify the issue, I 
will be happy to look into it.

Note that a lot of bugs are timing related, so while it might seem that 
a change is causing the problem, it may not be the cause, it may just be 
changing the environment for the bug to show up.

Shoaib

On 1/6/22 22:48, Lukas Bulwahn wrote:
> Dear Rao and David,
>
>
> In our syzkaller instance running on linux-next,
> https://urldefense.com/v3/__https://elisa-builder-00.iol.unh.edu/syzkaller-next/__;!!ACWV5N9M2RV99hQ!YR_lD5j1kvA5QfrbPcM5nMVZZkWNcF-UEE4vKA20TPkslzzGDVPqpL-6heEhBZ_e$ , we have been
> observing a memory leak in prepare_creds,
> https://urldefense.com/v3/__https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3__;!!ACWV5N9M2RV99hQ!YR_lD5j1kvA5QfrbPcM5nMVZZkWNcF-UEE4vKA20TPkslzzGDVPqpL-6hS1luOMv$ ,
> for quite some time.
>
> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> So, it is in mainline, was released and has not been fixed in
> linux-next yet.
>
> As syzkaller also provides a reproducer, we bisected this memory leak
> to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> support").
>
> We also tested that reverting this commit on torvalds' current tree
> made the memory leak with the reproducer go away.
>
> Could you please have a look how your commit introduces this memory
> leak? We will gladly support testing your fix in case help is needed.
>
>
> Best regards,
>
> Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-07 17:54 ` Shoaib Rao
@ 2022-01-09  4:10   ` Lukas Bulwahn
       [not found]     ` <20220109064905.1594-1-hdanton@sina.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Lukas Bulwahn @ 2022-01-09  4:10 UTC (permalink / raw)
  To: Shoaib Rao
  Cc: David S. Miller, Linux Kernel Mailing List, Netdev, Sudip Mukherjee

On Fri, Jan 7, 2022 at 6:55 PM Shoaib Rao <rao.shoaib@oracle.com> wrote:
>
> Hi Lukas,
>
> I took a look at the patch and I fail to see how prepare_creds() could
> be impacted by the patch. The only reference to a cred in the patch is
> via maybe_add_creds().
>
> prepare_creds() is called to make a copy of the current creds which will
> be later modified. If there is any leak it would be in the caller not
> releasing the memory. The patch does not do anything with creds.
>
> If there is any more information that can help identify the issue, I
> will be happy to look into it.
>

Here is more information:

Here are all crash reports:

https://elisa-builder-00.iol.unh.edu/syzkaller-next/crash?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3

and here at the bottom of the page is a C program that shows the
memory leak with the typical memory leak detections switched on:

https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3

Please try to reproduce this on your machine. If you need more
instructions on how to set up the kernel to get this program to
reproduce the issue, please let us know.

> Note that a lot of bugs are timing related, so while it might seem that
> a change is causing the problem, it may not be the cause, it may just be
> changing the environment for the bug to show up.
>

Well, we are pretty sure that this commit makes it show up and
disappear depending on where it is included or reverted, respectively,
tested now on multiple kernel versions. So, to resolve the issue, we
just need to revert the commit.

Lukas

> Shoaib
>
> On 1/6/22 22:48, Lukas Bulwahn wrote:
> > Dear Rao and David,
> >
> >
> > In our syzkaller instance running on linux-next,
> > https://urldefense.com/v3/__https://elisa-builder-00.iol.unh.edu/syzkaller-next/__;!!ACWV5N9M2RV99hQ!YR_lD5j1kvA5QfrbPcM5nMVZZkWNcF-UEE4vKA20TPkslzzGDVPqpL-6heEhBZ_e$ , we have been
> > observing a memory leak in prepare_creds,
> > https://urldefense.com/v3/__https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3__;!!ACWV5N9M2RV99hQ!YR_lD5j1kvA5QfrbPcM5nMVZZkWNcF-UEE4vKA20TPkslzzGDVPqpL-6hS1luOMv$ ,
> > for quite some time.
> >
> > It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> > So, it is in mainline, was released and has not been fixed in
> > linux-next yet.
> >
> > As syzkaller also provides a reproducer, we bisected this memory leak
> > to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> > support").
> >
> > We also tested that reverting this commit on torvalds' current tree
> > made the memory leak with the reproducer go away.
> >
> > Could you please have a look how your commit introduces this memory
> > leak? We will gladly support testing your fix in case help is needed.
> >
> >
> > Best regards,
> >
> > Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-07  6:48 Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support") Lukas Bulwahn
  2022-01-07 17:54 ` Shoaib Rao
@ 2022-01-09 21:20 ` Jakub Kicinski
  2022-01-10 14:02   ` Thorsten Leemhuis
  2022-01-10 14:54   ` Lukas Bulwahn
  1 sibling, 2 replies; 11+ messages in thread
From: Jakub Kicinski @ 2022-01-09 21:20 UTC (permalink / raw)
  To: Lukas Bulwahn
  Cc: Rao Shoaib, David S. Miller, Linux Kernel Mailing List, Netdev,
	Sudip Mukherjee, regressions

On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:
> Dear Rao and David,
> 
> 
> In our syzkaller instance running on linux-next,
> https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
> observing a memory leak in prepare_creds,
> https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
> for quite some time.
> 
> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> So, it is in mainline, was released and has not been fixed in
> linux-next yet.
> 
> As syzkaller also provides a reproducer, we bisected this memory leak
> to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> support").
> 
> We also tested that reverting this commit on torvalds' current tree
> made the memory leak with the reproducer go away.
> 
> Could you please have a look how your commit introduces this memory
> leak? We will gladly support testing your fix in case help is needed.

Let's test the regression/bug report tracking bot :)

#regzbot introduced: 314001f0bf92

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-09 21:20 ` Jakub Kicinski
@ 2022-01-10 14:02   ` Thorsten Leemhuis
  2022-01-10 16:19     ` Lukas Bulwahn
  2022-01-10 16:21     ` Jakub Kicinski
  2022-01-10 14:54   ` Lukas Bulwahn
  1 sibling, 2 replies; 11+ messages in thread
From: Thorsten Leemhuis @ 2022-01-10 14:02 UTC (permalink / raw)
  To: Jakub Kicinski, Lukas Bulwahn
  Cc: Rao Shoaib, David S. Miller, Linux Kernel Mailing List, Netdev,
	Sudip Mukherjee, regressions


On 09.01.22 22:20, Jakub Kicinski wrote:
> On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:
>> Dear Rao and David,
>>
>>
>> In our syzkaller instance running on linux-next,
>> https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
>> observing a memory leak in prepare_creds,
>> https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
>> for quite some time.
>>
>> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
>> So, it is in mainline, was released and has not been fixed in
>> linux-next yet.
>>
>> As syzkaller also provides a reproducer, we bisected this memory leak
>> to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
>> support").
>>
>> We also tested that reverting this commit on torvalds' current tree
>> made the memory leak with the reproducer go away.
>>
>> Could you please have a look how your commit introduces this memory
>> leak? We will gladly support testing your fix in case help is needed.
> 
> Let's test the regression/bug report tracking bot :)
> 
> #regzbot introduced: 314001f0bf92

Great, thx for trying, you only did a small mistake: it lacked a caret
(^) before the "introduced", which would have told regzbot that the
parent mail (the one you quoted) is the one containing the report (which
later is linked in patch descriptions of fixes and allows rezgbot to
connect things). That's why regzbot now thinks you reported the issue
and looks out for patches and commits that link to your mail. :-/

Don't worry, I just added it properly and now mark this as duplicate:

#regzbot dup-of:
https://lore.kernel.org/lkml/CAKXUXMzZkQvHJ35nwVhcJe%2BDrtEXGw%2BeKGVD04=xRJkVUC2sPA@mail.gmail.com/

Thx again for trying.


I wonder if this mistake could be avoided. I came up with one idea while
walking the dog:

 * if there is *no* parent mail, then "regzbot introduce" could consider
the current mail as the report

 * if there *is* a parent mail, then "regzbot introduce" could consider
the parent as the report

Then regzbot would have done the right thing in this case. But there is
a "but": I wonder if such an approach would be too much black magic that
confuses more than it helps. What do you think?

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-09 21:20 ` Jakub Kicinski
  2022-01-10 14:02   ` Thorsten Leemhuis
@ 2022-01-10 14:54   ` Lukas Bulwahn
  1 sibling, 0 replies; 11+ messages in thread
From: Lukas Bulwahn @ 2022-01-10 14:54 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Rao Shoaib, David S. Miller, Linux Kernel Mailing List, Netdev,
	Sudip Mukherjee, regressions

On Sun, Jan 9, 2022 at 10:20 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:
> > Dear Rao and David,
> >
> >
> > In our syzkaller instance running on linux-next,
> > https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
> > observing a memory leak in prepare_creds,
> > https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
> > for quite some time.
> >
> > It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> > So, it is in mainline, was released and has not been fixed in
> > linux-next yet.
> >
> > As syzkaller also provides a reproducer, we bisected this memory leak
> > to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> > support").
> >
> > We also tested that reverting this commit on torvalds' current tree
> > made the memory leak with the reproducer go away.
> >
> > Could you please have a look how your commit introduces this memory
> > leak? We will gladly support testing your fix in case help is needed.
>
> Let's test the regression/bug report tracking bot :)
>
> #regzbot introduced: 314001f0bf92

Here is all relevant information:

We have a reproducer program and this reproducer setup:


Kernel Build:

make mrproper && make defconfig && make kvm_guest.config &&
scripts/config -e KCOV -e KCOV_INSTRUMENT_ALL -e
KCOV_ENABLE_COMPARISONS -e DEBUG_FS -e DEBUG_KMEMLEAK -e DEBUG_INFO -e
KALLSYMS -e KALLSYMS_ALL -e NAMESPACES -e UTS_NS -e IPC_NS -e PID_NS
-e NET_NS -e CGROUP_PIDS -e MEMCG -e USER_NS -e CONFIGFS_FS -e
SECURITYFS -e FAULT_INJECTION -e FAULT_INJECTION_DEBUG_FS -e
FAULT_INJECTION_USERCOPY -e FAILSLAB -e FAIL_PAGE_ALLOC -e
FAIL_MAKE_REQUEST -e FAIL_IO_TIMEOUT -e FAIL_FUTEX -e LOCKDEP -e
PROVE_LOCKING -e DEBUG_ATOMIC_SLEEP -e PROVE_RCU -e DEBUG_VM -e
REFCOUNT_FULL -e FORTIFY_SOURCE -e HARDENED_USERCOPY -e
LOCKUP_DETECTOR -e SOFTLOCKUP_DETECTOR -e HARDLOCKUP_DETECTOR -e
BOOTPARAM_HARDLOCKUP_PANIC -e DETECT_HUNG_TASK -e WQ_WATCHDOG -e
USB_GADGET -e USB_RAW_GADGET -e TUN -e KCSAN -d RANDOMIZE_BASE -e
MAC80211_HWSIM -e IEEE802154 -e MAC802154 -e IEEE802154_DRIVERS -e
IEEE802154_HWSIM -e BT -e BT_HCIVHCI && make olddefconfig && make -j
24

(This is not a minimal config for the reproducer.)


QEMU Command:

qemu-system-x86_64 -m 2048 -smp 2 -chardev
socket,id=SOCKSYZ,server,nowait,host=localhost,port=46514 -mon
chardev=SOCKSYZ,mode=control -display none -serial stdio -no-reboot
-name VM-test -device virtio-rng-pci -enable-kvm -cpu
host,migratable=off -device e1000,netdev=net0 -netdev
user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:28993-:22 -hda
bullseye.img -snapshot -kernel bzImage -append "root=/dev/sda
console=ttyS0"

Reproducer: see C program at the bottom of
https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3

Trigger in QEMU: compile reproducer with gcc and run it

We observe the memory leak output below on next-20220110 with the setup above.
We do NOT observe the memory leak output below on next-20220110, when
disabling AF_UNIX_OOB.

So, no memory leak for a kernel build with this diff in the repository
and everything else same as above. That is also why the bisection
identified commit 314001f0bf92 to introduce this memory leak.

diff --git a/net/unix/Kconfig b/net/unix/Kconfig
index b7f811216820..e4175feb1809 100644
--- a/net/unix/Kconfig
+++ b/net/unix/Kconfig
@@ -28,7 +28,7 @@ config UNIX_SCM
 config AF_UNIX_OOB
        bool
        depends on UNIX
-       default y
+       default n

 config UNIX_DIAG
        tristate "UNIX: socket monitoring interface"



memory leak output:

BUG: memory leak
unreferenced object 0xffff888012fd0240 (size 192):
  comm "a.out", pid 250, jiffies 4294908743 (age 13.529s)
  hex dump (first 32 bytes):
    02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000a3bddd5a>] kmem_cache_alloc+0x133/0x2d0
    [<00000000587efbf5>] prepare_creds+0x27/0x420
    [<0000000095b9beb6>] copy_creds+0x3a/0x600
    [<000000004e59ddd9>] copy_process+0x830/0x2b80
    [<000000005840a46d>] kernel_clone+0x89/0xbf0
    [<0000000070c730ab>] __do_sys_clone+0x88/0xb0
    [<00000000f5b1c158>] do_syscall_64+0x3a/0x80
    [<000000004a0e7245>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff88800536dba0 (size 32):
  comm "a.out", pid 250, jiffies 4294908743 (age 13.529s)
  hex dump (first 32 bytes):
    01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000e0ec3d82>] __kmalloc+0x161/0x270
    [<000000006e2dab2d>] security_prepare_creds+0xa3/0xd0
    [<0000000090cfc7fd>] prepare_creds+0x2d6/0x420
    [<0000000095b9beb6>] copy_creds+0x3a/0x600
    [<000000004e59ddd9>] copy_process+0x830/0x2b80
    [<000000005840a46d>] kernel_clone+0x89/0xbf0
    [<0000000070c730ab>] __do_sys_clone+0x88/0xb0
    [<00000000f5b1c158>] do_syscall_64+0x3a/0x80
    [<000000004a0e7245>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff88800dba5c00 (size 232):
  comm "a.out", pid 250, jiffies 4294908743 (age 13.529s)
  hex dump (first 32 bytes):
    02 00 00 00 00 00 00 00 00 00 00 00 ad 4e ad de  .............N..
    ff ff ff ff 00 00 00 00 ff ff ff ff ff ff ff ff  ................
  backtrace:
    [<00000000a3bddd5a>] kmem_cache_alloc+0x133/0x2d0
    [<0000000069a36692>] alloc_pid+0x6d/0x670
    [<000000006f39f82c>] copy_process+0x1a95/0x2b80
    [<000000005840a46d>] kernel_clone+0x89/0xbf0
    [<0000000070c730ab>] __do_sys_clone+0x88/0xb0
    [<00000000f5b1c158>] do_syscall_64+0x3a/0x80
    [<000000004a0e7245>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff888014438e80 (size 1856):
  comm "a.out", pid 251, jiffies 4294908743 (age 13.529s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 58 01 00 00 00 00 00 00  ........X.......
    01 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
  backtrace:
    [<00000000a3bddd5a>] kmem_cache_alloc+0x133/0x2d0
    [<000000004c97eff8>] sk_prot_alloc+0x3e/0x1b0
    [<0000000034d397b2>] sk_alloc+0x34/0x320
    [<0000000046549569>] unix_create1+0x84/0x260
    [<00000000e72cbd15>] unix_create+0x90/0x120
    [<000000000d82ff9e>] __sock_create+0x285/0x520
    [<00000000087d9b40>] __sys_socketpair+0x142/0x380
    [<00000000f7586b33>] __x64_sys_socketpair+0x1e/0x30
    [<00000000f5b1c158>] do_syscall_64+0x3a/0x80
    [<000000004a0e7245>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff8880135311c0 (size 32):
  comm "a.out", pid 251, jiffies 4294908743 (age 13.529s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01 00 00 00 01 00 00 00 18 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000b535b6cd>] kmem_cache_alloc_trace+0x144/0x220
    [<000000008c20c9fd>] selinux_sk_alloc_security+0x52/0xf0
    [<00000000cdf964c1>] security_sk_alloc+0x39/0x70
    [<0000000005d51b11>] sk_prot_alloc+0x89/0x1b0
    [<0000000034d397b2>] sk_alloc+0x34/0x320
    [<0000000046549569>] unix_create1+0x84/0x260
    [<00000000e72cbd15>] unix_create+0x90/0x120
    [<000000000d82ff9e>] __sock_create+0x285/0x520
    [<00000000087d9b40>] __sys_socketpair+0x142/0x380
    [<00000000f7586b33>] __x64_sys_socketpair+0x1e/0x30
    [<00000000f5b1c158>] do_syscall_64+0x3a/0x80
    [<000000004a0e7245>] entry_SYSCALL_64_after_hwframe+0x44/0xae

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
       [not found]     ` <20220109064905.1594-1-hdanton@sina.com>
@ 2022-01-10 15:15       ` Lukas Bulwahn
  0 siblings, 0 replies; 11+ messages in thread
From: Lukas Bulwahn @ 2022-01-10 15:15 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Shoaib Rao, David S. Miller, Linux Kernel Mailing List, Netdev,
	Sudip Mukherjee

On Sun, Jan 9, 2022 at 7:49 AM Hillf Danton <hdanton@sina.com> wrote:
>
> On Sun, 9 Jan 2022 05:10:48 +0100 Lukas Bulwahn wrote:
> > On Fri, Jan 7, 2022 at 6:55 PM Shoaib Rao <rao.shoaib@oracle.com> wrote:
> > >
> > > Hi Lukas,
> > >
> > > I took a look at the patch and I fail to see how prepare_creds() could
> > > be impacted by the patch. The only reference to a cred in the patch is
> > > via maybe_add_creds().
> > >
> > > prepare_creds() is called to make a copy of the current creds which will
> > > be later modified. If there is any leak it would be in the caller not
> > > releasing the memory. The patch does not do anything with creds.
> > >
> > > If there is any more information that can help identify the issue, I
> > > will be happy to look into it.
> > >
> >
> > Here is more information:
> >
> > Here are all crash reports:
> >
> > https://elisa-builder-00.iol.unh.edu/syzkaller-next/crash?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3
>
> More weid is the failure of Ctrl-f "unix" at [1] except for a bunch of
> clone. Can you specify why report at [1] has a direct link to af_unix?
>
>         Hillf
>
> [1] https://elisa-builder-00.iol.unh.edu/syzkaller-next/file?name=crashes%2f1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3%2freport15
>

Hillf,

I agree that is really weird. I fear we have some issue with our
syzkaller instance, somehow the database is collecting error logs that
seem to be different from the error logs I observe when manually
running the reproducer. The heuristics of aggregating error messages
is black magic.

Importantly, we have a reproducer, which is clearly related to the
af_unix functionality and we can manually trigger a reasonable error
trace. Ignore all the rest.

Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-10 14:02   ` Thorsten Leemhuis
@ 2022-01-10 16:19     ` Lukas Bulwahn
  2022-01-10 16:29       ` Jakub Kicinski
  2022-01-10 16:21     ` Jakub Kicinski
  1 sibling, 1 reply; 11+ messages in thread
From: Lukas Bulwahn @ 2022-01-10 16:19 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Jakub Kicinski, Rao Shoaib, David S. Miller,
	Linux Kernel Mailing List, Netdev, Sudip Mukherjee, regressions

On Mon, Jan 10, 2022 at 3:02 PM Thorsten Leemhuis
<regressions@leemhuis.info> wrote:
>
>
> On 09.01.22 22:20, Jakub Kicinski wrote:
> > On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:
> >> Dear Rao and David,
> >>
> >>
> >> In our syzkaller instance running on linux-next,
> >> https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
> >> observing a memory leak in prepare_creds,
> >> https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
> >> for quite some time.
> >>
> >> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> >> So, it is in mainline, was released and has not been fixed in
> >> linux-next yet.
> >>
> >> As syzkaller also provides a reproducer, we bisected this memory leak
> >> to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> >> support").
> >>
> >> We also tested that reverting this commit on torvalds' current tree
> >> made the memory leak with the reproducer go away.
> >>
> >> Could you please have a look how your commit introduces this memory
> >> leak? We will gladly support testing your fix in case help is needed.
> >
> > Let's test the regression/bug report tracking bot :)
> >
> > #regzbot introduced: 314001f0bf92
>
> Great, thx for trying, you only did a small mistake: it lacked a caret
> (^) before the "introduced", which would have told regzbot that the
> parent mail (the one you quoted) is the one containing the report (which
> later is linked in patch descriptions of fixes and allows rezgbot to
> connect things). That's why regzbot now thinks you reported the issue
> and looks out for patches and commits that link to your mail. :-/
>
> Don't worry, I just added it properly and now mark this as duplicate:
>
> #regzbot dup-of:
> https://lore.kernel.org/lkml/CAKXUXMzZkQvHJ35nwVhcJe%2BDrtEXGw%2BeKGVD04=xRJkVUC2sPA@mail.gmail.com/
>
> Thx again for trying.
>

Thorsten, Jakub, formally this may or may not be a "regression"---as
Thorsten defines it:

It's a regression if some application or practical use case running fine on one
Linux kernel works worse or not at all with a newer version compiled using a
similar configuration.

The af_unix functionality without oob support works before
314001f0bf92 ("af_unix: Add OOB support").
The af_unix functionality without oob support works after 314001f0bf92
("af_unix: Add OOB support").
The af_unix with oob support after the new feature with 314001f0bf92
("af_unix: Add OOB support") makes a memory leak visible; we do not
know if this feature even triggers it or just makes it visible.

Now, if we disable oob support we get a kernel without an observable
memory leak. However, oob support is added by default, and this makes
this memory leak visible. So, if oob support is turned into a
non-default option or nobody ever made use of the oob support before,
it really does not count as regression at all. The oob support did not
work before and now it works but just leaks a bit of memory---it is
potentially a bug, but not a regression. Of course, maybe oob support
is also just intended to make this memory leak observable, who knows?
Then, it is not even a bug, but a feature.

Thorsten's database is still quite empty, so let us keep tracking the
progress with regzbot. I guess we cannot mark "issues" in regzbot as a
true regression or as a bug (an issue that appears with a new
feature).

Also, this reproducer is automatically generated, so it barely
qualifies as "some application or practical use case", but at best as
some derived "stress test program" or "micro benchmark".

The syzbot CI and kernel CI database are also planning to track such
things (once all databases and all the interfaces all work smoothly),
so in the long term, such issues as this one would not qualify for
regzbot. For now, many things in these pipelines are still manual and
hence, triggering and investigation is manual effort, as well as
manually informing the involved developers, which also means that
tracking remains manual effort, for which regzbot is probably the
right new tool for now.

We will learn what should go into regzbot's tracker and what should
not---as we move on in the community: various information from other
systems (syzbot, kernel CI, kernel test robot etc.) and their reports
are also still difficult to add, find, track, bisect etc.

Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-10 14:02   ` Thorsten Leemhuis
  2022-01-10 16:19     ` Lukas Bulwahn
@ 2022-01-10 16:21     ` Jakub Kicinski
  1 sibling, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2022-01-10 16:21 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Lukas Bulwahn, Rao Shoaib, David S. Miller,
	Linux Kernel Mailing List, Netdev, Sudip Mukherjee, regressions

On Mon, 10 Jan 2022 15:02:23 +0100 Thorsten Leemhuis wrote:
> On 09.01.22 22:20, Jakub Kicinski wrote:
> > On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:  
> >> Dear Rao and David,
> >>
> >>
> >> In our syzkaller instance running on linux-next,
> >> https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
> >> observing a memory leak in prepare_creds,
> >> https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
> >> for quite some time.
> >>
> >> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> >> So, it is in mainline, was released and has not been fixed in
> >> linux-next yet.
> >>
> >> As syzkaller also provides a reproducer, we bisected this memory leak
> >> to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> >> support").
> >>
> >> We also tested that reverting this commit on torvalds' current tree
> >> made the memory leak with the reproducer go away.
> >>
> >> Could you please have a look how your commit introduces this memory
> >> leak? We will gladly support testing your fix in case help is needed.  
> > 
> > Let's test the regression/bug report tracking bot :)
> > 
> > #regzbot introduced: 314001f0bf92  
> 
> Great, thx for trying, you only did a small mistake: it lacked a caret
> (^) before the "introduced", which would have told regzbot that the
> parent mail (the one you quoted) is the one containing the report (which
> later is linked in patch descriptions of fixes and allows rezgbot to
> connect things). That's why regzbot now thinks you reported the issue
> and looks out for patches and commits that link to your mail. :-/
> 
> Don't worry, I just added it properly and now mark this as duplicate:
> 
> #regzbot dup-of:
> https://lore.kernel.org/lkml/CAKXUXMzZkQvHJ35nwVhcJe%2BDrtEXGw%2BeKGVD04=xRJkVUC2sPA@mail.gmail.com/
> 
> Thx again for trying.

Ah, thanks for the fix up, I copy/pasted the example with the hash and
forgot about the caret.

> I wonder if this mistake could be avoided. I came up with one idea while
> walking the dog:
> 
>  * if there is *no* parent mail, then "regzbot introduce" could consider
> the current mail as the report
> 
>  * if there *is* a parent mail, then "regzbot introduce" could consider
> the parent as the report
> 
> Then regzbot would have done the right thing in this case. But there is
> a "but": I wonder if such an approach would be too much black magic that
> confuses more than it helps. What do you think?

IMHO heuristics may do more harm than good. At least for maintainers.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-10 16:19     ` Lukas Bulwahn
@ 2022-01-10 16:29       ` Jakub Kicinski
  2022-01-28 13:17         ` Thorsten Leemhuis
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2022-01-10 16:29 UTC (permalink / raw)
  To: Lukas Bulwahn
  Cc: Thorsten Leemhuis, Rao Shoaib, David S. Miller,
	Linux Kernel Mailing List, Netdev, Sudip Mukherjee, regressions

On Mon, 10 Jan 2022 17:19:56 +0100 Lukas Bulwahn wrote:
> It's a regression if some application or practical use case running fine on one
> Linux kernel works worse or not at all with a newer version compiled using a
> similar configuration.
> 
> The af_unix functionality without oob support works before
> 314001f0bf92 ("af_unix: Add OOB support").
> The af_unix functionality without oob support works after 314001f0bf92
> ("af_unix: Add OOB support").
> The af_unix with oob support after the new feature with 314001f0bf92
> ("af_unix: Add OOB support") makes a memory leak visible; we do not
> know if this feature even triggers it or just makes it visible.
> 
> Now, if we disable oob support we get a kernel without an observable
> memory leak. However, oob support is added by default, and this makes
> this memory leak visible. So, if oob support is turned into a
> non-default option or nobody ever made use of the oob support before,
> it really does not count as regression at all. The oob support did not
> work before and now it works but just leaks a bit of memory---it is
> potentially a bug, but not a regression. Of course, maybe oob support
> is also just intended to make this memory leak observable, who knows?
> Then, it is not even a bug, but a feature.

I see, thanks for the explanation. It wasn't clear from the "does not
repro on 5.15, does repro on net-next" type of wording that the repro
actually uses the new functionality.

> Thorsten's database is still quite empty, so let us keep tracking the
> progress with regzbot. I guess we cannot mark "issues" in regzbot as a
> true regression or as a bug (an issue that appears with a new
> feature).
> 
> Also, this reproducer is automatically generated, so it barely
> qualifies as "some application or practical use case", but at best as
> some derived "stress test program" or "micro benchmark".
> 
> The syzbot CI and kernel CI database are also planning to track such
> things (once all databases and all the interfaces all work smoothly),
> so in the long term, such issues as this one would not qualify for
> regzbot. For now, many things in these pipelines are still manual and
> hence, triggering and investigation is manual effort, as well as
> manually informing the involved developers, which also means that
> tracking remains manual effort, for which regzbot is probably the
> right new tool for now.

Right, that was my thinking.

> We will learn what should go into regzbot's tracker and what should
> not---as we move on in the community: various information from other
> systems (syzbot, kernel CI, kernel test robot etc.) and their reports
> are also still difficult to add, find, track, bisect etc.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-10 16:29       ` Jakub Kicinski
@ 2022-01-28 13:17         ` Thorsten Leemhuis
  0 siblings, 0 replies; 11+ messages in thread
From: Thorsten Leemhuis @ 2022-01-28 13:17 UTC (permalink / raw)
  To: Jakub Kicinski, Lukas Bulwahn
  Cc: Rao Shoaib, David S. Miller, Linux Kernel Mailing List, Netdev,
	Sudip Mukherjee, regressions

On 10.01.22 17:29, Jakub Kicinski wrote:
> On Mon, 10 Jan 2022 17:19:56 +0100 Lukas Bulwahn wrote:
>> It's a regression if some application or practical use case running fine on one
>> Linux kernel works worse or not at all with a newer version compiled using a
>> similar configuration.
>>
>> The af_unix functionality without oob support works before
>> 314001f0bf92 ("af_unix: Add OOB support").
>> The af_unix functionality without oob support works after 314001f0bf92
>> ("af_unix: Add OOB support").
>> The af_unix with oob support after the new feature with 314001f0bf92
>> ("af_unix: Add OOB support") makes a memory leak visible; we do not
>> know if this feature even triggers it or just makes it visible.
>>
>> Now, if we disable oob support we get a kernel without an observable
>> memory leak. However, oob support is added by default, and this makes
>> this memory leak visible. So, if oob support is turned into a
>> non-default option or nobody ever made use of the oob support before,
>> it really does not count as regression at all. The oob support did not
>> work before and now it works but just leaks a bit of memory---it is
>> potentially a bug, but not a regression. Of course, maybe oob support
>> is also just intended to make this memory leak observable, who knows?
>> Then, it is not even a bug, but a feature.
> I see, thanks for the explanation. It wasn't clear from the "does not
> repro on 5.15, does repro on net-next" type of wording that the repro
> actually uses the new functionality.

Thx from my side, too.

>> Thorsten's database is still quite empty, so let us keep tracking the
>> progress with regzbot. I guess we cannot mark "issues" in regzbot as a
>> true regression or as a bug (an issue that appears with a new
>> feature).

Yeah, I definitely don't want regzbot to be used to track ordinary
issues, but sometimes it's hard to find clear boundaries between issue
and regression. This is one, but I tend to say it's an issue, as users s
won't notice the leak in practice afaics. But there are other areas that
are greyish; an kernel Oops/Warning/BUG or a hang sometimes might also
not strictly be a regression, but I guess tracking them might be a good
idea, so I'm open to those.

Anyway:

>> Also, this reproducer is automatically generated, so it barely
>> qualifies as "some application or practical use case", but at best as
>> some derived "stress test program" or "micro benchmark".

I left it tracked until now, but after Jakub's reply nothing to actually
address the issue seem to have happened. I guess in that case it's not
that important and it's time to remove it from the list of open
regressions tracked by regzbot, if that is okay for everyone:

#regzbot invalid: not strictly a regression

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-01-28 13:18 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-07  6:48 Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support") Lukas Bulwahn
2022-01-07 17:54 ` Shoaib Rao
2022-01-09  4:10   ` Lukas Bulwahn
     [not found]     ` <20220109064905.1594-1-hdanton@sina.com>
2022-01-10 15:15       ` Lukas Bulwahn
2022-01-09 21:20 ` Jakub Kicinski
2022-01-10 14:02   ` Thorsten Leemhuis
2022-01-10 16:19     ` Lukas Bulwahn
2022-01-10 16:29       ` Jakub Kicinski
2022-01-28 13:17         ` Thorsten Leemhuis
2022-01-10 16:21     ` Jakub Kicinski
2022-01-10 14:54   ` Lukas Bulwahn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).