regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
       [not found] <CAKXUXMzZkQvHJ35nwVhcJe+DrtEXGw+eKGVD04=xRJkVUC2sPA@mail.gmail.com>
@ 2022-01-09 21:20 ` Jakub Kicinski
  2022-01-10 14:02   ` Thorsten Leemhuis
  2022-01-10 14:54   ` Lukas Bulwahn
  2022-01-10 13:15 ` Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support") #forregzbot Thorsten Leemhuis
  1 sibling, 2 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-01-09 21:20 UTC (permalink / raw)
  To: Lukas Bulwahn
  Cc: Rao Shoaib, David S. Miller, Linux Kernel Mailing List, Netdev,
	Sudip Mukherjee, regressions

On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:
> Dear Rao and David,
> 
> 
> In our syzkaller instance running on linux-next,
> https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
> observing a memory leak in prepare_creds,
> https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
> for quite some time.
> 
> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> So, it is in mainline, was released and has not been fixed in
> linux-next yet.
> 
> As syzkaller also provides a reproducer, we bisected this memory leak
> to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> support").
> 
> We also tested that reverting this commit on torvalds' current tree
> made the memory leak with the reproducer go away.
> 
> Could you please have a look how your commit introduces this memory
> leak? We will gladly support testing your fix in case help is needed.

Let's test the regression/bug report tracking bot :)

#regzbot introduced: 314001f0bf92

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support") #forregzbot
       [not found] <CAKXUXMzZkQvHJ35nwVhcJe+DrtEXGw+eKGVD04=xRJkVUC2sPA@mail.gmail.com>
  2022-01-09 21:20 ` Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support") Jakub Kicinski
@ 2022-01-10 13:15 ` Thorsten Leemhuis
  1 sibling, 0 replies; 8+ messages in thread
From: Thorsten Leemhuis @ 2022-01-10 13:15 UTC (permalink / raw)
  To: regressions

On 07.01.22 07:48, Lukas Bulwahn wrote:
> Dear Rao and David,
> 
> 
> In our syzkaller instance running on linux-next,
> https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
> observing a memory leak in prepare_creds,
> https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
> for quite some time.
> 
> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> So, it is in mainline, was released and has not been fixed in
> linux-next yet.
> 
> As syzkaller also provides a reproducer, we bisected this memory leak
> to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> support").
> 
> We also tested that reverting this commit on torvalds' current tree
> made the memory leak with the reproducer go away.
> 
> Could you please have a look how your commit introduces this memory
> leak? We will gladly support testing your fix in case help is needed.

#regzbot ^introduced: 314001f0bf92

TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-09 21:20 ` Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support") Jakub Kicinski
@ 2022-01-10 14:02   ` Thorsten Leemhuis
  2022-01-10 16:19     ` Lukas Bulwahn
  2022-01-10 16:21     ` Jakub Kicinski
  2022-01-10 14:54   ` Lukas Bulwahn
  1 sibling, 2 replies; 8+ messages in thread
From: Thorsten Leemhuis @ 2022-01-10 14:02 UTC (permalink / raw)
  To: Jakub Kicinski, Lukas Bulwahn
  Cc: Rao Shoaib, David S. Miller, Linux Kernel Mailing List, Netdev,
	Sudip Mukherjee, regressions


On 09.01.22 22:20, Jakub Kicinski wrote:
> On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:
>> Dear Rao and David,
>>
>>
>> In our syzkaller instance running on linux-next,
>> https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
>> observing a memory leak in prepare_creds,
>> https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
>> for quite some time.
>>
>> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
>> So, it is in mainline, was released and has not been fixed in
>> linux-next yet.
>>
>> As syzkaller also provides a reproducer, we bisected this memory leak
>> to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
>> support").
>>
>> We also tested that reverting this commit on torvalds' current tree
>> made the memory leak with the reproducer go away.
>>
>> Could you please have a look how your commit introduces this memory
>> leak? We will gladly support testing your fix in case help is needed.
> 
> Let's test the regression/bug report tracking bot :)
> 
> #regzbot introduced: 314001f0bf92

Great, thx for trying, you only did a small mistake: it lacked a caret
(^) before the "introduced", which would have told regzbot that the
parent mail (the one you quoted) is the one containing the report (which
later is linked in patch descriptions of fixes and allows rezgbot to
connect things). That's why regzbot now thinks you reported the issue
and looks out for patches and commits that link to your mail. :-/

Don't worry, I just added it properly and now mark this as duplicate:

#regzbot dup-of:
https://lore.kernel.org/lkml/CAKXUXMzZkQvHJ35nwVhcJe%2BDrtEXGw%2BeKGVD04=xRJkVUC2sPA@mail.gmail.com/

Thx again for trying.


I wonder if this mistake could be avoided. I came up with one idea while
walking the dog:

 * if there is *no* parent mail, then "regzbot introduce" could consider
the current mail as the report

 * if there *is* a parent mail, then "regzbot introduce" could consider
the parent as the report

Then regzbot would have done the right thing in this case. But there is
a "but": I wonder if such an approach would be too much black magic that
confuses more than it helps. What do you think?

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-09 21:20 ` Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support") Jakub Kicinski
  2022-01-10 14:02   ` Thorsten Leemhuis
@ 2022-01-10 14:54   ` Lukas Bulwahn
  1 sibling, 0 replies; 8+ messages in thread
From: Lukas Bulwahn @ 2022-01-10 14:54 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Rao Shoaib, David S. Miller, Linux Kernel Mailing List, Netdev,
	Sudip Mukherjee, regressions

On Sun, Jan 9, 2022 at 10:20 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:
> > Dear Rao and David,
> >
> >
> > In our syzkaller instance running on linux-next,
> > https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
> > observing a memory leak in prepare_creds,
> > https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
> > for quite some time.
> >
> > It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> > So, it is in mainline, was released and has not been fixed in
> > linux-next yet.
> >
> > As syzkaller also provides a reproducer, we bisected this memory leak
> > to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> > support").
> >
> > We also tested that reverting this commit on torvalds' current tree
> > made the memory leak with the reproducer go away.
> >
> > Could you please have a look how your commit introduces this memory
> > leak? We will gladly support testing your fix in case help is needed.
>
> Let's test the regression/bug report tracking bot :)
>
> #regzbot introduced: 314001f0bf92

Here is all relevant information:

We have a reproducer program and this reproducer setup:


Kernel Build:

make mrproper && make defconfig && make kvm_guest.config &&
scripts/config -e KCOV -e KCOV_INSTRUMENT_ALL -e
KCOV_ENABLE_COMPARISONS -e DEBUG_FS -e DEBUG_KMEMLEAK -e DEBUG_INFO -e
KALLSYMS -e KALLSYMS_ALL -e NAMESPACES -e UTS_NS -e IPC_NS -e PID_NS
-e NET_NS -e CGROUP_PIDS -e MEMCG -e USER_NS -e CONFIGFS_FS -e
SECURITYFS -e FAULT_INJECTION -e FAULT_INJECTION_DEBUG_FS -e
FAULT_INJECTION_USERCOPY -e FAILSLAB -e FAIL_PAGE_ALLOC -e
FAIL_MAKE_REQUEST -e FAIL_IO_TIMEOUT -e FAIL_FUTEX -e LOCKDEP -e
PROVE_LOCKING -e DEBUG_ATOMIC_SLEEP -e PROVE_RCU -e DEBUG_VM -e
REFCOUNT_FULL -e FORTIFY_SOURCE -e HARDENED_USERCOPY -e
LOCKUP_DETECTOR -e SOFTLOCKUP_DETECTOR -e HARDLOCKUP_DETECTOR -e
BOOTPARAM_HARDLOCKUP_PANIC -e DETECT_HUNG_TASK -e WQ_WATCHDOG -e
USB_GADGET -e USB_RAW_GADGET -e TUN -e KCSAN -d RANDOMIZE_BASE -e
MAC80211_HWSIM -e IEEE802154 -e MAC802154 -e IEEE802154_DRIVERS -e
IEEE802154_HWSIM -e BT -e BT_HCIVHCI && make olddefconfig && make -j
24

(This is not a minimal config for the reproducer.)


QEMU Command:

qemu-system-x86_64 -m 2048 -smp 2 -chardev
socket,id=SOCKSYZ,server,nowait,host=localhost,port=46514 -mon
chardev=SOCKSYZ,mode=control -display none -serial stdio -no-reboot
-name VM-test -device virtio-rng-pci -enable-kvm -cpu
host,migratable=off -device e1000,netdev=net0 -netdev
user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:28993-:22 -hda
bullseye.img -snapshot -kernel bzImage -append "root=/dev/sda
console=ttyS0"

Reproducer: see C program at the bottom of
https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3

Trigger in QEMU: compile reproducer with gcc and run it

We observe the memory leak output below on next-20220110 with the setup above.
We do NOT observe the memory leak output below on next-20220110, when
disabling AF_UNIX_OOB.

So, no memory leak for a kernel build with this diff in the repository
and everything else same as above. That is also why the bisection
identified commit 314001f0bf92 to introduce this memory leak.

diff --git a/net/unix/Kconfig b/net/unix/Kconfig
index b7f811216820..e4175feb1809 100644
--- a/net/unix/Kconfig
+++ b/net/unix/Kconfig
@@ -28,7 +28,7 @@ config UNIX_SCM
 config AF_UNIX_OOB
        bool
        depends on UNIX
-       default y
+       default n

 config UNIX_DIAG
        tristate "UNIX: socket monitoring interface"



memory leak output:

BUG: memory leak
unreferenced object 0xffff888012fd0240 (size 192):
  comm "a.out", pid 250, jiffies 4294908743 (age 13.529s)
  hex dump (first 32 bytes):
    02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000a3bddd5a>] kmem_cache_alloc+0x133/0x2d0
    [<00000000587efbf5>] prepare_creds+0x27/0x420
    [<0000000095b9beb6>] copy_creds+0x3a/0x600
    [<000000004e59ddd9>] copy_process+0x830/0x2b80
    [<000000005840a46d>] kernel_clone+0x89/0xbf0
    [<0000000070c730ab>] __do_sys_clone+0x88/0xb0
    [<00000000f5b1c158>] do_syscall_64+0x3a/0x80
    [<000000004a0e7245>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff88800536dba0 (size 32):
  comm "a.out", pid 250, jiffies 4294908743 (age 13.529s)
  hex dump (first 32 bytes):
    01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000e0ec3d82>] __kmalloc+0x161/0x270
    [<000000006e2dab2d>] security_prepare_creds+0xa3/0xd0
    [<0000000090cfc7fd>] prepare_creds+0x2d6/0x420
    [<0000000095b9beb6>] copy_creds+0x3a/0x600
    [<000000004e59ddd9>] copy_process+0x830/0x2b80
    [<000000005840a46d>] kernel_clone+0x89/0xbf0
    [<0000000070c730ab>] __do_sys_clone+0x88/0xb0
    [<00000000f5b1c158>] do_syscall_64+0x3a/0x80
    [<000000004a0e7245>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff88800dba5c00 (size 232):
  comm "a.out", pid 250, jiffies 4294908743 (age 13.529s)
  hex dump (first 32 bytes):
    02 00 00 00 00 00 00 00 00 00 00 00 ad 4e ad de  .............N..
    ff ff ff ff 00 00 00 00 ff ff ff ff ff ff ff ff  ................
  backtrace:
    [<00000000a3bddd5a>] kmem_cache_alloc+0x133/0x2d0
    [<0000000069a36692>] alloc_pid+0x6d/0x670
    [<000000006f39f82c>] copy_process+0x1a95/0x2b80
    [<000000005840a46d>] kernel_clone+0x89/0xbf0
    [<0000000070c730ab>] __do_sys_clone+0x88/0xb0
    [<00000000f5b1c158>] do_syscall_64+0x3a/0x80
    [<000000004a0e7245>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff888014438e80 (size 1856):
  comm "a.out", pid 251, jiffies 4294908743 (age 13.529s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 58 01 00 00 00 00 00 00  ........X.......
    01 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
  backtrace:
    [<00000000a3bddd5a>] kmem_cache_alloc+0x133/0x2d0
    [<000000004c97eff8>] sk_prot_alloc+0x3e/0x1b0
    [<0000000034d397b2>] sk_alloc+0x34/0x320
    [<0000000046549569>] unix_create1+0x84/0x260
    [<00000000e72cbd15>] unix_create+0x90/0x120
    [<000000000d82ff9e>] __sock_create+0x285/0x520
    [<00000000087d9b40>] __sys_socketpair+0x142/0x380
    [<00000000f7586b33>] __x64_sys_socketpair+0x1e/0x30
    [<00000000f5b1c158>] do_syscall_64+0x3a/0x80
    [<000000004a0e7245>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff8880135311c0 (size 32):
  comm "a.out", pid 251, jiffies 4294908743 (age 13.529s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01 00 00 00 01 00 00 00 18 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000b535b6cd>] kmem_cache_alloc_trace+0x144/0x220
    [<000000008c20c9fd>] selinux_sk_alloc_security+0x52/0xf0
    [<00000000cdf964c1>] security_sk_alloc+0x39/0x70
    [<0000000005d51b11>] sk_prot_alloc+0x89/0x1b0
    [<0000000034d397b2>] sk_alloc+0x34/0x320
    [<0000000046549569>] unix_create1+0x84/0x260
    [<00000000e72cbd15>] unix_create+0x90/0x120
    [<000000000d82ff9e>] __sock_create+0x285/0x520
    [<00000000087d9b40>] __sys_socketpair+0x142/0x380
    [<00000000f7586b33>] __x64_sys_socketpair+0x1e/0x30
    [<00000000f5b1c158>] do_syscall_64+0x3a/0x80
    [<000000004a0e7245>] entry_SYSCALL_64_after_hwframe+0x44/0xae

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-10 14:02   ` Thorsten Leemhuis
@ 2022-01-10 16:19     ` Lukas Bulwahn
  2022-01-10 16:29       ` Jakub Kicinski
  2022-01-10 16:21     ` Jakub Kicinski
  1 sibling, 1 reply; 8+ messages in thread
From: Lukas Bulwahn @ 2022-01-10 16:19 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Jakub Kicinski, Rao Shoaib, David S. Miller,
	Linux Kernel Mailing List, Netdev, Sudip Mukherjee, regressions

On Mon, Jan 10, 2022 at 3:02 PM Thorsten Leemhuis
<regressions@leemhuis.info> wrote:
>
>
> On 09.01.22 22:20, Jakub Kicinski wrote:
> > On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:
> >> Dear Rao and David,
> >>
> >>
> >> In our syzkaller instance running on linux-next,
> >> https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
> >> observing a memory leak in prepare_creds,
> >> https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
> >> for quite some time.
> >>
> >> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> >> So, it is in mainline, was released and has not been fixed in
> >> linux-next yet.
> >>
> >> As syzkaller also provides a reproducer, we bisected this memory leak
> >> to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> >> support").
> >>
> >> We also tested that reverting this commit on torvalds' current tree
> >> made the memory leak with the reproducer go away.
> >>
> >> Could you please have a look how your commit introduces this memory
> >> leak? We will gladly support testing your fix in case help is needed.
> >
> > Let's test the regression/bug report tracking bot :)
> >
> > #regzbot introduced: 314001f0bf92
>
> Great, thx for trying, you only did a small mistake: it lacked a caret
> (^) before the "introduced", which would have told regzbot that the
> parent mail (the one you quoted) is the one containing the report (which
> later is linked in patch descriptions of fixes and allows rezgbot to
> connect things). That's why regzbot now thinks you reported the issue
> and looks out for patches and commits that link to your mail. :-/
>
> Don't worry, I just added it properly and now mark this as duplicate:
>
> #regzbot dup-of:
> https://lore.kernel.org/lkml/CAKXUXMzZkQvHJ35nwVhcJe%2BDrtEXGw%2BeKGVD04=xRJkVUC2sPA@mail.gmail.com/
>
> Thx again for trying.
>

Thorsten, Jakub, formally this may or may not be a "regression"---as
Thorsten defines it:

It's a regression if some application or practical use case running fine on one
Linux kernel works worse or not at all with a newer version compiled using a
similar configuration.

The af_unix functionality without oob support works before
314001f0bf92 ("af_unix: Add OOB support").
The af_unix functionality without oob support works after 314001f0bf92
("af_unix: Add OOB support").
The af_unix with oob support after the new feature with 314001f0bf92
("af_unix: Add OOB support") makes a memory leak visible; we do not
know if this feature even triggers it or just makes it visible.

Now, if we disable oob support we get a kernel without an observable
memory leak. However, oob support is added by default, and this makes
this memory leak visible. So, if oob support is turned into a
non-default option or nobody ever made use of the oob support before,
it really does not count as regression at all. The oob support did not
work before and now it works but just leaks a bit of memory---it is
potentially a bug, but not a regression. Of course, maybe oob support
is also just intended to make this memory leak observable, who knows?
Then, it is not even a bug, but a feature.

Thorsten's database is still quite empty, so let us keep tracking the
progress with regzbot. I guess we cannot mark "issues" in regzbot as a
true regression or as a bug (an issue that appears with a new
feature).

Also, this reproducer is automatically generated, so it barely
qualifies as "some application or practical use case", but at best as
some derived "stress test program" or "micro benchmark".

The syzbot CI and kernel CI database are also planning to track such
things (once all databases and all the interfaces all work smoothly),
so in the long term, such issues as this one would not qualify for
regzbot. For now, many things in these pipelines are still manual and
hence, triggering and investigation is manual effort, as well as
manually informing the involved developers, which also means that
tracking remains manual effort, for which regzbot is probably the
right new tool for now.

We will learn what should go into regzbot's tracker and what should
not---as we move on in the community: various information from other
systems (syzbot, kernel CI, kernel test robot etc.) and their reports
are also still difficult to add, find, track, bisect etc.

Lukas

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-10 14:02   ` Thorsten Leemhuis
  2022-01-10 16:19     ` Lukas Bulwahn
@ 2022-01-10 16:21     ` Jakub Kicinski
  1 sibling, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2022-01-10 16:21 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Lukas Bulwahn, Rao Shoaib, David S. Miller,
	Linux Kernel Mailing List, Netdev, Sudip Mukherjee, regressions

On Mon, 10 Jan 2022 15:02:23 +0100 Thorsten Leemhuis wrote:
> On 09.01.22 22:20, Jakub Kicinski wrote:
> > On Fri, 7 Jan 2022 07:48:46 +0100 Lukas Bulwahn wrote:  
> >> Dear Rao and David,
> >>
> >>
> >> In our syzkaller instance running on linux-next,
> >> https://elisa-builder-00.iol.unh.edu/syzkaller-next/, we have been
> >> observing a memory leak in prepare_creds,
> >> https://elisa-builder-00.iol.unh.edu/syzkaller-next/report?id=1dcac8539d69ad9eb94ab2c8c0d99c11a0b516a3,
> >> for quite some time.
> >>
> >> It is reproducible on v5.15-rc1, v5.15, v5.16-rc8 and next-20220104.
> >> So, it is in mainline, was released and has not been fixed in
> >> linux-next yet.
> >>
> >> As syzkaller also provides a reproducer, we bisected this memory leak
> >> to be introduced with  commit 314001f0bf92 ("af_unix: Add OOB
> >> support").
> >>
> >> We also tested that reverting this commit on torvalds' current tree
> >> made the memory leak with the reproducer go away.
> >>
> >> Could you please have a look how your commit introduces this memory
> >> leak? We will gladly support testing your fix in case help is needed.  
> > 
> > Let's test the regression/bug report tracking bot :)
> > 
> > #regzbot introduced: 314001f0bf92  
> 
> Great, thx for trying, you only did a small mistake: it lacked a caret
> (^) before the "introduced", which would have told regzbot that the
> parent mail (the one you quoted) is the one containing the report (which
> later is linked in patch descriptions of fixes and allows rezgbot to
> connect things). That's why regzbot now thinks you reported the issue
> and looks out for patches and commits that link to your mail. :-/
> 
> Don't worry, I just added it properly and now mark this as duplicate:
> 
> #regzbot dup-of:
> https://lore.kernel.org/lkml/CAKXUXMzZkQvHJ35nwVhcJe%2BDrtEXGw%2BeKGVD04=xRJkVUC2sPA@mail.gmail.com/
> 
> Thx again for trying.

Ah, thanks for the fix up, I copy/pasted the example with the hash and
forgot about the caret.

> I wonder if this mistake could be avoided. I came up with one idea while
> walking the dog:
> 
>  * if there is *no* parent mail, then "regzbot introduce" could consider
> the current mail as the report
> 
>  * if there *is* a parent mail, then "regzbot introduce" could consider
> the parent as the report
> 
> Then regzbot would have done the right thing in this case. But there is
> a "but": I wonder if such an approach would be too much black magic that
> confuses more than it helps. What do you think?

IMHO heuristics may do more harm than good. At least for maintainers.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-10 16:19     ` Lukas Bulwahn
@ 2022-01-10 16:29       ` Jakub Kicinski
  2022-01-28 13:17         ` Thorsten Leemhuis
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2022-01-10 16:29 UTC (permalink / raw)
  To: Lukas Bulwahn
  Cc: Thorsten Leemhuis, Rao Shoaib, David S. Miller,
	Linux Kernel Mailing List, Netdev, Sudip Mukherjee, regressions

On Mon, 10 Jan 2022 17:19:56 +0100 Lukas Bulwahn wrote:
> It's a regression if some application or practical use case running fine on one
> Linux kernel works worse or not at all with a newer version compiled using a
> similar configuration.
> 
> The af_unix functionality without oob support works before
> 314001f0bf92 ("af_unix: Add OOB support").
> The af_unix functionality without oob support works after 314001f0bf92
> ("af_unix: Add OOB support").
> The af_unix with oob support after the new feature with 314001f0bf92
> ("af_unix: Add OOB support") makes a memory leak visible; we do not
> know if this feature even triggers it or just makes it visible.
> 
> Now, if we disable oob support we get a kernel without an observable
> memory leak. However, oob support is added by default, and this makes
> this memory leak visible. So, if oob support is turned into a
> non-default option or nobody ever made use of the oob support before,
> it really does not count as regression at all. The oob support did not
> work before and now it works but just leaks a bit of memory---it is
> potentially a bug, but not a regression. Of course, maybe oob support
> is also just intended to make this memory leak observable, who knows?
> Then, it is not even a bug, but a feature.

I see, thanks for the explanation. It wasn't clear from the "does not
repro on 5.15, does repro on net-next" type of wording that the repro
actually uses the new functionality.

> Thorsten's database is still quite empty, so let us keep tracking the
> progress with regzbot. I guess we cannot mark "issues" in regzbot as a
> true regression or as a bug (an issue that appears with a new
> feature).
> 
> Also, this reproducer is automatically generated, so it barely
> qualifies as "some application or practical use case", but at best as
> some derived "stress test program" or "micro benchmark".
> 
> The syzbot CI and kernel CI database are also planning to track such
> things (once all databases and all the interfaces all work smoothly),
> so in the long term, such issues as this one would not qualify for
> regzbot. For now, many things in these pipelines are still manual and
> hence, triggering and investigation is manual effort, as well as
> manually informing the involved developers, which also means that
> tracking remains manual effort, for which regzbot is probably the
> right new tool for now.

Right, that was my thinking.

> We will learn what should go into regzbot's tracker and what should
> not---as we move on in the community: various information from other
> systems (syzbot, kernel CI, kernel test robot etc.) and their reports
> are also still difficult to add, find, track, bisect etc.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
  2022-01-10 16:29       ` Jakub Kicinski
@ 2022-01-28 13:17         ` Thorsten Leemhuis
  0 siblings, 0 replies; 8+ messages in thread
From: Thorsten Leemhuis @ 2022-01-28 13:17 UTC (permalink / raw)
  To: Jakub Kicinski, Lukas Bulwahn
  Cc: Rao Shoaib, David S. Miller, Linux Kernel Mailing List, Netdev,
	Sudip Mukherjee, regressions

On 10.01.22 17:29, Jakub Kicinski wrote:
> On Mon, 10 Jan 2022 17:19:56 +0100 Lukas Bulwahn wrote:
>> It's a regression if some application or practical use case running fine on one
>> Linux kernel works worse or not at all with a newer version compiled using a
>> similar configuration.
>>
>> The af_unix functionality without oob support works before
>> 314001f0bf92 ("af_unix: Add OOB support").
>> The af_unix functionality without oob support works after 314001f0bf92
>> ("af_unix: Add OOB support").
>> The af_unix with oob support after the new feature with 314001f0bf92
>> ("af_unix: Add OOB support") makes a memory leak visible; we do not
>> know if this feature even triggers it or just makes it visible.
>>
>> Now, if we disable oob support we get a kernel without an observable
>> memory leak. However, oob support is added by default, and this makes
>> this memory leak visible. So, if oob support is turned into a
>> non-default option or nobody ever made use of the oob support before,
>> it really does not count as regression at all. The oob support did not
>> work before and now it works but just leaks a bit of memory---it is
>> potentially a bug, but not a regression. Of course, maybe oob support
>> is also just intended to make this memory leak observable, who knows?
>> Then, it is not even a bug, but a feature.
> I see, thanks for the explanation. It wasn't clear from the "does not
> repro on 5.15, does repro on net-next" type of wording that the repro
> actually uses the new functionality.

Thx from my side, too.

>> Thorsten's database is still quite empty, so let us keep tracking the
>> progress with regzbot. I guess we cannot mark "issues" in regzbot as a
>> true regression or as a bug (an issue that appears with a new
>> feature).

Yeah, I definitely don't want regzbot to be used to track ordinary
issues, but sometimes it's hard to find clear boundaries between issue
and regression. This is one, but I tend to say it's an issue, as users s
won't notice the leak in practice afaics. But there are other areas that
are greyish; an kernel Oops/Warning/BUG or a hang sometimes might also
not strictly be a regression, but I guess tracking them might be a good
idea, so I'm open to those.

Anyway:

>> Also, this reproducer is automatically generated, so it barely
>> qualifies as "some application or practical use case", but at best as
>> some derived "stress test program" or "micro benchmark".

I left it tracked until now, but after Jakub's reply nothing to actually
address the issue seem to have happened. I guess in that case it's not
that important and it's time to remove it from the list of open
regressions tracked by regzbot, if that is okay for everyone:

#regzbot invalid: not strictly a regression

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-01-28 13:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAKXUXMzZkQvHJ35nwVhcJe+DrtEXGw+eKGVD04=xRJkVUC2sPA@mail.gmail.com>
2022-01-09 21:20 ` Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support") Jakub Kicinski
2022-01-10 14:02   ` Thorsten Leemhuis
2022-01-10 16:19     ` Lukas Bulwahn
2022-01-10 16:29       ` Jakub Kicinski
2022-01-28 13:17         ` Thorsten Leemhuis
2022-01-10 16:21     ` Jakub Kicinski
2022-01-10 14:54   ` Lukas Bulwahn
2022-01-10 13:15 ` Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support") #forregzbot Thorsten Leemhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).