linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
To: netdev@vger.kernel.org, linux-kselftest@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Shuah Khan <shuah@kernel.org>, Eric Dumazet <edumazet@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: BUG: selftest/net/tun: Hang in unregister_netdevice
Date: Tue, 14 Mar 2023 17:00:47 +0100	[thread overview]
Message-ID: <27769d34-521c-f0ef-b6c2-6bd452e4f9bf@alu.unizg.hr> (raw)
In-Reply-To: <d7a64812-73db-feb2-e6d6-e1d8c09a6fed@alu.unizg.hr>

On 3/14/23 14:52, Mirsad Todorovac wrote:
> On 3/14/23 12:45, Mirsad Todorovac wrote:
>> Hi, all!
>>
>> After running tools/testing/selftests/net/tun, there seems to be some kind of hang
>> in test "FAIL  tun.reattach_delete_close" or "FAIL  tun.reattach_close_delete".
>>
>> Two tests exit by timeout, but the processes left are unkillable, even with kill -9 PID:
>>
>> [root@pc-mtodorov linux_torvalds]# ps -ef | grep tun
>> root        1140       1  0 12:16 ?        00:00:00 /bin/bash /usr/sbin/ksmtuned
>> root        1333       1  0 12:16 ?        00:00:01 /usr/libexec/platform-python -Es /usr/sbin/tuned -l -P
>> root        3930    2309  0 12:20 pts/1    00:00:00 tools/testing/selftests/net/tun
>> root        3952    2309  0 12:21 pts/1    00:00:00 tools/testing/selftests/net/tun
>> root        4056    3765  0 12:25 pts/1    00:00:00 grep --color=auto tun
>> [root@pc-mtodorov linux_torvalds]# kill -9 3930 3952
>> [root@pc-mtodorov linux_torvalds]# ps -ef | grep tun
>> root        1140       1  0 12:16 ?        00:00:00 /bin/bash /usr/sbin/ksmtuned
>> root        1333       1  0 12:16 ?        00:00:01 /usr/libexec/platform-python -Es /usr/sbin/tuned -l -P
>> root        3930    2309  0 12:20 pts/1    00:00:00 tools/testing/selftests/net/tun
>> root        3952    2309  0 12:21 pts/1    00:00:00 tools/testing/selftests/net/tun
>> root        4060    3765  0 12:25 pts/1    00:00:00 grep --color=auto tun
>> [root@pc-mtodorov linux_torvalds]#
>>
>> The kernel seems to be stuck in some loop, and filling the log with the
>> following messages until reboot, where it is also waiting very long on the
>> situation to timeout, which apparently never happens.
>>
>> Mar 14 11:54:09 pc-mtodorov kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = 3
>> Mar 14 11:54:19 pc-mtodorov kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = 3
>> Mar 14 11:54:29 pc-mtodorov kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = 3
>> Mar 14 11:54:40 pc-mtodorov kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = 3
>> Mar 14 11:54:50 pc-mtodorov kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = 3
>>
>> The platform is kernel 6.3.0-rc2 on AlmaLinux 8.7 and a LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB
>> (lshw output attached).
>>
>> The .config is here:
>>
>> https://domac.alu.hr/~mtodorov/linux/selftests/net-tun/config-6.3.0-rc2-mg-andy-devres-00006-gfc89d7fb499b
>>
>> Basically, it is a vanilla Torvalds tree kernel with MGLRU, KMEMLEAK, and CONFIG_DEBUG_KOBJECT enabled.
>> And devres patch.
>>
>> Please find the strace of the net/tun run attached.
>>
>> I am available for additional diagnostics.
> 
> Hi, again!
> 
> I've been busy while waiting for reply, so I wondered how would a vanilla kernel
> go through the test, considering that I've been testing a number of patches
> lately.
> 
> I did a fresh git clone from repo and woa.
> 
> Surprisingly, the test with CONFIG_DEBUG_KOBJECT turned off passes:
> 
> [root@pc-mtodorov linux_torvalds]# tools/testing/selftests/net/tun
> TAP version 13
> 1..5
> # Starting 5 tests from 1 test cases.
> #  RUN           tun.delete_detach_close ...
> #            OK  tun.delete_detach_close
> ok 1 tun.delete_detach_close
> #  RUN           tun.detach_delete_close ...
> #            OK  tun.detach_delete_close
> ok 2 tun.detach_delete_close
> #  RUN           tun.detach_close_delete ...
> #            OK  tun.detach_close_delete
> ok 3 tun.detach_close_delete
> #  RUN           tun.reattach_delete_close ...
> #            OK  tun.reattach_delete_close
> ok 4 tun.reattach_delete_close
> #  RUN           tun.reattach_close_delete ...
> #            OK  tun.reattach_close_delete
> ok 5 tun.reattach_close_delete
> # PASSED: 5 / 5 tests passed.
> # Totals: pass:5 fail:0 xfail:0 xpass:0 skip:0 error:0
> [root@pc-mtodorov linux_torvalds]#
> 
> So, no hanging processes that cannot be killed now.
> 
> If you think it is worthy to explore the lockup that occurs when turning
> CONFIG_DEBUG_KOBJECT=y, I will rebuild once again with these turned on,
> to clear any doubts.

Confirmed.

With the sole difference of:

[marvin@pc-mtodorov linux_torvalds]$ grep KOBJECT /boot/config-6.3.0-rc2-vanilla-00006-gfc89d7fb499b
CONFIG_DEBUG_KOBJECT=y
CONFIG_DEBUG_KOBJECT_RELEASE=y
# CONFIG_SAMPLE_KOBJECT is not set
[marvin@pc-mtodorov linux_torvalds]$

we get again unkillable processes:

[root@pc-mtodorov linux_torvalds]# ps -ef | grep tun
root        1157       1  0 16:44 ?        00:00:00 /bin/bash /usr/sbin/ksmtuned
root        1331       1  0 16:44 ?        00:00:01 /usr/libexec/platform-python -Es /usr/sbin/tuned -l -P
root        3479    2315  0 16:45 pts/1    00:00:00 tools/testing/selftests/net/tun
root        3512    2315  0 16:45 pts/1    00:00:00 tools/testing/selftests/net/tun
root        4091    3364  0 16:49 pts/1    00:00:00 grep --color=auto tun
[root@pc-mtodorov linux_torvalds]# kill -9 3479 3512
[root@pc-mtodorov linux_torvalds]# ps -ef | grep tun
root        1157       1  0 16:44 ?        00:00:00 /bin/bash /usr/sbin/ksmtuned
root        1331       1  0 16:44 ?        00:00:01 /usr/libexec/platform-python -Es /usr/sbin/tuned -l -P
root        3479    2315  0 16:45 pts/1    00:00:00 tools/testing/selftests/net/tun
root        3512    2315  0 16:45 pts/1    00:00:00 tools/testing/selftests/net/tun
root        4095    3364  0 16:50 pts/1    00:00:00 grep --color=auto tun
[root@pc-mtodorov linux_torvalds]#

Possibly the kernel /proc/cmdline is also important:

[root@pc-mtodorov linux_torvalds]# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt5)/vmlinuz-6.3.0-rc2-vanilla-00006-gfc89d7fb499b root=/dev/mapper/almalinux_desktop--mtodorov-root ro 
crashkernel=auto resume=/dev/mapper/almalinux_desktop--mtodorov-swap rd.lvm.lv=almalinux_desktop-mtodorov/root 
rd.lvm.lv=almalinux_desktop-mtodorov/swap loglevel=7 i915.alpha_support=1 debug devres.log=1
[root@pc-mtodorov linux_torvalds]#

After a while, kernel message start looping:

  kernel:unregister_netdevice: waiting for tap0 to become free. Usage count = 3

Message from syslogd@pc-mtodorov at Mar 14 16:57:15 ...
  kernel:unregister_netdevice: waiting for tap0 to become free. Usage count = 3

Message from syslogd@pc-mtodorov at Mar 14 16:57:24 ...
  kernel:unregister_netdevice: waiting for tap0 to become free. Usage count = 3

Message from syslogd@pc-mtodorov at Mar 14 16:57:26 ...
  kernel:unregister_netdevice: waiting for tap0 to become free. Usage count = 3

This hangs processes until very late stage of shutdown.

I can confirm that CONFIG_DEBUG_{KOBJECT,KOBJECT_RELEASE}=y were the only changes
to .config in between builds.

Best regards,
Mirsad

-- 
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

  reply	other threads:[~2023-03-14 16:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-14 11:45 BUG: selftest/net/tun: Hang in unregister_netdevice Mirsad Todorovac
2023-03-14 13:52 ` Mirsad Todorovac
2023-03-14 16:00   ` Mirsad Todorovac [this message]
2023-03-14 16:02     ` Eric Dumazet
2023-03-14 20:10       ` Mirsad Goran Todorovac
2023-03-15 20:56         ` Kuniyuki Iwashima
2023-03-15 20:59           ` Eric Dumazet
2023-03-16 20:28             ` Mirsad Goran Todorovac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27769d34-521c-f0ef-b6c2-6bd452e4f9bf@alu.unizg.hr \
    --to=mirsad.todorovac@alu.unizg.hr \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).