linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	ast@kernel.org, axboe@kernel.dk, bfields@fieldses.org,
	bridge@lists.linux-foundation.org, chainsaw@gentoo.org,
	christian.brauner@ubuntu.com, chuck.lever@oracle.com,
	davem@davemloft.net, dhowells@redhat.com,
	gregkh@linuxfoundation.org, jarkko.sakkinen@linux.intel.com,
	jmorris@namei.org, josh@joshtriplett.org, keescook@chromium.org,
	keyrings@vger.kernel.org, kuba@kernel.org,
	lars.ellenberg@linbit.com, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	nikolay@cumulusnetworks.com, philipp.reisner@linbit.com,
	ravenexp@gmail.com, roopa@cumulusnetworks.com, serge@hallyn.com,
	slyfox@gentoo.org, viro@zeniv.linux.org.uk,
	yangtiezhu@loongson.cn, netdev@vger.kernel.org,
	markward@linux.ibm.com, linux-s390 <linux-s390@vger.kernel.org>
Subject: Re: linux-next: umh: fix processed error when UMH_WAIT_PROC is used seems to break linux bridge on s390x (bisected)
Date: Wed, 24 Jun 2020 20:32:12 +0200	[thread overview]
Message-ID: <4d8fbcea-a892-3453-091f-d57c03f9aa90@de.ibm.com> (raw)
In-Reply-To: <4e27098e-ac8d-98f0-3a9a-ea25242e24ec@de.ibm.com>



On 24.06.20 20:09, Christian Borntraeger wrote:
> 
> 
> On 24.06.20 19:58, Christian Borntraeger wrote:
>>
>>
>> On 24.06.20 18:09, Luis Chamberlain wrote:
>>> On Wed, Jun 24, 2020 at 05:54:46PM +0200, Christian Borntraeger wrote:
>>>>
>>>>
>>>> On 24.06.20 16:43, Christoph Hellwig wrote:
>>>>> On Wed, Jun 24, 2020 at 01:11:54PM +0200, Christian Borntraeger wrote:
>>>>>> Does anyone have an idea why "umh: fix processed error when UMH_WAIT_PROC is used" breaks the
>>>>>> linux-bridge on s390?
>>>>>
>>>>> Are we even sure this is s390 specific and doesn't happen on other
>>>>> architectures with the same bridge setup?
>>>>
>>>> Fair point. AFAIK nobody has tested this yet on x86.
>>>
>>> Regardless, can you enable dynamic debug prints, to see if the kernel
>>> reveals anything on the bridge code which may be relevant:
>>>
>>> echo "file net/bridge/* +p" > /sys/kernel/debug/dynamic_debug/control
>>>
>>>   Luis
>>
>> When I start a guest the following happens with the patch:
>>
>> [   47.420237] virbr0: port 2(vnet0) entered blocking state
>> [   47.420242] virbr0: port 2(vnet0) entered disabled state
>> [   47.420315] device vnet0 entered promiscuous mode
>> [   47.420365] virbr0: port 2(vnet0) event 16
>> [   47.420366] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   47.420373] virbr0: toggle option: 12 state: 0 -> 0
>> [   47.420536] virbr0: port 2(vnet0) entered blocking state
>> [   47.420538] virbr0: port 2(vnet0) event 16
>> [   47.420539] virbr0: br_fill_info event 16 port vnet0 master virbr0
>>
>> and the nothing happens.
>>
>>
>> without the patch
>> [   33.805410] virbr0: hello timer expired
>> [   35.805413] virbr0: hello timer expired
>> [   36.184349] virbr0: port 2(vnet0) entered blocking state
>> [   36.184353] virbr0: port 2(vnet0) entered disabled state
>> [   36.184427] device vnet0 entered promiscuous mode
>> [   36.184479] virbr0: port 2(vnet0) event 16
>> [   36.184480] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   36.184487] virbr0: toggle option: 12 state: 0 -> 0
>> [   36.184636] virbr0: port 2(vnet0) entered blocking state
>> [   36.184638] virbr0: port 2(vnet0) entered listening state
>> [   36.184639] virbr0: port 2(vnet0) event 16
>> [   36.184640] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   36.184645] virbr0: port 2(vnet0) event 16
>> [   36.184646] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   37.805478] virbr0: hello timer expired
>> [   38.205413] virbr0: port 2(vnet0) forward delay timer
>> [   38.205414] virbr0: port 2(vnet0) entered learning state
>> [   38.205427] virbr0: port 2(vnet0) event 16
>> [   38.205430] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   38.765414] virbr0: port 2(vnet0) hold timer expired
>> [   39.805415] virbr0: hello timer expired
>> [   40.285410] virbr0: port 2(vnet0) forward delay timer
>> [   40.285411] virbr0: port 2(vnet0) entered forwarding state
>> [   40.285418] virbr0: topology change detected, propagating
>> [   40.285420] virbr0: decreasing ageing time to 400
>> [   40.285427] virbr0: port 2(vnet0) event 16
>> [   40.285432] virbr0: br_fill_info event 16 port vnet0 master virbr0
>> [   40.765408] virbr0: port 2(vnet0) hold timer expired
>> [   41.805415] virbr0: hello timer expired
>> [   42.765426] virbr0: port 2(vnet0) hold timer expired
>> [   43.805425] virbr0: hello timer expired
>> [   44.765426] virbr0: port 2(vnet0) hold timer expired
>> [   45.805418] virbr0: hello timer expired
>>
>> and continuing....
> 
> Just reverting the umh.c parts like this makes the problem go away.
> 
> diff --git a/kernel/umh.c b/kernel/umh.c
> index f81e8698e36e..79f139a7ca03 100644
> --- a/kernel/umh.c
> +++ b/kernel/umh.c
> @@ -154,8 +154,8 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>                  * the real error code is already in sub_info->retval or
>                  * sub_info->retval is 0 anyway, so don't mess with it then.
>                  */
> -               if (KWIFEXITED(ret))
> -                       sub_info->retval = KWEXITSTATUS(ret);
> +               if (ret)
> +                       sub_info->retval = ret;
>         }
>  
>         /* Restore default kernel sig handler */

I instrumented this:

[    5.118528] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[    9.409235] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.114914] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[   10.116308] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[   10.117690] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[   10.118732] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[   10.119899] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.121365] ret=0x100 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x1 
[   10.128044] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.945814] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.962799] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.983755] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   10.992705] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.118877] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.124359] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.129364] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.142139] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.148385] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.153686] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.158861] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.164068] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.192445] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.200605] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.208493] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.216612] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.226467] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.525363] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.532758] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.535279] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.697660] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.701634] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.818652] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   11.836228] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   12.082266] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 
[   40.965898] ret=0x0 KWIFEXITED(ret) = 0x1 KWEXITSTATUS(ret)= 0x0 

So the translations look correct. But your change is actually a sematic change
if(ret) will only trigger if there is an error
if (KWIFEXITED(ret)) will always trigger when the process ends. So we will always overwrite -ECHILD
and we did not do it before. 

  reply	other threads:[~2020-06-24 18:32 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-10 15:49 [PATCH 0/5] kmod/umh: a few fixes Luis R. Rodriguez
2020-06-10 15:49 ` [PATCH 1/5] selftests: kmod: Use variable NAME in kmod_test_0001() Luis R. Rodriguez
2020-06-10 15:49 ` [PATCH 2/5] kmod: Remove redundant "be an" in the comment Luis R. Rodriguez
2020-06-10 15:49 ` [PATCH 3/5] test_kmod: Avoid potential double free in trigger_config_run_type() Luis R. Rodriguez
2020-06-10 15:49 ` [PATCH 4/5] umh: fix processed error when UMH_WAIT_PROC is used Luis R. Rodriguez
2020-06-23 14:11   ` linux-next: umh: fix processed error when UMH_WAIT_PROC is used seems to break linux bridge on s390x (bisected) Christian Borntraeger
2020-06-23 14:23     ` Christian Borntraeger
2020-06-24 11:11       ` Christian Borntraeger
2020-06-24 12:05         ` Luis Chamberlain
2020-06-24 13:17           ` Luis Chamberlain
2020-06-24 16:13             ` Luis Chamberlain
2020-06-24 14:43         ` Christoph Hellwig
2020-06-24 15:54           ` Christian Borntraeger
2020-06-24 16:09             ` Luis Chamberlain
2020-06-24 17:58               ` Christian Borntraeger
2020-06-24 18:09                 ` Christian Borntraeger
2020-06-24 18:32                   ` Christian Borntraeger [this message]
2020-06-24 18:37                     ` Christian Borntraeger
2020-06-25 13:26                       ` Christian Borntraeger
2020-06-26  2:54                       ` Luis Chamberlain
2020-06-26  5:22                         ` Christian Borntraeger
2020-06-26  9:00                           ` Christoph Hellwig
2020-06-26 11:40                             ` Luis Chamberlain
2020-06-26 11:50                               ` Luis Chamberlain
2020-06-30 17:57                         ` Luis Chamberlain
2020-07-01 10:08                           ` Christian Borntraeger
2020-07-01 13:24                             ` Tetsuo Handa
2020-07-01 13:53                               ` Luis Chamberlain
2020-07-01 14:08                                 ` Tetsuo Handa
2020-07-01 15:38                                   ` Luis Chamberlain
2020-07-01 15:48                                     ` Christian Borntraeger
2020-07-01 15:58                                       ` Luis Chamberlain
2020-07-01 16:01                                         ` Christian Borntraeger
2020-07-02  4:26                                     ` Tetsuo Handa
2020-07-02 19:46                                       ` Luis Chamberlain
2020-07-03  0:52                                         ` Tetsuo Handa
2020-07-03 13:28                                           ` Luis Chamberlain
2020-07-01 15:26                                 ` Christian Borntraeger
2020-07-01 13:46                             ` Luis Chamberlain
2020-06-10 15:49 ` [PATCH 5/5] selftests: simplify kmod failure value Luis R. Rodriguez
2020-06-18  0:43 ` [PATCH 0/5] kmod/umh: a few fixes Andrew Morton
2020-06-19 20:46   ` Luis Chamberlain
2020-06-19 21:07     ` Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d8fbcea-a892-3453-091f-d57c03f9aa90@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bfields@fieldses.org \
    --cc=bridge@lists.linux-foundation.org \
    --cc=chainsaw@gentoo.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=chuck.lever@oracle.com \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=jarkko.sakkinen@linux.intel.com \
    --cc=jmorris@namei.org \
    --cc=josh@joshtriplett.org \
    --cc=keescook@chromium.org \
    --cc=keyrings@vger.kernel.org \
    --cc=kuba@kernel.org \
    --cc=lars.ellenberg@linbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=markward@linux.ibm.com \
    --cc=mcgrof@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=philipp.reisner@linbit.com \
    --cc=ravenexp@gmail.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=serge@hallyn.com \
    --cc=slyfox@gentoo.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yangtiezhu@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).