All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Boris Sukholitko <boris.sukholitko@broadcom.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, keescook@chromium.org,
	yzaikin@google.com
Subject: Re: [PATCH] __register_sysctl_table: do not drop subdir
Date: Thu, 28 May 2020 09:04:02 -0500	[thread overview]
Message-ID: <874ks02m25.fsf@x220.int.ebiederm.org> (raw)
In-Reply-To: <20200528080812.GA21974@noodle> (Boris Sukholitko's message of "Thu, 28 May 2020 11:08:12 +0300")

Boris Sukholitko <boris.sukholitko@broadcom.com> writes:

> On Wed, May 27, 2020 at 12:58:05PM +0000, Luis Chamberlain wrote:
>> Eric since you authored the code which this code claism to fix, your
>> review would be appreciated.
>> 
>> On Wed, May 27, 2020 at 01:48:48PM +0300, Boris Sukholitko wrote:
>> > Successful get_subdir returns dir with its header.nreg properly
>> > adjusted. No need to drop the dir in that case.
>> 
>> This commit log is not that clear to me
>> can you explain what happens
>> without this patch, and how critical it is to fix it. How did you
>> notice this issue?
>
> Apologies for being too terse with my explanation. I'll try to expand
> below.
>
> In testing of our kernel (based on 4.19, tainted, sorry!) on our aarch64 based hardware
> we've come upon the following oops (lightly edited to omit irrelevant
> details):

How does your 4.19 proc_sysctl.c compare with the latest proc-sysctl.c?
Have you backported all of the most recent bug fixes?

> 000:50:01.133 Unable to handle kernel paging request at virtual address 0000000000007a12
> 000:50:02.209 Process brctl (pid: 14467, stack limit = 0x00000000bcf7a578)
> 000:50:02.209 CPU: 1 PID: 14467 Comm: brctl Tainted: P                  4.19.122 #1
> 000:50:02.209 Hardware name: Broadcom-v8A (DT)
> 000:50:02.209 pstate: 60000005 (nZCv daif -PAN -UAO)
> 000:50:02.209 pc : unregister_sysctl_table+0x1c/0xa0
> 000:50:02.209 lr : unregister_net_sysctl_table+0xc/0x20
> 000:50:02.209 sp : ffffff800e5ab9e0
> 000:50:02.209 x29: ffffff800e5ab9e0 x28: ffffffc016439ec0 
> 000:50:02.209 x27: 0000000000000000 x26: ffffff8008804078 
> 000:50:02.209 x25: ffffff80087b4dd8 x24: ffffffc015d65000 
> 000:50:02.209 x23: ffffffc01f0d6010 x22: ffffffc01f0d6000 
> 000:50:02.209 x21: ffffffc0166c4eb0 x20: 00000000000000bd 
> 000:50:02.209 x19: ffffffc01f0d6030 x18: 0000000000000400 
> 000:50:02.256 x17: 0000000000000000 x16: 0000000000000000 
> 000:50:02.256 x15: 0000000000000400 x14: 0000000000000129 
> 000:50:02.256 x13: 0000000000000001 x12: 0000000000000030 
> 000:50:02.256 x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 
> 000:50:02.256 x9 : feff646663687161 x8 : ffffffffffffffff 
> 000:50:02.256 x7 : fefefefefefefefe x6 : 0000000000008080 
> 000:50:02.256 x5 : 00000000ffffffff x4 : ffffff8008905c38 
> 000:50:02.256 x3 : ffffffc01f0d602c x2 : 00000000000000bd 
> 000:50:02.256 x1 : ffffffc01f0d60c0 x0 : 0000000000007a12 
> 000:50:02.256 Call trace:
> 000:50:02.256  unregister_sysctl_table+0x1c/0xa0
> 000:50:02.256  unregister_net_sysctl_table+0xc/0x20
> 000:50:02.256  __devinet_sysctl_unregister.isra.0+0x2c/0x60
> 000:50:02.256  inetdev_event+0x198/0x510
> 000:50:02.256  notifier_call_chain+0x58/0xa0
> 000:50:02.303  raw_notifier_call_chain+0x14/0x20
> 000:50:02.303  call_netdevice_notifiers_info+0x34/0x80
> 000:50:02.303  rollback_registered_many+0x384/0x600
> 000:50:02.303  unregister_netdevice_queue+0x8c/0x110
> 000:50:02.303  br_dev_delete+0x8c/0xa0
> 000:50:02.303  br_del_bridge+0x44/0x70
> 000:50:02.303  br_ioctl_deviceless_stub+0xcc/0x310
> 000:50:02.303  sock_ioctl+0x194/0x3f0
> 000:50:02.303  compat_sock_ioctl+0x678/0xc00
> 000:50:02.303  __arm64_compat_sys_ioctl+0xf0/0xcb0
> 000:50:02.303  el0_svc_common+0x70/0x170
> 000:50:02.303  el0_svc_compat_handler+0x1c/0x30
> 000:50:02.303  el0_svc_compat+0x8/0x18
> 000:50:02.303 Code: a90153f3 aa0003f3 f9401000 b40000c0 (f9400001) 
>
> The crash is in the call to count_subheaders(header->ctl_table_arg).
>
> Although the header (being in x19 == 0xffffffc01f0d6030) looks like a
> normal kernel pointer, ctl_table_arg (x0 == 0x0000000000007a12) looks
> invalid.
>
> Trying to find the issue, we've started tracing header allocation being
> done by kzalloc in __register_sysctl_table and header freeing being done
> in drop_sysctl_table.
>
> Then we've noticed headers being freed which where not allocated before.
> The faulty freeing was done on parent->header at the end of
> drop_sysctl_table.
>
> From this we've started to suspect some infelicity in header.nreg
> refcounting, thus leading us the __register_sysctl_table fix in the
> patch.
>
> Here is more detailed explanation of the fix.
>
> The current __register_sysctl_table logic looks like:
>
> 1. We start with some root dir, incrementing its header.nreg.
>
> 2. Then we find suitable dir using get_subdir function.
>
> 3. get_subdir decrements nreg on the parent dir and increments it on the
>    dir being returned. See found label there.
>
> 4. We decrement dir's header.nreg for the symmetry with step 1.
>
> IMHO, the bug is on step 4. If another dir is being returned by
> get_subdir we decrement its nreg. I.e. the returned dir nreg stays 1
> despite having children added to it.
>
> This leads eventually to the innocent parent header being freed.
>

But the insertion of children in insert_header also increases the count
so it does not look like that should be true.

>> If you don't apply this patch what issue do you see?
>
> For some unexplained reason, the crashes are very rare and require
> stressing the system while creating and destroing network interfaces.
>
>> 
>> Do we test for it? Can we?
>> 
>
> With some printk tracing the issue is easy to see while doing simple
> brctl addbr / delbr to create and destroy bridge interface.
>
> Probably there is some SLUB debug option which may allow to catch the
> faulty free.

I see some recent (within the last year) fixes to proc_sysctl.c in this
area.  Do you have those?  It looks like bridge up and down is stressing
this code.  Either those most recent fixes are wrong, your kernel is
missing them or this needs some more investigation.

Eric

  reply	other threads:[~2020-05-28 14:07 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27 10:48 [PATCH] __register_sysctl_table: do not drop subdir Boris Sukholitko
2020-05-27 12:58 ` Luis Chamberlain
2020-05-28  8:08   ` Boris Sukholitko
2020-05-28 14:04     ` Eric W. Biederman [this message]
2020-05-28 14:20       ` Luis Chamberlain
2020-05-31 11:44         ` Boris Sukholitko
2020-06-01 13:17           ` Luis Chamberlain
2020-05-31 11:39       ` Boris Sukholitko
2020-06-03  1:07 ` [__register_sysctl_table] 4092a9304d: WARNING:at_fs/proc/proc_sysctl.c:#retire_sysctl_set kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874ks02m25.fsf@x220.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=boris.sukholitko@broadcom.com \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.