linux-cifs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: refcount_t: underflow; use-after-free with CIFS umount after scsi-misc commit ef2cc88e2a205b8a11a19e78db63a70d3728cdf5
       [not found] <30808b0b-367a-266a-7ef4-de69c08e1319@internode.on.net>
@ 2019-12-09  1:49 ` Arthur Marsh
  2019-12-09  2:23   ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Arthur Marsh @ 2019-12-09  1:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: SCSI development list, Linux Kernel Mailing List, linux-cifs,
	James E.J. Bottomley



Arthur Marsh wrote on 12/5/19 2:14 PM:
> Hi, I came across a problem when un-mounting CIFS mounts that bisected 
> back to the commit "[ef2cc88e2a205b8a11a19e78db63a70d3728cdf5] Merge tag 
> 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi".


This still happens with 5.5.0-rc1:

[  301.679819] ------------[ cut here ]------------
[  301.680095] refcount_t: underflow; use-after-free.
[  301.680192] WARNING: CPU: 1 PID: 3569 at lib/refcount.c:28 
refcount_warn_saturate+0xb4/0xf3
[  301.680298] Modules linked in: md4 sha512_generic cmac hmac cifs 
libdes dns_resolver fscache libarc4 nfc rfkill cpufreq_conservative 
cpufreq_userspace cpufreq_powersave binfmt_misc nls_utf8 nls_cp437 vfat 
fat lm75 it87 hwmon_vid tun cuse fuse ib_iser rdma_cm iw_cm ib_cm 
ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 
parport_pc ppdev lp parport radeon ttm snd_hda_codec_realtek 
snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio snd_hda_intel 
snd_intel_dspcfg drm_kms_helper snd_hda_codec drm i2c_algo_bit 
snd_hda_core fb_sys_fops iTCO_wdt iTCO_vendor_support intel_powerclamp 
syscopyarea sysfillrect snd_hwdep sysimgblt snd_pcm evdev snd_timer 
pcspkr snd serio_raw rng_core soundcore button acpi_cpufreq ext4 crc16 
mbcache jbd2 sr_mod cdrom sg sd_mod ata_generic psmouse ata_piix tg3 
pata_it821x firewire_ohci uhci_hcd i2c_i801 lpc_ich firewire_core 
mfd_core crc_itu_t libata ehci_pci scsi_mod ehci_hcd usbcore usb_common 
ptp pps_core libphy
[  301.681361] CPU: 1 PID: 3569 Comm: umount Not tainted 5.5.0-rc1 #1692
[  301.681444] Hardware name:  /8I945P Pro, BIOS F10 04/06/2006
[  301.681521] EIP: refcount_warn_saturate+0xb4/0xf3
[  301.681586] Code: 24 f8 58 8b c1 e8 ab 91 d0 ff 0f 0b c9 c3 80 3d c3 
be 9d c1 00 75 91 c6 05 c3 be 9d c1 01 c7 04 24 4c 59 8b c1 e8 8b 91 d0 
ff <0f> 0b c9 c3 80 3d c1 be 9d c1 00 0f 85 6d ff ff ff c6 05 c1 be 9d
[  301.681807] EAX: 00000026 EBX: ed5ae6d8 ECX: 00000007 EDX: f62950cc
[  301.681889] ESI: ed5ae6e4 EDI: ed5d7000 EBP: ed503e2c ESP: ed503e28
[  301.681971] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010292
[  301.682057] CR0: 80050033 CR2: 0058f00c CR3: 2a20a000 CR4: 000006d0
[  301.682139] Call Trace:
[  301.682240]  close_shroot+0x97/0xda [cifs]
[  301.682351]  SMB2_tdis+0x7c/0x176 [cifs]
[  301.682456]  ? _get_xid+0x58/0x91 [cifs]
[  301.682563]  cifs_put_tcon.part.0+0x99/0x202 [cifs]
[  301.682637]  ? ida_free+0x99/0x10a
[  301.682727]  ? cifs_umount+0x3d/0x9d [cifs]
[  301.682829]  cifs_put_tlink+0x3a/0x50 [cifs]
[  301.682929]  cifs_umount+0x44/0x9d [cifs]
[  301.683023]  cifs_kill_sb+0x16/0x19 [cifs]
[  301.683084]  deactivate_locked_super+0x28/0x5c
[  301.683147]  deactivate_super+0x30/0x33
[  301.683204]  cleanup_mnt+0x8a/0x103
[  301.683256]  __cleanup_mnt+0xb/0xd
[  301.683308]  task_work_run+0x6c/0x8b
[  301.683364]  exit_to_usermode_loop+0xbe/0xc0
[  301.683427]  do_fast_syscall_32+0x291/0x364
[  301.683493]  entry_SYSENTER_32+0xaa/0x102
[  301.683549] EIP: 0xb7fa9af1
[  301.683592] Code: 00 89 d3 5b 5e 5f 5d c3 b8 00 09 3d 00 eb c4 8b 04 
24 c3 8b 14 24 c3 8b 1c 24 c3 8b 34 24 c3 90 90 51 52 55 89 e5 0f 34 cd 
80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
[  301.683811] EAX: 00000000 EBX: 0058ccf0 ECX: 00000000 EDX: b7f12000
[  301.683892] ESI: bfe75280 EDI: b7f12000 EBP: bfe74168 ESP: bfe740fc
[  301.683973] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000286
[  301.687861] irq event stamp: 8202
[  301.691737] hardirqs last  enabled at (8201): [<c10c741c>] 
console_unlock+0x45a/0x598
[  301.695687] hardirqs last disabled at (8202): [<c100171f>] 
trace_hardirqs_off_thunk+0xc/0x1d
[  301.699620] softirqs last  enabled at (8132): [<c14f9770>] 
peernet2id+0x3c/0x56
[  301.703583] softirqs last disabled at (8130): [<c14f9756>] 
peernet2id+0x22/0x56
[  301.707502] ---[ end trace fbe0ba2a50b93358 ]---

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: refcount_t: underflow; use-after-free with CIFS umount after scsi-misc commit ef2cc88e2a205b8a11a19e78db63a70d3728cdf5
  2019-12-09  1:49 ` refcount_t: underflow; use-after-free with CIFS umount after scsi-misc commit ef2cc88e2a205b8a11a19e78db63a70d3728cdf5 Arthur Marsh
@ 2019-12-09  2:23   ` Linus Torvalds
  2019-12-09  2:52     ` Al Viro
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Linus Torvalds @ 2019-12-09  2:23 UTC (permalink / raw)
  To: Arthur Marsh
  Cc: SCSI development list, Linux Kernel Mailing List, CIFS,
	James E.J. Bottomley

On Sun, Dec 8, 2019 at 5:49 PM Arthur Marsh
<arthur.marsh@internode.on.net> wrote:
>
> This still happens with 5.5.0-rc1:

Does it happen 100% of the time?

Your bisection result looks pretty nonsensical - not that it's
impossible (anything is possible), but it really doesn't look very
likely. Which makes me think maybe it's slightly timing-sensitive or
something?

Would you mind trying to re-do the bisection, and for each kernel try
the mount thing at least a few times before you decide a kernel is
good?

Bisection is very powerful, but if _any_ of the kernels you marked
good weren't really good (they just happened to not trigger the
problem), bisection ends up giving completely the wrong answer. And
with that bisection commit, there's not even a hint of what could have
gone wrong.

             Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: refcount_t: underflow; use-after-free with CIFS umount after scsi-misc commit ef2cc88e2a205b8a11a19e78db63a70d3728cdf5
  2019-12-09  2:23   ` Linus Torvalds
@ 2019-12-09  2:52     ` Al Viro
  2019-12-09  3:10       ` Linus Torvalds
  2019-12-09  3:18     ` Steve French
  2019-12-09 14:45     ` Arthur Marsh
  2 siblings, 1 reply; 8+ messages in thread
From: Al Viro @ 2019-12-09  2:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arthur Marsh, SCSI development list, Linux Kernel Mailing List,
	CIFS, James E.J. Bottomley

On Sun, Dec 08, 2019 at 06:23:02PM -0800, Linus Torvalds wrote:
> On Sun, Dec 8, 2019 at 5:49 PM Arthur Marsh
> <arthur.marsh@internode.on.net> wrote:
> >
> > This still happens with 5.5.0-rc1:
> 
> Does it happen 100% of the time?
> 
> Your bisection result looks pretty nonsensical - not that it's
> impossible (anything is possible), but it really doesn't look very
> likely. Which makes me think maybe it's slightly timing-sensitive or
> something?
> 
> Would you mind trying to re-do the bisection, and for each kernel try
> the mount thing at least a few times before you decide a kernel is
> good?
> 
> Bisection is very powerful, but if _any_ of the kernels you marked
> good weren't really good (they just happened to not trigger the
> problem), bisection ends up giving completely the wrong answer. And
> with that bisection commit, there's not even a hint of what could have
> gone wrong.

FWIW, the thing that is IME absolutely incompatible with bisection
is CONFIG_GCC_PLUGIN_RANDSTRUCT.  It can affect frequencies badly
enough, even in the cases when the bug isn't directly dependent
upon that thing.

I suspect that nonsense bisects spewed by CI bots lately (bisect on
x86 oops ending up at commit limited to arch/parisc, etc.) are at
least partially due to that kind of garbage...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: refcount_t: underflow; use-after-free with CIFS umount after scsi-misc commit ef2cc88e2a205b8a11a19e78db63a70d3728cdf5
  2019-12-09  2:52     ` Al Viro
@ 2019-12-09  3:10       ` Linus Torvalds
  2019-12-09  3:15         ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2019-12-09  3:10 UTC (permalink / raw)
  To: Al Viro
  Cc: Arthur Marsh, SCSI development list, Linux Kernel Mailing List,
	CIFS, James E.J. Bottomley

On Sun, Dec 8, 2019 at 6:52 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> FWIW, the thing that is IME absolutely incompatible with bisection
> is CONFIG_GCC_PLUGIN_RANDSTRUCT.  It can affect frequencies badly
> enough, even in the cases when the bug isn't directly dependent
> upon that thing.

It will easily affect timing in major ways, yes.

You're right that at least the CI bots might want to disable it for
bisecting. Or force a particular seed for RANDSTRUCT - I seem to
recall that there was some way to make it be at least repeatable for
any particular structure.

         Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: refcount_t: underflow; use-after-free with CIFS umount after scsi-misc commit ef2cc88e2a205b8a11a19e78db63a70d3728cdf5
  2019-12-09  3:10       ` Linus Torvalds
@ 2019-12-09  3:15         ` Linus Torvalds
  0 siblings, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2019-12-09  3:15 UTC (permalink / raw)
  To: Al Viro
  Cc: Arthur Marsh, SCSI development list, Linux Kernel Mailing List,
	CIFS, James E.J. Bottomley

On Sun, Dec 8, 2019 at 7:10 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> You're right that at least the CI bots might want to disable it for
> bisecting. Or force a particular seed for RANDSTRUCT - I seem to
> recall that there was some way to make it be at least repeatable for
> any particular structure.

Yeah, if you build in the same directory and don't do a "distclean" or
"git clean -dqfx", the seed remains in
scripts/gcc-plgins/randomize_layout_seed.h across builds, and the
result then _should_ be repeatable.

But yes, it's the kind of noise that is likely not worth fighting for
bisection (or any random bug hunting) at all, and turning off
RANDSTRUCT is probably the right thing to do.

But I don't think Arthur had RANDSTRUCT enabled, so that should be fine.

            Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: refcount_t: underflow; use-after-free with CIFS umount after scsi-misc commit ef2cc88e2a205b8a11a19e78db63a70d3728cdf5
  2019-12-09  2:23   ` Linus Torvalds
  2019-12-09  2:52     ` Al Viro
@ 2019-12-09  3:18     ` Steve French
  2019-12-09  3:27       ` Steve French
  2019-12-09 14:45     ` Arthur Marsh
  2 siblings, 1 reply; 8+ messages in thread
From: Steve French @ 2019-12-09  3:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arthur Marsh, SCSI development list, Linux Kernel Mailing List,
	CIFS, James E.J. Bottomley, ronnie sahlberg

On Sun, Dec 8, 2019 at 8:23 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, Dec 8, 2019 at 5:49 PM Arthur Marsh
> <arthur.marsh@internode.on.net> wrote:
> >
> > This still happens with 5.5.0-rc1:
>
> Does it happen 100% of the time?

I can reproduce it (although it was a little more difficult since WiFi doesn't
work on RC1 on some of my hardware - due to the 802.11 driver regression oops.
I was able to reproduce it to Samba localhost).


> Your bisection result looks pretty nonsensical - not that it's
> impossible (anything is possible), but it really doesn't look very
> likely. Which makes me think maybe it's slightly timing-sensitive or
> something?

The bisection result is implausible.  I just did some experiments and
it looks far more likely is that it is related to commit
72e73c78c446e ("cifs: close the shared root handle on tree disconnect")
so added Ronnie to the cc.  That patch added a call (at unmount time)
to close_shroot.
The idea of that patch made sense - although tree disconnect (and then
logoff of the session)
will indirectly free any open handles on the server for that session,
it is a little
cleaner to close the cached root SMB3 file handle explicitly.

void close_shroot(struct cached_fid *cfid)
{
        mutex_lock(&cfid->fid_mutex);
        kref_put(&cfid->refcount, smb2_close_cached_fid);
        mutex_unlock(&cfid->fid_mutex);
}


Taking out the one line change in the patch from last week that calls
close_shroot from
umount (SMB2_tdis, ie tree_disconnect) I don't see the problem so far
more likely
that it is related to that commit.   The problem seems to be related
to servers which
don't support directory leases.  Will spin up a patch to fix this if
Ronnie hasn't already fixed it

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: refcount_t: underflow; use-after-free with CIFS umount after scsi-misc commit ef2cc88e2a205b8a11a19e78db63a70d3728cdf5
  2019-12-09  3:18     ` Steve French
@ 2019-12-09  3:27       ` Steve French
  0 siblings, 0 replies; 8+ messages in thread
From: Steve French @ 2019-12-09  3:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arthur Marsh, SCSI development list, Linux Kernel Mailing List,
	CIFS, James E.J. Bottomley, ronnie sahlberg

Arthur,
If you want to avoid the issue which causes the oops you can remove
this one line change (see below) but I (or Ronnie) will post a patch
for this - but wanted to discuss with him briefly first.

# git show 72e73c78c446e
commit 72e73c78c446e3c009a29b017c7fa3d79463e2aa
Author: Ronnie Sahlberg <lsahlber@redhat.com>
Date:   Thu Nov 7 17:00:38 2019 +1000

    cifs: close the shared root handle on tree disconnect

    Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>

diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 05149862aea4..acb70f67efc9 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -1807,6 +1807,8 @@ SMB2_tdis(const unsigned int xid, struct cifs_tcon *tcon)
        if ((tcon->need_reconnect) || (tcon->ses->need_reconnect))
                return 0;

+       close_shroot(&tcon->crfid);
+
        rc = smb2_plain_req_init(SMB2_TREE_DISCONNECT, tcon, (void **) &req,
                             &total_len);
        if (rc)

On Sun, Dec 8, 2019 at 9:18 PM Steve French <smfrench@gmail.com> wrote:
>
> On Sun, Dec 8, 2019 at 8:23 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Sun, Dec 8, 2019 at 5:49 PM Arthur Marsh
> > <arthur.marsh@internode.on.net> wrote:
> > >
> > > This still happens with 5.5.0-rc1:
> >
> > Does it happen 100% of the time?
>
> I can reproduce it (although it was a little more difficult since WiFi doesn't
> work on RC1 on some of my hardware - due to the 802.11 driver regression oops.
> I was able to reproduce it to Samba localhost).
>
>
> > Your bisection result looks pretty nonsensical - not that it's
> > impossible (anything is possible), but it really doesn't look very
> > likely. Which makes me think maybe it's slightly timing-sensitive or
> > something?
>
> The bisection result is implausible.  I just did some experiments and
> it looks far more likely is that it is related to commit
> 72e73c78c446e ("cifs: close the shared root handle on tree disconnect")
> so added Ronnie to the cc.  That patch added a call (at unmount time)
> to close_shroot.
> The idea of that patch made sense - although tree disconnect (and then
> logoff of the session)
> will indirectly free any open handles on the server for that session,
> it is a little
> cleaner to close the cached root SMB3 file handle explicitly.
>
> void close_shroot(struct cached_fid *cfid)
> {
>         mutex_lock(&cfid->fid_mutex);
>         kref_put(&cfid->refcount, smb2_close_cached_fid);
>         mutex_unlock(&cfid->fid_mutex);
> }
>
>
> Taking out the one line change in the patch from last week that calls
> close_shroot from
> umount (SMB2_tdis, ie tree_disconnect) I don't see the problem so far
> more likely
> that it is related to that commit.   The problem seems to be related
> to servers which
> don't support directory leases.  Will spin up a patch to fix this if
> Ronnie hasn't already fixed it



-- 
Thanks,

Steve

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: refcount_t: underflow; use-after-free with CIFS umount after scsi-misc commit ef2cc88e2a205b8a11a19e78db63a70d3728cdf5
  2019-12-09  2:23   ` Linus Torvalds
  2019-12-09  2:52     ` Al Viro
  2019-12-09  3:18     ` Steve French
@ 2019-12-09 14:45     ` Arthur Marsh
  2 siblings, 0 replies; 8+ messages in thread
From: Arthur Marsh @ 2019-12-09 14:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: SCSI development list, Linux Kernel Mailing List, CIFS,
	James E.J. Bottomley

Hi, I ran the last good kernel with several boot-up, cifs mount, un-mount, shut down cycles without encountering the problem.

After applying the patch from <ronniesahlberg@gmail.com>:

diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 0ab6b1200288..d2658f51ff60 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -1847,7 +1847,8 @@ SMB2_tdis(const unsigned int xid, struct cifs_tcon *tcon)
if ((tcon->need_reconnect) || (tcon->ses->need_reconnect))
return 0;

- close_shroot(&tcon->crfid);
+ if (tcon->crfid.is_valid)
+ close_shroot(&tcon->crfid);


 to kernel 5.5.0-rc1 I no longer experience the problem.

Regards,

Arthur. 

On 9 December 2019 12:53:02 pm ACDT, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>On Sun, Dec 8, 2019 at 5:49 PM Arthur Marsh
><arthur.marsh@internode.on.net> wrote:
>>
>> This still happens with 5.5.0-rc1:
>
>Does it happen 100% of the time?
>
>Your bisection result looks pretty nonsensical - not that it's
>impossible (anything is possible), but it really doesn't look very
>likely. Which makes me think maybe it's slightly timing-sensitive or
>something?
>
>Would you mind trying to re-do the bisection, and for each kernel try
>the mount thing at least a few times before you decide a kernel is
>good?
>
>Bisection is very powerful, but if _any_ of the kernels you marked
>good weren't really good (they just happened to not trigger the
>problem), bisection ends up giving completely the wrong answer. And
>with that bisection commit, there's not even a hint of what could have
>gone wrong.
>
>             Linus

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-12-09 14:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <30808b0b-367a-266a-7ef4-de69c08e1319@internode.on.net>
2019-12-09  1:49 ` refcount_t: underflow; use-after-free with CIFS umount after scsi-misc commit ef2cc88e2a205b8a11a19e78db63a70d3728cdf5 Arthur Marsh
2019-12-09  2:23   ` Linus Torvalds
2019-12-09  2:52     ` Al Viro
2019-12-09  3:10       ` Linus Torvalds
2019-12-09  3:15         ` Linus Torvalds
2019-12-09  3:18     ` Steve French
2019-12-09  3:27       ` Steve French
2019-12-09 14:45     ` Arthur Marsh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).