Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
* [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
@ 2020-07-15 14:48 Kai-Heng Feng
  2020-07-15 15:02 ` Chuck Lever
  0 siblings, 1 reply; 13+ messages in thread
From: Kai-Heng Feng @ 2020-07-15 14:48 UTC (permalink / raw)
  To: chuck.lever
  Cc: matthew.ruffell, linux-stable, linux-nfs,
	open list:NETWORKING DRIVERS, open list

Hi,

Multiple users reported NFS causes NULL pointer dereference [1] on Ubuntu, due to commit "SUNRPC: Add "@len" parameter to gss_unwrap()" and commit "SUNRPC: Fix GSS privacy computation of auth->au_ralign".

The same issue happens on upstream stable 5.4.y branch.
The mainline kernel doesn't have this issue though.

Should we revert them? Or is there any missing commits need to be backported to v5.4?

[1] https://bugs.launchpad.net/bugs/1886277

Kai-Heng

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-15 14:48 [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y Kai-Heng Feng
@ 2020-07-15 15:02 ` Chuck Lever
  2020-07-15 15:08   ` Kai-Heng Feng
  0 siblings, 1 reply; 13+ messages in thread
From: Chuck Lever @ 2020-07-15 15:02 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: matthew.ruffell, linux-stable, Linux NFS Mailing List,
	open list:NETWORKING DRIVERS, open list



> On Jul 15, 2020, at 10:48 AM, Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:
> 
> Hi,
> 
> Multiple users reported NFS causes NULL pointer dereference [1] on Ubuntu, due to commit "SUNRPC: Add "@len" parameter to gss_unwrap()" and commit "SUNRPC: Fix GSS privacy computation of auth->au_ralign".
> 
> The same issue happens on upstream stable 5.4.y branch.
> The mainline kernel doesn't have this issue though.
> 
> Should we revert them? Or is there any missing commits need to be backported to v5.4?
> 
> [1] https://bugs.launchpad.net/bugs/1886277
> 
> Kai-Heng

31c9590ae468 ("SUNRPC: Add "@len" parameter to gss_unwrap()") is a refactoring
change. It shouldn't have introduced any behavior difference. But in theory,
practice and theory should be the same...

Check if 0a8e7b7d0846 ("SUNRPC: Revert 241b1f419f0e ("SUNRPC: Remove xdr_buf_trim()")")
is also applied to 5.4.0-40-generic.

It would help to know if v5.5 stable is working for you. I haven't had any
problems with it.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-15 15:02 ` Chuck Lever
@ 2020-07-15 15:08   ` Kai-Heng Feng
  2020-07-15 15:14     ` Chuck Lever
  0 siblings, 1 reply; 13+ messages in thread
From: Kai-Heng Feng @ 2020-07-15 15:08 UTC (permalink / raw)
  To: Chuck Lever
  Cc: matthew.ruffell, linux-stable, Linux NFS Mailing List,
	open list:NETWORKING DRIVERS, open list



> On Jul 15, 2020, at 23:02, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On Jul 15, 2020, at 10:48 AM, Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:
>> 
>> Hi,
>> 
>> Multiple users reported NFS causes NULL pointer dereference [1] on Ubuntu, due to commit "SUNRPC: Add "@len" parameter to gss_unwrap()" and commit "SUNRPC: Fix GSS privacy computation of auth->au_ralign".
>> 
>> The same issue happens on upstream stable 5.4.y branch.
>> The mainline kernel doesn't have this issue though.
>> 
>> Should we revert them? Or is there any missing commits need to be backported to v5.4?
>> 
>> [1] https://bugs.launchpad.net/bugs/1886277
>> 
>> Kai-Heng
> 
> 31c9590ae468 ("SUNRPC: Add "@len" parameter to gss_unwrap()") is a refactoring
> change. It shouldn't have introduced any behavior difference. But in theory,
> practice and theory should be the same...
> 
> Check if 0a8e7b7d0846 ("SUNRPC: Revert 241b1f419f0e ("SUNRPC: Remove xdr_buf_trim()")")
> is also applied to 5.4.0-40-generic.

Yes, it's included. The commit is part of upstream stable 5.4.

> 
> It would help to know if v5.5 stable is working for you. I haven't had any
> problems with it.

I'll ask users to test it out. 
Thanks for you quick reply!

Kai-Heng

> 
> 
> --
> Chuck Lever
> 
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-15 15:08   ` Kai-Heng Feng
@ 2020-07-15 15:14     ` Chuck Lever
  2020-07-15 18:54       ` Chuck Lever
  0 siblings, 1 reply; 13+ messages in thread
From: Chuck Lever @ 2020-07-15 15:14 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: matthew.ruffell, linux-stable, Linux NFS Mailing List,
	open list:NETWORKING DRIVERS, open list



> On Jul 15, 2020, at 11:08 AM, Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:
> 
>> On Jul 15, 2020, at 23:02, Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>>> On Jul 15, 2020, at 10:48 AM, Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Multiple users reported NFS causes NULL pointer dereference [1] on Ubuntu, due to commit "SUNRPC: Add "@len" parameter to gss_unwrap()" and commit "SUNRPC: Fix GSS privacy computation of auth->au_ralign".
>>> 
>>> The same issue happens on upstream stable 5.4.y branch.
>>> The mainline kernel doesn't have this issue though.
>>> 
>>> Should we revert them? Or is there any missing commits need to be backported to v5.4?
>>> 
>>> [1] https://bugs.launchpad.net/bugs/1886277
>>> 
>>> Kai-Heng
>> 
>> 31c9590ae468 ("SUNRPC: Add "@len" parameter to gss_unwrap()") is a refactoring
>> change. It shouldn't have introduced any behavior difference. But in theory,
>> practice and theory should be the same...
>> 
>> Check if 0a8e7b7d0846 ("SUNRPC: Revert 241b1f419f0e ("SUNRPC: Remove xdr_buf_trim()")")
>> is also applied to 5.4.0-40-generic.
> 
> Yes, it's included. The commit is part of upstream stable 5.4.
> 
>> 
>> It would help to know if v5.5 stable is working for you. I haven't had any
>> problems with it.
> 
> I'll ask users to test it out. 
> Thanks for you quick reply!

Another thought: Please ask what encryption type is in use. The
kerberos_v1 enctypes might exercise a code path I wasn't able to
test.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-15 15:14     ` Chuck Lever
@ 2020-07-15 18:54       ` Chuck Lever
  2020-07-16 18:40         ` Pierre Sauter
  0 siblings, 1 reply; 13+ messages in thread
From: Chuck Lever @ 2020-07-15 18:54 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: matthew.ruffell, linux-stable, Linux NFS Mailing List,
	open list:NETWORKING DRIVERS, open list



> On Jul 15, 2020, at 11:14 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On Jul 15, 2020, at 11:08 AM, Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:
>> 
>>> On Jul 15, 2020, at 23:02, Chuck Lever <chuck.lever@oracle.com> wrote:
>>> 
>>>> On Jul 15, 2020, at 10:48 AM, Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> Multiple users reported NFS causes NULL pointer dereference [1] on Ubuntu, due to commit "SUNRPC: Add "@len" parameter to gss_unwrap()" and commit "SUNRPC: Fix GSS privacy computation of auth->au_ralign".
>>>> 
>>>> The same issue happens on upstream stable 5.4.y branch.
>>>> The mainline kernel doesn't have this issue though.
>>>> 
>>>> Should we revert them? Or is there any missing commits need to be backported to v5.4?
>>>> 
>>>> [1] https://bugs.launchpad.net/bugs/1886277
>>>> 
>>>> Kai-Heng
>>> 
>>> 31c9590ae468 ("SUNRPC: Add "@len" parameter to gss_unwrap()") is a refactoring
>>> change. It shouldn't have introduced any behavior difference. But in theory,
>>> practice and theory should be the same...
>>> 
>>> Check if 0a8e7b7d0846 ("SUNRPC: Revert 241b1f419f0e ("SUNRPC: Remove xdr_buf_trim()")")
>>> is also applied to 5.4.0-40-generic.
>> 
>> Yes, it's included. The commit is part of upstream stable 5.4.
>> 
>>> 
>>> It would help to know if v5.5 stable is working for you. I haven't had any
>>> problems with it.
>> 
>> I'll ask users to test it out. 
>> Thanks for you quick reply!
> 
> Another thought: Please ask what encryption type is in use. The
> kerberos_v1 enctypes might exercise a code path I wasn't able to
> test.

OK.

v5.4.40 does not have 31c9590ae468 and friends, but the claim is this
one crashes?

And v5.4.51 has those three and 89a3c9f5b9f0, which Pierre claims fixes
the problem for him; but another commenter says v5.4.51 still crashes.

So we're getting inconsistent problem reports.

Have the testers enable memory debugging : KASAN or SLUB debugging
might provide more information. I might have some time later this week
to try reproducing on upstream stable, but no guarantees.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-15 18:54       ` Chuck Lever
@ 2020-07-16 18:40         ` Pierre Sauter
  2020-07-16 19:25           ` Chuck Lever
  0 siblings, 1 reply; 13+ messages in thread
From: Pierre Sauter @ 2020-07-16 18:40 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Kai-Heng Feng, matthew.ruffell, linux-stable,
	Linux NFS Mailing List, open list:NETWORKING DRIVERS, open list,
	linux-kernel-owner

Hi,

Am 2020-07-15 20:54, schrieb Chuck Lever:
> v5.4.40 does not have 31c9590ae468 and friends, but the claim is this
> one crashes?

To my knowledge 31c9590ae468 and friends are in v5.4.40.

> And v5.4.51 has those three and 89a3c9f5b9f0, which Pierre claims fixes
> the problem for him; but another commenter says v5.4.51 still crashes.

v5.4.51 still crashes for me (and afaik it does not have 89a3c9f5b9f0). 
I applied 89a3c9f5b9f0 to the original v5.4.40 which helps mostly.

My krb5 etype is aes256-cts-hmac-sha1-96.

Below is the bug in the original v5.4.40 with KASAN enabled. It happened 
immediately after mount of /home, no login neccessary:

[   21.501730] ==================================================================
[   21.501756] BUG: KASAN: slab-out-of-bounds in _copy_from_pages+0xe9/0x200 [sunrpc]
[   21.501759] Write of size 64 at addr ffff8883bc9f3244 by task update-desktop-/1478

[   21.501763] CPU: 0 PID: 1478 Comm: update-desktop- Tainted: G           OE     5.4.0-40-generic #44
[   21.501764] Hardware name: XXXXXXXXXXXXXXXXXXXXXX
[   21.501765] Call Trace:
[   21.501769]  dump_stack+0x96/0xca
[   21.501772]  print_address_description.constprop.0+0x20/0x210
[   21.501789]  ? _copy_from_pages+0xe9/0x200 [sunrpc]
[   21.501790]  __kasan_report.cold+0x1b/0x41
[   21.501806]  ? _copy_from_pages+0xe9/0x200 [sunrpc]
[   21.501807]  kasan_report+0x12/0x20
[   21.501809]  check_memory_region+0x129/0x1b0
[   21.501811]  memcpy+0x38/0x50
[   21.501825]  _copy_from_pages+0xe9/0x200 [sunrpc]
[   21.501839]  ? call_decode+0x2fd/0x7e0 [sunrpc]
[   21.501854]  ? __rpc_execute+0x204/0xbd0 [sunrpc]
[   21.501869]  xdr_shrink_pagelen+0x198/0x3c0 [sunrpc]
[   21.501871]  ? trailing_symlink+0x6fe/0x810
[   21.501886]  xdr_align_pages+0x15f/0x580 [sunrpc]
[   21.501904]  ? decode_setattr+0x120/0x120 [nfsv4]
[   21.501920]  xdr_read_pages+0x44/0x290 [sunrpc]
[   21.501935]  ? __decode_op_hdr+0x29/0x430 [nfsv4]
[   21.501949]  nfs4_xdr_dec_readlink+0x238/0x390 [nfsv4]
[   21.501963]  ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4]
[   21.501966]  ? __kasan_slab_free+0x14e/0x180
[   21.501970]  ? gss_validate+0x37e/0x610 [auth_rpcgss]
[   21.501972]  ? kasan_slab_free+0xe/0x10
[   21.501985]  ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4]
[   21.502001]  rpcauth_unwrap_resp_decode+0xaa/0x100 [sunrpc]
[   21.502005]  gss_unwrap_resp+0x99d/0x1570 [auth_rpcgss]
[   21.502009]  ? gss_destroy_cred+0x460/0x460 [auth_rpcgss]
[   21.502011]  ? finish_task_switch+0x163/0x670
[   21.502014]  ? __switch_to_asm+0x34/0x70
[   21.502017]  ? gss_wrap_req+0x830/0x830 [auth_rpcgss]
[   21.502020]  ? prepare_to_wait+0xea/0x2b0
[   21.502036]  rpcauth_unwrap_resp+0xac/0x100 [sunrpc]
[   21.502049]  call_decode+0x454/0x7e0 [sunrpc]
[   21.502063]  ? rpc_decode_header+0x10a0/0x10a0 [sunrpc]
[   21.502065]  ? var_wake_function+0x140/0x140
[   21.502078]  ? call_transmit_status+0x31e/0x5d0 [sunrpc]
[   21.502091]  ? rpc_decode_header+0x10a0/0x10a0 [sunrpc]
[   21.502106]  __rpc_execute+0x204/0xbd0 [sunrpc]
[   21.502119]  ? xprt_wait_for_reply_request_def+0x170/0x170 [sunrpc]
[   21.502134]  ? rpc_exit+0xc0/0xc0 [sunrpc]
[   21.502135]  ? __kasan_check_read+0x11/0x20
[   21.502137]  ? wake_up_bit+0x42/0x50
[   21.502151]  rpc_execute+0x1a0/0x1f0 [sunrpc]
[   21.502165]  rpc_run_task+0x454/0x5e0 [sunrpc]
[   21.502179]  nfs4_call_sync_custom+0x12/0x70 [nfsv4]
[   21.502192]  nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4]
[   21.502194]  ? __read_once_size_nocheck.constprop.0+0x10/0x10
[   21.502207]  ? nfs4_call_sync_custom+0x70/0x70 [nfsv4]
[   21.502209]  ? __kasan_check_read+0x11/0x20
[   21.502211]  ? rmqueue+0x397/0x2410
[   21.502225]  _nfs4_proc_readlink+0x1a6/0x250 [nfsv4]
[   21.502238]  ? _nfs4_proc_getdeviceinfo+0x350/0x350 [nfsv4]
[   21.502254]  nfs4_proc_readlink+0x101/0x2c0 [nfsv4]
[   21.502268]  ? nfs4_proc_link+0x1c0/0x1c0 [nfsv4]
[   21.502271]  ? add_to_page_cache_locked+0x20/0x20
[   21.502285]  nfs_symlink_filler+0xdc/0x190 [nfs]
[   21.502287]  do_read_cache_page+0x60e/0x1490
[   21.502297]  ? nfs4_do_lookup_revalidate+0x1a1/0x2d0 [nfs]
[   21.502308]  ? nfs_get_link+0x370/0x370 [nfs]
[   21.502311]  ? xas_load+0x23/0x250
[   21.502312]  ? pagecache_get_page+0x760/0x760
[   21.502315]  ? lockref_get_not_dead+0xe3/0x1c0
[   21.502317]  ? __kasan_check_write+0x14/0x20
[   21.502318]  ? lockref_get_not_dead+0xe3/0x1c0
[   21.502320]  ? __kasan_check_write+0x14/0x20
[   21.502321]  ? _raw_spin_lock+0x7b/0xd0
[   21.502323]  ? _raw_write_trylock+0x110/0x110
[   21.502325]  read_cache_page+0x4c/0x80
[   21.502335]  nfs_get_link+0x75/0x370 [nfs]
[   21.502337]  trailing_symlink+0x6fe/0x810
[   21.502347]  ? nfs_destroy_readpagecache+0x20/0x20 [nfs]
[   21.502349]  path_lookupat.isra.0+0x188/0x7d0
[   21.502351]  ? do_syscall_64+0x9f/0x3a0
[   21.502353]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   21.502355]  ? path_parentat.isra.0+0x110/0x110
[   21.502357]  ? stack_trace_save+0x94/0xc0
[   21.502359]  ? stack_trace_consume_entry+0x170/0x170
[   21.502361]  filename_lookup+0x185/0x3b0
[   21.502362]  ? nd_jump_link+0x1d0/0x1d0
[   21.502364]  ? kasan_slab_free+0xe/0x10
[   21.502366]  ? __kasan_check_read+0x11/0x20
[   21.502367]  ? __check_object_size+0x249/0x316
[   21.502369]  ? strncpy_from_user+0x80/0x290
[   21.502370]  ? kmem_cache_alloc+0x180/0x250
[   21.502372]  ? getname_flags+0x100/0x520
[   21.502374]  user_path_at_empty+0x3a/0x50
[   21.502375]  vfs_statx+0xca/0x150
[   21.502377]  ? vfs_statx_fd+0x90/0x90
[   21.502379]  ? __kasan_slab_free+0x14e/0x180
[   21.502381]  __do_sys_newstat+0x9a/0x100
[   21.502382]  ? cp_new_stat+0x5d0/0x5d0
[   21.502384]  ? __kasan_check_write+0x14/0x20
[   21.502385]  ? _raw_spin_lock_irq+0x82/0xe0
[   21.502386]  ? _raw_read_lock_irq+0x50/0x50
[   21.502388]  ? __blkcg_punt_bio_submit+0x1c0/0x1c0
[   21.502390]  ? __kasan_check_write+0x14/0x20
[   21.502392]  ? switch_fpu_return+0x13a/0x2d0
[   21.502393]  ? fpregs_mark_activate+0x150/0x150
[   21.502395]  __x64_sys_newstat+0x54/0x80
[   21.502397]  do_syscall_64+0x9f/0x3a0
[   21.502398]  ? prepare_exit_to_usermode+0xee/0x1a0
[   21.502400]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   21.502402] RIP: 0033:0x7f9afc30049a
[   21.502404] Code: 00 00 75 05 48 83 c4 18 c3 e8 f2 24 02 00 66 90 f3 0f 1e fa 41 89 f8 48 89 f7 48 89 d6 41 83 f8 01 77 2d b8 04 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 8b 15 c1 a9 0d 00 f7
[   21.502405] RSP: 002b:00007ffe7cd464c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
[   21.502407] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f9afc30049a
[   21.502408] RDX: 00007ffe7cd464d0 RSI: 00007ffe7cd464d0 RDI: 000055c9330705f0
[   21.502409] RBP: 000055c9330705f0 R08: 0000000000000001 R09: 0000000000000001
[   21.502410] R10: 0000000000000017 R11: 0000000000000246 R12: 000055c933074de3
[   21.502410] R13: 000055c93306e1a0 R14: 000055c93306c940 R15: 000055c93306c990

[   21.502414] Allocated by task 1478:
[   21.502416]  save_stack+0x23/0x90
[   21.502418]  __kasan_kmalloc.constprop.0+0xcf/0xe0
[   21.502419]  kasan_slab_alloc+0xe/0x10
[   21.502420]  kmem_cache_alloc+0xd7/0x250
[   21.502422]  mempool_alloc_slab+0x17/0x20
[   21.502424]  mempool_alloc+0x126/0x330
[   21.502439]  rpc_malloc+0x1f2/0x270 [sunrpc]
[   21.502452]  call_allocate+0x3b9/0x9d0 [sunrpc]
[   21.502467]  __rpc_execute+0x204/0xbd0 [sunrpc]
[   21.502480]  rpc_execute+0x1a0/0x1f0 [sunrpc]
[   21.502493]  rpc_run_task+0x454/0x5e0 [sunrpc]
[   21.502507]  nfs4_call_sync_custom+0x12/0x70 [nfsv4]
[   21.502520]  nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4]
[   21.502533]  _nfs4_proc_readlink+0x1a6/0x250 [nfsv4]
[   21.502546]  nfs4_proc_readlink+0x101/0x2c0 [nfsv4]
[   21.502558]  nfs_symlink_filler+0xdc/0x190 [nfs]
[   21.502560]  do_read_cache_page+0x60e/0x1490
[   21.502561]  read_cache_page+0x4c/0x80
[   21.502571]  nfs_get_link+0x75/0x370 [nfs]
[   21.502572]  trailing_symlink+0x6fe/0x810
[   21.502574]  path_lookupat.isra.0+0x188/0x7d0
[   21.502575]  filename_lookup+0x185/0x3b0
[   21.502576]  user_path_at_empty+0x3a/0x50
[   21.502578]  vfs_statx+0xca/0x150
[   21.502579]  __do_sys_newstat+0x9a/0x100
[   21.502581]  __x64_sys_newstat+0x54/0x80
[   21.502582]  do_syscall_64+0x9f/0x3a0
[   21.502584]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[   21.502585] Freed by task 0:
[   21.502586] (stack is not available)

[   21.502589] The buggy address belongs to the object at ffff8883bc9f2a80
                which belongs to the cache rpc_buffers of size 2048
[   21.502591] The buggy address is located 1988 bytes inside of
                2048-byte region [ffff8883bc9f2a80, ffff8883bc9f3280)
[   21.502592] The buggy address belongs to the page:
[   21.502595] page:ffffea000ef27c00 refcount:1 mapcount:0 mapping:ffff8883bf61e600 index:0x0 compound_mapcount: 0
[   21.502597] flags: 0x17ffffc0010200(slab|head)
[   21.502600] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff8883bf61e600
[   21.502602] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
[   21.502603] page dumped because: kasan: bad access detected

[   21.502605] Memory state around the buggy address:
[   21.502607]  ffff8883bc9f3180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   21.502608]  ffff8883bc9f3200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   21.502610] >ffff8883bc9f3280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   21.502611]                    ^
[   21.502612]  ffff8883bc9f3300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   21.502614]  ffff8883bc9f3380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   21.502615] ==================================================================
[   21.502616] Disabling lock debugging due to kernel taint

Best Regards
-- 
Pierre Sauter
Studentenwerk München
-------



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-16 18:40         ` Pierre Sauter
@ 2020-07-16 19:25           ` Chuck Lever
  2020-07-17 17:29             ` Pierre Sauter
  0 siblings, 1 reply; 13+ messages in thread
From: Chuck Lever @ 2020-07-16 19:25 UTC (permalink / raw)
  To: Pierre Sauter
  Cc: Kai-Heng Feng, matthew.ruffell, linux-stable,
	Linux NFS Mailing List, open list:NETWORKING DRIVERS, open list,
	linux-kernel-owner

Hi Pierre-

> On Jul 16, 2020, at 2:40 PM, Pierre Sauter <pierre.sauter@stwm.de> wrote:
> 
> Hi,
> 
> Am 2020-07-15 20:54, schrieb Chuck Lever:
>> v5.4.40 does not have 31c9590ae468 and friends, but the claim is this
>> one crashes?
> 
> To my knowledge 31c9590ae468 and friends are in v5.4.40.

Those upstream commits were merged in v5.4.42.

The last commit that is applied to net/sunrpc/auth_gss/gss_krb5_wrap.c
in v5.4.40 is 241b1f419f0e ("SUNRPC: Remove xdr_buf_trim()").


>> And v5.4.51 has those three and 89a3c9f5b9f0, which Pierre claims fixes
>> the problem for him; but another commenter says v5.4.51 still crashes.
> 
> v5.4.51 still crashes for me (and afaik it does not have 89a3c9f5b9f0). 
> I applied 89a3c9f5b9f0 to the original v5.4.40 which helps mostly.

In the v5.4.51 upstream stable kernel, I see:

commit 7b99577ff376d58addf149eebe0ffff46351b3d7
Author:     Chuck Lever <chuck.lever@oracle.com>
AuthorDate: Thu Jun 25 11:32:34 2020 -0400
Commit:     Sasha Levin <sashal@kernel.org>
CommitDate: Tue Jun 30 15:37:12 2020 -0400

    SUNRPC: Properly set the @subbuf parameter of xdr_buf_subsegment()
    
    commit 89a3c9f5b9f0bcaa9aea3e8b2a616fcaea9aad78 upstream.

According to "git describe --contains", 7b99577ff376 was merged in v5.4.50.

So this makes me think there's a possibility you are not using upstream
stable kernels. I can't help if I don't know what source code and commit
stream you are using. It also makes me question the bisect result.


> My krb5 etype is aes256-cts-hmac-sha1-96.

Thanks! And what is your NFS server and filesystem? It's possible that the
client is not estimating the size of the reply correctly. Variables include
the size of file handles, MIC verifiers, and wrap tokens.


> Below is the bug in the original v5.4.40 with KASAN enabled. It happened 
> immediately after mount of /home, no login neccessary:
> 
> [   21.501730] ==================================================================
> [   21.501756] BUG: KASAN: slab-out-of-bounds in _copy_from_pages+0xe9/0x200 [sunrpc]
> [   21.501759] Write of size 64 at addr ffff8883bc9f3244 by task update-desktop-/1478
> 
> [   21.501763] CPU: 0 PID: 1478 Comm: update-desktop- Tainted: G           OE     5.4.0-40-generic #44

So, I don't know what 5.4.0-40-generic is, but it's not an upstream
stable kernel. If it's an Ubuntu kernel, you should work with them
directly to nail this issue down.


> [   21.501764] Hardware name: XXXXXXXXXXXXXXXXXXXXXX
> [   21.501765] Call Trace:
> [   21.501769]  dump_stack+0x96/0xca
> [   21.501772]  print_address_description.constprop.0+0x20/0x210
> [   21.501789]  ? _copy_from_pages+0xe9/0x200 [sunrpc]
> [   21.501790]  __kasan_report.cold+0x1b/0x41
> [   21.501806]  ? _copy_from_pages+0xe9/0x200 [sunrpc]
> [   21.501807]  kasan_report+0x12/0x20
> [   21.501809]  check_memory_region+0x129/0x1b0
> [   21.501811]  memcpy+0x38/0x50
> [   21.501825]  _copy_from_pages+0xe9/0x200 [sunrpc]
> [   21.501839]  ? call_decode+0x2fd/0x7e0 [sunrpc]
> [   21.501854]  ? __rpc_execute+0x204/0xbd0 [sunrpc]
> [   21.501869]  xdr_shrink_pagelen+0x198/0x3c0 [sunrpc]

You might try:

e8d70b321ecc ("SUNRPC: Fix another issue with MIC buffer space")


> [   21.501871]  ? trailing_symlink+0x6fe/0x810
> [   21.501886]  xdr_align_pages+0x15f/0x580 [sunrpc]
> [   21.501904]  ? decode_setattr+0x120/0x120 [nfsv4]
> [   21.501920]  xdr_read_pages+0x44/0x290 [sunrpc]
> [   21.501935]  ? __decode_op_hdr+0x29/0x430 [nfsv4]
> [   21.501949]  nfs4_xdr_dec_readlink+0x238/0x390 [nfsv4]
> [   21.501963]  ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4]
> [   21.501966]  ? __kasan_slab_free+0x14e/0x180
> [   21.501970]  ? gss_validate+0x37e/0x610 [auth_rpcgss]
> [   21.501972]  ? kasan_slab_free+0xe/0x10
> [   21.501985]  ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4]
> [   21.502001]  rpcauth_unwrap_resp_decode+0xaa/0x100 [sunrpc]
> [   21.502005]  gss_unwrap_resp+0x99d/0x1570 [auth_rpcgss]
> [   21.502009]  ? gss_destroy_cred+0x460/0x460 [auth_rpcgss]
> [   21.502011]  ? finish_task_switch+0x163/0x670
> [   21.502014]  ? __switch_to_asm+0x34/0x70
> [   21.502017]  ? gss_wrap_req+0x830/0x830 [auth_rpcgss]
> [   21.502020]  ? prepare_to_wait+0xea/0x2b0
> [   21.502036]  rpcauth_unwrap_resp+0xac/0x100 [sunrpc]
> [   21.502049]  call_decode+0x454/0x7e0 [sunrpc]
> [   21.502063]  ? rpc_decode_header+0x10a0/0x10a0 [sunrpc]
> [   21.502065]  ? var_wake_function+0x140/0x140
> [   21.502078]  ? call_transmit_status+0x31e/0x5d0 [sunrpc]
> [   21.502091]  ? rpc_decode_header+0x10a0/0x10a0 [sunrpc]
> [   21.502106]  __rpc_execute+0x204/0xbd0 [sunrpc]
> [   21.502119]  ? xprt_wait_for_reply_request_def+0x170/0x170 [sunrpc]
> [   21.502134]  ? rpc_exit+0xc0/0xc0 [sunrpc]
> [   21.502135]  ? __kasan_check_read+0x11/0x20
> [   21.502137]  ? wake_up_bit+0x42/0x50
> [   21.502151]  rpc_execute+0x1a0/0x1f0 [sunrpc]
> [   21.502165]  rpc_run_task+0x454/0x5e0 [sunrpc]
> [   21.502179]  nfs4_call_sync_custom+0x12/0x70 [nfsv4]
> [   21.502192]  nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4]
> [   21.502194]  ? __read_once_size_nocheck.constprop.0+0x10/0x10
> [   21.502207]  ? nfs4_call_sync_custom+0x70/0x70 [nfsv4]
> [   21.502209]  ? __kasan_check_read+0x11/0x20
> [   21.502211]  ? rmqueue+0x397/0x2410
> [   21.502225]  _nfs4_proc_readlink+0x1a6/0x250 [nfsv4]
> [   21.502238]  ? _nfs4_proc_getdeviceinfo+0x350/0x350 [nfsv4]
> [   21.502254]  nfs4_proc_readlink+0x101/0x2c0 [nfsv4]
> [   21.502268]  ? nfs4_proc_link+0x1c0/0x1c0 [nfsv4]
> [   21.502271]  ? add_to_page_cache_locked+0x20/0x20
> [   21.502285]  nfs_symlink_filler+0xdc/0x190 [nfs]
> [   21.502287]  do_read_cache_page+0x60e/0x1490
> [   21.502297]  ? nfs4_do_lookup_revalidate+0x1a1/0x2d0 [nfs]
> [   21.502308]  ? nfs_get_link+0x370/0x370 [nfs]
> [   21.502311]  ? xas_load+0x23/0x250
> [   21.502312]  ? pagecache_get_page+0x760/0x760
> [   21.502315]  ? lockref_get_not_dead+0xe3/0x1c0
> [   21.502317]  ? __kasan_check_write+0x14/0x20
> [   21.502318]  ? lockref_get_not_dead+0xe3/0x1c0
> [   21.502320]  ? __kasan_check_write+0x14/0x20
> [   21.502321]  ? _raw_spin_lock+0x7b/0xd0
> [   21.502323]  ? _raw_write_trylock+0x110/0x110
> [   21.502325]  read_cache_page+0x4c/0x80
> [   21.502335]  nfs_get_link+0x75/0x370 [nfs]
> [   21.502337]  trailing_symlink+0x6fe/0x810
> [   21.502347]  ? nfs_destroy_readpagecache+0x20/0x20 [nfs]
> [   21.502349]  path_lookupat.isra.0+0x188/0x7d0
> [   21.502351]  ? do_syscall_64+0x9f/0x3a0
> [   21.502353]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   21.502355]  ? path_parentat.isra.0+0x110/0x110
> [   21.502357]  ? stack_trace_save+0x94/0xc0
> [   21.502359]  ? stack_trace_consume_entry+0x170/0x170
> [   21.502361]  filename_lookup+0x185/0x3b0
> [   21.502362]  ? nd_jump_link+0x1d0/0x1d0
> [   21.502364]  ? kasan_slab_free+0xe/0x10
> [   21.502366]  ? __kasan_check_read+0x11/0x20
> [   21.502367]  ? __check_object_size+0x249/0x316
> [   21.502369]  ? strncpy_from_user+0x80/0x290
> [   21.502370]  ? kmem_cache_alloc+0x180/0x250
> [   21.502372]  ? getname_flags+0x100/0x520
> [   21.502374]  user_path_at_empty+0x3a/0x50
> [   21.502375]  vfs_statx+0xca/0x150
> [   21.502377]  ? vfs_statx_fd+0x90/0x90
> [   21.502379]  ? __kasan_slab_free+0x14e/0x180
> [   21.502381]  __do_sys_newstat+0x9a/0x100
> [   21.502382]  ? cp_new_stat+0x5d0/0x5d0
> [   21.502384]  ? __kasan_check_write+0x14/0x20
> [   21.502385]  ? _raw_spin_lock_irq+0x82/0xe0
> [   21.502386]  ? _raw_read_lock_irq+0x50/0x50
> [   21.502388]  ? __blkcg_punt_bio_submit+0x1c0/0x1c0
> [   21.502390]  ? __kasan_check_write+0x14/0x20
> [   21.502392]  ? switch_fpu_return+0x13a/0x2d0
> [   21.502393]  ? fpregs_mark_activate+0x150/0x150
> [   21.502395]  __x64_sys_newstat+0x54/0x80
> [   21.502397]  do_syscall_64+0x9f/0x3a0
> [   21.502398]  ? prepare_exit_to_usermode+0xee/0x1a0
> [   21.502400]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   21.502402] RIP: 0033:0x7f9afc30049a
> [   21.502404] Code: 00 00 75 05 48 83 c4 18 c3 e8 f2 24 02 00 66 90 f3 0f 1e fa 41 89 f8 48 89 f7 48 89 d6 41 83 f8 01 77 2d b8 04 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 8b 15 c1 a9 0d 00 f7
> [   21.502405] RSP: 002b:00007ffe7cd464c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
> [   21.502407] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f9afc30049a
> [   21.502408] RDX: 00007ffe7cd464d0 RSI: 00007ffe7cd464d0 RDI: 000055c9330705f0
> [   21.502409] RBP: 000055c9330705f0 R08: 0000000000000001 R09: 0000000000000001
> [   21.502410] R10: 0000000000000017 R11: 0000000000000246 R12: 000055c933074de3
> [   21.502410] R13: 000055c93306e1a0 R14: 000055c93306c940 R15: 000055c93306c990
> 
> [   21.502414] Allocated by task 1478:
> [   21.502416]  save_stack+0x23/0x90
> [   21.502418]  __kasan_kmalloc.constprop.0+0xcf/0xe0
> [   21.502419]  kasan_slab_alloc+0xe/0x10
> [   21.502420]  kmem_cache_alloc+0xd7/0x250
> [   21.502422]  mempool_alloc_slab+0x17/0x20
> [   21.502424]  mempool_alloc+0x126/0x330
> [   21.502439]  rpc_malloc+0x1f2/0x270 [sunrpc]
> [   21.502452]  call_allocate+0x3b9/0x9d0 [sunrpc]
> [   21.502467]  __rpc_execute+0x204/0xbd0 [sunrpc]
> [   21.502480]  rpc_execute+0x1a0/0x1f0 [sunrpc]
> [   21.502493]  rpc_run_task+0x454/0x5e0 [sunrpc]
> [   21.502507]  nfs4_call_sync_custom+0x12/0x70 [nfsv4]
> [   21.502520]  nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4]
> [   21.502533]  _nfs4_proc_readlink+0x1a6/0x250 [nfsv4]
> [   21.502546]  nfs4_proc_readlink+0x101/0x2c0 [nfsv4]
> [   21.502558]  nfs_symlink_filler+0xdc/0x190 [nfs]
> [   21.502560]  do_read_cache_page+0x60e/0x1490
> [   21.502561]  read_cache_page+0x4c/0x80
> [   21.502571]  nfs_get_link+0x75/0x370 [nfs]
> [   21.502572]  trailing_symlink+0x6fe/0x810
> [   21.502574]  path_lookupat.isra.0+0x188/0x7d0
> [   21.502575]  filename_lookup+0x185/0x3b0
> [   21.502576]  user_path_at_empty+0x3a/0x50
> [   21.502578]  vfs_statx+0xca/0x150
> [   21.502579]  __do_sys_newstat+0x9a/0x100
> [   21.502581]  __x64_sys_newstat+0x54/0x80
> [   21.502582]  do_syscall_64+0x9f/0x3a0
> [   21.502584]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> [   21.502585] Freed by task 0:
> [   21.502586] (stack is not available)
> 
> [   21.502589] The buggy address belongs to the object at ffff8883bc9f2a80
>                which belongs to the cache rpc_buffers of size 2048
> [   21.502591] The buggy address is located 1988 bytes inside of
>                2048-byte region [ffff8883bc9f2a80, ffff8883bc9f3280)
> [   21.502592] The buggy address belongs to the page:
> [   21.502595] page:ffffea000ef27c00 refcount:1 mapcount:0 mapping:ffff8883bf61e600 index:0x0 compound_mapcount: 0
> [   21.502597] flags: 0x17ffffc0010200(slab|head)
> [   21.502600] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff8883bf61e600
> [   21.502602] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
> [   21.502603] page dumped because: kasan: bad access detected
> 
> [   21.502605] Memory state around the buggy address:
> [   21.502607]  ffff8883bc9f3180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   21.502608]  ffff8883bc9f3200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   21.502610] >ffff8883bc9f3280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   21.502611]                    ^
> [   21.502612]  ffff8883bc9f3300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   21.502614]  ffff8883bc9f3380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   21.502615] ==================================================================
> [   21.502616] Disabling lock debugging due to kernel taint
> 
> Best Regards
> -- 
> Pierre Sauter
> Studentenwerk München
> -------

--
Chuck Lever




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-16 19:25           ` Chuck Lever
@ 2020-07-17 17:29             ` Pierre Sauter
  2020-07-17 17:34               ` Chuck Lever
  0 siblings, 1 reply; 13+ messages in thread
From: Pierre Sauter @ 2020-07-17 17:29 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Kai-Heng Feng, matthew.ruffell, linux-stable,
	Linux NFS Mailing List, open list:NETWORKING DRIVERS, open list,
	linux-kernel-owner

Hi Chuck,

Am Donnerstag, 16. Juli 2020, 21:25:40 CEST schrieb Chuck Lever:
> So this makes me think there's a possibility you are not using upstream
> stable kernels. I can't help if I don't know what source code and commit
> stream you are using. It also makes me question the bisect result.

Yes you are right, I was referring to Ubuntu kernels 5.4.0-XX. From the
discussion in the Ubuntu bugtracker I got the impression that Ubuntu kernels
5.4.0-XX and upstream 5.4.XX are closely related, obviously they are not. The
bisection was done by the original bug reporter and also refers to the Ubuntu
kernel.

In the meantime I tested v5.4.51 upstream, which shows no problems. Sorry for
the bother.

> > My krb5 etype is aes256-cts-hmac-sha1-96.
> 
> Thanks! And what is your NFS server and filesystem? It's possible that the
> client is not estimating the size of the reply correctly. Variables include
> the size of file handles, MIC verifiers, and wrap tokens.

The server is Debian with v4.19.130 upstream, filesystem ext4.

> You might try:
> 
> e8d70b321ecc ("SUNRPC: Fix another issue with MIC buffer space")

That one is actually in Ubuntus 5.4.0-40, from looking at the code.

Best Regards
-- 
Pierre Sauter
Studentenwerk München
-------




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-17 17:29             ` Pierre Sauter
@ 2020-07-17 17:34               ` Chuck Lever
  2020-07-17 17:56                 ` Kai-Heng Feng
  0 siblings, 1 reply; 13+ messages in thread
From: Chuck Lever @ 2020-07-17 17:34 UTC (permalink / raw)
  To: Pierre Sauter
  Cc: Kai-Heng Feng, matthew.ruffell, linux-stable,
	Linux NFS Mailing List, open list:NETWORKING DRIVERS, open list,
	linux-kernel-owner



> On Jul 17, 2020, at 1:29 PM, Pierre Sauter <pierre.sauter@stwm.de> wrote:
> 
> Hi Chuck,
> 
> Am Donnerstag, 16. Juli 2020, 21:25:40 CEST schrieb Chuck Lever:
>> So this makes me think there's a possibility you are not using upstream
>> stable kernels. I can't help if I don't know what source code and commit
>> stream you are using. It also makes me question the bisect result.
> 
> Yes you are right, I was referring to Ubuntu kernels 5.4.0-XX. From the
> discussion in the Ubuntu bugtracker I got the impression that Ubuntu kernels
> 5.4.0-XX and upstream 5.4.XX are closely related, obviously they are not. The
> bisection was done by the original bug reporter and also refers to the Ubuntu
> kernel.
> 
> In the meantime I tested v5.4.51 upstream, which shows no problems. Sorry for
> the bother.

Pierre, thanks for confirming!

Kai-Heng suspected an upstream stable commit that is missing in 5.4.0-40,
but I don't have any good suggestions.


>>> My krb5 etype is aes256-cts-hmac-sha1-96.
>> 
>> Thanks! And what is your NFS server and filesystem? It's possible that the
>> client is not estimating the size of the reply correctly. Variables include
>> the size of file handles, MIC verifiers, and wrap tokens.
> 
> The server is Debian with v4.19.130 upstream, filesystem ext4.
> 
>> You might try:
>> 
>> e8d70b321ecc ("SUNRPC: Fix another issue with MIC buffer space")
> 
> That one is actually in Ubuntus 5.4.0-40, from looking at the code.

--
Chuck Lever




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-17 17:34               ` Chuck Lever
@ 2020-07-17 17:56                 ` Kai-Heng Feng
  2020-07-17 19:46                   ` Pierre Sauter
  0 siblings, 1 reply; 13+ messages in thread
From: Kai-Heng Feng @ 2020-07-17 17:56 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Pierre Sauter, matthew.ruffell, linux-stable,
	Linux NFS Mailing List, open list:NETWORKING DRIVERS, open list,
	linux-kernel-owner



> On Jul 18, 2020, at 01:34, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On Jul 17, 2020, at 1:29 PM, Pierre Sauter <pierre.sauter@stwm.de> wrote:
>> 
>> Hi Chuck,
>> 
>> Am Donnerstag, 16. Juli 2020, 21:25:40 CEST schrieb Chuck Lever:
>>> So this makes me think there's a possibility you are not using upstream
>>> stable kernels. I can't help if I don't know what source code and commit
>>> stream you are using. It also makes me question the bisect result.
>> 
>> Yes you are right, I was referring to Ubuntu kernels 5.4.0-XX. From the
>> discussion in the Ubuntu bugtracker I got the impression that Ubuntu kernels
>> 5.4.0-XX and upstream 5.4.XX are closely related, obviously they are not. The
>> bisection was done by the original bug reporter and also refers to the Ubuntu
>> kernel.
>> 
>> In the meantime I tested v5.4.51 upstream, which shows no problems. Sorry for
>> the bother.
> 
> Pierre, thanks for confirming!
> 
> Kai-Heng suspected an upstream stable commit that is missing in 5.4.0-40,
> but I don't have any good suggestions.

Well, Ubuntu's 5.4 kernel is based on upstream stable v5.4, so I asked users to test stable v5.4.51, however the feedback was negative, and that's the reason why I raised the issue here.

Anyway, good to know that it's fixed in upstream stable, everything's good now!
Thanks for your effort Chuck.

Kai-Heng


> 
> 
>>>> My krb5 etype is aes256-cts-hmac-sha1-96.
>>> 
>>> Thanks! And what is your NFS server and filesystem? It's possible that the
>>> client is not estimating the size of the reply correctly. Variables include
>>> the size of file handles, MIC verifiers, and wrap tokens.
>> 
>> The server is Debian with v4.19.130 upstream, filesystem ext4.
>> 
>>> You might try:
>>> 
>>> e8d70b321ecc ("SUNRPC: Fix another issue with MIC buffer space")
>> 
>> That one is actually in Ubuntus 5.4.0-40, from looking at the code.
> 
> --
> Chuck Lever


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-17 17:56                 ` Kai-Heng Feng
@ 2020-07-17 19:46                   ` Pierre Sauter
  2020-07-18 15:55                     ` Chuck Lever
  0 siblings, 1 reply; 13+ messages in thread
From: Pierre Sauter @ 2020-07-17 19:46 UTC (permalink / raw)
  To: Chuck Lever, Kai-Heng Feng
  Cc: matthew.ruffell, linux-stable, Linux NFS Mailing List,
	open list:NETWORKING DRIVERS, open list, linux-kernel-owner

Am Freitag, 17. Juli 2020, 19:56:09 CEST schrieb Kai-Heng Feng:
> > Pierre, thanks for confirming!
> > 
> > Kai-Heng suspected an upstream stable commit that is missing in 5.4.0-40,
> > but I don't have any good suggestions.
> 
> Well, Ubuntu's 5.4 kernel is based on upstream stable v5.4, so I asked users to test stable v5.4.51, however the feedback was negative, and that's the reason why I raised the issue here.
> 
> Anyway, good to know that it's fixed in upstream stable, everything's good now!
> Thanks for your effort Chuck.
> 
> Kai-Heng

Sorry to have caused premature happiness. Kai-Hengs last message reminded me
that I had seen the bug earlier in the week on Ubuntu Mainline v.5.4.51.
So I decided to rebuild vanilla v5.4.51 with Ubuntus config + KASAN, and voila.
It seems that their config is just really good in exposing the bug on mount. I
am off for the weekend, can do more testing next week.

[   21.664580] ==================================================================
[   21.664657] BUG: KASAN: slab-out-of-bounds in _copy_from_pages+0xed/0x210 [sunrpc]
[   21.664705] Write of size 64 at addr ffff8883b6b7d444 by task update-desktop-/1345

[   21.664764] CPU: 0 PID: 1345 Comm: update-desktop- Not tainted 5.4.51 #1
[   21.664765] Hardware name: XXXXXX
[   21.664766] Call Trace:
[   21.664771]  dump_stack+0x96/0xca
[   21.664775]  print_address_description.constprop.0+0x20/0x210
[   21.664795]  ? _copy_from_pages+0xed/0x210 [sunrpc]
[   21.664797]  __kasan_report.cold+0x1b/0x41
[   21.664816]  ? _copy_from_pages+0xed/0x210 [sunrpc]
[   21.664819]  kasan_report+0x14/0x20
[   21.664820]  check_memory_region+0x129/0x1b0
[   21.664822]  memcpy+0x38/0x50
[   21.664840]  _copy_from_pages+0xed/0x210 [sunrpc]
[   21.664859]  xdr_shrink_pagelen+0x1d6/0x440 [sunrpc]
[   21.664877]  xdr_align_pages+0x15f/0x580 [sunrpc]
[   21.664897]  ? decode_setattr+0x120/0x120 [nfsv4]
[   21.664916]  xdr_read_pages+0x44/0x290 [sunrpc]
[   21.664933]  ? __decode_op_hdr+0x29/0x430 [nfsv4]
[   21.664950]  nfs4_xdr_dec_readlink+0x238/0x390 [nfsv4]
[   21.664966]  ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4]
[   21.664969]  ? __kasan_slab_free+0x14e/0x180
[   21.664985]  ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4]
[   21.665003]  rpcauth_unwrap_resp_decode+0xaa/0x100 [sunrpc]
[   21.665009]  gss_unwrap_resp+0x99d/0x1570 [auth_rpcgss]
[   21.665014]  ? gss_destroy_cred+0x460/0x460 [auth_rpcgss]
[   21.665016]  ? finish_task_switch+0x163/0x670
[   21.665019]  ? __switch_to_asm+0x34/0x70
[   21.665023]  ? gss_wrap_req+0x1700/0x1700 [auth_rpcgss]
[   21.665026]  ? prepare_to_wait+0xea/0x2b0
[   21.665045]  rpcauth_unwrap_resp+0xac/0x100 [sunrpc]
[   21.665061]  call_decode+0x454/0x7e0 [sunrpc]
[   21.665077]  ? rpc_decode_header+0x10a0/0x10a0 [sunrpc]
[   21.665079]  ? var_wake_function+0x140/0x140
[   21.665095]  ? call_transmit_status+0x31e/0x5d0 [sunrpc]
[   21.665110]  ? rpc_decode_header+0x10a0/0x10a0 [sunrpc]
[   21.665127]  __rpc_execute+0x204/0xbd0 [sunrpc]
[   21.665143]  ? xprt_wait_for_reply_request_def+0x170/0x170 [sunrpc]
[   21.665160]  ? rpc_exit+0xc0/0xc0 [sunrpc]
[   21.665162]  ? __kasan_check_read+0x11/0x20
[   21.665164]  ? wake_up_bit+0x42/0x50
[   21.665181]  rpc_execute+0x1a0/0x1f0 [sunrpc]
[   21.665197]  rpc_run_task+0x454/0x5e0 [sunrpc]
[   21.665213]  nfs4_call_sync_custom+0x12/0x70 [nfsv4]
[   21.665229]  nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4]
[   21.665244]  ? nfs4_call_sync_custom+0x70/0x70 [nfsv4]
[   21.665247]  ? get_page_from_freelist+0x24d0/0x45f0
[   21.665263]  _nfs4_proc_readlink+0x1a6/0x250 [nfsv4]
[   21.665280]  ? _nfs4_proc_getdeviceinfo+0x350/0x350 [nfsv4]
[   21.665282]  ? release_pages+0x44b/0xca0
[   21.665284]  ? __mod_lruvec_state+0x8f/0x320
[   21.665286]  ? pagevec_lru_move_fn+0x18d/0x230
[   21.665303]  nfs4_proc_readlink+0x101/0x2c0 [nfsv4]
[   21.665320]  ? nfs4_proc_link+0x1c0/0x1c0 [nfsv4]
[   21.665322]  ? add_to_page_cache_locked+0x20/0x20
[   21.665339]  nfs_symlink_filler+0xdc/0x190 [nfs]
[   21.665341]  do_read_cache_page+0x60e/0x1490
[   21.665353]  ? nfs4_do_lookup_revalidate+0x1a1/0x2d0 [nfs]
[   21.665365]  ? nfs_get_link+0x370/0x370 [nfs]
[   21.665367]  ? xas_load+0x23/0x250
[   21.665369]  ? pagecache_get_page+0x760/0x760
[   21.665372]  ? lockref_get_not_dead+0xe3/0x1c0
[   21.665374]  ? __kasan_check_write+0x14/0x20
[   21.665376]  ? lockref_get_not_dead+0xe3/0x1c0
[   21.665378]  ? __kasan_check_write+0x14/0x20
[   21.665380]  ? _raw_spin_lock+0x7b/0xd0
[   21.665382]  ? _raw_write_trylock+0x110/0x110
[   21.665384]  read_cache_page+0x4c/0x80
[   21.665396]  nfs_get_link+0x75/0x370 [nfs]
[   21.665399]  trailing_symlink+0x6fe/0x810
[   21.665411]  ? nfs_destroy_readpagecache+0x20/0x20 [nfs]
[   21.665413]  path_lookupat.isra.0+0x188/0x7d0
[   21.665416]  ? do_syscall_64+0x9f/0x3a0
[   21.665418]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   21.665420]  ? path_parentat.isra.0+0x110/0x110
[   21.665423]  ? stack_trace_save+0x94/0xc0
[   21.665424]  ? stack_trace_consume_entry+0x170/0x170
[   21.665427]  filename_lookup+0x185/0x3b0
[   21.665429]  ? nd_jump_link+0x1d0/0x1d0
[   21.665431]  ? kasan_slab_free+0xe/0x10
[   21.665434]  ? __kasan_check_read+0x11/0x20
[   21.665436]  ? __check_object_size+0x249/0x316
[   21.665438]  ? strncpy_from_user+0x80/0x290
[   21.665440]  ? kmem_cache_alloc+0x180/0x250
[   21.665442]  ? getname_flags+0x100/0x520
[   21.665444]  user_path_at_empty+0x3a/0x50
[   21.665447]  vfs_statx+0xca/0x150
[   21.665449]  ? vfs_statx_fd+0x90/0x90
[   21.665451]  ? __kasan_slab_free+0x14e/0x180
[   21.665453]  __do_sys_newstat+0x9a/0x100
[   21.665455]  ? cp_new_stat+0x5d0/0x5d0
[   21.665457]  ? __kasan_check_write+0x14/0x20
[   21.665459]  ? _raw_spin_lock_irq+0x82/0xe0
[   21.665461]  ? _raw_read_lock_irq+0x50/0x50
[   21.665464]  ? __blkcg_punt_bio_submit+0x1c0/0x1c0
[   21.665466]  ? __kasan_check_write+0x14/0x20
[   21.665469]  ? switch_fpu_return+0x13a/0x2d0
[   21.665471]  ? fpregs_mark_activate+0x150/0x150
[   21.665474]  __x64_sys_newstat+0x54/0x80
[   21.665476]  do_syscall_64+0x9f/0x3a0
[   21.665478]  ? prepare_exit_to_usermode+0xee/0x1a0
[   21.665480]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   21.665482] RIP: 0033:0x7f6e05f5c49a
[   21.665485] Code: 00 00 75 05 48 83 c4 18 c3 e8 f2 24 02 00 66 90 f3 0f 1e fa 41 89 f8 48 89 f7 48 89 d6 41 83 f8 01 77 2d b8 04 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 8b 15 c1 a9 0d 00 f7
[   21.665486] RSP: 002b:00007fff043e5f18 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
[   21.665488] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6e05f5c49a
[   21.665489] RDX: 00007fff043e5f20 RSI: 00007fff043e5f20 RDI: 000055af4c4ea5f0
[   21.665490] RBP: 000055af4c4ea5f0 R08: 0000000000000001 R09: 0000000000000001
[   21.665491] R10: 0000000000000017 R11: 0000000000000246 R12: 000055af4c4eede3
[   21.665492] R13: 000055af4c4e81a0 R14: 000055af4c4e6940 R15: 000055af4c4e6990

[   21.665508] Allocated by task 1345:
[   21.665532]  save_stack+0x23/0x90
[   21.665534]  __kasan_kmalloc.constprop.0+0xcf/0xe0
[   21.665536]  kasan_slab_alloc+0xe/0x10
[   21.665538]  kmem_cache_alloc+0xd7/0x250
[   21.665539]  mempool_alloc_slab+0x17/0x20
[   21.665541]  mempool_alloc+0x126/0x330
[   21.665558]  rpc_malloc+0x1f2/0x270 [sunrpc]
[   21.665574]  call_allocate+0x3b9/0x9d0 [sunrpc]
[   21.665591]  __rpc_execute+0x204/0xbd0 [sunrpc]
[   21.665607]  rpc_execute+0x1a0/0x1f0 [sunrpc]
[   21.665623]  rpc_run_task+0x454/0x5e0 [sunrpc]
[   21.665638]  nfs4_call_sync_custom+0x12/0x70 [nfsv4]
[   21.665653]  nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4]
[   21.665668]  _nfs4_proc_readlink+0x1a6/0x250 [nfsv4]
[   21.665684]  nfs4_proc_readlink+0x101/0x2c0 [nfsv4]
[   21.665698]  nfs_symlink_filler+0xdc/0x190 [nfs]
[   21.665699]  do_read_cache_page+0x60e/0x1490
[   21.665701]  read_cache_page+0x4c/0x80
[   21.665713]  nfs_get_link+0x75/0x370 [nfs]
[   21.665714]  trailing_symlink+0x6fe/0x810
[   21.665716]  path_lookupat.isra.0+0x188/0x7d0
[   21.665718]  filename_lookup+0x185/0x3b0
[   21.665719]  user_path_at_empty+0x3a/0x50
[   21.665721]  vfs_statx+0xca/0x150
[   21.665723]  __do_sys_newstat+0x9a/0x100
[   21.665725]  __x64_sys_newstat+0x54/0x80
[   21.665727]  do_syscall_64+0x9f/0x3a0
[   21.665729]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[   21.665743] Freed by task 0:
[   21.665762] (stack is not available)

[   21.665798] The buggy address belongs to the object at ffff8883b6b7cc80
                which belongs to the cache rpc_buffers of size 2048
[   21.665871] The buggy address is located 1988 bytes inside of
                2048-byte region [ffff8883b6b7cc80, ffff8883b6b7d480)
[   21.665939] The buggy address belongs to the page:
[   21.665970] page:ffffea000edade00 refcount:1 mapcount:0 mapping:ffff88840afecc00 index:0x0 compound_mapcount: 0
[   21.666029] flags: 0x17ffffc0010200(slab|head)
[   21.666059] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff88840afecc00
[   21.666107] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
[   21.666152] page dumped because: kasan: bad access detected

[   21.666197] Memory state around the buggy address:
[   21.666228]  ffff8883b6b7d380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   21.666272]  ffff8883b6b7d400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   21.666315] >ffff8883b6b7d480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   21.666358]                    ^
[   21.666379]  ffff8883b6b7d500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   21.666423]  ffff8883b6b7d580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   21.666465] ==================================================================
[   21.666509] Disabling lock debugging due to kernel taint




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-17 19:46                   ` Pierre Sauter
@ 2020-07-18 15:55                     ` Chuck Lever
  2020-07-20 21:22                       ` Chuck Lever
  0 siblings, 1 reply; 13+ messages in thread
From: Chuck Lever @ 2020-07-18 15:55 UTC (permalink / raw)
  To: Pierre Sauter
  Cc: Kai-Heng Feng, matthew.ruffell, linux-stable,
	Linux NFS Mailing List, open list:NETWORKING DRIVERS, open list,
	linux-kernel-owner



> On Jul 17, 2020, at 3:46 PM, Pierre Sauter <pierre.sauter@stwm.de> wrote:
> 
> Am Freitag, 17. Juli 2020, 19:56:09 CEST schrieb Kai-Heng Feng:
>>> Pierre, thanks for confirming!
>>> 
>>> Kai-Heng suspected an upstream stable commit that is missing in 5.4.0-40,
>>> but I don't have any good suggestions.
>> 
>> Well, Ubuntu's 5.4 kernel is based on upstream stable v5.4, so I asked users to test stable v5.4.51, however the feedback was negative, and that's the reason why I raised the issue here.
>> 
>> Anyway, good to know that it's fixed in upstream stable, everything's good now!
>> Thanks for your effort Chuck.
>> 
>> Kai-Heng
> 
> Sorry to have caused premature happiness. Kai-Hengs last message reminded me
> that I had seen the bug earlier in the week on Ubuntu Mainline v.5.4.51.
> So I decided to rebuild vanilla v5.4.51 with Ubuntus config + KASAN, and voila.
> It seems that their config is just really good in exposing the bug on mount. I
> am off for the weekend, can do more testing next week.
> 
> [   21.664580] ==================================================================
> [   21.664657] BUG: KASAN: slab-out-of-bounds in _copy_from_pages+0xed/0x210 [sunrpc]
> [   21.664705] Write of size 64 at addr ffff8883b6b7d444 by task update-desktop-/1345
> 
> [   21.664764] CPU: 0 PID: 1345 Comm: update-desktop- Not tainted 5.4.51 #1
> [   21.664765] Hardware name: XXXXXX
> [   21.664766] Call Trace:
> [   21.664771]  dump_stack+0x96/0xca
> [   21.664775]  print_address_description.constprop.0+0x20/0x210
> [   21.664795]  ? _copy_from_pages+0xed/0x210 [sunrpc]
> [   21.664797]  __kasan_report.cold+0x1b/0x41
> [   21.664816]  ? _copy_from_pages+0xed/0x210 [sunrpc]
> [   21.664819]  kasan_report+0x14/0x20
> [   21.664820]  check_memory_region+0x129/0x1b0
> [   21.664822]  memcpy+0x38/0x50
> [   21.664840]  _copy_from_pages+0xed/0x210 [sunrpc]
> [   21.664859]  xdr_shrink_pagelen+0x1d6/0x440 [sunrpc]
> [   21.664877]  xdr_align_pages+0x15f/0x580 [sunrpc]
> [   21.664897]  ? decode_setattr+0x120/0x120 [nfsv4]
> [   21.664916]  xdr_read_pages+0x44/0x290 [sunrpc]
> [   21.664933]  ? __decode_op_hdr+0x29/0x430 [nfsv4]
> [   21.664950]  nfs4_xdr_dec_readlink+0x238/0x390 [nfsv4]

READLINK appears to be a common element in these splats. Is there
an especially large symbolic link in your home directory? Knowing
that might help me reproduce the problem here.

You confirmed the crash does not occur in v5.5.19, but the 5.8-ish
kernel you tested was Ubuntu's. Do you have test results for a
stock upstream v5.8-rc5 kernel?

Do you know if v5.6.19 has this issue?


> [   21.664966]  ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4]
> [   21.664969]  ? __kasan_slab_free+0x14e/0x180
> [   21.664985]  ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4]
> [   21.665003]  rpcauth_unwrap_resp_decode+0xaa/0x100 [sunrpc]
> [   21.665009]  gss_unwrap_resp+0x99d/0x1570 [auth_rpcgss]
> [   21.665014]  ? gss_destroy_cred+0x460/0x460 [auth_rpcgss]
> [   21.665016]  ? finish_task_switch+0x163/0x670
> [   21.665019]  ? __switch_to_asm+0x34/0x70
> [   21.665023]  ? gss_wrap_req+0x1700/0x1700 [auth_rpcgss]
> [   21.665026]  ? prepare_to_wait+0xea/0x2b0
> [   21.665045]  rpcauth_unwrap_resp+0xac/0x100 [sunrpc]
> [   21.665061]  call_decode+0x454/0x7e0 [sunrpc]
> [   21.665077]  ? rpc_decode_header+0x10a0/0x10a0 [sunrpc]
> [   21.665079]  ? var_wake_function+0x140/0x140
> [   21.665095]  ? call_transmit_status+0x31e/0x5d0 [sunrpc]
> [   21.665110]  ? rpc_decode_header+0x10a0/0x10a0 [sunrpc]
> [   21.665127]  __rpc_execute+0x204/0xbd0 [sunrpc]
> [   21.665143]  ? xprt_wait_for_reply_request_def+0x170/0x170 [sunrpc]
> [   21.665160]  ? rpc_exit+0xc0/0xc0 [sunrpc]
> [   21.665162]  ? __kasan_check_read+0x11/0x20
> [   21.665164]  ? wake_up_bit+0x42/0x50
> [   21.665181]  rpc_execute+0x1a0/0x1f0 [sunrpc]
> [   21.665197]  rpc_run_task+0x454/0x5e0 [sunrpc]
> [   21.665213]  nfs4_call_sync_custom+0x12/0x70 [nfsv4]
> [   21.665229]  nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4]
> [   21.665244]  ? nfs4_call_sync_custom+0x70/0x70 [nfsv4]
> [   21.665247]  ? get_page_from_freelist+0x24d0/0x45f0
> [   21.665263]  _nfs4_proc_readlink+0x1a6/0x250 [nfsv4]
> [   21.665280]  ? _nfs4_proc_getdeviceinfo+0x350/0x350 [nfsv4]
> [   21.665282]  ? release_pages+0x44b/0xca0
> [   21.665284]  ? __mod_lruvec_state+0x8f/0x320
> [   21.665286]  ? pagevec_lru_move_fn+0x18d/0x230
> [   21.665303]  nfs4_proc_readlink+0x101/0x2c0 [nfsv4]
> [   21.665320]  ? nfs4_proc_link+0x1c0/0x1c0 [nfsv4]
> [   21.665322]  ? add_to_page_cache_locked+0x20/0x20
> [   21.665339]  nfs_symlink_filler+0xdc/0x190 [nfs]
> [   21.665341]  do_read_cache_page+0x60e/0x1490
> [   21.665353]  ? nfs4_do_lookup_revalidate+0x1a1/0x2d0 [nfs]
> [   21.665365]  ? nfs_get_link+0x370/0x370 [nfs]
> [   21.665367]  ? xas_load+0x23/0x250
> [   21.665369]  ? pagecache_get_page+0x760/0x760
> [   21.665372]  ? lockref_get_not_dead+0xe3/0x1c0
> [   21.665374]  ? __kasan_check_write+0x14/0x20
> [   21.665376]  ? lockref_get_not_dead+0xe3/0x1c0
> [   21.665378]  ? __kasan_check_write+0x14/0x20
> [   21.665380]  ? _raw_spin_lock+0x7b/0xd0
> [   21.665382]  ? _raw_write_trylock+0x110/0x110
> [   21.665384]  read_cache_page+0x4c/0x80
> [   21.665396]  nfs_get_link+0x75/0x370 [nfs]
> [   21.665399]  trailing_symlink+0x6fe/0x810
> [   21.665411]  ? nfs_destroy_readpagecache+0x20/0x20 [nfs]
> [   21.665413]  path_lookupat.isra.0+0x188/0x7d0
> [   21.665416]  ? do_syscall_64+0x9f/0x3a0
> [   21.665418]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   21.665420]  ? path_parentat.isra.0+0x110/0x110
> [   21.665423]  ? stack_trace_save+0x94/0xc0
> [   21.665424]  ? stack_trace_consume_entry+0x170/0x170
> [   21.665427]  filename_lookup+0x185/0x3b0
> [   21.665429]  ? nd_jump_link+0x1d0/0x1d0
> [   21.665431]  ? kasan_slab_free+0xe/0x10
> [   21.665434]  ? __kasan_check_read+0x11/0x20
> [   21.665436]  ? __check_object_size+0x249/0x316
> [   21.665438]  ? strncpy_from_user+0x80/0x290
> [   21.665440]  ? kmem_cache_alloc+0x180/0x250
> [   21.665442]  ? getname_flags+0x100/0x520
> [   21.665444]  user_path_at_empty+0x3a/0x50
> [   21.665447]  vfs_statx+0xca/0x150
> [   21.665449]  ? vfs_statx_fd+0x90/0x90
> [   21.665451]  ? __kasan_slab_free+0x14e/0x180
> [   21.665453]  __do_sys_newstat+0x9a/0x100
> [   21.665455]  ? cp_new_stat+0x5d0/0x5d0
> [   21.665457]  ? __kasan_check_write+0x14/0x20
> [   21.665459]  ? _raw_spin_lock_irq+0x82/0xe0
> [   21.665461]  ? _raw_read_lock_irq+0x50/0x50
> [   21.665464]  ? __blkcg_punt_bio_submit+0x1c0/0x1c0
> [   21.665466]  ? __kasan_check_write+0x14/0x20
> [   21.665469]  ? switch_fpu_return+0x13a/0x2d0
> [   21.665471]  ? fpregs_mark_activate+0x150/0x150
> [   21.665474]  __x64_sys_newstat+0x54/0x80
> [   21.665476]  do_syscall_64+0x9f/0x3a0
> [   21.665478]  ? prepare_exit_to_usermode+0xee/0x1a0
> [   21.665480]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   21.665482] RIP: 0033:0x7f6e05f5c49a
> [   21.665485] Code: 00 00 75 05 48 83 c4 18 c3 e8 f2 24 02 00 66 90 f3 0f 1e fa 41 89 f8 48 89 f7 48 89 d6 41 83 f8 01 77 2d b8 04 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 8b 15 c1 a9 0d 00 f7
> [   21.665486] RSP: 002b:00007fff043e5f18 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
> [   21.665488] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6e05f5c49a
> [   21.665489] RDX: 00007fff043e5f20 RSI: 00007fff043e5f20 RDI: 000055af4c4ea5f0
> [   21.665490] RBP: 000055af4c4ea5f0 R08: 0000000000000001 R09: 0000000000000001
> [   21.665491] R10: 0000000000000017 R11: 0000000000000246 R12: 000055af4c4eede3
> [   21.665492] R13: 000055af4c4e81a0 R14: 000055af4c4e6940 R15: 000055af4c4e6990
> 
> [   21.665508] Allocated by task 1345:
> [   21.665532]  save_stack+0x23/0x90
> [   21.665534]  __kasan_kmalloc.constprop.0+0xcf/0xe0
> [   21.665536]  kasan_slab_alloc+0xe/0x10
> [   21.665538]  kmem_cache_alloc+0xd7/0x250
> [   21.665539]  mempool_alloc_slab+0x17/0x20
> [   21.665541]  mempool_alloc+0x126/0x330
> [   21.665558]  rpc_malloc+0x1f2/0x270 [sunrpc]
> [   21.665574]  call_allocate+0x3b9/0x9d0 [sunrpc]
> [   21.665591]  __rpc_execute+0x204/0xbd0 [sunrpc]
> [   21.665607]  rpc_execute+0x1a0/0x1f0 [sunrpc]
> [   21.665623]  rpc_run_task+0x454/0x5e0 [sunrpc]
> [   21.665638]  nfs4_call_sync_custom+0x12/0x70 [nfsv4]
> [   21.665653]  nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4]
> [   21.665668]  _nfs4_proc_readlink+0x1a6/0x250 [nfsv4]
> [   21.665684]  nfs4_proc_readlink+0x101/0x2c0 [nfsv4]
> [   21.665698]  nfs_symlink_filler+0xdc/0x190 [nfs]
> [   21.665699]  do_read_cache_page+0x60e/0x1490
> [   21.665701]  read_cache_page+0x4c/0x80
> [   21.665713]  nfs_get_link+0x75/0x370 [nfs]
> [   21.665714]  trailing_symlink+0x6fe/0x810
> [   21.665716]  path_lookupat.isra.0+0x188/0x7d0
> [   21.665718]  filename_lookup+0x185/0x3b0
> [   21.665719]  user_path_at_empty+0x3a/0x50
> [   21.665721]  vfs_statx+0xca/0x150
> [   21.665723]  __do_sys_newstat+0x9a/0x100
> [   21.665725]  __x64_sys_newstat+0x54/0x80
> [   21.665727]  do_syscall_64+0x9f/0x3a0
> [   21.665729]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> [   21.665743] Freed by task 0:
> [   21.665762] (stack is not available)
> 
> [   21.665798] The buggy address belongs to the object at ffff8883b6b7cc80
>                which belongs to the cache rpc_buffers of size 2048
> [   21.665871] The buggy address is located 1988 bytes inside of
>                2048-byte region [ffff8883b6b7cc80, ffff8883b6b7d480)
> [   21.665939] The buggy address belongs to the page:
> [   21.665970] page:ffffea000edade00 refcount:1 mapcount:0 mapping:ffff88840afecc00 index:0x0 compound_mapcount: 0
> [   21.666029] flags: 0x17ffffc0010200(slab|head)
> [   21.666059] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff88840afecc00
> [   21.666107] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
> [   21.666152] page dumped because: kasan: bad access detected
> 
> [   21.666197] Memory state around the buggy address:
> [   21.666228]  ffff8883b6b7d380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   21.666272]  ffff8883b6b7d400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   21.666315] >ffff8883b6b7d480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   21.666358]                    ^
> [   21.666379]  ffff8883b6b7d500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   21.666423]  ffff8883b6b7d580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   21.666465] ==================================================================
> [   21.666509] Disabling lock debugging due to kernel taint
> 
> 
> 

--
Chuck Lever




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y
  2020-07-18 15:55                     ` Chuck Lever
@ 2020-07-20 21:22                       ` Chuck Lever
  0 siblings, 0 replies; 13+ messages in thread
From: Chuck Lever @ 2020-07-20 21:22 UTC (permalink / raw)
  To: Pierre Sauter
  Cc: Kai-Heng Feng, matthew.ruffell, linux-stable,
	Linux NFS Mailing List, open list:NETWORKING DRIVERS, open list,
	linux-kernel-owner



> On Jul 18, 2020, at 11:55 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On Jul 17, 2020, at 3:46 PM, Pierre Sauter <pierre.sauter@stwm.de> wrote:
>> 
>> Am Freitag, 17. Juli 2020, 19:56:09 CEST schrieb Kai-Heng Feng:
>>>> Pierre, thanks for confirming!
>>>> 
>>>> Kai-Heng suspected an upstream stable commit that is missing in 5.4.0-40,
>>>> but I don't have any good suggestions.
>>> 
>>> Well, Ubuntu's 5.4 kernel is based on upstream stable v5.4, so I asked users to test stable v5.4.51, however the feedback was negative, and that's the reason why I raised the issue here.
>>> 
>>> Anyway, good to know that it's fixed in upstream stable, everything's good now!
>>> Thanks for your effort Chuck.
>>> 
>>> Kai-Heng
>> 
>> Sorry to have caused premature happiness. Kai-Hengs last message reminded me
>> that I had seen the bug earlier in the week on Ubuntu Mainline v.5.4.51.
>> So I decided to rebuild vanilla v5.4.51 with Ubuntus config + KASAN, and voila.
>> It seems that their config is just really good in exposing the bug on mount. I
>> am off for the weekend, can do more testing next week.
>> 
>> [   21.664580] ==================================================================
>> [   21.664657] BUG: KASAN: slab-out-of-bounds in _copy_from_pages+0xed/0x210 [sunrpc]
>> [   21.664705] Write of size 64 at addr ffff8883b6b7d444 by task update-desktop-/1345
>> 
>> [   21.664764] CPU: 0 PID: 1345 Comm: update-desktop- Not tainted 5.4.51 #1
>> [   21.664765] Hardware name: XXXXXX
>> [   21.664766] Call Trace:
>> [   21.664771]  dump_stack+0x96/0xca
>> [   21.664775]  print_address_description.constprop.0+0x20/0x210
>> [   21.664795]  ? _copy_from_pages+0xed/0x210 [sunrpc]
>> [   21.664797]  __kasan_report.cold+0x1b/0x41
>> [   21.664816]  ? _copy_from_pages+0xed/0x210 [sunrpc]
>> [   21.664819]  kasan_report+0x14/0x20
>> [   21.664820]  check_memory_region+0x129/0x1b0
>> [   21.664822]  memcpy+0x38/0x50
>> [   21.664840]  _copy_from_pages+0xed/0x210 [sunrpc]
>> [   21.664859]  xdr_shrink_pagelen+0x1d6/0x440 [sunrpc]
>> [   21.664877]  xdr_align_pages+0x15f/0x580 [sunrpc]
>> [   21.664897]  ? decode_setattr+0x120/0x120 [nfsv4]
>> [   21.664916]  xdr_read_pages+0x44/0x290 [sunrpc]
>> [   21.664933]  ? __decode_op_hdr+0x29/0x430 [nfsv4]
>> [   21.664950]  nfs4_xdr_dec_readlink+0x238/0x390 [nfsv4]
> 
> READLINK appears to be a common element in these splats. Is there
> an especially large symbolic link in your home directory? Knowing
> that might help me reproduce the problem here.
> 
> You confirmed the crash does not occur in v5.5.19, but the 5.8-ish
> kernel you tested was Ubuntu's. Do you have test results for a
> stock upstream v5.8-rc5 kernel?
> 
> Do you know if v5.6.19 has this issue?

I have a workload that can reproduce this exact KASAN splat on
v5.4.51. Looking into it now.


>> [   21.664966]  ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4]
>> [   21.664969]  ? __kasan_slab_free+0x14e/0x180
>> [   21.664985]  ? nfs4_xdr_dec_read+0x3c0/0x3c0 [nfsv4]
>> [   21.665003]  rpcauth_unwrap_resp_decode+0xaa/0x100 [sunrpc]
>> [   21.665009]  gss_unwrap_resp+0x99d/0x1570 [auth_rpcgss]
>> [   21.665014]  ? gss_destroy_cred+0x460/0x460 [auth_rpcgss]
>> [   21.665016]  ? finish_task_switch+0x163/0x670
>> [   21.665019]  ? __switch_to_asm+0x34/0x70
>> [   21.665023]  ? gss_wrap_req+0x1700/0x1700 [auth_rpcgss]
>> [   21.665026]  ? prepare_to_wait+0xea/0x2b0
>> [   21.665045]  rpcauth_unwrap_resp+0xac/0x100 [sunrpc]
>> [   21.665061]  call_decode+0x454/0x7e0 [sunrpc]
>> [   21.665077]  ? rpc_decode_header+0x10a0/0x10a0 [sunrpc]
>> [   21.665079]  ? var_wake_function+0x140/0x140
>> [   21.665095]  ? call_transmit_status+0x31e/0x5d0 [sunrpc]
>> [   21.665110]  ? rpc_decode_header+0x10a0/0x10a0 [sunrpc]
>> [   21.665127]  __rpc_execute+0x204/0xbd0 [sunrpc]
>> [   21.665143]  ? xprt_wait_for_reply_request_def+0x170/0x170 [sunrpc]
>> [   21.665160]  ? rpc_exit+0xc0/0xc0 [sunrpc]
>> [   21.665162]  ? __kasan_check_read+0x11/0x20
>> [   21.665164]  ? wake_up_bit+0x42/0x50
>> [   21.665181]  rpc_execute+0x1a0/0x1f0 [sunrpc]
>> [   21.665197]  rpc_run_task+0x454/0x5e0 [sunrpc]
>> [   21.665213]  nfs4_call_sync_custom+0x12/0x70 [nfsv4]
>> [   21.665229]  nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4]
>> [   21.665244]  ? nfs4_call_sync_custom+0x70/0x70 [nfsv4]
>> [   21.665247]  ? get_page_from_freelist+0x24d0/0x45f0
>> [   21.665263]  _nfs4_proc_readlink+0x1a6/0x250 [nfsv4]
>> [   21.665280]  ? _nfs4_proc_getdeviceinfo+0x350/0x350 [nfsv4]
>> [   21.665282]  ? release_pages+0x44b/0xca0
>> [   21.665284]  ? __mod_lruvec_state+0x8f/0x320
>> [   21.665286]  ? pagevec_lru_move_fn+0x18d/0x230
>> [   21.665303]  nfs4_proc_readlink+0x101/0x2c0 [nfsv4]
>> [   21.665320]  ? nfs4_proc_link+0x1c0/0x1c0 [nfsv4]
>> [   21.665322]  ? add_to_page_cache_locked+0x20/0x20
>> [   21.665339]  nfs_symlink_filler+0xdc/0x190 [nfs]
>> [   21.665341]  do_read_cache_page+0x60e/0x1490
>> [   21.665353]  ? nfs4_do_lookup_revalidate+0x1a1/0x2d0 [nfs]
>> [   21.665365]  ? nfs_get_link+0x370/0x370 [nfs]
>> [   21.665367]  ? xas_load+0x23/0x250
>> [   21.665369]  ? pagecache_get_page+0x760/0x760
>> [   21.665372]  ? lockref_get_not_dead+0xe3/0x1c0
>> [   21.665374]  ? __kasan_check_write+0x14/0x20
>> [   21.665376]  ? lockref_get_not_dead+0xe3/0x1c0
>> [   21.665378]  ? __kasan_check_write+0x14/0x20
>> [   21.665380]  ? _raw_spin_lock+0x7b/0xd0
>> [   21.665382]  ? _raw_write_trylock+0x110/0x110
>> [   21.665384]  read_cache_page+0x4c/0x80
>> [   21.665396]  nfs_get_link+0x75/0x370 [nfs]
>> [   21.665399]  trailing_symlink+0x6fe/0x810
>> [   21.665411]  ? nfs_destroy_readpagecache+0x20/0x20 [nfs]
>> [   21.665413]  path_lookupat.isra.0+0x188/0x7d0
>> [   21.665416]  ? do_syscall_64+0x9f/0x3a0
>> [   21.665418]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [   21.665420]  ? path_parentat.isra.0+0x110/0x110
>> [   21.665423]  ? stack_trace_save+0x94/0xc0
>> [   21.665424]  ? stack_trace_consume_entry+0x170/0x170
>> [   21.665427]  filename_lookup+0x185/0x3b0
>> [   21.665429]  ? nd_jump_link+0x1d0/0x1d0
>> [   21.665431]  ? kasan_slab_free+0xe/0x10
>> [   21.665434]  ? __kasan_check_read+0x11/0x20
>> [   21.665436]  ? __check_object_size+0x249/0x316
>> [   21.665438]  ? strncpy_from_user+0x80/0x290
>> [   21.665440]  ? kmem_cache_alloc+0x180/0x250
>> [   21.665442]  ? getname_flags+0x100/0x520
>> [   21.665444]  user_path_at_empty+0x3a/0x50
>> [   21.665447]  vfs_statx+0xca/0x150
>> [   21.665449]  ? vfs_statx_fd+0x90/0x90
>> [   21.665451]  ? __kasan_slab_free+0x14e/0x180
>> [   21.665453]  __do_sys_newstat+0x9a/0x100
>> [   21.665455]  ? cp_new_stat+0x5d0/0x5d0
>> [   21.665457]  ? __kasan_check_write+0x14/0x20
>> [   21.665459]  ? _raw_spin_lock_irq+0x82/0xe0
>> [   21.665461]  ? _raw_read_lock_irq+0x50/0x50
>> [   21.665464]  ? __blkcg_punt_bio_submit+0x1c0/0x1c0
>> [   21.665466]  ? __kasan_check_write+0x14/0x20
>> [   21.665469]  ? switch_fpu_return+0x13a/0x2d0
>> [   21.665471]  ? fpregs_mark_activate+0x150/0x150
>> [   21.665474]  __x64_sys_newstat+0x54/0x80
>> [   21.665476]  do_syscall_64+0x9f/0x3a0
>> [   21.665478]  ? prepare_exit_to_usermode+0xee/0x1a0
>> [   21.665480]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [   21.665482] RIP: 0033:0x7f6e05f5c49a
>> [   21.665485] Code: 00 00 75 05 48 83 c4 18 c3 e8 f2 24 02 00 66 90 f3 0f 1e fa 41 89 f8 48 89 f7 48 89 d6 41 83 f8 01 77 2d b8 04 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 8b 15 c1 a9 0d 00 f7
>> [   21.665486] RSP: 002b:00007fff043e5f18 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
>> [   21.665488] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6e05f5c49a
>> [   21.665489] RDX: 00007fff043e5f20 RSI: 00007fff043e5f20 RDI: 000055af4c4ea5f0
>> [   21.665490] RBP: 000055af4c4ea5f0 R08: 0000000000000001 R09: 0000000000000001
>> [   21.665491] R10: 0000000000000017 R11: 0000000000000246 R12: 000055af4c4eede3
>> [   21.665492] R13: 000055af4c4e81a0 R14: 000055af4c4e6940 R15: 000055af4c4e6990
>> 
>> [   21.665508] Allocated by task 1345:
>> [   21.665532]  save_stack+0x23/0x90
>> [   21.665534]  __kasan_kmalloc.constprop.0+0xcf/0xe0
>> [   21.665536]  kasan_slab_alloc+0xe/0x10
>> [   21.665538]  kmem_cache_alloc+0xd7/0x250
>> [   21.665539]  mempool_alloc_slab+0x17/0x20
>> [   21.665541]  mempool_alloc+0x126/0x330
>> [   21.665558]  rpc_malloc+0x1f2/0x270 [sunrpc]
>> [   21.665574]  call_allocate+0x3b9/0x9d0 [sunrpc]
>> [   21.665591]  __rpc_execute+0x204/0xbd0 [sunrpc]
>> [   21.665607]  rpc_execute+0x1a0/0x1f0 [sunrpc]
>> [   21.665623]  rpc_run_task+0x454/0x5e0 [sunrpc]
>> [   21.665638]  nfs4_call_sync_custom+0x12/0x70 [nfsv4]
>> [   21.665653]  nfs4_call_sync_sequence+0x143/0x1f0 [nfsv4]
>> [   21.665668]  _nfs4_proc_readlink+0x1a6/0x250 [nfsv4]
>> [   21.665684]  nfs4_proc_readlink+0x101/0x2c0 [nfsv4]
>> [   21.665698]  nfs_symlink_filler+0xdc/0x190 [nfs]
>> [   21.665699]  do_read_cache_page+0x60e/0x1490
>> [   21.665701]  read_cache_page+0x4c/0x80
>> [   21.665713]  nfs_get_link+0x75/0x370 [nfs]
>> [   21.665714]  trailing_symlink+0x6fe/0x810
>> [   21.665716]  path_lookupat.isra.0+0x188/0x7d0
>> [   21.665718]  filename_lookup+0x185/0x3b0
>> [   21.665719]  user_path_at_empty+0x3a/0x50
>> [   21.665721]  vfs_statx+0xca/0x150
>> [   21.665723]  __do_sys_newstat+0x9a/0x100
>> [   21.665725]  __x64_sys_newstat+0x54/0x80
>> [   21.665727]  do_syscall_64+0x9f/0x3a0
>> [   21.665729]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> 
>> [   21.665743] Freed by task 0:
>> [   21.665762] (stack is not available)
>> 
>> [   21.665798] The buggy address belongs to the object at ffff8883b6b7cc80
>>               which belongs to the cache rpc_buffers of size 2048
>> [   21.665871] The buggy address is located 1988 bytes inside of
>>               2048-byte region [ffff8883b6b7cc80, ffff8883b6b7d480)
>> [   21.665939] The buggy address belongs to the page:
>> [   21.665970] page:ffffea000edade00 refcount:1 mapcount:0 mapping:ffff88840afecc00 index:0x0 compound_mapcount: 0
>> [   21.666029] flags: 0x17ffffc0010200(slab|head)
>> [   21.666059] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff88840afecc00
>> [   21.666107] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
>> [   21.666152] page dumped because: kasan: bad access detected
>> 
>> [   21.666197] Memory state around the buggy address:
>> [   21.666228]  ffff8883b6b7d380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [   21.666272]  ffff8883b6b7d400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [   21.666315] >ffff8883b6b7d480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> [   21.666358]                    ^
>> [   21.666379]  ffff8883b6b7d500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> [   21.666423]  ffff8883b6b7d580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> [   21.666465] ==================================================================
>> [   21.666509] Disabling lock debugging due to kernel taint
>> 
>> 
>> 
> 
> --
> Chuck Lever

--
Chuck Lever




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, back to index

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-15 14:48 [Regression] "SUNRPC: Add "@len" parameter to gss_unwrap()" breaks NFS Kerberos on upstream stable 5.4.y Kai-Heng Feng
2020-07-15 15:02 ` Chuck Lever
2020-07-15 15:08   ` Kai-Heng Feng
2020-07-15 15:14     ` Chuck Lever
2020-07-15 18:54       ` Chuck Lever
2020-07-16 18:40         ` Pierre Sauter
2020-07-16 19:25           ` Chuck Lever
2020-07-17 17:29             ` Pierre Sauter
2020-07-17 17:34               ` Chuck Lever
2020-07-17 17:56                 ` Kai-Heng Feng
2020-07-17 19:46                   ` Pierre Sauter
2020-07-18 15:55                     ` Chuck Lever
2020-07-20 21:22                       ` Chuck Lever

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git