All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: NeilBrown <neilb@suse.de>
Cc: Chuck Lever <chuck.lever@oracle.com>, linux-nfs@vger.kernel.org
Subject: Re: [PATCH 00/14] SUNRPC: clean up server thread management.
Date: Sun, 21 Nov 2021 19:56:39 -0500	[thread overview]
Message-ID: <20211122005639.GA12035@fieldses.org> (raw)
In-Reply-To: <163753863448.13692.4142092237119935826@noble.neil.brown.name>

On Mon, Nov 22, 2021 at 10:50:34AM +1100, NeilBrown wrote:
> On Thu, 18 Nov 2021, J. Bruce Fields wrote:
> > On Wed, Nov 17, 2021 at 11:46:49AM +1100, NeilBrown wrote:
> > > I have a dream of making nfsd threads start and stop dynamically.
> > 
> > It's a good dream!
> > 
> > I haven't had a chance to look at these at all yet, I just kicked off
> > tests to run overnight, and woke up to the below.
> > 
> > This happened on the client, probably the first time it attempted to do
> > an nfsv4 mount, so something went wrong with setup of the callback
> > server.
> 
> I cannot reproduce this and cannot see any way it could possible happen.

Huh.  Well, it's possible I mixed up the results somehow.  I'll see if I
can reproduce tonight or tomorrow.

> Could you please confirm the patches were applied on a vanilla 5.1.6-rc1
> kernel, and that you don't have the "pool_mode" module parameter set.

/sys/module/sunrpc/parameters/pool_mode is "global", the default.

> As I said, serv->sv_nrpools is zero, so either it hasn't been set, or it
> was set, but the 'serv' has been cleared and free (or freed and
> reallocated and clearer), or that it was set to zero.
> 
> svc_pool_map_get() doesn't explicitly protect against npools==0 (maybe
> it should), only npools < 0.  But even without that I cannot see it ever
> setting ->npools to zero.
> 
> I have changed refcounting, so maybe something could get freed early,
> but all the changes I made happen *before* the point in the code where
> it is crashing.

OK, I'll take another look and let you know....

--b.

> 
> So I'm thoroughly perplexed.
> 
> Thanks,
> NeilBrown
> 
> 
> > 
> > --b.
> > 
> > [  285.585061] divide error: 0000 [#1] PREEMPT SMP KASAN PTI
> > [  285.585754] CPU: 0 PID: 5864 Comm: mount.nfs Not tainted 5.16.0-rc1-00014-g659e13af1f87 #1017
> > [  285.586828] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
> > [  285.587828] RIP: 0010:svc_pool_for_cpu+0xc7/0x1b0 [sunrpc]
> > [  285.588501] Code: 8b ab f0 00 00 00 48 ba 00 00 00 00 00 fc ff df 48 89 f9 48 c1 e9 03 0f b6 14 11 84 d2 74 09 80 fa 03 0f 8e 8d 00 00 00 31 d2 <f7> b3 e8 00 00 00 48 83 c4 08 5b 48 8d 04 52 48 c1 e0 06 48 01 e8
> > [  285.590820] RSP: 0018:ffff88801526f8f8 EFLAGS: 00010246
> > [  285.591418] RAX: 0000000000000000 RBX: ffff88800db3bc00 RCX: 1ffff11001b6779d
> > [  285.592267] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88800db3bce8
> > [  285.593145] RBP: 0000000000000010 R08: 0000000000000001 R09: ffff888014b7403f
> > [  285.594057] R10: ffffed100296e807 R11: 0000000000000001 R12: ffff888014b74038
> > [  285.594940] R13: ffff888014b74010 R14: ffff888014b74000 R15: ffff88800db3bc00
> > [  285.595826] FS:  00007f489f68a440(0000) GS:ffff88806d400000(0000) knlGS:0000000000000000
> > [  285.596851] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  285.597578] CR2: 00007f2dffa0b198 CR3: 000000000c486003 CR4: 0000000000170ef0
> > [  285.598510] Call Trace:
> > [  285.598824]  <TASK>
> > [  285.599097]  svc_xprt_do_enqueue+0x164/0x900 [sunrpc]
> > [  285.599767]  svc_xprt_received+0x181/0x3a0 [sunrpc]
> > [  285.600411]  _svc_create_xprt+0x2bd/0x740 [sunrpc]
> > [  285.601049]  ? svc_add_new_perm_xprt+0x140/0x140 [sunrpc]
> > [  285.601787]  ? lock_release+0x3b8/0x6d0
> > [  285.602318]  ? nfs_callback_up+0x7ad/0xdb0 [nfsv4]
> > [  285.603617]  svc_create_xprt+0x36/0x90 [sunrpc]
> > [  285.604306]  nfs_callback_up+0x81f/0xdb0 [nfsv4]
> > [  285.604972]  nfs4_init_client+0x1db/0x450 [nfsv4]
> > [  285.605605]  ? nfs41_init_client+0x70/0x70 [nfsv4]
> > [  285.606304]  nfs4_set_client+0x25f/0x410 [nfsv4]
> > [  285.606912]  ? nfs4_add_trunk.isra.0+0x280/0x280 [nfsv4]
> > [  285.607606]  nfs4_create_server+0x5f0/0xda0 [nfsv4]
> > [  285.608250]  ? lock_is_held_type+0xd7/0x130
> > [  285.608786]  ? nfs4_server_common_setup+0x670/0x670 [nfsv4]
> > [  285.609505]  ? __module_get+0x47/0x60
> > [  285.610077]  nfs4_try_get_tree+0xd3/0x250 [nfsv4]
> > [  285.610690]  vfs_get_tree+0x8a/0x2d0
> > [  285.611152]  path_mount+0x3f9/0x19e0
> > [  285.611608]  ? debug_check_no_obj_freed+0x1f3/0x3c0
> > [  285.612227]  ? lock_is_held_type+0xd7/0x130
> > [  285.612757]  ? finish_automount+0x8c0/0x8c0
> > [  285.613281]  ? user_path_at_empty+0x45/0x50
> > [  285.613832]  ? rcu_read_lock_sched_held+0x3f/0x70
> > [  285.614456]  ? kmem_cache_free+0xd9/0x1b0
> > [  285.614965]  __x64_sys_mount+0x1d6/0x240
> > [  285.615455]  ? path_mount+0x19e0/0x19e0
> > [  285.615941]  ? syscall_enter_from_user_mode+0x1d/0x50
> > [  285.616572]  do_syscall_64+0x43/0x90
> > [  285.617043]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  285.617693] RIP: 0033:0x7f489fd4182e
> > [  285.618206] Code: 48 8b 0d 4d 16 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1a 16 0c 00 f7 d8 64 89 01 48
> > [  285.620595] RSP: 002b:00007ffdc3bdd5b8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
> > [  285.621532] RAX: ffffffffffffffda RBX: 00007ffdc3bdd750 RCX: 00007f489fd4182e
> > [  285.622492] RDX: 000055da46c0a510 RSI: 000055da46c0a550 RDI: 000055da46c0c2f0
> > [  285.623372] RBP: 00007ffdc3bdd750 R08: 000055da46c0d050 R09: 0037332e3232312e
> > [  285.624271] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > [  285.625158] R13: 00007ffdc3bdd660 R14: 000055da46c0ce00 R15: 000055da46c0ce90
> > [  285.626097]  </TASK>
> > [  285.626381] Modules linked in: nfsv4 rpcsec_gss_krb5 nfsv3 nfs_acl nfs lockd grace auth_rpcgss sunrpc
> > [  285.627622] ---[ end trace 0ea273cc87891325 ]---
> > [  285.628222] RIP: 0010:svc_pool_for_cpu+0xc7/0x1b0 [sunrpc]
> > [  285.628945] Code: 8b ab f0 00 00 00 48 ba 00 00 00 00 00 fc ff df 48 89 f9 48 c1 e9 03 0f b6 14 11 84 d2 74 09 80 fa 03 0f 8e 8d 00 00 00 31 d2 <f7> b3 e8 00 00 00 48 83 c4 08 5b 48 8d 04 52 48 c1 e0 06 48 01 e8
> > [  285.631830] RSP: 0018:ffff88801526f8f8 EFLAGS: 00010246
> > [  285.632557] RAX: 0000000000000000 RBX: ffff88800db3bc00 RCX: 1ffff11001b6779d
> > [  285.634319] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88800db3bce8
> > [  285.635430] RBP: 0000000000000010 R08: 0000000000000001 R09: ffff888014b7403f
> > [  285.636408] R10: ffffed100296e807 R11: 0000000000000001 R12: ffff888014b74038
> > [  285.637369] R13: ffff888014b74010 R14: ffff888014b74000 R15: ffff88800db3bc00
> > [  285.638346] FS:  00007f489f68a440(0000) GS:ffff88806d400000(0000) knlGS:0000000000000000
> > [  285.639434] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  285.640233] CR2: 00007f2dffa0b198 CR3: 000000000c486003 CR4: 0000000000170ef0
> > [  285.641194] note: mount.nfs[5864] exited with preempt_count 1
> > [  562.003788] kworker/dying (773) used greatest stack depth: 23128 bytes left
> > [ 1356.888419] clocksource: timekeeping watchdog on CPU1: acpi_pm retried 2 times before success
> > [ 2396.888656] clocksource: timekeeping watchdog on CPU1: acpi_pm retried 2 times before success
> > [ 3071.387007] clocksource: timekeeping watchdog on CPU0: acpi_pm retried 2 times before success
> > [ 3074.395010] clocksource: timekeeping watchdog on CPU0: acpi_pm retried 2 times before success
> > [ 3082.395298] clocksource: timekeeping watchdog on CPU0: acpi_pm retried 2 times before success
> > [ 5736.389488] clocksource: timekeeping watchdog on CPU0: acpi_pm retried 2 times before success
> > [root@test3 ~]# uname -a
> > Linux test3.fieldses.org 5.16.0-rc1-00014-g659e13af1f87 #1017 SMP PREEMPT Tue Nov 16 20:51:49 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
> > 
> > 

  reply	other threads:[~2021-11-22  0:56 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-17  0:46 [PATCH 00/14] SUNRPC: clean up server thread management NeilBrown
2021-11-17  0:46 ` [PATCH 03/14] NFSD: narrow nfsd_mutex protection in nfsd thread NeilBrown
2021-11-17  0:46 ` [PATCH 02/14] nfsd: make nfsd_stats.th_cnt atomic_t NeilBrown
2021-11-17  0:46 ` [PATCH 01/14] SUNRPC: stop using ->sv_nrthreads as a refcount NeilBrown
2021-11-17  0:46 ` [PATCH 14/14] lockd: use svc_set_num_threads() for thread start and stop NeilBrown
2021-11-17  0:46 ` [PATCH 10/14] lockd: move lockd_start_svc() call into lockd_create_svc() NeilBrown
2021-11-17  0:46 ` [PATCH 06/14] SUNRPC: discard svo_setup and rename svc_set_num_threads_sync() NeilBrown
2021-11-17  0:46 ` [PATCH 05/14] NFSD: Make it possible to use svc_set_num_threads_sync NeilBrown
2021-11-17  0:46 ` [PATCH 08/14] lockd: introduce nlmsvc_serv NeilBrown
2021-11-17  0:46 ` [PATCH 12/14] lockd: introduce lockd_put() NeilBrown
2021-11-17  0:46 ` [PATCH 09/14] lockd: simplify management of network status notifiers NeilBrown
2021-11-17  0:46 ` [PATCH 11/14] lockd: move svc_exit_thread() into the thread NeilBrown
2021-11-17  0:46 ` [PATCH 07/14] NFSD: simplify locking for network notifier NeilBrown
2021-11-17  0:46 ` [PATCH 04/14] SUNRPC: use sv_lock to protect updates to sv_nrthreads NeilBrown
2021-11-17  0:46 ` [PATCH 13/14] lockd: rename lockd_create_svc() to lockd_get() NeilBrown
2021-11-17 14:12 ` [PATCH 00/14] SUNRPC: clean up server thread management J. Bruce Fields
2021-11-19  3:24   ` NeilBrown
2021-11-21 23:50   ` NeilBrown
2021-11-22  0:56     ` J. Bruce Fields [this message]
2021-11-22  0:59       ` J. Bruce Fields
2021-11-22  1:13         ` NeilBrown
2021-11-22  2:37           ` J. Bruce Fields
2021-11-22  3:38             ` J. Bruce Fields
2021-11-22 15:18               ` J. Bruce Fields
2021-11-22 23:25                 ` NeilBrown
2021-11-17 17:12 ` Chuck Lever III
2021-11-19  3:19   ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211122005639.GA12035@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.