Problem with providing implementation id in NFSv4.1

* Problem with providing implementation id in NFSv4.1
@ 2022-07-07  1:13 NeilBrown
  2022-07-07  4:17 ` Trond Myklebust
  0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2022-07-07  1:13 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker, linux-nfs

In NFSv4.1 when we EXCHANGE_ID to talk to a new server - possibly a PNFS
Data Server that we haven't talked to before - we by default send an
implementation id.  This is created from several fields obtained from
utsname().
utsname() depends on current->nsproxy, and will crash if that is NULL.

When a process exits it calls, among other things,

	exit_task_namespaces(tsk);
	exit_task_work(tsk);

exit_task_namespaces() will set ->nsproxy to NULL
exit_task_work() will run delayed work items, including fput() on all
files that were still open when the process exited.  This will cause any
pending writes to be flushed for NFS.

So if a process writes to a file on a PNFS server, exits, and the MDS
tells the client to send the data to a DS which it hasn't established a
connection with before, then it will crash in encode_exchange_id().

That order of calls in do_exit() is deliberate so we cannot swap them - see
Commit: 8aac62706ada ("move exit_task_namespaces() outside of exit_notify()")

The options that I can see are:
1/ generate the implementation-id string at mount time and keep it
   around much like we do for cl_owner_id
2/ Check current->nsproxy in encode_exchange_id() and skip the
   implementation id if ->nsproxy is not available.
   Note that there is no risk for a race with testing ->nxproxy.

Doesn't anyone have a strong opinion of which is best.  I'm inclined to
go with '2', but mostly because it is less coding.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 3+ messages in thread