* DFS tests failing in buildbot
@ 2022-03-08 11:38 Shyam Prasad N
2022-03-08 15:26 ` Paulo Alcantara
0 siblings, 1 reply; 2+ messages in thread
From: Shyam Prasad N @ 2022-03-08 11:38 UTC (permalink / raw)
To: CIFS, ronnie sahlberg, Pavel Shilovsky, Steve French, Paulo Alcantara
Hi,
Once every few runs, we see the DFS tests failing in buildbot.
I did some digging into this, and here's my conclusion.
Please let me know if you can point out some issue with the root cause
or the fix.
There is a race condition that exists between cifsd and I/O threads
when the tcp connection is broken. The cifsd thread marks the
server/session/tcon structures for reconnect, and recreates the
socket, and sets 1 credit for this server. This only changes after the
next negotiate/session-setup completes, where it can get more credits.
During this window, if any ongoing I/O requires more than 1 credit,
then it will return with smb3_insufficient_credits (note that slightly
earlier in the same code, we identify reconnect with
smb3_reconnect_detected, but do nothing about it). The I/O will now
leak -EHOSTDOWN or -EAGAIN into userspace.
I feel that we should return a special error (-ERESTARTSYS?) when
smb3_reconnect_detected, and use this errno to ask the caller to
restart the syscall.
Ronnie/Pavel/Paulo: Please let me know what you think about this.
--
Regards,
Shyam
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: DFS tests failing in buildbot
2022-03-08 11:38 DFS tests failing in buildbot Shyam Prasad N
@ 2022-03-08 15:26 ` Paulo Alcantara
0 siblings, 0 replies; 2+ messages in thread
From: Paulo Alcantara @ 2022-03-08 15:26 UTC (permalink / raw)
To: Shyam Prasad N, CIFS, ronnie sahlberg, Pavel Shilovsky, Steve French
Shyam Prasad N <nspmangalore@gmail.com> writes:
> There is a race condition that exists between cifsd and I/O threads
> when the tcp connection is broken. The cifsd thread marks the
> server/session/tcon structures for reconnect, and recreates the
> socket, and sets 1 credit for this server. This only changes after the
> next negotiate/session-setup completes, where it can get more credits.
> During this window, if any ongoing I/O requires more than 1 credit,
> then it will return with smb3_insufficient_credits (note that slightly
> earlier in the same code, we identify reconnect with
> smb3_reconnect_detected, but do nothing about it). The I/O will now
> leak -EHOSTDOWN or -EAGAIN into userspace.
I don't see why it would be a problem returning either -EAGAIN or
-EHOSTDOWN back to userspace on *soft* mounts. Isn't this what we want?
If the syscall gets signaled while we are waiting for the tcp connection
being restablished, then we return -ERESTARTSYS. See
wait_event_interruptible_timeout() in smb2_reconnect().
> I feel that we should return a special error (-ERESTARTSYS?) when
> smb3_reconnect_detected, and use this errno to ask the caller to
> restart the syscall.
Userspace doesn't handle -ERESTARTSYS. When we return -ERESTARTSYS from
a signaled syscall, this means that the kernel will either handle the
signal and restart syscall from the beginning, or return -EINTR back to
userspace.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-03-08 15:26 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-08 11:38 DFS tests failing in buildbot Shyam Prasad N
2022-03-08 15:26 ` Paulo Alcantara
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.