DFS tests failing in buildbot

* DFS tests failing in buildbot
@ 2022-03-08 11:38 Shyam Prasad N
  2022-03-08 15:26 ` Paulo Alcantara
  0 siblings, 1 reply; 2+ messages in thread
From: Shyam Prasad N @ 2022-03-08 11:38 UTC (permalink / raw)
  To: CIFS, ronnie sahlberg, Pavel Shilovsky, Steve French, Paulo Alcantara

Hi,

Once every few runs, we see the DFS tests failing in buildbot.
I did some digging into this, and here's my conclusion.
Please let me know if you can point out some issue with the root cause
or the fix.

There is a race condition that exists between cifsd and I/O threads
when the tcp connection is broken. The cifsd thread marks the
server/session/tcon structures for reconnect, and recreates the
socket, and sets 1 credit for this server. This only changes after the
next negotiate/session-setup completes, where it can get more credits.
During this window, if any ongoing I/O requires more than 1 credit,
then it will return with smb3_insufficient_credits (note that slightly
earlier in the same code, we identify reconnect with
smb3_reconnect_detected, but do nothing about it). The I/O will now
leak -EHOSTDOWN or -EAGAIN into userspace.

I feel that we should return a special error (-ERESTARTSYS?) when
smb3_reconnect_detected, and use this errno to ask the caller to
restart the syscall.

Ronnie/Pavel/Paulo: Please let me know what you think about this.

-- 
Regards,
Shyam

^ permalink raw reply	[flat|nested] 2+ messages in thread