Steve, Pavel This patch goes ontop of Pavels patch. Maybe it should be merged with Pavels patch since his patch changes from "we only send a close() on an interrupted open()" to now "we send a close() on either interrupted open() or interrupted close()" so both comments as well as log messages are updates. Additionally it adds logging of the MID that failed in the case of an interrupted Open() so that it is easy to find it in wireshark and check whether that smb2 file handle was indeed handles by a SMB_Close() or not. From testing it appears Pavels patch works. When the close() is interrupted we don't leak handles as far as I can tell. We do have a leak in the Open() case though and it seems that eventhough we set things up and flags the MID to be cancelled we actually never end up calling smb2_cancelled_close_fid() and thus we never send a SMB2_Close(). I haven't found the root cause yet but I suspect we mess up mid flags or state somewhere. It did work in the past though when Sachin provided the initial implementation so we have regressed I think. I have added a new test 'cifs/102' to the buildbot that checks for this but have not integrated into the cifs-testing run yet since we still fail this test. At least we will not have further regressions once we fix this and enable the test in the future. ronnie s ----- Original Message ----- From: "Ronnie Sahlberg" To: "Pavel Shilovsky" Cc: "Frank Sorenson" , "linux-cifs" Sent: Wednesday, 13 November, 2019 3:19:11 PM Subject: Re: A process killed while opening a file can result in leaked open handle on the server ----- Original Message ----- > From: "Pavel Shilovsky" > To: "Frank Sorenson" > Cc: "linux-cifs" > Sent: Wednesday, 13 November, 2019 12:37:38 PM > Subject: Re: A process killed while opening a file can result in leaked open handle on the server > > вт, 12 нояб. 2019 г. в 11:34, Frank Sorenson : > > > > If a process opening a file is killed while waiting for a SMB2_CREATE > > response from the server, the response may not be handled by the client, > > leaking an open file handle on the server. > > > > Thanks for reporting the problem. > > > This can be reproduced with the following: > > > > # mount //vm3/user1 /mnt/vm3 > > -overs=3,sec=ntlmssp,credentials=/root/.user1_smb_creds > > # cd /mnt/vm3 > > # echo foo > foo > > > > # for i in {1..100} ; do cat foo >/dev/null 2>&1 & sleep 0.0001 ; kill -9 > > $! ; done > > > > (increase count if necessary--100 appears sufficient to cause multiple > > leaked file handles) > > > > This is a known problem and the client handles it by closing unmatched > opens (the one that don't have a corresponding waiting thread) > immediately. It is indicated by the message you observed: "Close > unmatched open". > > > > > The client will stop waiting for the response, and will output > > the following when the response does arrive: > > > > CIFS VFS: Close unmatched open > > > > on the server side, an open file handle is leaked. If using samba, > > the following will show these open files: > > > > > > # smbstatus | grep -i Locked -A1000 > > Locked files: > > Pid Uid DenyMode Access R/W Oplock > > SharePath Name Time > > -------------------------------------------------------------------------------------------------- > > 25936 501 DENY_NONE 0x80 RDONLY NONE > > /home/user1 . Tue Nov 12 12:29:24 2019 > > 25936 501 DENY_NONE 0x80 RDONLY NONE > > /home/user1 . Tue Nov 12 12:29:24 2019 > > 25936 501 DENY_NONE 0x120089 RDONLY LEASE(RWH) > > /home/user1 foo Tue Nov 12 12:29:24 2019 > > 25936 501 DENY_NONE 0x120089 RDONLY LEASE(RWH) > > /home/user1 foo Tue Nov 12 12:29:24 2019 > > 25936 501 DENY_NONE 0x120089 RDONLY LEASE(RWH) > > /home/user1 foo Tue Nov 12 12:29:24 2019 > > 25936 501 DENY_NONE 0x120089 RDONLY LEASE(RWH) > > /home/user1 foo Tue Nov 12 12:29:24 2019 > > 25936 501 DENY_NONE 0x120089 RDONLY LEASE(RWH) > > /home/user1 foo Tue Nov 12 12:29:24 2019 > > > > I tried it locally myself, it seems that we have another issue related > to interrupted close requests: > > [1656476.740311] fs/cifs/file.c: Flush inode 0000000078729371 file > 000000004d5f9348 rc 0 > [1656476.740313] fs/cifs/file.c: closing last open instance for inode > 0000000078729371 > [1656476.740314] fs/cifs/file.c: CIFS VFS: in _cifsFileInfo_put as > Xid: 457 with uid: 1000 > [1656476.740315] fs/cifs/smb2pdu.c: Close > [1656476.740317] fs/cifs/transport.c: signal is pending before sending any > data > > This will return -512 error to the upper layer but we do not save a > problematic handle somewhere to close it later. op->release() error > code is not being checked by VFS anyway. > > We should do the similar thing as we do today for cancelled mids: > allocate "struct close_cancelled_open" and queue the lazy work to > close the handle. > > I attached the patch implementing this idea. The patch handles the > interrupt errors in smb2_close_file which is called during close > system call. It fixed the problem in my setup. In the same time I > think I should probably move the handling to PDU layer function > SMB2_close() to handle *all* interrupted close requests including ones > closing temporary handles. I am wondering if anyone has thoughts about > it. Anyway, review of the patch is highly appreciated. I think your patch makes it better. I seem Close() that your patch now fixes the leak for. But there is still a leak in Open(). I just got one instance where I leaked one handle. The log shows : CIFS VFS: \\192.168.124.207 Cancelling wait for mid 25 cmd: 5 and the wireshark shows an Open() request/response for this MID but we never send a Close() for the handle. > > @Frank, could you please test the patch in your environment? > > -- > Best regards, > Pavel Shilovsky >