* cifs-rdma: KASAN-detected UAF when using rxe driver
@ 2023-01-24 17:48 David Howells
2023-01-25 7:48 ` David Howells
2023-01-25 14:02 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells
0 siblings, 2 replies; 18+ messages in thread
From: David Howells @ 2023-01-24 17:48 UTC (permalink / raw)
To: Steve French
Cc: dhowells, Shyam Prasad N, Rohith Surabattula, Tom Talpey,
Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs
Hi Steve,
I was trying to test cifs rdma and KASAN detected a UAF when using the
softRoCE RDMA driver (rxe):
BUG: KASAN: use-after-free in smbd_reconnect (fs/cifs/smbdirect.c:1427
if (server->smbd_conn->transport_status == SMBD_CONNECTED) {
I've attached the oops log below. This is with v6.2-rc5 with no additional
patches. One thing I'm wondering is if smbd_destroy() should clear
server->smbd_conn before returning since it kfrees the smbd_connection struct
that that was pointing to.
The commands I was using:
rdma link add rxe0 type rxe netdev enp6s0 # andromeda, softRoCE
cd ~/xfstests-dev; ./check generic/001
The xfstests config:
FSTYP=cifs
TEST_DEV=//carina/test
TEST_DIR=/xfstest.test
TEST_FS_MOUNT_OPTS='-ousername=shares,password=foobar,vers=3.1.1,rdma'
export MOUNT_OPTIONS='-ousername=shares,password=foobar,vers=3.1.1,rdma'
export SCRATCH_DEV=//carina/scratch
export SCRATCH_MNT=/xfstest.scratch
The mounted filesystem:
//carina/test /xfstest.test cifs rw,context=system_u:object_r:root_t:s0,relatime,vers=3.1.1,cache=strict,username=shares,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.6.1,rdma,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=524224,wsize=524224,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=5 0 0
It's talking to ksmbd on carina.
David
---
infiniband rxe0: set active
infiniband rxe0: added enp6s0
RDS/IB: rxe0: added
CIFS: Attempting to mount \\carina\test
CIFS: VFS: RDMA transport established
CIFS: Attempting to mount \\carina\scratch
CIFS: Attempting to mount \\carina\scratch
run fstests generic/001 at 2023-01-24 17:31:24
CIFS: VFS: smbd_recv_buf:1887 disconnected
==================================================================
BUG: KASAN: use-after-free in smbd_reconnect+0xba/0x1a9
Read of size 4 at addr ffff888119014000 by task cifsd/4963
CPU: 0 PID: 4963 Comm: cifsd Not tainted 6.2.0-rc5-build2 #729
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
Call Trace:
<TASK>
dump_stack_lvl+0x4c/0x5f
print_address_description.constprop.0+0x80/0x2b2
print_report+0x10f/0x1f2
? __virt_addr_valid+0xcd/0x113
? smbd_reconnect+0xba/0x1a9
? smbd_reconnect+0xba/0x1a9
kasan_report+0x88/0xa7
? smbd_reconnect+0xba/0x1a9
smbd_reconnect+0xba/0x1a9
__cifs_reconnect+0x4ca/0x637
? cifs_mark_tcp_ses_conns_for_reconnect+0x20a/0x20a
? __raw_spin_lock_init+0x83/0x83
? cifs_readv_from_socket+0x28f/0x2e6
? cifs_readv_from_socket+0x28f/0x2e6
cifs_readv_from_socket+0x1e7/0x2e6
cifs_read_from_socket+0xb5/0xef
? cifs_readv_from_socket+0x2e6/0x2e6
? mempool_kmalloc+0x11/0x11
? reacquire_held_locks+0x1bb/0x1bb
? memset+0x21/0x3f
cifs_demultiplex_thread+0x19f/0xbae
? cifs_handle_standard+0x277/0x277
? reacquire_held_locks+0x1bb/0x1bb
? __kthread_parkme+0x65/0xe8
? rcu_read_lock_bh_held+0xb1/0xb1
? preempt_count_sub+0x18/0xba
? _raw_spin_unlock_irqrestore+0x39/0x4c
? cifs_handle_standard+0x277/0x277
kthread+0x164/0x173
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x1f/0x30
</TASK>
Allocated by task 4959:
stack_trace_save+0x8d/0xba
kasan_save_stack+0x1c/0x38
kasan_set_track+0x21/0x26
____kasan_kmalloc+0x69/0x73
_smbd_get_connection+0xcf/0x124c
smbd_get_connection+0x21/0x3e
cifs_get_tcp_session.part.0+0x7f6/0xb87
cifs_mount_get_session+0x53/0x164
cifs_mount+0x8d/0x227
cifs_smb3_do_mount+0x168/0x465
smb3_get_tree+0x55/0x8a
vfs_get_tree+0x43/0x14d
do_new_mount+0x197/0x2b4
path_mount+0x6c7/0x705
do_mount+0x9c/0xdb
__do_sys_mount+0x141/0x16e
do_syscall_64+0x39/0x46
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 4963:
stack_trace_save+0x8d/0xba
kasan_save_stack+0x1c/0x38
kasan_set_track+0x21/0x26
kasan_save_free_info+0x27/0x37
____kasan_slab_free+0xb6/0xd2
__kmem_cache_free+0x93/0xd2
smbd_destroy+0x8da/0x91c
__cifs_reconnect+0x48d/0x637
cifs_readv_from_socket+0x1e7/0x2e6
cifs_read_from_socket+0xb5/0xef
cifs_demultiplex_thread+0x19f/0xbae
kthread+0x164/0x173
ret_from_fork+0x1f/0x30
Last potentially related work creation:
stack_trace_save+0x8d/0xba
kasan_save_stack+0x1c/0x38
__kasan_record_aux_stack+0x5f/0x65
insert_work+0x30/0xaf
__queue_work+0x3cc/0x3ef
queue_work_on+0x4e/0x68
__ib_process_cq+0x228/0x276
ib_poll_handler+0x41/0x14f
irq_poll_softirq+0xd9/0x1ad
__do_softirq+0x201/0x470
Second to last potentially related work creation:
stack_trace_save+0x8d/0xba
kasan_save_stack+0x1c/0x38
__kasan_record_aux_stack+0x5f/0x65
insert_work+0x30/0xaf
__queue_work+0x3cc/0x3ef
queue_work_on+0x4e/0x68
recv_done+0x171/0x714
__ib_process_cq+0x228/0x276
ib_poll_handler+0x41/0x14f
irq_poll_softirq+0xd9/0x1ad
__do_softirq+0x201/0x470
The buggy address belongs to the object at ffff888119014000
which belongs to the cache kmalloc-4k of size 4096
The buggy address is located 0 bytes inside of
4096-byte region [ffff888119014000, ffff888119015000)
The buggy address belongs to the physical page:
page:00000000a28ee5c4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x119014
head:00000000a28ee5c4 order:1 compound_mapcount:0 subpages_mapcount:0 compound_pincount:0
flags: 0x200000000010200(slab|head|node=0|zone=2)
raw: 0200000000010200 ffff888100040900 ffffea0004513490 ffffea0004581e10
raw: 0000000000000000 ffff888119014000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888119013f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888119013f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888119014000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888119014080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888119014100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: cifs-rdma: KASAN-detected UAF when using rxe driver 2023-01-24 17:48 cifs-rdma: KASAN-detected UAF when using rxe driver David Howells @ 2023-01-25 7:48 ` David Howells 2023-01-25 14:02 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells 1 sibling, 0 replies; 18+ messages in thread From: David Howells @ 2023-01-25 7:48 UTC (permalink / raw) Cc: dhowells, Steve French, Shyam Prasad N, Rohith Surabattula, Tom Talpey, Long Li, Namjae Jeon, Stefan Metzmacher, Paulo Alcantara, Jeff Layton, linux-cifs David Howells <dhowells@redhat.com> wrote: > I was trying to test cifs rdma and KASAN detected a UAF when using the > softRoCE RDMA driver (rxe): > > BUG: KASAN: use-after-free in smbd_reconnect (fs/cifs/smbdirect.c:1427 > if (server->smbd_conn->transport_status == SMBD_CONNECTED) { Okay, this seems to go back at least to v5.19, so it's been around for a while. David ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-24 17:48 cifs-rdma: KASAN-detected UAF when using rxe driver David Howells 2023-01-25 7:48 ` David Howells @ 2023-01-25 14:02 ` David Howells 2023-01-25 14:47 ` Tom Talpey ` (4 more replies) 1 sibling, 5 replies; 18+ messages in thread From: David Howells @ 2023-01-25 14:02 UTC (permalink / raw) To: Steve French Cc: dhowells, Shyam Prasad N, Rohith Surabattula, Tom Talpey, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs Hi Steve, That attached patch stops the kernel from oopsing, but it still tries endlessly to send with softRoCE. I'm having better luck with softIWarp - with some other patches, I can run generic/001 to completion with that transport. David --- commit 820cb3802c6a73c54e2e215b674eb5870fd5d0e5 Author: David Howells <dhowells@redhat.com> Date: Wed Jan 25 12:42:07 2023 +0000 cifs: Fix oops due to uncleared server->smbd_conn in reconnect In smbd_destroy(), clear the server->smbd_conn pointer after freeing the smbd_connection struct that it points to so that reconnection doesn't get confused. Fixes: 8ef130f9ec27 ("CIFS: SMBD: Implement function to destroy a SMB Direct connection") Signed-off-by: David Howells <dhowells@redhat.com> cc: Long Li <longli@microsoft.com> cc: Steve French <smfrench@gmail.com> cc: Pavel Shilovsky <pshilov@microsoft.com> cc: Ronnie Sahlberg <lsahlber@redhat.com> cc: linux-cifs@vger.kernel.org diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c index 90789aaa6567..8c816b25ce7c 100644 --- a/fs/cifs/smbdirect.c +++ b/fs/cifs/smbdirect.c @@ -1405,6 +1405,7 @@ void smbd_destroy(struct TCP_Server_Info *server) destroy_workqueue(info->workqueue); log_rdma_event(INFO, "rdma session destroyed\n"); kfree(info); + server->smbd_conn = NULL; } /* ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-25 14:02 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells @ 2023-01-25 14:47 ` Tom Talpey 2023-01-25 15:52 ` Tom Talpey ` (3 subsequent siblings) 4 siblings, 0 replies; 18+ messages in thread From: Tom Talpey @ 2023-01-25 14:47 UTC (permalink / raw) To: David Howells, Steve French Cc: Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs On 1/25/2023 9:02 AM, David Howells wrote: > Hi Steve, > > That attached patch stops the kernel from oopsing, but it still tries > endlessly to send with softRoCE. I'm having better luck with softIWarp - with > some other patches, I can run generic/001 to completion with that transport. Do you have any logging from the softRoCE runs? I'd suspect some kind of RDMA-specific scatter/gather overflow which might be server-side as easily as client-side. On client, try: echo 0x1ff >/sys/module/cifs/parameters/smbd_logging_class On server: ksmbd.control -d conn ksmbd.control -d rdma > --- > commit 820cb3802c6a73c54e2e215b674eb5870fd5d0e5 > Author: David Howells <dhowells@redhat.com> > Date: Wed Jan 25 12:42:07 2023 +0000 > > cifs: Fix oops due to uncleared server->smbd_conn in reconnect > > In smbd_destroy(), clear the server->smbd_conn pointer after freeing the > smbd_connection struct that it points to so that reconnection doesn't get > confused. > > Fixes: 8ef130f9ec27 ("CIFS: SMBD: Implement function to destroy a SMB Direct connection") > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Long Li <longli@microsoft.com> > cc: Steve French <smfrench@gmail.com> > cc: Pavel Shilovsky <pshilov@microsoft.com> > cc: Ronnie Sahlberg <lsahlber@redhat.com> > cc: linux-cifs@vger.kernel.org > > diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c > index 90789aaa6567..8c816b25ce7c 100644 > --- a/fs/cifs/smbdirect.c > +++ b/fs/cifs/smbdirect.c > @@ -1405,6 +1405,7 @@ void smbd_destroy(struct TCP_Server_Info *server) > destroy_workqueue(info->workqueue); > log_rdma_event(INFO, "rdma session destroyed\n"); > kfree(info); > + server->smbd_conn = NULL; > } > > /* > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-25 14:02 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells 2023-01-25 14:47 ` Tom Talpey @ 2023-01-25 15:52 ` Tom Talpey 2023-01-25 16:20 ` Steve French ` (2 subsequent siblings) 4 siblings, 0 replies; 18+ messages in thread From: Tom Talpey @ 2023-01-25 15:52 UTC (permalink / raw) To: David Howells, Steve French Cc: Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs Re the one-liner... Acked-by: Tom Talpey <tom@talpey.com> On 1/25/2023 9:02 AM, David Howells wrote: > Hi Steve, > > That attached patch stops the kernel from oopsing, but it still tries > endlessly to send with softRoCE. I'm having better luck with softIWarp - with > some other patches, I can run generic/001 to completion with that transport. > > David > > --- > commit 820cb3802c6a73c54e2e215b674eb5870fd5d0e5 > Author: David Howells <dhowells@redhat.com> > Date: Wed Jan 25 12:42:07 2023 +0000 > > cifs: Fix oops due to uncleared server->smbd_conn in reconnect > > In smbd_destroy(), clear the server->smbd_conn pointer after freeing the > smbd_connection struct that it points to so that reconnection doesn't get > confused. > > Fixes: 8ef130f9ec27 ("CIFS: SMBD: Implement function to destroy a SMB Direct connection") > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Long Li <longli@microsoft.com> > cc: Steve French <smfrench@gmail.com> > cc: Pavel Shilovsky <pshilov@microsoft.com> > cc: Ronnie Sahlberg <lsahlber@redhat.com> > cc: linux-cifs@vger.kernel.org > > diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c > index 90789aaa6567..8c816b25ce7c 100644 > --- a/fs/cifs/smbdirect.c > +++ b/fs/cifs/smbdirect.c > @@ -1405,6 +1405,7 @@ void smbd_destroy(struct TCP_Server_Info *server) > destroy_workqueue(info->workqueue); > log_rdma_event(INFO, "rdma session destroyed\n"); > kfree(info); > + server->smbd_conn = NULL; > } > > /* > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-25 14:02 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells 2023-01-25 14:47 ` Tom Talpey 2023-01-25 15:52 ` Tom Talpey @ 2023-01-25 16:20 ` Steve French 2023-01-25 20:41 ` David Howells 2023-01-26 15:20 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells 4 siblings, 0 replies; 18+ messages in thread From: Steve French @ 2023-01-25 16:20 UTC (permalink / raw) To: David Howells Cc: Steve French, Shyam Prasad N, Rohith Surabattula, Tom Talpey, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs minor cleanup of description and pushed to cifs-2.6.git for-next On Wed, Jan 25, 2023 at 8:05 AM David Howells <dhowells@redhat.com> wrote: > > Hi Steve, > > That attached patch stops the kernel from oopsing, but it still tries > endlessly to send with softRoCE. I'm having better luck with softIWarp - with > some other patches, I can run generic/001 to completion with that transport. > > David > > --- > commit 820cb3802c6a73c54e2e215b674eb5870fd5d0e5 > Author: David Howells <dhowells@redhat.com> > Date: Wed Jan 25 12:42:07 2023 +0000 > > cifs: Fix oops due to uncleared server->smbd_conn in reconnect > > In smbd_destroy(), clear the server->smbd_conn pointer after freeing the > smbd_connection struct that it points to so that reconnection doesn't get > confused. > > Fixes: 8ef130f9ec27 ("CIFS: SMBD: Implement function to destroy a SMB Direct connection") > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Long Li <longli@microsoft.com> > cc: Steve French <smfrench@gmail.com> > cc: Pavel Shilovsky <pshilov@microsoft.com> > cc: Ronnie Sahlberg <lsahlber@redhat.com> > cc: linux-cifs@vger.kernel.org > > diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c > index 90789aaa6567..8c816b25ce7c 100644 > --- a/fs/cifs/smbdirect.c > +++ b/fs/cifs/smbdirect.c > @@ -1405,6 +1405,7 @@ void smbd_destroy(struct TCP_Server_Info *server) > destroy_workqueue(info->workqueue); > log_rdma_event(INFO, "rdma session destroyed\n"); > kfree(info); > + server->smbd_conn = NULL; > } > > /* > -- Thanks, Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-25 14:02 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells ` (2 preceding siblings ...) 2023-01-25 16:20 ` Steve French @ 2023-01-25 20:41 ` David Howells 2023-01-25 22:24 ` Tom Talpey 2023-01-25 22:43 ` David Howells 2023-01-26 15:20 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells 4 siblings, 2 replies; 18+ messages in thread From: David Howells @ 2023-01-25 20:41 UTC (permalink / raw) To: Tom Talpey Cc: dhowells, Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs Hi Tom, Steve suggested I should ask you about this. I have IWarp RDMA mostly working with my iteratorisation patches - certainly better than without them, but I think that's mostly due to the patch that Stefan Metzmacher so dislikes ("cifs: Fix problem with encrypted RDMA data read"). However, fallocate doesn't work: # rdma link add siw0 type siw netdev enp6s0 # andromeda, softIWarp # mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma # fallocate -l 1M /xfstest.test/hello fallocate: fallocate failed: Resource temporarily unavailable Because smb3_simple_fallocate_write_range() calls SMB2_write(), which calls cifs_send_recv() then compound_send_recv() and thence to smb_send_rqst(). smb_send_rqst() encrypts the buffer it is given and smbd_send() attempts to shovel it to the server using Direct Data Placement - which I think might fail because the data is encrypted. In one run of the above commands, the data in the kvec array looked like: fe534d42400001000000000009000a0000000000000000001600000000000000a01300000200 0000000000000000000000000000000000000000000000000000000000000000000000000000 before the smb_send_rqst() gets to ->init_transform_rq() and like: 98eddc1bc31da7c55c00341e4dc769fa4c8b2b0ecdacbad33eb31855ec162fa2458b8437edc7 88ee0a033c84aa857b65ab31ce553594d412719cc3daf925e873e80062ec16b97c855721a42d after. The encrypted data is seen on the wire in DDP/RDMA packets. Any thoughts as to how to fix this? Does it need to pass a flag down to suppress the encryption or suppress the use of direct data placement? Or should it perhaps go through something like ->write_iter()? Note also that it encrypts the buffer in place and then smb3_simple_fallocate_write_range() reuses the buffer multiple times without clearing it. I've pushed my cifs iteratorisation patches to: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-cifs I can post them by email a bit later. David ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-25 20:41 ` David Howells @ 2023-01-25 22:24 ` Tom Talpey 2023-01-25 22:43 ` David Howells 1 sibling, 0 replies; 18+ messages in thread From: Tom Talpey @ 2023-01-25 22:24 UTC (permalink / raw) To: David Howells Cc: Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs On 1/25/2023 3:41 PM, David Howells wrote: > Hi Tom, > > Steve suggested I should ask you about this. > > I have IWarp RDMA mostly working with my iteratorisation patches - certainly > better than without them, but I think that's mostly due to the patch that > Stefan Metzmacher so dislikes ("cifs: Fix problem with encrypted RDMA data > read"). The encryption problem is real, and Metze is correct. The client shouldnt be requesting, and the server shouldn't be responding, with unencrypted messages on encrypted shares. The problem is, the proper fix is complicated. - We've reported the issue to Microsoft, but they have not yet said how the Windows client and server are intended to behave, and they have not yet revealed how the protocol document will be changed. At this time, the Linux implementation conforms, dangerously, with the published spec. - There is some unexplained behavior in the client when the connection is lost after failing to decrypt the unencrypted response. In my earlier look at the traces, for some reason it reconnects and retries without requesting RDMA. This succeeds, because the "inline" requests and responses are encrypted and decrypted successfully. It's interesting that this occurs on a compounded fallocate call. That might be a clue, too. What are you trying to test? Since encrypted SMBDirect traffic is known to have an issue, I guess I'd suggest turning off encryption-by-default on the share. I'll poke Microsoft again on the protocol ticket. Tom. > However, fallocate doesn't work: > > # rdma link add siw0 type siw netdev enp6s0 # andromeda, softIWarp > # mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma > # fallocate -l 1M /xfstest.test/hello > fallocate: fallocate failed: Resource temporarily unavailable > > Because smb3_simple_fallocate_write_range() calls SMB2_write(), which calls > cifs_send_recv() then compound_send_recv() and thence to smb_send_rqst(). > > smb_send_rqst() encrypts the buffer it is given and smbd_send() attempts to > shovel it to the server using Direct Data Placement - which I think might fail > because the data is encrypted. > > In one run of the above commands, the data in the kvec array looked like: > > fe534d42400001000000000009000a0000000000000000001600000000000000a01300000200 > 0000000000000000000000000000000000000000000000000000000000000000000000000000 > > before the smb_send_rqst() gets to ->init_transform_rq() and like: > > 98eddc1bc31da7c55c00341e4dc769fa4c8b2b0ecdacbad33eb31855ec162fa2458b8437edc7 > 88ee0a033c84aa857b65ab31ce553594d412719cc3daf925e873e80062ec16b97c855721a42d > > after. The encrypted data is seen on the wire in DDP/RDMA packets. > > Any thoughts as to how to fix this? > > Does it need to pass a flag down to suppress the encryption or suppress the > use of direct data placement? Or should it perhaps go through something like > ->write_iter()? > > Note also that it encrypts the buffer in place and then > smb3_simple_fallocate_write_range() reuses the buffer multiple times without > clearing it. > > I've pushed my cifs iteratorisation patches to: > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-cifs > > I can post them by email a bit later. > > David > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-25 20:41 ` David Howells 2023-01-25 22:24 ` Tom Talpey @ 2023-01-25 22:43 ` David Howells 2023-01-25 22:56 ` Tom Talpey ` (2 more replies) 1 sibling, 3 replies; 18+ messages in thread From: David Howells @ 2023-01-25 22:43 UTC (permalink / raw) To: Tom Talpey Cc: dhowells, Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs Tom Talpey <tom@talpey.com> wrote: > What are you trying to test? I'm trying to make sure my iteratorisation patches work, including with RDMA. I have some functions to decant some data an iterator either into a scatterlist and into an RDMA SGE array without the need to get refs on pages. > Since encrypted SMBDirect traffic is known to have an issue, I guess I'd > suggest turning off encryption-by-default on the share. How do I do that? In the ksmbd config? [global] smb3 encryption = yes David ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-25 22:43 ` David Howells @ 2023-01-25 22:56 ` Tom Talpey 2023-01-25 23:42 ` Namjae Jeon 2023-01-26 14:42 ` pcap of misbehaving fallocate over cifs rdma David Howells 2 siblings, 0 replies; 18+ messages in thread From: Tom Talpey @ 2023-01-25 22:56 UTC (permalink / raw) To: David Howells Cc: Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs On 1/25/2023 5:43 PM, David Howells wrote: > Tom Talpey <tom@talpey.com> wrote: > >> What are you trying to test? > > I'm trying to make sure my iteratorisation patches work, including with RDMA. > I have some functions to decant some data an iterator either into a > scatterlist and into an RDMA SGE array without the need to get refs on pages. Most excellent. Great name for the task too. :) There are going to be a couple of paths to test eventually. In the non-encrypted case, the data will be coming down with a rather different set of sges/segments than after it goes through the scrambler. Since we're not ready to implement the encrypted SMBDirect traffic yet, it's best to put off the encrypted path work/testing, agree? >> Since encrypted SMBDirect traffic is known to have an issue, I guess I'd >> suggest turning off encryption-by-default on the share. > > How do I do that? In the ksmbd config? > > [global] > smb3 encryption = yes That's definitely needed, but also check that the share stanzas do not request encryption, as well. Tom. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-25 22:43 ` David Howells 2023-01-25 22:56 ` Tom Talpey @ 2023-01-25 23:42 ` Namjae Jeon 2023-01-26 14:42 ` pcap of misbehaving fallocate over cifs rdma David Howells 2 siblings, 0 replies; 18+ messages in thread From: Namjae Jeon @ 2023-01-25 23:42 UTC (permalink / raw) To: David Howells Cc: Tom Talpey, Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Stefan Metzmacher, Jeff Layton, linux-cifs 2023-01-26 7:43 GMT+09:00, David Howells <dhowells@redhat.com>: > Tom Talpey <tom@talpey.com> wrote: > >> What are you trying to test? > > I'm trying to make sure my iteratorisation patches work, including with > RDMA. > I have some functions to decant some data an iterator either into a > scatterlist and into an RDMA SGE array without the need to get refs on > pages. > >> Since encrypted SMBDirect traffic is known to have an issue, I guess I'd >> suggest turning off encryption-by-default on the share. > > How do I do that? In the ksmbd config? > > [global] > smb3 encryption = yes I recently changed the input of the smb3 encryption parameters. It is "auto" by default. Requests/responses will not be encrypted unless you give the seal option in the mount options. So please update the latest ksmbd-tools for your test. man ksmbd.conf smb3 encryption (G) Client is disallowed, allowed, or required to use SMB3 encryption. With smb3 en‐ cryption = disabled, SMB3 encryption is disallowed even if it is requested by the client. With smb3 encryption = auto, SMB3 encryption is allowed if it is requested by the client. With smb3 encryption = mandatory, SMB3 encryption is required. i.e. clients that do not support encryption will be denied access to the share. Default: smb3 encryption = auto Thanks. > > David > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* pcap of misbehaving fallocate over cifs rdma 2023-01-25 22:43 ` David Howells 2023-01-25 22:56 ` Tom Talpey 2023-01-25 23:42 ` Namjae Jeon @ 2023-01-26 14:42 ` David Howells 2023-01-26 19:54 ` David Howells 2 siblings, 1 reply; 18+ messages in thread From: David Howells @ 2023-01-26 14:42 UTC (permalink / raw) To: Tom Talpey, Steve French Cc: dhowells, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs [-- Attachment #1: Type: text/plain, Size: 1409 bytes --] Hi Tom, Steve, Could you take a look at the attached and see if you can tell me why it's going wrong? It's a server-side packet capture of: # rdma link add siw0 type siw netdev enp6s0 # andromeda, softIWarp # mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma # fallocate -l 1M /xfstest.test/hello fallocate: fallocate failed: Resource temporarily unavailable # dd if=/dev/zero of=/xfstest.test/hello2 bs=16k count=1 oflag=direct conv=notrunc seek=2 1+0 records in 1+0 records out 16384 bytes (16 kB, 16 KiB) copied, 0.108858 s, 151 kB/s # umount /xfstest.test I altered the code to only send 16K of data at a time during the fallocate so that each block should fit within one message, but it fails whilst sending the first write. The fallocate starts at frame 74. There's an Ioctl exchange and then it starts using "DDP/RDMA Send" to shovel data across (the data looks right), but the server sends a Terminate packet in frame 90 before the client's Send is complete. The Send completes in frame 92 and the wireshark decoder seems to like it. For comparison I also did a DIO write with dd. That starts in frame 125 and uses a different mechanism (DDP/RDMA Read Request and Read Response) to shovel the data - and that completes successfully. I've switched the encryption back to auto, so it's not doing transport encryption. Thanks, David [-- Attachment #2: cifs-iwarp-falloc.pcap.gz --] [-- Type: application/gzip, Size: 9177 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pcap of misbehaving fallocate over cifs rdma 2023-01-26 14:42 ` pcap of misbehaving fallocate over cifs rdma David Howells @ 2023-01-26 19:54 ` David Howells 2023-01-26 20:29 ` Tom Talpey 2023-01-26 20:47 ` David Howells 0 siblings, 2 replies; 18+ messages in thread From: David Howells @ 2023-01-26 19:54 UTC (permalink / raw) To: Steve French Cc: dhowells, Tom Talpey, Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs Steve French <smfrench@gmail.com> wrote: > I am puzzled ... you show the fallocate failing but why do you mention > it sending data, sending writes smb3_simple_fallocate_write_range() sends data. > - when I try the fallocate you pasted above I see what is in the attached > screenshot go over the network (no writes) - and your example looks like it > simply doesn't send anything then resets the session at frame 93 Look at frame 92. That's the concluding packet of the write performed by smb3_simple_fallocate_write_range(). 74 4.568861795 192.168.6.2 -> 192.168.6.1 SMB2 250 Ioctl Request FSCTL_QUERY_ALLOCATED_RANGES File: hello 75 4.569429926 192.168.6.1 -> 192.168.6.2 SMB2 242 Ioctl Response FSCTL_QUERY_ALLOCATED_RANGES 77 4.680495774 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 78 4.680496219 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 79 4.680496364 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 80 4.680496552 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 81 4.680496698 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 82 4.680496844 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 83 4.680496989 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 84 4.680497177 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 88 4.680638842 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 89 4.680639016 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 90 4.680704523 192.168.6.1 -> 192.168.6.2 DDP/RDMA 114 5445 > 50018 Terminate [last DDP segment] 91 4.680735089 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] 92 4.680735359 192.168.6.2 -> 192.168.6.1 SMB2 946 Write Request Len:16384 Off:204800 File: hello David ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pcap of misbehaving fallocate over cifs rdma 2023-01-26 19:54 ` David Howells @ 2023-01-26 20:29 ` Tom Talpey 2023-01-26 20:47 ` David Howells 1 sibling, 0 replies; 18+ messages in thread From: Tom Talpey @ 2023-01-26 20:29 UTC (permalink / raw) To: David Howells, Steve French Cc: Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs On 1/26/2023 2:54 PM, David Howells wrote: > Steve French <smfrench@gmail.com> wrote: > >> I am puzzled ... you show the fallocate failing but why do you mention >> it sending data, sending writes > > smb3_simple_fallocate_write_range() sends data. > >> - when I try the fallocate you pasted above I see what is in the attached >> screenshot go over the network (no writes) - and your example looks like it >> simply doesn't send anything then resets the session at frame 93 > > Look at frame 92. That's the concluding packet of the write performed by > smb3_simple_fallocate_write_range(). > > 74 4.568861795 192.168.6.2 -> 192.168.6.1 SMB2 250 Ioctl Request FSCTL_QUERY_ALLOCATED_RANGES File: hello > 75 4.569429926 192.168.6.1 -> 192.168.6.2 SMB2 242 Ioctl Response FSCTL_QUERY_ALLOCATED_RANGES > 77 4.680495774 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 78 4.680496219 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 79 4.680496364 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 80 4.680496552 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 81 4.680496698 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 82 4.680496844 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 83 4.680496989 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 84 4.680497177 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 88 4.680638842 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 89 4.680639016 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 90 4.680704523 192.168.6.1 -> 192.168.6.2 DDP/RDMA 114 5445 > 50018 Terminate [last DDP segment] > 91 4.680735089 192.168.6.2 -> 192.168.6.1 DDP/RDMA 1514 50018 > 5445 Send [more DDP segments] > 92 4.680735359 192.168.6.2 -> 192.168.6.1 SMB2 946 Write Request Len:16384 Off:204800 File: hello > That's a really large SMBDirect Send operation, it looks like it's trying to send the entire write in one message and it overflows the receive buffer. I'm still fighting with wireshark and can't decode the layers above TCP. Can you look at the SMBDirect negotiation at the start of the trace, and tell me what the max send/receive values were set by each side? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pcap of misbehaving fallocate over cifs rdma 2023-01-26 19:54 ` David Howells 2023-01-26 20:29 ` Tom Talpey @ 2023-01-26 20:47 ` David Howells 1 sibling, 0 replies; 18+ messages in thread From: David Howells @ 2023-01-26 20:47 UTC (permalink / raw) To: Tom Talpey Cc: dhowells, Steve French, Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs Tom Talpey <tom@talpey.com> wrote: > That's a really large SMBDirect Send operation, it looks like it's > trying to send the entire write in one message and it overflows > the receive buffer. > > I'm still fighting with wireshark and can't decode the layers > above TCP. Can you look at the SMBDirect negotiation at the > start of the trace, and tell me what the max send/receive > values were set by each side? Frame 8: 110 bytes on wire (880 bits), 110 bytes captured (880 bits) on interface enp2s0, id 0 Ethernet II, Src: IntelCor_bb:e6:30 (00:1b:21:bb:e6:30), Dst: IntelCor_bb:e6:ac (00:1b:21:bb:e6:ac) Internet Protocol Version 4, Src: 192.168.6.2, Dst: 192.168.6.1 Transmission Control Protocol, Src Port: 50018, Dst Port: 5445, Seq: 33, Ack: 33, Len: 44 iWARP Marker Protocol data unit Aligned framing iWARP Direct Data Placement and Remote Direct Memory Access Protocol SMB-Direct (SMB RDMA Transport) NegotiateRequest MinVersion: 0x0100 MaxVersion: 0x0100 CreditsRequested: 255 PreferredSendSize: 1364 MaxReceiveSize: 1364 MaxFragmentedSize: 1048576 Frame 9: 122 bytes on wire (976 bits), 122 bytes captured (976 bits) on interface enp2s0, id 0 Ethernet II, Src: IntelCor_bb:e6:ac (00:1b:21:bb:e6:ac), Dst: IntelCor_bb:e6:30 (00:1b:21:bb:e6:30) Internet Protocol Version 4, Src: 192.168.6.1, Dst: 192.168.6.2 Transmission Control Protocol, Src Port: 5445, Dst Port: 50018, Seq: 33, Ack: 77, Len: 56 iWARP Marker Protocol data unit Aligned framing iWARP Direct Data Placement and Remote Direct Memory Access Protocol SMB-Direct (SMB RDMA Transport) NegotiateResponse MinVersion: 0x0100 MaxVersion: 0x0100 NegotiatedVersion: 0x0100 CreditsRequested: 255 CreditsGranted: 254 Status: STATUS_SUCCESS (0x00000000) MaxReadWriteSize: 524224 PreferredSendSize: 1364 MaxReceiveSize: 1364 MaxFragmentedSize: 173910 Frame 10: 110 bytes on wire (880 bits), 110 bytes captured (880 bits) on interface enp2s0, id 0 Ethernet II, Src: IntelCor_bb:e6:30 (00:1b:21:bb:e6:30), Dst: IntelCor_bb:e6:ac (00:1b:21:bb:e6:ac) Internet Protocol Version 4, Src: 192.168.6.2, Dst: 192.168.6.1 Transmission Control Protocol, Src Port: 50018, Dst Port: 5445, Seq: 77, Ack: 89, Len: 44 iWARP Marker Protocol data unit Aligned framing iWARP Direct Data Placement and Remote Direct Memory Access Protocol SMB-Direct (SMB RDMA Transport) DataMessage CreditsRequested: 255 CreditsGranted: 255 Flags: 0x0000 .... .... .... ...0 = ResponseRequested: False RemainingLength: 0 DataOffset: 0 DataLength: 0 Frame 11: 346 bytes on wire (2768 bits), 346 bytes captured (2768 bits) on interface enp2s0, id 0 Ethernet II, Src: IntelCor_bb:e6:30 (00:1b:21:bb:e6:30), Dst: IntelCor_bb:e6:ac (00:1b:21:bb:e6:ac) Internet Protocol Version 4, Src: 192.168.6.2, Dst: 192.168.6.1 Transmission Control Protocol, Src Port: 50018, Dst Port: 5445, Seq: 121, Ack: 89, Len: 280 iWARP Marker Protocol data unit Aligned framing iWARP Direct Data Placement and Remote Direct Memory Access Protocol SMB-Direct (SMB RDMA Transport) DataMessage CreditsRequested: 255 CreditsGranted: 0 Flags: 0x0000 .... .... .... ...0 = ResponseRequested: False RemainingLength: 0 DataOffset: 24 DataLength: 232 SMB2 (Server Message Block Protocol version 2) SMB2 Header ProtocolId: 0xfe534d42 Header Length: 64 Credit Charge: 0 Channel Sequence: 0 Reserved: 0000 Command: Negotiate Protocol (0) Credits requested: 10 Flags: 0x00000000 Chain Offset: 0x00000000 Message ID: 0 Process Id: 0x000013c5 Tree Id: 0x00000000 Session Id: 0x0000000000000000 Signature: 00000000000000000000000000000000 [Response in: 13] Negotiate Protocol Request (0x00) [Preauth Hash: 81cd52dea94ed363a171b7effe222c0003574f5c54f6c7a1cbb041676ea9ddf15245b2a4…] StructureSize: 0x0024 Dialect count: 4 Security mode: 0x01, Signing enabled Reserved: 0000 Capabilities: 0x00000077, DFS, LEASING, LARGE MTU, PERSISTENT HANDLES, DIRECTORY LEASING, ENCRYPTION Client Guid: c494649a-e636-d94c-a55e-be00d5a02a30 NegotiateContextOffset: 0x00000070 NegotiateContextCount: 4 Reserved: 0000 Dialect: SMB 2.1 (0x0210) Dialect: SMB 3.0 (0x0300) Dialect: SMB 3.0.2 (0x0302) Dialect: SMB 3.1.1 (0x0311) Negotiate Context: SMB2_PREAUTH_INTEGRITY_CAPABILITIES Type: SMB2_PREAUTH_INTEGRITY_CAPABILITIES (0x0001) DataLength: 38 Reserved: 00000000 HashAlgorithmCount: 1 SaltLength: 32 HashAlgorithm: SHA-512 (0x0001) Salt: 1d6e14b44264b6cc1db622478c3826c4cd09df1dc70abf73f13b9261724d4181 Negotiate Context: SMB2_ENCRYPTION_CAPABILITIES Type: SMB2_ENCRYPTION_CAPABILITIES (0x0002) DataLength: 8 Reserved: 00000000 CipherCount: 3 CipherId: AES-128-GCM (0x0002) CipherId: AES-256-GCM (0x0004) CipherId: AES-128-CCM (0x0001) Negotiate Context: SMB2_NETNAME_NEGOTIATE_CONTEXT_ID Type: SMB2_NETNAME_NEGOTIATE_CONTEXT_ID (0x0005) DataLength: 22 Reserved: 00000000 Netname: 192.168.6.1 Negotiate Context: SMB2_POSIX_EXTENSIONS_CAPABILITIES Type: SMB2_POSIX_EXTENSIONS_CAPABILITIES (0x0100) DataLength: 16 Reserved: 00000000 POSIX Reserved: 93ad25509cb411e7b42383de968bcd7c ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-25 14:02 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells ` (3 preceding siblings ...) 2023-01-25 20:41 ` David Howells @ 2023-01-26 15:20 ` David Howells 2023-01-26 19:22 ` Tom Talpey 2023-01-26 19:49 ` David Howells 4 siblings, 2 replies; 18+ messages in thread From: David Howells @ 2023-01-26 15:20 UTC (permalink / raw) To: Tom Talpey Cc: dhowells, Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs [-- Attachment #1: Type: text/plain, Size: 947 bytes --] Tom Talpey <tom@talpey.com> wrote: > Do you have any logging from the softRoCE runs? I'd suspect some > kind of RDMA-specific scatter/gather overflow which might be > server-side as easily as client-side. > > On client, try: > echo 0x1ff >/sys/module/cifs/parameters/smbd_logging_class > > On server: > ksmbd.control -d conn > ksmbd.control -d rdma Okay, on -rc5 without my patches, using: # rdma link add rxe0 type rxe netdev enp6s0 # andromeda, softRoCE # mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma # dd if=/dev/zero of=/xfstest.test/hello2 bs=16k count=1 oflag=direct conv=notrunc seek=2 the dd hangs. I've captured the client and server logging you requested plus a pcap file on the server (see attached). Note also I tried md5summing a 1MiB file and that produced a different MD5 sum each time. I couldn't see enough data being transferred in the pcap to indicate that that was happening. David [-- Attachment #2: Client dmesg log --] [-- Type: application/gzip, Size: 7196 bytes --] [-- Attachment #3: Server dmesg log --] [-- Type: application/gzip, Size: 4350 bytes --] [-- Attachment #4: Server packet capture --] [-- Type: application/gzip, Size: 9728 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-26 15:20 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells @ 2023-01-26 19:22 ` Tom Talpey 2023-01-26 19:49 ` David Howells 1 sibling, 0 replies; 18+ messages in thread From: Tom Talpey @ 2023-01-26 19:22 UTC (permalink / raw) To: David Howells Cc: Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs On 1/26/2023 10:20 AM, David Howells wrote: > Tom Talpey <tom@talpey.com> wrote: > >> Do you have any logging from the softRoCE runs? I'd suspect some >> kind of RDMA-specific scatter/gather overflow which might be >> server-side as easily as client-side. >> >> On client, try: >> echo 0x1ff >/sys/module/cifs/parameters/smbd_logging_class >> >> On server: >> ksmbd.control -d conn >> ksmbd.control -d rdma > > Okay, on -rc5 without my patches, using: > > # rdma link add rxe0 type rxe netdev enp6s0 # andromeda, softRoCE > # mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma > # dd if=/dev/zero of=/xfstest.test/hello2 bs=16k count=1 oflag=direct conv=notrunc seek=2 > > the dd hangs. I've captured the client and server logging you requested plus > a pcap file on the server (see attached). > > Note also I tried md5summing a 1MiB file and that produced a different MD5 sum > each time. I couldn't see enough data being transferred in the pcap to > indicate that that was happening. It looks like the server is seeing transmit timeouts on its responses, there are 7 of these in server-log.txt: [3700697.936899] ksmbd: smb_direct: read/write error. opcode = 0, status = transport retry counter exceeded(12) [3700697.937043] ksmbd: Failed to send message: -107 Maybe this is a softiWARP issue? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect 2023-01-26 15:20 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells 2023-01-26 19:22 ` Tom Talpey @ 2023-01-26 19:49 ` David Howells 1 sibling, 0 replies; 18+ messages in thread From: David Howells @ 2023-01-26 19:49 UTC (permalink / raw) To: Tom Talpey Cc: dhowells, Steve French, Shyam Prasad N, Rohith Surabattula, Long Li, Namjae Jeon, Stefan Metzmacher, Jeff Layton, linux-cifs Tom Talpey <tom@talpey.com> wrote: > Maybe this is a softiWARP issue? That should be softRoCE. David ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2023-01-26 20:49 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-01-24 17:48 cifs-rdma: KASAN-detected UAF when using rxe driver David Howells 2023-01-25 7:48 ` David Howells 2023-01-25 14:02 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells 2023-01-25 14:47 ` Tom Talpey 2023-01-25 15:52 ` Tom Talpey 2023-01-25 16:20 ` Steve French 2023-01-25 20:41 ` David Howells 2023-01-25 22:24 ` Tom Talpey 2023-01-25 22:43 ` David Howells 2023-01-25 22:56 ` Tom Talpey 2023-01-25 23:42 ` Namjae Jeon 2023-01-26 14:42 ` pcap of misbehaving fallocate over cifs rdma David Howells 2023-01-26 19:54 ` David Howells 2023-01-26 20:29 ` Tom Talpey 2023-01-26 20:47 ` David Howells 2023-01-26 15:20 ` [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect David Howells 2023-01-26 19:22 ` Tom Talpey 2023-01-26 19:49 ` David Howells
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.