* NFS over RDMA issues on Linux 5.4
@ 2020-08-03 15:05 Timo Rothenpieler
2020-08-03 16:24 ` Chuck Lever
0 siblings, 1 reply; 19+ messages in thread
From: Timo Rothenpieler @ 2020-08-03 15:05 UTC (permalink / raw)
To: linux-nfs
Hello,
I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB
cards and wanted to setup NFS over RDMA on it.
However, while mounting the FS over RDMA works fine, actually using it
results in the following messages absolutely hammering dmesg on both
client and server:
> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log
The spam only stops once I forcibly reboot the client. The filesystem
gets nowhere during all this. The retrans counter in nfsstat just keeps
going up, nothing actually gets done.
This is on Linux 5.4.54, using nfs-utils 2.4.3.
The mlx5 driver had enhanced-mode disabled in order to enable IPoIB
connected mode with an MTU of 65520.
Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only
when I mount via rdma that things go wrong.
Is this an issue on my end, or did I run into a bug somewhere here?
Any pointers, patches and solutions to test are welcome.
Thanks,
Timo Rothenpieler
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-03 15:05 NFS over RDMA issues on Linux 5.4 Timo Rothenpieler @ 2020-08-03 16:24 ` Chuck Lever 2020-08-04 9:36 ` Leon Romanovsky 0 siblings, 1 reply; 19+ messages in thread From: Chuck Lever @ 2020-08-03 16:24 UTC (permalink / raw) To: Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma Hi Timo- > On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: > > Hello, > > I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it. > > However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server: > >> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log > > The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done. > > This is on Linux 5.4.54, using nfs-utils 2.4.3. > The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520. > > Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong. > > Is this an issue on my end, or did I run into a bug somewhere here? > Any pointers, patches and solutions to test are welcome. I haven't seen that failure mode here, so best I can recommend is keep investigating. I've copied linux-rdma in case they have any advice. -- Chuck Lever ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-03 16:24 ` Chuck Lever @ 2020-08-04 9:36 ` Leon Romanovsky 2020-08-04 10:52 ` Timo Rothenpieler 0 siblings, 1 reply; 19+ messages in thread From: Leon Romanovsky @ 2020-08-04 9:36 UTC (permalink / raw) To: Chuck Lever; +Cc: Timo Rothenpieler, Linux NFS Mailing List, linux-rdma On Mon, Aug 03, 2020 at 12:24:21PM -0400, Chuck Lever wrote: > Hi Timo- > > > On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: > > > > Hello, > > > > I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it. > > > > However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server: > > > >> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log > > > > The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done. > > > > This is on Linux 5.4.54, using nfs-utils 2.4.3. > > The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520. > > > > Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong. > > > > Is this an issue on my end, or did I run into a bug somewhere here? > > Any pointers, patches and solutions to test are welcome. > > I haven't seen that failure mode here, so best I can recommend is > keep investigating. I've copied linux-rdma in case they have any > advice. The mentioning of IPoIB is a slightly confusing in the context of NFS-over-RDMA. Are you running NFS over IPoIB? From brief look on CQE error syndrome (local length error), the client sends wrong WQE. Thanks > > -- > Chuck Lever > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 9:36 ` Leon Romanovsky @ 2020-08-04 10:52 ` Timo Rothenpieler 2020-08-04 12:25 ` Leon Romanovsky 0 siblings, 1 reply; 19+ messages in thread From: Timo Rothenpieler @ 2020-08-04 10:52 UTC (permalink / raw) To: Leon Romanovsky, Chuck Lever; +Cc: Linux NFS Mailing List, linux-rdma On 04.08.2020 11:36, Leon Romanovsky wrote: > On Mon, Aug 03, 2020 at 12:24:21PM -0400, Chuck Lever wrote: >> Hi Timo- >> >>> On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: >>> >>> Hello, >>> >>> I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it. >>> >>> However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server: >>> >>>> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log >>> >>> The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done. >>> >>> This is on Linux 5.4.54, using nfs-utils 2.4.3. >>> The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520. >>> >>> Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong. >>> >>> Is this an issue on my end, or did I run into a bug somewhere here? >>> Any pointers, patches and solutions to test are welcome. >> >> I haven't seen that failure mode here, so best I can recommend is >> keep investigating. I've copied linux-rdma in case they have any >> advice. > > The mentioning of IPoIB is a slightly confusing in the context of NFS-over-RDMA. > Are you running NFS over IPoIB? For all I'm aware, NFS over RDMA still needs an IP and port to be targeted to, so IPoIB is mandatory? At least the admin guide in the kernel says so. Right now I actually am running NFS over IPoIB (without RDMA), because of the issue at hand. And would like to turn on RDMA for enhanced performance. > From brief look on CQE error syndrome (local length error), the client sends wrong WQE. Does that point at an issue in the kernel code, or something I did wrong? The fstab entries for these mounts look like this: 10.110.10.200:/home /home nfs4 rw,rdma,port=20049,noatime,async,vers=4.2,_netdev 0 0 Is there anything more I can investigate? I tried turning connected mode off and lowering the mtu in turn, but that did not have any effect. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 10:52 ` Timo Rothenpieler @ 2020-08-04 12:25 ` Leon Romanovsky 2020-08-04 12:49 ` Chuck Lever 0 siblings, 1 reply; 19+ messages in thread From: Leon Romanovsky @ 2020-08-04 12:25 UTC (permalink / raw) To: Timo Rothenpieler; +Cc: Chuck Lever, Linux NFS Mailing List, linux-rdma On Tue, Aug 04, 2020 at 12:52:27PM +0200, Timo Rothenpieler wrote: > On 04.08.2020 11:36, Leon Romanovsky wrote: > > On Mon, Aug 03, 2020 at 12:24:21PM -0400, Chuck Lever wrote: > > > Hi Timo- > > > > > > > On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: > > > > > > > > Hello, > > > > > > > > I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it. > > > > > > > > However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server: > > > > > > > > > https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log > > > > > > > > The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done. > > > > > > > > This is on Linux 5.4.54, using nfs-utils 2.4.3. > > > > The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520. > > > > > > > > Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong. > > > > > > > > Is this an issue on my end, or did I run into a bug somewhere here? > > > > Any pointers, patches and solutions to test are welcome. > > > > > > I haven't seen that failure mode here, so best I can recommend is > > > keep investigating. I've copied linux-rdma in case they have any > > > advice. > > > > The mentioning of IPoIB is a slightly confusing in the context of NFS-over-RDMA. > > Are you running NFS over IPoIB? > > For all I'm aware, NFS over RDMA still needs an IP and port to be targeted > to, so IPoIB is mandatory? > At least the admin guide in the kernel says so. > > Right now I actually am running NFS over IPoIB (without RDMA), because of > the issue at hand. And would like to turn on RDMA for enhanced performance. > > > From brief look on CQE error syndrome (local length error), the client sends wrong WQE. > > Does that point at an issue in the kernel code, or something I did wrong? > > The fstab entries for these mounts look like this: > > 10.110.10.200:/home /home nfs4 > rw,rdma,port=20049,noatime,async,vers=4.2,_netdev 0 0 > > Is there anything more I can investigate? I tried turning connected mode off > and lowering the mtu in turn, but that did not have any effect. Chuck, You probably know which traces Timo should enable on the client. The fact that NFS over (not-enahnced) IPoIB works highly reduces driver/FW issues. Thanks ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 12:25 ` Leon Romanovsky @ 2020-08-04 12:49 ` Chuck Lever 2020-08-04 13:08 ` Timo Rothenpieler 0 siblings, 1 reply; 19+ messages in thread From: Chuck Lever @ 2020-08-04 12:49 UTC (permalink / raw) To: Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma, Leon Romanovsky > On Aug 4, 2020, at 8:25 AM, Leon Romanovsky <leon@kernel.org> wrote: > > On Tue, Aug 04, 2020 at 12:52:27PM +0200, Timo Rothenpieler wrote: >> On 04.08.2020 11:36, Leon Romanovsky wrote: >>> On Mon, Aug 03, 2020 at 12:24:21PM -0400, Chuck Lever wrote: >>>> Hi Timo- >>>> >>>>> On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: >>>>> >>>>> Hello, >>>>> >>>>> I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it. >>>>> >>>>> However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server: >>>>> >>>>>> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log >>>>> >>>>> The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done. >>>>> >>>>> This is on Linux 5.4.54, using nfs-utils 2.4.3. >>>>> The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520. >>>>> >>>>> Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong. >>>>> >>>>> Is this an issue on my end, or did I run into a bug somewhere here? >>>>> Any pointers, patches and solutions to test are welcome. >>>> >>>> I haven't seen that failure mode here, so best I can recommend is >>>> keep investigating. I've copied linux-rdma in case they have any >>>> advice. >>> >>> The mentioning of IPoIB is a slightly confusing in the context of NFS-over-RDMA. >>> Are you running NFS over IPoIB? >> >> For all I'm aware, NFS over RDMA still needs an IP and port to be targeted >> to, so IPoIB is mandatory? >> At least the admin guide in the kernel says so. >> >> Right now I actually am running NFS over IPoIB (without RDMA), because of >> the issue at hand. And would like to turn on RDMA for enhanced performance. >> >>> From brief look on CQE error syndrome (local length error), the client sends wrong WQE. >> >> Does that point at an issue in the kernel code, or something I did wrong? >> >> The fstab entries for these mounts look like this: >> >> 10.110.10.200:/home /home nfs4 >> rw,rdma,port=20049,noatime,async,vers=4.2,_netdev 0 0 >> >> Is there anything more I can investigate? I tried turning connected mode off >> and lowering the mtu in turn, but that did not have any effect. > > Chuck, > > You probably know which traces Timo should enable on the client. > The fact that NFS over (not-enahnced) IPoIB works highly reduces > driver/FW issues. Timo, I tend to think this is not a configuration issue. Do you know of a known working kernel? -- Chuck Lever ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 12:49 ` Chuck Lever @ 2020-08-04 13:08 ` Timo Rothenpieler 2020-08-04 13:12 ` Chuck Lever 0 siblings, 1 reply; 19+ messages in thread From: Timo Rothenpieler @ 2020-08-04 13:08 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing List, linux-rdma On 04.08.2020 14:49, Chuck Lever wrote: > Timo, I tend to think this is not a configuration issue. > Do you know of a known working kernel? > This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade. Also keep in mind that the dmesg spam happens on both server and client simultaneously. I'll see if I can borrow two of the nodes to turn into a temporary test system for this. The Kernel for this system is self-built and not any distribution kernel. This could not be a missing kernel config option or something? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 13:08 ` Timo Rothenpieler @ 2020-08-04 13:12 ` Chuck Lever 2020-08-04 13:19 ` Timo Rothenpieler 2020-08-04 13:46 ` Leon Romanovsky 0 siblings, 2 replies; 19+ messages in thread From: Chuck Lever @ 2020-08-04 13:12 UTC (permalink / raw) To: Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma > On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: > > On 04.08.2020 14:49, Chuck Lever wrote: >> Timo, I tend to think this is not a configuration issue. >> Do you know of a known working kernel? > > This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade. > > Also keep in mind that the dmesg spam happens on both server and client simultaneously. Let's start with the client only, since restarting it seems to clear the problem. > I'll see if I can borrow two of the nodes to turn into a temporary test system for this. > > The Kernel for this system is self-built and not any distribution kernel. Would it be easy to try a kernel earlier in the 5.4.y stable series? > This could not be a missing kernel config option or something? Doubtful. -- Chuck Lever ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 13:12 ` Chuck Lever @ 2020-08-04 13:19 ` Timo Rothenpieler 2020-08-04 13:24 ` Chuck Lever 2020-08-04 13:46 ` Leon Romanovsky 1 sibling, 1 reply; 19+ messages in thread From: Timo Rothenpieler @ 2020-08-04 13:19 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing List, linux-rdma On 04.08.2020 15:12, Chuck Lever wrote: > > >> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: >> >> On 04.08.2020 14:49, Chuck Lever wrote: >>> Timo, I tend to think this is not a configuration issue. >>> Do you know of a known working kernel? >> >> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade. >> >> Also keep in mind that the dmesg spam happens on both server and client simultaneously. > > Let's start with the client only, since restarting it seems to clear the problem. > > >> I'll see if I can borrow two of the nodes to turn into a temporary test system for this. >> >> The Kernel for this system is self-built and not any distribution kernel. > > Would it be easy to try a kernel earlier in the 5.4.y stable series? Yes, that should be very straight forward, since I can just use the same config. Got any specific version in mind? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 13:19 ` Timo Rothenpieler @ 2020-08-04 13:24 ` Chuck Lever 2020-08-04 13:40 ` Timo Rothenpieler 0 siblings, 1 reply; 19+ messages in thread From: Chuck Lever @ 2020-08-04 13:24 UTC (permalink / raw) To: Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma > On Aug 4, 2020, at 9:19 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: > > On 04.08.2020 15:12, Chuck Lever wrote: >>> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: >>> >>> On 04.08.2020 14:49, Chuck Lever wrote: >>>> Timo, I tend to think this is not a configuration issue. >>>> Do you know of a known working kernel? >>> >>> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade. >>> >>> Also keep in mind that the dmesg spam happens on both server and client simultaneously. >> Let's start with the client only, since restarting it seems to clear the problem. >>> I'll see if I can borrow two of the nodes to turn into a temporary test system for this. >>> >>> The Kernel for this system is self-built and not any distribution kernel. >> Would it be easy to try a kernel earlier in the 5.4.y stable series? > > Yes, that should be very straight forward, since I can just use the same config. > Got any specific version in mind? Start with an early one, like 5.4.16. -- Chuck Lever ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 13:24 ` Chuck Lever @ 2020-08-04 13:40 ` Timo Rothenpieler 0 siblings, 0 replies; 19+ messages in thread From: Timo Rothenpieler @ 2020-08-04 13:40 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing List, linux-rdma On 04.08.2020 15:24, Chuck Lever wrote: > Start with an early one, like 5.4.16. > Still happening with 5.4.16 on the client. I'll see if I can get a 4.19 one going soon. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 13:12 ` Chuck Lever 2020-08-04 13:19 ` Timo Rothenpieler @ 2020-08-04 13:46 ` Leon Romanovsky 2020-08-04 13:53 ` Chuck Lever 1 sibling, 1 reply; 19+ messages in thread From: Leon Romanovsky @ 2020-08-04 13:46 UTC (permalink / raw) To: Chuck Lever; +Cc: Timo Rothenpieler, Linux NFS Mailing List, linux-rdma On Tue, Aug 04, 2020 at 09:12:55AM -0400, Chuck Lever wrote: > > > > On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: > > > > On 04.08.2020 14:49, Chuck Lever wrote: > >> Timo, I tend to think this is not a configuration issue. > >> Do you know of a known working kernel? > > > > This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade. > > > > Also keep in mind that the dmesg spam happens on both server and client simultaneously. > > Let's start with the client only, since restarting it seems to clear the problem. It is client because according to the server CQE errors, it is Remote_Invalid_Request_Error with "9.7.5.2.2 NAK CODES" from IBTA. Thanks ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 13:46 ` Leon Romanovsky @ 2020-08-04 13:53 ` Chuck Lever 2020-08-04 15:34 ` Chuck Lever 0 siblings, 1 reply; 19+ messages in thread From: Chuck Lever @ 2020-08-04 13:53 UTC (permalink / raw) To: Leon Romanovsky, Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma > On Aug 4, 2020, at 9:46 AM, Leon Romanovsky <leon@kernel.org> wrote: > > On Tue, Aug 04, 2020 at 09:12:55AM -0400, Chuck Lever wrote: >> >> >>> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: >>> >>> On 04.08.2020 14:49, Chuck Lever wrote: >>>> Timo, I tend to think this is not a configuration issue. >>>> Do you know of a known working kernel? >>> >>> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade. >>> >>> Also keep in mind that the dmesg spam happens on both server and client simultaneously. >> >> Let's start with the client only, since restarting it seems to clear the problem. > > It is client because according to the server CQE errors, it is Remote_Invalid_Request_Error > with "9.7.5.2.2 NAK CODES" from IBTA. Thanks! OK, then let's use ftrace. Timo, can you install trace-cmd on your client? Then: 1. # trace-cmd record -e rpcrdma -e sunrpc 2. Trigger the problem 3. Control-C the trace-cmd, and copy the trace.dat file to another system 4. reboot your client Then send me your trace.dat. You don't have to cc the mailing lists. -- Chuck Lever ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 13:53 ` Chuck Lever @ 2020-08-04 15:34 ` Chuck Lever 2020-08-04 15:39 ` Timo Rothenpieler 2020-08-04 15:55 ` Leon Romanovsky 0 siblings, 2 replies; 19+ messages in thread From: Chuck Lever @ 2020-08-04 15:34 UTC (permalink / raw) To: Leon Romanovsky, Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma > On Aug 4, 2020, at 9:53 AM, Chuck Lever <chuck.lever@oracle.com> wrote: > > > >> On Aug 4, 2020, at 9:46 AM, Leon Romanovsky <leon@kernel.org> wrote: >> >> On Tue, Aug 04, 2020 at 09:12:55AM -0400, Chuck Lever wrote: >>> >>> >>>> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: >>>> >>>> On 04.08.2020 14:49, Chuck Lever wrote: >>>>> Timo, I tend to think this is not a configuration issue. >>>>> Do you know of a known working kernel? >>>> >>>> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade. >>>> >>>> Also keep in mind that the dmesg spam happens on both server and client simultaneously. >>> >>> Let's start with the client only, since restarting it seems to clear the problem. >> >> It is client because according to the server CQE errors, it is Remote_Invalid_Request_Error >> with "9.7.5.2.2 NAK CODES" from IBTA. > > Thanks! OK, then let's use ftrace. > > Timo, can you install trace-cmd on your client? Then: > > 1. # trace-cmd record -e rpcrdma -e sunrpc > > 2. Trigger the problem > > 3. Control-C the trace-cmd, and copy the trace.dat file to another system > > 4. reboot your client > > Then send me your trace.dat. You don't have to cc the mailing lists. I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's Send was too large? Timo, what filesystem are you sharing on your NFS server? The thing that comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053 -- Chuck Lever ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 15:34 ` Chuck Lever @ 2020-08-04 15:39 ` Timo Rothenpieler 2020-08-04 15:46 ` Chuck Lever 2020-08-04 15:55 ` Leon Romanovsky 1 sibling, 1 reply; 19+ messages in thread From: Timo Rothenpieler @ 2020-08-04 15:39 UTC (permalink / raw) To: Chuck Lever, Leon Romanovsky; +Cc: Linux NFS Mailing List, linux-rdma On 04.08.2020 17:34, Chuck Lever wrote: > I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's > Send was too large? > > Timo, what filesystem are you sharing on your NFS server? The thing that > comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053 > The filesystem on the server is indeed a zfs-on-linux (version 0.8.4), just as in that bug report. Should I try to apply the proposed fix you posted on that bug report on the client (and server?). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 15:39 ` Timo Rothenpieler @ 2020-08-04 15:46 ` Chuck Lever 2020-08-04 15:50 ` Timo Rothenpieler 0 siblings, 1 reply; 19+ messages in thread From: Chuck Lever @ 2020-08-04 15:46 UTC (permalink / raw) To: Timo Rothenpieler; +Cc: Leon Romanovsky, Linux NFS Mailing List, linux-rdma > On Aug 4, 2020, at 11:39 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: > > On 04.08.2020 17:34, Chuck Lever wrote: >> I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's >> Send was too large? >> Timo, what filesystem are you sharing on your NFS server? The thing that >> comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053 > > The filesystem on the server is indeed a zfs-on-linux (version 0.8.4), just as in that bug report. > > Should I try to apply the proposed fix you posted on that bug report on the client (and server?). If you are hitting that bug, the server is the problem. The client should work fine once the server is fixed. (I'm not happy about the client's looping behavior either, but that will go away once the server behaves). I'm not hopeful that the fix applies cleanly to v4.19, but it might. Another option would be upgrading your NFS server. -- Chuck Lever ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 15:46 ` Chuck Lever @ 2020-08-04 15:50 ` Timo Rothenpieler 2020-08-04 16:07 ` Timo Rothenpieler 0 siblings, 1 reply; 19+ messages in thread From: Timo Rothenpieler @ 2020-08-04 15:50 UTC (permalink / raw) To: Chuck Lever; +Cc: Leon Romanovsky, Linux NFS Mailing List, linux-rdma On 04.08.2020 17:46, Chuck Lever wrote: > > >> On Aug 4, 2020, at 11:39 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: >> >> On 04.08.2020 17:34, Chuck Lever wrote: >>> I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's >>> Send was too large? >>> Timo, what filesystem are you sharing on your NFS server? The thing that >>> comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053 >> >> The filesystem on the server is indeed a zfs-on-linux (version 0.8.4), just as in that bug report. >> >> Should I try to apply the proposed fix you posted on that bug report on the client (and server?). > > If you are hitting that bug, the server is the problem. The client > should work fine once the server is fixed. (I'm not happy about > the client's looping behavior either, but that will go away once > the server behaves). > > I'm not hopeful that the fix applies cleanly to v4.19, but it > might. Another option would be upgrading your NFS server. It's running on 5.4.54 and the patch applies with no fuzz whatsoever: > patching file fs/nfsd/nfs4xdr.c > Hunk #1 succeeded at 3530 (offset 9 lines). > Hunk #2 succeeded at 3556 (offset 9 lines). > patching file include/linux/sunrpc/svc.h > patching file include/linux/sunrpc/svc_rdma.h > Hunk #2 succeeded at 172 (offset 1 line). > Hunk #3 succeeded at 192 (offset 1 line). > patching file include/linux/sunrpc/svc_xprt.h > patching file net/sunrpc/svc.c > Hunk #1 succeeded at 1635 (offset -2 lines). > patching file net/sunrpc/svcsock.c > Hunk #2 succeeded at 660 (offset 2 lines). > Hunk #3 succeeded at 1181 (offset 4 lines). > patching file net/sunrpc/xprtrdma/svc_rdma_recvfrom.c > Hunk #1 succeeded at 193 (offset 2 lines). > patching file net/sunrpc/xprtrdma/svc_rdma_rw.c > Hunk #1 succeeded at 481 (offset -3 lines). > Hunk #2 succeeded at 500 (offset -3 lines). > Hunk #3 succeeded at 510 (offset -3 lines). > Hunk #4 succeeded at 524 (offset -3 lines). > Hunk #5 succeeded at 538 (offset -3 lines). > Hunk #6 succeeded at 578 (offset -3 lines). > patching file net/sunrpc/xprtrdma/svc_rdma_sendto.c > Hunk #1 succeeded at 856 (offset -15 lines). > Hunk #2 succeeded at 891 with fuzz 2 (offset -22 lines). > patching file net/sunrpc/xprtrdma/svc_rdma_transport.c > Hunk #1 succeeded at 81 (offset -1 lines). I will deploy the patch to both server and client and report back. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 15:50 ` Timo Rothenpieler @ 2020-08-04 16:07 ` Timo Rothenpieler 0 siblings, 0 replies; 19+ messages in thread From: Timo Rothenpieler @ 2020-08-04 16:07 UTC (permalink / raw) To: Chuck Lever; +Cc: Leon Romanovsky, Linux NFS Mailing List, linux-rdma On 04.08.2020 17:50, Timo Rothenpieler wrote: > On 04.08.2020 17:46, Chuck Lever wrote: >> >> >>> On Aug 4, 2020, at 11:39 AM, Timo Rothenpieler >>> <timo@rothenpieler.org> wrote: >>> >>> On 04.08.2020 17:34, Chuck Lever wrote: >>>> I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's >>>> Send was too large? >>>> Timo, what filesystem are you sharing on your NFS server? The thing >>>> that >>>> comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053 >>> >>> The filesystem on the server is indeed a zfs-on-linux (version >>> 0.8.4), just as in that bug report. >>> >>> Should I try to apply the proposed fix you posted on that bug report >>> on the client (and server?). >> >> If you are hitting that bug, the server is the problem. The client >> should work fine once the server is fixed. (I'm not happy about >> the client's looping behavior either, but that will go away once >> the server behaves). >> >> I'm not hopeful that the fix applies cleanly to v4.19, but it >> might. Another option would be upgrading your NFS server. > > It's running on 5.4.54 and the patch applies with no fuzz whatsoever: > > I will deploy the patch to both server and client and report back. Reporting success. With the patch from that bug applied, no error spam is happening anymore. Plus, the filesystem actually works and definitely got a whole lot snappier than before. Which is not all that unexpected. Thank you so much for your help analyzing this and for the fix! I hope it can get applied to mainline soon and will reach 5.4 backports eventually. Until then, I will carry it as a local patch for the systems. Thanks again, Timo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS over RDMA issues on Linux 5.4 2020-08-04 15:34 ` Chuck Lever 2020-08-04 15:39 ` Timo Rothenpieler @ 2020-08-04 15:55 ` Leon Romanovsky 1 sibling, 0 replies; 19+ messages in thread From: Leon Romanovsky @ 2020-08-04 15:55 UTC (permalink / raw) To: Chuck Lever; +Cc: Timo Rothenpieler, Linux NFS Mailing List, linux-rdma On Tue, Aug 04, 2020 at 11:34:05AM -0400, Chuck Lever wrote: > > > > On Aug 4, 2020, at 9:53 AM, Chuck Lever <chuck.lever@oracle.com> wrote: > > > > > > > >> On Aug 4, 2020, at 9:46 AM, Leon Romanovsky <leon@kernel.org> wrote: > >> > >> On Tue, Aug 04, 2020 at 09:12:55AM -0400, Chuck Lever wrote: > >>> > >>> > >>>> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote: > >>>> > >>>> On 04.08.2020 14:49, Chuck Lever wrote: > >>>>> Timo, I tend to think this is not a configuration issue. > >>>>> Do you know of a known working kernel? > >>>> > >>>> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade. > >>>> > >>>> Also keep in mind that the dmesg spam happens on both server and client simultaneously. > >>> > >>> Let's start with the client only, since restarting it seems to clear the problem. > >> > >> It is client because according to the server CQE errors, it is Remote_Invalid_Request_Error > >> with "9.7.5.2.2 NAK CODES" from IBTA. > > > > Thanks! OK, then let's use ftrace. > > > > Timo, can you install trace-cmd on your client? Then: > > > > 1. # trace-cmd record -e rpcrdma -e sunrpc > > > > 2. Trigger the problem > > > > 3. Control-C the trace-cmd, and copy the trace.dat file to another system > > > > 4. reboot your client > > > > Then send me your trace.dat. You don't have to cc the mailing lists. > > I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's > Send was too large? 1. We have local_length_error counter, it can help to run it on server and clients. [leonro@vm ~]$ cat /sys/class/infiniband/ibp0s9/ports/1/hw_counters/resp_local_length_error 0 resp_local_length_error - "Number of times responder detected local length errors." 2. LOC_LEN_ERR supports that is written in CQE error on the client. This is what is written in our HW document: IB compliant completion with error syndrome 0x1: Local_Length_Error 3. From IBTA, 11.6.2 COMPLETION RETURN STATUS Local Length Error - Generated for a Work Request posted to the local Send Queue when the sum of the Data Segment lengths exceeds the message length for the channel adapter port. Generated for a Work Request posted to the local Receive Queue when the sum of the Data Segment lengths is too small to receive a valid incoming message or the length of the incoming message is greater than the maximum message size supported by the HCA port that received the message. So if "1" works :), we will be able to distinguish if client sends too large WR or recieves too large. Thanks > > Timo, what filesystem are you sharing on your NFS server? The thing that > comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053 > > > -- > Chuck Lever > > > ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2020-08-04 16:08 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-03 15:05 NFS over RDMA issues on Linux 5.4 Timo Rothenpieler 2020-08-03 16:24 ` Chuck Lever 2020-08-04 9:36 ` Leon Romanovsky 2020-08-04 10:52 ` Timo Rothenpieler 2020-08-04 12:25 ` Leon Romanovsky 2020-08-04 12:49 ` Chuck Lever 2020-08-04 13:08 ` Timo Rothenpieler 2020-08-04 13:12 ` Chuck Lever 2020-08-04 13:19 ` Timo Rothenpieler 2020-08-04 13:24 ` Chuck Lever 2020-08-04 13:40 ` Timo Rothenpieler 2020-08-04 13:46 ` Leon Romanovsky 2020-08-04 13:53 ` Chuck Lever 2020-08-04 15:34 ` Chuck Lever 2020-08-04 15:39 ` Timo Rothenpieler 2020-08-04 15:46 ` Chuck Lever 2020-08-04 15:50 ` Timo Rothenpieler 2020-08-04 16:07 ` Timo Rothenpieler 2020-08-04 15:55 ` Leon Romanovsky
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.