All of lore.kernel.org
 help / color / mirror / Atom feed
* NFS over RDMA issues on Linux 5.4
@ 2020-08-03 15:05 Timo Rothenpieler
  2020-08-03 16:24 ` Chuck Lever
  0 siblings, 1 reply; 19+ messages in thread
From: Timo Rothenpieler @ 2020-08-03 15:05 UTC (permalink / raw)
  To: linux-nfs

Hello,

I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB 
cards and wanted to setup NFS over RDMA on it.

However, while mounting the FS over RDMA works fine, actually using it 
results in the following messages absolutely hammering dmesg on both 
client and server:

> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log

The spam only stops once I forcibly reboot the client. The filesystem 
gets nowhere during all this. The retrans counter in nfsstat just keeps 
going up, nothing actually gets done.

This is on Linux 5.4.54, using nfs-utils 2.4.3.
The mlx5 driver had enhanced-mode disabled in order to enable IPoIB 
connected mode with an MTU of 65520.

Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only 
when I mount via rdma that things go wrong.

Is this an issue on my end, or did I run into a bug somewhere here?
Any pointers, patches and solutions to test are welcome.


Thanks,
Timo Rothenpieler

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-03 15:05 NFS over RDMA issues on Linux 5.4 Timo Rothenpieler
@ 2020-08-03 16:24 ` Chuck Lever
  2020-08-04  9:36   ` Leon Romanovsky
  0 siblings, 1 reply; 19+ messages in thread
From: Chuck Lever @ 2020-08-03 16:24 UTC (permalink / raw)
  To: Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma

Hi Timo-

> On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
> 
> Hello,
> 
> I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it.
> 
> However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server:
> 
>> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log
> 
> The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done.
> 
> This is on Linux 5.4.54, using nfs-utils 2.4.3.
> The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520.
> 
> Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong.
> 
> Is this an issue on my end, or did I run into a bug somewhere here?
> Any pointers, patches and solutions to test are welcome.

I haven't seen that failure mode here, so best I can recommend is
keep investigating. I've copied linux-rdma in case they have any
advice.

--
Chuck Lever




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-03 16:24 ` Chuck Lever
@ 2020-08-04  9:36   ` Leon Romanovsky
  2020-08-04 10:52     ` Timo Rothenpieler
  0 siblings, 1 reply; 19+ messages in thread
From: Leon Romanovsky @ 2020-08-04  9:36 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Timo Rothenpieler, Linux NFS Mailing List, linux-rdma

On Mon, Aug 03, 2020 at 12:24:21PM -0400, Chuck Lever wrote:
> Hi Timo-
>
> > On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
> >
> > Hello,
> >
> > I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it.
> >
> > However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server:
> >
> >> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log
> >
> > The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done.
> >
> > This is on Linux 5.4.54, using nfs-utils 2.4.3.
> > The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520.
> >
> > Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong.
> >
> > Is this an issue on my end, or did I run into a bug somewhere here?
> > Any pointers, patches and solutions to test are welcome.
>
> I haven't seen that failure mode here, so best I can recommend is
> keep investigating. I've copied linux-rdma in case they have any
> advice.

The mentioning of IPoIB is a slightly confusing in the context of NFS-over-RDMA.
Are you running NFS over IPoIB?

From brief look on CQE error syndrome (local length error), the client sends wrong WQE.

Thanks

>
> --
> Chuck Lever
>
>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04  9:36   ` Leon Romanovsky
@ 2020-08-04 10:52     ` Timo Rothenpieler
  2020-08-04 12:25       ` Leon Romanovsky
  0 siblings, 1 reply; 19+ messages in thread
From: Timo Rothenpieler @ 2020-08-04 10:52 UTC (permalink / raw)
  To: Leon Romanovsky, Chuck Lever; +Cc: Linux NFS Mailing List, linux-rdma

On 04.08.2020 11:36, Leon Romanovsky wrote:
> On Mon, Aug 03, 2020 at 12:24:21PM -0400, Chuck Lever wrote:
>> Hi Timo-
>>
>>> On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
>>>
>>> Hello,
>>>
>>> I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it.
>>>
>>> However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server:
>>>
>>>> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log
>>>
>>> The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done.
>>>
>>> This is on Linux 5.4.54, using nfs-utils 2.4.3.
>>> The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520.
>>>
>>> Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong.
>>>
>>> Is this an issue on my end, or did I run into a bug somewhere here?
>>> Any pointers, patches and solutions to test are welcome.
>>
>> I haven't seen that failure mode here, so best I can recommend is
>> keep investigating. I've copied linux-rdma in case they have any
>> advice.
> 
> The mentioning of IPoIB is a slightly confusing in the context of NFS-over-RDMA.
> Are you running NFS over IPoIB?

For all I'm aware, NFS over RDMA still needs an IP and port to be 
targeted to, so IPoIB is mandatory?
At least the admin guide in the kernel says so.

Right now I actually am running NFS over IPoIB (without RDMA), because 
of the issue at hand. And would like to turn on RDMA for enhanced 
performance.

>  From brief look on CQE error syndrome (local length error), the client sends wrong WQE.

Does that point at an issue in the kernel code, or something I did wrong?

The fstab entries for these mounts look like this:

10.110.10.200:/home /home nfs4 
rw,rdma,port=20049,noatime,async,vers=4.2,_netdev 0 0

Is there anything more I can investigate? I tried turning connected mode 
off and lowering the mtu in turn, but that did not have any effect.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 10:52     ` Timo Rothenpieler
@ 2020-08-04 12:25       ` Leon Romanovsky
  2020-08-04 12:49         ` Chuck Lever
  0 siblings, 1 reply; 19+ messages in thread
From: Leon Romanovsky @ 2020-08-04 12:25 UTC (permalink / raw)
  To: Timo Rothenpieler; +Cc: Chuck Lever, Linux NFS Mailing List, linux-rdma

On Tue, Aug 04, 2020 at 12:52:27PM +0200, Timo Rothenpieler wrote:
> On 04.08.2020 11:36, Leon Romanovsky wrote:
> > On Mon, Aug 03, 2020 at 12:24:21PM -0400, Chuck Lever wrote:
> > > Hi Timo-
> > >
> > > > On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
> > > >
> > > > Hello,
> > > >
> > > > I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it.
> > > >
> > > > However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server:
> > > >
> > > > > https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log
> > > >
> > > > The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done.
> > > >
> > > > This is on Linux 5.4.54, using nfs-utils 2.4.3.
> > > > The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520.
> > > >
> > > > Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong.
> > > >
> > > > Is this an issue on my end, or did I run into a bug somewhere here?
> > > > Any pointers, patches and solutions to test are welcome.
> > >
> > > I haven't seen that failure mode here, so best I can recommend is
> > > keep investigating. I've copied linux-rdma in case they have any
> > > advice.
> >
> > The mentioning of IPoIB is a slightly confusing in the context of NFS-over-RDMA.
> > Are you running NFS over IPoIB?
>
> For all I'm aware, NFS over RDMA still needs an IP and port to be targeted
> to, so IPoIB is mandatory?
> At least the admin guide in the kernel says so.
>
> Right now I actually am running NFS over IPoIB (without RDMA), because of
> the issue at hand. And would like to turn on RDMA for enhanced performance.
>
> >  From brief look on CQE error syndrome (local length error), the client sends wrong WQE.
>
> Does that point at an issue in the kernel code, or something I did wrong?
>
> The fstab entries for these mounts look like this:
>
> 10.110.10.200:/home /home nfs4
> rw,rdma,port=20049,noatime,async,vers=4.2,_netdev 0 0
>
> Is there anything more I can investigate? I tried turning connected mode off
> and lowering the mtu in turn, but that did not have any effect.

Chuck,

You probably know which traces Timo should enable on the client.
The fact that NFS over (not-enahnced) IPoIB works highly reduces
driver/FW issues.

Thanks

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 12:25       ` Leon Romanovsky
@ 2020-08-04 12:49         ` Chuck Lever
  2020-08-04 13:08           ` Timo Rothenpieler
  0 siblings, 1 reply; 19+ messages in thread
From: Chuck Lever @ 2020-08-04 12:49 UTC (permalink / raw)
  To: Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma, Leon Romanovsky



> On Aug 4, 2020, at 8:25 AM, Leon Romanovsky <leon@kernel.org> wrote:
> 
> On Tue, Aug 04, 2020 at 12:52:27PM +0200, Timo Rothenpieler wrote:
>> On 04.08.2020 11:36, Leon Romanovsky wrote:
>>> On Mon, Aug 03, 2020 at 12:24:21PM -0400, Chuck Lever wrote:
>>>> Hi Timo-
>>>> 
>>>>> On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it.
>>>>> 
>>>>> However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server:
>>>>> 
>>>>>> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log
>>>>> 
>>>>> The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done.
>>>>> 
>>>>> This is on Linux 5.4.54, using nfs-utils 2.4.3.
>>>>> The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520.
>>>>> 
>>>>> Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong.
>>>>> 
>>>>> Is this an issue on my end, or did I run into a bug somewhere here?
>>>>> Any pointers, patches and solutions to test are welcome.
>>>> 
>>>> I haven't seen that failure mode here, so best I can recommend is
>>>> keep investigating. I've copied linux-rdma in case they have any
>>>> advice.
>>> 
>>> The mentioning of IPoIB is a slightly confusing in the context of NFS-over-RDMA.
>>> Are you running NFS over IPoIB?
>> 
>> For all I'm aware, NFS over RDMA still needs an IP and port to be targeted
>> to, so IPoIB is mandatory?
>> At least the admin guide in the kernel says so.
>> 
>> Right now I actually am running NFS over IPoIB (without RDMA), because of
>> the issue at hand. And would like to turn on RDMA for enhanced performance.
>> 
>>> From brief look on CQE error syndrome (local length error), the client sends wrong WQE.
>> 
>> Does that point at an issue in the kernel code, or something I did wrong?
>> 
>> The fstab entries for these mounts look like this:
>> 
>> 10.110.10.200:/home /home nfs4
>> rw,rdma,port=20049,noatime,async,vers=4.2,_netdev 0 0
>> 
>> Is there anything more I can investigate? I tried turning connected mode off
>> and lowering the mtu in turn, but that did not have any effect.
> 
> Chuck,
> 
> You probably know which traces Timo should enable on the client.
> The fact that NFS over (not-enahnced) IPoIB works highly reduces
> driver/FW issues.

Timo, I tend to think this is not a configuration issue.
Do you know of a known working kernel?


--
Chuck Lever




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 12:49         ` Chuck Lever
@ 2020-08-04 13:08           ` Timo Rothenpieler
  2020-08-04 13:12             ` Chuck Lever
  0 siblings, 1 reply; 19+ messages in thread
From: Timo Rothenpieler @ 2020-08-04 13:08 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing List, linux-rdma

On 04.08.2020 14:49, Chuck Lever wrote:
> Timo, I tend to think this is not a configuration issue.
> Do you know of a known working kernel?
> 

This is a brand new system, it's never been running with any kernel 
older than 5.4, and downgrading it to 4.19 or something else while in 
operation is unfortunately not easily possible. For a client it would 
definitely not be out of the question, but the main nfs server I cannot 
easily downgrade.

Also keep in mind that the dmesg spam happens on both server and client 
simultaneously.

I'll see if I can borrow two of the nodes to turn into a temporary test 
system for this.

The Kernel for this system is self-built and not any distribution 
kernel. This could not be a missing kernel config option or something?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 13:08           ` Timo Rothenpieler
@ 2020-08-04 13:12             ` Chuck Lever
  2020-08-04 13:19               ` Timo Rothenpieler
  2020-08-04 13:46               ` Leon Romanovsky
  0 siblings, 2 replies; 19+ messages in thread
From: Chuck Lever @ 2020-08-04 13:12 UTC (permalink / raw)
  To: Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma



> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
> 
> On 04.08.2020 14:49, Chuck Lever wrote:
>> Timo, I tend to think this is not a configuration issue.
>> Do you know of a known working kernel?
> 
> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade.
> 
> Also keep in mind that the dmesg spam happens on both server and client simultaneously.

Let's start with the client only, since restarting it seems to clear the problem.


> I'll see if I can borrow two of the nodes to turn into a temporary test system for this.
> 
> The Kernel for this system is self-built and not any distribution kernel.

Would it be easy to try a kernel earlier in the 5.4.y stable series?


> This could not be a missing kernel config option or something?

Doubtful.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 13:12             ` Chuck Lever
@ 2020-08-04 13:19               ` Timo Rothenpieler
  2020-08-04 13:24                 ` Chuck Lever
  2020-08-04 13:46               ` Leon Romanovsky
  1 sibling, 1 reply; 19+ messages in thread
From: Timo Rothenpieler @ 2020-08-04 13:19 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing List, linux-rdma

On 04.08.2020 15:12, Chuck Lever wrote:
> 
> 
>> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
>>
>> On 04.08.2020 14:49, Chuck Lever wrote:
>>> Timo, I tend to think this is not a configuration issue.
>>> Do you know of a known working kernel?
>>
>> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade.
>>
>> Also keep in mind that the dmesg spam happens on both server and client simultaneously.
> 
> Let's start with the client only, since restarting it seems to clear the problem.
> 
> 
>> I'll see if I can borrow two of the nodes to turn into a temporary test system for this.
>>
>> The Kernel for this system is self-built and not any distribution kernel.
> 
> Would it be easy to try a kernel earlier in the 5.4.y stable series?

Yes, that should be very straight forward, since I can just use the same 
config.
Got any specific version in mind?




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 13:19               ` Timo Rothenpieler
@ 2020-08-04 13:24                 ` Chuck Lever
  2020-08-04 13:40                   ` Timo Rothenpieler
  0 siblings, 1 reply; 19+ messages in thread
From: Chuck Lever @ 2020-08-04 13:24 UTC (permalink / raw)
  To: Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma



> On Aug 4, 2020, at 9:19 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
> 
> On 04.08.2020 15:12, Chuck Lever wrote:
>>> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
>>> 
>>> On 04.08.2020 14:49, Chuck Lever wrote:
>>>> Timo, I tend to think this is not a configuration issue.
>>>> Do you know of a known working kernel?
>>> 
>>> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade.
>>> 
>>> Also keep in mind that the dmesg spam happens on both server and client simultaneously.
>> Let's start with the client only, since restarting it seems to clear the problem.
>>> I'll see if I can borrow two of the nodes to turn into a temporary test system for this.
>>> 
>>> The Kernel for this system is self-built and not any distribution kernel.
>> Would it be easy to try a kernel earlier in the 5.4.y stable series?
> 
> Yes, that should be very straight forward, since I can just use the same config.
> Got any specific version in mind?

Start with an early one, like 5.4.16.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 13:24                 ` Chuck Lever
@ 2020-08-04 13:40                   ` Timo Rothenpieler
  0 siblings, 0 replies; 19+ messages in thread
From: Timo Rothenpieler @ 2020-08-04 13:40 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing List, linux-rdma

On 04.08.2020 15:24, Chuck Lever wrote:
> Start with an early one, like 5.4.16.
> 

Still happening with 5.4.16 on the client. I'll see if I can get a 4.19 
one going soon.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 13:12             ` Chuck Lever
  2020-08-04 13:19               ` Timo Rothenpieler
@ 2020-08-04 13:46               ` Leon Romanovsky
  2020-08-04 13:53                 ` Chuck Lever
  1 sibling, 1 reply; 19+ messages in thread
From: Leon Romanovsky @ 2020-08-04 13:46 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Timo Rothenpieler, Linux NFS Mailing List, linux-rdma

On Tue, Aug 04, 2020 at 09:12:55AM -0400, Chuck Lever wrote:
>
>
> > On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
> >
> > On 04.08.2020 14:49, Chuck Lever wrote:
> >> Timo, I tend to think this is not a configuration issue.
> >> Do you know of a known working kernel?
> >
> > This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade.
> >
> > Also keep in mind that the dmesg spam happens on both server and client simultaneously.
>
> Let's start with the client only, since restarting it seems to clear the problem.

It is client because according to the server CQE errors, it is Remote_Invalid_Request_Error
with "9.7.5.2.2 NAK CODES" from IBTA.

Thanks

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 13:46               ` Leon Romanovsky
@ 2020-08-04 13:53                 ` Chuck Lever
  2020-08-04 15:34                   ` Chuck Lever
  0 siblings, 1 reply; 19+ messages in thread
From: Chuck Lever @ 2020-08-04 13:53 UTC (permalink / raw)
  To: Leon Romanovsky, Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma



> On Aug 4, 2020, at 9:46 AM, Leon Romanovsky <leon@kernel.org> wrote:
> 
> On Tue, Aug 04, 2020 at 09:12:55AM -0400, Chuck Lever wrote:
>> 
>> 
>>> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
>>> 
>>> On 04.08.2020 14:49, Chuck Lever wrote:
>>>> Timo, I tend to think this is not a configuration issue.
>>>> Do you know of a known working kernel?
>>> 
>>> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade.
>>> 
>>> Also keep in mind that the dmesg spam happens on both server and client simultaneously.
>> 
>> Let's start with the client only, since restarting it seems to clear the problem.
> 
> It is client because according to the server CQE errors, it is Remote_Invalid_Request_Error
> with "9.7.5.2.2 NAK CODES" from IBTA.

Thanks! OK, then let's use ftrace.

Timo, can you install trace-cmd on your client? Then:

1. # trace-cmd record -e rpcrdma -e sunrpc

2. Trigger the problem

3. Control-C the trace-cmd, and copy the trace.dat file to another system

4. reboot your client

Then send me your trace.dat. You don't have to cc the mailing lists.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 13:53                 ` Chuck Lever
@ 2020-08-04 15:34                   ` Chuck Lever
  2020-08-04 15:39                     ` Timo Rothenpieler
  2020-08-04 15:55                     ` Leon Romanovsky
  0 siblings, 2 replies; 19+ messages in thread
From: Chuck Lever @ 2020-08-04 15:34 UTC (permalink / raw)
  To: Leon Romanovsky, Timo Rothenpieler; +Cc: Linux NFS Mailing List, linux-rdma



> On Aug 4, 2020, at 9:53 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On Aug 4, 2020, at 9:46 AM, Leon Romanovsky <leon@kernel.org> wrote:
>> 
>> On Tue, Aug 04, 2020 at 09:12:55AM -0400, Chuck Lever wrote:
>>> 
>>> 
>>>> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
>>>> 
>>>> On 04.08.2020 14:49, Chuck Lever wrote:
>>>>> Timo, I tend to think this is not a configuration issue.
>>>>> Do you know of a known working kernel?
>>>> 
>>>> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade.
>>>> 
>>>> Also keep in mind that the dmesg spam happens on both server and client simultaneously.
>>> 
>>> Let's start with the client only, since restarting it seems to clear the problem.
>> 
>> It is client because according to the server CQE errors, it is Remote_Invalid_Request_Error
>> with "9.7.5.2.2 NAK CODES" from IBTA.
> 
> Thanks! OK, then let's use ftrace.
> 
> Timo, can you install trace-cmd on your client? Then:
> 
> 1. # trace-cmd record -e rpcrdma -e sunrpc
> 
> 2. Trigger the problem
> 
> 3. Control-C the trace-cmd, and copy the trace.dat file to another system
> 
> 4. reboot your client
> 
> Then send me your trace.dat. You don't have to cc the mailing lists.

I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's
Send was too large?

Timo, what filesystem are you sharing on your NFS server? The thing that
comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053


--
Chuck Lever




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 15:34                   ` Chuck Lever
@ 2020-08-04 15:39                     ` Timo Rothenpieler
  2020-08-04 15:46                       ` Chuck Lever
  2020-08-04 15:55                     ` Leon Romanovsky
  1 sibling, 1 reply; 19+ messages in thread
From: Timo Rothenpieler @ 2020-08-04 15:39 UTC (permalink / raw)
  To: Chuck Lever, Leon Romanovsky; +Cc: Linux NFS Mailing List, linux-rdma

On 04.08.2020 17:34, Chuck Lever wrote:
> I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's
> Send was too large?
> 
> Timo, what filesystem are you sharing on your NFS server? The thing that
> comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053
> 

The filesystem on the server is indeed a zfs-on-linux (version 0.8.4), 
just as in that bug report.

Should I try to apply the proposed fix you posted on that bug report on 
the client (and server?).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 15:39                     ` Timo Rothenpieler
@ 2020-08-04 15:46                       ` Chuck Lever
  2020-08-04 15:50                         ` Timo Rothenpieler
  0 siblings, 1 reply; 19+ messages in thread
From: Chuck Lever @ 2020-08-04 15:46 UTC (permalink / raw)
  To: Timo Rothenpieler; +Cc: Leon Romanovsky, Linux NFS Mailing List, linux-rdma



> On Aug 4, 2020, at 11:39 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
> 
> On 04.08.2020 17:34, Chuck Lever wrote:
>> I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's
>> Send was too large?
>> Timo, what filesystem are you sharing on your NFS server? The thing that
>> comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053
> 
> The filesystem on the server is indeed a zfs-on-linux (version 0.8.4), just as in that bug report.
> 
> Should I try to apply the proposed fix you posted on that bug report on the client (and server?).

If you are hitting that bug, the server is the problem. The client
should work fine once the server is fixed. (I'm not happy about
the client's looping behavior either, but that will go away once
the server behaves).

I'm not hopeful that the fix applies cleanly to v4.19, but it
might. Another option would be upgrading your NFS server.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 15:46                       ` Chuck Lever
@ 2020-08-04 15:50                         ` Timo Rothenpieler
  2020-08-04 16:07                           ` Timo Rothenpieler
  0 siblings, 1 reply; 19+ messages in thread
From: Timo Rothenpieler @ 2020-08-04 15:50 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Leon Romanovsky, Linux NFS Mailing List, linux-rdma

On 04.08.2020 17:46, Chuck Lever wrote:
> 
> 
>> On Aug 4, 2020, at 11:39 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
>>
>> On 04.08.2020 17:34, Chuck Lever wrote:
>>> I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's
>>> Send was too large?
>>> Timo, what filesystem are you sharing on your NFS server? The thing that
>>> comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053
>>
>> The filesystem on the server is indeed a zfs-on-linux (version 0.8.4), just as in that bug report.
>>
>> Should I try to apply the proposed fix you posted on that bug report on the client (and server?).
> 
> If you are hitting that bug, the server is the problem. The client
> should work fine once the server is fixed. (I'm not happy about
> the client's looping behavior either, but that will go away once
> the server behaves).
> 
> I'm not hopeful that the fix applies cleanly to v4.19, but it
> might. Another option would be upgrading your NFS server.

It's running on 5.4.54 and the patch applies with no fuzz whatsoever:

> patching file fs/nfsd/nfs4xdr.c
> Hunk #1 succeeded at 3530 (offset 9 lines).
> Hunk #2 succeeded at 3556 (offset 9 lines).
> patching file include/linux/sunrpc/svc.h
> patching file include/linux/sunrpc/svc_rdma.h
> Hunk #2 succeeded at 172 (offset 1 line).
> Hunk #3 succeeded at 192 (offset 1 line).
> patching file include/linux/sunrpc/svc_xprt.h
> patching file net/sunrpc/svc.c
> Hunk #1 succeeded at 1635 (offset -2 lines).
> patching file net/sunrpc/svcsock.c
> Hunk #2 succeeded at 660 (offset 2 lines).
> Hunk #3 succeeded at 1181 (offset 4 lines).
> patching file net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> Hunk #1 succeeded at 193 (offset 2 lines).
> patching file net/sunrpc/xprtrdma/svc_rdma_rw.c
> Hunk #1 succeeded at 481 (offset -3 lines).
> Hunk #2 succeeded at 500 (offset -3 lines).
> Hunk #3 succeeded at 510 (offset -3 lines).
> Hunk #4 succeeded at 524 (offset -3 lines).
> Hunk #5 succeeded at 538 (offset -3 lines).
> Hunk #6 succeeded at 578 (offset -3 lines).
> patching file net/sunrpc/xprtrdma/svc_rdma_sendto.c
> Hunk #1 succeeded at 856 (offset -15 lines).
> Hunk #2 succeeded at 891 with fuzz 2 (offset -22 lines).
> patching file net/sunrpc/xprtrdma/svc_rdma_transport.c
> Hunk #1 succeeded at 81 (offset -1 lines).

I will deploy the patch to both server and client and report back.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 15:34                   ` Chuck Lever
  2020-08-04 15:39                     ` Timo Rothenpieler
@ 2020-08-04 15:55                     ` Leon Romanovsky
  1 sibling, 0 replies; 19+ messages in thread
From: Leon Romanovsky @ 2020-08-04 15:55 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Timo Rothenpieler, Linux NFS Mailing List, linux-rdma

On Tue, Aug 04, 2020 at 11:34:05AM -0400, Chuck Lever wrote:
>
>
> > On Aug 4, 2020, at 9:53 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> >
> >
> >
> >> On Aug 4, 2020, at 9:46 AM, Leon Romanovsky <leon@kernel.org> wrote:
> >>
> >> On Tue, Aug 04, 2020 at 09:12:55AM -0400, Chuck Lever wrote:
> >>>
> >>>
> >>>> On Aug 4, 2020, at 9:08 AM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
> >>>>
> >>>> On 04.08.2020 14:49, Chuck Lever wrote:
> >>>>> Timo, I tend to think this is not a configuration issue.
> >>>>> Do you know of a known working kernel?
> >>>>
> >>>> This is a brand new system, it's never been running with any kernel older than 5.4, and downgrading it to 4.19 or something else while in operation is unfortunately not easily possible. For a client it would definitely not be out of the question, but the main nfs server I cannot easily downgrade.
> >>>>
> >>>> Also keep in mind that the dmesg spam happens on both server and client simultaneously.
> >>>
> >>> Let's start with the client only, since restarting it seems to clear the problem.
> >>
> >> It is client because according to the server CQE errors, it is Remote_Invalid_Request_Error
> >> with "9.7.5.2.2 NAK CODES" from IBTA.
> >
> > Thanks! OK, then let's use ftrace.
> >
> > Timo, can you install trace-cmd on your client? Then:
> >
> > 1. # trace-cmd record -e rpcrdma -e sunrpc
> >
> > 2. Trigger the problem
> >
> > 3. Control-C the trace-cmd, and copy the trace.dat file to another system
> >
> > 4. reboot your client
> >
> > Then send me your trace.dat. You don't have to cc the mailing lists.
>
> I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's
> Send was too large?

1.
We have local_length_error counter, it can help to run it on server and clients.
[leonro@vm ~]$ cat /sys/class/infiniband/ibp0s9/ports/1/hw_counters/resp_local_length_error
0

resp_local_length_error - "Number of times responder detected local length errors."
2.
LOC_LEN_ERR supports that is written in CQE error on the client.
This is what is written in our HW document:
 IB compliant completion with error syndrome
	0x1: Local_Length_Error
3.
From IBTA, 11.6.2 COMPLETION RETURN STATUS
Local Length Error - Generated for a Work Request posted to the local
Send Queue when the sum of the Data Segment lengths exceeds the message
length for the channel adapter port. Generated for a Work Request posted
to the local Receive Queue when the sum of the Data Segment lengths is
too small to receive a valid incoming message or the length of the incoming
message is greater than the maximum message size supported by the HCA port
that received the message.


So if "1" works :), we will be able to distinguish if client sends too
large WR or recieves too large.

Thanks

>
> Timo, what filesystem are you sharing on your NFS server? The thing that
> comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053
>
>
> --
> Chuck Lever
>
>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: NFS over RDMA issues on Linux 5.4
  2020-08-04 15:50                         ` Timo Rothenpieler
@ 2020-08-04 16:07                           ` Timo Rothenpieler
  0 siblings, 0 replies; 19+ messages in thread
From: Timo Rothenpieler @ 2020-08-04 16:07 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Leon Romanovsky, Linux NFS Mailing List, linux-rdma

On 04.08.2020 17:50, Timo Rothenpieler wrote:
> On 04.08.2020 17:46, Chuck Lever wrote:
>>
>>
>>> On Aug 4, 2020, at 11:39 AM, Timo Rothenpieler 
>>> <timo@rothenpieler.org> wrote:
>>>
>>> On 04.08.2020 17:34, Chuck Lever wrote:
>>>> I see a LOC_LEN_ERR on a Receive. Leon, doesn't that mean the server's
>>>> Send was too large?
>>>> Timo, what filesystem are you sharing on your NFS server? The thing 
>>>> that
>>>> comes to mind is https://bugzilla.kernel.org/show_bug.cgi?id=198053
>>>
>>> The filesystem on the server is indeed a zfs-on-linux (version 
>>> 0.8.4), just as in that bug report.
>>>
>>> Should I try to apply the proposed fix you posted on that bug report 
>>> on the client (and server?).
>>
>> If you are hitting that bug, the server is the problem. The client
>> should work fine once the server is fixed. (I'm not happy about
>> the client's looping behavior either, but that will go away once
>> the server behaves).
>>
>> I'm not hopeful that the fix applies cleanly to v4.19, but it
>> might. Another option would be upgrading your NFS server.
> 
> It's running on 5.4.54 and the patch applies with no fuzz whatsoever:

> 
> I will deploy the patch to both server and client and report back.

Reporting success.

With the patch from that bug applied, no error spam is happening anymore.
Plus, the filesystem actually works and definitely got a whole lot 
snappier than before. Which is not all that unexpected.

Thank you so much for your help analyzing this and for the fix!
I hope it can get applied to mainline soon and will reach 5.4 backports 
eventually.
Until then, I will carry it as a local patch for the systems.


Thanks again,
Timo

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-08-04 16:08 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-03 15:05 NFS over RDMA issues on Linux 5.4 Timo Rothenpieler
2020-08-03 16:24 ` Chuck Lever
2020-08-04  9:36   ` Leon Romanovsky
2020-08-04 10:52     ` Timo Rothenpieler
2020-08-04 12:25       ` Leon Romanovsky
2020-08-04 12:49         ` Chuck Lever
2020-08-04 13:08           ` Timo Rothenpieler
2020-08-04 13:12             ` Chuck Lever
2020-08-04 13:19               ` Timo Rothenpieler
2020-08-04 13:24                 ` Chuck Lever
2020-08-04 13:40                   ` Timo Rothenpieler
2020-08-04 13:46               ` Leon Romanovsky
2020-08-04 13:53                 ` Chuck Lever
2020-08-04 15:34                   ` Chuck Lever
2020-08-04 15:39                     ` Timo Rothenpieler
2020-08-04 15:46                       ` Chuck Lever
2020-08-04 15:50                         ` Timo Rothenpieler
2020-08-04 16:07                           ` Timo Rothenpieler
2020-08-04 15:55                     ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.