linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NFS dmesg errors in 5.14-rc1
@ 2021-07-14 16:40 Marciniszyn, Mike
  2021-07-14 19:43 ` Chuck Lever III
  0 siblings, 1 reply; 2+ messages in thread
From: Marciniszyn, Mike @ 2021-07-14 16:40 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: linux-rdma

Chuck,

We are now seeing this in the first RC:


[31868.644165] ------------[ cut here ]------------
[31868.650059] failed to drain recv queue: -22
[31868.655191] WARNING: CPU: 32 PID: 559 at drivers/infiniband/core/verbs.c:2738 __ib_drain_rq+0x163/0x1a0 [ib_core]
[31868.657234] ------------[ cut here ]------------
[31868.667133] Modules linked in: nfsv3
[31868.672832] failed to drain send queue: -22
[31868.677279]  nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs tcp_diag udp_diag raw_diag inet_diag rfkill ib_isert iscsi_target_mod target_core_mod rpcrdma ib_iser rdma_ucm opa_vnic rdma_cm ib_umad libiscsi ib_ipoib scsi_transport_iscsi ib_cm iw_cm sunrpc hfi1 mgag200 intel_rapl_msr intel_rapl_common drm_kms_helper sb_edac syscopyarea rdmavt x86_pkg_temp_thermal sysfillrect intel_powerclamp ipmi_si ib_uverbs sysimgblt coretemp fb_sys_fops cec ipmi_devintf drm crct10dif_pclmul crc32_pclmul iTCO_wdt iTCO_vendor_support ghash_clmulni_intel ib_core mei_me rapl intel_cstate mei lpc_ich mxm_wmi i2c_i801
[31868.682425] WARNING: CPU: 65 PID: 608575 at drivers/infiniband/core/verbs.c:2705 __ib_drain_sq+0x14d/0x190 [ib_core]

On the same tests, the mount command fails with a connection refused...

Any ideas on this?

5.13.1 (the first 5.13.y release) tests fine.

Mike

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: NFS dmesg errors in 5.14-rc1
  2021-07-14 16:40 NFS dmesg errors in 5.14-rc1 Marciniszyn, Mike
@ 2021-07-14 19:43 ` Chuck Lever III
  0 siblings, 0 replies; 2+ messages in thread
From: Chuck Lever III @ 2021-07-14 19:43 UTC (permalink / raw)
  To: Marciniszyn, Mike; +Cc: linux-rdma

Hi Mike-

> On Jul 14, 2021, at 12:40 PM, Marciniszyn, Mike <mike.marciniszyn@cornelisnetworks.com> wrote:
> 
> Chuck,
> 
> We are now seeing this in the first RC:
> 
> 
> [31868.644165] ------------[ cut here ]------------
> [31868.650059] failed to drain recv queue: -22
> [31868.655191] WARNING: CPU: 32 PID: 559 at drivers/infiniband/core/verbs.c:2738 __ib_drain_rq+0x163/0x1a0 [ib_core]
> [31868.657234] ------------[ cut here ]------------
> [31868.667133] Modules linked in: nfsv3
> [31868.672832] failed to drain send queue: -22
> [31868.677279]  nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs tcp_diag udp_diag raw_diag inet_diag rfkill ib_isert iscsi_target_mod target_core_mod rpcrdma ib_iser rdma_ucm opa_vnic rdma_cm ib_umad libiscsi ib_ipoib scsi_transport_iscsi ib_cm iw_cm sunrpc hfi1 mgag200 intel_rapl_msr intel_rapl_common drm_kms_helper sb_edac syscopyarea rdmavt x86_pkg_temp_thermal sysfillrect intel_powerclamp ipmi_si ib_uverbs sysimgblt coretemp fb_sys_fops cec ipmi_devintf drm crct10dif_pclmul crc32_pclmul iTCO_wdt iTCO_vendor_support ghash_clmulni_intel ib_core mei_me rapl intel_cstate mei lpc_ich mxm_wmi i2c_i801
> [31868.682425] WARNING: CPU: 65 PID: 608575 at drivers/infiniband/core/verbs.c:2705 __ib_drain_sq+0x14d/0x190 [ib_core]

The above warnings tell us ib_modify_qp() is returning -EINVAL,
twice in a row. ib_drain_qp() is not able to put the QP in the
ERR state, so it didn't try to post the drain sentinels.


> On the same tests, the mount command fails with a connection refused...
> 
> Any ideas on this?
> 
> 5.13.1 (the first 5.13.y release) tests fine.

There is exactly one change to the client components in
net/sunrpc/xprtrdma/ in v5.14-rc1:

  e86be3a04bc4 ("SUNRPC: More fixes for backlog congestion")

Based on these two facts, my first inclination is that this is
a problem with the verbs provider, not with rpcrdma.ko.

Let's collect a little more information. Enable tracing on
your client before trying your test again:

 # trace-cmd record -e sunrpc -e rpcrdma -e rdma_core -e rdma_cma

When the test fails, ^C the trace-cmd, and have a look at the
trace.dat file (and/or, send it to me).


--
Chuck Lever




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-07-14 19:45 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-14 16:40 NFS dmesg errors in 5.14-rc1 Marciniszyn, Mike
2021-07-14 19:43 ` Chuck Lever III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).