All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luse, Paul E <paul.e.luse at intel.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] SPDK errors
Date: Thu, 31 Aug 2017 22:36:45 +0000	[thread overview]
Message-ID: <82C9F782B054C94B9FC04A331649C77A68F3B372@fmsmsx104.amr.corp.intel.com> (raw)
In-Reply-To: CAKJZjiw2N+7xxLbUhLmSdyLxwTbsqdMMHhyys1Kb73vhOqF+Qw@mail.gmail.com

[-- Attachment #1: Type: text/plain, Size: 6629 bytes --]

Well those are good steps… hopefully someone else will jump in as well. I will see if I can get my HW setup to repro over the long weekend and let ya know how it goes…

From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Santhebachalli Ganesh
Sent: Wednesday, August 30, 2017 12:44 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] SPDK errors

Thanks.

I did change a few cables, and even targets. Assuming that this took care of insertion force statistically.
Could not do IB instead of Eth as my adapter does not support IB.

Also, the error messages are not consistent. That was a snapshot of one of the runs.
Then, there is also the older Connectx3 adapters (latest FW flashed) with newer kernels and latest SPDK/DPDK.

Seen the following:
>nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not map to outstanding cmd
>bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
>bdev.c: 511:spdk_bdev_finish: *ERROR*: bdev IO pool count is 65533 but should be 65536

On Tue, Aug 29, 2017 at 7:33 PM, Vladislav Bolkhovitin <vst(a)vlnb.net<mailto:vst(a)vlnb.net>> wrote:

Santhebachalli Ganesh wrote on 08/29/2017 10:14 AM:
> Folks,
> My name is Ganesh, and I am working on NVEMoF performance metrics using SPDK (and kernel).
> I would appreciate your expert insights.
>
> I am observing errors when QD on perf is increased above >=64 most of the
> times. Sometimes, even for <=16
> Errors are not consistent.
>
> Attached are some details.
>
> Please let me know if have any additional questions.
>
> Thanks.
> -Ganesh
>
> SPDK errors 1.txt
>
>
> Setup details:
> -- Some info on setup
> Same HW/SW on target and initiator.
>
> adminuser(a)dell730-80:~> hostnamectl
>    Static hostname: dell730-80
>          Icon name: computer-server
>            Chassis: server
>         Machine ID: b5abb0fe67afd04c59521c40599b3115
>            Boot ID: f825aa6338194338a6f80125caa836c7
>   Operating System: openSUSE Leap 42.3
>        CPE OS Name: cpe:/o:opensuse:leap:42.3
>             Kernel: Linux 4.12.8-1.g4d7933a-default
>       Architecture: x86-64
>
> adminuser(a)dell730-80:~> lscpu | grep -i socket
> Core(s) per socket:    12
> Socket(s):             2
>
> 2MB and/or 1GB huge pages set,
>
> Latest spdk/dpdk from respective GIT,
>
> compiled with RDMA flag,
>
> nvmf.conf file: (have played around with the values)
> reactor mask 0x5555
> AcceptorCore 2
> 1 - 3 Subsystems on cores 4,8,10
>
> adminuser(a)dell730-80:~> sudo gits/spdk/app/nvmf_tgt/nvmf_tgt -c gits/spdk/etc/spdk/nvmf.conf -p 6
>
> PCI, NVME cards (16GB)
> adminuser(a)dell730-80:~> sudo lspci | grep -i pmc
> 04:00.0 Non-Volatile memory controller: PMC-Sierra Inc. Device f117 (rev 06)
> 06:00.0 Non-Volatile memory controller: PMC-Sierra Inc. Device f117 (rev 06)
> 85:00.0 Non-Volatile memory controller: PMC-Sierra Inc. Device f117 (rev 06)
>
> Network cards: (latest associated FW from vendor)
> adminuser(a)dell730-80:~> sudo lspci | grep -i connect
> 05:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
>
> --- initiator cmd line
> sudo ./perf -q 32 -s 512 -w randread -t 30 -r 'trtype:RDMA adrfam:IPv4 traddr:1.1.1.80 trsvcid:4420 subnqn:nqn.2016-06.io.spdk:cnode1' -c 0x2
>
> --errors on stdout on target
> Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not map to outstanding cmd
> Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_qpair.c: 284:nvme_qpair_print_completion: *NOTICE*: SUCCESS (00/00) sqid:1 cid:201 cdw0:0 sqhd:0094 p:0 m:0 dnr:0
> Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not map to outstanding cmd
> Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_qpair.c: 284:nvme_qpair_print_completion: *NOTICE*: SUCCESS (00/00) sqid:1 cid:201 cdw0:0 sqhd:0094 p:0 m:0 dnr:0
> Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not map to outstanding cmd
> Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_qpair.c: 284:nvme_qpair_print_completion: *NOTICE*: SUCCESS (00/00) sqid:1 cid:198 cdw0:0 sqhd:0094 p:0 m:0 dnr:0
> Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not map to outstanding cmd
> Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_qpair.c: 284:nvme_qpair_print_completion: *NOTICE*: SUCCESS (00/00) sqid:1 cid:222 cdw0:0 sqhd:0094 p:0 m:0 dnr:0
> Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not map to outstanding cmd
> Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_qpair.c: 284:nvme_qpair_print_completion: *NOTICE*: SUCCESS (00/00) sqid:1 cid:222 cdw0:0 sqhd:0094 p:0 m:0 dnr:0
> Aug 24 17:14:09 dell730-80 nvmf[38006]: bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
> Aug 24 17:14:09 dell730-80 nvmf[38006]: bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
> Aug 24 17:14:09 dell730-80 nvmf[38006]: bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
> Aug 24 17:14:09 dell730-80 nvmf[38006]: bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
> Aug 24 17:14:09 dell730-80 nvmf[38006]: bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
> Aug 24 17:14:13 dell730-80 nvmf[38006]: rdma.c:1622:spdk_nvmf_rdma_poll: *ERROR*: CQ error on CQ 0x7f8a3803cae0, Request 0x140231622050400 (12): transport retry counter exceeded
>
> --- errros seen on client
> nvme_rdma.c:1470:nvme_rdma_qpair_process_completions: *ERROR*: CQ error on Queue Pair 0x1fdb580, Response Index 33408520 (13): RNR retry counter exceeded
> nvme_rdma.c:1470:nvme_rdma_qpair_process_completions: *ERROR*: CQ error on Queue Pair 0x1fdb580, Response Index 33408016 (5): Work Request Flushed Error
> nvme_rdma.c:1470:nvme_rdma_qpair_process_completions: *ERROR*: CQ error on Queue Pair 0x1fdb580, Response Index 14 (5): Work Request Flushed Error
> nvme_rdma.c:1470:nvme_rdma_qpair_process_completions: *ERROR*: CQ error on Queue Pair 0x1fdb580, Response Index 15 (5): Work Request Flushed Error

It's, actually, might be HW errors, because retries supposed to be engaged only on
packets loss/corruptions. Might be bad or not too well inserted cables.

Vlad

_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk


[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 11061 bytes --]

             reply	other threads:[~2017-08-31 22:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-31 22:36 Luse, Paul E [this message]
  -- strict thread matches above, loose matches on Subject: below --
2017-11-05 19:08 [SPDK] SPDK errors Luse, Paul E
2017-11-03 22:12 Santhebachalli Ganesh
2017-11-01 23:57 Santhebachalli Ganesh
2017-11-01 23:00 Santhebachalli Ganesh
2017-11-01 22:52 Luse, Paul E
2017-11-01 22:10 Santhebachalli Ganesh
2017-09-12 23:00 Luse, Paul E
2017-09-12 11:28 Santhebachalli Ganesh
2017-09-12  0:06 Kariuki, John K
2017-09-05 11:52 Santhebachalli Ganesh
2017-09-04 19:40 Luse, Paul E
2017-09-01 18:59 Santhebachalli Ganesh
2017-08-31 22:49 Santhebachalli Ganesh
2017-08-30 19:43 Santhebachalli Ganesh
2017-08-30  2:33 Vladislav Bolkhovitin
2017-08-29 22:43 Santhebachalli Ganesh
2017-08-29 19:00 Luse, Paul E
2017-08-29 17:50 Santhebachalli Ganesh
2017-08-29 17:26 Luse, Paul E
2017-08-29 17:14 Santhebachalli Ganesh
2017-08-25  1:47 Santhebachalli Ganesh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=82C9F782B054C94B9FC04A331649C77A68F3B372@fmsmsx104.amr.corp.intel.com \
    --to=spdk@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.