All of lore.kernel.org
 help / color / mirror / Atom feed
From: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
To: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>,
	"Robert Randall (rrandall)"
	<rrandall-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org>,
	Keith Busch <keith.busch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org"
	<linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: NVMeoF Linux GIT repo
Date: Wed, 26 Oct 2016 13:00:42 +0300	[thread overview]
Message-ID: <6f07770e-5b21-1c4f-9215-58f5821aae34@mellanox.com> (raw)
In-Reply-To: <2aeadd6a-0f5c-5c59-cafa-10116ccc91c0-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>

On 10/22/2016 1:19 AM, Sagi Grimberg wrote:
> Hey Robert,
> 
>> Sorry Keith, I'm back to the same question again.  I've tried using
>> the released 4.8.2 kernel and I'm seeing errors in the Linux RDMA
>> layer.  Log file is attached.  My guess is this may have been fixed
>> already but since I'm not writing code on Linux it is difficult to
>> keep up with which repo and which branch I should be using.
>>
>> It reports a syndrome 5 which appears to mean "work request flush error".
>>
>> Setup is stable 4.8.2 kernel with Mellanox RoCE v2.
>>
>> So, where do I grab the latest and greatest code these days?
> 
> So from a quick look at the log the FLUSH errors are
> just side effects. Once a queue-pair transitions to
> ERROR state it flushes all the pending work requests with
> a FLUSH syndrome, so we should look at the first error which
> is:
> 
> mlx5_1:poll_soft_wc:647:(pid 3422): polled software generated completion
> on CQ 0x14
> 
> This seems to come from the GSI QP completion emulation from
> Haggai (CC'd). CQ 0x14 is not nvmet-rdma completion queue (from
> the log it's 0x5d) so something went wrong but its does not
> seem to be nvmet-rdma's fault.
I'm not sure this line means anything wrong as happened. It just means
that the software emulated CQ has received a packet (a MAD), and that
debugging prints are on.

We did had a bug with that code, and it was fixed in [1] (kernel 4.8) so
you should have the fix.

> 
> Haggai, any tips for Robert?

I'll take another look at the logs and see if I think of anything.

[1] https://patchwork.kernel.org/patch/9211211/
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: haggaie@mellanox.com (Haggai Eran)
Subject: NVMeoF Linux GIT repo
Date: Wed, 26 Oct 2016 13:00:42 +0300	[thread overview]
Message-ID: <6f07770e-5b21-1c4f-9215-58f5821aae34@mellanox.com> (raw)
In-Reply-To: <2aeadd6a-0f5c-5c59-cafa-10116ccc91c0@grimberg.me>

On 10/22/2016 1:19 AM, Sagi Grimberg wrote:
> Hey Robert,
> 
>> Sorry Keith, I'm back to the same question again.  I've tried using
>> the released 4.8.2 kernel and I'm seeing errors in the Linux RDMA
>> layer.  Log file is attached.  My guess is this may have been fixed
>> already but since I'm not writing code on Linux it is difficult to
>> keep up with which repo and which branch I should be using.
>>
>> It reports a syndrome 5 which appears to mean "work request flush error".
>>
>> Setup is stable 4.8.2 kernel with Mellanox RoCE v2.
>>
>> So, where do I grab the latest and greatest code these days?
> 
> So from a quick look at the log the FLUSH errors are
> just side effects. Once a queue-pair transitions to
> ERROR state it flushes all the pending work requests with
> a FLUSH syndrome, so we should look at the first error which
> is:
> 
> mlx5_1:poll_soft_wc:647:(pid 3422): polled software generated completion
> on CQ 0x14
> 
> This seems to come from the GSI QP completion emulation from
> Haggai (CC'd). CQ 0x14 is not nvmet-rdma completion queue (from
> the log it's 0x5d) so something went wrong but its does not
> seem to be nvmet-rdma's fault.
I'm not sure this line means anything wrong as happened. It just means
that the software emulated CQ has received a packet (a MAD), and that
debugging prints are on.

We did had a bug with that code, and it was fixed in [1] (kernel 4.8) so
you should have the fix.

> 
> Haggai, any tips for Robert?

I'll take another look at the logs and see if I think of anything.

[1] https://patchwork.kernel.org/patch/9211211/

  parent reply	other threads:[~2016-10-26 10:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-20 15:36 NVMeoF Linux GIT repo Robert Randall (rrandall)
2016-10-20 19:02 ` J Freyensee
2016-10-21 17:14   ` Robert Randall (rrandall)
     [not found] ` <f5eefaea1d4b4d24945fdfb12da5a6ab-ESsAEwT0rQfU3M7VfFl2o0EOCMrvLtNR@public.gmane.org>
2016-10-21 22:19   ` Sagi Grimberg
2016-10-21 22:19     ` Sagi Grimberg
     [not found]     ` <2aeadd6a-0f5c-5c59-cafa-10116ccc91c0-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-10-25 14:06       ` Robert Randall
2016-10-25 14:06         ` Robert Randall
2016-10-26 10:00       ` Haggai Eran [this message]
2016-10-26 10:00         ` Haggai Eran
     [not found]         ` <6f07770e-5b21-1c4f-9215-58f5821aae34-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-10-26 12:49           ` Haggai Eran
2016-10-26 12:49             ` Haggai Eran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6f07770e-5b21-1c4f-9215-58f5821aae34@mellanox.com \
    --to=haggaie-vpraknaxozvwk0htik3j/w@public.gmane.org \
    --cc=keith.busch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=rrandall-AL4WhLSQfzjQT0dZR+AlfA@public.gmane.org \
    --cc=sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.