All of lore.kernel.org
 help / color / mirror / Atom feed
* Failure with 8K Write operations
@ 2016-09-12 20:49 Narayan Ayalasomayajula
  2016-09-13  9:16 ` Sagi Grimberg
  0 siblings, 1 reply; 10+ messages in thread
From: Narayan Ayalasomayajula @ 2016-09-12 20:49 UTC (permalink / raw)


Hello All,

I am running into a failure with the 4.8.0 branch and wanted to see this is a known issue or whether there is something I am not doing right in my setup/configuration. The issue that I am running into is that the Host is indicating a NAK (Remote Access Error) condition when executing an FIO script that is performing 100% 8K Write operations. Trace analysis shows that the target has the expected Virtual Address and R_KEY values in the READ REQUEST but for some reason, the Host flags the request as an access violation. I ran a similar test with iWARP Host and Target systems and the did see a Terminate followed by FIN from the Host. The cause for both failures appears to be the same.

The details of the configuration on the Host and Target systems are provided below and attached as well.

The repository where I retrieved the 4.8.0 branch is:

       git clone git://git.infradead.org/nvme-fabrics.git --branch=nvmf-4.8-rc

Test Configuration:
The Linux Host is using a ConnectX-4 (single port 100G, firmware-version: 12.16.1020) connected to a Mellanox switch and the Linux Target is a ConnectX-3 (single port 10G, firmware-version: 2.36.5000) connected to the same switch. Normal flow control is enabled on all ports of the switch (along with Jumbo frame). I am using null_blk as the target (so the IOs are not being serviced by a "real" nvme target).

FIO script:
        [global]
?????? ioengine=libaio
?????? direct=1

?????? runtime=10m
?????? size=800g
?????? time_based
?????? norandommap
?????? group_reporting

?????? bs=8k
?????? numjobs=8
?????? iodepth=16
?????? 

?????? [rand_write]
?????? filename=/dev/nvme0n1
?????? rw=randwrite

If I missed providing any information on the failure, please let me know. Any help or guidance is appreciated.

Thanks,
Narayan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Configuration_for_NAK_issue.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 13396 bytes
Desc: Configuration_for_NAK_issue.docx
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20160912/8ca9b75d/attachment-0003.docx>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Linux_Host_dmesg_log_for_NAK_issue.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 17285 bytes
Desc: Linux_Host_dmesg_log_for_NAK_issue.docx
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20160912/8ca9b75d/attachment-0004.docx>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Linux_Target_dmesg_log_for_NAK_issue.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 20879 bytes
Desc: Linux_Target_dmesg_log_for_NAK_issue.docx
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20160912/8ca9b75d/attachment-0005.docx>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Failure with 8K Write operations
  2016-09-12 20:49 Failure with 8K Write operations Narayan Ayalasomayajula
@ 2016-09-13  9:16 ` Sagi Grimberg
  2016-09-13 20:04   ` Narayan Ayalasomayajula
  0 siblings, 1 reply; 10+ messages in thread
From: Sagi Grimberg @ 2016-09-13  9:16 UTC (permalink / raw)



> Hello All,

Hi Narayan,

> I am running into a failure with the 4.8.0 branch and wanted to see this is a known issue or whether there is something I am not doing right in my setup/configuration. The issue that I am running into is that the Host is indicating a NAK (Remote Access Error) condition when executing an FIO script that is performing 100% 8K Write operations. Trace analysis shows that the target has the expected Virtual Address and R_KEY values in the READ REQUEST but for some reason, the Host flags the request as an access violation. I ran a similar test with iWARP Host and Target systems and the did see a Terminate followed by FIN from the Host. The cause for both failures appears to be the same.
>

I cannot reproduce what you are seeing on my setup (Steve, can you?)
I'm running 2 VMs connected over SRIOV on the same PC though...

Can you share the log on the host side?

Can you also add this print to verify that the
host driver programmed the same sgl as it sent
the target:
--
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index c2c2c28e6eb5..248fa2e5cabf 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -955,6 +955,9 @@ static int nvme_rdma_map_sg_fr(struct 
nvme_rdma_queue *queue,
         sg->type = (NVME_KEY_SGL_FMT_DATA_DESC << 4) |
                         NVME_SGL_FMT_INVALIDATE;

+       pr_err("%s: rkey=%#x iova=%#llx length=%#x\n",
+               __func__, req->mr->rkey, req->mr->iova, req->mr->length);
+
         return 0;
  }
--

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Failure with 8K Write operations
  2016-09-13  9:16 ` Sagi Grimberg
@ 2016-09-13 20:04   ` Narayan Ayalasomayajula
  2016-09-13 23:51     ` J Freyensee
  0 siblings, 1 reply; 10+ messages in thread
From: Narayan Ayalasomayajula @ 2016-09-13 20:04 UTC (permalink / raw)


Hi Sagi,

Thanks for the print statement to verify that the sgls in the command capsule match what the Host programmed. I added this print statement and compared the Virtual Address and R_Key information in the /var/log to the NVMe Commands in the trace file and found the two to match. I have the trace and Host log files from this failure (trace is ~6M) - will it be useful for someone who may be looking into this issue? 

Regarding the host side log information you mentioned, I had attached that in my prior email (attached again). Is this what you are requesting? That was collected prior to adding the print statement that you suggested.

Just to summarize, the failure is seen in the following configuration:

1. Host is an 8-core Ubuntu server running the 4.8.0 kernel. It has a ConnectX-4 RNIC (1x100G) and is connected to a Mellanox Switch.
2. Target is an 8-core Ubuntu server running the 4.8.0 kernel. It has a ConnectX-3 RNIC (1x10G) and is connected to a Mellanox Switch.
3. Switch has normal Pause and Jumbo frame support enabled on all ports.
4. Test fails with Host sending a NAK (Remote Access Error) for the following FIO workload:

	[global]
	ioengine=libaio
	direct=1
	runtime=10m
	size=800g
	time_based
	norandommap
	group_reporting
	bs=8k
	numjobs=8
	iodepth=16

	[rand_write]
	filename=/dev/nvme0n1
	rw=randwrite 

I have found that the failure happens with numjobs set to 1 as well.

Thanks again for your response,
Narayan

-----Original Message-----
From: Sagi Grimberg [mailto:sagi@grimberg.me] 
Sent: Tuesday, September 13, 2016 2:16 AM
To: Narayan Ayalasomayajula <narayan.ayalasomayajula at kazan-networks.com>; linux-nvme at lists.infradead.org
Subject: Re: Failure with 8K Write operations


> Hello All,

Hi Narayan,

> I am running into a failure with the 4.8.0 branch and wanted to see this is a known issue or whether there is something I am not doing right in my setup/configuration. The issue that I am running into is that the Host is indicating a NAK (Remote Access Error) condition when executing an FIO script that is performing 100% 8K Write operations. Trace analysis shows that the target has the expected Virtual Address and R_KEY values in the READ REQUEST but for some reason, the Host flags the request as an access violation. I ran a similar test with iWARP Host and Target systems and the did see a Terminate followed by FIN from the Host. The cause for both failures appears to be the same.
>

I cannot reproduce what you are seeing on my setup (Steve, can you?) I'm running 2 VMs connected over SRIOV on the same PC though...

Can you share the log on the host side?

Can you also add this print to verify that the host driver programmed the same sgl as it sent the target:
--
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index c2c2c28e6eb5..248fa2e5cabf 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -955,6 +955,9 @@ static int nvme_rdma_map_sg_fr(struct nvme_rdma_queue *queue,
         sg->type = (NVME_KEY_SGL_FMT_DATA_DESC << 4) |
                         NVME_SGL_FMT_INVALIDATE;

+       pr_err("%s: rkey=%#x iova=%#llx length=%#x\n",
+               __func__, req->mr->rkey, req->mr->iova, 
+ req->mr->length);
+
         return 0;
  }
--
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Linux_Host_dmesg_log_for_NAK_issue.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 17285 bytes
Desc: Linux_Host_dmesg_log_for_NAK_issue.docx
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20160913/90d1fa17/attachment-0001.docx>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Failure with 8K Write operations
  2016-09-13 20:04   ` Narayan Ayalasomayajula
@ 2016-09-13 23:51     ` J Freyensee
  2016-09-14  0:03       ` Narayan Ayalasomayajula
  0 siblings, 1 reply; 10+ messages in thread
From: J Freyensee @ 2016-09-13 23:51 UTC (permalink / raw)


On Tue, 2016-09-13@20:04 +0000, Narayan Ayalasomayajula wrote:
> Hi Sagi,
> 
> Thanks for the print statement to verify that the sgls in the command
> capsule match what the Host programmed. I added this print statement
> and compared the Virtual Address and R_Key information in the
> /var/log to the NVMe Commands in the trace file and found the two to
> match. I have the trace and Host log files from this failure (trace
> is ~6M) - will it be useful for someone who may be looking into this
> issue??
> 
> Regarding the host side log information you mentioned, I had attached
> that in my prior email (attached again). Is this what you are
> requesting? That was collected prior to adding the print statement
> that you suggested.
> 
> Just to summarize, the failure is seen in the following
> configuration:
> 
> 1. Host is an 8-core Ubuntu server running the 4.8.0 kernel. It has a
> ConnectX-4 RNIC (1x100G) and is connected to a Mellanox Switch.
> 2. Target is an 8-core Ubuntu server running the 4.8.0 kernel. It has
> a ConnectX-3 RNIC (1x10G) and is connected to a Mellanox Switch.
> 3. Switch has normal Pause and Jumbo frame support enabled on all
> ports.
> 4. Test fails with Host sending a NAK (Remote Access Error) for the
> following FIO workload:
> 
> 	[global]
> 	ioengine=libaio
> 	direct=1
> 	runtime=10m
> 	size=800g
> 	time_based
> 	norandommap
> 	group_reporting
> 	bs=8k
> 	numjobs=8
> 	iodepth=16
> 
> 	[rand_write]
> 	filename=/dev/nvme0n1
> 	rw=randwrite?
> 

Hi Narayan:

I have a 2 host, 2 target 1RU server data network using a 32x Arista
switch and using your FIO setup above, I am not seeing any errors. ?I
tried running your script on each Host at the same time targeting the
same NVMe Target (but different SSDs targeted by each Host) as well as
only running the script on 1 Host only and didn't see any errors. Also
tried 'numjobs=1' and didn't reproduce what you see.

Both Host and Targets for me are using the 4.8-rc4 kernel. ?Both the
Host and Target are using dual port Mellanox?ConnectX-3 Pro EN 40Gb (so
I'm using a RoCE setup). My Hosts are 32 processor machines and Targets
are 28 Processor machine. ?All filled w/various Intel SSDs.

Something unique about your setup.

Jay


> I have found that the failure happens with numjobs set to 1 as well.
> 
> Thanks again for your response,
> Narayan
> 
> -----Original Message-----
> From: Sagi Grimberg [mailto:sagi at grimberg.me]?
> Sent: Tuesday, September 13, 2016 2:16 AM
> To: Narayan Ayalasomayajula <narayan.ayalasomayajula at kazan-networks.c
> om>; linux-nvme at lists.infradead.org
> Subject: Re: Failure with 8K Write operations
> 
> 
> > 
> > Hello All,
> 
> Hi Narayan,
> 
> > 
> > I am running into a failure with the 4.8.0 branch and wanted to see
> > this is a known issue or whether there is something I am not doing
> > right in my setup/configuration. The issue that I am running into
> > is that the Host is indicating a NAK (Remote Access Error)
> > condition when executing an FIO script that is performing 100% 8K
> > Write operations. Trace analysis shows that the target has the
> > expected Virtual Address and R_KEY values in the READ REQUEST but
> > for some reason, the Host flags the request as an access violation.
> > I ran a similar test with iWARP Host and Target systems and the did
> > see a Terminate followed by FIN from the Host. The cause for both
> > failures appears to be the same.
> > 
> 
> I cannot reproduce what you are seeing on my setup (Steve, can you?)
> I'm running 2 VMs connected over SRIOV on the same PC though...
> 
> Can you share the log on the host side?
> 
> Can you also add this print to verify that the host driver programmed
> the same sgl as it sent the target:
> --
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index c2c2c28e6eb5..248fa2e5cabf 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -955,6 +955,9 @@ static int nvme_rdma_map_sg_fr(struct
> nvme_rdma_queue *queue,
> ?????????sg->type = (NVME_KEY_SGL_FMT_DATA_DESC << 4) |
> ?????????????????????????NVME_SGL_FMT_INVALIDATE;
> 
> +???????pr_err("%s: rkey=%#x iova=%#llx length=%#x\n",
> +???????????????__func__, req->mr->rkey, req->mr->iova,?
> + req->mr->length);
> +
> ?????????return 0;
> ? }
> --
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Failure with 8K Write operations
  2016-09-13 23:51     ` J Freyensee
@ 2016-09-14  0:03       ` Narayan Ayalasomayajula
  2016-09-14 16:52         ` J Freyensee
  0 siblings, 1 reply; 10+ messages in thread
From: Narayan Ayalasomayajula @ 2016-09-14  0:03 UTC (permalink / raw)


Hi Jay,

Thanks for taking the effort to emulate the behavior.

I did not mention this in my last email but had indicated it in the earlier posting. I am using null_blk as the target (so the IOs are not being serviced by a real nvme target). I am not sure if that could somehow be the catalyst for the failure. Is it possible for you to re-run your test with null_blk as the target? 

Thanks,
Narayan

-----Original Message-----
From: J Freyensee [mailto:james_p_freyensee@linux.intel.com] 
Sent: Tuesday, September 13, 2016 4:51 PM
To: Narayan Ayalasomayajula <narayan.ayalasomayajula at kazan-networks.com>; Sagi Grimberg <sagi at grimberg.me>; linux-nvme at lists.infradead.org
Subject: Re: Failure with 8K Write operations

On Tue, 2016-09-13@20:04 +0000, Narayan Ayalasomayajula wrote:
> Hi Sagi,
> 
> Thanks for the print statement to verify that the sgls in the command 
> capsule match what the Host programmed. I added this print statement 
> and compared the Virtual Address and R_Key information in the /var/log 
> to the NVMe Commands in the trace file and found the two to match. I 
> have the trace and Host log files from this failure (trace is ~6M) - 
> will it be useful for someone who may be looking into this issue?
> 
> Regarding the host side log information you mentioned, I had attached 
> that in my prior email (attached again). Is this what you are 
> requesting? That was collected prior to adding the print statement 
> that you suggested.
> 
> Just to summarize, the failure is seen in the following
> configuration:
> 
> 1. Host is an 8-core Ubuntu server running the 4.8.0 kernel. It has a
> ConnectX-4 RNIC (1x100G) and is connected to a Mellanox Switch.
> 2. Target is an 8-core Ubuntu server running the 4.8.0 kernel. It has 
> a ConnectX-3 RNIC (1x10G) and is connected to a Mellanox Switch.
> 3. Switch has normal Pause and Jumbo frame support enabled on all 
> ports.
> 4. Test fails with Host sending a NAK (Remote Access Error) for the 
> following FIO workload:
> 
> 	[global]
> 	ioengine=libaio
> 	direct=1
> 	runtime=10m
> 	size=800g
> 	time_based
> 	norandommap
> 	group_reporting
> 	bs=8k
> 	numjobs=8
> 	iodepth=16
> 
> 	[rand_write]
> 	filename=/dev/nvme0n1
> 	rw=randwrite
> 

Hi Narayan:

I have a 2 host, 2 target 1RU server data network using a 32x Arista
switch and using your FIO setup above, I am not seeing any errors. ?I
tried running your script on each Host at the same time targeting the
same NVMe Target (but different SSDs targeted by each Host) as well as
only running the script on 1 Host only and didn't see any errors. Also
tried 'numjobs=1' and didn't reproduce what you see.

Both Host and Targets for me are using the 4.8-rc4 kernel. ?Both the
Host and Target are using dual port Mellanox?ConnectX-3 Pro EN 40Gb (so
I'm using a RoCE setup). My Hosts are 32 processor machines and Targets
are 28 Processor machine. ?All filled w/various Intel SSDs.

Something unique about your setup.

Jay


> I have found that the failure happens with numjobs set to 1 as well.
> 
> Thanks again for your response,
> Narayan
> 
> -----Original Message-----
> From: Sagi Grimberg [mailto:sagi at grimberg.me]?
> Sent: Tuesday, September 13, 2016 2:16 AM
> To: Narayan Ayalasomayajula <narayan.ayalasomayajula at kazan-networks.c
> om>; linux-nvme at lists.infradead.org
> Subject: Re: Failure with 8K Write operations
> 
> 
> > 
> > Hello All,
> 
> Hi Narayan,
> 
> > 
> > I am running into a failure with the 4.8.0 branch and wanted to see
> > this is a known issue or whether there is something I am not doing
> > right in my setup/configuration. The issue that I am running into
> > is that the Host is indicating a NAK (Remote Access Error)
> > condition when executing an FIO script that is performing 100% 8K
> > Write operations. Trace analysis shows that the target has the
> > expected Virtual Address and R_KEY values in the READ REQUEST but
> > for some reason, the Host flags the request as an access violation.
> > I ran a similar test with iWARP Host and Target systems and the did
> > see a Terminate followed by FIN from the Host. The cause for both
> > failures appears to be the same.
> > 
> 
> I cannot reproduce what you are seeing on my setup (Steve, can you?)
> I'm running 2 VMs connected over SRIOV on the same PC though...
> 
> Can you share the log on the host side?
> 
> Can you also add this print to verify that the host driver programmed
> the same sgl as it sent the target:
> --
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index c2c2c28e6eb5..248fa2e5cabf 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -955,6 +955,9 @@ static int nvme_rdma_map_sg_fr(struct
> nvme_rdma_queue *queue,
> ?????????sg->type = (NVME_KEY_SGL_FMT_DATA_DESC << 4) |
> ?????????????????????????NVME_SGL_FMT_INVALIDATE;
> 
> +???????pr_err("%s: rkey=%#x iova=%#llx length=%#x\n",
> +???????????????__func__, req->mr->rkey, req->mr->iova,?
> + req->mr->length);
> +
> ?????????return 0;
> ? }
> --
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Failure with 8K Write operations
  2016-09-14  0:03       ` Narayan Ayalasomayajula
@ 2016-09-14 16:52         ` J Freyensee
  2016-09-15 13:36           ` Narayan Ayalasomayajula
  0 siblings, 1 reply; 10+ messages in thread
From: J Freyensee @ 2016-09-14 16:52 UTC (permalink / raw)


On Wed, 2016-09-14@00:03 +0000, Narayan Ayalasomayajula wrote:
> Hi Jay,
> 
> Thanks for taking the effort to emulate the behavior.
> 
> I did not mention this in my last email but had indicated it in the
> earlier posting. I am using null_blk as the target (so the IOs are
> not being serviced by a real nvme target). I am not sure if that
> could somehow be the catalyst for the failure. Is it possible for you
> to re-run your test with null_blk as the target??

As we talked off-line, try the latest mainline kernel from kernel.org
and see if you see anything different.

> 
> Thanks,
> Narayan
> 
> -----Original Message-----
> From: J Freyensee [mailto:james_p_freyensee at linux.intel.com]?
> Sent: Tuesday, September 13, 2016 4:51 PM
> To: Narayan Ayalasomayajula <narayan.ayalasomayajula at kazan-networks.c
> om>; Sagi Grimberg <sagi at grimberg.me>; linux-nvme at lists.infradead.org
> Subject: Re: Failure with 8K Write operations
> 
> On Tue, 2016-09-13@20:04 +0000, Narayan Ayalasomayajula wrote:
> > 
> > Hi Sagi,
> > 
> > Thanks for the print statement to verify that the sgls in the
> > command?
> > capsule match what the Host programmed. I added this print
> > statement?
> > and compared the Virtual Address and R_Key information in the
> > /var/log?
> > to the NVMe Commands in the trace file and found the two to match.
> > I?
> > have the trace and Host log files from this failure (trace is ~6M)
> > -?
> > will it be useful for someone who may be looking into this issue?
> > 
> > Regarding the host side log information you mentioned, I had
> > attached?
> > that in my prior email (attached again). Is this what you are?
> > requesting? That was collected prior to adding the print statement?
> > that you suggested.
> > 
> > Just to summarize, the failure is seen in the following
> > configuration:
> > 
> > 1. Host is an 8-core Ubuntu server running the 4.8.0 kernel. It has
> > a
> > ConnectX-4 RNIC (1x100G) and is connected to a Mellanox Switch.
> > 2. Target is an 8-core Ubuntu server running the 4.8.0 kernel. It
> > has?
> > a ConnectX-3 RNIC (1x10G) and is connected to a Mellanox Switch.
> > 3. Switch has normal Pause and Jumbo frame support enabled on all?
> > ports.
> > 4. Test fails with Host sending a NAK (Remote Access Error) for
> > the?
> > following FIO workload:
> > 
> > 	[global]
> > 	ioengine=libaio
> > 	direct=1
> > 	runtime=10m
> > 	size=800g
> > 	time_based
> > 	norandommap
> > 	group_reporting
> > 	bs=8k
> > 	numjobs=8
> > 	iodepth=16
> > 
> > 	[rand_write]
> > 	filename=/dev/nvme0n1
> > 	rw=randwrite
> > 
> 
> Hi Narayan:
> 
> I have a 2 host, 2 target 1RU server data network using a 32x Arista
> switch and using your FIO setup above, I am not seeing any errors. ?I
> tried running your script on each Host at the same time targeting the
> same NVMe Target (but different SSDs targeted by each Host) as well
> as
> only running the script on 1 Host only and didn't see any errors.
> Also
> tried 'numjobs=1' and didn't reproduce what you see.
> 
> Both Host and Targets for me are using the 4.8-rc4 kernel. ?Both the
> Host and Target are using dual port Mellanox?ConnectX-3 Pro EN 40Gb
> (so
> I'm using a RoCE setup). My Hosts are 32 processor machines and
> Targets
> are 28 Processor machine. ?All filled w/various Intel SSDs.
> 
> Something unique about your setup.
> 
> Jay
> 
> 
> > 
> > I have found that the failure happens with numjobs set to 1 as
> > well.
> > 
> > Thanks again for your response,
> > Narayan
> > 
> > -----Original Message-----
> > From: Sagi Grimberg [mailto:sagi at grimberg.me]?
> > Sent: Tuesday, September 13, 2016 2:16 AM
> > To: Narayan Ayalasomayajula <narayan.ayalasomayajula at kazan-networks
> > .c
> > om>; linux-nvme at lists.infradead.org
> > Subject: Re: Failure with 8K Write operations
> > 
> > 
> > > 
> > > 
> > > Hello All,
> > 
> > Hi Narayan,
> > 
> > > 
> > > 
> > > I am running into a failure with the 4.8.0 branch and wanted to
> > > see
> > > this is a known issue or whether there is something I am not
> > > doing
> > > right in my setup/configuration. The issue that I am running into
> > > is that the Host is indicating a NAK (Remote Access Error)
> > > condition when executing an FIO script that is performing 100% 8K
> > > Write operations. Trace analysis shows that the target has the
> > > expected Virtual Address and R_KEY values in the READ REQUEST but
> > > for some reason, the Host flags the request as an access
> > > violation.
> > > I ran a similar test with iWARP Host and Target systems and the
> > > did
> > > see a Terminate followed by FIN from the Host. The cause for both
> > > failures appears to be the same.
> > > 
> > 
> > I cannot reproduce what you are seeing on my setup (Steve, can
> > you?)
> > I'm running 2 VMs connected over SRIOV on the same PC though...
> > 
> > Can you share the log on the host side?
> > 
> > Can you also add this print to verify that the host driver
> > programmed
> > the same sgl as it sent the target:
> > --
> > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> > index c2c2c28e6eb5..248fa2e5cabf 100644
> > --- a/drivers/nvme/host/rdma.c
> > +++ b/drivers/nvme/host/rdma.c
> > @@ -955,6 +955,9 @@ static int nvme_rdma_map_sg_fr(struct
> > nvme_rdma_queue *queue,
> > ?????????sg->type = (NVME_KEY_SGL_FMT_DATA_DESC << 4) |
> > ?????????????????????????NVME_SGL_FMT_INVALIDATE;
> > 
> > +???????pr_err("%s: rkey=%#x iova=%#llx length=%#x\n",
> > +???????????????__func__, req->mr->rkey, req->mr->iova,?
> > + req->mr->length);
> > +
> > ?????????return 0;
> > ? }
> > --
> > _______________________________________________
> > Linux-nvme mailing list
> > Linux-nvme at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Failure with 8K Write operations
  2016-09-14 16:52         ` J Freyensee
@ 2016-09-15 13:36           ` Narayan Ayalasomayajula
  2016-11-01 23:07             ` Ming Lin
  0 siblings, 1 reply; 10+ messages in thread
From: Narayan Ayalasomayajula @ 2016-09-15 13:36 UTC (permalink / raw)


Hi Jay,

Thanks for pointing out that I was not running the latest version of the kernel. I updated to 4.8rc6 and my FIO test that had previously failed with the Linux NVMeF target (using null_blk device as the target) is now completing successfully. I am still seeing the same NAK (Remote Access Error) failure when I use our target instead. I will debug this further but updating to 4.8rc6 did improve things.

(Sagi, thanks for the print statements to display what the driver is passing down to the RNIC. I will use that change in 4.8rc6 to debug further). 

Thanks,
Narayan

-----Original Message-----
From: J Freyensee [mailto:james_p_freyensee@linux.intel.com] 
Sent: Wednesday, September 14, 2016 9:52 AM
To: Narayan Ayalasomayajula <narayan.ayalasomayajula at kazan-networks.com>; Sagi Grimberg <sagi at grimberg.me>; linux-nvme at lists.infradead.org
Subject: Re: Failure with 8K Write operations

On Wed, 2016-09-14@00:03 +0000, Narayan Ayalasomayajula wrote:
> Hi Jay,
> 
> Thanks for taking the effort to emulate the behavior.
> 
> I did not mention this in my last email but had indicated it in the 
> earlier posting. I am using null_blk as the target (so the IOs are not 
> being serviced by a real nvme target). I am not sure if that could 
> somehow be the catalyst for the failure. Is it possible for you to 
> re-run your test with null_blk as the target?

As we talked off-line, try the latest mainline kernel from kernel.org and see if you see anything different.

> 
> Thanks,
> Narayan
> 
> -----Original Message-----
> From: J Freyensee [mailto:james_p_freyensee at linux.intel.com]
> Sent: Tuesday, September 13, 2016 4:51 PM
> To: Narayan Ayalasomayajula <narayan.ayalasomayajula at kazan-networks.c
> om>; Sagi Grimberg <sagi at grimberg.me>; linux-nvme at lists.infradead.org
> Subject: Re: Failure with 8K Write operations
> 
> On Tue, 2016-09-13@20:04 +0000, Narayan Ayalasomayajula wrote:
> > 
> > Hi Sagi,
> > 
> > Thanks for the print statement to verify that the sgls in the 
> > command capsule match what the Host programmed. I added this print 
> > statement and compared the Virtual Address and R_Key information in 
> > the /var/log to the NVMe Commands in the trace file and found the 
> > two to match.
> > I
> > have the trace and Host log files from this failure (trace is ~6M)
> > -
> > will it be useful for someone who may be looking into this issue?
> > 
> > Regarding the host side log information you mentioned, I had 
> > attached that in my prior email (attached again). Is this what you 
> > are requesting? That was collected prior to adding the print 
> > statement that you suggested.
> > 
> > Just to summarize, the failure is seen in the following
> > configuration:
> > 
> > 1. Host is an 8-core Ubuntu server running the 4.8.0 kernel. It has 
> > a
> > ConnectX-4 RNIC (1x100G) and is connected to a Mellanox Switch.
> > 2. Target is an 8-core Ubuntu server running the 4.8.0 kernel. It 
> > has a ConnectX-3 RNIC (1x10G) and is connected to a Mellanox Switch.
> > 3. Switch has normal Pause and Jumbo frame support enabled on all 
> > ports.
> > 4. Test fails with Host sending a NAK (Remote Access Error) for the 
> > following FIO workload:
> > 
> > 	[global]
> > 	ioengine=libaio
> > 	direct=1
> > 	runtime=10m
> > 	size=800g
> > 	time_based
> > 	norandommap
> > 	group_reporting
> > 	bs=8k
> > 	numjobs=8
> > 	iodepth=16
> > 
> > 	[rand_write]
> > 	filename=/dev/nvme0n1
> > 	rw=randwrite
> > 
> 
> Hi Narayan:
> 
> I have a 2 host, 2 target 1RU server data network using a 32x Arista 
> switch and using your FIO setup above, I am not seeing any errors. ?I 
> tried running your script on each Host at the same time targeting the 
> same NVMe Target (but different SSDs targeted by each Host) as well as 
> only running the script on 1 Host only and didn't see any errors.
> Also
> tried 'numjobs=1' and didn't reproduce what you see.
> 
> Both Host and Targets for me are using the 4.8-rc4 kernel. ?Both the 
> Host and Target are using dual port Mellanox?ConnectX-3 Pro EN 40Gb 
> (so I'm using a RoCE setup). My Hosts are 32 processor machines and 
> Targets are 28 Processor machine. ?All filled w/various Intel SSDs.
> 
> Something unique about your setup.
> 
> Jay
> 
> 
> > 
> > I have found that the failure happens with numjobs set to 1 as well.
> > 
> > Thanks again for your response,
> > Narayan
> > 
> > -----Original Message-----
> > From: Sagi Grimberg [mailto:sagi at grimberg.me]
> > Sent: Tuesday, September 13, 2016 2:16 AM
> > To: Narayan Ayalasomayajula <narayan.ayalasomayajula at kazan-networks
> > .c
> > om>; linux-nvme at lists.infradead.org
> > Subject: Re: Failure with 8K Write operations
> > 
> > 
> > > 
> > > 
> > > Hello All,
> > 
> > Hi Narayan,
> > 
> > > 
> > > 
> > > I am running into a failure with the 4.8.0 branch and wanted to 
> > > see this is a known issue or whether there is something I am not 
> > > doing right in my setup/configuration. The issue that I am running 
> > > into is that the Host is indicating a NAK (Remote Access Error) 
> > > condition when executing an FIO script that is performing 100% 8K 
> > > Write operations. Trace analysis shows that the target has the 
> > > expected Virtual Address and R_KEY values in the READ REQUEST but 
> > > for some reason, the Host flags the request as an access 
> > > violation.
> > > I ran a similar test with iWARP Host and Target systems and the 
> > > did see a Terminate followed by FIN from the Host. The cause for 
> > > both failures appears to be the same.
> > > 
> > 
> > I cannot reproduce what you are seeing on my setup (Steve, can
> > you?)
> > I'm running 2 VMs connected over SRIOV on the same PC though...
> > 
> > Can you share the log on the host side?
> > 
> > Can you also add this print to verify that the host driver 
> > programmed the same sgl as it sent the target:
> > --
> > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c 
> > index c2c2c28e6eb5..248fa2e5cabf 100644
> > --- a/drivers/nvme/host/rdma.c
> > +++ b/drivers/nvme/host/rdma.c
> > @@ -955,6 +955,9 @@ static int nvme_rdma_map_sg_fr(struct 
> > nvme_rdma_queue *queue,
> > ?????????sg->type = (NVME_KEY_SGL_FMT_DATA_DESC << 4) |
> > ?????????????????????????NVME_SGL_FMT_INVALIDATE;
> > 
> > +???????pr_err("%s: rkey=%#x iova=%#llx length=%#x\n",
> > +???????????????__func__, req->mr->rkey, req->mr->iova,
> > + req->mr->length);
> > +
> > ?????????return 0;
> > ? }
> > --
> > _______________________________________________
> > Linux-nvme mailing list
> > Linux-nvme at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Failure with 8K Write operations
  2016-09-15 13:36           ` Narayan Ayalasomayajula
@ 2016-11-01 23:07             ` Ming Lin
  2016-11-01 23:27               ` Ming Lin
  0 siblings, 1 reply; 10+ messages in thread
From: Ming Lin @ 2016-11-01 23:07 UTC (permalink / raw)


On Thu, Sep 15, 2016 at 6:36 AM, Narayan Ayalasomayajula
<narayan.ayalasomayajula@kazan-networks.com> wrote:
> Hi Jay,
>
> Thanks for pointing out that I was not running the latest version of the kernel. I updated to 4.8rc6 and my FIO test that had previously failed with the Linux NVMeF target (using null_blk device as the target) is now completing successfully. I am still seeing the same NAK (Remote Access Error) failure when I use our target instead. I will debug this further but updating to 4.8rc6 did improve things.

Hi Narayan,

I also saw similar error with 8k write when I use my own target implementation.
Did you fix it already?

Thanks,
Ming

>
> (Sagi, thanks for the print statements to display what the driver is passing down to the RNIC. I will use that change in 4.8rc6 to debug further).
>
> Thanks,
> Narayan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Failure with 8K Write operations
  2016-11-01 23:07             ` Ming Lin
@ 2016-11-01 23:27               ` Ming Lin
  2016-11-02  1:24                 ` Narayan Ayalasomayajula
  0 siblings, 1 reply; 10+ messages in thread
From: Ming Lin @ 2016-11-01 23:27 UTC (permalink / raw)


On Tue, Nov 1, 2016@4:07 PM, Ming Lin <mlin@kernel.org> wrote:
> On Thu, Sep 15, 2016 at 6:36 AM, Narayan Ayalasomayajula
> <narayan.ayalasomayajula@kazan-networks.com> wrote:
>> Hi Jay,
>>
>> Thanks for pointing out that I was not running the latest version of the kernel. I updated to 4.8rc6 and my FIO test that had previously failed with the Linux NVMeF target (using null_blk device as the target) is now completing successfully. I am still seeing the same NAK (Remote Access Error) failure when I use our target instead. I will debug this further but updating to 4.8rc6 did improve things.
>
> Hi Narayan,
>
> I also saw similar error with 8k write when I use my own target implementation.
> Did you fix it already?

Hi Narayan,

With Sagi's great help off-line, I just fixed it.
In my code, when I post RDMA_READ, I didn't set rdma_wr.next to NULL.

Shame on myself ... example code as below

int rw_ctx_post(NvmetRdmaRsp *rsp)
{
    rsp->rdma_wr.next = NULL;
    return ibv_post_send(cm_id->qp, &rsp->rdma_wr, &bad_wr);
}

Possibly you may don't have this kind of stupid bug ...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Failure with 8K Write operations
  2016-11-01 23:27               ` Ming Lin
@ 2016-11-02  1:24                 ` Narayan Ayalasomayajula
  0 siblings, 0 replies; 10+ messages in thread
From: Narayan Ayalasomayajula @ 2016-11-02  1:24 UTC (permalink / raw)


Hi Ming,

We have a HW implementation of the NVMeF target and the issue was different :) I believe the issue was related to some buffer mis-use under heavy workload.

Thanks,
Narayan

-----Original Message-----
From: Ming Lin [mailto:mlin@kernel.org] 
Sent: Tuesday, November 01, 2016 4:28 PM
To: Narayan Ayalasomayajula <narayan.ayalasomayajula at kazan-networks.com>
Cc: J Freyensee <james_p_freyensee at linux.intel.com>; Sagi Grimberg <sagi at grimberg.me>; linux-nvme at lists.infradead.org
Subject: Re: Failure with 8K Write operations

On Tue, Nov 1, 2016@4:07 PM, Ming Lin <mlin@kernel.org> wrote:
> On Thu, Sep 15, 2016 at 6:36 AM, Narayan Ayalasomayajula 
> <narayan.ayalasomayajula@kazan-networks.com> wrote:
>> Hi Jay,
>>
>> Thanks for pointing out that I was not running the latest version of the kernel. I updated to 4.8rc6 and my FIO test that had previously failed with the Linux NVMeF target (using null_blk device as the target) is now completing successfully. I am still seeing the same NAK (Remote Access Error) failure when I use our target instead. I will debug this further but updating to 4.8rc6 did improve things.
>
> Hi Narayan,
>
> I also saw similar error with 8k write when I use my own target implementation.
> Did you fix it already?

Hi Narayan,

With Sagi's great help off-line, I just fixed it.
In my code, when I post RDMA_READ, I didn't set rdma_wr.next to NULL.

Shame on myself ... example code as below
	
int rw_ctx_post(NvmetRdmaRsp *rsp)
{
    rsp->rdma_wr.next = NULL;
    return ibv_post_send(cm_id->qp, &rsp->rdma_wr, &bad_wr); }

Possibly you may don't have this kind of stupid bug ...

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-11-02  1:24 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-12 20:49 Failure with 8K Write operations Narayan Ayalasomayajula
2016-09-13  9:16 ` Sagi Grimberg
2016-09-13 20:04   ` Narayan Ayalasomayajula
2016-09-13 23:51     ` J Freyensee
2016-09-14  0:03       ` Narayan Ayalasomayajula
2016-09-14 16:52         ` J Freyensee
2016-09-15 13:36           ` Narayan Ayalasomayajula
2016-11-01 23:07             ` Ming Lin
2016-11-01 23:27               ` Ming Lin
2016-11-02  1:24                 ` Narayan Ayalasomayajula

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.