* [PATCH 0/1] nvmet: allow user to set req alloc flag
@ 2020-10-20 1:44 Chaitanya Kulkarni
2020-10-20 1:44 ` [PATCH 1/1] " Chaitanya Kulkarni
2020-10-20 8:22 ` [PATCH 0/1] " Sagi Grimberg
0 siblings, 2 replies; 5+ messages in thread
From: Chaitanya Kulkarni @ 2020-10-20 1:44 UTC (permalink / raw)
To: linux-nvme; +Cc: logang, hch, Chaitanya Kulkarni, sagi
Hi,
When using NVMeOF target in the passthru mode we allocate the request
with BLK_MQ_REQ_NOWAIT flag. This allocates the request in the following
manner :-
nvme_alloc_request()
blk_mq_alloc_rquest()
blk_mq_queue_enter()
if (flag & BLK_MQ_REQ_NOWAIT)
return -EBUSY; <-- return if busy.
On the NVMe controller which I've the fio random write workload running
parallel on 32 namespaces with higher queue depth results in I/O error,
where blk_mq_queue_enter() returning -EBUSY as shown above. This problem
is not easy to reproduce but occurs once in a while with following error
(See 1 for detailed log) :-
test1: (groupid=0, jobs=32): err= 5
(file:io_u.c:1744, func=io_u error, error=Input/output error):
When the flag BLK_MQ_REQ_NOWAIT is removed from the allocation the
workload doen't result in the error.
This patch fixes the problem with the request allocation by adding
a new configfs attribute so that user can optionally decide whether
to use BLK_MQ_REQ_NOWAIT or not. We retain the default behavior by
using BLK_MQ_REQ_NOWAIT when creating the nvmet passthru subsystem.
Regards,
Chaitanya
Chaitanya Kulkarni (1):
nvmet: allow user to set req alloc flag
drivers/nvme/target/configfs.c | 36 ++++++++++++++++++++++++++++++++++
drivers/nvme/target/core.c | 3 +++
drivers/nvme/target/nvmet.h | 1 +
drivers/nvme/target/passthru.c | 3 ++-
4 files changed, 42 insertions(+), 1 deletion(-)
1. FIO workload resulting in the I/O error with default behavior :-
-------------------------------------------------------------------
# cat nvmet-req-nowait-1.fio.log
fio-3.8-5-g464b
Starting 32 processes
test1: (groupid=0, jobs=32): err= 5 (file:io_u.c:1744, func=io_u error, error=Input/output error): pid=4580: Mon Oct 19 16:07:42 2020
write: IOPS=871k, BW=3403MiB/s (3569MB/s)(1664MiB/489msec)
slat (usec): min=6, max=17846, avg=33.13, stdev=228.94
clat (usec): min=50, max=40690, avg=4618.04, stdev=1154.78
lat (usec): min=194, max=40702, avg=4651.20, stdev=1170.03
clat percentiles (usec):
| 1.00th=[ 3195], 5.00th=[ 3621], 10.00th=[ 3720], 20.00th=[ 3752],
| 30.00th=[ 3818], 40.00th=[ 3884], 50.00th=[ 3982], 60.00th=[ 4359],
| 70.00th=[ 5604], 80.00th=[ 5735], 90.00th=[ 5997], 95.00th=[ 6194],
| 99.00th=[ 6915], 99.50th=[ 7504], 99.90th=[16057], 99.95th=[17695],
| 99.99th=[22938]
lat (usec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.21%, 4=50.40%, 10=49.19%, 20=0.09%, 50=0.04%
cpu : usr=4.07%, sys=34.86%, ctx=5296, majf=0, minf=393
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.2%, >=64=99.5%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=0,426154,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=128
Run status group 0 (all jobs):
WRITE: bw=3403MiB/s (3569MB/s), 3403MiB/s-3403MiB/s (3569MB/s-3569MB/s), io=1664MiB (1745MB), run=489-489msec
Disk stats (read/write):
nvme2n1: ios=0/9329, merge=0/0, ticks=0/16477, in_queue=16478, util=1.81%
nvme2n2: ios=40/7807, merge=0/0, ticks=3/13759, in_queue=13763, util=1.92%
nvme2n3: ios=40/7577, merge=0/0, ticks=7/17245, in_queue=17253, util=1.98%
nvme2n4: ios=40/7647, merge=0/0, ticks=7/27139, in_queue=27147, util=2.14%
nvme2n5: ios=40/8678, merge=0/0, ticks=5/20387, in_queue=20393, util=2.16%
nvme2n6: ios=40/9069, merge=0/0, ticks=7/26973, in_queue=26981, util=2.29%
nvme2n7: ios=40/6509, merge=0/0, ticks=8/19087, in_queue=19095, util=2.34%
nvme2n8: ios=40/9491, merge=0/0, ticks=4/19053, in_queue=19058, util=2.42%
nvme2n9: ios=40/6604, merge=0/0, ticks=4/27047, in_queue=27051, util=2.43%
nvme2n10: ios=40/9525, merge=0/0, ticks=8/27259, in_queue=27268, util=2.60%
nvme2n11: ios=40/8255, merge=0/0, ticks=7/27218, in_queue=27226, util=2.65%
nvme2n12: ios=40/7239, merge=0/0, ticks=5/26904, in_queue=26910, util=2.75%
nvme2n13: ios=40/7588, merge=0/0, ticks=6/18549, in_queue=18556, util=2.75%
nvme2n14: ios=40/7474, merge=0/0, ticks=5/18387, in_queue=18392, util=2.91%
nvme2n15: ios=40/7293, merge=0/0, ticks=4/14218, in_queue=14223, util=3.23%
nvme2n16: ios=0/8255, merge=0/0, ticks=0/27107, in_queue=27108, util=3.39%
nvme2n17: ios=0/8383, merge=0/0, ticks=0/27563, in_queue=27563, util=3.79%
nvme2n18: ios=0/7900, merge=0/0, ticks=0/27229, in_queue=27229, util=3.94%
nvme2n19: ios=0/9406, merge=0/0, ticks=0/14151, in_queue=14152, util=4.37%
nvme2n20: ios=0/9022, merge=0/0, ticks=0/27223, in_queue=27223, util=4.89%
nvme2n21: ios=0/7231, merge=0/0, ticks=0/14102, in_queue=14102, util=4.98%
nvme2n22: ios=0/8891, merge=0/0, ticks=0/27090, in_queue=27090, util=5.52%
nvme2n23: ios=0/6223, merge=0/0, ticks=0/27066, in_queue=27067, util=5.54%
nvme2n24: ios=0/7366, merge=0/0, ticks=0/27291, in_queue=27291, util=6.09%
nvme2n25: ios=0/7301, merge=0/0, ticks=0/26984, in_queue=26985, util=7.55%
nvme2n26: ios=0/7716, merge=0/0, ticks=0/19097, in_queue=19098, util=8.43%
nvme2n27: ios=0/9144, merge=0/0, ticks=0/27355, in_queue=27356, util=10.98%
nvme2n28: ios=0/6591, merge=0/0, ticks=0/14058, in_queue=14058, util=12.28%
nvme2n29: ios=0/8915, merge=0/0, ticks=0/27237, in_queue=27237, util=17.74%
nvme2n30: ios=0/7098, merge=0/0, ticks=0/25610, in_queue=25610, util=28.49%
nvme2n31: ios=0/6404, merge=0/0, ticks=0/21010, in_queue=21011, util=31.86%
nvme2n32: ios=0/9678, merge=0/0, ticks=0/19048, in_queue=19049, util=73.70%
1. FIO workload without any error when not using BLK_MQ_REQ_NOWAIT :-
---------------------------------------------------------------------
# cat nvmet-req-nowait-0.fio.log
fio-3.8-5-g464b
Starting 32 processes
test1: (groupid=0, jobs=32): err= 0: pid=4665: Mon Oct 19 16:09:06 2020
write: IOPS=832k, BW=3250MiB/s (3408MB/s)(195GiB/61570msec)
slat (usec): min=6, max=1984.3k, avg=34.57, stdev=1293.12
clat (usec): min=42, max=3012.5k, avg=4804.90, stdev=16251.84
lat (usec): min=80, max=3012.5k, avg=4839.58, stdev=16320.43
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 4], 10.00th=[ 4], 20.00th=[ 5],
| 30.00th=[ 5], 40.00th=[ 5], 50.00th=[ 5], 60.00th=[ 5],
| 70.00th=[ 5], 80.00th=[ 5], 90.00th=[ 7], 95.00th=[ 7],
| 99.00th=[ 7], 99.50th=[ 8], 99.90th=[ 13], 99.95th=[ 19],
| 99.99th=[ 776]
bw ( KiB/s): min= 56, max=132007, per=3.23%, avg=107439.18, stdev=20819.41, samples=3790
iops : min= 14, max=33001, avg=26859.75, stdev=5204.86, samples=3790
lat (usec) : 50=0.01%, 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
lat (usec) : 1000=0.01%
lat (msec) : 2=0.01%, 4=13.50%, 10=86.26%, 20=0.19%, 50=0.01%
lat (msec) : 100=0.01%, 250=0.01%, 500=0.02%, 750=0.01%, 1000=0.01%
cpu : usr=3.89%, sys=33.38%, ctx=5596245, majf=0, minf=438
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued rwts: total=0,51228543,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=128
Run status group 0 (all jobs):
WRITE: bw=3250MiB/s (3408MB/s), 3250MiB/s-3250MiB/s (3408MB/s-3408MB/s), io=195GiB (210GB), run=61570-61570msec
Disk stats (read/write):
nvme2n1: ios=96/1583859, merge=0/0, ticks=13/431764, in_queue=431778, util=74.77%
nvme2n2: ios=80/1626714, merge=0/0, ticks=9/441432, in_queue=441441, util=77.34%
nvme2n3: ios=0/1656183, merge=0/0, ticks=0/468030, in_queue=468030, util=79.37%
nvme2n4: ios=80/1557939, merge=0/0, ticks=8/424431, in_queue=424439, util=73.46%
nvme2n5: ios=80/1649598, merge=0/0, ticks=8/428050, in_queue=428059, util=79.85%
nvme2n6: ios=40/1609485, merge=0/0, ticks=4/426484, in_queue=426488, util=79.08%
nvme2n7: ios=89/1583523, merge=0/0, ticks=11/428733, in_queue=428744, util=76.78%
nvme2n8: ios=96/1582927, merge=0/0, ticks=11/423911, in_queue=423922, util=77.75%
nvme2n9: ios=96/1666151, merge=0/0, ticks=13/447659, in_queue=447671, util=81.52%
nvme2n10: ios=96/1524152, merge=0/0, ticks=12/420001, in_queue=420013, util=76.71%
nvme2n11: ios=96/1630334, merge=0/0, ticks=14/449232, in_queue=449246, util=80.11%
nvme2n12: ios=96/1582044, merge=0/0, ticks=11/433235, in_queue=433246, util=79.40%
nvme2n13: ios=96/1663883, merge=0/0, ticks=10/477186, in_queue=477196, util=84.84%
nvme2n14: ios=96/1522140, merge=0/0, ticks=14/442828, in_queue=442842, util=76.87%
nvme2n15: ios=80/1564809, merge=0/0, ticks=11/437801, in_queue=437812, util=80.18%
nvme2n16: ios=80/1596635, merge=0/0, ticks=10/441174, in_queue=441183, util=82.13%
nvme2n17: ios=80/1646794, merge=0/0, ticks=7/467660, in_queue=467667, util=86.82%
nvme2n18: ios=80/1589903, merge=0/0, ticks=11/449100, in_queue=449112, util=83.04%
nvme2n19: ios=80/1613331, merge=0/0, ticks=12/433444, in_queue=433456, util=86.20%
nvme2n20: ios=80/1594185, merge=0/0, ticks=9/433980, in_queue=433988, util=87.45%
nvme2n21: ios=80/1566221, merge=0/0, ticks=11/430998, in_queue=431009, util=86.10%
nvme2n22: ios=80/1636843, merge=0/0, ticks=8/462546, in_queue=462554, util=87.66%
nvme2n23: ios=80/1522876, merge=0/0, ticks=13/441547, in_queue=441560, util=84.79%
nvme2n24: ios=80/1636153, merge=0/0, ticks=8/465240, in_queue=465248, util=90.78%
nvme2n25: ios=80/1507958, merge=0/0, ticks=9/419422, in_queue=419431, util=83.93%
nvme2n26: ios=0/1536856, merge=0/0, ticks=0/428050, in_queue=428050, util=88.16%
nvme2n27: ios=0/1608849, merge=0/0, ticks=0/457471, in_queue=457470, util=91.70%
nvme2n28: ios=0/1607794, merge=0/0, ticks=0/467126, in_queue=467126, util=94.52%
nvme2n29: ios=0/1623665, merge=0/0, ticks=0/432039, in_queue=432039, util=93.98%
nvme2n30: ios=0/1633017, merge=0/0, ticks=0/441836, in_queue=441836, util=97.04%
nvme2n31: ios=0/1644927, merge=0/0, ticks=0/455211, in_queue=455212, util=96.53%
nvme2n32: ios=0/1582765, merge=0/0, ticks=0/447856, in_queue=447856, util=94.97%
--
2.22.1
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/1] nvmet: allow user to set req alloc flag
2020-10-20 1:44 [PATCH 0/1] nvmet: allow user to set req alloc flag Chaitanya Kulkarni
@ 2020-10-20 1:44 ` Chaitanya Kulkarni
2020-10-20 8:22 ` [PATCH 0/1] " Sagi Grimberg
1 sibling, 0 replies; 5+ messages in thread
From: Chaitanya Kulkarni @ 2020-10-20 1:44 UTC (permalink / raw)
To: linux-nvme; +Cc: logang, hch, Chaitanya Kulkarni, sagi
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 4603 bytes --]
By default, we set the passthru request allocation flag such that it
returns the error in the following code patch and we fail the I/O when
BLK_MQ_REQ_NOWAIT is set:-
nvme_alloc_request()
blk_mq_alloc_request()
blk_mq_queue_enter()
if (flag & BLK_MQ_REQ_NOWAIT)
return -EBUSY; <-- return if busy.
On some controllers using BLK_MQ_REQ_NOWAIT ends up in I/O error where
the controller is perfectly healthy and not in a degraded state.
Block layer request allocation does allow us to wait instead of
immediately returning the error when we BLK_MQ_REQ_NOWAIT flag is not
used. This has shown to fix the I/O error problem reported under
heavy random write workload.
This patch fixes the problem with the request allocation by adding
a new configfs attribute so that user can optionally decide whether
to use BLK_MQ_REQ_NOWAIT or not. We retain the default behavior by
using BLK_MQ_REQ_NOWAIT when creating the nvmet passthru subsystem.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
drivers/nvme/target/configfs.c | 36 ++++++++++++++++++++++++++++++++++
drivers/nvme/target/core.c | 3 +++
drivers/nvme/target/nvmet.h | 1 +
drivers/nvme/target/passthru.c | 3 ++-
4 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 37e1d7784e17..3892b8fb0ff6 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -6,6 +6,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/kernel.h>
#include <linux/module.h>
+#include <linux/blk-mq.h>
#include <linux/slab.h>
#include <linux/stat.h>
#include <linux/ctype.h>
@@ -736,9 +737,44 @@ static ssize_t nvmet_passthru_enable_store(struct config_item *item,
}
CONFIGFS_ATTR(nvmet_passthru_, enable);
+static ssize_t nvmet_passthru_req_nowait_show(struct config_item *item,
+ char *page)
+{
+ struct nvmet_subsys *subsys = to_subsys(item->ci_parent);
+
+ return snprintf(page, PAGE_SIZE, "%d\n", subsys->req_nowait);
+}
+
+static ssize_t nvmet_passthru_req_nowait_store(struct config_item *item,
+ const char *page, size_t count)
+{
+ struct nvmet_subsys *subsys = to_subsys(item->ci_parent);
+ bool req_nowait;
+ int ret = 0;
+
+ mutex_lock(&subsys->lock);
+ if (subsys->passthru_ctrl) {
+ pr_err("disable passthru ctrl before setting req_nowait\n");
+ ret = -EBUSY;
+ goto out;
+ }
+
+ if (strtobool(page, &req_nowait)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ subsys->req_nowait = req_nowait ? BLK_MQ_REQ_NOWAIT : 0;
+out:
+ mutex_unlock(&subsys->lock);
+ return ret ? ret : count;
+}
+CONFIGFS_ATTR(nvmet_passthru_, req_nowait);
+
static struct configfs_attribute *nvmet_passthru_attrs[] = {
&nvmet_passthru_attr_device_path,
&nvmet_passthru_attr_enable,
+ &nvmet_passthru_attr_req_nowait,
NULL,
};
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index aafcbc424b7a..50634945bd42 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -1511,6 +1511,9 @@ struct nvmet_subsys *nvmet_subsys_alloc(const char *subsysnqn,
}
subsys->cntlid_min = NVME_CNTLID_MIN;
subsys->cntlid_max = NVME_CNTLID_MAX;
+#ifdef NVME_TARGET_PASSTHRU
+ subsys->req_nowait = BLK_MQ_REQ_NOWAIT;
+#endif
kref_init(&subsys->ref);
mutex_init(&subsys->lock);
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 559a15ccc322..a7ca8f5860c9 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -249,6 +249,7 @@ struct nvmet_subsys {
struct nvme_ctrl *passthru_ctrl;
char *passthru_ctrl_path;
struct config_group passthru_group;
+ blk_mq_req_flags_t req_nowait;
#endif /* CONFIG_NVME_TARGET_PASSTHRU */
};
diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 56c571052216..7a2ce00f7a25 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -215,6 +215,7 @@ static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
{
+ blk_mq_req_flags_t req_nowait = req->sq->ctrl->subsys->req_nowait;
struct nvme_ctrl *ctrl = nvmet_req_passthru_ctrl(req);
struct request_queue *q = ctrl->admin_q;
struct nvme_ns *ns = NULL;
@@ -236,7 +237,7 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
q = ns->queue;
}
- rq = nvme_alloc_request(q, req->cmd, BLK_MQ_REQ_NOWAIT, NVME_QID_ANY);
+ rq = nvme_alloc_request(q, req->cmd, req_nowait, NVME_QID_ANY);
if (IS_ERR(rq)) {
status = NVME_SC_INTERNAL;
goto out_put_ns;
--
2.22.1
[-- Attachment #2: Type: text/plain, Size: 158 bytes --]
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 0/1] nvmet: allow user to set req alloc flag
2020-10-20 1:44 [PATCH 0/1] nvmet: allow user to set req alloc flag Chaitanya Kulkarni
2020-10-20 1:44 ` [PATCH 1/1] " Chaitanya Kulkarni
@ 2020-10-20 8:22 ` Sagi Grimberg
2020-10-20 15:39 ` Logan Gunthorpe
1 sibling, 1 reply; 5+ messages in thread
From: Sagi Grimberg @ 2020-10-20 8:22 UTC (permalink / raw)
To: Chaitanya Kulkarni, linux-nvme; +Cc: logang, hch
> Hi,
>
> When using NVMeOF target in the passthru mode we allocate the request
> with BLK_MQ_REQ_NOWAIT flag. This allocates the request in the following
> manner :-
>
> nvme_alloc_request()
> blk_mq_alloc_rquest()
> blk_mq_queue_enter()
> if (flag & BLK_MQ_REQ_NOWAIT)
> return -EBUSY; <-- return if busy.
>
> On the NVMe controller which I've the fio random write workload running
> parallel on 32 namespaces with higher queue depth results in I/O error,
> where blk_mq_queue_enter() returning -EBUSY as shown above. This problem
> is not easy to reproduce but occurs once in a while with following error
> (See 1 for detailed log) :-
>
> test1: (groupid=0, jobs=32): err= 5
> (file:io_u.c:1744, func=io_u error, error=Input/output error):
>
> When the flag BLK_MQ_REQ_NOWAIT is removed from the allocation the
> workload doen't result in the error.
>
> This patch fixes the problem with the request allocation by adding
> a new configfs attribute so that user can optionally decide whether
> to use BLK_MQ_REQ_NOWAIT or not. We retain the default behavior by
> using BLK_MQ_REQ_NOWAIT when creating the nvmet passthru subsystem.
Why should we ever set REQ_NOWAIT at all? Nothing prevents the
host(s) queue depth from exceeding the controller queue depth...
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/1] nvmet: allow user to set req alloc flag
2020-10-20 8:22 ` [PATCH 0/1] " Sagi Grimberg
@ 2020-10-20 15:39 ` Logan Gunthorpe
2020-10-20 15:53 ` Keith Busch
0 siblings, 1 reply; 5+ messages in thread
From: Logan Gunthorpe @ 2020-10-20 15:39 UTC (permalink / raw)
To: Sagi Grimberg, Chaitanya Kulkarni, linux-nvme; +Cc: hch
On 2020-10-20 2:22 a.m., Sagi Grimberg wrote:
>
>> Hi,
>>
>> When using NVMeOF target in the passthru mode we allocate the request
>> with BLK_MQ_REQ_NOWAIT flag. This allocates the request in the following
>> manner :-
>>
>> nvme_alloc_request()
>> blk_mq_alloc_rquest()
>> blk_mq_queue_enter()
>> if (flag & BLK_MQ_REQ_NOWAIT)
>> return -EBUSY; <-- return if busy.
>>
>> On the NVMe controller which I've the fio random write workload running
>> parallel on 32 namespaces with higher queue depth results in I/O error,
>> where blk_mq_queue_enter() returning -EBUSY as shown above. This problem
>> is not easy to reproduce but occurs once in a while with following error
>> (See 1 for detailed log) :-
>>
>> test1: (groupid=0, jobs=32): err= 5
>> (file:io_u.c:1744, func=io_u error, error=Input/output error):
>>
>> When the flag BLK_MQ_REQ_NOWAIT is removed from the allocation the
>> workload doen't result in the error.
>>
>> This patch fixes the problem with the request allocation by adding
>> a new configfs attribute so that user can optionally decide whether
>> to use BLK_MQ_REQ_NOWAIT or not. We retain the default behavior by
>> using BLK_MQ_REQ_NOWAIT when creating the nvmet passthru subsystem.
>
> Why should we ever set REQ_NOWAIT at all? Nothing prevents the
> host(s) queue depth from exceeding the controller queue depth...
I agree... I certainly found adding a configfs attribute for this rather
off-putting. Why would the user want an option that turns on random errors?
Logan
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/1] nvmet: allow user to set req alloc flag
2020-10-20 15:39 ` Logan Gunthorpe
@ 2020-10-20 15:53 ` Keith Busch
0 siblings, 0 replies; 5+ messages in thread
From: Keith Busch @ 2020-10-20 15:53 UTC (permalink / raw)
To: Logan Gunthorpe; +Cc: linux-nvme, Sagi Grimberg, Chaitanya Kulkarni, hch
On Tue, Oct 20, 2020 at 09:39:28AM -0600, Logan Gunthorpe wrote:
>
>
> On 2020-10-20 2:22 a.m., Sagi Grimberg wrote:
> >
> >> Hi,
> >>
> >> When using NVMeOF target in the passthru mode we allocate the request
> >> with BLK_MQ_REQ_NOWAIT flag. This allocates the request in the following
> >> manner :-
> >>
> >> nvme_alloc_request()
> >> blk_mq_alloc_rquest()
> >> blk_mq_queue_enter()
> >> if (flag & BLK_MQ_REQ_NOWAIT)
> >> return -EBUSY; <-- return if busy.
> >>
> >> On the NVMe controller which I've the fio random write workload running
> >> parallel on 32 namespaces with higher queue depth results in I/O error,
> >> where blk_mq_queue_enter() returning -EBUSY as shown above. This problem
> >> is not easy to reproduce but occurs once in a while with following error
> >> (See 1 for detailed log) :-
> >>
> >> test1: (groupid=0, jobs=32): err= 5
> >> (file:io_u.c:1744, func=io_u error, error=Input/output error):
> >>
> >> When the flag BLK_MQ_REQ_NOWAIT is removed from the allocation the
> >> workload doen't result in the error.
> >>
> >> This patch fixes the problem with the request allocation by adding
> >> a new configfs attribute so that user can optionally decide whether
> >> to use BLK_MQ_REQ_NOWAIT or not. We retain the default behavior by
> >> using BLK_MQ_REQ_NOWAIT when creating the nvmet passthru subsystem.
> >
> > Why should we ever set REQ_NOWAIT at all? Nothing prevents the
> > host(s) queue depth from exceeding the controller queue depth...
>
> I agree... I certainly found adding a configfs attribute for this rather
> off-putting. Why would the user want an option that turns on random errors?
Right, you really only want to use NOWAIT when waiting can lead to bad
things like a deadlock. I don't think the target passthrough has any
such issue, so you should be able to safely leave that flag out.
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-10-20 15:53 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-20 1:44 [PATCH 0/1] nvmet: allow user to set req alloc flag Chaitanya Kulkarni
2020-10-20 1:44 ` [PATCH 1/1] " Chaitanya Kulkarni
2020-10-20 8:22 ` [PATCH 0/1] " Sagi Grimberg
2020-10-20 15:39 ` Logan Gunthorpe
2020-10-20 15:53 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).