All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] nvme-core: make gencounter feature tunable
@ 2021-12-10 11:21 Chaitanya Kulkarni
  2021-12-10 11:21 ` [PATCH 1/3] nvme-core: make cid gencounter configurable Chaitanya Kulkarni
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Chaitanya Kulkarni @ 2021-12-10 11:21 UTC (permalink / raw)
  To: kbusch, hch, sagi; +Cc: linux-nvme, Chaitanya Kulkarni

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 14215 bytes --]

From: Chaitanya Kulkarni <kch@nvidia.com>

Hi,

The recent commit uses a combination of command id and a gencounter
to calculate the command id so that we can call out buggy controllers
for spurious completions and that also avoids use after free.
This commit adds various if statements and bitwise operations
such as &, <<, >>, & along with comparison operations to validate the  
gencounter and print out the errors.
 
This feature is required to catch the buggy controller, but in the  
production environment where controller is validated and known to be
stable this achieves nothing but adds additional code and runtime CPU  
instructions for the PCIe controller.
Worst case scenario when NVMeOF setup is using passthru backend then
these instructions gets duplicated on the host and on the target side
for each controller.
 
This patch-series makes the gencounter feature configurable by   
defining a new KConfig macro CONFIG_NVME_DEBUG_USE_CID_GENCTR and adding
a new KConfig section Debug with subsection for enabling the  
gencounter debug feature. We move the current tag + gencounter code when
CONFIG_NVME_DEBUG_USE_CID_GENCTR is defined and keep the original code
where we use tag based command id where
CONFIG_NVME_DEBUG_USE_CID_GENCTR is not defined.
The newly added Debug section is kept under NVMe Driver section under
host side.
 
In this way we achieve:-
1. Removing the entire gencounter code at the compile-time and keeping the  
   fastpath as lean as possible when the controller is known to be stable.
2. Retaining the gencounter behavior by enabling the debug feature.
3. Letting user decide which debug features to enable instead of forcing
   any.
4. Avoid measurable differences in fio performance numbers. see below.

* fio BW/iops/slat cid gencounter on avg IOPS = 75k, BW = 295MiB :-
read: IOPS=75.3k, BW=294MiB/s
read: IOPS=75.8k, BW=296MiB/s 
read: IOPS=75.2k, BW=294MiB/s 
read: IOPS=76.0k, BW=297MiB/s 
read: IOPS=75.4k, BW=294MiB/s 
read: IOPS=76.0k, BW=297MiB/s 
read: IOPS=75.5k, BW=295MiB/s 
read: IOPS=75.4k, BW=294MiB/s 

Submission latency avg = 6698 nsec :-
slat (nsec): avg=5057.99
slat (nsec): avg=7434.15
slat (nsec): avg=6310.50
slat (nsec): avg=8083.82
slat (nsec): avg=7297.77
slat (nsec): avg=6003.77
slat (nsec): avg=6657.75
slat (nsec): avg=6745.34

* fio BW/iops/slat cid gencounter off IOPS avg = 76k, BW = 297MiB :-
read: IOPS=77.7k, BW=304MiB/s
read: IOPS=76.5k, BW=299MiB/s 
read: IOPS=74.1k, BW=290MiB/s 
read: IOPS=75.8k, BW=296MiB/s 
read: IOPS=75.9k, BW=297MiB/s 
read: IOPS=75.2k, BW=294MiB/s 
read: IOPS=77.6k, BW=303MiB/s 
read: IOPS=76.8k, BW=300MiB/s 

Submission latency avg = 5030 nsec :-

slat (nsec): avg=3859.20
slat (nsec): avg=4669.78
slat (nsec): avg=3778.51
slat (nsec): avg=8436.18
slat (nsec): avg=5653.83
slat (nsec): avg=5286.60
slat (nsec): avg=4176.05
slat (nsec): avg=4385.19

-ck

Chaitanya Kulkarni (3):
  nvme-core: make cid gencounter configurable
  nvme-core: move gencounter check into nvme_cid()
  nvme: add KConfig options for debug features

 drivers/nvme/host/Kconfig | 10 ++++++++++
 drivers/nvme/host/core.c  |  3 ---
 drivers/nvme/host/nvme.h  | 31 +++++++++++++++++++++++++++++++
 3 files changed, 41 insertions(+), 3 deletions(-)

-- 
2.29.0

Test Log :-

root@dev nvme (nvme-5.16) # ./compile_nvme.sh 
+ umount /mnt/nvme0n1
+ clear_dmesg
./compile_nvme.sh: line 3: clear_dmesg: command not found
umount: /mnt/nvme0n1: no mount point specified.
+ modprobe -r nvme-fabrics
+ modprobe -r nvme_loop
+ modprobe -r nvmet
+ modprobe -r nvme
+ sleep 1
+ modprobe -r nvme-core
+ lsmod
+ grep nvme
+ sleep 1
+ git diff
+ sleep 1
++ nproc
+ make -j 48 M=drivers/nvme/ modules
  MODPOST drivers/nvme/Module.symvers
  CC [M]  drivers/nvme/host/nvme-core.mod.o
  CC [M]  drivers/nvme/host/nvme-fabrics.mod.o
  CC [M]  drivers/nvme/host/nvme-fc.mod.o
  CC [M]  drivers/nvme/host/nvme-rdma.mod.o
  CC [M]  drivers/nvme/host/nvme-tcp.mod.o
  CC [M]  drivers/nvme/host/nvme.mod.o
  CC [M]  drivers/nvme/target/nvme-fcloop.mod.o
  CC [M]  drivers/nvme/target/nvme-loop.mod.o
  CC [M]  drivers/nvme/target/nvmet-fc.mod.o
  CC [M]  drivers/nvme/target/nvmet-rdma.mod.o
  CC [M]  drivers/nvme/target/nvmet-tcp.mod.o
  CC [M]  drivers/nvme/target/nvmet.mod.o
  LD [M]  drivers/nvme/host/nvme-rdma.ko
  LD [M]  drivers/nvme/host/nvme-core.ko
  LD [M]  drivers/nvme/host/nvme-fc.ko
  LD [M]  drivers/nvme/target/nvme-loop.ko
  LD [M]  drivers/nvme/target/nvmet.ko
  LD [M]  drivers/nvme/target/nvmet-rdma.ko
  LD [M]  drivers/nvme/host/nvme-tcp.ko
  LD [M]  drivers/nvme/target/nvme-fcloop.ko
  LD [M]  drivers/nvme/target/nvmet-fc.ko
  LD [M]  drivers/nvme/target/nvmet-tcp.ko
  LD [M]  drivers/nvme/host/nvme-fabrics.ko
  LD [M]  drivers/nvme/host/nvme.ko
  BTF [M] drivers/nvme/target/nvme-loop.ko
  BTF [M] drivers/nvme/host/nvme-rdma.ko
  BTF [M] drivers/nvme/target/nvme-fcloop.ko
  BTF [M] drivers/nvme/host/nvme-fabrics.ko
  BTF [M] drivers/nvme/target/nvmet-rdma.ko
  BTF [M] drivers/nvme/target/nvmet-tcp.ko
  BTF [M] drivers/nvme/host/nvme.ko
  BTF [M] drivers/nvme/host/nvme-tcp.ko
  BTF [M] drivers/nvme/target/nvmet-fc.ko
  BTF [M] drivers/nvme/host/nvme-fc.ko
  BTF [M] drivers/nvme/host/nvme-core.ko
  BTF [M] drivers/nvme/target/nvmet.ko
+ HOST=drivers/nvme/host
+ TARGET=drivers/nvme/target
++ uname -r
+ HOST_DEST=/lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/host/
++ uname -r
+ TARGET_DEST=/lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/target/
+ cp drivers/nvme/host/nvme-core.ko drivers/nvme/host/nvme-fabrics.ko drivers/nvme/host/nvme-fc.ko drivers/nvme/host/nvme.ko drivers/nvme/host/nvme-rdma.ko drivers/nvme/host/nvme-tcp.ko /lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/host//
+ cp drivers/nvme/target/nvme-fcloop.ko drivers/nvme/target/nvme-loop.ko drivers/nvme/target/nvmet-fc.ko drivers/nvme/target/nvmet.ko drivers/nvme/target/nvmet-rdma.ko drivers/nvme/target/nvmet-tcp.ko /lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/target//
+ ls -lrth /lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/host/ /lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/target//
/lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/host/:
total 6.6M
-rw-r--r--. 1 root root  2.7M Dec 10 01:49 nvme-core.ko
-rw-r--r--. 1 root root  446K Dec 10 01:49 nvme-fabrics.ko
-rw-r--r--. 1 root root  964K Dec 10 01:49 nvme-fc.ko
-rw-r--r--. 1 root root  732K Dec 10 01:49 nvme.ko
-rw-r--r--. 1 root root 1019K Dec 10 01:49 nvme-rdma.ko
-rw-r--r--. 1 root root  893K Dec 10 01:49 nvme-tcp.ko

/lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/target//:
total 6.7M
-rw-r--r--. 1 root root 498K Dec 10 01:49 nvme-fcloop.ko
-rw-r--r--. 1 root root 435K Dec 10 01:49 nvme-loop.ko
-rw-r--r--. 1 root root 767K Dec 10 01:49 nvmet-fc.ko
-rw-r--r--. 1 root root 3.3M Dec 10 01:49 nvmet.ko
-rw-r--r--. 1 root root 992K Dec 10 01:49 nvmet-rdma.ko
-rw-r--r--. 1 root root 768K Dec 10 01:49 nvmet-tcp.ko
+ modprobe nvme
root@dev nvme (nvme-5.16) # grep NVME_DEBUG .config
# CONFIG_NVME_DEBUG_USE_CID_GENCTR is not set
(reverse-i-search)`for': mv 0001-nvme-code-command_id-with-a-genctr-^Cr-use-after-fre.patch  sagi-genctr.patch
root@dev nvme (nvme-5.16) # ./passthru_config.sh /dev/nvme0
umount: /mnt/nvme0n1: no mount point specified.
+ mkdir /sys/kernel/config/nvmet/subsystems/pt-nqn
+ sleep 1
+ echo 'Initializing passthru ctrl path ...'
Initializing passthru ctrl path ...
+ echo -n /dev/nvme0
+ sleep 1
+ echo 1
+ echo 1
+ mkdir /sys/kernel/config/nvmet/ports/1/
+ sleep 1
+ echo -n loop
+ sleep 1
+ echo 'Connecting passthru ctrl to the port '
Connecting passthru ctrl to the port 
+ ln -s /sys/kernel/config/nvmet/subsystems/pt-nqn /sys/kernel/config/nvmet/ports/1/subsystems/
+ sleep 1
+ nvme connect -t loop -n pt-nqn
+ sleep 1
+ tr -s ' ' ' '
+ set +x
root@dev nvme (nvme-5.16) # for i in `seq 1 10`; do fio fio/randread.fio --filename=/dev/nvme1n1                                                                           --output=no-cid-genctr-${i}.fio; done
root@dev nvme (nvme-5.16) # makemnconfig 
configuration written to .config

*** End of the configuration.
*** Execute 'make' to start the build or try 'make help'.

root@dev nvme (nvme-5.16) # 
root@dev nvme (nvme-5.16) # 
root@dev nvme (nvme-5.16) # 
root@dev nvme (nvme-5.16) # 
root@dev nvme (nvme-5.16) # 
root@dev nvme (nvme-5.16) # ./kernel_compile.sh 
  DEPMOD  /lib/modules/5.16.0-rc1nvme+
sh ./arch/x86/boot/install.sh 5.16.0-rc1nvme+ \
	arch/x86/boot/bzImage System.map "/boot"
Generating grub configuration file ...
done
real	1m33.758s
user	4m54.633s
sys	1m50.129s
root@dev nvme (nvme-5.16) # reboot 
Connection to 192.168.0.46 closed by remote host.
Connection to 192.168.0.46 closed.
[neo@fedora ~]$ dev
root@192.168.0.46's password: 
Last login: Fri Dec 10 01:49:40 2021 from 192.168.0.63
root@dev ~ # cd nvme/
root@dev nvme (nvme-5.16) # ./compile_nvme.sh 
+ umount /mnt/nvme0n1
+ clear_dmesg
./compile_nvme.sh: line 3: clear_dmesg: command not found
umount: /mnt/nvme0n1: no mount point specified.
+ modprobe -r nvme-fabrics
+ modprobe -r nvme_loop
+ modprobe -r nvmet
+ modprobe -r nvme
+ sleep 1
+ modprobe -r nvme-core
+ lsmod
+ grep nvme
+ sleep 1
./+ git diff
+ sleep 1
++ nproc
+ make -j 48 M=drivers/nvme/ modules
  MODPOST drivers/nvme/Module.symvers
  CC [M]  drivers/nvme/host/nvme-core.mod.o
  CC [M]  drivers/nvme/host/nvme-fabrics.mod.o
  CC [M]  drivers/nvme/host/nvme-fc.mod.o
  CC [M]  drivers/nvme/host/nvme-rdma.mod.o
  CC [M]  drivers/nvme/host/nvme-tcp.mod.o
  CC [M]  drivers/nvme/host/nvme.mod.o
  CC [M]  drivers/nvme/target/nvme-fcloop.mod.o
  CC [M]  drivers/nvme/target/nvme-loop.mod.o
  CC [M]  drivers/nvme/target/nvmet-fc.mod.o
  CC [M]  drivers/nvme/target/nvmet-rdma.mod.o
  CC [M]  drivers/nvme/target/nvmet-tcp.mod.o
  CC [M]  drivers/nvme/target/nvmet.mod.o
  LD [M]  drivers/nvme/target/nvme-loop.ko
  LD [M]  drivers/nvme/target/nvmet.ko
  LD [M]  drivers/nvme/host/nvme.ko
  LD [M]  drivers/nvme/target/nvmet-rdma.ko
  LD [M]  drivers/nvme/host/nvme-rdma.ko
  LD [M]  drivers/nvme/host/nvme-tcp.ko
  LD [M]  drivers/nvme/host/nvme-core.ko
  LD [M]  drivers/nvme/target/nvmet-fc.ko
  LD [M]  drivers/nvme/target/nvmet-tcp.ko
  LD [M]  drivers/nvme/host/nvme-fabrics.ko
  LD [M]  drivers/nvme/host/nvme-fc.ko
  BTF [M] drivers/nvme/target/nvme-loop.ko
  LD [M]  drivers/nvme/target/nvme-fcloop.ko
  BTF [M] drivers/nvme/target/nvmet-rdma.ko
  BTF [M] drivers/nvme/host/nvme-fabrics.ko
  BTF [M] drivers/nvme/target/nvme-fcloop.ko
  BTF [M] drivers/nvme/host/nvme-rdma.ko
  BTF [M] drivers/nvme/host/nvme-tcp.ko
  BTF [M] drivers/nvme/target/nvmet-fc.ko
  BTF [M] drivers/nvme/target/nvmet-tcp.ko
  BTF [M] drivers/nvme/host/nvme.ko
  BTF [M] drivers/nvme/host/nvme-fc.ko
  BTF [M] drivers/nvme/target/nvmet.ko
  BTF [M] drivers/nvme/host/nvme-core.ko
+ HOST=drivers/nvme/host
+ TARGET=drivers/nvme/target
++ uname -r
+ HOST_DEST=/lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/host/
++ uname -r
+ TARGET_DEST=/lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/target/
+ cp drivers/nvme/host/nvme-core.ko drivers/nvme/host/nvme-fabrics.ko drivers/nvme/host/nvme-fc.ko drivers/nvme/host/nvme.ko drivers/nvme/host/nvme-rdma.ko drivers/nvme/host/nvme-tcp.ko /lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/host//
+ cp drivers/nvme/target/nvme-fcloop.ko drivers/nvme/target/nvme-loop.ko drivers/nvme/target/nvmet-fc.ko drivers/nvme/target/nvmet.ko drivers/nvme/target/nvmet-rdma.ko drivers/nvme/target/nvmet-tcp.ko /lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/target//
+ ls -lrth /lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/host/ /lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/target//
/lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/host/:
total 6.6M
-rw-r--r--. 1 root root  2.7M Dec 10 02:17 nvme-core.ko
-rw-r--r--. 1 root root  446K Dec 10 02:17 nvme-fabrics.ko
-rw-r--r--. 1 root root  964K Dec 10 02:17 nvme-fc.ko
-rw-r--r--. 1 root root  732K Dec 10 02:17 nvme.ko
-rw-r--r--. 1 root root 1019K Dec 10 02:17 nvme-rdma.ko
-rw-r--r--. 1 root root  897K Dec 10 02:17 nvme-tcp.ko

/lib/modules/5.16.0-rc1nvme+/kernel/drivers/nvme/target//:
total 6.7M
-rw-r--r--. 1 root root 498K Dec 10 02:17 nvme-fcloop.ko
-rw-r--r--. 1 root root 436K Dec 10 02:17 nvme-loop.ko
-rw-r--r--. 1 root root 767K Dec 10 02:17 nvmet-fc.ko
-rw-r--r--. 1 root root 3.3M Dec 10 02:17 nvmet.ko
-rw-r--r--. 1 root root 992K Dec 10 02:17 nvmet-rdma.ko
-rw-r--r--. 1 root root 768K Dec 10 02:17 nvmet-tcp.ko
+ modprobe nvme
root@dev nvme (nvme-5.16) # ./passthru_config.sh /dev/nvme0
umount: /mnt/nvme0n1: no mount point specified.
+ mkdir /sys/kernel/config/nvmet/subsystems/pt-nqn
+ sleep 1
+ echo 'Initializing passthru ctrl path ...'
Initializing passthru ctrl path ...
+ echo -n /dev/nvme0
+ sleep 1
+ echo 1
+ echo 1
+ mkdir /sys/kernel/config/nvmet/ports/1/
+ sleep 1
+ echo -n loop
+ sleep 1
+ echo 'Connecting passthru ctrl to the port '
Connecting passthru ctrl to the port 
+ ln -s /sys/kernel/config/nvmet/subsystems/pt-nqn /sys/kernel/config/nvmet/ports/1/subsystems/
+ sleep 1
+ nvme connect -t loop -n pt-nqn
+ sleep 1
root@dev nvme (nvme-5.16) # grep NVME_DEBUG .config
CONFIG_NVME_DEBUG_USE_CID_GENCTR=y
root@dev nvme (nvme-5.16) # for i in `seq 1 10`; do fio fio/randread.fio --filename=/dev/nvme1n1                                                                           --output=yes-cid-genctr-${i}.fio; done
root@dev nvme (nvme-5.16) # .0%][r=297MiB/s][r=76.0k IOPS][eta 00m:00s]
root@dev nvme (nvme-5.16) # 
root@dev nvme (nvme-5.16) # 
root@dev nvme (nvme-5.16) # 
root@dev nvme (nvme-5.16) # grep slat yes*genctr* | awk '{print $6}' | cut -f 2 -d '=' | cut -f1 -d ','
5057.99
7434.15
6310.50
8083.82
7297.77
6003.77
6657.75
6745.34
root@dev nvme (nvme-5.16) # grep slat no*genctr* | awk '{print $6}' | cut -f 2 -d '=' | cut -f1 -d ','
3859.20
4669.78
3778.51
8436.18
5653.83
5286.60
4176.05
4385.19
root@dev nvme (nvme-5.16) # sum=0;for i in `grep slat yes-cid-genctr-* | awk '{print $6}' | cut -f 2 -d '=' | cut -f1 -d ','| cut -f 1 -d '.'`; do let sum=sum+$i; done ; echo $sum/8|bc
6698
root@dev nvme (nvme-5.16) # sum=0;for i in `grep slat no-cid-genctr-* | awk '{print $6}' | cut -f 2 -d '=' | cut -f1 -d ','| cut -f 1 -d '.'`; do let sum=sum+$i; done ; echo $sum/8|bc
5030
root@dev nvme (nvme-5.16) # 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] nvme-core: make cid gencounter configurable
  2021-12-10 11:21 [PATCH 0/3] nvme-core: make gencounter feature tunable Chaitanya Kulkarni
@ 2021-12-10 11:21 ` Chaitanya Kulkarni
  2021-12-10 11:21 ` [PATCH 2/3] nvme-core: move gencounter check into nvme_cid() Chaitanya Kulkarni
  2021-12-10 11:21 ` [PATCH 3/3] nvme: add KConfig options for debug features Chaitanya Kulkarni
  2 siblings, 0 replies; 14+ messages in thread
From: Chaitanya Kulkarni @ 2021-12-10 11:21 UTC (permalink / raw)
  To: kbusch, hch, sagi; +Cc: linux-nvme, Chaitanya Kulkarni

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 2807 bytes --]

From: Chaitanya Kulkarni <kch@nvidia.com>

The recent commit uses a combination of command id and a gencounter
to calculate the command id so that we can call out buggy controllers
for spurious completions and that also avoids use after free.
This commit adds various if statements and bitwise operations
such as &, <<, >>, & along with comparison operations to validate the  
gencounter and print out the errors.
 
This feature is required to catch the buggy controller, but in the  
production environment where controller is validated and known to be
stable this achieves nothing but adds additional code and runtime CPU  
instructions for the PCIe controller.
Worst case scenario when NVMeOF setup is using passthru backend then
these instructions get duplicated on the host and on the target side
for each controller.
 
This patch makes the gencounter feature configurable by   
using new macro CONFIG_NVME_DEBUG_USE_CID_GENCTR. We move the current
tag + gencounter code when CONFIG_NVME_DEBUG_USE_CID_GENCTR is defined
and keep the original code where we use tag-based command id where
NVME_DEBUG_USE_CID_GENCTR is not defined.

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/nvme/host/nvme.h | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index b334af8aa264..98d7627cfdce 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -153,7 +153,9 @@ enum nvme_quirks {
 struct nvme_request {
 	struct nvme_command	*cmd;
 	union nvme_result	result;
+#ifdef CONFIG_NVME_DEBUG_USE_CID_GENCTR
 	u8			genctr;
+#endif
 	u8			retries;
 	u8			flags;
 	u16			status;
@@ -496,6 +498,7 @@ struct nvme_ctrl_ops {
 	int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size);
 };
 
+#ifdef CONFIG_NVME_DEBUG_USE_CID_GENCTR
 /*
  * nvme command_id is constructed as such:
  * | xxxx | xxxxxxxxxxxx |
@@ -538,6 +541,32 @@ static inline struct request *nvme_cid_to_rq(struct blk_mq_tags *tags,
 {
 	return blk_mq_tag_to_rq(tags, nvme_tag_from_cid(command_id));
 }
+#else
+#define nvme_tag_from_cid(cid)			(cid)
+static inline u16 nvme_cid(struct request *rq)
+{
+	return rq->tag;
+}
+
+static inline struct request *nvme_cid_to_rq(struct blk_mq_tags *tags,
+		u16 command_id)
+{
+	return blk_mq_tag_to_rq(tags, command_id);
+}
+
+static inline struct request *nvme_find_rq(struct blk_mq_tags *tags,
+		u16 command_id)
+{
+	u16 tag = nvme_tag_from_cid(command_id);
+	struct request *rq;
+
+	rq = blk_mq_tag_to_rq(tags, tag);
+	if (unlikely(!rq))
+		pr_err("could not locate request for tag %#x\n", tag);
+
+	return rq;
+}
+#endif /* CONFIG_NVME_DEBUG_USE_CID_GENCTR */
 
 #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
 void nvme_fault_inject_init(struct nvme_fault_inject *fault_inj,
-- 
2.29.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] nvme-core: move gencounter check into nvme_cid()
  2021-12-10 11:21 [PATCH 0/3] nvme-core: make gencounter feature tunable Chaitanya Kulkarni
  2021-12-10 11:21 ` [PATCH 1/3] nvme-core: make cid gencounter configurable Chaitanya Kulkarni
@ 2021-12-10 11:21 ` Chaitanya Kulkarni
  2021-12-10 15:43   ` Keith Busch
  2021-12-10 11:21 ` [PATCH 3/3] nvme: add KConfig options for debug features Chaitanya Kulkarni
  2 siblings, 1 reply; 14+ messages in thread
From: Chaitanya Kulkarni @ 2021-12-10 11:21 UTC (permalink / raw)
  To: kbusch, hch, sagi; +Cc: linux-nvme, Chaitanya Kulkarni

From: Chaitanya Kulkarni <kch@nvidia.com>

Move gencounter related code to nvme_cid to keep the code under its own
interface, this also avoids any open coding and if condition when
gencounter feature is turned off.

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/nvme/host/core.c | 3 ---
 drivers/nvme/host/nvme.h | 2 ++
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3c9f221379bd..7008cea69400 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -990,7 +990,6 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
 blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
 {
 	struct nvme_command *cmd = nvme_req(req)->cmd;
-	struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
 	blk_status_t ret = BLK_STS_OK;
 
 	if (!(req->rq_flags & RQF_DONTPREP))
@@ -1037,8 +1036,6 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
 		return BLK_STS_IOERR;
 	}
 
-	if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
-		nvme_req(req)->genctr++;
 	cmd->common.command_id = nvme_cid(req);
 	trace_nvme_setup_cmd(req, cmd);
 	return ret;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 98d7627cfdce..2be0191e1a1f 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -511,6 +511,8 @@ struct nvme_ctrl_ops {
 
 static inline u16 nvme_cid(struct request *rq)
 {
+	if (!(nvme_req(rq)->ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
+		nvme_req(rq)->genctr++;
 	return nvme_cid_install_genctr(nvme_req(rq)->genctr) | rq->tag;
 }
 
-- 
2.29.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] nvme: add KConfig options for debug features
  2021-12-10 11:21 [PATCH 0/3] nvme-core: make gencounter feature tunable Chaitanya Kulkarni
  2021-12-10 11:21 ` [PATCH 1/3] nvme-core: make cid gencounter configurable Chaitanya Kulkarni
  2021-12-10 11:21 ` [PATCH 2/3] nvme-core: move gencounter check into nvme_cid() Chaitanya Kulkarni
@ 2021-12-10 11:21 ` Chaitanya Kulkarni
  2021-12-12  9:22   ` Sagi Grimberg
  2 siblings, 1 reply; 14+ messages in thread
From: Chaitanya Kulkarni @ 2021-12-10 11:21 UTC (permalink / raw)
  To: kbusch, hch, sagi; +Cc: linux-nvme, Chaitanya Kulkarni

From: Chaitanya Kulkarni <kch@nvidia.com>

Add KConfig menu option to enable and disable gencounter debug
feature that uses config NVME_DEBUG_USE_CID_GENCTR. 

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/nvme/host/Kconfig | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
index dc0450ca23a3..dfa2609b7006 100644
--- a/drivers/nvme/host/Kconfig
+++ b/drivers/nvme/host/Kconfig
@@ -1,4 +1,14 @@
 # SPDX-License-Identifier: GPL-2.0-only
+menu "Debug (Enable driver debug features)"
+config NVME_DEBUG_USE_CID_GENCTR
+	bool "Enable command ID gen counter for spurious request completion"
+	depends on NVME_CORE
+	help
+	  The NVM Express driver will use generation conunter
+	  when calculating the command id. This is needed to debug the
+	  spurious request completions coming from a buggy controller.
+endmenu
+
 config NVME_CORE
 	tristate
 	select BLK_DEV_INTEGRITY_T10 if BLK_DEV_INTEGRITY
-- 
2.29.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] nvme-core: move gencounter check into nvme_cid()
  2021-12-10 11:21 ` [PATCH 2/3] nvme-core: move gencounter check into nvme_cid() Chaitanya Kulkarni
@ 2021-12-10 15:43   ` Keith Busch
  2021-12-10 17:46     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 14+ messages in thread
From: Keith Busch @ 2021-12-10 15:43 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: hch, sagi, linux-nvme, Chaitanya Kulkarni

On Fri, Dec 10, 2021 at 03:21:15AM -0800, Chaitanya Kulkarni wrote:
> -	if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
> -		nvme_req(req)->genctr++;
>  	cmd->common.command_id = nvme_cid(req);
>  	trace_nvme_setup_cmd(req, cmd);
>  	return ret;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 98d7627cfdce..2be0191e1a1f 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -511,6 +511,8 @@ struct nvme_ctrl_ops {
>  
>  static inline u16 nvme_cid(struct request *rq)
>  {
> +	if (!(nvme_req(rq)->ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
> +		nvme_req(rq)->genctr++;
>  	return nvme_cid_install_genctr(nvme_req(rq)->genctr) | rq->tag;
>  }

No, this will incrememnt the counter every time someone queries the
command id for this request, which happens in multiple places. The
counter can't be modified for the lifetime of the request.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] nvme-core: move gencounter check into nvme_cid()
  2021-12-10 15:43   ` Keith Busch
@ 2021-12-10 17:46     ` Chaitanya Kulkarni
  2021-12-12  9:19       ` Sagi Grimberg
  0 siblings, 1 reply; 14+ messages in thread
From: Chaitanya Kulkarni @ 2021-12-10 17:46 UTC (permalink / raw)
  To: Keith Busch; +Cc: hch, sagi, linux-nvme, Chaitanya Kulkarni

On 12/10/21 7:43 AM, Keith Busch wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Fri, Dec 10, 2021 at 03:21:15AM -0800, Chaitanya Kulkarni wrote:
>> -     if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
>> -             nvme_req(req)->genctr++;
>>        cmd->common.command_id = nvme_cid(req);
>>        trace_nvme_setup_cmd(req, cmd);
>>        return ret;
>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>> index 98d7627cfdce..2be0191e1a1f 100644
>> --- a/drivers/nvme/host/nvme.h
>> +++ b/drivers/nvme/host/nvme.h
>> @@ -511,6 +511,8 @@ struct nvme_ctrl_ops {
>>
>>   static inline u16 nvme_cid(struct request *rq)
>>   {
>> +     if (!(nvme_req(rq)->ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
>> +             nvme_req(rq)->genctr++;
>>        return nvme_cid_install_genctr(nvme_req(rq)->genctr) | rq->tag;
>>   }
> 
> No, this will incrememnt the counter every time someone queries the
> command id for this request, which happens in multiple places. The
> counter can't be modified for the lifetime of the request.
> 

Yes, the right thing to do here is to create a new stub for
CONFIG_NVME_DEBUG_USE_GENCTR and !CONFIG_NVME_DEBUG_USE_GENCTR
case and move the quick check in there, will send out V2.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] nvme-core: move gencounter check into nvme_cid()
  2021-12-10 17:46     ` Chaitanya Kulkarni
@ 2021-12-12  9:19       ` Sagi Grimberg
  2021-12-13  7:07         ` Chaitanya Kulkarni
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2021-12-12  9:19 UTC (permalink / raw)
  To: Chaitanya Kulkarni, Keith Busch; +Cc: hch, linux-nvme


>> External email: Use caution opening links or attachments
>>
>>
>> On Fri, Dec 10, 2021 at 03:21:15AM -0800, Chaitanya Kulkarni wrote:
>>> -     if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
>>> -             nvme_req(req)->genctr++;
>>>         cmd->common.command_id = nvme_cid(req);
>>>         trace_nvme_setup_cmd(req, cmd);
>>>         return ret;
>>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>>> index 98d7627cfdce..2be0191e1a1f 100644
>>> --- a/drivers/nvme/host/nvme.h
>>> +++ b/drivers/nvme/host/nvme.h
>>> @@ -511,6 +511,8 @@ struct nvme_ctrl_ops {
>>>
>>>    static inline u16 nvme_cid(struct request *rq)
>>>    {
>>> +     if (!(nvme_req(rq)->ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
>>> +             nvme_req(rq)->genctr++;
>>>         return nvme_cid_install_genctr(nvme_req(rq)->genctr) | rq->tag;
>>>    }
>>
>> No, this will incrememnt the counter every time someone queries the
>> command id for this request, which happens in multiple places. The
>> counter can't be modified for the lifetime of the request.
>>
> 
> Yes, the right thing to do here is to create a new stub for
> CONFIG_NVME_DEBUG_USE_GENCTR and !CONFIG_NVME_DEBUG_USE_GENCTR
> case and move the quick check in there, will send out V2.

Don't understand the global config option at all, distributions
will probably enable it anyways (as the default needs to be set
to Y anyways). However, this set was obviously not tested with nvme-tcp
because it breaks it completely. And it is also in general a bad
practice IMO to increment the genctr on every install. It needs to
be in a different helper.

Please make sure to test nvme-tcp on your v2.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] nvme: add KConfig options for debug features
  2021-12-10 11:21 ` [PATCH 3/3] nvme: add KConfig options for debug features Chaitanya Kulkarni
@ 2021-12-12  9:22   ` Sagi Grimberg
  2021-12-13  7:39     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2021-12-12  9:22 UTC (permalink / raw)
  To: Chaitanya Kulkarni, kbusch, hch; +Cc: linux-nvme, Chaitanya Kulkarni


> From: Chaitanya Kulkarni <kch@nvidia.com>
> 
> Add KConfig menu option to enable and disable gencounter debug
> feature that uses config NVME_DEBUG_USE_CID_GENCTR.
> 
> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
> ---
>   drivers/nvme/host/Kconfig | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
> index dc0450ca23a3..dfa2609b7006 100644
> --- a/drivers/nvme/host/Kconfig
> +++ b/drivers/nvme/host/Kconfig
> @@ -1,4 +1,14 @@
>   # SPDX-License-Identifier: GPL-2.0-only
> +menu "Debug (Enable driver debug features)"
> +config NVME_DEBUG_USE_CID_GENCTR
> +	bool "Enable command ID gen counter for spurious request completion"
> +	depends on NVME_CORE
> +	help
> +	  The NVM Express driver will use generation conunter
> +	  when calculating the command id. This is needed to debug the
> +	  spurious request completions coming from a buggy controller.

This is not just to debug - it is also to protect against such a
controller. What is the purpose of this config option anyways?
The main distributions will (as they should) enable it anyways...


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] nvme-core: move gencounter check into nvme_cid()
  2021-12-12  9:19       ` Sagi Grimberg
@ 2021-12-13  7:07         ` Chaitanya Kulkarni
  0 siblings, 0 replies; 14+ messages in thread
From: Chaitanya Kulkarni @ 2021-12-13  7:07 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: hch, Keith Busch, linux-nvme

Sagi,

On 12/12/21 1:19 AM, Sagi Grimberg wrote:
> External email: Use caution opening links or attachments
> 
> 
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Fri, Dec 10, 2021 at 03:21:15AM -0800, Chaitanya Kulkarni wrote:
>>>> -     if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
>>>> -             nvme_req(req)->genctr++;
>>>>         cmd->common.command_id = nvme_cid(req);
>>>>         trace_nvme_setup_cmd(req, cmd);
>>>>         return ret;
>>>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>>>> index 98d7627cfdce..2be0191e1a1f 100644
>>>> --- a/drivers/nvme/host/nvme.h
>>>> +++ b/drivers/nvme/host/nvme.h
>>>> @@ -511,6 +511,8 @@ struct nvme_ctrl_ops {
>>>>
>>>>    static inline u16 nvme_cid(struct request *rq)
>>>>    {
>>>> +     if (!(nvme_req(rq)->ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
>>>> +             nvme_req(rq)->genctr++;
>>>>         return nvme_cid_install_genctr(nvme_req(rq)->genctr) | rq->tag;
>>>>    }
>>>
>>> No, this will incrememnt the counter every time someone queries the
>>> command id for this request, which happens in multiple places. The
>>> counter can't be modified for the lifetime of the request.
>>>
>>
>> Yes, the right thing to do here is to create a new stub for
>> CONFIG_NVME_DEBUG_USE_GENCTR and !CONFIG_NVME_DEBUG_USE_GENCTR
>> case and move the quick check in there, will send out V2.
> 
> Don't understand the global config option at all, distributions
> will probably enable it anyways (as the default needs to be set
> to Y anyways). However, this set was obviously not tested with nvme-tcp
> because it breaks it completely. And it is also in general a bad
> practice IMO to increment the genctr on every install. It needs to
> be in a different helper.
> 
> Please make sure to test nvme-tcp on your v2.

Yes the nvme_cid() chanege is wrong, let me send a V2 tested on the
tcp and correct mentioned helper.

-ck


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] nvme: add KConfig options for debug features
  2021-12-12  9:22   ` Sagi Grimberg
@ 2021-12-13  7:39     ` Chaitanya Kulkarni
  2021-12-13  9:27       ` Sagi Grimberg
  0 siblings, 1 reply; 14+ messages in thread
From: Chaitanya Kulkarni @ 2021-12-13  7:39 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, Chaitanya Kulkarni, hch, kbusch

On 12/12/21 1:22 AM, Sagi Grimberg wrote:
> External email: Use caution opening links or attachments
> 
> 
>> From: Chaitanya Kulkarni <kch@nvidia.com>
>>
>> Add KConfig menu option to enable and disable gencounter debug
>> feature that uses config NVME_DEBUG_USE_CID_GENCTR.
>>
>> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
>> ---
>>   drivers/nvme/host/Kconfig | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
>> index dc0450ca23a3..dfa2609b7006 100644
>> --- a/drivers/nvme/host/Kconfig
>> +++ b/drivers/nvme/host/Kconfig
>> @@ -1,4 +1,14 @@
>>   # SPDX-License-Identifier: GPL-2.0-only
>> +menu "Debug (Enable driver debug features)"
>> +config NVME_DEBUG_USE_CID_GENCTR
>> +     bool "Enable command ID gen counter for spurious request 
>> completion"
>> +     depends on NVME_CORE
>> +     help
>> +       The NVM Express driver will use generation counter
>> +       when calculating the command id. This is needed to debug the
>> +       spurious request completions coming from a buggy controller.
> 
> This is not just to debug - it is also to protect against such a
> controller. What is the purpose of this config option anyways?
> The main distributions will (as they should) enable it anyways...

I can rewrite the text and rename it to "driver features".
We are protecting against such a controller which is not stable
(buggy), i.e. it is doing things which it shouldn't be
doing at the first place. Consider a case if controller is not
buggy then it adds instructions in the fast path which are not
needed at all.

A controller(s) that is used in the production environment goes
through qualification process from vendors and from the consumers
to make sure they are stable, something like spurious completions
detection is a basic part of the qualification, hence we should
allow user to configure genctr than forcing additional
instructions in the fast path and keep this pattern for future
such cases.

Maybe I didn't understand, can you please explain what are the
benefits of having gen-counter where controller is stable?

I'll wait to send V2 if you can suggest any other way
(than kconfig something like module param, I'm not sure) so user
can configure genctr calculations, please let me know.

-ck


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] nvme: add KConfig options for debug features
  2021-12-13  7:39     ` Chaitanya Kulkarni
@ 2021-12-13  9:27       ` Sagi Grimberg
  2021-12-14  7:54         ` Chaitanya Kulkarni
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2021-12-13  9:27 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: linux-nvme, hch, kbusch


>>> Add KConfig menu option to enable and disable gencounter debug
>>> feature that uses config NVME_DEBUG_USE_CID_GENCTR.
>>>
>>> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
>>> ---
>>>    drivers/nvme/host/Kconfig | 10 ++++++++++
>>>    1 file changed, 10 insertions(+)
>>>
>>> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
>>> index dc0450ca23a3..dfa2609b7006 100644
>>> --- a/drivers/nvme/host/Kconfig
>>> +++ b/drivers/nvme/host/Kconfig
>>> @@ -1,4 +1,14 @@
>>>    # SPDX-License-Identifier: GPL-2.0-only
>>> +menu "Debug (Enable driver debug features)"
>>> +config NVME_DEBUG_USE_CID_GENCTR
>>> +     bool "Enable command ID gen counter for spurious request
>>> completion"
>>> +     depends on NVME_CORE
>>> +     help
>>> +       The NVM Express driver will use generation counter
>>> +       when calculating the command id. This is needed to debug the
>>> +       spurious request completions coming from a buggy controller.
>>
>> This is not just to debug - it is also to protect against such a
>> controller. What is the purpose of this config option anyways?
>> The main distributions will (as they should) enable it anyways...
> 
> I can rewrite the text and rename it to "driver features".

It is not a feature.

> We are protecting against such a controller which is not stable
> (buggy), i.e. it is doing things which it shouldn't be
> doing at the first place.

Yes, this is exactly what we are doing.

> Consider a case if controller is not
> buggy then it adds instructions in the fast path which are not
> needed at all.

We keep repeating this, if the controller assumes _anything_ on
the command identifier then it _is_ buggy, period.

Again, the fact that the controller is buggy is perfectly fine
really, that's why we have quirks.

What happened to your nvme-cli argument? that would make it
specific to that controller.

The host side cost is very cheap as shown before.

> A controller(s) that is used in the production environment goes
> through qualification process from vendors and from the consumers
> to make sure they are stable, something like spurious completions
> detection is a basic part of the qualification,

I don't think that assuming that a production controller does not
have any hidden bugs is a great practice, and controllers evolve during
their production lifetime.

> hence we should
> allow user to configure genctr than forcing additional
> instructions in the fast path and keep this pattern for future
> such cases.

I personally think that given that the worst-case option for such
a bug is possible data-corruption then we should not make it optional,
but I won't insist on it if others think otherwise.

> Maybe I didn't understand, can you please explain what are the
> benefits of having gen-counter where controller is stable?

The benefit is that the host code does not "assume" anything
about the controller.

> I'll wait to send V2 if you can suggest any other way
> (than kconfig something like module param, I'm not sure) so user
> can configure genctr calculations, please let me know.

As I said, the nvme-cli skip-genctr argument is perfectly valid in my
mind.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] nvme: add KConfig options for debug features
  2021-12-13  9:27       ` Sagi Grimberg
@ 2021-12-14  7:54         ` Chaitanya Kulkarni
  2021-12-14 11:03           ` Sagi Grimberg
  0 siblings, 1 reply; 14+ messages in thread
From: Chaitanya Kulkarni @ 2021-12-14  7:54 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, hch, kbusch

On 12/13/21 1:27 AM, Sagi Grimberg wrote:
> External email: Use caution opening links or attachments
> 
> 
>>>> Add KConfig menu option to enable and disable gencounter debug
>>>> feature that uses config NVME_DEBUG_USE_CID_GENCTR.
>>>>
>>>> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
>>>> ---
>>>>    drivers/nvme/host/Kconfig | 10 ++++++++++
>>>>    1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
>>>> index dc0450ca23a3..dfa2609b7006 100644
>>>> --- a/drivers/nvme/host/Kconfig
>>>> +++ b/drivers/nvme/host/Kconfig
>>>> @@ -1,4 +1,14 @@
>>>>    # SPDX-License-Identifier: GPL-2.0-only
>>>> +menu "Debug (Enable driver debug features)"
>>>> +config NVME_DEBUG_USE_CID_GENCTR
>>>> +     bool "Enable command ID gen counter for spurious request
>>>> completion"
>>>> +     depends on NVME_CORE
>>>> +     help
>>>> +       The NVM Express driver will use generation counter
>>>> +       when calculating the command id. This is needed to debug the
>>>> +       spurious request completions coming from a buggy controller.
>>>
>>> This is not just to debug - it is also to protect against such a
>>> controller. What is the purpose of this config option anyways?
>>> The main distributions will (as they should) enable it anyways...
>>
>> I can rewrite the text and rename it to "driver features".
> 
> It is not a feature.

that is what I thought, hence I kept it under debug.

> 
>> We are protecting against such a controller which is not stable
>> (buggy), i.e. it is doing things which it shouldn't be
>> doing at the first place.
> 
> Yes, this is exactly what we are doing.
> 
>> Consider a case if controller is not
>> buggy then it adds instructions in the fast path which are not
>> needed at all.
> 
> We keep repeating this, if the controller assumes _anything_ on
> the command identifier then it _is_ buggy, period.
> 

Completely agree on this.

What I mean buggy in this context is not related to assuming anything
with command identifier population, it meant controller that is
causing spurious completions -> buggy. The controller that is not
able to process non-seq command ids and assumes anything is out of
picture for this patch-series (for that nvme-cli skip approach is
fine), see below.

> Again, the fact that the controller is buggy is perfectly fine
> really, that's why we have quirks.
> 

A controller which is suffering from spurious completions needs
gencounter to debug the issue, using quirk and masking gencounter
should be avoided in this case.

> What happened to your nvme-cli argument? that would make it
> specific to that controller.
> 

That is completely different issue, the nvme-cli argument tries
to fix the issue with a controller that fails to process
non-sequntial command-id such as Apple ctrl and needs quirk.

This path-series has nothing to do with that issue.
This patch-series is only focused on keeping the fast path lean
and allowing user(s) to configure gencounter in the environment
where :-

1. Controller is stable.
2. Able to process the non-sequential command ids
    (hence no quirk needed) and does not assume anything about the
    command id population unlike Apple controller.
3. Does not suffer from spurious completion problem.

For this controller, user can disable gencounter and keep the fast
path lean.

> The host side cost is very cheap as shown before.
> 
>> A controller(s) that is used in the production environment goes
>> through qualification process from vendors and from the consumers
>> to make sure they are stable, something like spurious completions
>> detection is a basic part of the qualification,
> 
> I don't think that assuming that a production controller does not
> have any hidden bugs is a great practice, and controllers evolve during
> their production lifetime.
> 

That is why we should keep it configurable and let user decide.

>> hence we should
>> allow user to configure genctr than forcing additional
>> instructions in the fast path and keep this pattern for future
>> such cases.
> 
> I personally think that given that the worst-case option for such
> a bug is possible data-corruption then we should not make it optional,
> but I won't insist on it if others think otherwise.
> 

Even if we keep the gencouter on unconditionally once spurious
completion is posted, controller is already in the inconsistent
state, there is no guarantee that data will be consistent. In
fact what we need is some way of signaling uevent on spurious
completion so that userspace can get notification on the top
of your work with gencounter.

>> Maybe I didn't understand, can you please explain what are the
>> benefits of having gen-counter where controller is stable?
> 
> The benefit is that the host code does not "assume" anything
> about the controller.
> 
>> I'll wait to send V2 if you can suggest any other way
>> (than kconfig something like module param, I'm not sure) so user
>> can configure genctr calculations, please let me know.
> 
> As I said, the nvme-cli skip-genctr argument is perfectly valid in my
> mind.

That is for the broken controllers which are not able to process
non-seq command id generation like Apple, this patch is not for that.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] nvme: add KConfig options for debug features
  2021-12-14  7:54         ` Chaitanya Kulkarni
@ 2021-12-14 11:03           ` Sagi Grimberg
  2021-12-17  6:52             ` Chaitanya Kulkarni
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2021-12-14 11:03 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: linux-nvme, hch, kbusch


>>>>> Add KConfig menu option to enable and disable gencounter debug
>>>>> feature that uses config NVME_DEBUG_USE_CID_GENCTR.
>>>>>
>>>>> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
>>>>> ---
>>>>>     drivers/nvme/host/Kconfig | 10 ++++++++++
>>>>>     1 file changed, 10 insertions(+)
>>>>>
>>>>> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
>>>>> index dc0450ca23a3..dfa2609b7006 100644
>>>>> --- a/drivers/nvme/host/Kconfig
>>>>> +++ b/drivers/nvme/host/Kconfig
>>>>> @@ -1,4 +1,14 @@
>>>>>     # SPDX-License-Identifier: GPL-2.0-only
>>>>> +menu "Debug (Enable driver debug features)"
>>>>> +config NVME_DEBUG_USE_CID_GENCTR
>>>>> +     bool "Enable command ID gen counter for spurious request
>>>>> completion"
>>>>> +     depends on NVME_CORE
>>>>> +     help
>>>>> +       The NVM Express driver will use generation counter
>>>>> +       when calculating the command id. This is needed to debug the
>>>>> +       spurious request completions coming from a buggy controller.
>>>>
>>>> This is not just to debug - it is also to protect against such a
>>>> controller. What is the purpose of this config option anyways?
>>>> The main distributions will (as they should) enable it anyways...
>>>
>>> I can rewrite the text and rename it to "driver features".
>>
>> It is not a feature.
> 
> that is what I thought, hence I kept it under debug.

It is not debug, I mean it can certainly be used for debug, but
it is protecting against kernel-crash (or worse) with such controllers.
The fact that a buggy controller can crash the kernel (or worse) has
been raised on this list by users multiple times in the past.

>>> We are protecting against such a controller which is not stable
>>> (buggy), i.e. it is doing things which it shouldn't be
>>> doing at the first place.
>>
>> Yes, this is exactly what we are doing.
>>
>>> Consider a case if controller is not
>>> buggy then it adds instructions in the fast path which are not
>>> needed at all.
>>
>> We keep repeating this, if the controller assumes _anything_ on
>> the command identifier then it _is_ buggy, period.
>>
> 
> Completely agree on this.
> 
> What I mean buggy in this context is not related to assuming anything
> with command identifier population, it meant controller that is
> causing spurious completions -> buggy. The controller that is not
> able to process non-seq command ids and assumes anything is out of
> picture for this patch-series (for that nvme-cli skip approach is
> fine), see below.
> 
>> Again, the fact that the controller is buggy is perfectly fine
>> really, that's why we have quirks.
>>
> 
> A controller which is suffering from spurious completions needs
> gencounter to debug the issue, using quirk and masking gencounter
> should be avoided in this case.

Well of course, the quirk is for controllers that don't suffer
from this but happen to break because the host is doing something
unexpected with the command id (even though it is allowed to).

> 
>> What happened to your nvme-cli argument? that would make it
>> specific to that controller.
>>
> 
> That is completely different issue, the nvme-cli argument tries
> to fix the issue with a controller that fails to process
> non-sequntial command-id such as Apple ctrl and needs quirk.
> 
> This path-series has nothing to do with that issue.
> This patch-series is only focused on keeping the fast path lean
> and allowing user(s) to configure gencounter in the environment
> where :-
> 
> 1. Controller is stable.
> 2. Able to process the non-sequential command ids
>      (hence no quirk needed) and does not assume anything about the
>      command id population unlike Apple controller.
> 3. Does not suffer from spurious completion problem.
> 
> For this controller, user can disable gencounter and keep the fast
> path lean.

I understand, although I'd argue that a vendor may tell a customer
to disable that globally... That is not necessarily something we want
to allow.

> 
>> The host side cost is very cheap as shown before.
>>
>>> A controller(s) that is used in the production environment goes
>>> through qualification process from vendors and from the consumers
>>> to make sure they are stable, something like spurious completions
>>> detection is a basic part of the qualification,
>>
>> I don't think that assuming that a production controller does not
>> have any hidden bugs is a great practice, and controllers evolve during
>> their production lifetime.
>>
> 
> That is why we should keep it configurable and let user decide.

OK, so this allows a user that is absolutely set on shaving few
nano-seconds to sacrifice a level of protection against buggy
controllers. Is this a real use-case?

For that, I guess the config option is fine, but this at the very least
needs to default to y.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] nvme: add KConfig options for debug features
  2021-12-14 11:03           ` Sagi Grimberg
@ 2021-12-17  6:52             ` Chaitanya Kulkarni
  0 siblings, 0 replies; 14+ messages in thread
From: Chaitanya Kulkarni @ 2021-12-17  6:52 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, hch, kbusch

>>>> A controller(s) that is used in the production environment goes
>>>> through qualification process from vendors and from the consumers
>>>> to make sure they are stable, something like spurious completions
>>>> detection is a basic part of the qualification,
>>>
>>> I don't think that assuming that a production controller does not
>>> have any hidden bugs is a great practice, and controllers evolve during
>>> their production lifetime.
>>>
>>
>> That is why we should keep it configurable and let user decide.
> 
> OK, so this allows a user that is absolutely set on shaving few
> nano-seconds to sacrifice a level of protection against buggy
> controllers. Is this a real use-case?
> 
> For that, I guess the config option is fine, but this at the very least
> needs to default to y.

Okay, I'll send out V2 with default options set to y.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-12-17  6:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-10 11:21 [PATCH 0/3] nvme-core: make gencounter feature tunable Chaitanya Kulkarni
2021-12-10 11:21 ` [PATCH 1/3] nvme-core: make cid gencounter configurable Chaitanya Kulkarni
2021-12-10 11:21 ` [PATCH 2/3] nvme-core: move gencounter check into nvme_cid() Chaitanya Kulkarni
2021-12-10 15:43   ` Keith Busch
2021-12-10 17:46     ` Chaitanya Kulkarni
2021-12-12  9:19       ` Sagi Grimberg
2021-12-13  7:07         ` Chaitanya Kulkarni
2021-12-10 11:21 ` [PATCH 3/3] nvme: add KConfig options for debug features Chaitanya Kulkarni
2021-12-12  9:22   ` Sagi Grimberg
2021-12-13  7:39     ` Chaitanya Kulkarni
2021-12-13  9:27       ` Sagi Grimberg
2021-12-14  7:54         ` Chaitanya Kulkarni
2021-12-14 11:03           ` Sagi Grimberg
2021-12-17  6:52             ` Chaitanya Kulkarni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.