Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
* [PATCHv4 for-next 00/19] Misc update for rnbd
@ 2021-04-14 12:23 Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 01/19] MAINTAINERS: Change maintainer for rnbd module Gioh Kim
                   ` (18 more replies)
  0 siblings, 19 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Gioh Kim

Hi,

I am sending v4 because there is no reply for v3 I sent last week.
The changes from v3 to v4 are described below. 

This is the misc update for rnbd. It inlcudes:
- Change maintainer
- Change domain address of maintainers' email: from cloud.ionos.com to ionos.com
- Add polling IO mode and document update
- Fix memory leak and some bug detected by static code analysis tools
- Code refactoring

V4->V3
- Add "Acked-by Jason" to patches including changes for RTRS
- Add "Reviewed-by Chaitanya" to patches reviewed by Chaitanya
- Add "Acked-by Jack" to "Documentation/sysfs-block-rnbd: Add descriptions
for remap_device and resize"

V3->V2
- Exclude patches relevant the Fault-injection feature

V2->V1
- Change the title: for-rc -> for-next
- Remove unnecessary (void) casting requested by Leon

Best regards

Danil Kipnis (1):
  MAINTAINERS: Change maintainer for rnbd module

Dima Stepanov (2):
  block/rnbd-clt-sysfs: Remove copy buffer overlap in
    rnbd_clt_get_path_name
  block/rnbd: Use strscpy instead of strlcpy

Gioh Kim (8):
  Documentation/sysfs-block-rnbd: Add descriptions for remap_device and
    resize
  block/rnbd-clt: Replace {NO_WAIT,WAIT} with RTRS_PERMIT_{WAIT,NOWAIT}
  block/rnbd-srv: Prevent a deadlock generated by accessing sysfs in
    parallel
  block/rnbd-srv: Remove force_close file after holding a lock
  block/rnbd-clt: Fix missing a memory free when unloading the module
  block/rnbd-clt: Support polling mode for IO latency optimization
  Documentation/ABI/rnbd-clt: Add description for nr_poll_queues
  block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev

Guoqing Jiang (5):
  block/rnbd-clt: Remove some arguments from
    insert_dev_if_not_exists_devpath
  block/rnbd-clt: Remove some arguments from rnbd_client_setup_device
  block/rnbd-clt: Move add_disk(dev->gd) to rnbd_clt_setup_gen_disk
  block/rnbd: Kill rnbd_clt_destroy_default_group
  block/rnbd: Kill destroy_device_cb

Jack Wang (1):
  block/rnbd-clt: Remove max_segment_size

Md Haris Iqbal (1):
  block/rnbd-clt: Generate kobject_uevent when the rnbd device state
    changes

Tom Rix (1):
  block/rnbd-clt: Improve find_or_create_sess() return check

 Documentation/ABI/testing/sysfs-block-rnbd    |  18 ++
 .../ABI/testing/sysfs-class-rnbd-client       |  13 ++
 MAINTAINERS                                   |   4 +-
 drivers/block/rnbd/rnbd-clt-sysfs.c           |  85 ++++++---
 drivers/block/rnbd/rnbd-clt.c                 | 167 ++++++++++++------
 drivers/block/rnbd/rnbd-clt.h                 |   6 +-
 drivers/block/rnbd/rnbd-srv-sysfs.c           |   5 +-
 drivers/block/rnbd/rnbd-srv.c                 |  69 +++-----
 drivers/block/rnbd/rnbd-srv.h                 |   3 +-
 drivers/infiniband/ulp/rtrs/rtrs-clt.c        |  75 +++++---
 drivers/infiniband/ulp/rtrs/rtrs-clt.h        |   1 -
 drivers/infiniband/ulp/rtrs/rtrs-pri.h        |   1 +
 drivers/infiniband/ulp/rtrs/rtrs-srv.c        |   4 +-
 drivers/infiniband/ulp/rtrs/rtrs.h            |  13 +-
 14 files changed, 303 insertions(+), 161 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 01/19] MAINTAINERS: Change maintainer for rnbd module
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 02/19] Documentation/sysfs-block-rnbd: Add descriptions for remap_device and resize Gioh Kim
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang,
	Danil Kipnis, Md Haris Iqbal, Jack Wang, Gioh Kim

From: Danil Kipnis <danil.kipnis@cloud.ionos.com>

Danil steps down, Haris will take over.
Also update email address to ionos.com, the old
cloud.ionos.com will still work for some time.

Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Acked-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
---
 MAINTAINERS | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index bf947775390c..723ba354dce6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15358,8 +15358,8 @@ N:	riscv
 K:	riscv
 
 RNBD BLOCK DRIVERS
-M:	Danil Kipnis <danil.kipnis@cloud.ionos.com>
-M:	Jack Wang <jinpu.wang@cloud.ionos.com>
+M:	Md. Haris Iqbal <haris.iqbal@ionos.com>
+M:	Jack Wang <jinpu.wang@ionos.com>
 L:	linux-block@vger.kernel.org
 S:	Maintained
 F:	drivers/block/rnbd/
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 02/19] Documentation/sysfs-block-rnbd: Add descriptions for remap_device and resize
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 01/19] MAINTAINERS: Change maintainer for rnbd module Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 03/19] block/rnbd-clt: Remove some arguments from insert_dev_if_not_exists_devpath Gioh Kim
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Gioh Kim,
	Jack Wang

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

Two sysfs entries, remap_device and resize, are missing.

Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 Documentation/ABI/testing/sysfs-block-rnbd | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-block-rnbd b/Documentation/ABI/testing/sysfs-block-rnbd
index 14a6fe9422b3..ec716e1c31a8 100644
--- a/Documentation/ABI/testing/sysfs-block-rnbd
+++ b/Documentation/ABI/testing/sysfs-block-rnbd
@@ -44,3 +44,15 @@ Date:		Feb 2020
 KernelVersion:	5.7
 Contact:	Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
 Description:	Contains the device access mode: ro, rw or migration.
+
+What:		/sys/block/rnbd<N>/rnbd/resize
+Date:		Feb 2020
+KernelVersion:	5.7
+Contact:	Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description:	Write the number of sectors to change the size of the disk.
+
+What:		/sys/block/rnbd<N>/rnbd/remap_device
+Date:		Feb 2020
+KernelVersion:	5.7
+Contact:	Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description:	Remap the disconnected device if the session is not destroyed yet.
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 03/19] block/rnbd-clt: Remove some arguments from insert_dev_if_not_exists_devpath
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 01/19] MAINTAINERS: Change maintainer for rnbd module Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 02/19] Documentation/sysfs-block-rnbd: Add descriptions for remap_device and resize Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 04/19] block/rnbd-clt: Remove some arguments from rnbd_client_setup_device Gioh Kim
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang,
	Guoqing Jiang, Danil Kipnis, Gioh Kim, Jack Wang,
	Chaitanya Kulkarni

From: Guoqing Jiang <guoqing.jiang@gmx.com>

Remove 'pathname' and 'sess' since we can dereference it from 'dev'.

Signed-off-by: Guoqing Jiang <guoqing.jiang@gmx.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/block/rnbd/rnbd-clt.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index 45a470076652..5a5c8dea38dc 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -1471,14 +1471,13 @@ static bool exists_devpath(const char *pathname, const char *sessname)
 	return found;
 }
 
-static bool insert_dev_if_not_exists_devpath(const char *pathname,
-					     struct rnbd_clt_session *sess,
-					     struct rnbd_clt_dev *dev)
+static bool insert_dev_if_not_exists_devpath(struct rnbd_clt_dev *dev)
 {
 	bool found;
+	struct rnbd_clt_session *sess = dev->sess;
 
 	mutex_lock(&sess_lock);
-	found = __exists_dev(pathname, sess->sessname);
+	found = __exists_dev(dev->pathname, sess->sessname);
 	if (!found) {
 		mutex_lock(&sess->lock);
 		list_add_tail(&dev->list, &sess->devs_list);
@@ -1522,7 +1521,7 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
 		ret = PTR_ERR(dev);
 		goto put_sess;
 	}
-	if (insert_dev_if_not_exists_devpath(pathname, sess, dev)) {
+	if (insert_dev_if_not_exists_devpath(dev)) {
 		ret = -EEXIST;
 		goto put_dev;
 	}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 04/19] block/rnbd-clt: Remove some arguments from rnbd_client_setup_device
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (2 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 03/19] block/rnbd-clt: Remove some arguments from insert_dev_if_not_exists_devpath Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 05/19] block/rnbd-clt: Move add_disk(dev->gd) to rnbd_clt_setup_gen_disk Gioh Kim
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang,
	Guoqing Jiang, Danil Kipnis, Gioh Kim, Jack Wang,
	Chaitanya Kulkarni

From: Guoqing Jiang <guoqing.jiang@gmx.com>

Remove them since both sess and idx can be dereferenced from dev. And
sess is not used in the function.

Signed-off-by: Guoqing Jiang <guoqing.jiang@gmx.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/block/rnbd/rnbd-clt.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index 5a5c8dea38dc..ecb83c10013d 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -1354,10 +1354,9 @@ static void rnbd_clt_setup_gen_disk(struct rnbd_clt_dev *dev, int idx)
 		blk_queue_flag_set(QUEUE_FLAG_NONROT, dev->queue);
 }
 
-static int rnbd_client_setup_device(struct rnbd_clt_session *sess,
-				     struct rnbd_clt_dev *dev, int idx)
+static int rnbd_client_setup_device(struct rnbd_clt_dev *dev)
 {
-	int err;
+	int err, idx = dev->clt_device_id;
 
 	dev->size = dev->nsectors * dev->logical_block_size;
 
@@ -1535,7 +1534,7 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
 	mutex_lock(&dev->lock);
 	pr_debug("Opened remote device: session=%s, path='%s'\n",
 		 sess->sessname, pathname);
-	ret = rnbd_client_setup_device(sess, dev, dev->clt_device_id);
+	ret = rnbd_client_setup_device(dev);
 	if (ret) {
 		rnbd_clt_err(dev,
 			      "map_device: Failed to configure device, err: %d\n",
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 05/19] block/rnbd-clt: Move add_disk(dev->gd) to rnbd_clt_setup_gen_disk
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (3 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 04/19] block/rnbd-clt: Remove some arguments from rnbd_client_setup_device Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 06/19] block/rnbd: Kill rnbd_clt_destroy_default_group Gioh Kim
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang,
	Guoqing Jiang, Danil Kipnis, Gioh Kim, Jack Wang,
	Chaitanya Kulkarni

From: Guoqing Jiang <guoqing.jiang@gmx.com>

It makes more sense to add gendisk in rnbd_clt_setup_gen_disk, instead
of do it in rnbd_clt_map_device.

Signed-off-by: Guoqing Jiang <guoqing.jiang@gmx.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/block/rnbd/rnbd-clt.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index ecb83c10013d..f864f06a49b3 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -1352,6 +1352,7 @@ static void rnbd_clt_setup_gen_disk(struct rnbd_clt_dev *dev, int idx)
 
 	if (!dev->rotational)
 		blk_queue_flag_set(QUEUE_FLAG_NONROT, dev->queue);
+	add_disk(dev->gd);
 }
 
 static int rnbd_client_setup_device(struct rnbd_clt_dev *dev)
@@ -1553,8 +1554,6 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
 		       dev->max_hw_sectors, dev->rotational, dev->wc, dev->fua);
 
 	mutex_unlock(&dev->lock);
-
-	add_disk(dev->gd);
 	rnbd_clt_put_sess(sess);
 
 	return dev;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 06/19] block/rnbd: Kill rnbd_clt_destroy_default_group
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (4 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 05/19] block/rnbd-clt: Move add_disk(dev->gd) to rnbd_clt_setup_gen_disk Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 07/19] block/rnbd: Kill destroy_device_cb Gioh Kim
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang,
	Guoqing Jiang, Guoqing Jiang, Danil Kipnis, Gioh Kim,
	Chaitanya Kulkarni

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

No need to have it since we can call sysfs_remove_group in the
rnbd_clt_destroy_sysfs_files.

Then rnbd_clt_destroy_sysfs_files is paired with it's counterpart
rnbd_clt_create_sysfs_files.

Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/block/rnbd/rnbd-clt-sysfs.c | 6 +-----
 drivers/block/rnbd/rnbd-clt.c       | 1 -
 drivers/block/rnbd/rnbd-clt.h       | 1 -
 3 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt-sysfs.c b/drivers/block/rnbd/rnbd-clt-sysfs.c
index d4aa6bfc9555..58c2cc0725b6 100644
--- a/drivers/block/rnbd/rnbd-clt-sysfs.c
+++ b/drivers/block/rnbd/rnbd-clt-sysfs.c
@@ -639,13 +639,9 @@ int rnbd_clt_create_sysfs_files(void)
 	return err;
 }
 
-void rnbd_clt_destroy_default_group(void)
-{
-	sysfs_remove_group(&rnbd_dev->kobj, &default_attr_group);
-}
-
 void rnbd_clt_destroy_sysfs_files(void)
 {
+	sysfs_remove_group(&rnbd_dev->kobj, &default_attr_group);
 	kobject_del(rnbd_devs_kobj);
 	kobject_put(rnbd_devs_kobj);
 	device_destroy(rnbd_dev_class, MKDEV(0, 0));
diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index f864f06a49b3..4e687ec88721 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -1675,7 +1675,6 @@ static void rnbd_destroy_sessions(void)
 	struct rnbd_clt_dev *dev, *tn;
 
 	/* Firstly forbid access through sysfs interface */
-	rnbd_clt_destroy_default_group();
 	rnbd_clt_destroy_sysfs_files();
 
 	/*
diff --git a/drivers/block/rnbd/rnbd-clt.h b/drivers/block/rnbd/rnbd-clt.h
index 537d499dad3b..714d426b449b 100644
--- a/drivers/block/rnbd/rnbd-clt.h
+++ b/drivers/block/rnbd/rnbd-clt.h
@@ -159,7 +159,6 @@ int rnbd_clt_resize_disk(struct rnbd_clt_dev *dev, size_t newsize);
 int rnbd_clt_create_sysfs_files(void);
 
 void rnbd_clt_destroy_sysfs_files(void);
-void rnbd_clt_destroy_default_group(void);
 
 void rnbd_clt_remove_dev_symlink(struct rnbd_clt_dev *dev);
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 07/19] block/rnbd: Kill destroy_device_cb
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (5 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 06/19] block/rnbd: Kill rnbd_clt_destroy_default_group Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 08/19] block/rnbd-clt: Replace {NO_WAIT,WAIT} with RTRS_PERMIT_{WAIT,NOWAIT} Gioh Kim
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang,
	Guoqing Jiang, Guoqing Jiang, Danil Kipnis, Gioh Kim,
	Chaitanya Kulkarni

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

We can use destroy_device directly since destroy_device_cb is just the
wrapper of destroy_device.

Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/block/rnbd/rnbd-srv.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index a6a68d44f517..a4fd9f167c18 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -178,8 +178,10 @@ static int process_rdma(struct rtrs_srv *sess,
 	return err;
 }
 
-static void destroy_device(struct rnbd_srv_dev *dev)
+static void destroy_device(struct kref *kref)
 {
+	struct rnbd_srv_dev *dev = container_of(kref, struct rnbd_srv_dev, kref);
+
 	WARN_ONCE(!list_empty(&dev->sess_dev_list),
 		  "Device %s is being destroyed but still in use!\n",
 		  dev->id);
@@ -198,18 +200,9 @@ static void destroy_device(struct rnbd_srv_dev *dev)
 		kfree(dev);
 }
 
-static void destroy_device_cb(struct kref *kref)
-{
-	struct rnbd_srv_dev *dev;
-
-	dev = container_of(kref, struct rnbd_srv_dev, kref);
-
-	destroy_device(dev);
-}
-
 static void rnbd_put_srv_dev(struct rnbd_srv_dev *dev)
 {
-	kref_put(&dev->kref, destroy_device_cb);
+	kref_put(&dev->kref, destroy_device);
 }
 
 void rnbd_destroy_sess_dev(struct rnbd_srv_sess_dev *sess_dev, bool keep_id)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 08/19] block/rnbd-clt: Replace {NO_WAIT,WAIT} with RTRS_PERMIT_{WAIT,NOWAIT}
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (6 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 07/19] block/rnbd: Kill destroy_device_cb Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 09/19] block/rnbd-srv: Prevent a deadlock generated by accessing sysfs in parallel Gioh Kim
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Gioh Kim,
	Leon Romanovsky, linux-rdma, Guoqing Jiang, Gioh Kim,
	Chaitanya Kulkarni, Jason Gunthorpe

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

They are defined with the same value and similar meaning, let's remove
one of them, then we can remove {WAIT,NOWAIT}.

Also change the type of 'wait' from 'int' to 'enum wait_type' to make
it clear.

Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/block/rnbd/rnbd-clt.c          | 42 +++++++++++---------------
 drivers/infiniband/ulp/rtrs/rtrs-clt.c |  4 +--
 drivers/infiniband/ulp/rtrs/rtrs.h     |  6 ++--
 3 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index 4e687ec88721..652b41cc4492 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -312,13 +312,11 @@ static void rnbd_rerun_all_if_idle(struct rnbd_clt_session *sess)
 
 static struct rtrs_permit *rnbd_get_permit(struct rnbd_clt_session *sess,
 					     enum rtrs_clt_con_type con_type,
-					     int wait)
+					     enum wait_type wait)
 {
 	struct rtrs_permit *permit;
 
-	permit = rtrs_clt_get_permit(sess->rtrs, con_type,
-				      wait ? RTRS_PERMIT_WAIT :
-				      RTRS_PERMIT_NOWAIT);
+	permit = rtrs_clt_get_permit(sess->rtrs, con_type, wait);
 	if (likely(permit))
 		/* We have a subtle rare case here, when all permits can be
 		 * consumed before busy counter increased.  This is safe,
@@ -344,7 +342,7 @@ static void rnbd_put_permit(struct rnbd_clt_session *sess,
 
 static struct rnbd_iu *rnbd_get_iu(struct rnbd_clt_session *sess,
 				     enum rtrs_clt_con_type con_type,
-				     int wait)
+				     enum wait_type wait)
 {
 	struct rnbd_iu *iu;
 	struct rtrs_permit *permit;
@@ -354,9 +352,7 @@ static struct rnbd_iu *rnbd_get_iu(struct rnbd_clt_session *sess,
 		return NULL;
 	}
 
-	permit = rnbd_get_permit(sess, con_type,
-				  wait ? RTRS_PERMIT_WAIT :
-				  RTRS_PERMIT_NOWAIT);
+	permit = rnbd_get_permit(sess, con_type, wait);
 	if (unlikely(!permit)) {
 		kfree(iu);
 		return NULL;
@@ -435,16 +431,11 @@ static void msg_conf(void *priv, int errno)
 	schedule_work(&iu->work);
 }
 
-enum wait_type {
-	NO_WAIT = 0,
-	WAIT    = 1
-};
-
 static int send_usr_msg(struct rtrs_clt *rtrs, int dir,
 			struct rnbd_iu *iu, struct kvec *vec,
 			size_t len, struct scatterlist *sg, unsigned int sg_len,
 			void (*conf)(struct work_struct *work),
-			int *errno, enum wait_type wait)
+			int *errno, int wait)
 {
 	int err;
 	struct rtrs_clt_req_ops req_ops;
@@ -476,7 +467,8 @@ static void msg_close_conf(struct work_struct *work)
 	rnbd_clt_put_dev(dev);
 }
 
-static int send_msg_close(struct rnbd_clt_dev *dev, u32 device_id, bool wait)
+static int send_msg_close(struct rnbd_clt_dev *dev, u32 device_id,
+			  enum wait_type wait)
 {
 	struct rnbd_clt_session *sess = dev->sess;
 	struct rnbd_msg_close msg;
@@ -530,7 +522,7 @@ static void msg_open_conf(struct work_struct *work)
 			 * If server thinks its fine, but we fail to process
 			 * then be nice and send a close to server.
 			 */
-			(void)send_msg_close(dev, device_id, NO_WAIT);
+			send_msg_close(dev, device_id, RTRS_PERMIT_NOWAIT);
 		}
 	}
 	kfree(rsp);
@@ -554,7 +546,7 @@ static void msg_sess_info_conf(struct work_struct *work)
 	rnbd_clt_put_sess(sess);
 }
 
-static int send_msg_open(struct rnbd_clt_dev *dev, bool wait)
+static int send_msg_open(struct rnbd_clt_dev *dev, enum wait_type wait)
 {
 	struct rnbd_clt_session *sess = dev->sess;
 	struct rnbd_msg_open_rsp *rsp;
@@ -601,7 +593,7 @@ static int send_msg_open(struct rnbd_clt_dev *dev, bool wait)
 	return err;
 }
 
-static int send_msg_sess_info(struct rnbd_clt_session *sess, bool wait)
+static int send_msg_sess_info(struct rnbd_clt_session *sess, enum wait_type wait)
 {
 	struct rnbd_msg_sess_info_rsp *rsp;
 	struct rnbd_msg_sess_info msg;
@@ -687,7 +679,7 @@ static void remap_devs(struct rnbd_clt_session *sess)
 	 * be asynchronous.
 	 */
 
-	err = send_msg_sess_info(sess, NO_WAIT);
+	err = send_msg_sess_info(sess, RTRS_PERMIT_NOWAIT);
 	if (err) {
 		pr_err("send_msg_sess_info(\"%s\"): %d\n", sess->sessname, err);
 		return;
@@ -711,7 +703,7 @@ static void remap_devs(struct rnbd_clt_session *sess)
 			continue;
 
 		rnbd_clt_info(dev, "session reconnected, remapping device\n");
-		err = send_msg_open(dev, NO_WAIT);
+		err = send_msg_open(dev, RTRS_PERMIT_NOWAIT);
 		if (err) {
 			rnbd_clt_err(dev, "send_msg_open(): %d\n", err);
 			break;
@@ -1242,7 +1234,7 @@ find_and_get_or_create_sess(const char *sessname,
 	if (err)
 		goto close_rtrs;
 
-	err = send_msg_sess_info(sess, WAIT);
+	err = send_msg_sess_info(sess, RTRS_PERMIT_WAIT);
 	if (err)
 		goto close_rtrs;
 
@@ -1525,7 +1517,7 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
 		ret = -EEXIST;
 		goto put_dev;
 	}
-	ret = send_msg_open(dev, WAIT);
+	ret = send_msg_open(dev, RTRS_PERMIT_WAIT);
 	if (ret) {
 		rnbd_clt_err(dev,
 			      "map_device: failed, can't open remote device, err: %d\n",
@@ -1559,7 +1551,7 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
 	return dev;
 
 send_close:
-	send_msg_close(dev, dev->device_id, WAIT);
+	send_msg_close(dev, dev->device_id, RTRS_PERMIT_WAIT);
 del_dev:
 	delete_dev(dev);
 put_dev:
@@ -1619,7 +1611,7 @@ int rnbd_clt_unmap_device(struct rnbd_clt_dev *dev, bool force,
 	destroy_sysfs(dev, sysfs_self);
 	destroy_gen_disk(dev);
 	if (was_mapped && sess->rtrs)
-		send_msg_close(dev, dev->device_id, WAIT);
+		send_msg_close(dev, dev->device_id, RTRS_PERMIT_WAIT);
 
 	rnbd_clt_info(dev, "Device is unmapped\n");
 
@@ -1653,7 +1645,7 @@ int rnbd_clt_remap_device(struct rnbd_clt_dev *dev)
 	mutex_unlock(&dev->lock);
 	if (!err) {
 		rnbd_clt_info(dev, "Remapping device.\n");
-		err = send_msg_open(dev, WAIT);
+		err = send_msg_open(dev, RTRS_PERMIT_WAIT);
 		if (err)
 			rnbd_clt_err(dev, "remap_device: %d\n", err);
 	}
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 0a08b4b742a3..7efd49bdc78c 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -103,11 +103,11 @@ static inline void __rtrs_put_permit(struct rtrs_clt *clt,
  *    up earlier.
  *
  * Context:
- *    Can sleep if @wait == RTRS_TAG_WAIT
+ *    Can sleep if @wait == RTRS_PERMIT_WAIT
  */
 struct rtrs_permit *rtrs_clt_get_permit(struct rtrs_clt *clt,
 					  enum rtrs_clt_con_type con_type,
-					  int can_wait)
+					  enum wait_type can_wait)
 {
 	struct rtrs_permit *permit;
 	DEFINE_WAIT(wait);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.h b/drivers/infiniband/ulp/rtrs/rtrs.h
index 8738e90e715a..2db1b5eb3ab0 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs.h
@@ -63,9 +63,9 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 
 void rtrs_clt_close(struct rtrs_clt *sess);
 
-enum {
+enum wait_type {
 	RTRS_PERMIT_NOWAIT = 0,
-	RTRS_PERMIT_WAIT   = 1,
+	RTRS_PERMIT_WAIT   = 1
 };
 
 /**
@@ -81,7 +81,7 @@ enum rtrs_clt_con_type {
 
 struct rtrs_permit *rtrs_clt_get_permit(struct rtrs_clt *sess,
 				    enum rtrs_clt_con_type con_type,
-				    int wait);
+				    enum wait_type wait);
 
 void rtrs_clt_put_permit(struct rtrs_clt *sess, struct rtrs_permit *permit);
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 09/19] block/rnbd-srv: Prevent a deadlock generated by accessing sysfs in parallel
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (7 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 08/19] block/rnbd-clt: Replace {NO_WAIT,WAIT} with RTRS_PERMIT_{WAIT,NOWAIT} Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 10/19] block/rnbd-srv: Remove force_close file after holding a lock Gioh Kim
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Gioh Kim,
	Gioh Kim

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

We got a warning message below.
When server tries to close one session by force, it locks the sysfs
interface and locks the srv_sess lock.
The problem is that client can send a request to close at the same time.
By close request, server locks the srv_sess lock and locks the sysfs
to remove the sysfs interfaces.

The simplest way to prevent that situation could be just use
mutex_trylock.

[  234.153965] ======================================================
[  234.154093] WARNING: possible circular locking dependency detected
[  234.154219] 5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 Tainted: G           O
[  234.154381] ------------------------------------------------------
[  234.154531] kworker/1:1H/618 is trying to acquire lock:
[  234.154651] ffff8887a09db0a8 (kn->count#132){++++}, at: kernfs_remove_by_name_ns+0x40/0x80
[  234.154819]
               but task is already holding lock:
[  234.154965] ffff8887ae5f6518 (&srv_sess->lock){+.+.}, at: rnbd_srv_rdma_ev+0x144/0x1590 [rnbd_server]
[  234.155132]
               which lock already depends on the new lock.

[  234.155311]
               the existing dependency chain (in reverse order) is:
[  234.155462]
               -> #1 (&srv_sess->lock){+.+.}:
[  234.155614]        __mutex_lock+0x134/0xcb0
[  234.155761]        rnbd_srv_sess_dev_force_close+0x36/0x50 [rnbd_server]
[  234.155889]        rnbd_srv_dev_session_force_close_store+0x69/0xc0 [rnbd_server]
[  234.156042]        kernfs_fop_write+0x13f/0x240
[  234.156162]        vfs_write+0xf3/0x280
[  234.156278]        ksys_write+0xba/0x150
[  234.156395]        do_syscall_64+0x62/0x270
[  234.156513]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  234.156632]
               -> #0 (kn->count#132){++++}:
[  234.156782]        __lock_acquire+0x129e/0x23a0
[  234.156900]        lock_acquire+0xf3/0x210
[  234.157043]        __kernfs_remove+0x42b/0x4c0
[  234.157161]        kernfs_remove_by_name_ns+0x40/0x80
[  234.157282]        remove_files+0x3f/0xa0
[  234.157399]        sysfs_remove_group+0x4a/0xb0
[  234.157519]        rnbd_srv_destroy_dev_session_sysfs+0x19/0x30 [rnbd_server]
[  234.157648]        rnbd_srv_rdma_ev+0x14c/0x1590 [rnbd_server]
[  234.157775]        process_io_req+0x29a/0x6a0 [rtrs_server]
[  234.157924]        __ib_process_cq+0x8c/0x100 [ib_core]
[  234.158709]        ib_cq_poll_work+0x31/0xb0 [ib_core]
[  234.158834]        process_one_work+0x4e5/0xaa0
[  234.158958]        worker_thread+0x65/0x5c0
[  234.159078]        kthread+0x1e0/0x200
[  234.159194]        ret_from_fork+0x24/0x30
[  234.159309]
               other info that might help us debug this:

[  234.159513]  Possible unsafe locking scenario:

[  234.159658]        CPU0                    CPU1
[  234.159775]        ----                    ----
[  234.159891]   lock(&srv_sess->lock);
[  234.160005]                                lock(kn->count#132);
[  234.160128]                                lock(&srv_sess->lock);
[  234.160250]   lock(kn->count#132);
[  234.160364]
                *** DEADLOCK ***

[  234.160536] 3 locks held by kworker/1:1H/618:
[  234.160677]  #0: ffff8883ca1ed528 ((wq_completion)ib-comp-wq){+.+.}, at: process_one_work+0x40a/0xaa0
[  234.160840]  #1: ffff8883d2d5fe10 ((work_completion)(&cq->work)){+.+.}, at: process_one_work+0x40a/0xaa0
[  234.161003]  #2: ffff8887ae5f6518 (&srv_sess->lock){+.+.}, at: rnbd_srv_rdma_ev+0x144/0x1590 [rnbd_server]
[  234.161168]
               stack backtrace:
[  234.161312] CPU: 1 PID: 618 Comm: kworker/1:1H Tainted: G           O      5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
[  234.161490] Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00       09/04/2012
[  234.161643] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[  234.161765] Call Trace:
[  234.161910]  dump_stack+0x96/0xe0
[  234.162028]  check_noncircular+0x29e/0x2e0
[  234.162148]  ? print_circular_bug+0x100/0x100
[  234.162267]  ? register_lock_class+0x1ad/0x8a0
[  234.162385]  ? __lock_acquire+0x68e/0x23a0
[  234.162505]  ? trace_event_raw_event_lock+0x190/0x190
[  234.162626]  __lock_acquire+0x129e/0x23a0
[  234.162746]  ? register_lock_class+0x8a0/0x8a0
[  234.162866]  lock_acquire+0xf3/0x210
[  234.162982]  ? kernfs_remove_by_name_ns+0x40/0x80
[  234.163127]  __kernfs_remove+0x42b/0x4c0
[  234.163243]  ? kernfs_remove_by_name_ns+0x40/0x80
[  234.163363]  ? kernfs_fop_readdir+0x3b0/0x3b0
[  234.163482]  ? strlen+0x1f/0x40
[  234.163596]  ? strcmp+0x30/0x50
[  234.163712]  kernfs_remove_by_name_ns+0x40/0x80
[  234.163832]  remove_files+0x3f/0xa0
[  234.163948]  sysfs_remove_group+0x4a/0xb0
[  234.164068]  rnbd_srv_destroy_dev_session_sysfs+0x19/0x30 [rnbd_server]
[  234.164196]  rnbd_srv_rdma_ev+0x14c/0x1590 [rnbd_server]
[  234.164345]  ? _raw_spin_unlock_irqrestore+0x43/0x50
[  234.164466]  ? lockdep_hardirqs_on+0x1a8/0x290
[  234.164597]  ? mlx4_ib_poll_cq+0x927/0x1280 [mlx4_ib]
[  234.164732]  ? rnbd_get_sess_dev+0x270/0x270 [rnbd_server]
[  234.164859]  process_io_req+0x29a/0x6a0 [rtrs_server]
[  234.164982]  ? rnbd_get_sess_dev+0x270/0x270 [rnbd_server]
[  234.165130]  __ib_process_cq+0x8c/0x100 [ib_core]
[  234.165279]  ib_cq_poll_work+0x31/0xb0 [ib_core]
[  234.165404]  process_one_work+0x4e5/0xaa0
[  234.165550]  ? pwq_dec_nr_in_flight+0x160/0x160
[  234.165675]  ? do_raw_spin_lock+0x119/0x1d0
[  234.165796]  worker_thread+0x65/0x5c0
[  234.165914]  ? process_one_work+0xaa0/0xaa0
[  234.166031]  kthread+0x1e0/0x200
[  234.166147]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[  234.166268]  ret_from_fork+0x24/0x30
[  234.251591] rnbd_server L243: </dev/loop1@close_device_session>: Device closed
[  234.604221] rnbd_server L264: RTRS Session close_device_session disconnected

Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
---
 drivers/block/rnbd/rnbd-srv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index a4fd9f167c18..1549a6361630 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -334,7 +334,9 @@ void rnbd_srv_sess_dev_force_close(struct rnbd_srv_sess_dev *sess_dev)
 	struct rnbd_srv_session	*sess = sess_dev->sess;
 
 	sess_dev->keep_id = true;
-	mutex_lock(&sess->lock);
+	/* It is already started to close by client's close message. */
+	if (!mutex_trylock(&sess->lock))
+		return;
 	rnbd_srv_destroy_dev_session_sysfs(sess_dev);
 	mutex_unlock(&sess->lock);
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 10/19] block/rnbd-srv: Remove force_close file after holding a lock
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (8 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 09/19] block/rnbd-srv: Prevent a deadlock generated by accessing sysfs in parallel Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 11/19] block/rnbd-clt: Improve find_or_create_sess() return check Gioh Kim
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Gioh Kim,
	Gioh Kim

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

We changed the rnbd_srv_sess_dev_force_close to use try-lock
because rnbd_srv_sess_dev_force_close and process_msg_close
can generate a deadlock.

Now rnbd_srv_sess_dev_force_close would do nothing
if it fails to get the lock. So removing the force_close
file should be moved to after the lock. Or the force_close
file is removed but the others are not removed.

Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
---
 drivers/block/rnbd/rnbd-srv-sysfs.c | 5 +----
 drivers/block/rnbd/rnbd-srv.c       | 5 ++++-
 drivers/block/rnbd/rnbd-srv.h       | 3 ++-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-srv-sysfs.c b/drivers/block/rnbd/rnbd-srv-sysfs.c
index 05ffe488ddc6..acf5fced11ef 100644
--- a/drivers/block/rnbd/rnbd-srv-sysfs.c
+++ b/drivers/block/rnbd/rnbd-srv-sysfs.c
@@ -147,10 +147,7 @@ static ssize_t rnbd_srv_dev_session_force_close_store(struct kobject *kobj,
 	}
 
 	rnbd_srv_info(sess_dev, "force close requested\n");
-
-	/* first remove sysfs itself to avoid deadlock */
-	sysfs_remove_file_self(&sess_dev->kobj, &attr->attr);
-	rnbd_srv_sess_dev_force_close(sess_dev);
+	rnbd_srv_sess_dev_force_close(sess_dev, attr);
 
 	return count;
 }
diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index 1549a6361630..a9bb414f7442 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -329,7 +329,8 @@ static int rnbd_srv_link_ev(struct rtrs_srv *rtrs,
 	}
 }
 
-void rnbd_srv_sess_dev_force_close(struct rnbd_srv_sess_dev *sess_dev)
+void rnbd_srv_sess_dev_force_close(struct rnbd_srv_sess_dev *sess_dev,
+				   struct kobj_attribute *attr)
 {
 	struct rnbd_srv_session	*sess = sess_dev->sess;
 
@@ -337,6 +338,8 @@ void rnbd_srv_sess_dev_force_close(struct rnbd_srv_sess_dev *sess_dev)
 	/* It is already started to close by client's close message. */
 	if (!mutex_trylock(&sess->lock))
 		return;
+	/* first remove sysfs itself to avoid deadlock */
+	sysfs_remove_file_self(&sess_dev->kobj, &attr->attr);
 	rnbd_srv_destroy_dev_session_sysfs(sess_dev);
 	mutex_unlock(&sess->lock);
 }
diff --git a/drivers/block/rnbd/rnbd-srv.h b/drivers/block/rnbd/rnbd-srv.h
index b157371c25ed..98ddc31eb408 100644
--- a/drivers/block/rnbd/rnbd-srv.h
+++ b/drivers/block/rnbd/rnbd-srv.h
@@ -64,7 +64,8 @@ struct rnbd_srv_sess_dev {
 	enum rnbd_access_mode		access_mode;
 };
 
-void rnbd_srv_sess_dev_force_close(struct rnbd_srv_sess_dev *sess_dev);
+void rnbd_srv_sess_dev_force_close(struct rnbd_srv_sess_dev *sess_dev,
+				   struct kobj_attribute *attr);
 /* rnbd-srv-sysfs.c */
 
 int rnbd_srv_create_dev_sysfs(struct rnbd_srv_dev *dev,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 11/19] block/rnbd-clt: Improve find_or_create_sess() return check
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (9 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 10/19] block/rnbd-srv: Remove force_close file after holding a lock Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 12/19] block/rnbd-clt: Fix missing a memory free when unloading the module Gioh Kim
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Tom Rix, Gioh Kim

From: Tom Rix <trix@redhat.com>

clang static analysis reports this problem

rnbd-clt.c:1212:11: warning: Branch condition evaluates to a
  garbage value
        else if (!first)
                 ^~~~~~

This is triggered in the find_and_get_or_create_sess() call
because the variable first is not initialized and the
earlier check is specifically for

	if (sess == ERR_PTR(-ENOMEM))

This is false positive.

But the if-check can be reduced by initializing first to
false and then returning if the call to find_or_creat_sess()
does not set it to true.  When it remains false, either
sess will be valid or not.  The not case is caught by
find_and_get_or_create_sess()'s caller rnbd_clt_map_device()

	sess = find_and_get_or_create_sess(...);
	if (IS_ERR(sess))
		return ERR_CAST(sess);

Since find_and_get_or_create_sess() initializes first to false
setting it in find_or_create_sess() is not needed.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
---
 drivers/block/rnbd/rnbd-clt.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index 652b41cc4492..9b44aac680d5 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -910,6 +910,7 @@ static struct rnbd_clt_session *__find_and_get_sess(const char *sessname)
 	return NULL;
 }
 
+/* caller is responsible for initializing 'first' to false */
 static struct
 rnbd_clt_session *find_or_create_sess(const char *sessname, bool *first)
 {
@@ -925,8 +926,7 @@ rnbd_clt_session *find_or_create_sess(const char *sessname, bool *first)
 		}
 		list_add(&sess->list, &sess_list);
 		*first = true;
-	} else
-		*first = false;
+	}
 	mutex_unlock(&sess_lock);
 
 	return sess;
@@ -1194,13 +1194,11 @@ find_and_get_or_create_sess(const char *sessname,
 	struct rnbd_clt_session *sess;
 	struct rtrs_attrs attrs;
 	int err;
-	bool first;
+	bool first = false;
 	struct rtrs_clt_ops rtrs_ops;
 
 	sess = find_or_create_sess(sessname, &first);
-	if (sess == ERR_PTR(-ENOMEM))
-		return ERR_PTR(-ENOMEM);
-	else if (!first)
+	if (!first)
 		return sess;
 
 	if (!path_cnt) {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 12/19] block/rnbd-clt: Fix missing a memory free when unloading the module
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (10 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 11/19] block/rnbd-clt: Improve find_or_create_sess() return check Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization Gioh Kim
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Gioh Kim,
	Gioh Kim

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

When unloading the rnbd-clt module, it does not free a memory
including the filename of the symbolic link to /sys/block/rnbdX.

It is found by kmemleak as below.

unreferenced object 0xffff9f1a83d3c740 (size 16):
  comm "bash", pid 736, jiffies 4295179665 (age 9841.310s)
  hex dump (first 16 bytes):
    21 64 65 76 21 6e 75 6c 6c 62 30 40 62 6c 61 00  !dev!nullb0@bla.
  backtrace:
    [<0000000039f0c55e>] 0xffffffffc0456c24
    [<000000001aab9513>] kernfs_fop_write+0xcf/0x1c0
    [<00000000db5aa4b3>] vfs_write+0xdb/0x1d0
    [<000000007a2e2207>] ksys_write+0x65/0xe0
    [<00000000055e280a>] do_syscall_64+0x50/0x1b0
    [<00000000c2b51831>] entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
---
 drivers/block/rnbd/rnbd-clt-sysfs.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt-sysfs.c b/drivers/block/rnbd/rnbd-clt-sysfs.c
index 58c2cc0725b6..49015f428e67 100644
--- a/drivers/block/rnbd/rnbd-clt-sysfs.c
+++ b/drivers/block/rnbd/rnbd-clt-sysfs.c
@@ -432,10 +432,14 @@ void rnbd_clt_remove_dev_symlink(struct rnbd_clt_dev *dev)
 	 * i.e. rnbd_clt_unmap_dev_store() leading to a sysfs warning because
 	 * of sysfs link already was removed already.
 	 */
-	if (dev->blk_symlink_name && try_module_get(THIS_MODULE)) {
-		sysfs_remove_link(rnbd_devs_kobj, dev->blk_symlink_name);
+	if (dev->blk_symlink_name) {
+		if (try_module_get(THIS_MODULE)) {
+			sysfs_remove_link(rnbd_devs_kobj, dev->blk_symlink_name);
+			module_put(THIS_MODULE);
+		}
+		/* It should be freed always. */
 		kfree(dev->blk_symlink_name);
-		module_put(THIS_MODULE);
+		dev->blk_symlink_name = NULL;
 	}
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (11 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 12/19] block/rnbd-clt: Fix missing a memory free when unloading the module Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-18  8:36   ` Leon Romanovsky
  2021-04-14 12:23 ` [PATCHv4 for-next 14/19] Documentation/ABI/rnbd-clt: Add description for nr_poll_queues Gioh Kim
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Gioh Kim,
	Leon Romanovsky, linux-rdma, Gioh Kim, Jason Gunthorpe

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

RNBD can make double-queues for irq-mode and poll-mode.
For example, on 4-CPU system 8 request-queues are created,
4 for irq-mode and 4 for poll-mode.
If the IO has HIPRI flag, the block-layer will call .poll function
of RNBD. Then IO is sent to the poll-mode queue.
Add optional nr_poll_queues argument for map_devices interface.

To support polling of RNBD, RTRS client creates connections
for both of irq-mode and direct-poll-mode.

For example, on 4-CPU system it could've create 5 connections:
con[0] => user message (softirq cq)
con[1:4] => softirq cq

After this patch, it can create 9 connections:
con[0] => user message (softirq cq)
con[1:4] => softirq cq
con[5:8] => DIRECT-POLL cq

Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/block/rnbd/rnbd-clt-sysfs.c    | 56 +++++++++++++----
 drivers/block/rnbd/rnbd-clt.c          | 85 +++++++++++++++++++++++---
 drivers/block/rnbd/rnbd-clt.h          |  5 +-
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 62 +++++++++++++++----
 drivers/infiniband/ulp/rtrs/rtrs-pri.h |  1 +
 drivers/infiniband/ulp/rtrs/rtrs.h     |  3 +-
 6 files changed, 178 insertions(+), 34 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt-sysfs.c b/drivers/block/rnbd/rnbd-clt-sysfs.c
index 49015f428e67..bd111ebceb75 100644
--- a/drivers/block/rnbd/rnbd-clt-sysfs.c
+++ b/drivers/block/rnbd/rnbd-clt-sysfs.c
@@ -34,6 +34,7 @@ enum {
 	RNBD_OPT_DEV_PATH	= 1 << 2,
 	RNBD_OPT_ACCESS_MODE	= 1 << 3,
 	RNBD_OPT_SESSNAME	= 1 << 6,
+	RNBD_OPT_NR_POLL_QUEUES	= 1 << 7,
 };
 
 static const unsigned int rnbd_opt_mandatory[] = {
@@ -42,12 +43,13 @@ static const unsigned int rnbd_opt_mandatory[] = {
 };
 
 static const match_table_t rnbd_opt_tokens = {
-	{RNBD_OPT_PATH,		"path=%s"	},
-	{RNBD_OPT_DEV_PATH,	"device_path=%s"},
-	{RNBD_OPT_DEST_PORT,	"dest_port=%d"  },
-	{RNBD_OPT_ACCESS_MODE,	"access_mode=%s"},
-	{RNBD_OPT_SESSNAME,	"sessname=%s"	},
-	{RNBD_OPT_ERR,		NULL		},
+	{RNBD_OPT_PATH,			"path=%s"		},
+	{RNBD_OPT_DEV_PATH,		"device_path=%s"	},
+	{RNBD_OPT_DEST_PORT,		"dest_port=%d"		},
+	{RNBD_OPT_ACCESS_MODE,		"access_mode=%s"	},
+	{RNBD_OPT_SESSNAME,		"sessname=%s"		},
+	{RNBD_OPT_NR_POLL_QUEUES,	"nr_poll_queues=%d"	},
+	{RNBD_OPT_ERR,			NULL			},
 };
 
 struct rnbd_map_options {
@@ -57,6 +59,7 @@ struct rnbd_map_options {
 	char *pathname;
 	u16 *dest_port;
 	enum rnbd_access_mode *access_mode;
+	u32 *nr_poll_queues;
 };
 
 static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
@@ -68,7 +71,7 @@ static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
 	int opt_mask = 0;
 	int token;
 	int ret = -EINVAL;
-	int i, dest_port;
+	int i, dest_port, nr_poll_queues;
 	int p_cnt = 0;
 
 	options = kstrdup(buf, GFP_KERNEL);
@@ -178,6 +181,19 @@ static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
 			kfree(p);
 			break;
 
+		case RNBD_OPT_NR_POLL_QUEUES:
+			if (match_int(args, &nr_poll_queues) || nr_poll_queues < -1 ||
+			    nr_poll_queues > (int)nr_cpu_ids) {
+				pr_err("bad nr_poll_queues parameter '%d'\n",
+				       nr_poll_queues);
+				ret = -EINVAL;
+				goto out;
+			}
+			if (nr_poll_queues == -1)
+				nr_poll_queues = nr_cpu_ids;
+			*opt->nr_poll_queues = nr_poll_queues;
+			break;
+
 		default:
 			pr_err("map_device: Unknown parameter or missing value '%s'\n",
 			       p);
@@ -227,6 +243,20 @@ static ssize_t state_show(struct kobject *kobj,
 
 static struct kobj_attribute rnbd_clt_state_attr = __ATTR_RO(state);
 
+static ssize_t nr_poll_queues_show(struct kobject *kobj,
+				   struct kobj_attribute *attr, char *page)
+{
+	struct rnbd_clt_dev *dev;
+
+	dev = container_of(kobj, struct rnbd_clt_dev, kobj);
+
+	return snprintf(page, PAGE_SIZE, "%d\n",
+			dev->nr_poll_queues);
+}
+
+static struct kobj_attribute rnbd_clt_nr_poll_queues =
+	__ATTR_RO(nr_poll_queues);
+
 static ssize_t mapping_path_show(struct kobject *kobj,
 				 struct kobj_attribute *attr, char *page)
 {
@@ -421,6 +451,7 @@ static struct attribute *rnbd_dev_attrs[] = {
 	&rnbd_clt_state_attr.attr,
 	&rnbd_clt_session_attr.attr,
 	&rnbd_clt_access_mode.attr,
+	&rnbd_clt_nr_poll_queues.attr,
 	NULL,
 };
 
@@ -469,7 +500,7 @@ static ssize_t rnbd_clt_map_device_show(struct kobject *kobj,
 					 char *page)
 {
 	return scnprintf(page, PAGE_SIZE,
-			 "Usage: echo \"[dest_port=server port number] sessname=<name of the rtrs session> path=<[srcaddr@]dstaddr> [path=<[srcaddr@]dstaddr>] device_path=<full path on remote side> [access_mode=<ro|rw|migration>]\" > %s\n\naddr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]\n",
+			 "Usage: echo \"[dest_port=server port number] sessname=<name of the rtrs session> path=<[srcaddr@]dstaddr> [path=<[srcaddr@]dstaddr>] device_path=<full path on remote side> [access_mode=<ro|rw|migration>] [nr_poll_queues=<number of queues>]\" > %s\n\naddr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]\n",
 			 attr->attr.name);
 }
 
@@ -541,6 +572,7 @@ static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
 	char sessname[NAME_MAX];
 	enum rnbd_access_mode access_mode = RNBD_ACCESS_RW;
 	u16 port_nr = RTRS_PORT;
+	u32 nr_poll_queues = 0;
 
 	struct sockaddr_storage *addrs;
 	struct rtrs_addr paths[6];
@@ -552,6 +584,7 @@ static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
 	opt.pathname = pathname;
 	opt.dest_port = &port_nr;
 	opt.access_mode = &access_mode;
+	opt.nr_poll_queues = &nr_poll_queues;
 	addrs = kcalloc(ARRAY_SIZE(paths) * 2, sizeof(*addrs), GFP_KERNEL);
 	if (!addrs)
 		return -ENOMEM;
@@ -565,12 +598,13 @@ static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
 	if (ret)
 		goto out;
 
-	pr_info("Mapping device %s on session %s, (access_mode: %s)\n",
+	pr_info("Mapping device %s on session %s, (access_mode: %s, nr_poll_queues: %d)\n",
 		pathname, sessname,
-		rnbd_access_mode_str(access_mode));
+		rnbd_access_mode_str(access_mode),
+		nr_poll_queues);
 
 	dev = rnbd_clt_map_device(sessname, paths, path_cnt, port_nr, pathname,
-				  access_mode);
+				  access_mode, nr_poll_queues);
 	if (IS_ERR(dev)) {
 		ret = PTR_ERR(dev);
 		goto out;
diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index 9b44aac680d5..63719ec04d58 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -1165,9 +1165,54 @@ static blk_status_t rnbd_queue_rq(struct blk_mq_hw_ctx *hctx,
 	return ret;
 }
 
+static int rnbd_rdma_poll(struct blk_mq_hw_ctx *hctx)
+{
+	struct rnbd_queue *q = hctx->driver_data;
+	struct rnbd_clt_dev *dev = q->dev;
+	int cnt;
+
+	cnt = rtrs_clt_rdma_cq_direct(dev->sess->rtrs, hctx->queue_num);
+	return cnt;
+}
+
+static int rnbd_rdma_map_queues(struct blk_mq_tag_set *set)
+{
+	struct rnbd_clt_session *sess = set->driver_data;
+
+	/* shared read/write queues */
+	set->map[HCTX_TYPE_DEFAULT].nr_queues = num_online_cpus();
+	set->map[HCTX_TYPE_DEFAULT].queue_offset = 0;
+	set->map[HCTX_TYPE_READ].nr_queues = num_online_cpus();
+	set->map[HCTX_TYPE_READ].queue_offset = 0;
+	blk_mq_map_queues(&set->map[HCTX_TYPE_DEFAULT]);
+	blk_mq_map_queues(&set->map[HCTX_TYPE_READ]);
+
+	if (sess->nr_poll_queues) {
+		/* dedicated queue for poll */
+		set->map[HCTX_TYPE_POLL].nr_queues = sess->nr_poll_queues;
+		set->map[HCTX_TYPE_POLL].queue_offset = set->map[HCTX_TYPE_READ].queue_offset +
+			set->map[HCTX_TYPE_READ].nr_queues;
+		blk_mq_map_queues(&set->map[HCTX_TYPE_POLL]);
+		pr_info("[session=%s] mapped %d/%d/%d default/read/poll queues.\n",
+			sess->sessname,
+			set->map[HCTX_TYPE_DEFAULT].nr_queues,
+			set->map[HCTX_TYPE_READ].nr_queues,
+			set->map[HCTX_TYPE_POLL].nr_queues);
+	} else {
+		pr_info("[session=%s] mapped %d/%d default/read queues.\n",
+			sess->sessname,
+			set->map[HCTX_TYPE_DEFAULT].nr_queues,
+			set->map[HCTX_TYPE_READ].nr_queues);
+	}
+
+	return 0;
+}
+
 static struct blk_mq_ops rnbd_mq_ops = {
 	.queue_rq	= rnbd_queue_rq,
 	.complete	= rnbd_softirq_done_fn,
+	.map_queues     = rnbd_rdma_map_queues,
+	.poll           = rnbd_rdma_poll,
 };
 
 static int setup_mq_tags(struct rnbd_clt_session *sess)
@@ -1181,7 +1226,15 @@ static int setup_mq_tags(struct rnbd_clt_session *sess)
 	tag_set->flags		= BLK_MQ_F_SHOULD_MERGE |
 				  BLK_MQ_F_TAG_QUEUE_SHARED;
 	tag_set->cmd_size	= sizeof(struct rnbd_iu) + RNBD_RDMA_SGL_SIZE;
-	tag_set->nr_hw_queues	= num_online_cpus();
+
+	/* for HCTX_TYPE_DEFAULT, HCTX_TYPE_READ, HCTX_TYPE_POLL */
+	tag_set->nr_maps        = sess->nr_poll_queues ? HCTX_MAX_TYPES : 2;
+	/*
+	 * HCTX_TYPE_DEFAULT and HCTX_TYPE_READ share one set of queues
+	 * others are for HCTX_TYPE_POLL
+	 */
+	tag_set->nr_hw_queues	= num_online_cpus() + sess->nr_poll_queues;
+	tag_set->driver_data    = sess;
 
 	return blk_mq_alloc_tag_set(tag_set);
 }
@@ -1189,7 +1242,7 @@ static int setup_mq_tags(struct rnbd_clt_session *sess)
 static struct rnbd_clt_session *
 find_and_get_or_create_sess(const char *sessname,
 			    const struct rtrs_addr *paths,
-			    size_t path_cnt, u16 port_nr)
+			    size_t path_cnt, u16 port_nr, u32 nr_poll_queues)
 {
 	struct rnbd_clt_session *sess;
 	struct rtrs_attrs attrs;
@@ -1198,6 +1251,17 @@ find_and_get_or_create_sess(const char *sessname,
 	struct rtrs_clt_ops rtrs_ops;
 
 	sess = find_or_create_sess(sessname, &first);
+	if (sess == ERR_PTR(-ENOMEM))
+		return ERR_PTR(-ENOMEM);
+	else if ((nr_poll_queues && !first) ||  (!nr_poll_queues && sess->nr_poll_queues)) {
+		/*
+		 * A device MUST have its own session to use the polling-mode.
+		 * It must fail to map new device with the same session.
+		 */
+		err = -EINVAL;
+		goto put_sess;
+	}
+
 	if (!first)
 		return sess;
 
@@ -1219,7 +1283,7 @@ find_and_get_or_create_sess(const char *sessname,
 				   0, /* Do not use pdu of rtrs */
 				   RECONNECT_DELAY, BMAX_SEGMENTS,
 				   BLK_MAX_SEGMENT_SIZE,
-				   MAX_RECONNECTS);
+				   MAX_RECONNECTS, nr_poll_queues);
 	if (IS_ERR(sess->rtrs)) {
 		err = PTR_ERR(sess->rtrs);
 		goto wake_up_and_put;
@@ -1227,6 +1291,7 @@ find_and_get_or_create_sess(const char *sessname,
 	rtrs_clt_query(sess->rtrs, &attrs);
 	sess->max_io_size = attrs.max_io_size;
 	sess->queue_depth = attrs.queue_depth;
+	sess->nr_poll_queues = nr_poll_queues;
 
 	err = setup_mq_tags(sess);
 	if (err)
@@ -1370,7 +1435,8 @@ static int rnbd_client_setup_device(struct rnbd_clt_dev *dev)
 
 static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess,
 				      enum rnbd_access_mode access_mode,
-				      const char *pathname)
+				      const char *pathname,
+				      u32 nr_poll_queues)
 {
 	struct rnbd_clt_dev *dev;
 	int ret;
@@ -1379,7 +1445,8 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess,
 	if (!dev)
 		return ERR_PTR(-ENOMEM);
 
-	dev->hw_queues = kcalloc(nr_cpu_ids, sizeof(*dev->hw_queues),
+	dev->hw_queues = kcalloc(nr_cpu_ids /* softirq */ + nr_poll_queues /* poll */,
+				 sizeof(*dev->hw_queues),
 				 GFP_KERNEL);
 	if (!dev->hw_queues) {
 		ret = -ENOMEM;
@@ -1405,6 +1472,7 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess,
 	dev->clt_device_id	= ret;
 	dev->sess		= sess;
 	dev->access_mode	= access_mode;
+	dev->nr_poll_queues	= nr_poll_queues;
 	mutex_init(&dev->lock);
 	refcount_set(&dev->refcount, 1);
 	dev->dev_state = DEV_STATE_INIT;
@@ -1491,7 +1559,8 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
 					   struct rtrs_addr *paths,
 					   size_t path_cnt, u16 port_nr,
 					   const char *pathname,
-					   enum rnbd_access_mode access_mode)
+					   enum rnbd_access_mode access_mode,
+					   u32 nr_poll_queues)
 {
 	struct rnbd_clt_session *sess;
 	struct rnbd_clt_dev *dev;
@@ -1500,11 +1569,11 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
 	if (unlikely(exists_devpath(pathname, sessname)))
 		return ERR_PTR(-EEXIST);
 
-	sess = find_and_get_or_create_sess(sessname, paths, path_cnt, port_nr);
+	sess = find_and_get_or_create_sess(sessname, paths, path_cnt, port_nr, nr_poll_queues);
 	if (IS_ERR(sess))
 		return ERR_CAST(sess);
 
-	dev = init_dev(sess, access_mode, pathname);
+	dev = init_dev(sess, access_mode, pathname, nr_poll_queues);
 	if (IS_ERR(dev)) {
 		pr_err("map_device: failed to map device '%s' from session %s, can't initialize device, err: %ld\n",
 		       pathname, sess->sessname, PTR_ERR(dev));
diff --git a/drivers/block/rnbd/rnbd-clt.h b/drivers/block/rnbd/rnbd-clt.h
index 714d426b449b..451e7383738f 100644
--- a/drivers/block/rnbd/rnbd-clt.h
+++ b/drivers/block/rnbd/rnbd-clt.h
@@ -90,6 +90,7 @@ struct rnbd_clt_session {
 	int			queue_depth;
 	u32			max_io_size;
 	struct blk_mq_tag_set	tag_set;
+	u32			nr_poll_queues;
 	struct mutex		lock; /* protects state and devs_list */
 	struct list_head        devs_list; /* list of struct rnbd_clt_dev */
 	refcount_t		refcount;
@@ -118,6 +119,7 @@ struct rnbd_clt_dev {
 	enum rnbd_clt_dev_state	dev_state;
 	char			*pathname;
 	enum rnbd_access_mode	access_mode;
+	u32			nr_poll_queues;
 	bool			read_only;
 	bool			rotational;
 	bool			wc;
@@ -147,7 +149,8 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
 					   struct rtrs_addr *paths,
 					   size_t path_cnt, u16 port_nr,
 					   const char *pathname,
-					   enum rnbd_access_mode access_mode);
+					   enum rnbd_access_mode access_mode,
+					   u32 nr_poll_queues);
 int rnbd_clt_unmap_device(struct rnbd_clt_dev *dev, bool force,
 			   const struct attribute *sysfs_self);
 
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 7efd49bdc78c..467d135a82cf 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -174,7 +174,7 @@ struct rtrs_clt_con *rtrs_permit_to_clt_con(struct rtrs_clt_sess *sess,
 	int id = 0;
 
 	if (likely(permit->con_type == RTRS_IO_CON))
-		id = (permit->cpu_id % (sess->s.con_num - 1)) + 1;
+		id = (permit->cpu_id % (sess->s.irq_con_num - 1)) + 1;
 
 	return to_clt_con(sess->s.con[id]);
 }
@@ -1400,23 +1400,29 @@ static void rtrs_clt_close_work(struct work_struct *work);
 static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
 					 const struct rtrs_addr *path,
 					 size_t con_num, u16 max_segments,
-					 size_t max_segment_size)
+					 size_t max_segment_size, u32 nr_poll_queues)
 {
 	struct rtrs_clt_sess *sess;
 	int err = -ENOMEM;
 	int cpu;
+	size_t total_con;
 
 	sess = kzalloc(sizeof(*sess), GFP_KERNEL);
 	if (!sess)
 		goto err;
 
-	/* Extra connection for user messages */
-	con_num += 1;
-
-	sess->s.con = kcalloc(con_num, sizeof(*sess->s.con), GFP_KERNEL);
+	/*
+	 * irqmode and poll
+	 * +1: Extra connection for user messages
+	 */
+	total_con = con_num + nr_poll_queues + 1;
+	sess->s.con = kcalloc(total_con, sizeof(*sess->s.con), GFP_KERNEL);
 	if (!sess->s.con)
 		goto err_free_sess;
 
+	sess->s.con_num = total_con;
+	sess->s.irq_con_num = con_num + 1;
+
 	sess->stats = kzalloc(sizeof(*sess->stats), GFP_KERNEL);
 	if (!sess->stats)
 		goto err_free_con;
@@ -1435,7 +1441,6 @@ static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
 		memcpy(&sess->s.src_addr, path->src,
 		       rdma_addr_size((struct sockaddr *)path->src));
 	strlcpy(sess->s.sessname, clt->sessname, sizeof(sess->s.sessname));
-	sess->s.con_num = con_num;
 	sess->clt = clt;
 	sess->max_pages_per_mr = max_segments * max_segment_size >> 12;
 	init_waitqueue_head(&sess->state_wq);
@@ -1576,9 +1581,14 @@ static int create_con_cq_qp(struct rtrs_clt_con *con)
 	}
 	cq_size = max_send_wr + max_recv_wr;
 	cq_vector = con->cpu % sess->s.dev->ib_dev->num_comp_vectors;
-	err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
-				 cq_vector, cq_size, max_send_wr,
-				 max_recv_wr, IB_POLL_SOFTIRQ);
+	if (con->c.cid >= sess->s.irq_con_num)
+		err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
+					cq_vector, cq_size, max_send_wr,
+					max_recv_wr, IB_POLL_DIRECT);
+	else
+		err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
+					cq_vector, cq_size, max_send_wr,
+					max_recv_wr, IB_POLL_SOFTIRQ);
 	/*
 	 * In case of error we do not bother to clean previous allocations,
 	 * since destroy_con_cq_qp() must be called.
@@ -2631,6 +2641,7 @@ static void free_clt(struct rtrs_clt *clt)
  * @max_segment_size: Max. size of one segment
  * @max_reconnect_attempts: Number of times to reconnect on error before giving
  *			    up, 0 for * disabled, -1 for forever
+ * @nr_poll_queues: number of polling mode connection using IB_POLL_DIRECT flag
  *
  * Starts session establishment with the rtrs_server. The function can block
  * up to ~2000ms before it returns.
@@ -2644,7 +2655,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 				 size_t pdu_sz, u8 reconnect_delay_sec,
 				 u16 max_segments,
 				 size_t max_segment_size,
-				 s16 max_reconnect_attempts)
+				 s16 max_reconnect_attempts, u32 nr_poll_queues)
 {
 	struct rtrs_clt_sess *sess, *tmp;
 	struct rtrs_clt *clt;
@@ -2662,7 +2673,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 		struct rtrs_clt_sess *sess;
 
 		sess = alloc_sess(clt, &paths[i], nr_cpu_ids,
-				  max_segments, max_segment_size);
+				  max_segments, max_segment_size, nr_poll_queues);
 		if (IS_ERR(sess)) {
 			err = PTR_ERR(sess);
 			goto close_all_sess;
@@ -2887,6 +2898,31 @@ int rtrs_clt_request(int dir, struct rtrs_clt_req_ops *ops,
 }
 EXPORT_SYMBOL(rtrs_clt_request);
 
+int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index)
+{
+	int cnt;
+	struct rtrs_con *con;
+	struct rtrs_clt_sess *sess;
+	struct path_it it;
+
+	rcu_read_lock();
+	for (path_it_init(&it, clt);
+	     (sess = it.next_path(&it)) && it.i < it.clt->paths_num; it.i++) {
+		if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
+			continue;
+
+		con = sess->s.con[index + 1];
+		cnt = ib_process_cq_direct(con->cq, -1);
+		if (likely(cnt))
+			break;
+	}
+	path_it_deinit(&it);
+	rcu_read_unlock();
+
+	return cnt;
+}
+EXPORT_SYMBOL(rtrs_clt_rdma_cq_direct);
+
 /**
  * rtrs_clt_query() - queries RTRS session attributes
  *@clt: session pointer
@@ -2916,7 +2952,7 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt *clt,
 	int err;
 
 	sess = alloc_sess(clt, addr, nr_cpu_ids, clt->max_segments,
-			  clt->max_segment_size);
+			  clt->max_segment_size, 0);
 	if (IS_ERR(sess))
 		return PTR_ERR(sess);
 
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
index 8caad0a2322b..00eb45053339 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
@@ -101,6 +101,7 @@ struct rtrs_sess {
 	uuid_t			uuid;
 	struct rtrs_con	**con;
 	unsigned int		con_num;
+	unsigned int		irq_con_num;
 	unsigned int		recon_cnt;
 	struct rtrs_ib_dev	*dev;
 	int			dev_ref;
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.h b/drivers/infiniband/ulp/rtrs/rtrs.h
index 2db1b5eb3ab0..f891fbe7abe6 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs.h
@@ -59,7 +59,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 				 size_t pdu_sz, u8 reconnect_delay_sec,
 				 u16 max_segments,
 				 size_t max_segment_size,
-				 s16 max_reconnect_attempts);
+				 s16 max_reconnect_attempts, u32 nr_poll_queues);
 
 void rtrs_clt_close(struct rtrs_clt *sess);
 
@@ -103,6 +103,7 @@ int rtrs_clt_request(int dir, struct rtrs_clt_req_ops *ops,
 		     struct rtrs_clt *sess, struct rtrs_permit *permit,
 		     const struct kvec *vec, size_t nr, size_t len,
 		     struct scatterlist *sg, unsigned int sg_cnt);
+int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index);
 
 /**
  * rtrs_attrs - RTRS session attributes
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 14/19] Documentation/ABI/rnbd-clt: Add description for nr_poll_queues
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (12 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 15/19] block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev Gioh Kim
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Gioh Kim,
	Jack Wang

describe how to set nr_poll_queues and enable the polling

Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 Documentation/ABI/testing/sysfs-block-rnbd        |  6 ++++++
 Documentation/ABI/testing/sysfs-class-rnbd-client | 13 +++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-block-rnbd b/Documentation/ABI/testing/sysfs-block-rnbd
index ec716e1c31a8..80b420b5d6b8 100644
--- a/Documentation/ABI/testing/sysfs-block-rnbd
+++ b/Documentation/ABI/testing/sysfs-block-rnbd
@@ -56,3 +56,9 @@ Date:		Feb 2020
 KernelVersion:	5.7
 Contact:	Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
 Description:	Remap the disconnected device if the session is not destroyed yet.
+
+What:		/sys/block/rnbd<N>/rnbd/nr_poll_queues
+Date:		Feb 2020
+KernelVersion:	5.7
+Contact:	Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description:	Contains the number of poll-mode queues
diff --git a/Documentation/ABI/testing/sysfs-class-rnbd-client b/Documentation/ABI/testing/sysfs-class-rnbd-client
index 2aa05b3e348e..0b5997ab3365 100644
--- a/Documentation/ABI/testing/sysfs-class-rnbd-client
+++ b/Documentation/ABI/testing/sysfs-class-rnbd-client
@@ -85,6 +85,19 @@ Description:	Expected format is the following::
 
 		By default "rw" is used.
 
+		nr_poll_queues
+		  specifies the number of poll-mode queues. If the IO has HIPRI flag,
+		  the block-layer will send the IO via the poll-mode queue.
+		  For fast network and device the polling is faster than interrupt-base
+		  IO handling because it saves time for context switching, switching to
+		  another process, handling the interrupt and switching back to the
+		  issuing process.
+
+		  Set -1 if you want to set it as the number of CPUs
+		  By default rnbd client creates only irq-mode queues.
+
+		  NOTICE: MUST make a unique session for a device using the poll-mode queues.
+
 		Exit Codes:
 
 		If the device is already mapped it will fail with EEXIST. If the input
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 15/19] block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (13 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 14/19] Documentation/ABI/rnbd-clt: Add description for nr_poll_queues Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:23 ` [PATCHv4 for-next 16/19] block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes Gioh Kim
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Gioh Kim,
	Leon Romanovsky, linux-rdma, Aleksei Marov, Gioh Kim,
	Chaitanya Kulkarni, Jason Gunthorpe

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

struct rtrs_srv is not used when handling rnbd_srv_rdma_ev messages, so
cleaned up
rdma_ev function pointer in rtrs_srv_ops also is changed.

Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Aleksei Marov <aleksei.marov@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/block/rnbd/rnbd-srv.c          | 39 ++++++++++----------------
 drivers/infiniband/ulp/rtrs/rtrs-srv.c |  4 +--
 drivers/infiniband/ulp/rtrs/rtrs.h     |  3 +-
 3 files changed, 18 insertions(+), 28 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index a9bb414f7442..abacd9ef10d6 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -114,8 +114,7 @@ rnbd_get_sess_dev(int dev_id, struct rnbd_srv_session *srv_sess)
 	return sess_dev;
 }
 
-static int process_rdma(struct rtrs_srv *sess,
-			struct rnbd_srv_session *srv_sess,
+static int process_rdma(struct rnbd_srv_session *srv_sess,
 			struct rtrs_srv_op *id, void *data, u32 datalen,
 			const void *usr, size_t usrlen)
 {
@@ -344,8 +343,7 @@ void rnbd_srv_sess_dev_force_close(struct rnbd_srv_sess_dev *sess_dev,
 	mutex_unlock(&sess->lock);
 }
 
-static int process_msg_close(struct rtrs_srv *rtrs,
-			     struct rnbd_srv_session *srv_sess,
+static int process_msg_close(struct rnbd_srv_session *srv_sess,
 			     void *data, size_t datalen, const void *usr,
 			     size_t usrlen)
 {
@@ -364,20 +362,18 @@ static int process_msg_close(struct rtrs_srv *rtrs,
 	return 0;
 }
 
-static int process_msg_open(struct rtrs_srv *rtrs,
-			    struct rnbd_srv_session *srv_sess,
+static int process_msg_open(struct rnbd_srv_session *srv_sess,
 			    const void *msg, size_t len,
 			    void *data, size_t datalen);
 
-static int process_msg_sess_info(struct rtrs_srv *rtrs,
-				 struct rnbd_srv_session *srv_sess,
+static int process_msg_sess_info(struct rnbd_srv_session *srv_sess,
 				 const void *msg, size_t len,
 				 void *data, size_t datalen);
 
-static int rnbd_srv_rdma_ev(struct rtrs_srv *rtrs, void *priv,
-			     struct rtrs_srv_op *id, int dir,
-			     void *data, size_t datalen, const void *usr,
-			     size_t usrlen)
+static int rnbd_srv_rdma_ev(void *priv,
+			    struct rtrs_srv_op *id, int dir,
+			    void *data, size_t datalen, const void *usr,
+			    size_t usrlen)
 {
 	struct rnbd_srv_session *srv_sess = priv;
 	const struct rnbd_msg_hdr *hdr = usr;
@@ -391,19 +387,16 @@ static int rnbd_srv_rdma_ev(struct rtrs_srv *rtrs, void *priv,
 
 	switch (type) {
 	case RNBD_MSG_IO:
-		return process_rdma(rtrs, srv_sess, id, data, datalen, usr,
-				    usrlen);
+		return process_rdma(srv_sess, id, data, datalen, usr, usrlen);
 	case RNBD_MSG_CLOSE:
-		ret = process_msg_close(rtrs, srv_sess, data, datalen,
-					usr, usrlen);
+		ret = process_msg_close(srv_sess, data, datalen, usr, usrlen);
 		break;
 	case RNBD_MSG_OPEN:
-		ret = process_msg_open(rtrs, srv_sess, usr, usrlen,
-				       data, datalen);
+		ret = process_msg_open(srv_sess, usr, usrlen, data, datalen);
 		break;
 	case RNBD_MSG_SESS_INFO:
-		ret = process_msg_sess_info(rtrs, srv_sess, usr, usrlen,
-					    data, datalen);
+		ret = process_msg_sess_info(srv_sess, usr, usrlen, data,
+					    datalen);
 		break;
 	default:
 		pr_warn("Received unexpected message type %d with dir %d from session %s\n",
@@ -656,8 +649,7 @@ static char *rnbd_srv_get_full_path(struct rnbd_srv_session *srv_sess,
 	return full_path;
 }
 
-static int process_msg_sess_info(struct rtrs_srv *rtrs,
-				 struct rnbd_srv_session *srv_sess,
+static int process_msg_sess_info(struct rnbd_srv_session *srv_sess,
 				 const void *msg, size_t len,
 				 void *data, size_t datalen)
 {
@@ -698,8 +690,7 @@ find_srv_sess_dev(struct rnbd_srv_session *srv_sess, const char *dev_name)
 	return NULL;
 }
 
-static int process_msg_open(struct rtrs_srv *rtrs,
-			    struct rnbd_srv_session *srv_sess,
+static int process_msg_open(struct rnbd_srv_session *srv_sess,
 			    const void *msg, size_t len,
 			    void *data, size_t datalen)
 {
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index d071809e3ed2..f7aa2a7e7442 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -998,7 +998,7 @@ static void process_read(struct rtrs_srv_con *con,
 	usr_len = le16_to_cpu(msg->usr_len);
 	data_len = off - usr_len;
 	data = page_address(srv->chunks[buf_id]);
-	ret = ctx->ops.rdma_ev(srv, srv->priv, id, READ, data, data_len,
+	ret = ctx->ops.rdma_ev(srv->priv, id, READ, data, data_len,
 			   data + data_len, usr_len);
 
 	if (unlikely(ret)) {
@@ -1051,7 +1051,7 @@ static void process_write(struct rtrs_srv_con *con,
 	usr_len = le16_to_cpu(req->usr_len);
 	data_len = off - usr_len;
 	data = page_address(srv->chunks[buf_id]);
-	ret = ctx->ops.rdma_ev(srv, srv->priv, id, WRITE, data, data_len,
+	ret = ctx->ops.rdma_ev(srv->priv, id, WRITE, data, data_len,
 			   data + data_len, usr_len);
 	if (unlikely(ret)) {
 		rtrs_err_rl(s,
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.h b/drivers/infiniband/ulp/rtrs/rtrs.h
index f891fbe7abe6..b0f56ffeff88 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs.h
@@ -139,7 +139,6 @@ struct rtrs_srv_ops {
 	 *			message for the data transfer will be sent to
 	 *			the client.
 
-	 *	@sess:		Session
 	 *	@priv:		Private data set by rtrs_srv_set_sess_priv()
 	 *	@id:		internal RTRS operation id
 	 *	@dir:		READ/WRITE
@@ -153,7 +152,7 @@ struct rtrs_srv_ops {
 	 *	@usr:		The extra user message sent by the client (%vec)
 	 *	@usrlen:	Size of the user message
 	 */
-	int (*rdma_ev)(struct rtrs_srv *sess, void *priv,
+	int (*rdma_ev)(void *priv,
 		       struct rtrs_srv_op *id, int dir,
 		       void *data, size_t datalen, const void *usr,
 		       size_t usrlen);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 16/19] block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (14 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 15/19] block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev Gioh Kim
@ 2021-04-14 12:23 ` Gioh Kim
  2021-04-14 12:24 ` [PATCHv4 for-next 17/19] block/rnbd-clt: Remove max_segment_size Gioh Kim
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:23 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang,
	Md Haris Iqbal, Gioh Kim

From: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>

When an RTRS session state changes, the transport layer generates an event
to RNBD. Then RNBD will change the state of the RNBD client device
accordingly.

This commit add kobject_uevent when the RNBD device state changes. With
this udev rules can be configured to react accordingly.

Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
---
 drivers/block/rnbd/rnbd-clt-sysfs.c | 1 +
 drivers/block/rnbd/rnbd-clt.c       | 9 ++++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/block/rnbd/rnbd-clt-sysfs.c b/drivers/block/rnbd/rnbd-clt-sysfs.c
index bd111ebceb75..5609b9cdc289 100644
--- a/drivers/block/rnbd/rnbd-clt-sysfs.c
+++ b/drivers/block/rnbd/rnbd-clt-sysfs.c
@@ -491,6 +491,7 @@ static int rnbd_clt_add_dev_kobj(struct rnbd_clt_dev *dev)
 			      ret);
 		kobject_put(&dev->kobj);
 	}
+	kobject_uevent(gd_kobj, KOBJ_ONLINE);
 
 	return ret;
 }
diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index 63719ec04d58..1fe010ed6f69 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -110,6 +110,7 @@ static int rnbd_clt_change_capacity(struct rnbd_clt_dev *dev,
 static int process_msg_open_rsp(struct rnbd_clt_dev *dev,
 				struct rnbd_msg_open_rsp *rsp)
 {
+	struct kobject *gd_kobj;
 	int err = 0;
 
 	mutex_lock(&dev->lock);
@@ -128,6 +129,8 @@ static int process_msg_open_rsp(struct rnbd_clt_dev *dev,
 		 */
 		if (dev->nsectors != nsectors)
 			rnbd_clt_change_capacity(dev, nsectors);
+		gd_kobj = &disk_to_dev(dev->gd)->kobj;
+		kobject_uevent(gd_kobj, KOBJ_ONLINE);
 		rnbd_clt_info(dev, "Device online, device remapped successfully\n");
 	}
 	err = rnbd_clt_set_dev_attr(dev, rsp);
@@ -649,14 +652,18 @@ static int send_msg_sess_info(struct rnbd_clt_session *sess, enum wait_type wait
 static void set_dev_states_to_disconnected(struct rnbd_clt_session *sess)
 {
 	struct rnbd_clt_dev *dev;
+	struct kobject *gd_kobj;
 
 	mutex_lock(&sess->lock);
 	list_for_each_entry(dev, &sess->devs_list, list) {
 		rnbd_clt_err(dev, "Device disconnected.\n");
 
 		mutex_lock(&dev->lock);
-		if (dev->dev_state == DEV_STATE_MAPPED)
+		if (dev->dev_state == DEV_STATE_MAPPED) {
 			dev->dev_state = DEV_STATE_MAPPED_DISCONNECTED;
+			gd_kobj = &disk_to_dev(dev->gd)->kobj;
+			kobject_uevent(gd_kobj, KOBJ_OFFLINE);
+		}
 		mutex_unlock(&dev->lock);
 	}
 	mutex_unlock(&sess->lock);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 17/19] block/rnbd-clt: Remove max_segment_size
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (15 preceding siblings ...)
  2021-04-14 12:23 ` [PATCHv4 for-next 16/19] block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes Gioh Kim
@ 2021-04-14 12:24 ` Gioh Kim
  2021-04-14 12:24 ` [PATCHv4 for-next 18/19] block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name Gioh Kim
  2021-04-14 12:24 ` [PATCHv4 for-next 19/19] block/rnbd: Use strscpy instead of strlcpy Gioh Kim
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:24 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang, Jack Wang,
	Leon Romanovsky, linux-rdma, Gioh Kim, Jason Gunthorpe

From: Jack Wang <jinpu.wang@cloud.ionos.com>

We always map with SZ_4K, so do not need max_segment_size.

Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/block/rnbd/rnbd-clt.c          |  1 -
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 15 +++++----------
 drivers/infiniband/ulp/rtrs/rtrs-clt.h |  1 -
 drivers/infiniband/ulp/rtrs/rtrs.h     |  1 -
 4 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index 1fe010ed6f69..7446660eb7f2 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -1289,7 +1289,6 @@ find_and_get_or_create_sess(const char *sessname,
 				   paths, path_cnt, port_nr,
 				   0, /* Do not use pdu of rtrs */
 				   RECONNECT_DELAY, BMAX_SEGMENTS,
-				   BLK_MAX_SEGMENT_SIZE,
 				   MAX_RECONNECTS, nr_poll_queues);
 	if (IS_ERR(sess->rtrs)) {
 		err = PTR_ERR(sess->rtrs);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 467d135a82cf..1603e0c399e8 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1400,7 +1400,7 @@ static void rtrs_clt_close_work(struct work_struct *work);
 static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
 					 const struct rtrs_addr *path,
 					 size_t con_num, u16 max_segments,
-					 size_t max_segment_size, u32 nr_poll_queues)
+					 u32 nr_poll_queues)
 {
 	struct rtrs_clt_sess *sess;
 	int err = -ENOMEM;
@@ -1442,7 +1442,7 @@ static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
 		       rdma_addr_size((struct sockaddr *)path->src));
 	strlcpy(sess->s.sessname, clt->sessname, sizeof(sess->s.sessname));
 	sess->clt = clt;
-	sess->max_pages_per_mr = max_segments * max_segment_size >> 12;
+	sess->max_pages_per_mr = max_segments;
 	init_waitqueue_head(&sess->state_wq);
 	sess->state = RTRS_CLT_CONNECTING;
 	atomic_set(&sess->connected_cnt, 0);
@@ -2538,7 +2538,6 @@ static struct rtrs_clt *alloc_clt(const char *sessname, size_t paths_num,
 				  void	(*link_ev)(void *priv,
 						   enum rtrs_clt_link_ev ev),
 				  unsigned int max_segments,
-				  size_t max_segment_size,
 				  unsigned int reconnect_delay_sec,
 				  unsigned int max_reconnect_attempts)
 {
@@ -2568,7 +2567,6 @@ static struct rtrs_clt *alloc_clt(const char *sessname, size_t paths_num,
 	clt->port = port;
 	clt->pdu_sz = pdu_sz;
 	clt->max_segments = max_segments;
-	clt->max_segment_size = max_segment_size;
 	clt->reconnect_delay_sec = reconnect_delay_sec;
 	clt->max_reconnect_attempts = max_reconnect_attempts;
 	clt->priv = priv;
@@ -2638,7 +2636,6 @@ static void free_clt(struct rtrs_clt *clt)
  * @pdu_sz: Size of extra payload which can be accessed after permit allocation.
  * @reconnect_delay_sec: time between reconnect tries
  * @max_segments: Max. number of segments per IO request
- * @max_segment_size: Max. size of one segment
  * @max_reconnect_attempts: Number of times to reconnect on error before giving
  *			    up, 0 for * disabled, -1 for forever
  * @nr_poll_queues: number of polling mode connection using IB_POLL_DIRECT flag
@@ -2654,7 +2651,6 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 				 size_t paths_num, u16 port,
 				 size_t pdu_sz, u8 reconnect_delay_sec,
 				 u16 max_segments,
-				 size_t max_segment_size,
 				 s16 max_reconnect_attempts, u32 nr_poll_queues)
 {
 	struct rtrs_clt_sess *sess, *tmp;
@@ -2663,7 +2659,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 
 	clt = alloc_clt(sessname, paths_num, port, pdu_sz, ops->priv,
 			ops->link_ev,
-			max_segments, max_segment_size, reconnect_delay_sec,
+			max_segments, reconnect_delay_sec,
 			max_reconnect_attempts);
 	if (IS_ERR(clt)) {
 		err = PTR_ERR(clt);
@@ -2673,7 +2669,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 		struct rtrs_clt_sess *sess;
 
 		sess = alloc_sess(clt, &paths[i], nr_cpu_ids,
-				  max_segments, max_segment_size, nr_poll_queues);
+				  max_segments, nr_poll_queues);
 		if (IS_ERR(sess)) {
 			err = PTR_ERR(sess);
 			goto close_all_sess;
@@ -2951,8 +2947,7 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt *clt,
 	struct rtrs_clt_sess *sess;
 	int err;
 
-	sess = alloc_sess(clt, addr, nr_cpu_ids, clt->max_segments,
-			  clt->max_segment_size, 0);
+	sess = alloc_sess(clt, addr, nr_cpu_ids, clt->max_segments, 0);
 	if (IS_ERR(sess))
 		return PTR_ERR(sess);
 
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
index 692bc83e1f09..98ba5d0a48b8 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
@@ -166,7 +166,6 @@ struct rtrs_clt {
 	unsigned int		max_reconnect_attempts;
 	unsigned int		reconnect_delay_sec;
 	unsigned int		max_segments;
-	size_t			max_segment_size;
 	void			*permits;
 	unsigned long		*permits_map;
 	size_t			queue_depth;
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.h b/drivers/infiniband/ulp/rtrs/rtrs.h
index b0f56ffeff88..bebaa94c4728 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs.h
@@ -58,7 +58,6 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 				 size_t path_cnt, u16 port,
 				 size_t pdu_sz, u8 reconnect_delay_sec,
 				 u16 max_segments,
-				 size_t max_segment_size,
 				 s16 max_reconnect_attempts, u32 nr_poll_queues);
 
 void rtrs_clt_close(struct rtrs_clt *sess);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 18/19] block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (16 preceding siblings ...)
  2021-04-14 12:24 ` [PATCHv4 for-next 17/19] block/rnbd-clt: Remove max_segment_size Gioh Kim
@ 2021-04-14 12:24 ` Gioh Kim
  2021-04-14 12:24 ` [PATCHv4 for-next 19/19] block/rnbd: Use strscpy instead of strlcpy Gioh Kim
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:24 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang,
	Dima Stepanov, Dima Stepanov, Arnd Bergmann, Gioh Kim,
	Chaitanya Kulkarni

From: Dima Stepanov <dmitrii.stepanov@cloud.ionos.com>

cppcheck report the following error:
  rnbd/rnbd-clt-sysfs.c:522:36: error: The variable 'buf' is used both
  as a parameter and as destination in snprintf(). The origin and
  destination buffers overlap. Quote from glibc (C-library)
  documentation
  (http://www.gnu.org/software/libc/manual/html_mono/libc.html#Formatted-Output-Functions):
  "If copying takes place between objects that overlap as a result of a
  call to sprintf() or snprintf(), the results are undefined."
  [sprintfOverlappingData]
Fix it by initializing the buf variable in the first snprintf call.

Fixes: 91f4acb2801c ("block/rnbd-clt: support mapping two devices")
Signed-off-by: Dima Stepanov <dmitrii.stepanov@ionos.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/block/rnbd/rnbd-clt-sysfs.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt-sysfs.c b/drivers/block/rnbd/rnbd-clt-sysfs.c
index 5609b9cdc289..062c52e7a468 100644
--- a/drivers/block/rnbd/rnbd-clt-sysfs.c
+++ b/drivers/block/rnbd/rnbd-clt-sysfs.c
@@ -515,11 +515,7 @@ static int rnbd_clt_get_path_name(struct rnbd_clt_dev *dev, char *buf,
 	while ((s = strchr(pathname, '/')))
 		s[0] = '!';
 
-	ret = snprintf(buf, len, "%s", pathname);
-	if (ret >= len)
-		return -ENAMETOOLONG;
-
-	ret = snprintf(buf, len, "%s@%s", buf, dev->sess->sessname);
+	ret = snprintf(buf, len, "%s@%s", pathname, dev->sess->sessname);
 	if (ret >= len)
 		return -ENAMETOOLONG;
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCHv4 for-next 19/19] block/rnbd: Use strscpy instead of strlcpy
  2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
                   ` (17 preceding siblings ...)
  2021-04-14 12:24 ` [PATCHv4 for-next 18/19] block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name Gioh Kim
@ 2021-04-14 12:24 ` Gioh Kim
  18 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-14 12:24 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, hch, sagi, bvanassche, haris.iqbal, jinpu.wang,
	Dima Stepanov, Gioh Kim, Chaitanya Kulkarni

From: Dima Stepanov <dmitrii.stepanov@cloud.ionos.com>

During checkpatch analyzing the following warning message was found:
  WARNING:STRLCPY: Prefer strscpy over strlcpy - see:
  https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Fix it by using strscpy calls instead of strlcpy.

Signed-off-by: Dima Stepanov <dmitrii.stepanov@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---
 drivers/block/rnbd/rnbd-clt-sysfs.c | 6 +++---
 drivers/block/rnbd/rnbd-clt.c       | 4 ++--
 drivers/block/rnbd/rnbd-srv.c       | 6 +++---
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt-sysfs.c b/drivers/block/rnbd/rnbd-clt-sysfs.c
index 062c52e7a468..66316cdc2a92 100644
--- a/drivers/block/rnbd/rnbd-clt-sysfs.c
+++ b/drivers/block/rnbd/rnbd-clt-sysfs.c
@@ -99,7 +99,7 @@ static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
 				kfree(p);
 				goto out;
 			}
-			strlcpy(opt->sessname, p, NAME_MAX);
+			strscpy(opt->sessname, p, NAME_MAX);
 			kfree(p);
 			break;
 
@@ -142,7 +142,7 @@ static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
 				kfree(p);
 				goto out;
 			}
-			strlcpy(opt->pathname, p, NAME_MAX);
+			strscpy(opt->pathname, p, NAME_MAX);
 			kfree(p);
 			break;
 
@@ -511,7 +511,7 @@ static int rnbd_clt_get_path_name(struct rnbd_clt_dev *dev, char *buf,
 	int ret;
 	char pathname[NAME_MAX], *s;
 
-	strlcpy(pathname, dev->pathname, sizeof(pathname));
+	strscpy(pathname, dev->pathname, sizeof(pathname));
 	while ((s = strchr(pathname, '/')))
 		s[0] = '!';
 
diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index 7446660eb7f2..76556fd6f153 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -578,7 +578,7 @@ static int send_msg_open(struct rnbd_clt_dev *dev, enum wait_type wait)
 
 	msg.hdr.type	= cpu_to_le16(RNBD_MSG_OPEN);
 	msg.access_mode	= dev->access_mode;
-	strlcpy(msg.dev_name, dev->pathname, sizeof(msg.dev_name));
+	strscpy(msg.dev_name, dev->pathname, sizeof(msg.dev_name));
 
 	WARN_ON(!rnbd_clt_get_dev(dev));
 	err = send_usr_msg(sess->rtrs, READ, iu,
@@ -800,7 +800,7 @@ static struct rnbd_clt_session *alloc_sess(const char *sessname)
 	sess = kzalloc_node(sizeof(*sess), GFP_KERNEL, NUMA_NO_NODE);
 	if (!sess)
 		return ERR_PTR(-ENOMEM);
-	strlcpy(sess->sessname, sessname, sizeof(sess->sessname));
+	strscpy(sess->sessname, sessname, sizeof(sess->sessname));
 	atomic_set(&sess->busy, 0);
 	mutex_init(&sess->lock);
 	INIT_LIST_HEAD(&sess->devs_list);
diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index abacd9ef10d6..899dd9d7c10b 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -298,7 +298,7 @@ static int create_sess(struct rtrs_srv *rtrs)
 	mutex_unlock(&sess_lock);
 
 	srv_sess->rtrs = rtrs;
-	strlcpy(srv_sess->sessname, sessname, sizeof(srv_sess->sessname));
+	strscpy(srv_sess->sessname, sessname, sizeof(srv_sess->sessname));
 
 	rtrs_srv_set_sess_priv(rtrs, srv_sess);
 
@@ -437,7 +437,7 @@ static struct rnbd_srv_dev *rnbd_srv_init_srv_dev(const char *id)
 	if (!dev)
 		return ERR_PTR(-ENOMEM);
 
-	strlcpy(dev->id, id, sizeof(dev->id));
+	strscpy(dev->id, id, sizeof(dev->id));
 	kref_init(&dev->kref);
 	INIT_LIST_HEAD(&dev->sess_dev_list);
 	mutex_init(&dev->lock);
@@ -589,7 +589,7 @@ rnbd_srv_create_set_sess_dev(struct rnbd_srv_session *srv_sess,
 
 	kref_init(&sdev->kref);
 
-	strlcpy(sdev->pathname, open_msg->dev_name, sizeof(sdev->pathname));
+	strscpy(sdev->pathname, open_msg->dev_name, sizeof(sdev->pathname));
 
 	sdev->rnbd_dev		= rnbd_dev;
 	sdev->sess		= srv_sess;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization
  2021-04-14 12:23 ` [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization Gioh Kim
@ 2021-04-18  8:36   ` Leon Romanovsky
  2021-04-19  5:12     ` Gioh Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Leon Romanovsky @ 2021-04-18  8:36 UTC (permalink / raw)
  To: Gioh Kim
  Cc: linux-block, axboe, hch, sagi, bvanassche, haris.iqbal,
	jinpu.wang, Gioh Kim, linux-rdma, Jason Gunthorpe

On Wed, Apr 14, 2021 at 02:23:56PM +0200, Gioh Kim wrote:
> From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> 
> RNBD can make double-queues for irq-mode and poll-mode.
> For example, on 4-CPU system 8 request-queues are created,
> 4 for irq-mode and 4 for poll-mode.
> If the IO has HIPRI flag, the block-layer will call .poll function
> of RNBD. Then IO is sent to the poll-mode queue.
> Add optional nr_poll_queues argument for map_devices interface.
> 
> To support polling of RNBD, RTRS client creates connections
> for both of irq-mode and direct-poll-mode.
> 
> For example, on 4-CPU system it could've create 5 connections:
> con[0] => user message (softirq cq)
> con[1:4] => softirq cq
> 
> After this patch, it can create 9 connections:
> con[0] => user message (softirq cq)
> con[1:4] => softirq cq
> con[5:8] => DIRECT-POLL cq
> 
> Cc: Leon Romanovsky <leonro@nvidia.com>
> Cc: linux-rdma@vger.kernel.org
> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
> Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> Acked-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/block/rnbd/rnbd-clt-sysfs.c    | 56 +++++++++++++----
>  drivers/block/rnbd/rnbd-clt.c          | 85 +++++++++++++++++++++++---
>  drivers/block/rnbd/rnbd-clt.h          |  5 +-
>  drivers/infiniband/ulp/rtrs/rtrs-clt.c | 62 +++++++++++++++----
>  drivers/infiniband/ulp/rtrs/rtrs-pri.h |  1 +
>  drivers/infiniband/ulp/rtrs/rtrs.h     |  3 +-
>  6 files changed, 178 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/block/rnbd/rnbd-clt-sysfs.c b/drivers/block/rnbd/rnbd-clt-sysfs.c
> index 49015f428e67..bd111ebceb75 100644
> --- a/drivers/block/rnbd/rnbd-clt-sysfs.c
> +++ b/drivers/block/rnbd/rnbd-clt-sysfs.c
> @@ -34,6 +34,7 @@ enum {
>  	RNBD_OPT_DEV_PATH	= 1 << 2,
>  	RNBD_OPT_ACCESS_MODE	= 1 << 3,
>  	RNBD_OPT_SESSNAME	= 1 << 6,
> +	RNBD_OPT_NR_POLL_QUEUES	= 1 << 7,
>  };
>  
>  static const unsigned int rnbd_opt_mandatory[] = {
> @@ -42,12 +43,13 @@ static const unsigned int rnbd_opt_mandatory[] = {
>  };
>  
>  static const match_table_t rnbd_opt_tokens = {
> -	{RNBD_OPT_PATH,		"path=%s"	},
> -	{RNBD_OPT_DEV_PATH,	"device_path=%s"},
> -	{RNBD_OPT_DEST_PORT,	"dest_port=%d"  },
> -	{RNBD_OPT_ACCESS_MODE,	"access_mode=%s"},
> -	{RNBD_OPT_SESSNAME,	"sessname=%s"	},
> -	{RNBD_OPT_ERR,		NULL		},
> +	{RNBD_OPT_PATH,			"path=%s"		},
> +	{RNBD_OPT_DEV_PATH,		"device_path=%s"	},
> +	{RNBD_OPT_DEST_PORT,		"dest_port=%d"		},
> +	{RNBD_OPT_ACCESS_MODE,		"access_mode=%s"	},
> +	{RNBD_OPT_SESSNAME,		"sessname=%s"		},
> +	{RNBD_OPT_NR_POLL_QUEUES,	"nr_poll_queues=%d"	},
> +	{RNBD_OPT_ERR,			NULL			},
>  };
>  
>  struct rnbd_map_options {
> @@ -57,6 +59,7 @@ struct rnbd_map_options {
>  	char *pathname;
>  	u16 *dest_port;
>  	enum rnbd_access_mode *access_mode;
> +	u32 *nr_poll_queues;
>  };
>  
>  static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
> @@ -68,7 +71,7 @@ static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
>  	int opt_mask = 0;
>  	int token;
>  	int ret = -EINVAL;
> -	int i, dest_port;
> +	int i, dest_port, nr_poll_queues;
>  	int p_cnt = 0;
>  
>  	options = kstrdup(buf, GFP_KERNEL);
> @@ -178,6 +181,19 @@ static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
>  			kfree(p);
>  			break;
>  
> +		case RNBD_OPT_NR_POLL_QUEUES:
> +			if (match_int(args, &nr_poll_queues) || nr_poll_queues < -1 ||
> +			    nr_poll_queues > (int)nr_cpu_ids) {
> +				pr_err("bad nr_poll_queues parameter '%d'\n",
> +				       nr_poll_queues);
> +				ret = -EINVAL;
> +				goto out;
> +			}
> +			if (nr_poll_queues == -1)
> +				nr_poll_queues = nr_cpu_ids;
> +			*opt->nr_poll_queues = nr_poll_queues;
> +			break;
> +
>  		default:
>  			pr_err("map_device: Unknown parameter or missing value '%s'\n",
>  			       p);
> @@ -227,6 +243,20 @@ static ssize_t state_show(struct kobject *kobj,
>  
>  static struct kobj_attribute rnbd_clt_state_attr = __ATTR_RO(state);
>  
> +static ssize_t nr_poll_queues_show(struct kobject *kobj,
> +				   struct kobj_attribute *attr, char *page)
> +{
> +	struct rnbd_clt_dev *dev;
> +
> +	dev = container_of(kobj, struct rnbd_clt_dev, kobj);
> +
> +	return snprintf(page, PAGE_SIZE, "%d\n",
> +			dev->nr_poll_queues);
> +}

Didn't Greg ask you to use sysfs_emit() here?

> +
> +static struct kobj_attribute rnbd_clt_nr_poll_queues =
> +	__ATTR_RO(nr_poll_queues);
> +
>  static ssize_t mapping_path_show(struct kobject *kobj,
>  				 struct kobj_attribute *attr, char *page)
>  {
> @@ -421,6 +451,7 @@ static struct attribute *rnbd_dev_attrs[] = {
>  	&rnbd_clt_state_attr.attr,
>  	&rnbd_clt_session_attr.attr,
>  	&rnbd_clt_access_mode.attr,
> +	&rnbd_clt_nr_poll_queues.attr,
>  	NULL,
>  };
>  
> @@ -469,7 +500,7 @@ static ssize_t rnbd_clt_map_device_show(struct kobject *kobj,
>  					 char *page)
>  {
>  	return scnprintf(page, PAGE_SIZE,
> -			 "Usage: echo \"[dest_port=server port number] sessname=<name of the rtrs session> path=<[srcaddr@]dstaddr> [path=<[srcaddr@]dstaddr>] device_path=<full path on remote side> [access_mode=<ro|rw|migration>]\" > %s\n\naddr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]\n",
> +			 "Usage: echo \"[dest_port=server port number] sessname=<name of the rtrs session> path=<[srcaddr@]dstaddr> [path=<[srcaddr@]dstaddr>] device_path=<full path on remote side> [access_mode=<ro|rw|migration>] [nr_poll_queues=<number of queues>]\" > %s\n\naddr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]\n",
>  			 attr->attr.name);
>  }
>  
> @@ -541,6 +572,7 @@ static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
>  	char sessname[NAME_MAX];
>  	enum rnbd_access_mode access_mode = RNBD_ACCESS_RW;
>  	u16 port_nr = RTRS_PORT;
> +	u32 nr_poll_queues = 0;
>  
>  	struct sockaddr_storage *addrs;
>  	struct rtrs_addr paths[6];
> @@ -552,6 +584,7 @@ static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
>  	opt.pathname = pathname;
>  	opt.dest_port = &port_nr;
>  	opt.access_mode = &access_mode;
> +	opt.nr_poll_queues = &nr_poll_queues;
>  	addrs = kcalloc(ARRAY_SIZE(paths) * 2, sizeof(*addrs), GFP_KERNEL);
>  	if (!addrs)
>  		return -ENOMEM;
> @@ -565,12 +598,13 @@ static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
>  	if (ret)
>  		goto out;
>  
> -	pr_info("Mapping device %s on session %s, (access_mode: %s)\n",
> +	pr_info("Mapping device %s on session %s, (access_mode: %s, nr_poll_queues: %d)\n",
>  		pathname, sessname,
> -		rnbd_access_mode_str(access_mode));
> +		rnbd_access_mode_str(access_mode),
> +		nr_poll_queues);
>  
>  	dev = rnbd_clt_map_device(sessname, paths, path_cnt, port_nr, pathname,
> -				  access_mode);
> +				  access_mode, nr_poll_queues);
>  	if (IS_ERR(dev)) {
>  		ret = PTR_ERR(dev);
>  		goto out;
> diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
> index 9b44aac680d5..63719ec04d58 100644
> --- a/drivers/block/rnbd/rnbd-clt.c
> +++ b/drivers/block/rnbd/rnbd-clt.c
> @@ -1165,9 +1165,54 @@ static blk_status_t rnbd_queue_rq(struct blk_mq_hw_ctx *hctx,
>  	return ret;
>  }
>  
> +static int rnbd_rdma_poll(struct blk_mq_hw_ctx *hctx)
> +{
> +	struct rnbd_queue *q = hctx->driver_data;
> +	struct rnbd_clt_dev *dev = q->dev;
> +	int cnt;
> +
> +	cnt = rtrs_clt_rdma_cq_direct(dev->sess->rtrs, hctx->queue_num);
> +	return cnt;
> +}
> +
> +static int rnbd_rdma_map_queues(struct blk_mq_tag_set *set)
> +{
> +	struct rnbd_clt_session *sess = set->driver_data;
> +
> +	/* shared read/write queues */
> +	set->map[HCTX_TYPE_DEFAULT].nr_queues = num_online_cpus();
> +	set->map[HCTX_TYPE_DEFAULT].queue_offset = 0;
> +	set->map[HCTX_TYPE_READ].nr_queues = num_online_cpus();
> +	set->map[HCTX_TYPE_READ].queue_offset = 0;
> +	blk_mq_map_queues(&set->map[HCTX_TYPE_DEFAULT]);
> +	blk_mq_map_queues(&set->map[HCTX_TYPE_READ]);
> +
> +	if (sess->nr_poll_queues) {
> +		/* dedicated queue for poll */
> +		set->map[HCTX_TYPE_POLL].nr_queues = sess->nr_poll_queues;
> +		set->map[HCTX_TYPE_POLL].queue_offset = set->map[HCTX_TYPE_READ].queue_offset +
> +			set->map[HCTX_TYPE_READ].nr_queues;
> +		blk_mq_map_queues(&set->map[HCTX_TYPE_POLL]);
> +		pr_info("[session=%s] mapped %d/%d/%d default/read/poll queues.\n",
> +			sess->sessname,
> +			set->map[HCTX_TYPE_DEFAULT].nr_queues,
> +			set->map[HCTX_TYPE_READ].nr_queues,
> +			set->map[HCTX_TYPE_POLL].nr_queues);
> +	} else {
> +		pr_info("[session=%s] mapped %d/%d default/read queues.\n",
> +			sess->sessname,
> +			set->map[HCTX_TYPE_DEFAULT].nr_queues,
> +			set->map[HCTX_TYPE_READ].nr_queues);
> +	}
> +
> +	return 0;
> +}
> +
>  static struct blk_mq_ops rnbd_mq_ops = {
>  	.queue_rq	= rnbd_queue_rq,
>  	.complete	= rnbd_softirq_done_fn,
> +	.map_queues     = rnbd_rdma_map_queues,
> +	.poll           = rnbd_rdma_poll,
>  };
>  
>  static int setup_mq_tags(struct rnbd_clt_session *sess)
> @@ -1181,7 +1226,15 @@ static int setup_mq_tags(struct rnbd_clt_session *sess)
>  	tag_set->flags		= BLK_MQ_F_SHOULD_MERGE |
>  				  BLK_MQ_F_TAG_QUEUE_SHARED;
>  	tag_set->cmd_size	= sizeof(struct rnbd_iu) + RNBD_RDMA_SGL_SIZE;
> -	tag_set->nr_hw_queues	= num_online_cpus();
> +
> +	/* for HCTX_TYPE_DEFAULT, HCTX_TYPE_READ, HCTX_TYPE_POLL */
> +	tag_set->nr_maps        = sess->nr_poll_queues ? HCTX_MAX_TYPES : 2;
> +	/*
> +	 * HCTX_TYPE_DEFAULT and HCTX_TYPE_READ share one set of queues
> +	 * others are for HCTX_TYPE_POLL
> +	 */
> +	tag_set->nr_hw_queues	= num_online_cpus() + sess->nr_poll_queues;
> +	tag_set->driver_data    = sess;
>  
>  	return blk_mq_alloc_tag_set(tag_set);
>  }
> @@ -1189,7 +1242,7 @@ static int setup_mq_tags(struct rnbd_clt_session *sess)
>  static struct rnbd_clt_session *
>  find_and_get_or_create_sess(const char *sessname,
>  			    const struct rtrs_addr *paths,
> -			    size_t path_cnt, u16 port_nr)
> +			    size_t path_cnt, u16 port_nr, u32 nr_poll_queues)
>  {
>  	struct rnbd_clt_session *sess;
>  	struct rtrs_attrs attrs;
> @@ -1198,6 +1251,17 @@ find_and_get_or_create_sess(const char *sessname,
>  	struct rtrs_clt_ops rtrs_ops;
>  
>  	sess = find_or_create_sess(sessname, &first);
> +	if (sess == ERR_PTR(-ENOMEM))
> +		return ERR_PTR(-ENOMEM);
> +	else if ((nr_poll_queues && !first) ||  (!nr_poll_queues && sess->nr_poll_queues)) {
> +		/*
> +		 * A device MUST have its own session to use the polling-mode.
> +		 * It must fail to map new device with the same session.
> +		 */
> +		err = -EINVAL;
> +		goto put_sess;
> +	}
> +
>  	if (!first)
>  		return sess;
>  
> @@ -1219,7 +1283,7 @@ find_and_get_or_create_sess(const char *sessname,
>  				   0, /* Do not use pdu of rtrs */
>  				   RECONNECT_DELAY, BMAX_SEGMENTS,
>  				   BLK_MAX_SEGMENT_SIZE,
> -				   MAX_RECONNECTS);
> +				   MAX_RECONNECTS, nr_poll_queues);
>  	if (IS_ERR(sess->rtrs)) {
>  		err = PTR_ERR(sess->rtrs);
>  		goto wake_up_and_put;
> @@ -1227,6 +1291,7 @@ find_and_get_or_create_sess(const char *sessname,
>  	rtrs_clt_query(sess->rtrs, &attrs);
>  	sess->max_io_size = attrs.max_io_size;
>  	sess->queue_depth = attrs.queue_depth;
> +	sess->nr_poll_queues = nr_poll_queues;
>  
>  	err = setup_mq_tags(sess);
>  	if (err)
> @@ -1370,7 +1435,8 @@ static int rnbd_client_setup_device(struct rnbd_clt_dev *dev)
>  
>  static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess,
>  				      enum rnbd_access_mode access_mode,
> -				      const char *pathname)
> +				      const char *pathname,
> +				      u32 nr_poll_queues)
>  {
>  	struct rnbd_clt_dev *dev;
>  	int ret;
> @@ -1379,7 +1445,8 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess,
>  	if (!dev)
>  		return ERR_PTR(-ENOMEM);
>  
> -	dev->hw_queues = kcalloc(nr_cpu_ids, sizeof(*dev->hw_queues),
> +	dev->hw_queues = kcalloc(nr_cpu_ids /* softirq */ + nr_poll_queues /* poll */,

Please don't add comments in the middle of function call.

> +				 sizeof(*dev->hw_queues),
>  				 GFP_KERNEL);
>  	if (!dev->hw_queues) {
>  		ret = -ENOMEM;
> @@ -1405,6 +1472,7 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess,
>  	dev->clt_device_id	= ret;
>  	dev->sess		= sess;
>  	dev->access_mode	= access_mode;
> +	dev->nr_poll_queues	= nr_poll_queues;
>  	mutex_init(&dev->lock);
>  	refcount_set(&dev->refcount, 1);
>  	dev->dev_state = DEV_STATE_INIT;
> @@ -1491,7 +1559,8 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
>  					   struct rtrs_addr *paths,
>  					   size_t path_cnt, u16 port_nr,
>  					   const char *pathname,
> -					   enum rnbd_access_mode access_mode)
> +					   enum rnbd_access_mode access_mode,
> +					   u32 nr_poll_queues)
>  {
>  	struct rnbd_clt_session *sess;
>  	struct rnbd_clt_dev *dev;
> @@ -1500,11 +1569,11 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
>  	if (unlikely(exists_devpath(pathname, sessname)))
>  		return ERR_PTR(-EEXIST);
>  
> -	sess = find_and_get_or_create_sess(sessname, paths, path_cnt, port_nr);
> +	sess = find_and_get_or_create_sess(sessname, paths, path_cnt, port_nr, nr_poll_queues);
>  	if (IS_ERR(sess))
>  		return ERR_CAST(sess);
>  
> -	dev = init_dev(sess, access_mode, pathname);
> +	dev = init_dev(sess, access_mode, pathname, nr_poll_queues);
>  	if (IS_ERR(dev)) {
>  		pr_err("map_device: failed to map device '%s' from session %s, can't initialize device, err: %ld\n",
>  		       pathname, sess->sessname, PTR_ERR(dev));
> diff --git a/drivers/block/rnbd/rnbd-clt.h b/drivers/block/rnbd/rnbd-clt.h
> index 714d426b449b..451e7383738f 100644
> --- a/drivers/block/rnbd/rnbd-clt.h
> +++ b/drivers/block/rnbd/rnbd-clt.h
> @@ -90,6 +90,7 @@ struct rnbd_clt_session {
>  	int			queue_depth;
>  	u32			max_io_size;
>  	struct blk_mq_tag_set	tag_set;
> +	u32			nr_poll_queues;
>  	struct mutex		lock; /* protects state and devs_list */
>  	struct list_head        devs_list; /* list of struct rnbd_clt_dev */
>  	refcount_t		refcount;
> @@ -118,6 +119,7 @@ struct rnbd_clt_dev {
>  	enum rnbd_clt_dev_state	dev_state;
>  	char			*pathname;
>  	enum rnbd_access_mode	access_mode;
> +	u32			nr_poll_queues;
>  	bool			read_only;
>  	bool			rotational;
>  	bool			wc;
> @@ -147,7 +149,8 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
>  					   struct rtrs_addr *paths,
>  					   size_t path_cnt, u16 port_nr,
>  					   const char *pathname,
> -					   enum rnbd_access_mode access_mode);
> +					   enum rnbd_access_mode access_mode,
> +					   u32 nr_poll_queues);
>  int rnbd_clt_unmap_device(struct rnbd_clt_dev *dev, bool force,
>  			   const struct attribute *sysfs_self);
>  
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> index 7efd49bdc78c..467d135a82cf 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> @@ -174,7 +174,7 @@ struct rtrs_clt_con *rtrs_permit_to_clt_con(struct rtrs_clt_sess *sess,
>  	int id = 0;
>  
>  	if (likely(permit->con_type == RTRS_IO_CON))
> -		id = (permit->cpu_id % (sess->s.con_num - 1)) + 1;
> +		id = (permit->cpu_id % (sess->s.irq_con_num - 1)) + 1;
>  
>  	return to_clt_con(sess->s.con[id]);
>  }
> @@ -1400,23 +1400,29 @@ static void rtrs_clt_close_work(struct work_struct *work);
>  static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
>  					 const struct rtrs_addr *path,
>  					 size_t con_num, u16 max_segments,
> -					 size_t max_segment_size)
> +					 size_t max_segment_size, u32 nr_poll_queues)
>  {
>  	struct rtrs_clt_sess *sess;
>  	int err = -ENOMEM;
>  	int cpu;
> +	size_t total_con;
>  
>  	sess = kzalloc(sizeof(*sess), GFP_KERNEL);
>  	if (!sess)
>  		goto err;
>  
> -	/* Extra connection for user messages */
> -	con_num += 1;
> -
> -	sess->s.con = kcalloc(con_num, sizeof(*sess->s.con), GFP_KERNEL);
> +	/*
> +	 * irqmode and poll
> +	 * +1: Extra connection for user messages
> +	 */
> +	total_con = con_num + nr_poll_queues + 1;
> +	sess->s.con = kcalloc(total_con, sizeof(*sess->s.con), GFP_KERNEL);
>  	if (!sess->s.con)
>  		goto err_free_sess;
>  
> +	sess->s.con_num = total_con;
> +	sess->s.irq_con_num = con_num + 1;
> +
>  	sess->stats = kzalloc(sizeof(*sess->stats), GFP_KERNEL);
>  	if (!sess->stats)
>  		goto err_free_con;
> @@ -1435,7 +1441,6 @@ static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
>  		memcpy(&sess->s.src_addr, path->src,
>  		       rdma_addr_size((struct sockaddr *)path->src));
>  	strlcpy(sess->s.sessname, clt->sessname, sizeof(sess->s.sessname));
> -	sess->s.con_num = con_num;
>  	sess->clt = clt;
>  	sess->max_pages_per_mr = max_segments * max_segment_size >> 12;
>  	init_waitqueue_head(&sess->state_wq);
> @@ -1576,9 +1581,14 @@ static int create_con_cq_qp(struct rtrs_clt_con *con)
>  	}
>  	cq_size = max_send_wr + max_recv_wr;
>  	cq_vector = con->cpu % sess->s.dev->ib_dev->num_comp_vectors;
> -	err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
> -				 cq_vector, cq_size, max_send_wr,
> -				 max_recv_wr, IB_POLL_SOFTIRQ);
> +	if (con->c.cid >= sess->s.irq_con_num)
> +		err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
> +					cq_vector, cq_size, max_send_wr,
> +					max_recv_wr, IB_POLL_DIRECT);
> +	else
> +		err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
> +					cq_vector, cq_size, max_send_wr,
> +					max_recv_wr, IB_POLL_SOFTIRQ);
>  	/*
>  	 * In case of error we do not bother to clean previous allocations,
>  	 * since destroy_con_cq_qp() must be called.
> @@ -2631,6 +2641,7 @@ static void free_clt(struct rtrs_clt *clt)
>   * @max_segment_size: Max. size of one segment
>   * @max_reconnect_attempts: Number of times to reconnect on error before giving
>   *			    up, 0 for * disabled, -1 for forever
> + * @nr_poll_queues: number of polling mode connection using IB_POLL_DIRECT flag
>   *
>   * Starts session establishment with the rtrs_server. The function can block
>   * up to ~2000ms before it returns.
> @@ -2644,7 +2655,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
>  				 size_t pdu_sz, u8 reconnect_delay_sec,
>  				 u16 max_segments,
>  				 size_t max_segment_size,
> -				 s16 max_reconnect_attempts)
> +				 s16 max_reconnect_attempts, u32 nr_poll_queues)
>  {
>  	struct rtrs_clt_sess *sess, *tmp;
>  	struct rtrs_clt *clt;
> @@ -2662,7 +2673,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
>  		struct rtrs_clt_sess *sess;
>  
>  		sess = alloc_sess(clt, &paths[i], nr_cpu_ids,
> -				  max_segments, max_segment_size);
> +				  max_segments, max_segment_size, nr_poll_queues);
>  		if (IS_ERR(sess)) {
>  			err = PTR_ERR(sess);
>  			goto close_all_sess;
> @@ -2887,6 +2898,31 @@ int rtrs_clt_request(int dir, struct rtrs_clt_req_ops *ops,
>  }
>  EXPORT_SYMBOL(rtrs_clt_request);
>  
> +int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index)
> +{
> +	int cnt;
> +	struct rtrs_con *con;
> +	struct rtrs_clt_sess *sess;
> +	struct path_it it;
> +
> +	rcu_read_lock();
> +	for (path_it_init(&it, clt);
> +	     (sess = it.next_path(&it)) && it.i < it.clt->paths_num; it.i++) {
> +		if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))

We talked about useless likely/unlikely in your workloads.

> +			continue;
> +
> +		con = sess->s.con[index + 1];
> +		cnt = ib_process_cq_direct(con->cq, -1);
> +		if (likely(cnt))
> +			break;
> +	}
> +	path_it_deinit(&it);
> +	rcu_read_unlock();
> +
> +	return cnt;
> +}
> +EXPORT_SYMBOL(rtrs_clt_rdma_cq_direct);
> +
>  /**
>   * rtrs_clt_query() - queries RTRS session attributes
>   *@clt: session pointer
> @@ -2916,7 +2952,7 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt *clt,
>  	int err;
>  
>  	sess = alloc_sess(clt, addr, nr_cpu_ids, clt->max_segments,
> -			  clt->max_segment_size);
> +			  clt->max_segment_size, 0);
>  	if (IS_ERR(sess))
>  		return PTR_ERR(sess);
>  
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> index 8caad0a2322b..00eb45053339 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> @@ -101,6 +101,7 @@ struct rtrs_sess {
>  	uuid_t			uuid;
>  	struct rtrs_con	**con;
>  	unsigned int		con_num;
> +	unsigned int		irq_con_num;
>  	unsigned int		recon_cnt;
>  	struct rtrs_ib_dev	*dev;
>  	int			dev_ref;
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs.h b/drivers/infiniband/ulp/rtrs/rtrs.h
> index 2db1b5eb3ab0..f891fbe7abe6 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs.h
> +++ b/drivers/infiniband/ulp/rtrs/rtrs.h
> @@ -59,7 +59,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
>  				 size_t pdu_sz, u8 reconnect_delay_sec,
>  				 u16 max_segments,
>  				 size_t max_segment_size,
> -				 s16 max_reconnect_attempts);
> +				 s16 max_reconnect_attempts, u32 nr_poll_queues);
>  
>  void rtrs_clt_close(struct rtrs_clt *sess);
>  
> @@ -103,6 +103,7 @@ int rtrs_clt_request(int dir, struct rtrs_clt_req_ops *ops,
>  		     struct rtrs_clt *sess, struct rtrs_permit *permit,
>  		     const struct kvec *vec, size_t nr, size_t len,
>  		     struct scatterlist *sg, unsigned int sg_cnt);
> +int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index);
>  
>  /**
>   * rtrs_attrs - RTRS session attributes
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization
  2021-04-18  8:36   ` Leon Romanovsky
@ 2021-04-19  5:12     ` Gioh Kim
  2021-04-19  5:20       ` Leon Romanovsky
  0 siblings, 1 reply; 26+ messages in thread
From: Gioh Kim @ 2021-04-19  5:12 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: linux-block, Jens Axboe, hch, sagi, Bart Van Assche, Haris Iqbal,
	Jinpu Wang, Gioh Kim, linux-rdma, Jason Gunthorpe

On Sun, Apr 18, 2021 at 10:36 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Wed, Apr 14, 2021 at 02:23:56PM +0200, Gioh Kim wrote:
> > From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> >
> > RNBD can make double-queues for irq-mode and poll-mode.
> > For example, on 4-CPU system 8 request-queues are created,
> > 4 for irq-mode and 4 for poll-mode.
> > If the IO has HIPRI flag, the block-layer will call .poll function
> > of RNBD. Then IO is sent to the poll-mode queue.
> > Add optional nr_poll_queues argument for map_devices interface.
> >
> > To support polling of RNBD, RTRS client creates connections
> > for both of irq-mode and direct-poll-mode.
> >
> > For example, on 4-CPU system it could've create 5 connections:
> > con[0] => user message (softirq cq)
> > con[1:4] => softirq cq
> >
> > After this patch, it can create 9 connections:
> > con[0] => user message (softirq cq)
> > con[1:4] => softirq cq
> > con[5:8] => DIRECT-POLL cq
> >
> > Cc: Leon Romanovsky <leonro@nvidia.com>
> > Cc: linux-rdma@vger.kernel.org
> > Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
> > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > Acked-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >  drivers/block/rnbd/rnbd-clt-sysfs.c    | 56 +++++++++++++----
> >  drivers/block/rnbd/rnbd-clt.c          | 85 +++++++++++++++++++++++---
> >  drivers/block/rnbd/rnbd-clt.h          |  5 +-
> >  drivers/infiniband/ulp/rtrs/rtrs-clt.c | 62 +++++++++++++++----
> >  drivers/infiniband/ulp/rtrs/rtrs-pri.h |  1 +
> >  drivers/infiniband/ulp/rtrs/rtrs.h     |  3 +-
> >  6 files changed, 178 insertions(+), 34 deletions(-)
> >
> > diff --git a/drivers/block/rnbd/rnbd-clt-sysfs.c b/drivers/block/rnbd/rnbd-clt-sysfs.c
> > index 49015f428e67..bd111ebceb75 100644
> > --- a/drivers/block/rnbd/rnbd-clt-sysfs.c
> > +++ b/drivers/block/rnbd/rnbd-clt-sysfs.c
> > @@ -34,6 +34,7 @@ enum {
> >       RNBD_OPT_DEV_PATH       = 1 << 2,
> >       RNBD_OPT_ACCESS_MODE    = 1 << 3,
> >       RNBD_OPT_SESSNAME       = 1 << 6,
> > +     RNBD_OPT_NR_POLL_QUEUES = 1 << 7,
> >  };
> >
> >  static const unsigned int rnbd_opt_mandatory[] = {
> > @@ -42,12 +43,13 @@ static const unsigned int rnbd_opt_mandatory[] = {
> >  };
> >
> >  static const match_table_t rnbd_opt_tokens = {
> > -     {RNBD_OPT_PATH,         "path=%s"       },
> > -     {RNBD_OPT_DEV_PATH,     "device_path=%s"},
> > -     {RNBD_OPT_DEST_PORT,    "dest_port=%d"  },
> > -     {RNBD_OPT_ACCESS_MODE,  "access_mode=%s"},
> > -     {RNBD_OPT_SESSNAME,     "sessname=%s"   },
> > -     {RNBD_OPT_ERR,          NULL            },
> > +     {RNBD_OPT_PATH,                 "path=%s"               },
> > +     {RNBD_OPT_DEV_PATH,             "device_path=%s"        },
> > +     {RNBD_OPT_DEST_PORT,            "dest_port=%d"          },
> > +     {RNBD_OPT_ACCESS_MODE,          "access_mode=%s"        },
> > +     {RNBD_OPT_SESSNAME,             "sessname=%s"           },
> > +     {RNBD_OPT_NR_POLL_QUEUES,       "nr_poll_queues=%d"     },
> > +     {RNBD_OPT_ERR,                  NULL                    },
> >  };
> >
> >  struct rnbd_map_options {
> > @@ -57,6 +59,7 @@ struct rnbd_map_options {
> >       char *pathname;
> >       u16 *dest_port;
> >       enum rnbd_access_mode *access_mode;
> > +     u32 *nr_poll_queues;
> >  };
> >
> >  static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
> > @@ -68,7 +71,7 @@ static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
> >       int opt_mask = 0;
> >       int token;
> >       int ret = -EINVAL;
> > -     int i, dest_port;
> > +     int i, dest_port, nr_poll_queues;
> >       int p_cnt = 0;
> >
> >       options = kstrdup(buf, GFP_KERNEL);
> > @@ -178,6 +181,19 @@ static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
> >                       kfree(p);
> >                       break;
> >
> > +             case RNBD_OPT_NR_POLL_QUEUES:
> > +                     if (match_int(args, &nr_poll_queues) || nr_poll_queues < -1 ||
> > +                         nr_poll_queues > (int)nr_cpu_ids) {
> > +                             pr_err("bad nr_poll_queues parameter '%d'\n",
> > +                                    nr_poll_queues);
> > +                             ret = -EINVAL;
> > +                             goto out;
> > +                     }
> > +                     if (nr_poll_queues == -1)
> > +                             nr_poll_queues = nr_cpu_ids;
> > +                     *opt->nr_poll_queues = nr_poll_queues;
> > +                     break;
> > +
> >               default:
> >                       pr_err("map_device: Unknown parameter or missing value '%s'\n",
> >                              p);
> > @@ -227,6 +243,20 @@ static ssize_t state_show(struct kobject *kobj,
> >
> >  static struct kobj_attribute rnbd_clt_state_attr = __ATTR_RO(state);
> >
> > +static ssize_t nr_poll_queues_show(struct kobject *kobj,
> > +                                struct kobj_attribute *attr, char *page)
> > +{
> > +     struct rnbd_clt_dev *dev;
> > +
> > +     dev = container_of(kobj, struct rnbd_clt_dev, kobj);
> > +
> > +     return snprintf(page, PAGE_SIZE, "%d\n",
> > +                     dev->nr_poll_queues);
> > +}
>
> Didn't Greg ask you to use sysfs_emit() here?

Right, I missed it.
I will fix it for next round.


>
> > +
> > +static struct kobj_attribute rnbd_clt_nr_poll_queues =
> > +     __ATTR_RO(nr_poll_queues);
> > +
> >  static ssize_t mapping_path_show(struct kobject *kobj,
> >                                struct kobj_attribute *attr, char *page)
> >  {
> > @@ -421,6 +451,7 @@ static struct attribute *rnbd_dev_attrs[] = {
> >       &rnbd_clt_state_attr.attr,
> >       &rnbd_clt_session_attr.attr,
> >       &rnbd_clt_access_mode.attr,
> > +     &rnbd_clt_nr_poll_queues.attr,
> >       NULL,
> >  };
> >
> > @@ -469,7 +500,7 @@ static ssize_t rnbd_clt_map_device_show(struct kobject *kobj,
> >                                        char *page)
> >  {
> >       return scnprintf(page, PAGE_SIZE,
> > -                      "Usage: echo \"[dest_port=server port number] sessname=<name of the rtrs session> path=<[srcaddr@]dstaddr> [path=<[srcaddr@]dstaddr>] device_path=<full path on remote side> [access_mode=<ro|rw|migration>]\" > %s\n\naddr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]\n",
> > +                      "Usage: echo \"[dest_port=server port number] sessname=<name of the rtrs session> path=<[srcaddr@]dstaddr> [path=<[srcaddr@]dstaddr>] device_path=<full path on remote side> [access_mode=<ro|rw|migration>] [nr_poll_queues=<number of queues>]\" > %s\n\naddr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]\n",
> >                        attr->attr.name);
> >  }
> >
> > @@ -541,6 +572,7 @@ static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
> >       char sessname[NAME_MAX];
> >       enum rnbd_access_mode access_mode = RNBD_ACCESS_RW;
> >       u16 port_nr = RTRS_PORT;
> > +     u32 nr_poll_queues = 0;
> >
> >       struct sockaddr_storage *addrs;
> >       struct rtrs_addr paths[6];
> > @@ -552,6 +584,7 @@ static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
> >       opt.pathname = pathname;
> >       opt.dest_port = &port_nr;
> >       opt.access_mode = &access_mode;
> > +     opt.nr_poll_queues = &nr_poll_queues;
> >       addrs = kcalloc(ARRAY_SIZE(paths) * 2, sizeof(*addrs), GFP_KERNEL);
> >       if (!addrs)
> >               return -ENOMEM;
> > @@ -565,12 +598,13 @@ static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
> >       if (ret)
> >               goto out;
> >
> > -     pr_info("Mapping device %s on session %s, (access_mode: %s)\n",
> > +     pr_info("Mapping device %s on session %s, (access_mode: %s, nr_poll_queues: %d)\n",
> >               pathname, sessname,
> > -             rnbd_access_mode_str(access_mode));
> > +             rnbd_access_mode_str(access_mode),
> > +             nr_poll_queues);
> >
> >       dev = rnbd_clt_map_device(sessname, paths, path_cnt, port_nr, pathname,
> > -                               access_mode);
> > +                               access_mode, nr_poll_queues);
> >       if (IS_ERR(dev)) {
> >               ret = PTR_ERR(dev);
> >               goto out;
> > diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
> > index 9b44aac680d5..63719ec04d58 100644
> > --- a/drivers/block/rnbd/rnbd-clt.c
> > +++ b/drivers/block/rnbd/rnbd-clt.c
> > @@ -1165,9 +1165,54 @@ static blk_status_t rnbd_queue_rq(struct blk_mq_hw_ctx *hctx,
> >       return ret;
> >  }
> >
> > +static int rnbd_rdma_poll(struct blk_mq_hw_ctx *hctx)
> > +{
> > +     struct rnbd_queue *q = hctx->driver_data;
> > +     struct rnbd_clt_dev *dev = q->dev;
> > +     int cnt;
> > +
> > +     cnt = rtrs_clt_rdma_cq_direct(dev->sess->rtrs, hctx->queue_num);
> > +     return cnt;
> > +}
> > +
> > +static int rnbd_rdma_map_queues(struct blk_mq_tag_set *set)
> > +{
> > +     struct rnbd_clt_session *sess = set->driver_data;
> > +
> > +     /* shared read/write queues */
> > +     set->map[HCTX_TYPE_DEFAULT].nr_queues = num_online_cpus();
> > +     set->map[HCTX_TYPE_DEFAULT].queue_offset = 0;
> > +     set->map[HCTX_TYPE_READ].nr_queues = num_online_cpus();
> > +     set->map[HCTX_TYPE_READ].queue_offset = 0;
> > +     blk_mq_map_queues(&set->map[HCTX_TYPE_DEFAULT]);
> > +     blk_mq_map_queues(&set->map[HCTX_TYPE_READ]);
> > +
> > +     if (sess->nr_poll_queues) {
> > +             /* dedicated queue for poll */
> > +             set->map[HCTX_TYPE_POLL].nr_queues = sess->nr_poll_queues;
> > +             set->map[HCTX_TYPE_POLL].queue_offset = set->map[HCTX_TYPE_READ].queue_offset +
> > +                     set->map[HCTX_TYPE_READ].nr_queues;
> > +             blk_mq_map_queues(&set->map[HCTX_TYPE_POLL]);
> > +             pr_info("[session=%s] mapped %d/%d/%d default/read/poll queues.\n",
> > +                     sess->sessname,
> > +                     set->map[HCTX_TYPE_DEFAULT].nr_queues,
> > +                     set->map[HCTX_TYPE_READ].nr_queues,
> > +                     set->map[HCTX_TYPE_POLL].nr_queues);
> > +     } else {
> > +             pr_info("[session=%s] mapped %d/%d default/read queues.\n",
> > +                     sess->sessname,
> > +                     set->map[HCTX_TYPE_DEFAULT].nr_queues,
> > +                     set->map[HCTX_TYPE_READ].nr_queues);
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> >  static struct blk_mq_ops rnbd_mq_ops = {
> >       .queue_rq       = rnbd_queue_rq,
> >       .complete       = rnbd_softirq_done_fn,
> > +     .map_queues     = rnbd_rdma_map_queues,
> > +     .poll           = rnbd_rdma_poll,
> >  };
> >
> >  static int setup_mq_tags(struct rnbd_clt_session *sess)
> > @@ -1181,7 +1226,15 @@ static int setup_mq_tags(struct rnbd_clt_session *sess)
> >       tag_set->flags          = BLK_MQ_F_SHOULD_MERGE |
> >                                 BLK_MQ_F_TAG_QUEUE_SHARED;
> >       tag_set->cmd_size       = sizeof(struct rnbd_iu) + RNBD_RDMA_SGL_SIZE;
> > -     tag_set->nr_hw_queues   = num_online_cpus();
> > +
> > +     /* for HCTX_TYPE_DEFAULT, HCTX_TYPE_READ, HCTX_TYPE_POLL */
> > +     tag_set->nr_maps        = sess->nr_poll_queues ? HCTX_MAX_TYPES : 2;
> > +     /*
> > +      * HCTX_TYPE_DEFAULT and HCTX_TYPE_READ share one set of queues
> > +      * others are for HCTX_TYPE_POLL
> > +      */
> > +     tag_set->nr_hw_queues   = num_online_cpus() + sess->nr_poll_queues;
> > +     tag_set->driver_data    = sess;
> >
> >       return blk_mq_alloc_tag_set(tag_set);
> >  }
> > @@ -1189,7 +1242,7 @@ static int setup_mq_tags(struct rnbd_clt_session *sess)
> >  static struct rnbd_clt_session *
> >  find_and_get_or_create_sess(const char *sessname,
> >                           const struct rtrs_addr *paths,
> > -                         size_t path_cnt, u16 port_nr)
> > +                         size_t path_cnt, u16 port_nr, u32 nr_poll_queues)
> >  {
> >       struct rnbd_clt_session *sess;
> >       struct rtrs_attrs attrs;
> > @@ -1198,6 +1251,17 @@ find_and_get_or_create_sess(const char *sessname,
> >       struct rtrs_clt_ops rtrs_ops;
> >
> >       sess = find_or_create_sess(sessname, &first);
> > +     if (sess == ERR_PTR(-ENOMEM))
> > +             return ERR_PTR(-ENOMEM);
> > +     else if ((nr_poll_queues && !first) ||  (!nr_poll_queues && sess->nr_poll_queues)) {
> > +             /*
> > +              * A device MUST have its own session to use the polling-mode.
> > +              * It must fail to map new device with the same session.
> > +              */
> > +             err = -EINVAL;
> > +             goto put_sess;
> > +     }
> > +
> >       if (!first)
> >               return sess;
> >
> > @@ -1219,7 +1283,7 @@ find_and_get_or_create_sess(const char *sessname,
> >                                  0, /* Do not use pdu of rtrs */
> >                                  RECONNECT_DELAY, BMAX_SEGMENTS,
> >                                  BLK_MAX_SEGMENT_SIZE,
> > -                                MAX_RECONNECTS);
> > +                                MAX_RECONNECTS, nr_poll_queues);
> >       if (IS_ERR(sess->rtrs)) {
> >               err = PTR_ERR(sess->rtrs);
> >               goto wake_up_and_put;
> > @@ -1227,6 +1291,7 @@ find_and_get_or_create_sess(const char *sessname,
> >       rtrs_clt_query(sess->rtrs, &attrs);
> >       sess->max_io_size = attrs.max_io_size;
> >       sess->queue_depth = attrs.queue_depth;
> > +     sess->nr_poll_queues = nr_poll_queues;
> >
> >       err = setup_mq_tags(sess);
> >       if (err)
> > @@ -1370,7 +1435,8 @@ static int rnbd_client_setup_device(struct rnbd_clt_dev *dev)
> >
> >  static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess,
> >                                     enum rnbd_access_mode access_mode,
> > -                                   const char *pathname)
> > +                                   const char *pathname,
> > +                                   u32 nr_poll_queues)
> >  {
> >       struct rnbd_clt_dev *dev;
> >       int ret;
> > @@ -1379,7 +1445,8 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess,
> >       if (!dev)
> >               return ERR_PTR(-ENOMEM);
> >
> > -     dev->hw_queues = kcalloc(nr_cpu_ids, sizeof(*dev->hw_queues),
> > +     dev->hw_queues = kcalloc(nr_cpu_ids /* softirq */ + nr_poll_queues /* poll */,
>
> Please don't add comments in the middle of function call.

Ok, I will fix it for next round.


>
> > +                              sizeof(*dev->hw_queues),
> >                                GFP_KERNEL);
> >       if (!dev->hw_queues) {
> >               ret = -ENOMEM;
> > @@ -1405,6 +1472,7 @@ static struct rnbd_clt_dev *init_dev(struct rnbd_clt_session *sess,
> >       dev->clt_device_id      = ret;
> >       dev->sess               = sess;
> >       dev->access_mode        = access_mode;
> > +     dev->nr_poll_queues     = nr_poll_queues;
> >       mutex_init(&dev->lock);
> >       refcount_set(&dev->refcount, 1);
> >       dev->dev_state = DEV_STATE_INIT;
> > @@ -1491,7 +1559,8 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
> >                                          struct rtrs_addr *paths,
> >                                          size_t path_cnt, u16 port_nr,
> >                                          const char *pathname,
> > -                                        enum rnbd_access_mode access_mode)
> > +                                        enum rnbd_access_mode access_mode,
> > +                                        u32 nr_poll_queues)
> >  {
> >       struct rnbd_clt_session *sess;
> >       struct rnbd_clt_dev *dev;
> > @@ -1500,11 +1569,11 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
> >       if (unlikely(exists_devpath(pathname, sessname)))
> >               return ERR_PTR(-EEXIST);
> >
> > -     sess = find_and_get_or_create_sess(sessname, paths, path_cnt, port_nr);
> > +     sess = find_and_get_or_create_sess(sessname, paths, path_cnt, port_nr, nr_poll_queues);
> >       if (IS_ERR(sess))
> >               return ERR_CAST(sess);
> >
> > -     dev = init_dev(sess, access_mode, pathname);
> > +     dev = init_dev(sess, access_mode, pathname, nr_poll_queues);
> >       if (IS_ERR(dev)) {
> >               pr_err("map_device: failed to map device '%s' from session %s, can't initialize device, err: %ld\n",
> >                      pathname, sess->sessname, PTR_ERR(dev));
> > diff --git a/drivers/block/rnbd/rnbd-clt.h b/drivers/block/rnbd/rnbd-clt.h
> > index 714d426b449b..451e7383738f 100644
> > --- a/drivers/block/rnbd/rnbd-clt.h
> > +++ b/drivers/block/rnbd/rnbd-clt.h
> > @@ -90,6 +90,7 @@ struct rnbd_clt_session {
> >       int                     queue_depth;
> >       u32                     max_io_size;
> >       struct blk_mq_tag_set   tag_set;
> > +     u32                     nr_poll_queues;
> >       struct mutex            lock; /* protects state and devs_list */
> >       struct list_head        devs_list; /* list of struct rnbd_clt_dev */
> >       refcount_t              refcount;
> > @@ -118,6 +119,7 @@ struct rnbd_clt_dev {
> >       enum rnbd_clt_dev_state dev_state;
> >       char                    *pathname;
> >       enum rnbd_access_mode   access_mode;
> > +     u32                     nr_poll_queues;
> >       bool                    read_only;
> >       bool                    rotational;
> >       bool                    wc;
> > @@ -147,7 +149,8 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
> >                                          struct rtrs_addr *paths,
> >                                          size_t path_cnt, u16 port_nr,
> >                                          const char *pathname,
> > -                                        enum rnbd_access_mode access_mode);
> > +                                        enum rnbd_access_mode access_mode,
> > +                                        u32 nr_poll_queues);
> >  int rnbd_clt_unmap_device(struct rnbd_clt_dev *dev, bool force,
> >                          const struct attribute *sysfs_self);
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > index 7efd49bdc78c..467d135a82cf 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > @@ -174,7 +174,7 @@ struct rtrs_clt_con *rtrs_permit_to_clt_con(struct rtrs_clt_sess *sess,
> >       int id = 0;
> >
> >       if (likely(permit->con_type == RTRS_IO_CON))
> > -             id = (permit->cpu_id % (sess->s.con_num - 1)) + 1;
> > +             id = (permit->cpu_id % (sess->s.irq_con_num - 1)) + 1;
> >
> >       return to_clt_con(sess->s.con[id]);
> >  }
> > @@ -1400,23 +1400,29 @@ static void rtrs_clt_close_work(struct work_struct *work);
> >  static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
> >                                        const struct rtrs_addr *path,
> >                                        size_t con_num, u16 max_segments,
> > -                                      size_t max_segment_size)
> > +                                      size_t max_segment_size, u32 nr_poll_queues)
> >  {
> >       struct rtrs_clt_sess *sess;
> >       int err = -ENOMEM;
> >       int cpu;
> > +     size_t total_con;
> >
> >       sess = kzalloc(sizeof(*sess), GFP_KERNEL);
> >       if (!sess)
> >               goto err;
> >
> > -     /* Extra connection for user messages */
> > -     con_num += 1;
> > -
> > -     sess->s.con = kcalloc(con_num, sizeof(*sess->s.con), GFP_KERNEL);
> > +     /*
> > +      * irqmode and poll
> > +      * +1: Extra connection for user messages
> > +      */
> > +     total_con = con_num + nr_poll_queues + 1;
> > +     sess->s.con = kcalloc(total_con, sizeof(*sess->s.con), GFP_KERNEL);
> >       if (!sess->s.con)
> >               goto err_free_sess;
> >
> > +     sess->s.con_num = total_con;
> > +     sess->s.irq_con_num = con_num + 1;
> > +
> >       sess->stats = kzalloc(sizeof(*sess->stats), GFP_KERNEL);
> >       if (!sess->stats)
> >               goto err_free_con;
> > @@ -1435,7 +1441,6 @@ static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
> >               memcpy(&sess->s.src_addr, path->src,
> >                      rdma_addr_size((struct sockaddr *)path->src));
> >       strlcpy(sess->s.sessname, clt->sessname, sizeof(sess->s.sessname));
> > -     sess->s.con_num = con_num;
> >       sess->clt = clt;
> >       sess->max_pages_per_mr = max_segments * max_segment_size >> 12;
> >       init_waitqueue_head(&sess->state_wq);
> > @@ -1576,9 +1581,14 @@ static int create_con_cq_qp(struct rtrs_clt_con *con)
> >       }
> >       cq_size = max_send_wr + max_recv_wr;
> >       cq_vector = con->cpu % sess->s.dev->ib_dev->num_comp_vectors;
> > -     err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
> > -                              cq_vector, cq_size, max_send_wr,
> > -                              max_recv_wr, IB_POLL_SOFTIRQ);
> > +     if (con->c.cid >= sess->s.irq_con_num)
> > +             err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
> > +                                     cq_vector, cq_size, max_send_wr,
> > +                                     max_recv_wr, IB_POLL_DIRECT);
> > +     else
> > +             err = rtrs_cq_qp_create(&sess->s, &con->c, sess->max_send_sge,
> > +                                     cq_vector, cq_size, max_send_wr,
> > +                                     max_recv_wr, IB_POLL_SOFTIRQ);
> >       /*
> >        * In case of error we do not bother to clean previous allocations,
> >        * since destroy_con_cq_qp() must be called.
> > @@ -2631,6 +2641,7 @@ static void free_clt(struct rtrs_clt *clt)
> >   * @max_segment_size: Max. size of one segment
> >   * @max_reconnect_attempts: Number of times to reconnect on error before giving
> >   *                       up, 0 for * disabled, -1 for forever
> > + * @nr_poll_queues: number of polling mode connection using IB_POLL_DIRECT flag
> >   *
> >   * Starts session establishment with the rtrs_server. The function can block
> >   * up to ~2000ms before it returns.
> > @@ -2644,7 +2655,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
> >                                size_t pdu_sz, u8 reconnect_delay_sec,
> >                                u16 max_segments,
> >                                size_t max_segment_size,
> > -                              s16 max_reconnect_attempts)
> > +                              s16 max_reconnect_attempts, u32 nr_poll_queues)
> >  {
> >       struct rtrs_clt_sess *sess, *tmp;
> >       struct rtrs_clt *clt;
> > @@ -2662,7 +2673,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
> >               struct rtrs_clt_sess *sess;
> >
> >               sess = alloc_sess(clt, &paths[i], nr_cpu_ids,
> > -                               max_segments, max_segment_size);
> > +                               max_segments, max_segment_size, nr_poll_queues);
> >               if (IS_ERR(sess)) {
> >                       err = PTR_ERR(sess);
> >                       goto close_all_sess;
> > @@ -2887,6 +2898,31 @@ int rtrs_clt_request(int dir, struct rtrs_clt_req_ops *ops,
> >  }
> >  EXPORT_SYMBOL(rtrs_clt_request);
> >
> > +int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index)
> > +{
> > +     int cnt;
> > +     struct rtrs_con *con;
> > +     struct rtrs_clt_sess *sess;
> > +     struct path_it it;
> > +
> > +     rcu_read_lock();
> > +     for (path_it_init(&it, clt);
> > +          (sess = it.next_path(&it)) && it.i < it.clt->paths_num; it.i++) {
> > +             if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
>
> We talked about useless likely/unlikely in your workloads.

Right, I've made a patch to remove all likely/unlikely
and will send with the next patch set.

I thought it could be better for review to keep the patches
in the patch set. So if this set is applied, I will send a small patch set
to remove likely/unlikely and do some cleanup.

>
> > +                     continue;
> > +
> > +             con = sess->s.con[index + 1];
> > +             cnt = ib_process_cq_direct(con->cq, -1);
> > +             if (likely(cnt))
> > +                     break;
> > +     }
> > +     path_it_deinit(&it);
> > +     rcu_read_unlock();
> > +
> > +     return cnt;
> > +}
> > +EXPORT_SYMBOL(rtrs_clt_rdma_cq_direct);
> > +
> >  /**
> >   * rtrs_clt_query() - queries RTRS session attributes
> >   *@clt: session pointer
> > @@ -2916,7 +2952,7 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt *clt,
> >       int err;
> >
> >       sess = alloc_sess(clt, addr, nr_cpu_ids, clt->max_segments,
> > -                       clt->max_segment_size);
> > +                       clt->max_segment_size, 0);
> >       if (IS_ERR(sess))
> >               return PTR_ERR(sess);
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> > index 8caad0a2322b..00eb45053339 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
> > @@ -101,6 +101,7 @@ struct rtrs_sess {
> >       uuid_t                  uuid;
> >       struct rtrs_con **con;
> >       unsigned int            con_num;
> > +     unsigned int            irq_con_num;
> >       unsigned int            recon_cnt;
> >       struct rtrs_ib_dev      *dev;
> >       int                     dev_ref;
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs.h b/drivers/infiniband/ulp/rtrs/rtrs.h
> > index 2db1b5eb3ab0..f891fbe7abe6 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs.h
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs.h
> > @@ -59,7 +59,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
> >                                size_t pdu_sz, u8 reconnect_delay_sec,
> >                                u16 max_segments,
> >                                size_t max_segment_size,
> > -                              s16 max_reconnect_attempts);
> > +                              s16 max_reconnect_attempts, u32 nr_poll_queues);
> >
> >  void rtrs_clt_close(struct rtrs_clt *sess);
> >
> > @@ -103,6 +103,7 @@ int rtrs_clt_request(int dir, struct rtrs_clt_req_ops *ops,
> >                    struct rtrs_clt *sess, struct rtrs_permit *permit,
> >                    const struct kvec *vec, size_t nr, size_t len,
> >                    struct scatterlist *sg, unsigned int sg_cnt);
> > +int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index);
> >
> >  /**
> >   * rtrs_attrs - RTRS session attributes
> > --
> > 2.25.1
> >

Thank you for the review.
I will send V5 soon.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization
  2021-04-19  5:12     ` Gioh Kim
@ 2021-04-19  5:20       ` Leon Romanovsky
  2021-04-19  5:51         ` Gioh Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Leon Romanovsky @ 2021-04-19  5:20 UTC (permalink / raw)
  To: Gioh Kim
  Cc: linux-block, Jens Axboe, hch, sagi, Bart Van Assche, Haris Iqbal,
	Jinpu Wang, Gioh Kim, linux-rdma, Jason Gunthorpe

On Mon, Apr 19, 2021 at 07:12:09AM +0200, Gioh Kim wrote:
> On Sun, Apr 18, 2021 at 10:36 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Wed, Apr 14, 2021 at 02:23:56PM +0200, Gioh Kim wrote:
> > > From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > >
> > > RNBD can make double-queues for irq-mode and poll-mode.
> > > For example, on 4-CPU system 8 request-queues are created,
> > > 4 for irq-mode and 4 for poll-mode.
> > > If the IO has HIPRI flag, the block-layer will call .poll function
> > > of RNBD. Then IO is sent to the poll-mode queue.
> > > Add optional nr_poll_queues argument for map_devices interface.
> > >
> > > To support polling of RNBD, RTRS client creates connections
> > > for both of irq-mode and direct-poll-mode.
> > >
> > > For example, on 4-CPU system it could've create 5 connections:
> > > con[0] => user message (softirq cq)
> > > con[1:4] => softirq cq
> > >
> > > After this patch, it can create 9 connections:
> > > con[0] => user message (softirq cq)
> > > con[1:4] => softirq cq
> > > con[5:8] => DIRECT-POLL cq

<...>

> > > +int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index)
> > > +{
> > > +     int cnt;
> > > +     struct rtrs_con *con;
> > > +     struct rtrs_clt_sess *sess;
> > > +     struct path_it it;
> > > +
> > > +     rcu_read_lock();
> > > +     for (path_it_init(&it, clt);
> > > +          (sess = it.next_path(&it)) && it.i < it.clt->paths_num; it.i++) {
> > > +             if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
> >
> > We talked about useless likely/unlikely in your workloads.
> 
> Right, I've made a patch to remove all likely/unlikely
> and will send with the next patch set.

This specific line is "brand new". We don't add code that will be
removed in next patch.

Thanks

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization
  2021-04-19  5:20       ` Leon Romanovsky
@ 2021-04-19  5:51         ` Gioh Kim
  2021-04-19  6:09           ` Leon Romanovsky
  0 siblings, 1 reply; 26+ messages in thread
From: Gioh Kim @ 2021-04-19  5:51 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: linux-block, Jens Axboe, hch, sagi, Bart Van Assche, Haris Iqbal,
	Jinpu Wang, Gioh Kim, linux-rdma, Jason Gunthorpe

On Mon, Apr 19, 2021 at 7:46 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Apr 19, 2021 at 07:12:09AM +0200, Gioh Kim wrote:
> > On Sun, Apr 18, 2021 at 10:36 AM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Wed, Apr 14, 2021 at 02:23:56PM +0200, Gioh Kim wrote:
> > > > From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > > >
> > > > RNBD can make double-queues for irq-mode and poll-mode.
> > > > For example, on 4-CPU system 8 request-queues are created,
> > > > 4 for irq-mode and 4 for poll-mode.
> > > > If the IO has HIPRI flag, the block-layer will call .poll function
> > > > of RNBD. Then IO is sent to the poll-mode queue.
> > > > Add optional nr_poll_queues argument for map_devices interface.
> > > >
> > > > To support polling of RNBD, RTRS client creates connections
> > > > for both of irq-mode and direct-poll-mode.
> > > >
> > > > For example, on 4-CPU system it could've create 5 connections:
> > > > con[0] => user message (softirq cq)
> > > > con[1:4] => softirq cq
> > > >
> > > > After this patch, it can create 9 connections:
> > > > con[0] => user message (softirq cq)
> > > > con[1:4] => softirq cq
> > > > con[5:8] => DIRECT-POLL cq
>
> <...>

I am sorry that I don't understand exactly.
Do I need to change them to "con<5..8>"?


>
> > > > +int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index)
> > > > +{
> > > > +     int cnt;
> > > > +     struct rtrs_con *con;
> > > > +     struct rtrs_clt_sess *sess;
> > > > +     struct path_it it;
> > > > +
> > > > +     rcu_read_lock();
> > > > +     for (path_it_init(&it, clt);
> > > > +          (sess = it.next_path(&it)) && it.i < it.clt->paths_num; it.i++) {
> > > > +             if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
> > >
> > > We talked about useless likely/unlikely in your workloads.
> >
> > Right, I've made a patch to remove all likely/unlikely
> > and will send with the next patch set.
>
> This specific line is "brand new". We don't add code that will be
> removed in next patch.

Ah, ok. So you mean,
1. remove unlikely from that line
2. send a patch to remove all likely/unlikely for next round

Am I right?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization
  2021-04-19  5:51         ` Gioh Kim
@ 2021-04-19  6:09           ` Leon Romanovsky
  2021-04-19  6:15             ` Gioh Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Leon Romanovsky @ 2021-04-19  6:09 UTC (permalink / raw)
  To: Gioh Kim
  Cc: linux-block, Jens Axboe, hch, sagi, Bart Van Assche, Haris Iqbal,
	Jinpu Wang, Gioh Kim, linux-rdma, Jason Gunthorpe

On Mon, Apr 19, 2021 at 07:51:34AM +0200, Gioh Kim wrote:
> On Mon, Apr 19, 2021 at 7:46 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Mon, Apr 19, 2021 at 07:12:09AM +0200, Gioh Kim wrote:
> > > On Sun, Apr 18, 2021 at 10:36 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > On Wed, Apr 14, 2021 at 02:23:56PM +0200, Gioh Kim wrote:
> > > > > From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > > > >
> > > > > RNBD can make double-queues for irq-mode and poll-mode.
> > > > > For example, on 4-CPU system 8 request-queues are created,
> > > > > 4 for irq-mode and 4 for poll-mode.
> > > > > If the IO has HIPRI flag, the block-layer will call .poll function
> > > > > of RNBD. Then IO is sent to the poll-mode queue.
> > > > > Add optional nr_poll_queues argument for map_devices interface.
> > > > >
> > > > > To support polling of RNBD, RTRS client creates connections
> > > > > for both of irq-mode and direct-poll-mode.
> > > > >
> > > > > For example, on 4-CPU system it could've create 5 connections:
> > > > > con[0] => user message (softirq cq)
> > > > > con[1:4] => softirq cq
> > > > >
> > > > > After this patch, it can create 9 connections:
> > > > > con[0] => user message (softirq cq)
> > > > > con[1:4] => softirq cq
> > > > > con[5:8] => DIRECT-POLL cq
> >
> > <...>
> 
> I am sorry that I don't understand exactly.
> Do I need to change them to "con<5..8>"?

No, I just removed not relevant text and replaced it with <...> in
automatic way :).

> 
> 
> >
> > > > > +int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index)
> > > > > +{
> > > > > +     int cnt;
> > > > > +     struct rtrs_con *con;
> > > > > +     struct rtrs_clt_sess *sess;
> > > > > +     struct path_it it;
> > > > > +
> > > > > +     rcu_read_lock();
> > > > > +     for (path_it_init(&it, clt);
> > > > > +          (sess = it.next_path(&it)) && it.i < it.clt->paths_num; it.i++) {
> > > > > +             if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
> > > >
> > > > We talked about useless likely/unlikely in your workloads.
> > >
> > > Right, I've made a patch to remove all likely/unlikely
> > > and will send with the next patch set.
> >
> > This specific line is "brand new". We don't add code that will be
> > removed in next patch.
> 
> Ah, ok. So you mean,
> 1. remove unlikely from that line
> 2. send a patch to remove all likely/unlikely for next round
> 
> Am I right?

Right

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization
  2021-04-19  6:09           ` Leon Romanovsky
@ 2021-04-19  6:15             ` Gioh Kim
  0 siblings, 0 replies; 26+ messages in thread
From: Gioh Kim @ 2021-04-19  6:15 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: linux-block, Jens Axboe, hch, sagi, Bart Van Assche, Haris Iqbal,
	Jinpu Wang, Gioh Kim, linux-rdma, Jason Gunthorpe

On Mon, Apr 19, 2021 at 8:09 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Apr 19, 2021 at 07:51:34AM +0200, Gioh Kim wrote:
> > On Mon, Apr 19, 2021 at 7:46 AM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Mon, Apr 19, 2021 at 07:12:09AM +0200, Gioh Kim wrote:
> > > > On Sun, Apr 18, 2021 at 10:36 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > > >
> > > > > On Wed, Apr 14, 2021 at 02:23:56PM +0200, Gioh Kim wrote:
> > > > > > From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > > > > >
> > > > > > RNBD can make double-queues for irq-mode and poll-mode.
> > > > > > For example, on 4-CPU system 8 request-queues are created,
> > > > > > 4 for irq-mode and 4 for poll-mode.
> > > > > > If the IO has HIPRI flag, the block-layer will call .poll function
> > > > > > of RNBD. Then IO is sent to the poll-mode queue.
> > > > > > Add optional nr_poll_queues argument for map_devices interface.
> > > > > >
> > > > > > To support polling of RNBD, RTRS client creates connections
> > > > > > for both of irq-mode and direct-poll-mode.
> > > > > >
> > > > > > For example, on 4-CPU system it could've create 5 connections:
> > > > > > con[0] => user message (softirq cq)
> > > > > > con[1:4] => softirq cq
> > > > > >
> > > > > > After this patch, it can create 9 connections:
> > > > > > con[0] => user message (softirq cq)
> > > > > > con[1:4] => softirq cq
> > > > > > con[5:8] => DIRECT-POLL cq
> > >
> > > <...>
> >
> > I am sorry that I don't understand exactly.
> > Do I need to change them to "con<5..8>"?
>
> No, I just removed not relevant text and replaced it with <...> in
> automatic way :).

Oh ;-)

>
> >
> >
> > >
> > > > > > +int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index)
> > > > > > +{
> > > > > > +     int cnt;
> > > > > > +     struct rtrs_con *con;
> > > > > > +     struct rtrs_clt_sess *sess;
> > > > > > +     struct path_it it;
> > > > > > +
> > > > > > +     rcu_read_lock();
> > > > > > +     for (path_it_init(&it, clt);
> > > > > > +          (sess = it.next_path(&it)) && it.i < it.clt->paths_num; it.i++) {
> > > > > > +             if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
> > > > >
> > > > > We talked about useless likely/unlikely in your workloads.
> > > >
> > > > Right, I've made a patch to remove all likely/unlikely
> > > > and will send with the next patch set.
> > >
> > > This specific line is "brand new". We don't add code that will be
> > > removed in next patch.
> >
> > Ah, ok. So you mean,
> > 1. remove unlikely from that line
> > 2. send a patch to remove all likely/unlikely for next round
> >
> > Am I right?
>
> Right

Thank you very much.
I will send V5 soon.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, back to index

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-14 12:23 [PATCHv4 for-next 00/19] Misc update for rnbd Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 01/19] MAINTAINERS: Change maintainer for rnbd module Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 02/19] Documentation/sysfs-block-rnbd: Add descriptions for remap_device and resize Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 03/19] block/rnbd-clt: Remove some arguments from insert_dev_if_not_exists_devpath Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 04/19] block/rnbd-clt: Remove some arguments from rnbd_client_setup_device Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 05/19] block/rnbd-clt: Move add_disk(dev->gd) to rnbd_clt_setup_gen_disk Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 06/19] block/rnbd: Kill rnbd_clt_destroy_default_group Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 07/19] block/rnbd: Kill destroy_device_cb Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 08/19] block/rnbd-clt: Replace {NO_WAIT,WAIT} with RTRS_PERMIT_{WAIT,NOWAIT} Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 09/19] block/rnbd-srv: Prevent a deadlock generated by accessing sysfs in parallel Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 10/19] block/rnbd-srv: Remove force_close file after holding a lock Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 11/19] block/rnbd-clt: Improve find_or_create_sess() return check Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 12/19] block/rnbd-clt: Fix missing a memory free when unloading the module Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 13/19] block/rnbd-clt: Support polling mode for IO latency optimization Gioh Kim
2021-04-18  8:36   ` Leon Romanovsky
2021-04-19  5:12     ` Gioh Kim
2021-04-19  5:20       ` Leon Romanovsky
2021-04-19  5:51         ` Gioh Kim
2021-04-19  6:09           ` Leon Romanovsky
2021-04-19  6:15             ` Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 14/19] Documentation/ABI/rnbd-clt: Add description for nr_poll_queues Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 15/19] block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev Gioh Kim
2021-04-14 12:23 ` [PATCHv4 for-next 16/19] block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes Gioh Kim
2021-04-14 12:24 ` [PATCHv4 for-next 17/19] block/rnbd-clt: Remove max_segment_size Gioh Kim
2021-04-14 12:24 ` [PATCHv4 for-next 18/19] block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name Gioh Kim
2021-04-14 12:24 ` [PATCHv4 for-next 19/19] block/rnbd: Use strscpy instead of strlcpy Gioh Kim

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git