linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Enable Fault Injection for RTRS
@ 2021-04-06 11:50 Gioh Kim
  2021-04-06 11:50 ` [PATCH 1/4] RDMA/rtrs: Enable the fault-injection Gioh Kim
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Gioh Kim @ 2021-04-06 11:50 UTC (permalink / raw)
  To: linux-rdma, linux-doc
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang,
	akinobu.mita, corbet, Gioh Kim

My colleagues and I would like to apply the fault injection
of the Linux to test error handling of RTRS module. RTRS module
consists of client and server modules that are connected via
Infiniband network. So it is important for the client to receive
the error of the server and handle it smoothly.

When debugfs is enabled, RTRS is able to export interfaces
to fail RTRS client and server.
Following fault injection points are enabled:
- fail a request processing on RTRS client side
- fail a heart-beat transferation on RTRS server side

This patch set is just a starting point. We will enable various
faults and test as many error cases as possible.

Best regards

Gioh Kim (4):
  RDMA/rtrs: Enable the fault-injection
  RDMA/rtrs-clt: Inject a fault at request processing
  RDMA/rtrs-srv: Inject a fault at heart-beat sending
  docs: fault-injection: Add fault-injection manual of RTRS

 .../fault-injection/rtrs-fault-injection.rst  | 83 +++++++++++++++++++
 drivers/infiniband/ulp/rtrs/Makefile          |  2 +
 drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c  | 44 ++++++++++
 drivers/infiniband/ulp/rtrs/rtrs-clt.c        |  7 ++
 drivers/infiniband/ulp/rtrs/rtrs-clt.h        | 13 +++
 drivers/infiniband/ulp/rtrs/rtrs-fault.c      | 52 ++++++++++++
 drivers/infiniband/ulp/rtrs/rtrs-fault.h      | 28 +++++++
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c  | 44 ++++++++++
 drivers/infiniband/ulp/rtrs/rtrs-srv.c        |  5 ++
 drivers/infiniband/ulp/rtrs/rtrs-srv.h        | 13 +++
 10 files changed, 291 insertions(+)
 create mode 100644 Documentation/fault-injection/rtrs-fault-injection.rst
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-fault.c
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-fault.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/4] RDMA/rtrs: Enable the fault-injection
  2021-04-06 11:50 [PATCH 0/4] Enable Fault Injection for RTRS Gioh Kim
@ 2021-04-06 11:50 ` Gioh Kim
  2021-04-06 11:50 ` [PATCH 2/4] RDMA/rtrs-clt: Inject a fault at request processing Gioh Kim
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Gioh Kim @ 2021-04-06 11:50 UTC (permalink / raw)
  To: linux-rdma, linux-doc
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang,
	akinobu.mita, corbet, Gioh Kim, Jack Wang

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

This patch introduces some functions to enable the fault-injection
for RTRS.
* rtrs_fault_inject_init/final initialize the fault-injection
and create a debugfs directory.
* rtrs_fault_inject_add creates a debugfs entry to enable
the fault-injection point.

Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/Makefile     |  2 +
 drivers/infiniband/ulp/rtrs/rtrs-fault.c | 52 ++++++++++++++++++++++++
 drivers/infiniband/ulp/rtrs/rtrs-fault.h | 28 +++++++++++++
 3 files changed, 82 insertions(+)
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-fault.c
 create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-fault.h

diff --git a/drivers/infiniband/ulp/rtrs/Makefile b/drivers/infiniband/ulp/rtrs/Makefile
index 3898509be270..3490f66bb7f2 100644
--- a/drivers/infiniband/ulp/rtrs/Makefile
+++ b/drivers/infiniband/ulp/rtrs/Makefile
@@ -3,10 +3,12 @@
 rtrs-client-y := rtrs-clt.o \
 		  rtrs-clt-stats.o \
 		  rtrs-clt-sysfs.o
+rtrs-client-$(CONFIG_FAULT_INJECTION_DEBUG_FS) += rtrs-fault.o
 
 rtrs-server-y := rtrs-srv.o \
 		  rtrs-srv-stats.o \
 		  rtrs-srv-sysfs.o
+rtrs-server-$(CONFIG_FAULT_INJECTION_DEBUG_FS) += rtrs-fault.o
 
 rtrs-core-y := rtrs.o
 
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-fault.c b/drivers/infiniband/ulp/rtrs/rtrs-fault.c
new file mode 100644
index 000000000000..af475c814c29
--- /dev/null
+++ b/drivers/infiniband/ulp/rtrs/rtrs-fault.c
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * RDMA Transport Layer
+ *
+ * Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
+ * Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
+ * Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
+ */
+
+#include "rtrs-fault.h"
+
+static DECLARE_FAULT_ATTR(fail_default_attr);
+
+void rtrs_fault_inject_init(struct rtrs_fault_inject *fj,
+			    const char *dir_name,
+			    u32 err_status)
+{
+	struct dentry *dir, *parent;
+	struct fault_attr *attr = &fj->attr;
+
+	/* create debugfs directory and attribute */
+	parent = debugfs_create_dir(dir_name, NULL);
+	if (!parent) {
+		pr_warn("%s: failed to create debugfs directory\n", dir_name);
+		return;
+	}
+
+	*attr = fail_default_attr;
+	dir = fault_create_debugfs_attr("fault_inject", parent, attr);
+	if (IS_ERR(dir)) {
+		pr_warn("%s: failed to create debugfs attr\n", dir_name);
+		debugfs_remove_recursive(parent);
+		return;
+	}
+	fj->parent = parent;
+	fj->dir = dir;
+
+	/* create debugfs for status code */
+	fj->status = err_status;
+	debugfs_create_u32("status", 0600, dir,	&fj->status);
+}
+
+void rtrs_fault_inject_final(struct rtrs_fault_inject *fj)
+{
+	/* remove debugfs directories */
+	debugfs_remove_recursive(fj->parent);
+}
+
+void rtrs_fault_inject_add(struct dentry *dir, const char *fname, bool *value)
+{
+	debugfs_create_bool(fname, 0600, dir, value);
+}
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-fault.h b/drivers/infiniband/ulp/rtrs/rtrs-fault.h
new file mode 100644
index 000000000000..8c1acffb2b16
--- /dev/null
+++ b/drivers/infiniband/ulp/rtrs/rtrs-fault.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * RDMA Transport Layer
+ *
+ * Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
+ * Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
+ * Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
+ */
+
+#ifndef RTRS_FAULT_H
+#define RTRS_FAULT_H
+
+#include <linux/fault-inject.h>
+
+struct rtrs_fault_inject {
+#ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
+	struct fault_attr attr;
+	struct dentry *parent;
+	struct dentry *dir;
+	u32 status;
+#endif
+};
+
+void rtrs_fault_inject_init(struct rtrs_fault_inject *fj,
+			    const char *dev_name, u32 err_status);
+void rtrs_fault_inject_add(struct dentry *dir, const char *fname, bool *value);
+void rtrs_fault_inject_final(struct rtrs_fault_inject *fj);
+#endif /* RTRS_FAULT_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/4] RDMA/rtrs-clt: Inject a fault at request processing
  2021-04-06 11:50 [PATCH 0/4] Enable Fault Injection for RTRS Gioh Kim
  2021-04-06 11:50 ` [PATCH 1/4] RDMA/rtrs: Enable the fault-injection Gioh Kim
@ 2021-04-06 11:50 ` Gioh Kim
  2021-04-06 11:50 ` [PATCH 3/4] RDMA/rtrs-srv: Inject a fault at heart-beat sending Gioh Kim
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Gioh Kim @ 2021-04-06 11:50 UTC (permalink / raw)
  To: linux-rdma, linux-doc
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang,
	akinobu.mita, corbet, Gioh Kim, Jack Wang

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

If the fault-injection is enabled, it does not sent a request to the
server and returns error.

Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c | 44 ++++++++++++++++++++
 drivers/infiniband/ulp/rtrs/rtrs-clt.c       |  7 ++++
 drivers/infiniband/ulp/rtrs/rtrs-clt.h       | 13 ++++++
 3 files changed, 64 insertions(+)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
index eb92ec13cb57..c502dcbae9bb 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
@@ -479,3 +479,47 @@ void rtrs_clt_destroy_sysfs_root(struct rtrs_clt *clt)
 		kobject_put(clt->kobj_paths);
 	}
 }
+
+#ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
+void rtrs_clt_fault_inject_init(struct rtrs_clt_fault_inject *fault_inject,
+				struct rtrs_clt_sess *sess)
+{
+	char str[NAME_MAX];
+	int cnt;
+
+	cnt = sockaddr_to_str((struct sockaddr *)&sess->s.src_addr,
+			      str, sizeof(str));
+	cnt += scnprintf(str + cnt, sizeof(str) - cnt, "@");
+	sockaddr_to_str((struct sockaddr *)&sess->s.dst_addr,
+			str + cnt, sizeof(str) - cnt);
+
+	rtrs_fault_inject_init(&fault_inject->fj, str, -EBUSY);
+	/* injection points */
+	rtrs_fault_inject_add(fault_inject->fj.dir,
+			      "fail-request", &fault_inject->fail_request);
+}
+
+void rtrs_clt_fault_inject_final(struct rtrs_clt_fault_inject *fault_inject)
+{
+	rtrs_fault_inject_final(&fault_inject->fj);
+}
+
+int rtrs_clt_should_fail_request(struct rtrs_clt_fault_inject *fault_inject)
+{
+	if (fault_inject->fail_request && should_fail(&fault_inject->fj.attr, 1))
+		return fault_inject->fj.status;
+	return 0;
+}
+#else
+void rtrs_clt_fault_inject_init(struct rtrs_clt_fault_inject *fault_inject,
+				struct rtrs_clt_sess *sess)
+{
+}
+void rtrs_clt_fault_inject_final(struct rtrs_clt_fault_inject *fault_inject)
+{
+}
+int rtrs_clt_should_fail_request(struct rtrs_clt_fault_inject *fault_inject)
+{
+	return 0;
+}
+#endif
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 64990df81937..5062328ac577 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1469,6 +1469,7 @@ static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
 
 void free_sess(struct rtrs_clt_sess *sess)
 {
+	rtrs_clt_fault_inject_final(&sess->fault_inject);
 	free_percpu(sess->mp_skip_entry);
 	mutex_destroy(&sess->init_mutex);
 	kfree(sess->s.con);
@@ -2686,6 +2687,8 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 			free_sess(sess);
 			goto close_all_sess;
 		}
+
+		rtrs_clt_fault_inject_init(&sess->fault_inject, sess);
 	}
 	err = alloc_permits(clt);
 	if (err)
@@ -2858,6 +2861,10 @@ int rtrs_clt_request(int dir, struct rtrs_clt_req_ops *ops,
 		if (unlikely(READ_ONCE(sess->state) != RTRS_CLT_CONNECTED))
 			continue;
 
+		err = rtrs_clt_should_fail_request(&sess->fault_inject);
+		if (unlikely(err))
+			continue;
+
 		if (unlikely(usr_len + hdr_len > sess->max_hdr_size)) {
 			rtrs_wrn_rl(sess->clt,
 				     "%s request failed, user message size is %zu and header length %zu, but max size is %u\n",
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
index 692bc83e1f09..59ea2ec44fe5 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
@@ -12,6 +12,7 @@
 
 #include <linux/device.h>
 #include "rtrs-pri.h"
+#include "rtrs-fault.h"
 
 /**
  * enum rtrs_clt_state - Client states.
@@ -122,6 +123,13 @@ struct rtrs_rbuf {
 	u32 rkey;
 };
 
+struct rtrs_clt_fault_inject {
+#ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
+	struct rtrs_fault_inject fj;
+	bool fail_request;
+#endif
+};
+
 struct rtrs_clt_sess {
 	struct rtrs_sess	s;
 	struct rtrs_clt	*clt;
@@ -150,6 +158,7 @@ struct rtrs_clt_sess {
 	char                    hca_name[IB_DEVICE_NAME_MAX];
 	struct list_head __percpu
 				*mp_skip_entry;
+	struct rtrs_clt_fault_inject	fault_inject;
 };
 
 struct rtrs_clt {
@@ -250,4 +259,8 @@ int rtrs_clt_create_sess_files(struct rtrs_clt_sess *sess);
 void rtrs_clt_destroy_sess_files(struct rtrs_clt_sess *sess,
 				  const struct attribute *sysfs_self);
 
+void rtrs_clt_fault_inject_init(struct rtrs_clt_fault_inject *fault_inject,
+				struct rtrs_clt_sess *sess);
+void rtrs_clt_fault_inject_final(struct rtrs_clt_fault_inject *fault_inject);
+int rtrs_clt_should_fail_request(struct rtrs_clt_fault_inject *fault_inject);
 #endif /* RTRS_CLT_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/4] RDMA/rtrs-srv: Inject a fault at heart-beat sending
  2021-04-06 11:50 [PATCH 0/4] Enable Fault Injection for RTRS Gioh Kim
  2021-04-06 11:50 ` [PATCH 1/4] RDMA/rtrs: Enable the fault-injection Gioh Kim
  2021-04-06 11:50 ` [PATCH 2/4] RDMA/rtrs-clt: Inject a fault at request processing Gioh Kim
@ 2021-04-06 11:50 ` Gioh Kim
  2021-04-06 11:50 ` [PATCH 4/4] docs: fault-injection: Add fault-injection manual of RTRS Gioh Kim
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Gioh Kim @ 2021-04-06 11:50 UTC (permalink / raw)
  To: linux-rdma, linux-doc
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang,
	akinobu.mita, corbet, Gioh Kim, Jack Wang

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

If the fault-injection is enabled, it does not send a heart-beat
and generates the error on the client side.

Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 44 ++++++++++++++++++++
 drivers/infiniband/ulp/rtrs/rtrs-srv.c       |  5 +++
 drivers/infiniband/ulp/rtrs/rtrs-srv.h       | 13 ++++++
 3 files changed, 62 insertions(+)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
index a9288175fbb5..57af9e7c3588 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
@@ -309,3 +309,47 @@ void rtrs_srv_destroy_sess_files(struct rtrs_srv_sess *sess)
 		rtrs_srv_destroy_once_sysfs_root_folders(sess);
 	}
 }
+
+#ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
+void rtrs_srv_fault_inject_init(struct rtrs_srv_fault_inject *fault_inject,
+				struct rtrs_srv_sess *sess)
+{
+	char str[NAME_MAX];
+	int cnt;
+
+	cnt = sockaddr_to_str((struct sockaddr *)&sess->s.src_addr,
+			      str, sizeof(str));
+	cnt += scnprintf(str + cnt, sizeof(str) - cnt, "@");
+	sockaddr_to_str((struct sockaddr *)&sess->s.dst_addr,
+			str + cnt, sizeof(str) - cnt);
+
+	rtrs_fault_inject_init(&fault_inject->fj, str, -EBUSY);
+	/* injection points */
+	rtrs_fault_inject_add(fault_inject->fj.dir,
+			      "fail-hb-ack", &fault_inject->fail_hb_ack);
+}
+
+void rtrs_srv_fault_inject_final(struct rtrs_srv_fault_inject *fault_inject)
+{
+	rtrs_fault_inject_final(&fault_inject->fj);
+}
+
+int rtrs_should_fail_hb_ack(struct rtrs_srv_fault_inject *fault_inject)
+{
+	if (fault_inject->fail_hb_ack && should_fail(&fault_inject->fj.attr, 1))
+		return fault_inject->fj.status;
+	return 0;
+}
+#else
+void rtrs_srv_fault_inject_init(struct rtrs_srv_fault_inject *fault_inject,
+				struct rtrs_srv_sess *sess_name)
+{
+}
+void rtrs_srv_fault_inject_final(struct rtrs_srv_fault_inject *fault_inject)
+{
+}
+int rtrs_should_fail_hb_ack(struct rtrs_srv_fault_inject *fault_inject)
+{
+	return 0;
+}
+#endif
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 5e9bb7bf5ef3..6e53dac0d22c 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1232,6 +1232,8 @@ static void rtrs_srv_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 			}
 		} else if (imm_type == RTRS_HB_MSG_IMM) {
 			WARN_ON(con->c.cid);
+			if (unlikely(rtrs_should_fail_hb_ack(&sess->fault_inject)))
+				break;
 			rtrs_send_hb_ack(&sess->s);
 		} else if (imm_type == RTRS_HB_ACK_IMM) {
 			WARN_ON(con->c.cid);
@@ -1489,6 +1491,7 @@ static void rtrs_srv_close_work(struct work_struct *work)
 
 	sess = container_of(work, typeof(*sess), close_work);
 
+	rtrs_srv_fault_inject_final(&sess->fault_inject);
 	rtrs_srv_destroy_sess_files(sess);
 	rtrs_srv_stop_hb(sess);
 
@@ -1748,6 +1751,8 @@ static struct rtrs_srv_sess *__alloc_sess(struct rtrs_srv *srv,
 
 	__add_path_to_srv(srv, sess);
 
+	rtrs_srv_fault_inject_init(&sess->fault_inject, sess);
+
 	return sess;
 
 err_unmap_bufs:
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
index 9543ae19996c..001889e148ac 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
@@ -13,6 +13,7 @@
 #include <linux/device.h>
 #include <linux/refcount.h>
 #include "rtrs-pri.h"
+#include "rtrs-fault.h"
 
 /*
  * enum rtrs_srv_state - Server states.
@@ -73,6 +74,13 @@ struct rtrs_srv_mr {
 	struct rtrs_iu	*iu;		/* send buffer for new rkey msg */
 };
 
+struct rtrs_srv_fault_inject {
+#ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
+	struct rtrs_fault_inject fj;
+	bool fail_hb_ack;
+#endif
+};
+
 struct rtrs_srv_sess {
 	struct rtrs_sess	s;
 	struct rtrs_srv	*srv;
@@ -90,6 +98,7 @@ struct rtrs_srv_sess {
 	unsigned int		mem_bits;
 	struct kobject		kobj;
 	struct rtrs_srv_stats	*stats;
+	struct rtrs_srv_fault_inject	fault_inject;
 };
 
 struct rtrs_srv {
@@ -152,4 +161,8 @@ ssize_t rtrs_srv_reset_all_help(struct rtrs_srv_stats *stats,
 int rtrs_srv_create_sess_files(struct rtrs_srv_sess *sess);
 void rtrs_srv_destroy_sess_files(struct rtrs_srv_sess *sess);
 
+void rtrs_srv_fault_inject_init(struct rtrs_srv_fault_inject *fault_inject,
+				struct rtrs_srv_sess *sess);
+void rtrs_srv_fault_inject_final(struct rtrs_srv_fault_inject *fault_inject);
+int rtrs_should_fail_hb_ack(struct rtrs_srv_fault_inject *fault_inject);
 #endif /* RTRS_SRV_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/4] docs: fault-injection: Add fault-injection manual of RTRS
  2021-04-06 11:50 [PATCH 0/4] Enable Fault Injection for RTRS Gioh Kim
                   ` (2 preceding siblings ...)
  2021-04-06 11:50 ` [PATCH 3/4] RDMA/rtrs-srv: Inject a fault at heart-beat sending Gioh Kim
@ 2021-04-06 11:50 ` Gioh Kim
  2021-04-06 14:20 ` [PATCH 0/4] Enable Fault Injection for RTRS Chuck Lever III
  2021-04-13 22:53 ` Jason Gunthorpe
  5 siblings, 0 replies; 8+ messages in thread
From: Gioh Kim @ 2021-04-06 11:50 UTC (permalink / raw)
  To: linux-rdma, linux-doc
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang,
	akinobu.mita, corbet, Gioh Kim

From: Gioh Kim <gi-oh.kim@cloud.ionos.com>

It describes how to use the fault-injection of RTRS.

Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
---
 .../fault-injection/rtrs-fault-injection.rst  | 83 +++++++++++++++++++
 1 file changed, 83 insertions(+)
 create mode 100644 Documentation/fault-injection/rtrs-fault-injection.rst

diff --git a/Documentation/fault-injection/rtrs-fault-injection.rst b/Documentation/fault-injection/rtrs-fault-injection.rst
new file mode 100644
index 000000000000..463869877a85
--- /dev/null
+++ b/Documentation/fault-injection/rtrs-fault-injection.rst
@@ -0,0 +1,83 @@
+RTRS (RDMA Transport) Fault Injection
+=====================================
+This document introduces how to enable and use the error injection of RTRS
+via debugfs in the /sys/kernel/debug directory. When enabled, users can
+enable specific error injection point and change the default status code
+via the debugfs.
+
+Following examples show how to inject an error into the RTRS.
+
+First, enable CONFIG_FAULT_INJECTION_DEBUG_FS kernel config,
+recompile the kernel. After booting up the kernel, map a target device.
+
+After mapping, /sys/kernel/debug/<session-name> directory is created
+on both of the client and the server.
+
+Example 1: Inject an error into request processing of rtrs-client
+-----------------------------------------------------------------
+
+Generate an error on one session of rtrs-client::
+
+  echo 100 > /sys/kernel/debug/ip\:192.168.123.144@ip\:192.168.123.190/fault_inject/probability
+  echo 1 > /sys/kernel/debug/ip\:192.168.123.144@ip\:192.168.123.190/fault_inject/times
+  echo 1 > /sys/kernel/debug/ip\:192.168.123.144@ip\:192.168.123.190/fault_inject/fail-request
+  dd if=/dev/rnbd0 of=./dd bs=1k count=10
+
+Expected Result::
+
+  dd succeeds but generates an IO error
+
+Message from dmesg::
+
+  FAULT_INJECTION: forcing a failure.
+  name fault_inject, interval 1, probability 100, space 0, times 1
+  CPU: 0 PID: 799 Comm: dd Tainted: G           O      5.4.77-pserver+ #169
+  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
+  Call Trace:
+    dump_stack+0x97/0xe0
+    should_fail.cold+0x5/0x11
+    rtrs_clt_should_fail_request+0x2f/0x50 [rtrs_client]
+    rtrs_clt_request+0x223/0x540 [rtrs_client]
+    rnbd_queue_rq+0x347/0x800 [rnbd_client]
+    __blk_mq_try_issue_directly+0x268/0x380
+    blk_mq_request_issue_directly+0x9a/0xe0
+    blk_mq_try_issue_list_directly+0xa3/0x170
+    blk_mq_sched_insert_requests+0x1de/0x340
+    blk_mq_flush_plug_list+0x488/0x620
+    blk_flush_plug_list+0x20f/0x250
+    blk_finish_plug+0x3c/0x54
+    read_pages+0x104/0x2b0
+    __do_page_cache_readahead+0x28b/0x2b0
+    ondemand_readahead+0x2cc/0x610
+    generic_file_read_iter+0xde0/0x11f0
+    new_sync_read+0x246/0x360
+    vfs_read+0xc1/0x1b0
+    ksys_read+0xc3/0x160
+    do_syscall_64+0x68/0x260
+    entry_SYSCALL_64_after_hwframe+0x49/0xbe
+  RIP: 0033:0x7f7ff4296461
+  Code: fe ff ff 50 48 8d 3d fe d0 09 00 e8 e9 03 02 00 66 0f 1f 84 00 00 00 00 00 48 8d 05 99 62 0d 00 8b 00 85 c0 75 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 41 54 49 89 d4 55 48
+  RSP: 002b:00007fffdceca5b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
+  RAX: ffffffffffffffda RBX: 000055c5eab6e3e0 RCX: 00007f7ff4296461
+  RDX: 0000000000000400 RSI: 000055c5ead27000 RDI: 0000000000000000
+  RBP: 0000000000000400 R08: 0000000000000003 R09: 00007f7ff4368260
+  R10: ffffffffffffff3b R11: 0000000000000246 R12: 000055c5ead27000
+  R13: 0000000000000000 R14: 0000000000000000 R15: 000055c5ead27000
+
+Example 2: rtrs-server does not send ACK to the heart-beat of rtrs-client
+-------------------------------------------------------------------------
+
+::
+
+  echo 100 > /sys/kernel/debug/ip\:192.168.123.190@ip\:192.168.123.144/fault_inject/probability
+  echo 5 > /sys/kernel/debug/ip\:192.168.123.190@ip\:192.168.123.144/fault_inject/times
+  echo 1 > /sys/kernel/debug/ip\:192.168.123.190@ip\:192.168.123.144/fault_inject/fail-hb-ack
+
+Expected Result::
+
+  If rtrs-server does not send ACK more than 5 times, rtrs-client tries reconnection.
+
+Check how many times rtrs-client did reconnection::
+
+  cat /sys/devices/virtual/rtrs-client/bla/paths/ip\:192.168.122.142@ip\:192.168.122.130/stats/reconnects
+  1 0
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/4] Enable Fault Injection for RTRS
  2021-04-06 11:50 [PATCH 0/4] Enable Fault Injection for RTRS Gioh Kim
                   ` (3 preceding siblings ...)
  2021-04-06 11:50 ` [PATCH 4/4] docs: fault-injection: Add fault-injection manual of RTRS Gioh Kim
@ 2021-04-06 14:20 ` Chuck Lever III
  2021-04-06 15:20   ` Gioh Kim
  2021-04-13 22:53 ` Jason Gunthorpe
  5 siblings, 1 reply; 8+ messages in thread
From: Chuck Lever III @ 2021-04-06 14:20 UTC (permalink / raw)
  To: Gioh Kim
  Cc: linux-rdma, linux-doc, bvanassche, leon, dledford, jgg,
	haris.iqbal, jinpu.wang, akinobu.mita, corbet



> On Apr 6, 2021, at 7:50 AM, Gioh Kim <gi-oh.kim@ionos.com> wrote:
> 
> My colleagues and I would like to apply the fault injection
> of the Linux to test error handling of RTRS module. RTRS module
> consists of client and server modules that are connected via
> Infiniband network. So it is important for the client to receive
> the error of the server and handle it smoothly.

I am a fan of fault injection. In fact I added a disconnect fault
injector for RPC that's in the kernel now, and it uses debugfs
as its control interface.

But that was years ago. If I were doing this today, I'd consider
kprobes, since fault injection is generally not something that
is consumed by users or administrators in a distributed kernel.

Have you considered injection via kprobes or eBPF instead of
adding permanent code?


> When debugfs is enabled, RTRS is able to export interfaces
> to fail RTRS client and server.
> Following fault injection points are enabled:
> - fail a request processing on RTRS client side
> - fail a heart-beat transferation on RTRS server side
> 
> This patch set is just a starting point. We will enable various
> faults and test as many error cases as possible.
> 
> Best regards
> 
> Gioh Kim (4):
>  RDMA/rtrs: Enable the fault-injection
>  RDMA/rtrs-clt: Inject a fault at request processing
>  RDMA/rtrs-srv: Inject a fault at heart-beat sending
>  docs: fault-injection: Add fault-injection manual of RTRS
> 
> .../fault-injection/rtrs-fault-injection.rst  | 83 +++++++++++++++++++
> drivers/infiniband/ulp/rtrs/Makefile          |  2 +
> drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c  | 44 ++++++++++
> drivers/infiniband/ulp/rtrs/rtrs-clt.c        |  7 ++
> drivers/infiniband/ulp/rtrs/rtrs-clt.h        | 13 +++
> drivers/infiniband/ulp/rtrs/rtrs-fault.c      | 52 ++++++++++++
> drivers/infiniband/ulp/rtrs/rtrs-fault.h      | 28 +++++++
> drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c  | 44 ++++++++++
> drivers/infiniband/ulp/rtrs/rtrs-srv.c        |  5 ++
> drivers/infiniband/ulp/rtrs/rtrs-srv.h        | 13 +++
> 10 files changed, 291 insertions(+)
> create mode 100644 Documentation/fault-injection/rtrs-fault-injection.rst
> create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-fault.c
> create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-fault.h
> 
> -- 
> 2.25.1
> 

--
Chuck Lever




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/4] Enable Fault Injection for RTRS
  2021-04-06 14:20 ` [PATCH 0/4] Enable Fault Injection for RTRS Chuck Lever III
@ 2021-04-06 15:20   ` Gioh Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Gioh Kim @ 2021-04-06 15:20 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: linux-rdma, linux-doc, bvanassche, leon, dledford, jgg,
	haris.iqbal, jinpu.wang, akinobu.mita, corbet

On Tue, Apr 6, 2021 at 4:20 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Apr 6, 2021, at 7:50 AM, Gioh Kim <gi-oh.kim@ionos.com> wrote:
> >
> > My colleagues and I would like to apply the fault injection
> > of the Linux to test error handling of RTRS module. RTRS module
> > consists of client and server modules that are connected via
> > Infiniband network. So it is important for the client to receive
> > the error of the server and handle it smoothly.
>
> I am a fan of fault injection. In fact I added a disconnect fault
> injector for RPC that's in the kernel now, and it uses debugfs
> as its control interface.
>
> But that was years ago. If I were doing this today, I'd consider
> kprobes, since fault injection is generally not something that
> is consumed by users or administrators in a distributed kernel.
>
> Have you considered injection via kprobes or eBPF instead of
> adding permanent code?

I have not considered the eBPF yet.
I will have a discussion with my colleagues about that.
Thank you for the information.


>
>
> > When debugfs is enabled, RTRS is able to export interfaces
> > to fail RTRS client and server.
> > Following fault injection points are enabled:
> > - fail a request processing on RTRS client side
> > - fail a heart-beat transferation on RTRS server side
> >
> > This patch set is just a starting point. We will enable various
> > faults and test as many error cases as possible.
> >
> > Best regards
> >
> > Gioh Kim (4):
> >  RDMA/rtrs: Enable the fault-injection
> >  RDMA/rtrs-clt: Inject a fault at request processing
> >  RDMA/rtrs-srv: Inject a fault at heart-beat sending
> >  docs: fault-injection: Add fault-injection manual of RTRS
> >
> > .../fault-injection/rtrs-fault-injection.rst  | 83 +++++++++++++++++++
> > drivers/infiniband/ulp/rtrs/Makefile          |  2 +
> > drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c  | 44 ++++++++++
> > drivers/infiniband/ulp/rtrs/rtrs-clt.c        |  7 ++
> > drivers/infiniband/ulp/rtrs/rtrs-clt.h        | 13 +++
> > drivers/infiniband/ulp/rtrs/rtrs-fault.c      | 52 ++++++++++++
> > drivers/infiniband/ulp/rtrs/rtrs-fault.h      | 28 +++++++
> > drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c  | 44 ++++++++++
> > drivers/infiniband/ulp/rtrs/rtrs-srv.c        |  5 ++
> > drivers/infiniband/ulp/rtrs/rtrs-srv.h        | 13 +++
> > 10 files changed, 291 insertions(+)
> > create mode 100644 Documentation/fault-injection/rtrs-fault-injection.rst
> > create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-fault.c
> > create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-fault.h
> >
> > --
> > 2.25.1
> >
>
> --
> Chuck Lever
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/4] Enable Fault Injection for RTRS
  2021-04-06 11:50 [PATCH 0/4] Enable Fault Injection for RTRS Gioh Kim
                   ` (4 preceding siblings ...)
  2021-04-06 14:20 ` [PATCH 0/4] Enable Fault Injection for RTRS Chuck Lever III
@ 2021-04-13 22:53 ` Jason Gunthorpe
  5 siblings, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2021-04-13 22:53 UTC (permalink / raw)
  To: Gioh Kim
  Cc: linux-rdma, linux-doc, bvanassche, leon, dledford, haris.iqbal,
	jinpu.wang, akinobu.mita, corbet

On Tue, Apr 06, 2021 at 01:50:45PM +0200, Gioh Kim wrote:
> My colleagues and I would like to apply the fault injection
> of the Linux to test error handling of RTRS module. RTRS module
> consists of client and server modules that are connected via
> Infiniband network. So it is important for the client to receive
> the error of the server and handle it smoothly.
> 
> When debugfs is enabled, RTRS is able to export interfaces
> to fail RTRS client and server.
> Following fault injection points are enabled:
> - fail a request processing on RTRS client side
> - fail a heart-beat transferation on RTRS server side
> 
> This patch set is just a starting point. We will enable various
> faults and test as many error cases as possible.
> 
> Best regards
> 
> Gioh Kim (4):
>   RDMA/rtrs: Enable the fault-injection
>   RDMA/rtrs-clt: Inject a fault at request processing
>   RDMA/rtrs-srv: Inject a fault at heart-beat sending
>   docs: fault-injection: Add fault-injection manual of RTRS

I'm going to drop this until you can look into ebpf kprobes, it does
seem the more modern way

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-04-13 22:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-06 11:50 [PATCH 0/4] Enable Fault Injection for RTRS Gioh Kim
2021-04-06 11:50 ` [PATCH 1/4] RDMA/rtrs: Enable the fault-injection Gioh Kim
2021-04-06 11:50 ` [PATCH 2/4] RDMA/rtrs-clt: Inject a fault at request processing Gioh Kim
2021-04-06 11:50 ` [PATCH 3/4] RDMA/rtrs-srv: Inject a fault at heart-beat sending Gioh Kim
2021-04-06 11:50 ` [PATCH 4/4] docs: fault-injection: Add fault-injection manual of RTRS Gioh Kim
2021-04-06 14:20 ` [PATCH 0/4] Enable Fault Injection for RTRS Chuck Lever III
2021-04-06 15:20   ` Gioh Kim
2021-04-13 22:53 ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).