All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/4] Sending kernel pathrecord query to user cache server
@ 2015-06-11 17:03 kaike.wan-ral2JQCrhuEAvxtiuMwx3w
       [not found] ` <1434042225-4975-1-git-send-email-kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: kaike.wan-ral2JQCrhuEAvxtiuMwx3w @ 2015-06-11 17:03 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Kaike Wan

From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

A SA cache is undeniably critical for fabric scalability and performance.
In user space, the ibacm application provides a good example of pathrecord
cache for address and route resolution. With the recent implementation of
the provider architecture, ibacm offers more extensibility as a SA cache.
In kernel, ipoib implements its own small cache for pathrecords, which is
however not available for general use. Furthermore, the implementation of
a SA cache in user space offers better flexibility, larger capacity, and
more robustness for the system.

In this patch series, a mechanism is implemented to allow ib_sa to
send pathrecord query to a user application (eg ibacm) through netlink.
Potentially, this mechanism could be easily extended to other SA queries.

With a customized test implemented in rdma_cm module (not included in this
series), it was shown that the time to retrieve 1 million pathrecords
dropped from 46660 jiffies (46.66 seconds) to 16119 jiffies (or 16.119
seconds) on a two-node system, a reduction of more than 60%.

Changes since v4:
- Patch 1: rename LS_NLA_TYPE_NUM_PATH as LS_NLA_TYPE_NUMB_PATH.
- Patch 4: remove the renaming of LS_NLA_TYPE_NUM_PATH as 
           LS_NLA_TYPE_NUMB_PATH.

Changes since v3:
- Patch 1: add basic RESOLVE attribute types.
- Patch 4: change the encoding of the RESOLVE request message based on
  the new attribute types and the input comp_mask. Change the response
  handling by iterating all attributes.

Changes since v2:
- Redesigne the communication protocol between the kernel and user space
  application. Instead of the MAD packet format, the new protocol uses 
  netlink message header and attributes to exchange request and 
  response between the kernel and user space.The design was described
  here:
  http://www.spinics.net/lists/linux-rdma/msg25621.html

Changes since v1:
- Move kzalloc changes into a separate patch (Patch 3).
- Remove redundant include line (Patch 4). 
- Rename struct rdma_nl_resp_msg as structure ib_nl_resp_msg (Patch 4).

Kaike Wan (4):
  IB/netlink: Add defines for local service requests through netlink
  IB/core: Check the presence of netlink multicast group listeners
  IB/sa: Allocate SA query with kzalloc
  IB/sa: Route SA pathrecord query through netlink

 drivers/infiniband/core/netlink.c  |    8 +
 drivers/infiniband/core/sa_query.c |  523 +++++++++++++++++++++++++++++++++++-
 include/rdma/rdma_netlink.h        |    7 +
 include/uapi/rdma/rdma_netlink.h   |   82 ++++++
 4 files changed, 615 insertions(+), 5 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found] ` <1434042225-4975-1-git-send-email-kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-06-11 17:03   ` kaike.wan-ral2JQCrhuEAvxtiuMwx3w
       [not found]     ` <1434042225-4975-2-git-send-email-kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-06-11 17:03   ` [PATCH v5 2/4] IB/core: Check the presence of netlink multicast group listeners kaike.wan-ral2JQCrhuEAvxtiuMwx3w
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: kaike.wan-ral2JQCrhuEAvxtiuMwx3w @ 2015-06-11 17:03 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Kaike Wan, John Fleck, Ira Weiny

From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

This patch adds netlink defines for SA client, local service group, local
service operations, and related attributes.

Signed-off-by: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: John Fleck <john.fleck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 include/uapi/rdma/rdma_netlink.h |   82 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 82 insertions(+), 0 deletions(-)

diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h
index 6e4bb42..6aa9281 100644
--- a/include/uapi/rdma/rdma_netlink.h
+++ b/include/uapi/rdma/rdma_netlink.h
@@ -7,12 +7,14 @@ enum {
 	RDMA_NL_RDMA_CM = 1,
 	RDMA_NL_NES,
 	RDMA_NL_C4IW,
+	RDMA_NL_SA,
 	RDMA_NL_NUM_CLIENTS
 };
 
 enum {
 	RDMA_NL_GROUP_CM = 1,
 	RDMA_NL_GROUP_IWPM,
+	RDMA_NL_GROUP_LS,
 	RDMA_NL_NUM_GROUPS
 };
 
@@ -128,5 +130,85 @@ enum {
 	IWPM_NLA_ERR_MAX
 };
 
+/* Local service group opcodes */
+enum {
+	RDMA_NL_LS_OP_RESOLVE = 0,
+	RDMA_NL_LS_OP_SET_TIMEOUT,
+	RDMA_NL_LS_NUM_OPS
+};
+
+/* Local service netlink message flags */
+#define RDMA_NL_LS_F_OK		0x0100	/* Success response */
+#define RDMA_NL_LS_F_ERR	0x0200	/* Failed response */
+
+/* Local service attribute type */
+enum {
+	LS_NLA_TYPE_STATUS = 0,
+	LS_NLA_TYPE_PATH_RECORD,
+	LS_NLA_TYPE_TIMEOUT,
+	LS_NLA_TYPE_SERVICE_ID,
+	LS_NLA_TYPE_DGID,
+	LS_NLA_TYPE_SGID,
+	LS_NLA_TYPE_TCLASS,
+	LS_NLA_TYPE_REVERSIBLE,
+	LS_NLA_TYPE_NUMB_PATH,
+	LS_NLA_TYPE_PKEY,
+	LS_NLA_TYPE_QOS_CLASS,
+	LS_NLA_TYPE_MAX
+};
+
+/* Local service status attribute */
+enum {
+	LS_NLA_STATUS_SUCCESS = 0,
+	LS_NLA_STATUS_EINVAL,
+	LS_NLA_STATUS_ENODATA,
+	LS_NLA_STATUS_MAX
+};
+
+struct rdma_nla_ls_status {
+	__u32		status;
+};
+
+/* Local service pathrecord attribute: struct ib_path_rec_data */
+
+/* Local service timeout attribute */
+struct rdma_nla_ls_timeout {
+	__u32		timeout;
+};
+
+/* Local Service ServiceID attribute */
+struct rdma_nla_ls_service_id {
+	__be64		service_id;
+};
+
+/* Local Service DGID/SGID attribute: big endian */
+struct rdma_nla_ls_gid {
+	__u8		gid[16];
+};
+
+/* Local Service Traffic Class attribute */
+struct rdma_nla_ls_tclass {
+	__u8		tclass;
+};
+
+/* Local Service Reversible attribute */
+struct rdma_nla_ls_reversible {
+	__u32		reversible;
+};
+
+/* Local Service numb_path attribute */
+struct rdma_nla_ls_numb_path {
+	__u8		numb_path;
+};
+
+/* Local Service Pkey attribute*/
+struct rdma_nla_ls_pkey {
+	__be16		pkey;
+};
+
+/* Local Service Qos Class attribute */
+struct rdma_nla_ls_qos_class {
+	__be16		qos_class;
+};
 
 #endif /* _UAPI_RDMA_NETLINK_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 2/4] IB/core: Check the presence of netlink multicast group listeners
       [not found] ` <1434042225-4975-1-git-send-email-kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-06-11 17:03   ` [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink kaike.wan-ral2JQCrhuEAvxtiuMwx3w
@ 2015-06-11 17:03   ` kaike.wan-ral2JQCrhuEAvxtiuMwx3w
  2015-06-11 17:03   ` [PATCH v5 3/4] IB/sa: Allocate SA query with kzalloc kaike.wan-ral2JQCrhuEAvxtiuMwx3w
  2015-06-11 17:03   ` [PATCH v5 4/4] IB/sa: Route SA pathrecord query through netlink kaike.wan-ral2JQCrhuEAvxtiuMwx3w
  3 siblings, 0 replies; 18+ messages in thread
From: kaike.wan-ral2JQCrhuEAvxtiuMwx3w @ 2015-06-11 17:03 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Kaike Wan, John Fleck, Ira Weiny

From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

This patch adds a function to check if listeners for a netlink multicast
group are present.

Signed-off-by: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: John Fleck <john.fleck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/netlink.c |    8 ++++++++
 include/rdma/rdma_netlink.h       |    7 +++++++
 2 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/netlink.c b/drivers/infiniband/core/netlink.c
index 23dd5a5..e0fc913 100644
--- a/drivers/infiniband/core/netlink.c
+++ b/drivers/infiniband/core/netlink.c
@@ -49,6 +49,14 @@ static DEFINE_MUTEX(ibnl_mutex);
 static struct sock *nls;
 static LIST_HEAD(client_list);
 
+int ibnl_chk_listeners(unsigned int group)
+{
+	if (netlink_has_listeners(nls, group) == 0)
+		return -1;
+	return 0;
+}
+EXPORT_SYMBOL(ibnl_chk_listeners);
+
 int ibnl_add_client(int index, int nops,
 		    const struct ibnl_client_cbs cb_table[])
 {
diff --git a/include/rdma/rdma_netlink.h b/include/rdma/rdma_netlink.h
index 0790882..5852661 100644
--- a/include/rdma/rdma_netlink.h
+++ b/include/rdma/rdma_netlink.h
@@ -77,4 +77,11 @@ int ibnl_unicast(struct sk_buff *skb, struct nlmsghdr *nlh,
 int ibnl_multicast(struct sk_buff *skb, struct nlmsghdr *nlh,
 			unsigned int group, gfp_t flags);
 
+/**
+ * Check if there are any listeners to the netlink group
+ * @group: the netlink group ID
+ * Returns 0 on success or a negative for no listeners.
+ */
+int ibnl_chk_listeners(unsigned int group);
+
 #endif /* _RDMA_NETLINK_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 3/4] IB/sa: Allocate SA query with kzalloc
       [not found] ` <1434042225-4975-1-git-send-email-kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-06-11 17:03   ` [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink kaike.wan-ral2JQCrhuEAvxtiuMwx3w
  2015-06-11 17:03   ` [PATCH v5 2/4] IB/core: Check the presence of netlink multicast group listeners kaike.wan-ral2JQCrhuEAvxtiuMwx3w
@ 2015-06-11 17:03   ` kaike.wan-ral2JQCrhuEAvxtiuMwx3w
  2015-06-11 17:03   ` [PATCH v5 4/4] IB/sa: Route SA pathrecord query through netlink kaike.wan-ral2JQCrhuEAvxtiuMwx3w
  3 siblings, 0 replies; 18+ messages in thread
From: kaike.wan-ral2JQCrhuEAvxtiuMwx3w @ 2015-06-11 17:03 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Kaike Wan, John Fleck, Ira Weiny

From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Replace kmalloc with kzalloc so that all uninitialized fields in SA query
will be zero-ed out to avoid unintentional consequence. This prepares the
SA query structure to accept new fields in the future.

Signed-off-by: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: John Fleck <john.fleck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/sa_query.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index c38f030..17e1cf7 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -739,7 +739,7 @@ int ib_sa_path_rec_get(struct ib_sa_client *client,
 	port  = &sa_dev->port[port_num - sa_dev->start_port];
 	agent = port->agent;
 
-	query = kmalloc(sizeof *query, gfp_mask);
+	query = kzalloc(sizeof(*query), gfp_mask);
 	if (!query)
 		return -ENOMEM;
 
@@ -861,7 +861,7 @@ int ib_sa_service_rec_query(struct ib_sa_client *client,
 	    method != IB_SA_METHOD_DELETE)
 		return -EINVAL;
 
-	query = kmalloc(sizeof *query, gfp_mask);
+	query = kzalloc(sizeof(*query), gfp_mask);
 	if (!query)
 		return -ENOMEM;
 
@@ -953,7 +953,7 @@ int ib_sa_mcmember_rec_query(struct ib_sa_client *client,
 	port  = &sa_dev->port[port_num - sa_dev->start_port];
 	agent = port->agent;
 
-	query = kmalloc(sizeof *query, gfp_mask);
+	query = kzalloc(sizeof(*query), gfp_mask);
 	if (!query)
 		return -ENOMEM;
 
@@ -1050,7 +1050,7 @@ int ib_sa_guid_info_rec_query(struct ib_sa_client *client,
 	port  = &sa_dev->port[port_num - sa_dev->start_port];
 	agent = port->agent;
 
-	query = kmalloc(sizeof *query, gfp_mask);
+	query = kzalloc(sizeof(*query), gfp_mask);
 	if (!query)
 		return -ENOMEM;
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 4/4] IB/sa: Route SA pathrecord query through netlink
       [not found] ` <1434042225-4975-1-git-send-email-kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-06-11 17:03   ` [PATCH v5 3/4] IB/sa: Allocate SA query with kzalloc kaike.wan-ral2JQCrhuEAvxtiuMwx3w
@ 2015-06-11 17:03   ` kaike.wan-ral2JQCrhuEAvxtiuMwx3w
  3 siblings, 0 replies; 18+ messages in thread
From: kaike.wan-ral2JQCrhuEAvxtiuMwx3w @ 2015-06-11 17:03 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Kaike Wan, John Fleck, Ira Weiny

From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

This patch routes a SA pathrecord query to netlink first and processes the
response appropriately. If a failure is returned, the request will be sent
through IB. The decision whether to route the request to netlink first is
determined by the presence of a listener for the local service netlink
multicast group. If the user-space local service netlink multicast group
listener is not present, the request will be sent through IB, just like
what is currently being done.

Signed-off-by: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: John Fleck <john.fleck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/sa_query.c |  515 +++++++++++++++++++++++++++++++++++-
 1 files changed, 514 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 17e1cf7..e063593 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -45,12 +45,21 @@
 #include <uapi/linux/if_ether.h>
 #include <rdma/ib_pack.h>
 #include <rdma/ib_cache.h>
+#include <rdma/rdma_netlink.h>
+#include <net/netlink.h>
+#include <uapi/rdma/ib_user_sa.h>
+#include <rdma/ib_marshall.h>
 #include "sa.h"
 
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("InfiniBand subnet administration query support");
 MODULE_LICENSE("Dual BSD/GPL");
 
+#define IB_SA_LOCAL_SVC_TIMEOUT_MIN		100
+#define IB_SA_LOCAL_SVC_TIMEOUT_DEFAULT		2000
+#define IB_SA_LOCAL_SVC_TIMEOUT_MAX		200000
+static int sa_local_svc_timeout_ms = IB_SA_LOCAL_SVC_TIMEOUT_DEFAULT;
+
 struct ib_sa_sm_ah {
 	struct ib_ah        *ah;
 	struct kref          ref;
@@ -80,8 +89,24 @@ struct ib_sa_query {
 	struct ib_mad_send_buf *mad_buf;
 	struct ib_sa_sm_ah     *sm_ah;
 	int			id;
+	u32			flags;
 };
 
+#define IB_SA_ENABLE_LOCAL_SERVICE	0x00000001
+#define IB_SA_CANCEL			0x00000002
+
+#define IB_SA_LOCAL_SVC_ENABLED(query) \
+	((query)->flags & IB_SA_ENABLE_LOCAL_SERVICE)
+#define IB_SA_ENABLE_LOCAL_SVC(query) \
+	((query)->flags |= IB_SA_ENABLE_LOCAL_SERVICE)
+#define IB_SA_DISABLE_LOCAL_SVC(query) \
+	((query)->flags &= ~IB_SA_ENABLE_LOCAL_SERVICE)
+
+#define IB_SA_QUERY_CANCELLED(query) \
+	((query)->flags & IB_SA_CANCEL)
+#define IB_SA_CANCEL_QUERY(query) \
+	((query)->flags |= IB_SA_CANCEL)
+
 struct ib_sa_service_query {
 	void (*callback)(int, struct ib_sa_service_rec *, void *);
 	void *context;
@@ -106,6 +131,26 @@ struct ib_sa_mcmember_query {
 	struct ib_sa_query sa_query;
 };
 
+struct ib_nl_request_info {
+	struct list_head list;
+	u32 seq;
+	unsigned long timeout;
+	struct ib_sa_query *query;
+};
+
+struct ib_nl_attr_info {
+	u16 len;	/* Total attr len: header + payload + padding */
+	ib_sa_comp_mask comp_mask;
+	void *input;
+	void (*set_attrs)(struct sk_buff *skb, struct ib_nl_attr_info *info);
+};
+
+static LIST_HEAD(ib_nl_request_list);
+static DEFINE_SPINLOCK(ib_nl_request_lock);
+static atomic_t ib_nl_sa_request_seq;
+static struct workqueue_struct *ib_nl_wq;
+static struct delayed_work ib_nl_timed_work;
+
 static void ib_sa_add_one(struct ib_device *device);
 static void ib_sa_remove_one(struct ib_device *device);
 
@@ -381,6 +426,433 @@ static const struct ib_field guidinfo_rec_table[] = {
 	  .size_bits    = 512 },
 };
 
+static int ib_nl_send_msg(int opcode, struct ib_nl_attr_info *attrs, u32 seq)
+{
+	struct sk_buff *skb = NULL;
+	struct nlmsghdr *nlh;
+	void *data;
+	int ret = 0;
+
+	if (attrs->len <= 0)
+		return -EMSGSIZE;
+
+	skb = nlmsg_new(attrs->len, GFP_KERNEL);
+	if (!skb) {
+		pr_err("alloc failed ret=%d\n", ret);
+		return -ENOMEM;
+	}
+
+	/* Put nlmsg header only for now */
+	data = ibnl_put_msg(skb, &nlh, seq, 0, RDMA_NL_SA,
+			    opcode, GFP_KERNEL);
+	if (!data) {
+		kfree_skb(skb);
+		return -EMSGSIZE;
+	}
+
+	/* Add attributes */
+	attrs->set_attrs(skb, attrs);
+
+	/* Repair the nlmsg header length */
+	nlmsg_end(skb, nlh);
+
+	ret = ibnl_multicast(skb, nlh, RDMA_NL_GROUP_LS, GFP_KERNEL);
+	if (!ret) {
+		ret = attrs->len;
+	} else {
+		if (ret != -ESRCH)
+			pr_err("ibnl_multicast failed l=%d, r=%d\n",
+			       attrs->len, ret);
+		ret = 0;
+	}
+	return ret;
+}
+
+static struct ib_nl_request_info *
+ib_nl_alloc_request(struct ib_sa_query *query)
+{
+	struct ib_nl_request_info *rinfo;
+
+	rinfo = kzalloc(sizeof(*rinfo), GFP_ATOMIC);
+	if (rinfo == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&rinfo->list);
+	rinfo->seq = (u32)atomic_inc_return(&ib_nl_sa_request_seq);
+	rinfo->query = query;
+
+	return rinfo;
+}
+
+static void ib_nl_set_path_rec_attrs(struct sk_buff *skb,
+				     struct ib_nl_attr_info *info)
+{
+	struct ib_sa_path_rec *sa_rec = info->input;
+
+	if (info->comp_mask & IB_SA_PATH_REC_SERVICE_ID)
+		nla_put(skb, LS_NLA_TYPE_SERVICE_ID,
+			sizeof(sa_rec->service_id), &sa_rec->service_id);
+	if (info->comp_mask & IB_SA_PATH_REC_DGID)
+		nla_put(skb, LS_NLA_TYPE_DGID, sizeof(sa_rec->dgid),
+			&sa_rec->dgid);
+	if (info->comp_mask & IB_SA_PATH_REC_SGID)
+		nla_put(skb, LS_NLA_TYPE_SGID, sizeof(sa_rec->sgid),
+			&sa_rec->sgid);
+	if (info->comp_mask & IB_SA_PATH_REC_TRAFFIC_CLASS)
+		nla_put(skb, LS_NLA_TYPE_TCLASS, sizeof(sa_rec->traffic_class),
+			&sa_rec->traffic_class);
+	if (info->comp_mask & IB_SA_PATH_REC_REVERSIBLE)
+		nla_put(skb, LS_NLA_TYPE_REVERSIBLE,
+			sizeof(sa_rec->reversible), &sa_rec->reversible);
+	if (info->comp_mask & IB_SA_PATH_REC_NUMB_PATH)
+		nla_put(skb, LS_NLA_TYPE_NUMB_PATH,
+			sizeof(sa_rec->numb_path), &sa_rec->numb_path);
+	if (info->comp_mask & IB_SA_PATH_REC_PKEY)
+		nla_put(skb, LS_NLA_TYPE_PKEY, sizeof(sa_rec->pkey),
+			&sa_rec->pkey);
+	if (info->comp_mask & IB_SA_PATH_REC_QOS_CLASS)
+		nla_put(skb, LS_NLA_TYPE_QOS_CLASS,
+			sizeof(sa_rec->qos_class), &sa_rec->qos_class);
+}
+
+static int ib_nl_get_path_rec_attrs_len(struct ib_nl_attr_info *info)
+{
+	int len = 0;
+
+	if (info->comp_mask & IB_SA_PATH_REC_SERVICE_ID)
+		len += nla_total_size(sizeof(struct rdma_nla_ls_service_id));
+	if (info->comp_mask & IB_SA_PATH_REC_DGID)
+		len += nla_total_size(sizeof(struct rdma_nla_ls_gid));
+	if (info->comp_mask & IB_SA_PATH_REC_SGID)
+		len += nla_total_size(sizeof(struct rdma_nla_ls_gid));
+	if (info->comp_mask & IB_SA_PATH_REC_TRAFFIC_CLASS)
+		len += nla_total_size(sizeof(struct rdma_nla_ls_tclass));
+	if (info->comp_mask & IB_SA_PATH_REC_REVERSIBLE)
+		len += nla_total_size(sizeof(struct rdma_nla_ls_reversible));
+	if (info->comp_mask & IB_SA_PATH_REC_NUMB_PATH)
+		len += nla_total_size(sizeof(struct rdma_nla_ls_numb_path));
+	if (info->comp_mask & IB_SA_PATH_REC_PKEY)
+		len += nla_total_size(sizeof(struct rdma_nla_ls_pkey));
+	if (info->comp_mask & IB_SA_PATH_REC_QOS_CLASS)
+		len += nla_total_size(sizeof(struct rdma_nla_ls_qos_class));
+
+	return len;
+}
+
+static int ib_nl_send_request(struct ib_nl_request_info *rinfo)
+{
+	struct ib_nl_attr_info info;
+	int opcode;
+	struct ib_sa_mad *mad;
+	unsigned long flags;
+	unsigned long delay;
+	int ret;
+
+	mad = rinfo->query->mad_buf->mad;
+	switch (mad->mad_hdr.attr_id) {
+	case cpu_to_be16(IB_SA_ATTR_PATH_REC):
+		opcode = RDMA_NL_LS_OP_RESOLVE;
+		mad = rinfo->query->mad_buf->mad;
+		info.comp_mask = mad->sa_hdr.comp_mask;
+		info.input = rinfo->query->mad_buf->context[1];
+		rinfo->query->mad_buf->context[1] = NULL;
+		info.len = ib_nl_get_path_rec_attrs_len(&info);
+		info.set_attrs = ib_nl_set_path_rec_attrs;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	spin_lock_irqsave(&ib_nl_request_lock, flags);
+	ret = ib_nl_send_msg(opcode, &info, rinfo->seq);
+	if (ret <= 0) {
+		ret = -EIO;
+		goto request_out;
+	} else {
+		ret = 0;
+	}
+
+	delay = msecs_to_jiffies(sa_local_svc_timeout_ms);
+	rinfo->timeout = delay + jiffies;
+	list_add_tail(&rinfo->list, &ib_nl_request_list);
+	/* Start the timeout if this is the only request */
+	if (ib_nl_request_list.next == &rinfo->list)
+		queue_delayed_work(ib_nl_wq, &ib_nl_timed_work, delay);
+
+request_out:
+	spin_unlock_irqrestore(&ib_nl_request_lock, flags);
+
+	return ret;
+}
+
+static int ib_nl_make_request(struct ib_sa_query *query)
+{
+	struct ib_nl_request_info *rinfo;
+	int ret;
+
+	rinfo = ib_nl_alloc_request(query);
+	if (IS_ERR(rinfo))
+		return -ENOMEM;
+
+	ret = ib_nl_send_request(rinfo);
+	if (ret)
+		kfree(rinfo);
+
+	return ret;
+}
+
+static int ib_nl_cancel_request(struct ib_sa_query *query)
+{
+	unsigned long flags;
+	struct ib_nl_request_info *rinfo;
+	int found = 0;
+
+	spin_lock_irqsave(&ib_nl_request_lock, flags);
+	list_for_each_entry(rinfo, &ib_nl_request_list, list) {
+		/* Let the timeout to take care of the callback */
+		if (query == rinfo->query) {
+			IB_SA_CANCEL_QUERY(query);
+			rinfo->timeout = jiffies;
+			list_move(&rinfo->list, &ib_nl_request_list);
+			found = 1;
+			mod_delayed_work(ib_nl_wq, &ib_nl_timed_work, 1);
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&ib_nl_request_lock, flags);
+
+	return found;
+}
+
+static void send_handler(struct ib_mad_agent *agent,
+			 struct ib_mad_send_wc *mad_send_wc);
+
+static void ib_nl_process_good_resolve_rsp(struct ib_sa_query *query,
+					   const struct nlmsghdr *nlh)
+{
+	struct ib_mad_send_wc mad_send_wc;
+	struct ib_sa_mad *mad = NULL;
+	const struct nlattr *head, *curr;
+	struct ib_path_rec_data  *rec;
+	int len, rem;
+
+	if (query->callback) {
+		head = (const struct nlattr *) nlmsg_data(nlh);
+		len = nlmsg_len(nlh);
+		nla_for_each_attr(curr, head, len, rem) {
+			if (curr->nla_type == LS_NLA_TYPE_PATH_RECORD) {
+				rec = nla_data(curr);
+				if ((rec->flags && IB_PATH_GMP) &&
+				    (rec->flags && IB_PATH_PRIMARY)) {
+					mad = query->mad_buf->mad;
+					mad->mad_hdr.method |=
+						IB_MGMT_METHOD_RESP;
+					memcpy(mad->data, rec->path_rec,
+					       sizeof(rec->path_rec));
+					query->callback(query, 0, mad);
+					break;
+				}
+			}
+		}
+	}
+
+	mad_send_wc.send_buf = query->mad_buf;
+	mad_send_wc.status = IB_WC_SUCCESS;
+	send_handler(query->mad_buf->mad_agent, &mad_send_wc);
+}
+
+static void ib_nl_request_timeout(struct work_struct *work)
+{
+	unsigned long flags;
+	struct ib_nl_request_info *rinfo;
+	struct ib_sa_query *query;
+	unsigned long delay;
+	struct ib_mad_send_wc mad_send_wc;
+	int ret;
+
+	spin_lock_irqsave(&ib_nl_request_lock, flags);
+	while (!list_empty(&ib_nl_request_list)) {
+		rinfo = list_entry(ib_nl_request_list.next,
+				   struct ib_nl_request_info, list);
+
+		if (time_after(rinfo->timeout, jiffies)) {
+			delay = rinfo->timeout - jiffies;
+			if ((long)delay <= 0)
+				delay = 1;
+			queue_delayed_work(ib_nl_wq, &ib_nl_timed_work, delay);
+			break;
+		}
+
+		list_del(&rinfo->list);
+		query = rinfo->query;
+		IB_SA_DISABLE_LOCAL_SVC(query);
+		/* Hold the lock to protect against query cancellation */
+		if (IB_SA_QUERY_CANCELLED(query))
+			ret = -1;
+		else
+			ret = ib_post_send_mad(query->mad_buf, NULL);
+		if (ret) {
+			mad_send_wc.send_buf = query->mad_buf;
+			mad_send_wc.status = IB_WC_WR_FLUSH_ERR;
+			spin_unlock_irqrestore(&ib_nl_request_lock, flags);
+			send_handler(query->port->agent, &mad_send_wc);
+			spin_lock_irqsave(&ib_nl_request_lock, flags);
+		}
+		kfree(rinfo);
+	}
+	spin_unlock_irqrestore(&ib_nl_request_lock, flags);
+}
+
+static inline int ib_nl_is_good_resolve_resp(const struct nlmsghdr *nlh)
+{
+	const struct nlattr *head, *curr;
+	struct ib_path_rec_data  *rec;
+	int len, rem;
+
+	if (nlh->nlmsg_flags & RDMA_NL_LS_F_ERR)
+		return 0;
+
+	if (!(nlh->nlmsg_flags & RDMA_NL_LS_F_OK))
+		return 0;
+
+	if (nlmsg_len(nlh) < nla_attr_size(sizeof(*rec)))
+		return 0;
+
+	head = (const struct nlattr *) nlmsg_data(nlh);
+	len = nlmsg_len(nlh);
+	nla_for_each_attr(curr, head, len, rem) {
+		if (curr->nla_type == LS_NLA_TYPE_PATH_RECORD) {
+			rec = nla_data(curr);
+			if ((rec->flags && IB_PATH_GMP) &&
+			    (rec->flags && IB_PATH_PRIMARY))
+				return 1;
+		}
+	}
+
+	return 0;
+}
+
+static int ib_nl_handle_set_timeout(struct sk_buff *skb,
+				    struct netlink_callback *cb)
+{
+	const struct nlmsghdr *nlh = (struct nlmsghdr *)cb->nlh;
+	int timeout, delta, abs_delta;
+	const struct nlattr *attr;
+	struct rdma_nla_ls_timeout *to_attr;
+	unsigned long flags;
+	struct ib_nl_request_info *rinfo;
+	long delay = 0;
+
+	if (nlmsg_len(nlh) < nla_attr_size(sizeof(*to_attr)))
+		goto settimeout_out;
+
+	attr = (const struct nlattr *) nlmsg_data(nlh);
+	if (attr->nla_type != LS_NLA_TYPE_TIMEOUT ||
+	    nla_len(attr) != sizeof(*to_attr))
+		goto settimeout_out;
+
+	to_attr = (struct rdma_nla_ls_timeout *) nla_data(attr);
+	timeout = (int) to_attr->timeout;
+	if (timeout < IB_SA_LOCAL_SVC_TIMEOUT_MIN)
+		timeout = IB_SA_LOCAL_SVC_TIMEOUT_MIN;
+	if (timeout > IB_SA_LOCAL_SVC_TIMEOUT_MAX)
+		timeout = IB_SA_LOCAL_SVC_TIMEOUT_MAX;
+
+	delta = timeout - sa_local_svc_timeout_ms;
+	if (delta < 0)
+		abs_delta = -delta;
+	else
+		abs_delta = delta;
+
+	if (delta != 0) {
+		spin_lock_irqsave(&ib_nl_request_lock, flags);
+		sa_local_svc_timeout_ms = timeout;
+		list_for_each_entry(rinfo, &ib_nl_request_list, list) {
+			if (delta < 0 && abs_delta > rinfo->timeout)
+				rinfo->timeout = 0;
+			else
+				rinfo->timeout += delta;
+
+			/* Get the new delay from the first entry */
+			if (!delay) {
+				delay = rinfo->timeout - jiffies;
+				if (delay <= 0)
+					delay = 1;
+			}
+		}
+		if (delay)
+			mod_delayed_work(ib_nl_wq, &ib_nl_timed_work,
+					 (unsigned long)delay);
+		spin_unlock_irqrestore(&ib_nl_request_lock, flags);
+	}
+
+settimeout_out:
+	return skb->len;
+}
+
+static int ib_nl_handle_resolve_resp(struct sk_buff *skb,
+				     struct netlink_callback *cb)
+{
+	const struct nlmsghdr *nlh = (struct nlmsghdr *)cb->nlh;
+	unsigned long flags;
+	struct ib_nl_request_info *rinfo;
+	struct ib_sa_query *query;
+	struct ib_mad_send_buf *send_buf;
+	struct ib_mad_send_wc mad_send_wc;
+	int found = 0;
+	int ret;
+
+	spin_lock_irqsave(&ib_nl_request_lock, flags);
+	list_for_each_entry(rinfo, &ib_nl_request_list, list) {
+		/*
+		 * If the query is cancelled, let the timeout routine
+		 * take care of it.
+		 */
+		if (nlh->nlmsg_seq == rinfo->seq) {
+			found = !IB_SA_QUERY_CANCELLED(rinfo->query);
+			if (found)
+				list_del(&rinfo->list);
+			break;
+		}
+	}
+
+	if (!found) {
+		spin_unlock_irqrestore(&ib_nl_request_lock, flags);
+		goto resp_out;
+	}
+
+	query = rinfo->query;
+	send_buf = query->mad_buf;
+
+	if (!ib_nl_is_good_resolve_resp(nlh)) {
+		/* if the result is a failure, send out the packet via IB */
+		IB_SA_DISABLE_LOCAL_SVC(query);
+		ret = ib_post_send_mad(query->mad_buf, NULL);
+		spin_unlock_irqrestore(&ib_nl_request_lock, flags);
+		if (ret) {
+			mad_send_wc.send_buf = send_buf;
+			mad_send_wc.status = IB_WC_GENERAL_ERR;
+			send_handler(query->port->agent, &mad_send_wc);
+		}
+	} else {
+		spin_unlock_irqrestore(&ib_nl_request_lock, flags);
+		ib_nl_process_good_resolve_rsp(query, nlh);
+	}
+
+	kfree(rinfo);
+resp_out:
+	return skb->len;
+}
+
+static struct ibnl_client_cbs ib_sa_cb_table[] = {
+	[RDMA_NL_LS_OP_RESOLVE] = {
+		.dump = ib_nl_handle_resolve_resp,
+		.module = THIS_MODULE },
+	[RDMA_NL_LS_OP_SET_TIMEOUT] = {
+		.dump = ib_nl_handle_set_timeout,
+		.module = THIS_MODULE },
+};
+
 static void free_sm_ah(struct kref *kref)
 {
 	struct ib_sa_sm_ah *sm_ah = container_of(kref, struct ib_sa_sm_ah, ref);
@@ -502,7 +974,13 @@ void ib_sa_cancel_query(int id, struct ib_sa_query *query)
 	mad_buf = query->mad_buf;
 	spin_unlock_irqrestore(&idr_lock, flags);
 
-	ib_cancel_mad(agent, mad_buf);
+	/*
+	 * If the query is still on the netlink request list, schedule
+	 * it to be cancelled by the timeout routine. Otherwise, it has been
+	 * sent to the MAD layer and has to be cancelled from there.
+	 */
+	if (!ib_nl_cancel_request(query))
+		ib_cancel_mad(agent, mad_buf);
 }
 EXPORT_SYMBOL(ib_sa_cancel_query);
 
@@ -638,6 +1116,14 @@ static int send_mad(struct ib_sa_query *query, int timeout_ms, gfp_t gfp_mask)
 	query->mad_buf->context[0] = query;
 	query->id = id;
 
+	if (IB_SA_LOCAL_SVC_ENABLED(query)) {
+		if (!ibnl_chk_listeners(RDMA_NL_GROUP_LS)) {
+			if (!ib_nl_make_request(query))
+				return id;
+		}
+		IB_SA_DISABLE_LOCAL_SVC(query);
+	}
+
 	ret = ib_post_send_mad(query->mad_buf, NULL);
 	if (ret) {
 		spin_lock_irqsave(&idr_lock, flags);
@@ -766,6 +1252,9 @@ int ib_sa_path_rec_get(struct ib_sa_client *client,
 
 	*sa_query = &query->sa_query;
 
+	IB_SA_ENABLE_LOCAL_SVC(&query->sa_query);
+	query->sa_query.mad_buf->context[1] = rec;
+
 	ret = send_mad(&query->sa_query, timeout_ms, gfp_mask);
 	if (ret < 0)
 		goto err2;
@@ -1250,6 +1739,8 @@ static int __init ib_sa_init(void)
 
 	get_random_bytes(&tid, sizeof tid);
 
+	atomic_set(&ib_nl_sa_request_seq, 0);
+
 	ret = ib_register_client(&sa_client);
 	if (ret) {
 		printk(KERN_ERR "Couldn't register ib_sa client\n");
@@ -1262,7 +1753,25 @@ static int __init ib_sa_init(void)
 		goto err2;
 	}
 
+	ib_nl_wq = create_singlethread_workqueue("ib_nl_sa_wq");
+	if (!ib_nl_wq) {
+		ret = -ENOMEM;
+		goto err3;
+	}
+
+	if (ibnl_add_client(RDMA_NL_SA, RDMA_NL_LS_NUM_OPS,
+			    ib_sa_cb_table)) {
+		pr_err("Failed to add netlink callback\n");
+		ret = -EINVAL;
+		goto err4;
+	}
+	INIT_DELAYED_WORK(&ib_nl_timed_work, ib_nl_request_timeout);
+
 	return 0;
+err4:
+	destroy_workqueue(ib_nl_wq);
+err3:
+	mcast_cleanup();
 err2:
 	ib_unregister_client(&sa_client);
 err1:
@@ -1271,6 +1780,10 @@ err1:
 
 static void __exit ib_sa_cleanup(void)
 {
+	ibnl_remove_client(RDMA_NL_SA);
+	cancel_delayed_work(&ib_nl_timed_work);
+	flush_workqueue(ib_nl_wq);
+	destroy_workqueue(ib_nl_wq);
 	mcast_cleanup();
 	ib_unregister_client(&sa_client);
 	idr_destroy(&query_idr);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]     ` <1434042225-4975-2-git-send-email-kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-06-11 17:58       ` Jason Gunthorpe
       [not found]         ` <20150611175803.GC20142-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2015-06-11 17:58 UTC (permalink / raw)
  To: kaike.wan-ral2JQCrhuEAvxtiuMwx3w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Fleck, Ira Weiny

On Thu, Jun 11, 2015 at 01:03:42PM -0400, kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> 
> This patch adds netlink defines for SA client, local service group, local
> service operations, and related attributes.

Why are you sending this again without responding the comments I
provided on this patch?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]         ` <20150611175803.GC20142-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-06-11 18:32           ` Wan, Kaike
       [not found]             ` <3F128C9216C9B84BB6ED23EF16290AFB0CABD2CD-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Wan, Kaike @ 2015-06-11 18:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John, Weiny, Ira

> From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
> Sent: Thursday, June 11, 2015 1:58 PM
> To: Wan, Kaike
> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Fleck, John; Weiny, Ira
> Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests
> through netlink
> 
> On Thu, Jun 11, 2015 at 01:03:42PM -0400, kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> > From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >
> > This patch adds netlink defines for SA client, local service group,
> > local service operations, and related attributes.
> 
> Why are you sending this again without responding the comments I provided
> on this patch?

Are you referring to the mandatory/option attributes comments?

Kaike
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]             ` <3F128C9216C9B84BB6ED23EF16290AFB0CABD2CD-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-06-11 18:44               ` Jason Gunthorpe
  2015-06-11 18:51               ` Jason Gunthorpe
  1 sibling, 0 replies; 18+ messages in thread
From: Jason Gunthorpe @ 2015-06-11 18:44 UTC (permalink / raw)
  To: Wan, Kaike; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John, Weiny, Ira

On Thu, Jun 11, 2015 at 06:32:44PM +0000, Wan, Kaike wrote:
> > Why are you sending this again without responding the comments I provided
> > on this patch?
> 
> Are you referring to the mandatory/option attributes comments?

Hum weird, I typed in a message yesterday but it seems to have
evaporated. I will try again. Sorry.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]             ` <3F128C9216C9B84BB6ED23EF16290AFB0CABD2CD-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2015-06-11 18:44               ` Jason Gunthorpe
@ 2015-06-11 18:51               ` Jason Gunthorpe
       [not found]                 ` <20150611185136.GC20779-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  1 sibling, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2015-06-11 18:51 UTC (permalink / raw)
  To: Wan, Kaike; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John, Weiny, Ira

On Thu, Jun 11, 2015 at 06:32:44PM +0000, Wan, Kaike wrote:
> > From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
> > Sent: Thursday, June 11, 2015 1:58 PM
> > To: Wan, Kaike
> > Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Fleck, John; Weiny, Ira
> > Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests
> > through netlink
> > 
> > On Thu, Jun 11, 2015 at 01:03:42PM -0400, kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> > > From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > >
> > > This patch adds netlink defines for SA client, local service group,
> > > local service operations, and related attributes.

> +	LS_NLA_TYPE_REVERSIBLE,
> +	LS_NLA_TYPE_NUMB_PATH,

These need to be combined. It does not make sense to request
numb_paths when the API we have is designed to return APM data.

Recommend a 'QP Type' with options of:
 - RC QP: Return up to the full 6 path tuple with full APM data
 - UD QP: Return a single non-reversible path
 - GMP QP: Return up to two reversible GMP paths.

If it is hard to get accurate information then you can sketch this in
and infer RC QP and GMP QP by the request having the reversible
set bit. But ideally RDMA CM and SRP would request RC QP, IPoIB would
request UD QP.

> +/* Local Service ServiceID attribute */
> +struct rdma_nla_ls_service_id {
> +	__be64		service_id;
> +};

Do not expose BE to userspace, everything should be in cpu order.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]                 ` <20150611185136.GC20779-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-06-11 19:21                   ` Wan, Kaike
       [not found]                     ` <3F128C9216C9B84BB6ED23EF16290AFB0CABD33C-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Wan, Kaike @ 2015-06-11 19:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John, Weiny, Ira

> From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
> Sent: Thursday, June 11, 2015 2:52 PM
> To: Wan, Kaike
> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Fleck, John; Weiny, Ira
> Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests
> through netlink
> 
> On Thu, Jun 11, 2015 at 06:32:44PM +0000, Wan, Kaike wrote:
> > > From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
> > > Sent: Thursday, June 11, 2015 1:58 PM
> > > To: Wan, Kaike
> > > Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Fleck, John; Weiny, Ira
> > > Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local
> > > service requests through netlink
> > >
> > > On Thu, Jun 11, 2015 at 01:03:42PM -0400, kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> > > > From: Kaike Wan <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > >
> > > > This patch adds netlink defines for SA client, local service
> > > > group, local service operations, and related attributes.
> 
> > +	LS_NLA_TYPE_REVERSIBLE,
> > +	LS_NLA_TYPE_NUMB_PATH,
> 
> These need to be combined. It does not make sense to request numb_paths
> when the API we have is designed to return APM data.
> 
> Recommend a 'QP Type' with options of:
>  - RC QP: Return up to the full 6 path tuple with full APM data
>  - UD QP: Return a single non-reversible path
>  - GMP QP: Return up to two reversible GMP paths.
> 
> If it is hard to get accurate information then you can sketch this in and infer
> RC QP and GMP QP by the request having the reversible set bit. But ideally
> RDMA CM and SRP would request RC QP, IPoIB would request UD QP.

It's quite indirect here for this implementation. (ib_sa_path_rec_get())

Caller         reversible                        numb_path             QP Type
--------------------------------------------------------------------------
cma              1                                                1                     GMP
ipoib             Not set                                    1 (?)                UD 
srp                Not set                                    1                      UD

In addition, on the user side, we need to convert QP type back into reversible_numpath again, which may not interpret the original request correctly.
Why should we do all the translation?

Is QP type definition already exported to user space?

I see  IBV_QPT_RC/UC/UD/RAW_PACKET/XRC_SEND/XRC_RECV, but not GMP in verbs.h.

> 
> > +/* Local Service ServiceID attribute */ struct rdma_nla_ls_service_id
> > +{
> > +	__be64		service_id;
> > +};
> 
> Do not expose BE to userspace, everything should be in cpu order.

If we use cpu order, we need to do two conversions: from BE to cpu order in kernel and from cpu order to BE in user space. Struct ib_user_path_rec contains many __be32 fields.

> 
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]                     ` <3F128C9216C9B84BB6ED23EF16290AFB0CABD33C-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-06-11 19:51                       ` Jason Gunthorpe
       [not found]                         ` <20150611195154.GB24457-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2015-06-11 19:51 UTC (permalink / raw)
  To: Wan, Kaike; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John, Weiny, Ira

On Thu, Jun 11, 2015 at 07:21:11PM +0000, Wan, Kaike wrote:
> > Recommend a 'QP Type' with options of:
> >  - RC QP: Return up to the full 6 path tuple with full APM data
> >  - UD QP: Return a single non-reversible path
> >  - GMP QP: Return up to two reversible GMP paths.
> > 
> > If it is hard to get accurate information then you can sketch this in and infer
> > RC QP and GMP QP by the request having the reversible set bit. But ideally
> > RDMA CM and SRP would request RC QP, IPoIB would request UD QP.
> 
> It's quite indirect here for this implementation. (ib_sa_path_rec_get())
> 
> Caller         reversible                        numb_path             QP Type
> cma              1                                                1                     GMP
> ipoib             Not set                                    1 (?)                UD 
> srp                Not set                                    1                      UD

CMA (mostly) and SRP use RC paths. IPoIB uses UD and RC. The IPoIB
path record query looks wrong for RC mode because it does not request
reversible, that should be fixed independently.

The indirection is because you don't want to modify the callers to
provide their additional information, which is probably a mistake. Add
a netlink fill callback and do it there?

> In addition, on the user side, we need to convert QP type back into
> reversible_numpath again, which may not interpret the original
> request correctly.  Why should we do all the translation?

The goal is to carry enough context information so that user space can
do something smart. We have alot more information available to drive
policy than is present in the simple IBA Path Record MAD.

In this case, the kernel does not actually care about numb paths or
reversible, except in the context of how it intends to use the
path(s).

For instance, having CM ask for reversible makes no sense, since in
the general case it wants the 3 path tuple of
GMP(reversible=1),FORWARD(reversible=0),RETURN(reversible=0). Similarly,
it doesn't make sense to ask for numbPath=1 because in the general
case it wants the 6 path APM set.

Being more specific here allows userspace to implement more detailed
policy beyond what the IB Path Record scheme allows. And for that we
need the context information from the requestor.

acm won't use it, it will always request numbPaths=1 and derive the
reversible bit from this netlink field. (or always set reversible=1)

> Is QP type definition already exported to user space?

QP Type could be a bad name, open to ideas

> I see  IBV_QPT_RC/UC/UD/RAW_PACKET/XRC_SEND/XRC_RECV, but not GMP in verbs.h.

GMP is a special case of UD.

> > > +/* Local Service ServiceID attribute */ struct rdma_nla_ls_service_id
> > > +{
> > > +	__be64		service_id;
> > > +};
> > 
> > Do not expose BE to userspace, everything should be in cpu order.
> 
> If we use cpu order, we need to do two conversions: from BE to cpu
> order in kernel and from cpu order to BE in user space. Struct
> ib_user_path_rec contains many __be32 fields.

I don't see a problem with the extra conversion.

IIRC ib_user_path_rec is an image of the SA Path Record Reply MAD, so
it should be byte for byte identical, which is why it has BE elements,
that is not to be encouraged generally..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]                         ` <20150611195154.GB24457-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-06-11 23:45                           ` Wan, Kaike
       [not found]                             ` <3F128C9216C9B84BB6ED23EF16290AFB0CABE5EE-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2015-06-11 23:56                           ` Weiny, Ira
  1 sibling, 1 reply; 18+ messages in thread
From: Wan, Kaike @ 2015-06-11 23:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John, Weiny, Ira

> From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
> Sent: Thursday, June 11, 2015 3:52 PM
> To: Wan, Kaike
> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Fleck, John; Weiny, Ira
> Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests
> through netlink
> 
> On Thu, Jun 11, 2015 at 07:21:11PM +0000, Wan, Kaike wrote:
> > > Recommend a 'QP Type' with options of:
> > >  - RC QP: Return up to the full 6 path tuple with full APM data
> > >  - UD QP: Return a single non-reversible path
> > >  - GMP QP: Return up to two reversible GMP paths.
> > >
> > > If it is hard to get accurate information then you can sketch this
> > > in and infer RC QP and GMP QP by the request having the reversible
> > > set bit. But ideally RDMA CM and SRP would request RC QP, IPoIB would
> request UD QP.
> >
> > It's quite indirect here for this implementation.
> > (ib_sa_path_rec_get())
> >
> > Caller         reversible                        numb_path             QP Type
> > cma              1                                                1                     GMP
> > ipoib             Not set                                    1 (?)                UD
> > srp                Not set                                    1                      UD
> 
> CMA (mostly) and SRP use RC paths. IPoIB uses UD and RC. The IPoIB path
> record query looks wrong for RC mode because it does not request reversible,
> that should be fixed independently.
> 
> The indirection is because you don't want to modify the callers to provide
> their additional information, which is probably  a mistake. Add a netlink fill
> callback and do it there?

Apparently, to achieve this goal, we need the following changes:
 
1. Caller (cma, ipoib, srp):
 - Add a new callback to handle multiple pathrecords returned in the response.
- Change the calling function to supply additional context info.

The problem is that currently only one copy of pathrecord is required for these callers and we need to find the use of other pathrecords in these callers. Otherwise, we simply are overdesigning.

2. ib_sa: 
  - Add a function  to take additional context info;
 - Code this addition context into new attributes.
- New callback function to return multiple pathrecords to the caller. 

3. User space application:

Apparently, ibacm can only return one copy of pathrecord and we need to find something different to test the code. And the current implementation of ibacm can't make use of the supplied context info (QP type or whatever).

I think that these changes run well beyond the original goal of this patch series to speed up the pathrecord query performance and may warranty a subsequent patch series.

For this patch series, how about limiting the change to combining reversible and numb_path into reversible_numb_path attribute?



> 
> > In addition, on the user side, we need to convert QP type back into
> > reversible_numpath again, which may not interpret the original request
> > correctly.  Why should we do all the translation?
> 
> The goal is to carry enough context information so that user space can do
> something smart. We have alot more information available to drive policy
> than is present in the simple IBA Path Record MAD.
> 
> In this case, the kernel does not actually care about numb paths or reversible,
> except in the context of how it intends to use the path(s).
> 
> For instance, having CM ask for reversible makes no sense, since in the
> general case it wants the 3 path tuple of
> GMP(reversible=1),FORWARD(reversible=0),RETURN(reversible=0). Similarly,
> it doesn't make sense to ask for numbPath=1 because in the general case it
> wants the 6 path APM set.
> 
> Being more specific here allows userspace to implement more detailed policy
> beyond what the IB Path Record scheme allows. And for that we need the
> context information from the requestor.
> 
> acm won't use it, it will always request numbPaths=1 and derive the
> reversible bit from this netlink field. (or always set reversible=1)
> 
> > Is QP type definition already exported to user space?
> 
> QP Type could be a bad name, open to ideas
> 
> > I see  IBV_QPT_RC/UC/UD/RAW_PACKET/XRC_SEND/XRC_RECV, but not
> GMP in verbs.h.
> 
> GMP is a special case of UD.
> 
> > > > +/* Local Service ServiceID attribute */ struct
> > > > +rdma_nla_ls_service_id {
> > > > +	__be64		service_id;
> > > > +};
> > >
> > > Do not expose BE to userspace, everything should be in cpu order.
> >
> > If we use cpu order, we need to do two conversions: from BE to cpu
> > order in kernel and from cpu order to BE in user space. Struct
> > ib_user_path_rec contains many __be32 fields.
> 
> I don't see a problem with the extra conversion.

That's fine. I can do the conversion on both sides. In the future, if the callers do not do the conversion, there will be only one conversion on the user space side:

caller (cpu byte order) ->ib_sa (cpu byte order) ->user space (cpu byte order ->network byte order).

Kaike

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]                         ` <20150611195154.GB24457-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2015-06-11 23:45                           ` Wan, Kaike
@ 2015-06-11 23:56                           ` Weiny, Ira
  1 sibling, 0 replies; 18+ messages in thread
From: Weiny, Ira @ 2015-06-11 23:56 UTC (permalink / raw)
  To: Jason Gunthorpe, Wan, Kaike
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John

> 
> > Is QP type definition already exported to user space?
> 
> QP Type could be a bad name, open to ideas

I think the name "QP Type" is bad...

I want to say "transport type" but I don't think that really communicates what we want here.

> 
> > I see  IBV_QPT_RC/UC/UD/RAW_PACKET/XRC_SEND/XRC_RECV, but not
> GMP in verbs.h.
> 
> GMP is a special case of UD.

I think this is where QP Type (or "Transport type") are not equivalent to if a path needs to be reversible or not.

To properly derive the reversibility requires more than the transport.  There is nothing which precludes me from requesting a reversible path for any QP Type.

This is really a "protocol" attribute and it seems that each protocol could potentially want a reversible Path Record independent of the QP or "Transport Type".

> 
> > > > +/* Local Service ServiceID attribute */ struct
> > > > +rdma_nla_ls_service_id {
> > > > +	__be64		service_id;
> > > > +};
> > >
> > > Do not expose BE to userspace, everything should be in cpu order.
> >
> > If we use cpu order, we need to do two conversions: from BE to cpu
> > order in kernel and from cpu order to BE in user space. Struct
> > ib_user_path_rec contains many __be32 fields.
> 
> I don't see a problem with the extra conversion.

Neither do I.  All user space values should be in host order.

Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]                             ` <3F128C9216C9B84BB6ED23EF16290AFB0CABE5EE-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-06-12  0:21                               ` Jason Gunthorpe
       [not found]                                 ` <20150612002133.GA17666-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2015-06-12  0:21 UTC (permalink / raw)
  To: Wan, Kaike; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John, Weiny, Ira

On Thu, Jun 11, 2015 at 11:45:46PM +0000, Wan, Kaike wrote:
> Apparently, to achieve this goal, we need the following changes:

I don't expect anything more than a proper hole in the UAPI to make
this work.

I already have patches that implement full kernel support for the
6-path format, and globally enable APM. With this netlink interface it
becomes feasible to mainline them. That will require some reworking of
what you've done, but I don't expect you to do that.

The only remaining thing we need here is that tiny bit of policy
information for the kernel to specify what the intended use of the
path is:
 - For connected CM operation (6 path records)
 - For unidirectional UD (1 path record)
 - For 'misc' GMP like operation (at least 1 reversible path record)

I'm happy if you call it PATH_USE_XX

Testing the reversible bit should be enough to rough this in, if
not reversible it is PATH_USE_UNIDIRECTIONAL_UD, otherwise it is
PATH_USE_GMP.

Recommend that userspace treat any unknown usage type as GMP (it is
the most general).

Ira: Do you like 'path use' better as a name? I think I do..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]                                 ` <20150612002133.GA17666-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-06-12  0:28                                   ` Wan, Kaike
       [not found]                                     ` <3F128C9216C9B84BB6ED23EF16290AFB0CABE6B3-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Wan, Kaike @ 2015-06-12  0:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John, Weiny, Ira

> From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
> Sent: Thursday, June 11, 2015 8:22 PM
> To: Wan, Kaike
> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Fleck, John; Weiny, Ira
> Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests
> through netlink
> 
> On Thu, Jun 11, 2015 at 11:45:46PM +0000, Wan, Kaike wrote:
> > Apparently, to achieve this goal, we need the following changes:
> 
> I don't expect anything more than a proper hole in the UAPI to make this
> work.
> 
> I already have patches that implement full kernel support for the 6-path
> format, and globally enable APM. With this netlink interface it becomes
> feasible to mainline them. That will require some reworking of what you've
> done, but I don't expect you to do that.
> 
> The only remaining thing we need here is that tiny bit of policy information
> for the kernel to specify what the intended use of the path is:
>  - For connected CM operation (6 path records)
>  - For unidirectional UD (1 path record)
>  - For 'misc' GMP like operation (at least 1 reversible path record)
> 
> I'm happy if you call it PATH_USE_XX
> 
> Testing the reversible bit should be enough to rough this in, if not reversible it
> is PATH_USE_UNIDIRECTIONAL_UD, otherwise it is PATH_USE_GMP.
> 
> Recommend that userspace treat any unknown usage type as GMP (it is the

> most general).

That would be cool. If we add this attribute, should we remove reversible and numb_path attributes?

Kaike

> 
> Ira: Do you like 'path use' better as a name? I think I do..
> 
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]                                     ` <3F128C9216C9B84BB6ED23EF16290AFB0CABE6B3-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-06-12  5:24                                       ` Weiny, Ira
       [not found]                                         ` <2807E5FD2F6FDA4886F6618EAC48510E1109C1F6-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Weiny, Ira @ 2015-06-12  5:24 UTC (permalink / raw)
  To: Wan, Kaike, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John

> 
> > From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
> > Sent: Thursday, June 11, 2015 8:22 PM
> > To: Wan, Kaike
> > Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Fleck, John; Weiny, Ira
> > Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service
> > requests through netlink
> >
> > On Thu, Jun 11, 2015 at 11:45:46PM +0000, Wan, Kaike wrote:
> > > Apparently, to achieve this goal, we need the following changes:
> >
> > I don't expect anything more than a proper hole in the UAPI to make
> > this work.
> >
> > I already have patches that implement full kernel support for the
> > 6-path format, and globally enable APM. With this netlink interface it
> > becomes feasible to mainline them. That will require some reworking of
> > what you've done, but I don't expect you to do that.
> >
> > The only remaining thing we need here is that tiny bit of policy
> > information for the kernel to specify what the intended use of the path is:
> >  - For connected CM operation (6 path records)
> >  - For unidirectional UD (1 path record)
> >  - For 'misc' GMP like operation (at least 1 reversible path record)
> >
> > I'm happy if you call it PATH_USE_XX
> >
> > Testing the reversible bit should be enough to rough this in, if not
> > reversible it is PATH_USE_UNIDIRECTIONAL_UD, otherwise it is
> PATH_USE_GMP.
> >

PATH_USE_* is a better name.  But I think the defines should be:

PATH_USE_ANY (or ALL?)
PATH_USE_UNIDIRECTIONAL
PATH_USE_GMP

> > Recommend that userspace treat any unknown usage type as GMP (it is
> > the
> 
> > most general).

Why not treat unknown type as "ANY/ALL" and return all 6 if possible.  If not possible/supported (ie like ibacm) return GMP which is always valid.

> 
> That would be cool. If we add this attribute, should we remove reversible and
> numb_path attributes?

Yes.

> 
> Kaike
> 
> >
> > Ira: Do you like 'path use' better as a name? I think I do..

Yea that is a much better name.

Thanks,
Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]                                         ` <2807E5FD2F6FDA4886F6618EAC48510E1109C1F6-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-06-15 18:06                                           ` Jason Gunthorpe
       [not found]                                             ` <20150615180648.GA1089-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Jason Gunthorpe @ 2015-06-15 18:06 UTC (permalink / raw)
  To: Weiny, Ira; +Cc: Wan, Kaike, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John

On Fri, Jun 12, 2015 at 05:24:32AM +0000, Weiny, Ira wrote:
> 
> PATH_USE_* is a better name.  But I think the defines should be:
> 
> PATH_USE_ANY (or ALL?)

How about PATH_USE_CONNECTED?

> Why not treat unknown type as "ANY/ALL" and return all 6 if
> possible.  If not possible/supported (ie like ibacm) return GMP
> which is always valid.

Make sense to me..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink
       [not found]                                             ` <20150615180648.GA1089-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-06-15 19:02                                               ` Weiny, Ira
  0 siblings, 0 replies; 18+ messages in thread
From: Weiny, Ira @ 2015-06-15 19:02 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Wan, Kaike, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Fleck, John

> 
> On Fri, Jun 12, 2015 at 05:24:32AM +0000, Weiny, Ira wrote:
> >
> > PATH_USE_* is a better name.  But I think the defines should be:
> >
> > PATH_USE_ANY (or ALL?)
> 
> How about PATH_USE_CONNECTED?

I don't want to bikeshed this too much as I'm not really good on names but...

How does "Connected" relate to wanting all available Path options?

Can't a connected transport use a reversible "GMP" path?

I guess that is where I am getting hung up.

Ira

> 
> > Why not treat unknown type as "ANY/ALL" and return all 6 if possible.
> > If not possible/supported (ie like ibacm) return GMP which is always
> > valid.
> 
> Make sense to me..
> 
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-06-15 19:02 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-11 17:03 [PATCH v5 0/4] Sending kernel pathrecord query to user cache server kaike.wan-ral2JQCrhuEAvxtiuMwx3w
     [not found] ` <1434042225-4975-1-git-send-email-kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-06-11 17:03   ` [PATCH v5 1/4] IB/netlink: Add defines for local service requests through netlink kaike.wan-ral2JQCrhuEAvxtiuMwx3w
     [not found]     ` <1434042225-4975-2-git-send-email-kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-06-11 17:58       ` Jason Gunthorpe
     [not found]         ` <20150611175803.GC20142-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-11 18:32           ` Wan, Kaike
     [not found]             ` <3F128C9216C9B84BB6ED23EF16290AFB0CABD2CD-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-06-11 18:44               ` Jason Gunthorpe
2015-06-11 18:51               ` Jason Gunthorpe
     [not found]                 ` <20150611185136.GC20779-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-11 19:21                   ` Wan, Kaike
     [not found]                     ` <3F128C9216C9B84BB6ED23EF16290AFB0CABD33C-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-06-11 19:51                       ` Jason Gunthorpe
     [not found]                         ` <20150611195154.GB24457-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-11 23:45                           ` Wan, Kaike
     [not found]                             ` <3F128C9216C9B84BB6ED23EF16290AFB0CABE5EE-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-06-12  0:21                               ` Jason Gunthorpe
     [not found]                                 ` <20150612002133.GA17666-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-12  0:28                                   ` Wan, Kaike
     [not found]                                     ` <3F128C9216C9B84BB6ED23EF16290AFB0CABE6B3-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-06-12  5:24                                       ` Weiny, Ira
     [not found]                                         ` <2807E5FD2F6FDA4886F6618EAC48510E1109C1F6-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-06-15 18:06                                           ` Jason Gunthorpe
     [not found]                                             ` <20150615180648.GA1089-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-15 19:02                                               ` Weiny, Ira
2015-06-11 23:56                           ` Weiny, Ira
2015-06-11 17:03   ` [PATCH v5 2/4] IB/core: Check the presence of netlink multicast group listeners kaike.wan-ral2JQCrhuEAvxtiuMwx3w
2015-06-11 17:03   ` [PATCH v5 3/4] IB/sa: Allocate SA query with kzalloc kaike.wan-ral2JQCrhuEAvxtiuMwx3w
2015-06-11 17:03   ` [PATCH v5 4/4] IB/sa: Route SA pathrecord query through netlink kaike.wan-ral2JQCrhuEAvxtiuMwx3w

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.