From: "Steve Wise" <swise@opengridcomputing.com> To: 'Leon Romanovsky' <leon@kernel.org> Cc: dsahern@gmail.com, stephen@networkplumber.org, netdev@vger.kernel.org, linux-rdma@vger.kernel.org Subject: RE: [PATCH v1 iproute2-next 1/4] rdma: add helper rd_sendrecv_msg() Date: Mon, 4 Mar 2019 08:13:04 -0600 [thread overview] Message-ID: <007901d4d294$62d5d280$28817780$@opengridcomputing.com> (raw) In-Reply-To: <20190303135052.GY15253@mtr-leonro.mtl.com> > > > > > > On 2/23/2019 3:26 AM, Leon Romanovsky wrote: > > > > On Thu, Feb 21, 2019 at 08:19:03AM -0800, Steve Wise wrote: > > > >> This function sends the constructed netlink message and then > > > >> receives the response, displaying any error text. > > > >> > > > >> Change 'rdma dev set' to use it. > > > >> > > > >> Signed-off-by: Steve Wise <swise@opengridcomputing.com> > > > >> --- > > > >> rdma/dev.c | 2 +- > > > >> rdma/rdma.h | 1 + > > > >> rdma/utils.c | 21 +++++++++++++++++++++ > > > >> 3 files changed, 23 insertions(+), 1 deletion(-) > > > >> > > > >> diff --git a/rdma/dev.c b/rdma/dev.c > > > >> index 60ff4b31e320..d2949c378f08 100644 > > > >> --- a/rdma/dev.c > > > >> +++ b/rdma/dev.c > > > >> @@ -273,7 +273,7 @@ static int dev_set_name(struct rd *rd) > > > >> mnl_attr_put_u32(rd->nlh, RDMA_NLDEV_ATTR_DEV_INDEX, rd- > > > >dev_idx); > > > >> mnl_attr_put_strz(rd->nlh, RDMA_NLDEV_ATTR_DEV_NAME, > > > rd_argv(rd)); > > > >> > > > >> - return rd_send_msg(rd); > > > >> + return rd_sendrecv_msg(rd, seq); > > > >> } > > > >> > > > >> static int dev_one_set(struct rd *rd) > > > >> diff --git a/rdma/rdma.h b/rdma/rdma.h > > > >> index 547bb5749a39..20be2f12c4f8 100644 > > > >> --- a/rdma/rdma.h > > > >> +++ b/rdma/rdma.h > > > >> @@ -115,6 +115,7 @@ bool rd_check_is_key_exist(struct rd *rd, const > > > char *key); > > > >> */ > > > >> int rd_send_msg(struct rd *rd); > > > >> int rd_recv_msg(struct rd *rd, mnl_cb_t callback, void *data, uint32_t > > > seq); > > > >> +int rd_sendrecv_msg(struct rd *rd, unsigned int seq); > > > >> void rd_prepare_msg(struct rd *rd, uint32_t cmd, uint32_t *seq, > > uint16_t > > > flags); > > > >> int rd_dev_init_cb(const struct nlmsghdr *nlh, void *data); > > > >> int rd_attr_cb(const struct nlattr *attr, void *data); > > > >> diff --git a/rdma/utils.c b/rdma/utils.c > > > >> index 069d44fece10..a6f2826c9605 100644 > > > >> --- a/rdma/utils.c > > > >> +++ b/rdma/utils.c > > > >> @@ -664,6 +664,27 @@ int rd_recv_msg(struct rd *rd, mnl_cb_t > callback, > > > void *data, unsigned int seq) > > > >> return ret; > > > >> } > > > >> > > > >> +static int null_cb(const struct nlmsghdr *nlh, void *data) > > > >> +{ > > > >> + return MNL_CB_OK; > > > >> +} > > > >> + > > > >> +int rd_sendrecv_msg(struct rd *rd, unsigned int seq) > > > >> +{ > > > >> + int ret; > > > >> + > > > >> + ret = rd_send_msg(rd); > > > >> + if (ret) { > > > >> + perror(NULL); > > > > This is more or less already done in rd_send_msg() and that function > > > > prints something in case of execution error. So the missing piece > > > > is to update rd_recv_msg(), so all places will "magically" print errors > > > > and not only dev_set_name(). > > > > > > Yea ok. > > > > > > > dev_set_name() doesn't call rd_recv_msg(). So you're suggesting I fix up > > rd_recv_msg() to display errors and make dev_set_name() call > rd_recv_msg() > > with the null_cb function? You sure that's the way to go? > > I'm sure that we need to fix dev_set_name(), everything else I'm not sure. > > Thanks Hey Leon, adding this to rd_recv_msg(): @@ -693,10 +693,28 @@ int rd_recv_msg(struct rd *rd, mnl_cb_t callback, void *data, unsigned int seq) ret = mnl_cb_run(buf, ret, seq, portid, callback, data); } while (ret > 0); + if (ret < 0) + perror(NULL); + mnl_socket_close(rd->nl); return ret; } Results in unexpected errors being logged when doing a query such as: [root@stevo1 iproute2]# ./rdma/rdma res show qp lqpn 176 error: Invalid argument link mlx5_0/1 lqpn 176 type UD state RTS sq-psn 0 comm [ib_core] error: Invalid argument error: No such file or directory error: Invalid argument error: No such file or directory It appears the "invalid argument" errors are due to rdmatool sending a RDMA_NLDEV_CMD_RES_QP_GET command using the doit kernel method to allow querying for just a QP with lqpn = 176. However, rdmatool isn't passing a port index in the messages that generate the "invalid argument" error from the kernel. IE you must provide a device index and port index when issuing a doit command vs a dumpit command. I think. This error was not found because rd_recv_msg() never displayed any errors previously. Further, the RES_FUNC() massive macro has code that will retry a failed doit call with a dumpit call. I think _##name() should distinguish between failures reported by the kernel doit function vs failures because no doit function exists. Not sure how to support that. static inline int _##name(struct rd *rd) \ { \ uint32_t idx; \ int ret; \ if (id) { \ ret = rd_doit_index(rd, &idx); \ if (ret) { \ ret = _res_send_idx_msg(rd, command, \ name##_idx_parse_cb, \ idx, id); \ if (!ret) \ return ret; \ /* Fallback for old systems without .doit callbacks */ \ } \ } \ return _res_send_msg(rd, command, name##_parse_cb); \ } \ The "no such file or dir" errors are being returned because, in my setup, there are 2 other links that do not have lqpn 176. So there are 2 issues uncovered by adding generic printing of errors in rd_recv_msg() 1) the doit code in rdmatool is generating requests for a doit method in the kernel w/o providing a port index. 2) some paths in rdmatool should not print "benign" errors like filtering on a GET command causing a "does not exist" error returned by the kernel doit func. #1 is a bug, IMO. Can you propose a fix? #2 could be solved by adding an error callback func passed to rd_recv_msg(). Then the RES_FUNC() functions could parse errors like "no such file or dir" when doing a filtered query and silently drop them. And functions like dev_set_name() would display all errors returned because there are no expected errors other than "success". Steve.
WARNING: multiple messages have this Message-ID (diff)
From: "Steve Wise" <swise@opengridcomputing.com> To: "'Leon Romanovsky'" <leon@kernel.org> Cc: <dsahern@gmail.com>, <stephen@networkplumber.org>, <netdev@vger.kernel.org>, <linux-rdma@vger.kernel.org> Subject: RE: [PATCH v1 iproute2-next 1/4] rdma: add helper rd_sendrecv_msg() Date: Mon, 4 Mar 2019 08:13:04 -0600 [thread overview] Message-ID: <007901d4d294$62d5d280$28817780$@opengridcomputing.com> (raw) In-Reply-To: <20190303135052.GY15253@mtr-leonro.mtl.com> > > > > > > On 2/23/2019 3:26 AM, Leon Romanovsky wrote: > > > > On Thu, Feb 21, 2019 at 08:19:03AM -0800, Steve Wise wrote: > > > >> This function sends the constructed netlink message and then > > > >> receives the response, displaying any error text. > > > >> > > > >> Change 'rdma dev set' to use it. > > > >> > > > >> Signed-off-by: Steve Wise <swise@opengridcomputing.com> > > > >> --- > > > >> rdma/dev.c | 2 +- > > > >> rdma/rdma.h | 1 + > > > >> rdma/utils.c | 21 +++++++++++++++++++++ > > > >> 3 files changed, 23 insertions(+), 1 deletion(-) > > > >> > > > >> diff --git a/rdma/dev.c b/rdma/dev.c > > > >> index 60ff4b31e320..d2949c378f08 100644 > > > >> --- a/rdma/dev.c > > > >> +++ b/rdma/dev.c > > > >> @@ -273,7 +273,7 @@ static int dev_set_name(struct rd *rd) > > > >> mnl_attr_put_u32(rd->nlh, RDMA_NLDEV_ATTR_DEV_INDEX, rd- > > > >dev_idx); > > > >> mnl_attr_put_strz(rd->nlh, RDMA_NLDEV_ATTR_DEV_NAME, > > > rd_argv(rd)); > > > >> > > > >> - return rd_send_msg(rd); > > > >> + return rd_sendrecv_msg(rd, seq); > > > >> } > > > >> > > > >> static int dev_one_set(struct rd *rd) > > > >> diff --git a/rdma/rdma.h b/rdma/rdma.h > > > >> index 547bb5749a39..20be2f12c4f8 100644 > > > >> --- a/rdma/rdma.h > > > >> +++ b/rdma/rdma.h > > > >> @@ -115,6 +115,7 @@ bool rd_check_is_key_exist(struct rd *rd, const > > > char *key); > > > >> */ > > > >> int rd_send_msg(struct rd *rd); > > > >> int rd_recv_msg(struct rd *rd, mnl_cb_t callback, void *data, uint32_t > > > seq); > > > >> +int rd_sendrecv_msg(struct rd *rd, unsigned int seq); > > > >> void rd_prepare_msg(struct rd *rd, uint32_t cmd, uint32_t *seq, > > uint16_t > > > flags); > > > >> int rd_dev_init_cb(const struct nlmsghdr *nlh, void *data); > > > >> int rd_attr_cb(const struct nlattr *attr, void *data); > > > >> diff --git a/rdma/utils.c b/rdma/utils.c > > > >> index 069d44fece10..a6f2826c9605 100644 > > > >> --- a/rdma/utils.c > > > >> +++ b/rdma/utils.c > > > >> @@ -664,6 +664,27 @@ int rd_recv_msg(struct rd *rd, mnl_cb_t > callback, > > > void *data, unsigned int seq) > > > >> return ret; > > > >> } > > > >> > > > >> +static int null_cb(const struct nlmsghdr *nlh, void *data) > > > >> +{ > > > >> + return MNL_CB_OK; > > > >> +} > > > >> + > > > >> +int rd_sendrecv_msg(struct rd *rd, unsigned int seq) > > > >> +{ > > > >> + int ret; > > > >> + > > > >> + ret = rd_send_msg(rd); > > > >> + if (ret) { > > > >> + perror(NULL); > > > > This is more or less already done in rd_send_msg() and that function > > > > prints something in case of execution error. So the missing piece > > > > is to update rd_recv_msg(), so all places will "magically" print errors > > > > and not only dev_set_name(). > > > > > > Yea ok. > > > > > > > dev_set_name() doesn't call rd_recv_msg(). So you're suggesting I fix up > > rd_recv_msg() to display errors and make dev_set_name() call > rd_recv_msg() > > with the null_cb function? You sure that's the way to go? > > I'm sure that we need to fix dev_set_name(), everything else I'm not sure. > > Thanks Hey Leon, adding this to rd_recv_msg(): @@ -693,10 +693,28 @@ int rd_recv_msg(struct rd *rd, mnl_cb_t callback, void *data, unsigned int seq) ret = mnl_cb_run(buf, ret, seq, portid, callback, data); } while (ret > 0); + if (ret < 0) + perror(NULL); + mnl_socket_close(rd->nl); return ret; } Results in unexpected errors being logged when doing a query such as: [root@stevo1 iproute2]# ./rdma/rdma res show qp lqpn 176 error: Invalid argument link mlx5_0/1 lqpn 176 type UD state RTS sq-psn 0 comm [ib_core] error: Invalid argument error: No such file or directory error: Invalid argument error: No such file or directory It appears the "invalid argument" errors are due to rdmatool sending a RDMA_NLDEV_CMD_RES_QP_GET command using the doit kernel method to allow querying for just a QP with lqpn = 176. However, rdmatool isn't passing a port index in the messages that generate the "invalid argument" error from the kernel. IE you must provide a device index and port index when issuing a doit command vs a dumpit command. I think. This error was not found because rd_recv_msg() never displayed any errors previously. Further, the RES_FUNC() massive macro has code that will retry a failed doit call with a dumpit call. I think _##name() should distinguish between failures reported by the kernel doit function vs failures because no doit function exists. Not sure how to support that. static inline int _##name(struct rd *rd) \ { \ uint32_t idx; \ int ret; \ if (id) { \ ret = rd_doit_index(rd, &idx); \ if (ret) { \ ret = _res_send_idx_msg(rd, command, \ name##_idx_parse_cb, \ idx, id); \ if (!ret) \ return ret; \ /* Fallback for old systems without .doit callbacks */ \ } \ } \ return _res_send_msg(rd, command, name##_parse_cb); \ } \ The "no such file or dir" errors are being returned because, in my setup, there are 2 other links that do not have lqpn 176. So there are 2 issues uncovered by adding generic printing of errors in rd_recv_msg() 1) the doit code in rdmatool is generating requests for a doit method in the kernel w/o providing a port index. 2) some paths in rdmatool should not print "benign" errors like filtering on a GET command causing a "does not exist" error returned by the kernel doit func. #1 is a bug, IMO. Can you propose a fix? #2 could be solved by adding an error callback func passed to rd_recv_msg(). Then the RES_FUNC() functions could parse errors like "no such file or dir" when doing a filtered query and silently drop them. And functions like dev_set_name() would display all errors returned because there are no expected errors other than "success". Steve.
next prev parent reply other threads:[~2019-03-04 14:13 UTC|newest] Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-02-21 18:22 [PATCH v1 iproute2-next 0/4] Dynamic rdma link creation Steve Wise 2019-02-21 16:19 ` [PATCH v1 iproute2-next 1/4] rdma: add helper rd_sendrecv_msg() Steve Wise 2019-02-23 9:26 ` Leon Romanovsky 2019-02-23 9:31 ` Leon Romanovsky 2019-02-26 17:19 ` Steve Wise 2019-02-26 19:16 ` Leon Romanovsky 2019-02-26 19:55 ` Steve Wise 2019-02-26 19:55 ` David Ahern 2019-02-26 17:13 ` Steve Wise 2019-02-27 20:23 ` Steve Wise 2019-02-27 20:23 ` Steve Wise 2019-03-03 13:50 ` Leon Romanovsky 2019-03-04 14:13 ` Steve Wise [this message] 2019-03-04 14:13 ` Steve Wise 2019-03-06 21:50 ` Steve Wise 2019-03-07 8:33 ` Leon Romanovsky 2019-02-28 19:41 ` Steve Wise 2019-02-28 19:56 ` Leon Romanovsky 2019-02-28 20:10 ` Steve Wise 2019-02-21 16:19 ` [PATCH v1 iproute2-next 2/4] Sync up rdma_netlink.h Steve Wise 2019-02-21 18:56 ` Stephen Hemminger 2019-02-21 22:10 ` Leon Romanovsky 2019-02-21 23:11 ` Jason Gunthorpe 2019-02-23 9:28 ` Leon Romanovsky 2019-02-26 17:15 ` Steve Wise 2019-02-21 16:19 ` [PATCH v1 iproute2-next 3/4] rdma: add 'link add/delete' commands Steve Wise 2019-02-21 23:14 ` Jason Gunthorpe 2019-02-23 9:43 ` Leon Romanovsky 2019-02-26 17:24 ` Steve Wise 2019-02-27 21:18 ` Steve Wise 2019-02-27 21:18 ` Steve Wise 2019-02-21 16:19 ` [PATCH v1 iproute2-next 4/4] rdma: man page update for link add/delete Steve Wise
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='007901d4d294$62d5d280$28817780$@opengridcomputing.com' \ --to=swise@opengridcomputing.com \ --cc=dsahern@gmail.com \ --cc=leon@kernel.org \ --cc=linux-rdma@vger.kernel.org \ --cc=netdev@vger.kernel.org \ --cc=stephen@networkplumber.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.