All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/5] IB/core: extended query device caps cleanup for v3.19
@ 2015-01-29 17:59 Yann Droneaud
       [not found] ` <cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 17:59 UTC (permalink / raw)
  To: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran, Roland Dreier
  Cc: Yann Droneaud, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi,

Following discussions in thread "[PATCH v3 06/17] IB/core: Add support
for extended query device caps" [1] and further comments on previous
patch in "[PATCH 3/4] IB/uverbs: ex_query_device: check request's
comp_mask" thread [2], I'm proposing an updated patchset to implement
a slighly different behavior to handle the request parameters in
ib_uverbs_ex_query_device() in order to restore the current behavior
of ib_copy_to_udata().

I think we found an agreement on the scheme to be implemented in
Haggai Eran's response [3]:

- input's comp_mask is currently not used, so it must be 0;
- the response buffer size is checked so that the base answer
  can be returned without being truncated;
- the extended feature (odp) information are returned
  if, and only if, there's enough space in the response buffer,
  allowing older programs to get only the features they would
  expect. Newer programs are expected to provide more space,
  but as the response's comp_mask will describe only the available
  fields, newer program cannot be surprized to not get information
  when run on older kernel (if we want to support this use case).

I feel even more confident this behavior would allow a better
maintainability. Additionally, it's still looking more like the
behavior already implemented by other InfiniBand/RDMA kernel <->
userspace interfaces.

I hope this would go in v3.19 before its release so that the extended
QUERY_DEVICE uverbs won't hurt us being part of an official release:
in it's current shape it won't be easy to support it.

Regards.

Changes from v0 [4]
- don't use input's comp_mask to conditionnaly build the response
- ensure input's comp_mask is set 0

[1] http://mid.gmane.org/1418733236.2779.26.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
[2] http://mid.gmane.org/063956366559d6919693aabb324bab83d676bc28.1421931555.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
[3] http://mid.gmane.org/54C902E4.5010600-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
[4] http://mid.gmane.org/cover.1421931555.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org

Yann Droneaud (5):
  IB/uverbs: ex_query_device: answer must not depend on request's
    comp_mask
  IB/uverbs: ex_query_device: check request's comp_mask
  IB/uverbs: ex_query_device: answer must depend on response buffer's
    size
  IB/uverbs: ex_query_device: no need to clear the whole structure
  IB/core: ib_copy_to_udata(): don't silently truncate response

 drivers/infiniband/core/uverbs_cmd.c | 39 +++++++++++++++++++++++++-----------
 include/rdma/ib_verbs.h              |  5 +----
 2 files changed, 28 insertions(+), 16 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found] ` <cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
@ 2015-01-29 17:59   ` Yann Droneaud
       [not found]     ` <24ceb1fc5b2b6563532e5776b1b2320b1ae37543.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 17:59   ` [PATCH v1 2/5] IB/uverbs: ex_query_device: check " Yann Droneaud
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 17:59 UTC (permalink / raw)
  To: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran, Roland Dreier
  Cc: Yann Droneaud, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

As specified in "Extending Verbs API" presentation [1] by Tzahi Oved
during OFA International Developer Workshop 2013, the request's
comp_mask should describe the request data: it's describe the
availability of extended fields in the request.
Conversely, the response's comp_mask should describe the presence
of extended fields in the response.

So this patch changes ib_uverbs_ex_query_device() function to always
return the maximum supported features, currently only
IB_USER_VERBS_EX_QUERY_DEVICE_ODP per commit 860f10a799c8 ("IB/core:
Add flags for on demand paging support"), regardless of the value of
request's comp_mask.

[1] https://www.openfabrics.org/images/docs/2013_Dev_Workshop/Tues_0423/2013_Workshop_Tues_0830_Tzahi_Oved-verbs_extensions_ofa_2013-tzahio.pdf

Link: http://mid.gmane.org/cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
---
 drivers/infiniband/core/uverbs_cmd.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 532d8eba8b02..6ef06a9b4362 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3324,17 +3324,15 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
 	resp.comp_mask = 0;
 
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
-	if (cmd.comp_mask & IB_USER_VERBS_EX_QUERY_DEVICE_ODP) {
-		resp.odp_caps.general_caps = attr.odp_caps.general_caps;
-		resp.odp_caps.per_transport_caps.rc_odp_caps =
-			attr.odp_caps.per_transport_caps.rc_odp_caps;
-		resp.odp_caps.per_transport_caps.uc_odp_caps =
-			attr.odp_caps.per_transport_caps.uc_odp_caps;
-		resp.odp_caps.per_transport_caps.ud_odp_caps =
-			attr.odp_caps.per_transport_caps.ud_odp_caps;
-		resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
-	}
+	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
+	resp.odp_caps.per_transport_caps.rc_odp_caps =
+		attr.odp_caps.per_transport_caps.rc_odp_caps;
+	resp.odp_caps.per_transport_caps.uc_odp_caps =
+		attr.odp_caps.per_transport_caps.uc_odp_caps;
+	resp.odp_caps.per_transport_caps.ud_odp_caps =
+		attr.odp_caps.per_transport_caps.ud_odp_caps;
 #endif
+	resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
 
 	err = ib_copy_to_udata(ucore, &resp, sizeof(resp));
 	if (err)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v1 2/5] IB/uverbs: ex_query_device: check request's comp_mask
       [not found] ` <cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 17:59   ` [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask Yann Droneaud
@ 2015-01-29 17:59   ` Yann Droneaud
       [not found]     ` <335da45738872e446f63db338ca766a34608c90a.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 18:00   ` [PATCH v1 3/5] IB/uverbs: ex_query_device: answer must depend on response buffer's size Yann Droneaud
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 17:59 UTC (permalink / raw)
  To: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran, Roland Dreier
  Cc: Yann Droneaud, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

This patch ensures the extended QUERY_DEVICE uverbs request's
comp_mask has only known and supported bits (currently none).

If userspace set unknown features bits, -EINVAL will be returned,
ensuring current programs are not allowed to set random feature
bits: such bits could enable new extended features in future kernel
versions and those features can trigger a behavior not unsupported
by the older programs or make the newer kernels return an error
for a request which was valid on older kernels.

Additionally, returning an error for unsupported feature would
allow userspace to probe/discover which extended features are
currently supported by a kernel.

Link: http://mid.gmane.org/cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
---
 drivers/infiniband/core/uverbs_cmd.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 6ef06a9b4362..fbcc54b86795 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3312,6 +3312,9 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
 	if (err)
 		return err;
 
+	if (cmd.comp_mask)
+		return -EINVAL;
+
 	if (cmd.reserved)
 		return -EINVAL;
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v1 3/5] IB/uverbs: ex_query_device: answer must depend on response buffer's size
       [not found] ` <cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 17:59   ` [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask Yann Droneaud
  2015-01-29 17:59   ` [PATCH v1 2/5] IB/uverbs: ex_query_device: check " Yann Droneaud
@ 2015-01-29 18:00   ` Yann Droneaud
       [not found]     ` <a7b2b5adb3b207ec2a4330067a03ce7e4c2225aa.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 18:00   ` [PATCH v1 4/5] IB/uverbs: ex_query_device: no need to clear the whole structure Yann Droneaud
  2015-01-29 18:00   ` [PATCH v1 5/5] IB/core: ib_copy_to_udata(): don't silently truncate response Yann Droneaud
  4 siblings, 1 reply; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 18:00 UTC (permalink / raw)
  To: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran, Roland Dreier
  Cc: Yann Droneaud, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

As specified in "Extending Verbs API" presentation [1] by Tzahi Oved
during OFA International Developer Workshop 2013, the request's
comp_mask should describe the request data: it's describe the
availability of extended fields in the request.
Conversely, the response's comp_mask should describe the presence
of extended fields in the response.

So instead of silently truncating extended QUERY_DEVICE uverb's
response, see commit 5a77abf9a97a ("IB/core: Add support for
extended query device caps")), this patch makes function
ib_uverbs_ex_query_device() check the available space in the
response buffer against the minimal response structure and fail
with -ENOSPC if this base structure cannot be returned to
userspace: it's required to be able to return the comp_mask's
value, at least.

For extended features, currently only IB_USER_VERBS_EX_QUERY_DEVICE_ODP
per commit 860f10a799c8 ("IB/core: Add flags for on demand paging
support"), the extension's data is returned if the needed space
is available in the response buffer: it is expected that newer
userspace program pass a bigger response buffer so that it can
retrieve extended features. The comp_mask value will match the fields
that were effectively returned to userspace.

In the end:
- userspace won't get truncated responses;
- newer kernel would be able to support older binaries and
  older binaries will work flawlessly with newer kernel;
- additionally, newer binaries would even be able to support
  older kernel, as far as they don't set new feature bit in
  request's comp_mask.

Note: as offsetof() is used to retrieve the size of the lower chunk
of the response, beware that it only works if the upper chunk
is right after, without any implicit padding. And, as the size of
the latter chunk is added to the base size, implicit padding at the
end of the structure is not taken in account. Both point must be
taken in account when extending the uverbs functionalities.

[1] https://www.openfabrics.org/images/docs/2013_Dev_Workshop/Tues_0423/2013_Workshop_Tues_0830_Tzahi_Oved-verbs_extensions_ofa_2013-tzahio.pdf

Link: http://mid.gmane.org/cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
---
 drivers/infiniband/core/uverbs_cmd.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index fbcc54b86795..81d8b5aa2eb6 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3302,6 +3302,7 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
 	struct ib_uverbs_ex_query_device  cmd;
 	struct ib_device_attr attr;
 	struct ib_device *device;
+	size_t resp_len;
 	int err;
 
 	device = file->device->ib_dev;
@@ -3318,6 +3319,11 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
 	if (cmd.reserved)
 		return -EINVAL;
 
+	resp_len = offsetof(typeof(resp), odp_caps);
+
+	if (ucore->outlen < resp_len)
+		return -ENOSPC;
+
 	err = device->query_device(device, &attr);
 	if (err)
 		return err;
@@ -3326,6 +3332,9 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
 	copy_query_dev_fields(file, &resp.base, &attr);
 	resp.comp_mask = 0;
 
+	if (ucore->outlen < resp_len + sizeof(resp.odp_caps))
+		goto end;
+
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
 	resp.odp_caps.per_transport_caps.rc_odp_caps =
@@ -3336,8 +3345,10 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
 		attr.odp_caps.per_transport_caps.ud_odp_caps;
 #endif
 	resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
+	resp_len += sizeof(resp.odp_caps);
 
-	err = ib_copy_to_udata(ucore, &resp, sizeof(resp));
+end:
+	err = ib_copy_to_udata(ucore, &resp, resp_len);
 	if (err)
 		return err;
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v1 4/5] IB/uverbs: ex_query_device: no need to clear the whole structure
       [not found] ` <cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-01-29 18:00   ` [PATCH v1 3/5] IB/uverbs: ex_query_device: answer must depend on response buffer's size Yann Droneaud
@ 2015-01-29 18:00   ` Yann Droneaud
       [not found]     ` <0b646f62e9272bc962a1ff6ff03ad9523b454dfe.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 18:00   ` [PATCH v1 5/5] IB/core: ib_copy_to_udata(): don't silently truncate response Yann Droneaud
  4 siblings, 1 reply; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 18:00 UTC (permalink / raw)
  To: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran, Roland Dreier
  Cc: Yann Droneaud, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

As only the requested fields are set and copied to userspace,
there's no need to clear the content of the response structure
beforehand.

Link: http://mid.gmane.org/cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
---
 drivers/infiniband/core/uverbs_cmd.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 81d8b5aa2eb6..9520880a163f 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3328,9 +3328,9 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
 	if (err)
 		return err;
 
-	memset(&resp, 0, sizeof(resp));
 	copy_query_dev_fields(file, &resp.base, &attr);
 	resp.comp_mask = 0;
+	resp.reserved = 0;
 
 	if (ucore->outlen < resp_len + sizeof(resp.odp_caps))
 		goto end;
@@ -3343,6 +3343,9 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
 		attr.odp_caps.per_transport_caps.uc_odp_caps;
 	resp.odp_caps.per_transport_caps.ud_odp_caps =
 		attr.odp_caps.per_transport_caps.ud_odp_caps;
+	resp.odp_caps.reserved = 0;
+#else
+	memset(&resp.odp_caps, 0, sizeof(resp.odp_caps));
 #endif
 	resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
 	resp_len += sizeof(resp.odp_caps);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v1 5/5] IB/core: ib_copy_to_udata(): don't silently truncate response
       [not found] ` <cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
                     ` (3 preceding siblings ...)
  2015-01-29 18:00   ` [PATCH v1 4/5] IB/uverbs: ex_query_device: no need to clear the whole structure Yann Droneaud
@ 2015-01-29 18:00   ` Yann Droneaud
       [not found]     ` <c69af8952bf25fdbcdfc527b0636bc3177798b95.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  4 siblings, 1 reply; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 18:00 UTC (permalink / raw)
  To: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran, Roland Dreier
  Cc: Yann Droneaud, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

While ib_copy_to_udata() should check for the available output
space as already proposed in some other patches [1][2][3], the
changes brought by commit 5a77abf9a97a ("IB/core: Add support for
extended query device caps") are silently truncating the data to
be written to userspace if the output buffer is not large enough
to hold the response data.

Silently truncating the response is not a reliable behavior as
userspace is not given any hint about this truncation: userspace
is leaved with garbage to play with.

Not checking the response buffer size and writing past the
userspace buffer is no good either, but it's the current behavior.

So this patch revert the particular change on ib_copy_to_udata()
as a better behavior is implemented in the upper level function
ib_uverbs_ex_query_device().

[1] "[PATCH 00/22] infiniband: improve userspace input check"

http://mid.gmane.org/cover.1376847403.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org

[2] "[PATCH 03/22] infiniband: ib_copy_from_udata(): check input length"

http://mid.gmane.org/2bf102a41c51f61965ee09df827abe8fefb523a9.1376847403.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org

[3] "[PATCH 04/22] infiniband: ib_copy_to_udata(): check output length"

http://mid.gmane.org/d27716a3a1c180f832d153a7402f65ea8a75b734.1376847403.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org

Link: http://mid.gmane.org/cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
---
 include/rdma/ib_verbs.h | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 0d74f1de99aa..65994a19e840 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1707,10 +1707,7 @@ static inline int ib_copy_from_udata(void *dest, struct ib_udata *udata, size_t
 
 static inline int ib_copy_to_udata(struct ib_udata *udata, void *src, size_t len)
 {
-	size_t copy_sz;
-
-	copy_sz = min_t(size_t, len, udata->outlen);
-	return copy_to_user(udata->outbuf, src, copy_sz) ? -EFAULT : 0;
+	return copy_to_user(udata->outbuf, src, len) ? -EFAULT : 0;
 }
 
 /**
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]     ` <24ceb1fc5b2b6563532e5776b1b2320b1ae37543.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
@ 2015-01-29 18:28       ` Jason Gunthorpe
       [not found]         ` <20150129182800.GH11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2015-01-29 18:28 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Jan 29, 2015 at 06:59:58PM +0100, Yann Droneaud wrote:
> As specified in "Extending Verbs API" presentation [1] by Tzahi Oved
> during OFA International Developer Workshop 2013, the request's
> comp_mask should describe the request data: it's describe the
> availability of extended fields in the request.
> Conversely, the response's comp_mask should describe the presence
> of extended fields in the response.

Roland: I agree with Yann, these patches need to go in, or the ODP
patches reverted.

>  #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> -	if (cmd.comp_mask & IB_USER_VERBS_EX_QUERY_DEVICE_ODP) {

Absolutely, a query output should never depend on the input comp_mask.

> -		resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> -		resp.odp_caps.per_transport_caps.rc_odp_caps =
> -			attr.odp_caps.per_transport_caps.rc_odp_caps;
> -		resp.odp_caps.per_transport_caps.uc_odp_caps =
> -			attr.odp_caps.per_transport_caps.uc_odp_caps;
> -		resp.odp_caps.per_transport_caps.ud_odp_caps =
> -			attr.odp_caps.per_transport_caps.ud_odp_caps;
> -		resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
> -	}
> +	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> +	resp.odp_caps.per_transport_caps.rc_odp_caps =
> +		attr.odp_caps.per_transport_caps.rc_odp_caps;
> +	resp.odp_caps.per_transport_caps.uc_odp_caps =
> +		attr.odp_caps.per_transport_caps.uc_odp_caps;
> +	resp.odp_caps.per_transport_caps.ud_odp_caps =
> +		attr.odp_caps.per_transport_caps.ud_odp_caps;
>  #endif
> +	resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;

Not sure about this - if 0 is a valid null answer for all the _caps
then it is fine, and the comp_mask bit should just be removed as the
size alone should be enough.

This looks wrong however:

>  	err = ib_copy_to_udata(ucore, &resp, sizeof(resp));
>  	if (err)
>           return err;
>        return 0;

Why isn't this returning the filled structure size to userspace?  This
would seem to be very urgently wrong to me? Am I missing something?

Patch 3 probably should gain:

- return 0;
+ return resp_len;

This patch looks like an improvement to me:

Reviewed-By: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 2/5] IB/uverbs: ex_query_device: check request's comp_mask
       [not found]     ` <335da45738872e446f63db338ca766a34608c90a.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
@ 2015-01-29 18:36       ` Jason Gunthorpe
       [not found]         ` <20150129183648.GI11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2015-02-01  8:12       ` Haggai Eran
  1 sibling, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2015-01-29 18:36 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Jan 29, 2015 at 06:59:59PM +0100, Yann Droneaud wrote:
> This patch ensures the extended QUERY_DEVICE uverbs request's
> comp_mask has only known and supported bits (currently none).

I think I would be happy to see the input comp_mask removed
entirely. I can't see a possible use for input data to a QUERY command
that wouldn't be better served by creating a new command

But forcing the value to 0 seems reasonable as well.

Reviewed-By: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>

Also, the _ex varients were supposed to be supersets of the base call,
so it is wrong that query_device_ex doesn't return all the same data
as query_device, layed out so that the original response structure is
a prefix of the extended response structure.

The other _ex calls in verbs were designed that way so it will be
surprising to the user that this one is different.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 3/5] IB/uverbs: ex_query_device: answer must depend on response buffer's size
       [not found]     ` <a7b2b5adb3b207ec2a4330067a03ce7e4c2225aa.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
@ 2015-01-29 18:38       ` Jason Gunthorpe
       [not found]         ` <20150129183839.GJ11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2015-02-01  8:38       ` Haggai Eran
  1 sibling, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2015-01-29 18:38 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Jan 29, 2015 at 07:00:00PM +0100, Yann Droneaud wrote:
> As specified in "Extending Verbs API" presentation [1] by Tzahi Oved
> during OFA International Developer Workshop 2013, the request's
> comp_mask should describe the request data: it's describe the
> availability of extended fields in the request.
> Conversely, the response's comp_mask should describe the presence
> of extended fields in the response.

This makes sense to me as well.

> -	err = ib_copy_to_udata(ucore, &resp, sizeof(resp));
> +end:
> +	err = ib_copy_to_udata(ucore, &resp, resp_len);
>  	if (err)
>  		return err;
>  

I think resp_len should be returned, not 0?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 4/5] IB/uverbs: ex_query_device: no need to clear the whole structure
       [not found]     ` <0b646f62e9272bc962a1ff6ff03ad9523b454dfe.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
@ 2015-01-29 18:39       ` Jason Gunthorpe
  2015-02-01  8:45       ` Haggai Eran
  1 sibling, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2015-01-29 18:39 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Jan 29, 2015 at 07:00:01PM +0100, Yann Droneaud wrote:
> As only the requested fields are set and copied to userspace,
> there's no need to clear the content of the response structure
> beforehand.

Agreed.

Reviewed-By: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]         ` <20150129182800.GH11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-01-29 18:43           ` Yann Droneaud
       [not found]             ` <1422557009.3133.172.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 21:59           ` Yann Droneaud
  2015-02-01  7:39           ` Haggai Eran
  2 siblings, 1 reply; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 18:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi,

Le jeudi 29 janvier 2015 à 11:28 -0700, Jason Gunthorpe a écrit :
> On Thu, Jan 29, 2015 at 06:59:58PM +0100, Yann Droneaud wrote:
> > As specified in "Extending Verbs API" presentation [1] by Tzahi Oved
> > during OFA International Developer Workshop 2013, the request's
> > comp_mask should describe the request data: it's describe the
> > availability of extended fields in the request.
> > Conversely, the response's comp_mask should describe the presence
> > of extended fields in the response.
> 
> Roland: I agree with Yann, these patches need to go in, or the ODP
> patches reverted.
> 
> >  #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > -	if (cmd.comp_mask & IB_USER_VERBS_EX_QUERY_DEVICE_ODP) {
> 
> Absolutely, a query output should never depend on the input comp_mask.
> 

We should keep this in mind then.

> > -		resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > -		resp.odp_caps.per_transport_caps.rc_odp_caps =
> > -			attr.odp_caps.per_transport_caps.rc_odp_caps;
> > -		resp.odp_caps.per_transport_caps.uc_odp_caps =
> > -			attr.odp_caps.per_transport_caps.uc_odp_caps;
> > -		resp.odp_caps.per_transport_caps.ud_odp_caps =
> > -			attr.odp_caps.per_transport_caps.ud_odp_caps;
> > -		resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
> > -	}
> > +	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > +	resp.odp_caps.per_transport_caps.rc_odp_caps =
> > +		attr.odp_caps.per_transport_caps.rc_odp_caps;
> > +	resp.odp_caps.per_transport_caps.uc_odp_caps =
> > +		attr.odp_caps.per_transport_caps.uc_odp_caps;
> > +	resp.odp_caps.per_transport_caps.ud_odp_caps =
> > +		attr.odp_caps.per_transport_caps.ud_odp_caps;
> >  #endif
> > +	resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
> 
> Not sure about this - if 0 is a valid null answer for all the _caps
> then it is fine, and the comp_mask bit should just be removed as the
> size alone should be enough.
> 

That's difficult to say. But I hope 0 are valids answers in this case.

Anyway, the response's comp_mask is needed, as it's the sole way for 
the userspace to know the size of the response. See below.

> This looks wrong however:
> 
> >  	err = ib_copy_to_udata(ucore, &resp, sizeof(resp));
> >  	if (err)
> >           return err;
> >        return 0;
> 
> Why isn't this returning the filled structure size to userspace?  This
> would seem to be very urgently wrong to me? Am I missing something?
> 
> Patch 3 probably should gain:
> 
> - return 0;
> + return resp_len;
> 

The write() syscall must return the size buffer passed to it, or less,
but in such case it would ask for trouble as userspace would be allowed
to write() the remaining bytes.
Returning a size bigger than the one passed to write() is not acceptable
and would break any expectation.

> This patch looks like an improvement to me:
> 
> Reviewed-By: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
> 

Thanks a lot.

Regards.

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]             ` <1422557009.3133.172.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
@ 2015-01-29 19:18               ` Jason Gunthorpe
       [not found]                 ` <20150129191801.GM11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2015-01-29 20:42               ` Yann Droneaud
  1 sibling, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2015-01-29 19:18 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Jan 29, 2015 at 07:43:29PM +0100, Yann Droneaud wrote:

> The write() syscall must return the size buffer passed to it, or
> less, but in such case it would ask for trouble as userspace would
> be allowed to write() the remaining bytes.  Returning a size bigger
> than the one passed to write() is not acceptable and would break any
> expectation.

By that logic the 0 return is still wrong, and it should be ucore->in_len

But I think we can return less without risking anything breaking, it
already violates the invariant associated with write() - it mutates
the buffer passed in!

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 2/5] IB/uverbs: ex_query_device: check request's comp_mask
       [not found]         ` <20150129183648.GI11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-01-29 19:22           ` Yann Droneaud
  2015-01-29 20:24           ` Jason Gunthorpe
  1 sibling, 0 replies; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 19:22 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi,

Le jeudi 29 janvier 2015 à 11:36 -0700, Jason Gunthorpe a écrit :
> On Thu, Jan 29, 2015 at 06:59:59PM +0100, Yann Droneaud wrote:
> > This patch ensures the extended QUERY_DEVICE uverbs request's
> > comp_mask has only known and supported bits (currently none).
> 
> I think I would be happy to see the input comp_mask removed
> entirely. I can't see a possible use for input data to a QUERY command
> that wouldn't be better served by creating a new command
> 
> But forcing the value to 0 seems reasonable as well.
> 

I cannot forsee the future, but having at least one unused bit
available allow for any kind of yet unknown extension: having this
bit/these bits checked now, permit fixing mistake later.

So let it be.

> Reviewed-By: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
> 

Thanks.

> Also, the _ex varients were supposed to be supersets of the base call,
> so it is wrong that query_device_ex doesn't return all the same data
> as query_device, layed out so that the original response structure is
> a prefix of the extended response structure.
> 
> The other _ex calls in verbs were designed that way so it will be
> surprising to the user that this one is different.
> 

It seems to me it has the layout you're expecting:

    224 struct ib_uverbs_ex_query_device_resp {
    225         struct ib_uverbs_query_device_resp base;
    226         __u32 comp_mask;
    227         __u32 reserved;
    228         struct ib_uverbs_odp_caps odp_caps;
    229 };

Regards.

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 3/5] IB/uverbs: ex_query_device: answer must depend on response buffer's size
       [not found]         ` <20150129183839.GJ11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-01-29 19:25           ` Yann Droneaud
  0 siblings, 0 replies; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 19:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi,

Le jeudi 29 janvier 2015 à 11:38 -0700, Jason Gunthorpe a écrit :
> On Thu, Jan 29, 2015 at 07:00:00PM +0100, Yann Droneaud wrote:
> > As specified in "Extending Verbs API" presentation [1] by Tzahi Oved
> > during OFA International Developer Workshop 2013, the request's
> > comp_mask should describe the request data: it's describe the
> > availability of extended fields in the request.
> > Conversely, the response's comp_mask should describe the presence
> > of extended fields in the response.
> 
> This makes sense to me as well.
> 
> > -	err = ib_copy_to_udata(ucore, &resp, sizeof(resp));
> > +end:
> > +	err = ib_copy_to_udata(ucore, &resp, resp_len);
> >  	if (err)
> >  		return err;
> >  
> 
> I think resp_len should be returned, not 0?
> 

As said previously it's not possible to use the effective size of the
response as the return value of the write() syscall: the syscall
must return the size of the input buffer, not the size of the output
buffer, otherwise it would break the semantics of the write() syscall.

Regards.

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 2/5] IB/uverbs: ex_query_device: check request's comp_mask
       [not found]         ` <20150129183648.GI11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2015-01-29 19:22           ` Yann Droneaud
@ 2015-01-29 20:24           ` Jason Gunthorpe
  1 sibling, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2015-01-29 20:24 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Jan 29, 2015 at 11:36:48AM -0700, Jason Gunthorpe wrote:

> Also, the _ex varients were supposed to be supersets of the base call,
> so it is wrong that query_device_ex doesn't return all the same data
> as query_device, layed out so that the original response structure is
> a prefix of the extended response structure.

Never mind this, I see the copy_query_dev_fields call now!

Thanks,
Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]             ` <1422557009.3133.172.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 19:18               ` Jason Gunthorpe
@ 2015-01-29 20:42               ` Yann Droneaud
  1 sibling, 0 replies; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 20:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Le jeudi 29 janvier 2015 à 19:43 +0100, Yann Droneaud a écrit :
> Le jeudi 29 janvier 2015 à 11:28 -0700, Jason Gunthorpe a écrit :
> > On Thu, Jan 29, 2015 at 06:59:58PM +0100, Yann Droneaud wrote:

> > > -		resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > > -		resp.odp_caps.per_transport_caps.rc_odp_caps =
> > > -			attr.odp_caps.per_transport_caps.rc_odp_caps;
> > > -		resp.odp_caps.per_transport_caps.uc_odp_caps =
> > > -			attr.odp_caps.per_transport_caps.uc_odp_caps;
> > > -		resp.odp_caps.per_transport_caps.ud_odp_caps =
> > > -			attr.odp_caps.per_transport_caps.ud_odp_caps;
> > > -		resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
> > > -	}
> > > +	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > > +	resp.odp_caps.per_transport_caps.rc_odp_caps =
> > > +		attr.odp_caps.per_transport_caps.rc_odp_caps;
> > > +	resp.odp_caps.per_transport_caps.uc_odp_caps =
> > > +		attr.odp_caps.per_transport_caps.uc_odp_caps;
> > > +	resp.odp_caps.per_transport_caps.ud_odp_caps =
> > > +		attr.odp_caps.per_transport_caps.ud_odp_caps;
> > >  #endif
> > > +	resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
> > 
> > Not sure about this - if 0 is a valid null answer for all the _caps
> > then it is fine, and the comp_mask bit should just be removed as the
> > size alone should be enough.
> > 
> 
> That's difficult to say. But I hope 0 are valids answers in this case.
> 

Hum, according to include/rdma/ib_verbs.h, there's a bit set
for ODP support:

    148 enum ib_odp_general_cap_bits {
    149         IB_ODP_SUPPORT = 1 << 0,
    150 };

So it should be safe to set everything to 0 in struct
ib_uverbs_odp_caps.

Regards.

-- 
Yann Droneaud
OPTEYA

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]                 ` <20150129191801.GM11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-01-29 20:50                   ` Yann Droneaud
       [not found]                     ` <1422564638.3133.198.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 20:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi,

Le jeudi 29 janvier 2015 à 12:18 -0700, Jason Gunthorpe a écrit :
> On Thu, Jan 29, 2015 at 07:43:29PM +0100, Yann Droneaud wrote:
> 
> > The write() syscall must return the size buffer passed to it, or
> > less, but in such case it would ask for trouble as userspace would
> > be allowed to write() the remaining bytes.  Returning a size bigger
> > than the one passed to write() is not acceptable and would break any
> > expectation.
> 
> By that logic the 0 return is still wrong, and it should be ucore->in_len
> 

This is handled by ib_uverbs_write() in
drivers/infiniband/core/uverbs_main.c:

    709                 if (err)
    710                         return err;
    711 
    712                 return written_count;


> But I think we can return less without risking anything breaking, it
> already violates the invariant associated with write() - it mutates
> the buffer passed in!
> 

I don't think so, as only the response buffer is written to and the
response buffer pointer is provided in the buffer given to write().

AFAIK, no uverbs ever change the content of the input buffer (eg. the
request): I've managed to declare the various input buffers "const" so
it would surprising to find it use for writing to userspace.

Anyway, I recognize that uverb way of abusing write() syscall is 
borderline (at best) regarding other Linux subsystems and Unix paradigm 
in general. But it's not enough to screw it more.

Regards.

-- 
Yann Droneaud
OPTEYA

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]                     ` <1422564638.3133.198.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
@ 2015-01-29 21:12                       ` Jason Gunthorpe
       [not found]                         ` <20150129211256.GA22099-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2015-01-29 21:12 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Jan 29, 2015 at 09:50:38PM +0100, Yann Droneaud wrote:

> Anyway, I recognize that uverb way of abusing write() syscall is 
> borderline (at best) regarding other Linux subsystems and Unix paradigm 
> in general. But it's not enough to screw it more.

Then we must return the correct output size explicitly in the struct.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]         ` <20150129182800.GH11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2015-01-29 18:43           ` Yann Droneaud
@ 2015-01-29 21:59           ` Yann Droneaud
       [not found]             ` <1422568741.3133.247.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-02-01  7:39           ` Haggai Eran
  2 siblings, 1 reply; 35+ messages in thread
From: Yann Droneaud @ 2015-01-29 21:59 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi,

Le jeudi 29 janvier 2015 à 11:28 -0700, Jason Gunthorpe a écrit :
> On Thu, Jan 29, 2015 at 06:59:58PM +0100, Yann Droneaud wrote:
> > As specified in "Extending Verbs API" presentation [1] by Tzahi Oved
> > during OFA International Developer Workshop 2013, the request's
> > comp_mask should describe the request data: it's describe the
> > availability of extended fields in the request.
> > Conversely, the response's comp_mask should describe the presence
> > of extended fields in the response.
> 
> Roland: I agree with Yann, these patches need to go in, or the ODP
> patches reverted.
> 

Reverting all On Demand Paging patches seems overkill:
if something as to be reverted it should be commit 5a77abf9a97a
("IB/core: Add support for extended query device caps") and the part of
commit 860f10a799c8 ("IB/core: Add flags for on demand paging support")
which modify ib_uverbs_ex_query_device().

But I wonder about this part of commit 860f10a799c8:

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index c7a43624c96b..f9326ccda4b5 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -953,6 +953,18 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
                goto err_free;
        }
 
+       if (cmd.access_flags & IB_ACCESS_ON_DEMAND) {
+               struct ib_device_attr attr;
+
+               ret = ib_query_device(pd->device, &attr);
+               if (ret || !(attr.device_cap_flags &
+                               IB_DEVICE_ON_DEMAND_PAGING)) {
+                       pr_debug("ODP support not available\n");
+                       ret = -EINVAL;
+                       goto err_put;
+               }
+       }
+

AFAICT (1 << 6) bit from struct ib_uverbs_reg_mr access_flags field
was not enforced to be 0 previously, as ib_check_mr_access() only check 
for some coherency between a subset of the bits (it's not a function
dedicated to check flags provided by userspace):

include/rdma/ib_verbs.h:

   1094 enum ib_access_flags {
   1095         IB_ACCESS_LOCAL_WRITE   = 1,
   1096         IB_ACCESS_REMOTE_WRITE  = (1<<1),
   1097         IB_ACCESS_REMOTE_READ   = (1<<2),
   1098         IB_ACCESS_REMOTE_ATOMIC = (1<<3),
   1099         IB_ACCESS_MW_BIND       = (1<<4),
   1100         IB_ZERO_BASED           = (1<<5),
   1101         IB_ACCESS_ON_DEMAND     = (1<<6),
   1102 };

drivers/infiniband/core/uverbs_cmd.c: ib_uverbs_reg_mr()

    961         ret = ib_check_mr_access(cmd.access_flags);
    962         if (ret)
    963                 return ret;

include/rdma/ib_verbs.h:

   2643 static inline int ib_check_mr_access(int flags)
   2644 {
   2645         /*
   2646          * Local write permission is required if remote write or
   2647          * remote atomic permission is also requested.
   2648          */
   2649         if (flags & (IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_REMOTE_WRITE) &&
   2650             !(flags & IB_ACCESS_LOCAL_WRITE))
   2651                 return -EINVAL;
   2652 
   2653         return 0;
   2654 }

drivers/infiniband/core/uverbs_cmd.c: ib_uverbs_reg_mr()

    990         mr = pd->device->reg_user_mr(pd, cmd.start, cmd.length, cmd.hca_va,
    991                                      cmd.access_flags, &udata);

reg_user_mr() functions may call ib_umem_get() and pass access_flags to
it:

drivers/infiniband/core/umem.c: ib_umem_get()

    114         /*
    115          * We ask for writable memory if any of the following
    116          * access flags are set.  "Local write" and "remote write"
    117          * obviously require write access.  "Remote atomic" can do
    118          * things like fetch and add, which will modify memory, and
    119          * "MW bind" can change permissions by binding a window.
    120          */
    121         umem->writable  = !!(access &
    122                 (IB_ACCESS_LOCAL_WRITE   | IB_ACCESS_REMOTE_WRITE |
    123                  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
    124 
    125         if (access & IB_ACCESS_ON_DEMAND) {
    126                 ret = ib_umem_odp_get(context, umem);
    127                 if (ret) {
    128                         kfree(umem);
    129                         return ERR_PTR(ret);
    130                 }
    131                 return umem;
    132         }


As you can see only a few bits in access_flags are checked in the end,
so it may exist a very unlikely possibility that an existing userspace
program is setting this IB_ACCESS_ON_DEMAND bit without the intention
of enabling on demand paging as this would be unnoticed by older kernel.

In the other hand, a newer program built with on-demand-paging in mind
will set the bit, but when run on older kernel, it will be a no-op,
allowing the program to continue, perhaps thinking on-demand-paging
is available.

That should be avoided as much as possible.

Unfortunately, I think this cannot be fixed as it's was long since 
IB_ZERO_BASED was added by commit 7083e42ee2 ("IB/core: Add "type 2" 
memory windows support").
Anyway there was no check for IB_ACCESS_REMOTE_READ, nor 
IB_ACCESS_MW_BIND in the uverb layer either.

So, just as the second argument of open() syscall (remember O_TMPFILE,
see http://lwn.net/Articles/562294/ ), we will have to live with and be
careful ...

Regards.

-- 
Yann Droneaud
OPTEYA

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]             ` <1422568741.3133.247.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
@ 2015-01-29 23:17               ` Roland Dreier
       [not found]                 ` <CAG4TOxN4DpTMMsCM-1oe6-w+5xYR4YHdJeL7p2nQpUy9gYCSjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-02-01  8:04               ` Haggai Eran
  1 sibling, 1 reply; 35+ messages in thread
From: Roland Dreier @ 2015-01-29 23:17 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Jason Gunthorpe, Sagi Grimberg, Shachar Raindel, Eli Cohen,
	Haggai Eran, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On Thu, Jan 29, 2015 at 1:59 PM, Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org> wrote:
>> Roland: I agree with Yann, these patches need to go in, or the ODP
>> patches reverted.

> Reverting all On Demand Paging patches seems overkill:
> if something as to be reverted it should be commit 5a77abf9a97a
> ("IB/core: Add support for extended query device caps") and the part of
> commit 860f10a799c8 ("IB/core: Add flags for on demand paging support")
> which modify ib_uverbs_ex_query_device().

Thank you and Jason for taking on this interface.

At this point I feel like I do about the IPoIB changes -- we should
revert the broken stuff and get it right for 3.20.

If we revert the two things you describe above, is everything else OK
to leave in 3.19 with respect to ABI?

Thanks,
  Roland

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]                 ` <CAG4TOxN4DpTMMsCM-1oe6-w+5xYR4YHdJeL7p2nQpUy9gYCSjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-01-30 17:26                   ` Yann Droneaud
       [not found]                     ` <1422638760.3133.260.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Yann Droneaud @ 2015-01-30 17:26 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Jason Gunthorpe, Sagi Grimberg, Shachar Raindel, Eli Cohen,
	Haggai Eran, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi,

Le jeudi 29 janvier 2015 à 15:17 -0800, Roland Dreier a écrit :
> On Thu, Jan 29, 2015 at 1:59 PM, Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org> wrote:
> >> Roland: I agree with Yann, these patches need to go in, or the ODP
> >> patches reverted.
> 
> > Reverting all On Demand Paging patches seems overkill:
> > if something as to be reverted it should be commit 5a77abf9a97a
> > ("IB/core: Add support for extended query device caps") and the part of
> > commit 860f10a799c8 ("IB/core: Add flags for on demand paging support")
> > which modify ib_uverbs_ex_query_device().
> 
> Thank you and Jason for taking on this interface.
> 
> At this point I feel like I do about the IPoIB changes -- we should
> revert the broken stuff and get it right for 3.20.
> 
> If we revert the two things you describe above, is everything else OK
> to leave in 3.19 with respect to ABI?
> 

I've tried to review every changes since v3.18 on drivers/infiniband
include/rdma and include/uapi/rdma with respect to ABI issues.

I've noticed no other issue, but I have to admit I've not well reviewed
the drivers (hw/) internal changes.

If the IB_USER_VERBS_EX_CMD_QUERY_DEVICE and ib_uverbs_ex_query_device
changes are going to be reverted for v3.19, the on-demand-paging
feature will be available (IB_DEVICE_ON_DEMAND_PAGING will be set 
device_cap_flags in response to non extended QUERY_DEVICE for mlx5 HCA
and IB_ACCESS_ON_DEMAND access flag will be effective for REG_MR 
uverbs), but its parameters won't be. I don't know if it's a no-go for 
the usage of on-demand paging by userspace: I have not the chance
of owning HCA with the support for this feature, nor the patches
libibverbs / libmlx5 ... (anyway I would not have the time to test).
I've hoped people from Mellanox would have commented on the revert 
option too.

Regards.

-- 
Yann Droneaud
OPTEYA

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]                     ` <1422638760.3133.260.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
@ 2015-01-31 20:09                       ` Or Gerlitz
       [not found]                         ` <CAJ3xEMhDtD7MpJ+1Y3z2yMpTrb9C5SaUa94E8xpVLHN4pHe3fw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-02-01 11:25                       ` Haggai Eran
  1 sibling, 1 reply; 35+ messages in thread
From: Or Gerlitz @ 2015-01-31 20:09 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Roland Dreier, Jason Gunthorpe, Sagi Grimberg, Shachar Raindel,
	Eli Cohen, Haggai Eran, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On Fri, Jan 30, 2015, Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org> wrote:
> [..] I have not the chance
> of owning HCA with the support for this feature, nor the patches
> libibverbs / libmlx5 ... (anyway I would not have the time to test). [...]

Yann, have you ever run Linux over RDMA capable HW (IB, RoCE, iWARP or alike?)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]         ` <20150129182800.GH11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2015-01-29 18:43           ` Yann Droneaud
  2015-01-29 21:59           ` Yann Droneaud
@ 2015-02-01  7:39           ` Haggai Eran
  2 siblings, 0 replies; 35+ messages in thread
From: Haggai Eran @ 2015-02-01  7:39 UTC (permalink / raw)
  To: Jason Gunthorpe, Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On 29/01/2015 20:28, Jason Gunthorpe wrote:
> On Thu, Jan 29, 2015 at 06:59:58PM +0100, Yann Droneaud wrote:
>> > -		resp.odp_caps.general_caps = attr.odp_caps.general_caps;
>> > -		resp.odp_caps.per_transport_caps.rc_odp_caps =
>> > -			attr.odp_caps.per_transport_caps.rc_odp_caps;
>> > -		resp.odp_caps.per_transport_caps.uc_odp_caps =
>> > -			attr.odp_caps.per_transport_caps.uc_odp_caps;
>> > -		resp.odp_caps.per_transport_caps.ud_odp_caps =
>> > -			attr.odp_caps.per_transport_caps.ud_odp_caps;
>> > -		resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
>> > -	}
>> > +	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
>> > +	resp.odp_caps.per_transport_caps.rc_odp_caps =
>> > +		attr.odp_caps.per_transport_caps.rc_odp_caps;
>> > +	resp.odp_caps.per_transport_caps.uc_odp_caps =
>> > +		attr.odp_caps.per_transport_caps.uc_odp_caps;
>> > +	resp.odp_caps.per_transport_caps.ud_odp_caps =
>> > +		attr.odp_caps.per_transport_caps.ud_odp_caps;
>> >  #endif
>> > +	resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
> Not sure about this - if 0 is a valid null answer for all the _caps
> then it is fine, and the comp_mask bit should just be removed as the
> size alone should be enough.

Zero is indeed a valid answer. There the IB_ODP_SUPPORT bit in the
general_caps field that says whether or not ODP is supported in general.
The per transport capabilities are also default to not supported.

However, I think we should keep the comp_mask field for future
extensions. The current code doesn't report the size of the response to
user space, and in addition, comp_mask being a bit mask has the
advantage of allowing only part of the structure to be marked valid.

Regards,
Haggai

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]             ` <1422568741.3133.247.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 23:17               ` Roland Dreier
@ 2015-02-01  8:04               ` Haggai Eran
  1 sibling, 0 replies; 35+ messages in thread
From: Haggai Eran @ 2015-02-01  8:04 UTC (permalink / raw)
  To: Yann Droneaud, Jason Gunthorpe
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On 29/01/2015 23:59, Yann Droneaud wrote:
> But I wonder about this part of commit 860f10a799c8:
> 
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index c7a43624c96b..f9326ccda4b5 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -953,6 +953,18 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
>                 goto err_free;
>         }
>  
> +       if (cmd.access_flags & IB_ACCESS_ON_DEMAND) {
> +               struct ib_device_attr attr;
> +
> +               ret = ib_query_device(pd->device, &attr);
> +               if (ret || !(attr.device_cap_flags &
> +                               IB_DEVICE_ON_DEMAND_PAGING)) {
> +                       pr_debug("ODP support not available\n");
> +                       ret = -EINVAL;
> +                       goto err_put;
> +               }
> +       }
> +
> 
> AFAICT (1 << 6) bit from struct ib_uverbs_reg_mr access_flags field
> was not enforced to be 0 previously, as ib_check_mr_access() only check 
> for some coherency between a subset of the bits (it's not a function
> dedicated to check flags provided by userspace):
> 
> include/rdma/ib_verbs.h:
> 
>    1094 enum ib_access_flags {
>    1095         IB_ACCESS_LOCAL_WRITE   = 1,
>    1096         IB_ACCESS_REMOTE_WRITE  = (1<<1),
>    1097         IB_ACCESS_REMOTE_READ   = (1<<2),
>    1098         IB_ACCESS_REMOTE_ATOMIC = (1<<3),
>    1099         IB_ACCESS_MW_BIND       = (1<<4),
>    1100         IB_ZERO_BASED           = (1<<5),
>    1101         IB_ACCESS_ON_DEMAND     = (1<<6),
>    1102 };
> 
> drivers/infiniband/core/uverbs_cmd.c: ib_uverbs_reg_mr()
> 
>     961         ret = ib_check_mr_access(cmd.access_flags);
>     962         if (ret)
>     963                 return ret;
> 
> include/rdma/ib_verbs.h:
> 
>    2643 static inline int ib_check_mr_access(int flags)
>    2644 {
>    2645         /*
>    2646          * Local write permission is required if remote write or
>    2647          * remote atomic permission is also requested.
>    2648          */
>    2649         if (flags & (IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_REMOTE_WRITE) &&
>    2650             !(flags & IB_ACCESS_LOCAL_WRITE))
>    2651                 return -EINVAL;
>    2652 
>    2653         return 0;
>    2654 }
> 
> drivers/infiniband/core/uverbs_cmd.c: ib_uverbs_reg_mr()
> 
>     990         mr = pd->device->reg_user_mr(pd, cmd.start, cmd.length, cmd.hca_va,
>     991                                      cmd.access_flags, &udata);
> 
> reg_user_mr() functions may call ib_umem_get() and pass access_flags to
> it:
> 
> drivers/infiniband/core/umem.c: ib_umem_get()
> 
>     114         /*
>     115          * We ask for writable memory if any of the following
>     116          * access flags are set.  "Local write" and "remote write"
>     117          * obviously require write access.  "Remote atomic" can do
>     118          * things like fetch and add, which will modify memory, and
>     119          * "MW bind" can change permissions by binding a window.
>     120          */
>     121         umem->writable  = !!(access &
>     122                 (IB_ACCESS_LOCAL_WRITE   | IB_ACCESS_REMOTE_WRITE |
>     123                  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
>     124 
>     125         if (access & IB_ACCESS_ON_DEMAND) {
>     126                 ret = ib_umem_odp_get(context, umem);
>     127                 if (ret) {
>     128                         kfree(umem);
>     129                         return ERR_PTR(ret);
>     130                 }
>     131                 return umem;
>     132         }
> 
> 
> As you can see only a few bits in access_flags are checked in the end,
> so it may exist a very unlikely possibility that an existing userspace
> program is setting this IB_ACCESS_ON_DEMAND bit without the intention
> of enabling on demand paging as this would be unnoticed by older kernel.
>
> In the other hand, a newer program built with on-demand-paging in mind
> will set the bit, but when run on older kernel, it will be a no-op,
> allowing the program to continue, perhaps thinking on-demand-paging
> is available.

This is why we added a method for userspace to check the kernel
capabilities both when adding the memory windows support and with ODP.
User-space can still send garbage to the kernel, but if it does so
without checking the kernel supports its request, it's user-space's problem.

> That should be avoided as much as possible.
> 
> Unfortunately, I think this cannot be fixed as it's was long since 
> IB_ZERO_BASED was added by commit 7083e42ee2 ("IB/core: Add "type 2" 
> memory windows support").
> Anyway there was no check for IB_ACCESS_REMOTE_READ, nor 
> IB_ACCESS_MW_BIND in the uverb layer either.

I think it would be perfectly fine to add a check today that validates
the MR access flags input is known. Because the newer feature require
user-space to check capabilities, I don't think it would matter if we
returned an error on newer kernels that do not support these features.

Haggai

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 2/5] IB/uverbs: ex_query_device: check request's comp_mask
       [not found]     ` <335da45738872e446f63db338ca766a34608c90a.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 18:36       ` Jason Gunthorpe
@ 2015-02-01  8:12       ` Haggai Eran
       [not found]         ` <54CDDFE4.7030003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 35+ messages in thread
From: Haggai Eran @ 2015-02-01  8:12 UTC (permalink / raw)
  To: Yann Droneaud, Sagi Grimberg, Shachar Raindel, Eli Cohen, Roland Dreier
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

On 29/01/2015 19:59, Yann Droneaud wrote:
> This patch ensures the extended QUERY_DEVICE uverbs request's
> comp_mask has only known and supported bits (currently none).
> 
> If userspace set unknown features bits, -EINVAL will be returned,
> ensuring current programs are not allowed to set random feature
> bits: such bits could enable new extended features in future kernel
> versions and those features can trigger a behavior not unsupported
> by the older programs or make the newer kernels return an error
> for a request which was valid on older kernels.
> 
> Additionally, returning an error for unsupported feature would
> allow userspace to probe/discover which extended features are
> currently supported by a kernel.

As I wrote before, I hope in the future we don't force userspace to
probe features this way, because it may be unnecessarily complex.

I agree though that we should have a way to extend this verb in the future.

Reviewed-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

> 
> Link: http://mid.gmane.org/cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/infiniband/core/uverbs_cmd.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index 6ef06a9b4362..fbcc54b86795 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -3312,6 +3312,9 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
>  	if (err)
>  		return err;
>  
> +	if (cmd.comp_mask)
> +		return -EINVAL;
> +
>  	if (cmd.reserved)
>  		return -EINVAL;
>  
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 3/5] IB/uverbs: ex_query_device: answer must depend on response buffer's size
       [not found]     ` <a7b2b5adb3b207ec2a4330067a03ce7e4c2225aa.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 18:38       ` Jason Gunthorpe
@ 2015-02-01  8:38       ` Haggai Eran
  1 sibling, 0 replies; 35+ messages in thread
From: Haggai Eran @ 2015-02-01  8:38 UTC (permalink / raw)
  To: Yann Droneaud, Sagi Grimberg, Shachar Raindel, Eli Cohen, Roland Dreier
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

On 29/01/2015 20:00, Yann Droneaud wrote:
> As specified in "Extending Verbs API" presentation [1] by Tzahi Oved
> during OFA International Developer Workshop 2013, the request's
> comp_mask should describe the request data: it's describe the
> availability of extended fields in the request.
> Conversely, the response's comp_mask should describe the presence
> of extended fields in the response.
> 
> So instead of silently truncating extended QUERY_DEVICE uverb's
> response, see commit 5a77abf9a97a ("IB/core: Add support for
> extended query device caps")), this patch makes function
> ib_uverbs_ex_query_device() check the available space in the
> response buffer against the minimal response structure and fail
> with -ENOSPC if this base structure cannot be returned to
> userspace: it's required to be able to return the comp_mask's
> value, at least.
> 
> For extended features, currently only IB_USER_VERBS_EX_QUERY_DEVICE_ODP
> per commit 860f10a799c8 ("IB/core: Add flags for on demand paging
> support"), the extension's data is returned if the needed space
> is available in the response buffer: it is expected that newer
> userspace program pass a bigger response buffer so that it can
> retrieve extended features. The comp_mask value will match the fields
> that were effectively returned to userspace.
> 
> In the end:
> - userspace won't get truncated responses;
> - newer kernel would be able to support older binaries and
>   older binaries will work flawlessly with newer kernel;
> - additionally, newer binaries would even be able to support
>   older kernel, as far as they don't set new feature bit in
>   request's comp_mask.
> 
> Note: as offsetof() is used to retrieve the size of the lower chunk
> of the response, beware that it only works if the upper chunk
> is right after, without any implicit padding. And, as the size of
> the latter chunk is added to the base size, implicit padding at the
> end of the structure is not taken in account. Both point must be
> taken in account when extending the uverbs functionalities.
> 
> [1] https://www.openfabrics.org/images/docs/2013_Dev_Workshop/Tues_0423/2013_Workshop_Tues_0830_Tzahi_Oved-verbs_extensions_ofa_2013-tzahio.pdf
> 
> Link: http://mid.gmane.org/cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Reviewed-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

> Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/infiniband/core/uverbs_cmd.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index fbcc54b86795..81d8b5aa2eb6 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -3302,6 +3302,7 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
>  	struct ib_uverbs_ex_query_device  cmd;
>  	struct ib_device_attr attr;
>  	struct ib_device *device;
> +	size_t resp_len;
>  	int err;
>  
>  	device = file->device->ib_dev;
> @@ -3318,6 +3319,11 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
>  	if (cmd.reserved)
>  		return -EINVAL;
>  
> +	resp_len = offsetof(typeof(resp), odp_caps);
> +
> +	if (ucore->outlen < resp_len)
> +		return -ENOSPC;
> +
>  	err = device->query_device(device, &attr);
>  	if (err)
>  		return err;
> @@ -3326,6 +3332,9 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
>  	copy_query_dev_fields(file, &resp.base, &attr);
>  	resp.comp_mask = 0;
>  
> +	if (ucore->outlen < resp_len + sizeof(resp.odp_caps))
> +		goto end;
> +
>  #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
>  	resp.odp_caps.general_caps = attr.odp_caps.general_caps;
>  	resp.odp_caps.per_transport_caps.rc_odp_caps =
> @@ -3336,8 +3345,10 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
>  		attr.odp_caps.per_transport_caps.ud_odp_caps;
>  #endif
>  	resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
> +	resp_len += sizeof(resp.odp_caps);
>  
> -	err = ib_copy_to_udata(ucore, &resp, sizeof(resp));
> +end:
> +	err = ib_copy_to_udata(ucore, &resp, resp_len);
>  	if (err)
>  		return err;
>  
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 4/5] IB/uverbs: ex_query_device: no need to clear the whole structure
       [not found]     ` <0b646f62e9272bc962a1ff6ff03ad9523b454dfe.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-29 18:39       ` Jason Gunthorpe
@ 2015-02-01  8:45       ` Haggai Eran
  1 sibling, 0 replies; 35+ messages in thread
From: Haggai Eran @ 2015-02-01  8:45 UTC (permalink / raw)
  To: Yann Droneaud, Sagi Grimberg, Shachar Raindel, Eli Cohen, Roland Dreier
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

On 29/01/2015 20:00, Yann Droneaud wrote:
> As only the requested fields are set and copied to userspace,
> there's no need to clear the content of the response structure
> beforehand.

Care must be taken to zero out all the data that is copied to userspace,
but I agree it can be done only to the parts that are unsupported as
this patch does.

Reviewed-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

> Link: http://mid.gmane.org/cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/infiniband/core/uverbs_cmd.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index 81d8b5aa2eb6..9520880a163f 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -3328,9 +3328,9 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
>  	if (err)
>  		return err;
>  
> -	memset(&resp, 0, sizeof(resp));
>  	copy_query_dev_fields(file, &resp.base, &attr);
>  	resp.comp_mask = 0;
> +	resp.reserved = 0;
>  
>  	if (ucore->outlen < resp_len + sizeof(resp.odp_caps))
>  		goto end;
> @@ -3343,6 +3343,9 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
>  		attr.odp_caps.per_transport_caps.uc_odp_caps;
>  	resp.odp_caps.per_transport_caps.ud_odp_caps =
>  		attr.odp_caps.per_transport_caps.ud_odp_caps;
> +	resp.odp_caps.reserved = 0;
> +#else
> +	memset(&resp.odp_caps, 0, sizeof(resp.odp_caps));
>  #endif
>  	resp.comp_mask |= IB_USER_VERBS_EX_QUERY_DEVICE_ODP;
>  	resp_len += sizeof(resp.odp_caps);
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 5/5] IB/core: ib_copy_to_udata(): don't silently truncate response
       [not found]     ` <c69af8952bf25fdbcdfc527b0636bc3177798b95.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
@ 2015-02-01  8:47       ` Haggai Eran
  0 siblings, 0 replies; 35+ messages in thread
From: Haggai Eran @ 2015-02-01  8:47 UTC (permalink / raw)
  To: Yann Droneaud, Sagi Grimberg, Shachar Raindel, Eli Cohen, Roland Dreier
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

On 29/01/2015 20:00, Yann Droneaud wrote:
> While ib_copy_to_udata() should check for the available output
> space as already proposed in some other patches [1][2][3], the
> changes brought by commit 5a77abf9a97a ("IB/core: Add support for
> extended query device caps") are silently truncating the data to
> be written to userspace if the output buffer is not large enough
> to hold the response data.
> 
> Silently truncating the response is not a reliable behavior as
> userspace is not given any hint about this truncation: userspace
> is leaved with garbage to play with.
> 
> Not checking the response buffer size and writing past the
> userspace buffer is no good either, but it's the current behavior.
> 
> So this patch revert the particular change on ib_copy_to_udata()
> as a better behavior is implemented in the upper level function
> ib_uverbs_ex_query_device().
> 
> [1] "[PATCH 00/22] infiniband: improve userspace input check"
> 
> http://mid.gmane.org/cover.1376847403.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
> 
> [2] "[PATCH 03/22] infiniband: ib_copy_from_udata(): check input length"
> 
> http://mid.gmane.org/2bf102a41c51f61965ee09df827abe8fefb523a9.1376847403.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
> 
> [3] "[PATCH 04/22] infiniband: ib_copy_to_udata(): check output length"
> 
> http://mid.gmane.org/d27716a3a1c180f832d153a7402f65ea8a75b734.1376847403.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
> 
> Link: http://mid.gmane.org/cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
> Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Reviewed-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

> Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
> ---
>  include/rdma/ib_verbs.h | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 0d74f1de99aa..65994a19e840 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1707,10 +1707,7 @@ static inline int ib_copy_from_udata(void *dest, struct ib_udata *udata, size_t
>  
>  static inline int ib_copy_to_udata(struct ib_udata *udata, void *src, size_t len)
>  {
> -	size_t copy_sz;
> -
> -	copy_sz = min_t(size_t, len, udata->outlen);
> -	return copy_to_user(udata->outbuf, src, copy_sz) ? -EFAULT : 0;
> +	return copy_to_user(udata->outbuf, src, len) ? -EFAULT : 0;
>  }
>  
>  /**
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]                     ` <1422638760.3133.260.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
  2015-01-31 20:09                       ` Or Gerlitz
@ 2015-02-01 11:25                       ` Haggai Eran
  1 sibling, 0 replies; 35+ messages in thread
From: Haggai Eran @ 2015-02-01 11:25 UTC (permalink / raw)
  To: Yann Droneaud, Roland Dreier
  Cc: Jason Gunthorpe, Sagi Grimberg, Shachar Raindel, Eli Cohen,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On 30/01/2015 19:26, Yann Droneaud wrote:
> Hi,
> 
> Le jeudi 29 janvier 2015 à 15:17 -0800, Roland Dreier a écrit :
>> On Thu, Jan 29, 2015 at 1:59 PM, Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org> wrote:
>>>> Roland: I agree with Yann, these patches need to go in, or the ODP
>>>> patches reverted.
>>
>>> Reverting all On Demand Paging patches seems overkill:
>>> if something as to be reverted it should be commit 5a77abf9a97a
>>> ("IB/core: Add support for extended query device caps") and the part of
>>> commit 860f10a799c8 ("IB/core: Add flags for on demand paging support")
>>> which modify ib_uverbs_ex_query_device().
>>
>> Thank you and Jason for taking on this interface.
>>
>> At this point I feel like I do about the IPoIB changes -- we should
>> revert the broken stuff and get it right for 3.20.
>>
>> If we revert the two things you describe above, is everything else OK
>> to leave in 3.19 with respect to ABI?
>>
> 
> I've tried to review every changes since v3.18 on drivers/infiniband
> include/rdma and include/uapi/rdma with respect to ABI issues.
> 
> I've noticed no other issue, but I have to admit I've not well reviewed
> the drivers (hw/) internal changes.
> 
> If the IB_USER_VERBS_EX_CMD_QUERY_DEVICE and ib_uverbs_ex_query_device
> changes are going to be reverted for v3.19, the on-demand-paging
> feature will be available (IB_DEVICE_ON_DEMAND_PAGING will be set 
> device_cap_flags in response to non extended QUERY_DEVICE for mlx5 HCA
> and IB_ACCESS_ON_DEMAND access flag will be effective for REG_MR 
> uverbs), but its parameters won't be. 

For user-space to make use of on demand paging, it should verify the
specific transport and operation is supported. If they don't, they will
encounter errors when a page fault occurs.

> I don't know if it's a no-go for 
> the usage of on-demand paging by userspace: I have not the chance
> of owning HCA with the support for this feature, nor the patches
> libibverbs / libmlx5 ... (anyway I would not have the time to test).
> I've hoped people from Mellanox would have commented on the revert 
> option too.

I would prefer it if Yann's patches are accepted. I understand it is
very late, but they are quite short, and I think they provide the right
semantic for this new verb.

As a second option, in case you prefer to revert the extended query
device patch, I will send a patch shortly to do that.

Regards,
Haggai

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 2/5] IB/uverbs: ex_query_device: check request's comp_mask
       [not found]         ` <54CDDFE4.7030003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-02-01 11:55           ` Yann Droneaud
  0 siblings, 0 replies; 35+ messages in thread
From: Yann Droneaud @ 2015-02-01 11:55 UTC (permalink / raw)
  To: Haggai Eran
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi,

Le dimanche 01 février 2015 à 10:12 +0200, Haggai Eran a écrit :
> On 29/01/2015 19:59, Yann Droneaud wrote:
> > This patch ensures the extended QUERY_DEVICE uverbs request's
> > comp_mask has only known and supported bits (currently none).
> > 
> > If userspace set unknown features bits, -EINVAL will be returned,
> > ensuring current programs are not allowed to set random feature
> > bits: such bits could enable new extended features in future kernel
> > versions and those features can trigger a behavior not unsupported
> > by the older programs or make the newer kernels return an error
> > for a request which was valid on older kernels.
> > 
> > Additionally, returning an error for unsupported feature would
> > allow userspace to probe/discover which extended features are
> > currently supported by a kernel.
> 
> As I wrote before, I hope in the future we don't force userspace to
> probe features this way, because it may be unnecessarily complex.
> 

I believe that most use cases won't need probing as applications are
often built according to the current kernel features in mind.

If applications need to use new features, it seems to be a small price
to pay to be prepared to get -EINVAL.

In another word: backward compatibility from application point of view:
a newer application wanting to run on older kernel must be prepared to.

> I agree though that we should have a way to extend this verb in the future.
> 
> Reviewed-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 

Thanks.

> > 
> > Link: http://mid.gmane.org/cover.1422553023.git.ydroneaud-RlY5vtjFyJ1hl2p70BpVqQ@public.gmane.orgm
> > Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/infiniband/core/uverbs_cmd.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> > index 6ef06a9b4362..fbcc54b86795 100644
> > --- a/drivers/infiniband/core/uverbs_cmd.c
> > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > @@ -3312,6 +3312,9 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
> >  	if (err)
> >  		return err;
> >  
> > +	if (cmd.comp_mask)
> > +		return -EINVAL;
> > +
> >  	if (cmd.reserved)
> >  		return -EINVAL;
> >  
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]                         ` <20150129211256.GA22099-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-02-05  2:54                           ` Weiny, Ira
       [not found]                             ` <2807E5FD2F6FDA4886F6618EAC48510E0CC12C30-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Weiny, Ira @ 2015-02-05  2:54 UTC (permalink / raw)
  To: Jason Gunthorpe, Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Haggai Eran,
	Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

> 
> On Thu, Jan 29, 2015 at 09:50:38PM +0100, Yann Droneaud wrote:
> 
> > Anyway, I recognize that uverb way of abusing write() syscall is
> > borderline (at best) regarding other Linux subsystems and Unix
> > paradigm in general. But it's not enough to screw it more.
> 
> Then we must return the correct output size explicitly in the struct.

I was thinking this very same thing as I read through this thread.

I too would like to avoid the use of comp_masks if at all possible.  The query call seems to be a verb where the structure size is all you really need to know the set of values returned.

As Jason says, other calls may require the comp_mask where 0 is not sufficient to indicate "missing".

Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]                             ` <2807E5FD2F6FDA4886F6618EAC48510E0CC12C30-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-02-05  8:26                               ` Haggai Eran
       [not found]                                 ` <54D32933.8080307-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Haggai Eran @ 2015-02-05  8:26 UTC (permalink / raw)
  To: Weiny, Ira, Jason Gunthorpe, Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On 05/02/2015 04:54, Weiny, Ira wrote:
>>
>> On Thu, Jan 29, 2015 at 09:50:38PM +0100, Yann Droneaud wrote:
>>
>>> Anyway, I recognize that uverb way of abusing write() syscall is
>>> borderline (at best) regarding other Linux subsystems and Unix
>>> paradigm in general. But it's not enough to screw it more.
>>
>> Then we must return the correct output size explicitly in the struct.
> 
> I was thinking this very same thing as I read through this thread.
> 
> I too would like to avoid the use of comp_masks if at all possible.  The query call seems to be a verb where the structure size is all you really need to know the set of values returned.
> 
> As Jason says, other calls may require the comp_mask where 0 is not sufficient to indicate "missing".

Would it be okay to return it in the ib_uverbs_cmd_hdr.out_words? That
would further abuse the write() syscall by writing to the input buffer.
However, the only other alternative I see is to add it explicitly to
every uverb response struct.

Haggai

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]                                 ` <54D32933.8080307-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-02-07  0:52                                   ` Weiny, Ira
       [not found]                                     ` <2807E5FD2F6FDA4886F6618EAC48510E0CC201F7-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Weiny, Ira @ 2015-02-07  0:52 UTC (permalink / raw)
  To: Haggai Eran, Jason Gunthorpe, Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

> 
> On 05/02/2015 04:54, Weiny, Ira wrote:
> >>
> >> On Thu, Jan 29, 2015 at 09:50:38PM +0100, Yann Droneaud wrote:
> >>
> >>> Anyway, I recognize that uverb way of abusing write() syscall is
> >>> borderline (at best) regarding other Linux subsystems and Unix
> >>> paradigm in general. But it's not enough to screw it more.
> >>
> >> Then we must return the correct output size explicitly in the struct.
> >
> > I was thinking this very same thing as I read through this thread.
> >
> > I too would like to avoid the use of comp_masks if at all possible.  The query
> call seems to be a verb where the structure size is all you really need to know
> the set of values returned.
> >
> > As Jason says, other calls may require the comp_mask where 0 is not
> sufficient to indicate "missing".
> 
> Would it be okay to return it in the ib_uverbs_cmd_hdr.out_words? That would
> further abuse the write() syscall by writing to the input buffer.

I don't think that is such a great idea.
 
> However, the only other alternative I see is to add it explicitly to every uverb
> response struct.
> 

I think this is the best solution.  There is a 32 bit reserved field in ib_uverbs_ex_query_device_resp.  Could we use all or part of that to be the size?

For other extended commands I'm not sure what to do.

Ira

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]                                     ` <2807E5FD2F6FDA4886F6618EAC48510E0CC201F7-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-02-08  7:27                                       ` Haggai Eran
  0 siblings, 0 replies; 35+ messages in thread
From: Haggai Eran @ 2015-02-08  7:27 UTC (permalink / raw)
  To: Weiny, Ira, Jason Gunthorpe, Yann Droneaud
  Cc: Sagi Grimberg, Shachar Raindel, Eli Cohen, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On 07/02/2015 02:52, Weiny, Ira wrote:
>>
>> On 05/02/2015 04:54, Weiny, Ira wrote:
>>>>
>>>> On Thu, Jan 29, 2015 at 09:50:38PM +0100, Yann Droneaud wrote:
>>>>
>>>>> Anyway, I recognize that uverb way of abusing write() syscall is
>>>>> borderline (at best) regarding other Linux subsystems and Unix
>>>>> paradigm in general. But it's not enough to screw it more.
>>>>
>>>> Then we must return the correct output size explicitly in the struct.
>>>
>>> I was thinking this very same thing as I read through this thread.
>>>
>>> I too would like to avoid the use of comp_masks if at all possible.  The query
>> call seems to be a verb where the structure size is all you really need to know
>> the set of values returned.
>>>
>>> As Jason says, other calls may require the comp_mask where 0 is not
>> sufficient to indicate "missing".
>>
...
>  
>> However, the only other alternative I see is to add it explicitly to every uverb
>> response struct.
>>
> 
> I think this is the best solution.  There is a 32 bit reserved field in ib_uverbs_ex_query_device_resp.  Could we use all or part of that to be the size?

Yes, I think 32-bit for the response length are more than enough.

I will send patches for 3.20 to re-introduce ib_uverbs_ex_query_device
with the response size instead of the reserved field, and with Yann's
changes.

Thanks,
Haggai

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask
       [not found]                         ` <CAJ3xEMhDtD7MpJ+1Y3z2yMpTrb9C5SaUa94E8xpVLHN4pHe3fw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-03-02 14:08                           ` Yann Droneaud
  0 siblings, 0 replies; 35+ messages in thread
From: Yann Droneaud @ 2015-03-02 14:08 UTC (permalink / raw)
  To: Or Gerlitz, Or Gerlitz
  Cc: Roland Dreier, Jason Gunthorpe, Sagi Grimberg, Shachar Raindel,
	Eli Cohen, Haggai Eran, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi Or,

Le samedi 31 janvier 2015 à 22:09 +0200, Or Gerlitz a écrit :
> On Fri, Jan 30, 2015, Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org> wrote:
> > [..] I have not the chance
> > of owning HCA with the support for this feature, nor the patches
> > libibverbs / libmlx5 ... (anyway I would not have the time to test). [...]
> 
> Yann, have you ever run Linux over RDMA capable HW (IB, RoCE, iWARP or alike?)

That was a silly question.

So I've took a little time to consider answering it,
perhaps you were gently kidding me.

Anyway, please take the following links as an answer to your question:

"rdma_create_qp() and max_send_wr"
<http://mid.gmane.org/1303404264.2243.19.camel-H/AUWmsJYVeqvyCYKW+Xr6xOck334EZe@public.gmane.org>

"ibv_create_qp() and max_inline_data behavior"
<http://mid.gmane.org/d7685e13978f93890b2939c5ac2d5c7f-zgzEX58YAwA@public.gmane.org>

"[PATCH librdmacm 0/3] no IBV_SEND_INLINE thus memory registration
required on QLogic/Intel HCA"
<http://mid.gmane.org/cover.1376746185.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>

"[PATCH 00/22] infiniband: improve userspace input check"
<http://mid.gmane.org/cover.1376847403.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>

I hope this would be enough to join the gang^Wclub. Where're the
by-laws ?

Regards.

-- 
Yann Droneaud
OPTEYA

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2015-03-02 14:08 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-29 17:59 [PATCH v1 0/5] IB/core: extended query device caps cleanup for v3.19 Yann Droneaud
     [not found] ` <cover.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-29 17:59   ` [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask Yann Droneaud
     [not found]     ` <24ceb1fc5b2b6563532e5776b1b2320b1ae37543.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-29 18:28       ` Jason Gunthorpe
     [not found]         ` <20150129182800.GH11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-01-29 18:43           ` Yann Droneaud
     [not found]             ` <1422557009.3133.172.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-29 19:18               ` Jason Gunthorpe
     [not found]                 ` <20150129191801.GM11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-01-29 20:50                   ` Yann Droneaud
     [not found]                     ` <1422564638.3133.198.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-29 21:12                       ` Jason Gunthorpe
     [not found]                         ` <20150129211256.GA22099-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-02-05  2:54                           ` Weiny, Ira
     [not found]                             ` <2807E5FD2F6FDA4886F6618EAC48510E0CC12C30-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-02-05  8:26                               ` Haggai Eran
     [not found]                                 ` <54D32933.8080307-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-02-07  0:52                                   ` Weiny, Ira
     [not found]                                     ` <2807E5FD2F6FDA4886F6618EAC48510E0CC201F7-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-02-08  7:27                                       ` Haggai Eran
2015-01-29 20:42               ` Yann Droneaud
2015-01-29 21:59           ` Yann Droneaud
     [not found]             ` <1422568741.3133.247.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-29 23:17               ` Roland Dreier
     [not found]                 ` <CAG4TOxN4DpTMMsCM-1oe6-w+5xYR4YHdJeL7p2nQpUy9gYCSjA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-30 17:26                   ` Yann Droneaud
     [not found]                     ` <1422638760.3133.260.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-31 20:09                       ` Or Gerlitz
     [not found]                         ` <CAJ3xEMhDtD7MpJ+1Y3z2yMpTrb9C5SaUa94E8xpVLHN4pHe3fw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-03-02 14:08                           ` Yann Droneaud
2015-02-01 11:25                       ` Haggai Eran
2015-02-01  8:04               ` Haggai Eran
2015-02-01  7:39           ` Haggai Eran
2015-01-29 17:59   ` [PATCH v1 2/5] IB/uverbs: ex_query_device: check " Yann Droneaud
     [not found]     ` <335da45738872e446f63db338ca766a34608c90a.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-29 18:36       ` Jason Gunthorpe
     [not found]         ` <20150129183648.GI11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-01-29 19:22           ` Yann Droneaud
2015-01-29 20:24           ` Jason Gunthorpe
2015-02-01  8:12       ` Haggai Eran
     [not found]         ` <54CDDFE4.7030003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-02-01 11:55           ` Yann Droneaud
2015-01-29 18:00   ` [PATCH v1 3/5] IB/uverbs: ex_query_device: answer must depend on response buffer's size Yann Droneaud
     [not found]     ` <a7b2b5adb3b207ec2a4330067a03ce7e4c2225aa.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-29 18:38       ` Jason Gunthorpe
     [not found]         ` <20150129183839.GJ11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-01-29 19:25           ` Yann Droneaud
2015-02-01  8:38       ` Haggai Eran
2015-01-29 18:00   ` [PATCH v1 4/5] IB/uverbs: ex_query_device: no need to clear the whole structure Yann Droneaud
     [not found]     ` <0b646f62e9272bc962a1ff6ff03ad9523b454dfe.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-29 18:39       ` Jason Gunthorpe
2015-02-01  8:45       ` Haggai Eran
2015-01-29 18:00   ` [PATCH v1 5/5] IB/core: ib_copy_to_udata(): don't silently truncate response Yann Droneaud
     [not found]     ` <c69af8952bf25fdbcdfc527b0636bc3177798b95.1422553023.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-02-01  8:47       ` Haggai Eran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.