All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
To: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 3/4] IB/uverbs: ex_query_device: check request's comp_mask
Date: Mon, 26 Jan 2015 12:17:09 +0100	[thread overview]
Message-ID: <1422271029.3133.68.camel@opteya.com> (raw)
In-Reply-To: <54C50A67.6040306-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Hi,

(adding linux-api@ for comments:

 We're introducing a new "uverb" in the InfiniBand subsystem:
 extended QUERY_DEVICE in v3.19:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.19-rc6#n3297
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/uapi/rdma/ib_user_verbs.h?id=v3.19-rc6#n209
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/uapi/rdma/ib_user_verbs.h?id=v3.19-rc6#n224

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5a77abf9a97a
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=860f10a799c8

 As you may not already know, InfiniBand subsystem use a write() 
 syscall[1] to issue ioctl()-like operations. Many operations (aka   
 verbs) are available [2], for each there's a query data structures
 and for some there's a response data structure [3]. As a result to a 
 write() operation, kernel could allocate resources on the task behalf 
 and/or write data back to userspace provided buffers whose pointers 
 were part of buffer passed to write().

 I've expressed concern on the way the new extended QUERY_DEVICE
 was trying to be itself expandable: by silently ignoring shorter
 buffer, returning truncated data, the interface seems awkward.

"Re: [PATCH v2 06/17] IB/core: Add support for extended query device caps"
http://mid.gmane.org/1418216676.11111.45.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org

"Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps"
http://mid.gmane.org/1418733236.2779.26.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org
http://mid.gmane.org/1418824811.3334.3.camel@dworkin
http://mid.gmane.org/1421844612.13543.40.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org

 I recognize the author's intention to provide a way for userspace
 to retrieve easily the highest supported information as something
 desirable.

 But I believe we must be more strict on the request content and
 fail for any unrecognized, unsupported, incorrect bits to make
 safer to extend the interface latter.

 I've submitted a patchset[4] to address these issues.
 But, while I'm convinced it the way to go, I'm not able to find
 how it could be made to satisfy the original author expectations.
 
 I hope linux-api@ readers could provide some insight regarding
 the way we should implement such interface

[1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_main.c?id=v3.19-rc6#n599

[2] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_main.c?id=v3.19-rc6#n81

[3] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/uapi/rdma/ib_user_verbs.h?id=v3.19-rc6

[4] http://mid.gmane.org/cover.1421931555.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org

)

Le dimanche 25 janvier 2015 à 17:23 +0200, Haggai Eran a écrit :
> On 22/01/2015 15:28, Yann Droneaud wrote:
> > This patch ensures the extended QUERY_DEVICE uverbs request's
> > comp_mask has only known values. If userspace returns unknown
> > features, -EINVAL will be returned, allowing to probe/discover
> > which features are currently supported by the kernel.
> 
> This probing process will be much more cumbersome than it needs to be
> because userspace will have to call QUERY_DEVICE repetitively with
> different comp_mask fields until it finds out the exact set of supported
> bits.
> 

O(log2(N))

Or you had to add a different interface, dedicated to retrieve the exact
supported feature mask.

> > Moreover, it also ensure the requested features set in comp_mask
> > are sequentially set, not skipping intermediate features, eg. the
> > "highest" requested feature also request all the "lower" ones.
> > This way each new feature will have to be stacked on top of the
> > existing ones: this is especially important for the request and
> > response data structures where fields are added after the
> > current ones when expanded to support a new feature.
> 
> I think it is perfectly acceptable that not all drivers will implement
> all available features, and so you shouldn't enforce this requirement.

With regard to QUERY_DEVICE: the data structure layout depends on the
comp_mask value. So either you propose a way to express multipart data
structure (see CMSG or NETLINK), or we have to ensure the data structure
is ever-growing, with each new chunck stacked over the existing ones:
that's the purpose of :

	if (cmd.comp_mask & (cmd.comp_mask + 1))
		return -EINVAL;

> Also, it makes the comp_mask nothing more than a wasteful version number
> between 0 and 63.

That's what I've already explained earlier in "Re: [PATCH v3 06/17]
IB/core: Add support for extended query device caps":

http://mid.gmane.org/1421844612.13543.40.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org

> 
> In the specific case of QUERY_DEVICE you might argue that there isn't
> any need for input comp_mask, only for output, and then you may enforce
> the input comp_mask will always be zero.

The extended QUERY_DEVICE uverbs as currently merged is using comp_mask
from input to choose to report on-demand-paging related value. So it
seems it's needed.

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.19-rc6#n3297

> However, you will in any case need to be able to extended the size of the response in the future.
> 

That's already the case for on demand paging.

> > 
> > Link: http://mid.gmane.org/cover.1421931555.git.ydroneaud-RlY5vtjFyJ1hl2p70BpVqQ@public.gmane.orgm
> > Cc: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Cc: Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Cc: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> > Signed-off-by: Yann Droneaud <ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/infiniband/core/uverbs_cmd.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> > index 8668b328b7e6..80a1c90f1dbf 100644
> > --- a/drivers/infiniband/core/uverbs_cmd.c
> > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > @@ -3313,6 +3313,12 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
> >  	if (err)
> >  		return err;
> >  
> > +	if (cmd.comp_mask & (cmd.comp_mask + 1))
> > +		return -EINVAL;
> > +
> > +	if (cmd.comp_mask & ~(__u32)IB_USER_VERBS_EX_QUERY_DEVICE_ODP)
> > +		return -EINVAL;
> > +

If we keep the checks on output buffer size from patch 1, returning
-ENOSPC in case of size mismatch, and if we are sure that no bit in
input comp_mask will ever trigger a call to a kernel function that can
make the uverb fail, the latter check on known value could be dropped,
allowing the userspace to set the highest mask (-1): userspace
will use -ENOSPC to probe the expected size of the response buffer
to match the highest supported comp_mask. But it's going to hurt
userspace application not ready to receive -ENOSPC on newer kernel
with extended QUERY_DEVICE ABI ... Oops.

So in this end, the safest way to ensure userspace is doing the correct
thing so that we have backward and forward compatibility is to check
for known values in comp_mask, check for response buffer size and ensure
that data structure chunk are stacked.

The tighter are the checks now, the easier the interface could be
extended latter.

> >  	if (cmd.reserved)
> >  		return -EINVAL;
> >  
> > 
> 

Regards.

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-01-26 11:17 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-22 13:28 [PATCH 0/4] IB/core: extended query device caps cleanup for v3.19 Yann Droneaud
     [not found] ` <cover.1421931555.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-22 13:28   ` [PATCH 1/4] IB/uverbs: ex_query_device: fail if output buffer size does not match Yann Droneaud
     [not found]     ` <d60715123c640bc7b720ad11a9faa3af78950aa6.1421931555.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-25 15:29       ` Haggai Eran
     [not found]         ` <54C50BBD.5000009-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-01-26 10:05           ` Yann Droneaud
2015-01-22 13:28   ` [PATCH 2/4] IB/core: ib_copy_to_udata(): don't silently truncate response Yann Droneaud
2015-01-22 13:28   ` [PATCH 3/4] IB/uverbs: ex_query_device: check request's comp_mask Yann Droneaud
     [not found]     ` <063956366559d6919693aabb324bab83d676bc28.1421931555.git.ydroneaud-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-25 15:23       ` Haggai Eran
     [not found]         ` <54C50A67.6040306-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-01-26 11:17           ` Yann Droneaud [this message]
     [not found]             ` <1422271029.3133.68.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-27  6:50               ` Haggai Eran
     [not found]                 ` <54C73549.5000003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-01-28 13:19                   ` Yann Droneaud
     [not found]                     ` <1422451151.3133.130.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-28 15:40                       ` Haggai Eran
     [not found]                         ` <54C902E4.5010600-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-01-29 18:00                           ` Yann Droneaud
2015-01-29 18:09                       ` Jason Gunthorpe
     [not found]                         ` <20150129180956.GG11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-01-29 18:35                           ` Yann Droneaud
     [not found]                             ` <1422556514.3133.165.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-29 19:14                               ` Jason Gunthorpe
     [not found]                                 ` <20150129191423.GL11842-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-01-29 21:14                                   ` Yann Droneaud
     [not found]                                     ` <1422566069.3133.218.camel-RlY5vtjFyJ3QT0dZR+AlfA@public.gmane.org>
2015-01-29 21:34                                       ` Jason Gunthorpe
2015-01-22 13:28   ` [PATCH 4/4] IB/uverbs: ex_query_device: no need to clear the whole structure Yann Droneaud

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1422271029.3133.68.camel@opteya.com \
    --to=ydroneaud-rly5vtjfyj3qt0dzr+alfa@public.gmane.org \
    --cc=eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.