From: Yishai Hadas <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
To: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
Cc: linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
"talal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org"
<talal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: Upcoming libibverbs release
Date: Sun, 26 Jun 2016 19:54:21 +0300 [thread overview]
Message-ID: <4ec1d8e6-a908-bb49-a137-415856ec6faa@dev.mellanox.co.il> (raw)
In-Reply-To: <3b89c411-72be-ddc5-5ebf-009eeee29692-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
On 6/23/2016 7:51 PM, Doug Ledford wrote:
> I'm preparing to make an official release of libibverbs soonish
> (thinking by the end of next week). Even though I took the timestamp v5
> patchset, I know there were still comments from Jason on the v4 set that
> weren't addressed in v5. Those can be addressed (and should be) with
> incremental patches prior to the official release. Yishai, can we
> please prioritize getting that done?
>
Listed below the main openings that were raised by Jason and our input
on, considering their need and priority.
Most of already answered as part of the discussion, will summarize it
all together to help evaluating the status/next steps.
> I'm still very much of the opinion that extended CQs should only allow
> compatible QP's to be added.
--------------------------------------
Application may create a CQ with large subset of flags that may fit some
of its attached QP types. Semantically, it is similar to what we have
today, it allows to put different QP types on the same CQ and by thus
allow application to have ordered completion events.
We don't see a good reason to change from current behavior and hurt
applications that expect that valid behavior.
> Add a compatibility layer
-------------------------------
A real application wants to achieve performance, to use a compatibility
layer over a vendor that doesn’t support the new API, might be a very
bad idea as of below.
1) The legacy API doesn't support the batching API that was introduced
by the new API, this will lead to have a Door-Bell/PCI write per CQE as
internally the compatibility layer will need to call ibv_poll_cq with
size of 1.
2) There will 2 copies, one from the HW CQE to ibv_wc as part of the
internal legacy API, second, from the ibv_wc to the read_xxx output,
comparing the new API.
3) The *non-required* fields will be copied anyway as part of the
internal legacy ibv_poll_cq from the HW CQE to ibv_wc and hits the
performance.
In addition,
There is *no* one to one mapping in all the cases between the new API to
the legacy one, it will lead to a *non* transparent usage of the
compatibility layer and ends by returning an error to the application in
few cases.
Specifically,
One of major benefits in the new API is the ability to say as part of CQ
creation that it's going to be used by only one thread and have lock
less flow as part of polling the CQ. This is not supported in the legacy
API which assumes that multi-threaded access might be used.
It makes sense to return an error in the compatibility layer API letting
application knows that this service is not supported when it was asked
by the application.
Applications that look for performance expect to work in that single
threaded mode and might fail as part of using the compatibility layer or
alternatively have no idea that they didn't get the service which
assumes to be bad behavior.
In addition, at the moment timestamping can’t be read via the legacy API
means that the compatibility layer can’t handle it and should supply an
error for. In the future there may be other fields/flows that can’t be
supported by the compatibility layer so application that will be written
over this layer will fail and won’t get the required benefit from using it.
To sum it up:
We don't see a real usage of such a layer, it has many limitations.
In case will be a real request from users, it can always be added in the
future, minor priority for now.
> Exact set of access flags
------------------------------
Application can ask for subset of the available bits, in addition, it
has an option to ask for all legacy ones (i.e. IBV_WC_STANDARD_FLAGS)
for a case that a CQ needs to support all legacy attribute and could be
attached to both RC and UD QPs.
We didn't find it useful at the moment to add other enums per QP type as
application can say explicitly what it wants by using the ORed bits and
even per QP type application may not be always interested in all the
fields. (e.g. CQ that used for TX only completions in RC is not
interested in few fields that are relevant for the RX).
To sum it up:
Low priority issue that can be added in the future based on users demand.
> Exact set of accessors
--------------------------------
At the moment the API introduces an option to get each field by a
separate accessor (i.e ibv_read_wc_xxx), in addition, wr_id and status
have direct access as application expect to use them in most of their flows.
During the discussion come up some ideas to add some extra accessors per
QP type (e.g. ibv_wc_read_ud, ibv_wc_read_rc, etc.).
This option makes sense when application really plans to read and use
all of the potential UD/RC fields, which *can't* be assumed as true in
the general case.
Even in a UD/RC QP the application may be interested in few of the CQE
fields while others are not needed.
From bench mark testing, we found that gathering few fields in one
ibv_wc_read_xxx has low benefit comparing the case that each field is
read by itself.
In a case that *not* all fields are really needed the extra copy to
return the redundant fields found to be more expensive comparing the
ibv_wc_read_xxx when needed.
We can think of an application that may be interested in subset of
fields per QP type or per send/receive flows to be gathered in one call,
this ability can always be added in the future as the API cleanly
enables extensions.
If you still think that adding some extra ibv_wc_read_xxx groups from
day one is really needed, let's define it and add now by an incremental
patch. Till now didn't see any user specific request for.
> Minor coding issues and micro optimizations
------------------------------------------------
Not sure what exactly that point refers to from API point of view,
assuming that it relates to few comments given on libmlx5 as of below.
> +static inline uint32_t mlx5_cq_read_wc_qp_num(struct ibv_cq_ex *ibcq)
> +{
> + struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
> You might want to look at having the caller pass in 'cq' from a member
> in ibv_cq_ex. That way the caller can hold the cq in a register
> instead of constantly reloading it like this - I'm assuming to_mcq is
> not just an container_of??
The assumption is *incorrect*, the internal access from ibv_cq_ex to
driver CQ is done via offsetof/container_of which is done in compile
time, no need to have some change from the clean usage of the API to
that mode.
> +static inline uint32_t mlx5_cq_read_wc_qp_num(struct ibv_cq_ex
> *ibcq)
> You need to go through everything you've written and make sure it is
> const-correct.
> You should also annotate with restrict.
> Doing this properly can increase performance by allowing the caller to
> avoid reloads.
From performance point of view we don't expect a change here, we
followed the created assembly code and saw no difference even when it
was annotated with 'restrict'.
In addition, please note the 'restrict' notation was only became
standard as part of C99, in previous compilers it may not work at all or
requires different key word to be defined as part of the CONFIG step.
We can go ahead and define the ibv_wc_read_xxx with const qualifier on
its input struct ibv_cq_ex *ibcq in some extra incremental patch if
makes sense.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-06-26 16:54 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-23 16:51 Upcoming libibverbs release Doug Ledford
[not found] ` <3b89c411-72be-ddc5-5ebf-009eeee29692-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-24 11:50 ` Yishai Hadas
2016-06-26 16:54 ` Yishai Hadas [this message]
[not found] ` <4ec1d8e6-a908-bb49-a137-415856ec6faa-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-06-27 18:17 ` Jason Gunthorpe
[not found] ` <20160627181758.GD23540-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-28 5:02 ` Leon Romanovsky
[not found] ` <20160628050246.GB3584-2ukJVAZIZ/Y@public.gmane.org>
2016-06-28 15:52 ` Knut Omang
[not found] ` <1467129133.8638.75.camel-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-06-28 16:22 ` Leon Romanovsky
2016-06-28 16:20 ` Jason Gunthorpe
[not found] ` <20160628162028.GA27518-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-28 17:05 ` Leon Romanovsky
[not found] ` <20160628170549.GE3584-2ukJVAZIZ/Y@public.gmane.org>
2016-06-28 19:12 ` Steve Wise
2016-06-28 21:14 ` Jason Gunthorpe
[not found] ` <20160628211441.GA5786-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-28 21:26 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373AB0663DA-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-06-29 5:19 ` Leon Romanovsky
[not found] ` <20160629051956.GG3584-2ukJVAZIZ/Y@public.gmane.org>
2016-06-29 18:30 ` Jason Gunthorpe
[not found] ` <20160629183042.GC17031-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-30 18:17 ` Liran Liss
[not found] ` <HE1PR05MB14189E07EE8EE458E0CCD4DAB1240-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2016-06-30 19:07 ` Jason Gunthorpe
2016-07-19 19:57 ` Doug Ledford
[not found] ` <babed655-b61f-e97b-2351-b1ea6692b18d-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-07-19 20:09 ` Jason Gunthorpe
2016-06-28 21:18 ` Jason Gunthorpe
[not found] ` <20160628211858.GB5786-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-29 5:15 ` Leon Romanovsky
[not found] ` <20160629051507.GF3584-2ukJVAZIZ/Y@public.gmane.org>
2016-06-29 17:07 ` Knut Omang
[not found] ` <1467220072.8638.166.camel-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-06-29 18:59 ` Yishai Hadas
2016-06-29 12:09 ` Christoph Hellwig
[not found] ` <20160629120920.GA24151-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-06-29 18:34 ` Jason Gunthorpe
[not found] ` <20160629183414.GD17031-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-29 18:46 ` Steve Wise
2016-06-29 18:57 ` Jason Gunthorpe
[not found] ` <20160629185757.GA17839-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-29 19:15 ` Steve Wise
2016-06-29 19:21 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373AB06699A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-06-29 19:25 ` Steve Wise
2016-06-29 20:34 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373AB0669F5-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-06-29 20:44 ` Steve Wise
2016-06-29 20:54 ` Steve Wise
2016-06-29 21:40 ` Doug Ledford
[not found] ` <1d03eaca-142a-3912-badf-aa9b14f6b2f6-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-29 22:03 ` Steve Wise
2016-06-29 19:27 ` Jason Gunthorpe
[not found] ` <20160629192730.GA18394-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-29 20:40 ` Doug Ledford
2016-06-28 13:26 ` Yishai Hadas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ec1d8e6-a908-bb49-a137-415856ec6faa@dev.mellanox.co.il \
--to=yishaih-ldsdmyg8hgv8yrgs2mwiifqbs+8scbdb@public.gmane.org \
--cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=talal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.