All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 libibverbs 0/7] Completion timestamping
@ 2016-05-29 14:51 Yishai Hadas
       [not found] ` <1464533475-18949-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Yishai Hadas @ 2016-05-29 14:51 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

Hi Doug,

This V4 series addressed few notes that we got for V3,
details below.

The kernel part was already accepted.

The series was retested successfully with mlx5 driver (lib, kernel)
and can be accessed also from my openfabrics GIT at:
git://openfabrics.org/~yishaih/libibverbs.git
branch: ts_v4.

Thanks,
Yishai

In order to do so, we add an extensible poll cq mechanism.
Former attempts of extending poll CQ were made. An attempt to solve this
problem tried to split the WC into mandatory and optional fields.
The user declared which optional fields each CQ should report and the
WC was constructed in a dynamic way representing all requested fields.
We got some comments regarding this complex approach and API. Furthermore,
it resulted in degraded performance in some flows.

The current approach is based on Jason's proposal. Instead of using a WC
struct, we report completion fields by request. A new ibv_cq_ex is added.
This new extended CQ contains accessor functions to the completion fields.
Each vendor assigns these function pointers in order to provide the
completion data efficiently. In order to create a suitable CQ and maintain
backward and forward compatibility, the user declares which completion
attributes he needs while creating the CQ. A successful creation of the CQ
guarantees that all requested attributes can be queried using the accessor
function pointers.

This approach prevents copying the WC fields in cost of indirect function calls.
However, as most applications don't use most completion fields anyway, the new
approach fully makes sense.

Benchmarks we ran in our test lab found that this new approach generally
equals to current API but *not* worse than. As the new API enables
extending the polled fields we can overall say that it's a better API than
the legacy one.

The user creates a CQ using ibv_create_cq_ex, stating which completion
attributes could be queried later on from this CQ.
In order to decrease per-completion polling overhead, as of updating indices in
the hardware, we split the polling into batches.

A batch is started when calling ibv_start_poll_ex. If a completion is
successfully fetched, the user could query its attributes using accessor
functions ibv_wc_read_xxx.  In order to fetch the next completion in the batch,
the user uses ibv_next_poll_ex.  The same ibv_wc_read_xxx functions are used in
order to query these completions as well. In order to end a batch, the user
uses ibv_end_poll_ex.  Of course, starting a new batch incurs some overhead.

Each batch could poll zero or more completions.
Each completion polling starts with either ibv_start_poll_ex/ibv_next_poll_ex
and ends with ibv_next_poll_ex/ibv_end_poll_ex.
Completion attributes could only be queried between these calls.
These attributes represents the values of the completion already fetched by
the last ibv_start_poll_ex/ibv_next_poll_ex.

The batching API is thread-safe (assuming the CQ wasn't created with
SINGLE_THREADED attribute) and represents a series of completions the user
would like to poll one after another.  The vendor user space driver should
guarantee this.

Completion timestamp is added on top of these extended ibv_create_cq_ex verb by
using wc_flags field of init_cq_attr. The user could query the CQ's completion
timestamp using ibv_wc_read_completion_ts. The timestamp mask (number of
supported bits) and the HCA's frequency are given in ibv_query_device_ex verb.

We also give the user an ability to read the HCA's current clock.
This is done via ibv_query_rt_values_ex. This verb could be extended
in the future for other interesting information.

Changes from V3:
Addressed Jason's notes as of below:
- Reorder ibv_cq_ex fields to be memory aligned.
- Add a check as part of __lib_ibv_create_cq_ex that libibverbs supports all in coming WC flags,
  this should prevent future compatibility issues.
- Change ibv_wc_read_slid to return uint32_t as of other vendor request.
- Change ibv_cq_init_attr_ex to use uint32 instead of int for some fields.

Changes from V2:
Addressed Jason's notes as of below:
- Remove the '_ex' notation where was no legacy one.
- Use 'wr_id' and 'status' fields directly on ibv_cq_ex to improve
  performance. We ran some benchmarking and verified that this change is
  really useful.

Changes from V1:
- Moved to indirect function calls in order to poll a CQ.

Changes from V0:
- Split the series to small logical patches.
- Align naming in some places to match other verbs.
- Fix and improve the man pages.
- Add an example code as part of rc_pingpong.

Matan Barak (6):
  Add support for extended creating CQ verb
  Add member functions to poll an extended CQ
  Add timestamp_mask and hca_core_clock to ibv_query_device_ex
  Add completion timestamp to poll_cq
  Create a single threaded CQ
  Add a verb that queries real time values from the HCA

Yishai Hadas (1):
  Add timestamp support in rc_pingpong

 Makefile.am                   |   3 +-
 examples/devinfo.c            |  10 ++
 examples/rc_pingpong.c        | 278 ++++++++++++++++++++++++++++++++----------
 include/infiniband/driver.h   |   9 ++
 include/infiniband/kern-abi.h |  26 ++++
 include/infiniband/verbs.h    | 243 ++++++++++++++++++++++++++++++++++++
 man/ibv_create_cq_ex.3        | 150 +++++++++++++++++++++++
 man/ibv_query_device_ex.3     |   6 +-
 man/ibv_query_rt_values_ex.3  |  50 ++++++++
 src/cmd.c                     |  69 +++++++++++
 src/device.c                  |  49 ++++++++
 src/ibverbs.h                 |   5 +
 src/libibverbs.map            |   1 +
 13 files changed, 833 insertions(+), 66 deletions(-)
 create mode 100644 man/ibv_create_cq_ex.3
 create mode 100644 man/ibv_query_rt_values_ex.3

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-05-31 19:48 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-29 14:51 [PATCH V4 libibverbs 0/7] Completion timestamping Yishai Hadas
     [not found] ` <1464533475-18949-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 14:51   ` [PATCH V4 libibverbs 1/7] Add support for extended creating CQ verb Yishai Hadas
2016-05-29 14:51   ` [PATCH V4 libibverbs 2/7] Add member functions to poll an extended CQ Yishai Hadas
     [not found]     ` <1464533475-18949-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 22:38       ` Doug Ledford
     [not found]         ` <574B6F71.9060808-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-29 23:30           ` Jason Gunthorpe
     [not found]             ` <20160529233009.GA12420-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-29 23:47               ` Doug Ledford
     [not found]                 ` <8F7BC9E2-75EC-413B-BEBE-11450225AF06-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-30  1:35                   ` Jason Gunthorpe
     [not found]                     ` <20160530013507.GA19230-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-30  3:39                       ` Doug Ledford
2016-05-31 17:46                       ` Hefty, Sean
     [not found]                         ` <1828884A29C6694DAF28B7E6B8A82373AB05CA6D-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-05-31 18:06                           ` Jason Gunthorpe
     [not found]                             ` <20160531180608.GA21834-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-31 19:48                               ` Hefty, Sean
2016-05-30  7:47                   ` Matan Barak
     [not found]                     ` <4958edf4-7296-26c9-4cbe-8fab45be11a3-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-30 10:01                       ` Yishai Hadas
     [not found]                     ` <4e8befc4-aec5-a17d-24ce-40ff97d345da@redhat.com>
     [not found]                       ` <4e8befc4-aec5-a17d-24ce-40ff97d345da-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-30 13:50                         ` Doug Ledford
     [not found]                           ` <8708a378-4c48-df98-86a4-d210bbe690b5-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-30 15:01                             ` Matan Barak (External)
     [not found]                               ` <ecdbec76-31cd-74e1-25b4-7d60c3fa2af0-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-30 17:05                                 ` Jason Gunthorpe
2016-05-29 14:51   ` [PATCH V4 libibverbs 3/7] Add timestamp_mask and hca_core_clock to ibv_query_device_ex Yishai Hadas
2016-05-29 14:51   ` [PATCH V4 libibverbs 4/7] Add completion timestamp to poll_cq Yishai Hadas
2016-05-29 14:51   ` [PATCH V4 libibverbs 5/7] Create a single threaded CQ Yishai Hadas
2016-05-29 14:51   ` [PATCH V4 libibverbs 6/7] Add a verb that queries real time values from the HCA Yishai Hadas
2016-05-29 14:51   ` [PATCH V4 libibverbs 7/7] Add timestamp support in rc_pingpong Yishai Hadas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.