linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Roman Pen <roman.penyaev@profitbricks.com>
Cc: linux-block@vger.kernel.org, linux-rdma@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>,
	Sagi Grimberg <sagi@grimberg.me>,
	Bart Van Assche <bart.vanassche@sandisk.com>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Doug Ledford <dledford@redhat.com>,
	Swapnil Ingle <swapnil.ingle@profitbricks.com>,
	Danil Kipnis <danil.kipnis@profitbricks.com>,
	Jack Wang <jinpu.wang@profitbricks.com>
Subject: Re: [PATCH v2 00/26] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)
Date: Tue, 22 May 2018 10:45:06 -0600	[thread overview]
Message-ID: <20180522164506.GC3311@ziepe.ca> (raw)
In-Reply-To: <20180518130413.16997-1-roman.penyaev@profitbricks.com>

On Fri, May 18, 2018 at 03:03:47PM +0200, Roman Pen wrote:
> Hi all,
> 
> This is v2 of series, which introduces IBNBD/IBTRS modules.
> 
> This cover letter is split on three parts:
> 
> 1. Introduction, which almost repeats everything from previous cover
>    letters.
> 2. Changelog.
> 3. Performance measurements on linux-4.17.0-rc2 and on two different
>    Mellanox cards: ConnectX-2 and ConnectX-3 and CPUs: Intel and AMD.
> 
> 
>  Introduction
> 
> IBTRS (InfiniBand Transport) is a reliable high speed transport library
> which allows for establishing connection between client and server
> machines via RDMA. It is optimized to transfer (read/write) IO blocks
> in the sense that it follows the BIO semantics of providing the
> possibility to either write data from a scatter-gather list to the
> remote side or to request ("read") data transfer from the remote side
> into a given set of buffers.
> 
> IBTRS is multipath capalbdke and provides I/O fail-over and load-balancing
> functionality, i.e. in IBTRS terminology, an IBTRS path is a set of RDMA
> CMs and particular path is selected according to the load-balancing policy.
> 
> IBNBD (InfiniBand Network Block Device) is a pair of kernel modules
> (client and server) that allow for remote access of a block device on
> the server over IBTRS protocol. After being mapped, the remote block
> devices can be accessed on the client side as local block devices.
> Internally IBNBD uses IBTRS as an RDMA transport library.
> 
> Why?
> 
>    - IBNBD/IBTRS is developed in order to map thin provisioned volumes,
>      thus internal protocol is simple.
>    - IBTRS was developed as an independent RDMA transport library, which
>      supports fail-over and load-balancing policies using multipath, thus
>      it can be used for any other IO needs rather than only for block
>      device.
>    - IBNBD/IBTRS is faster than NVME over RDMA.
>      Old comparison results:
>      https://www.spinics.net/lists/linux-rdma/msg48799.html
>      New comparison results: see performance measurements section below.
> 
> Key features of IBTRS transport library and IBNBD block device:
> 
> o High throughput and low latency due to:
>    - Only two RDMA messages per IO.
>    - IMM InfiniBand messages on responses to reduce round trip latency.
>    - Simplified memory management: memory allocation happens once on
>      server side when IBTRS session is established.
> 
> o IO fail-over and load-balancing by using multipath.  According to
>   our test loads additional path brings ~20% of bandwidth.  
> 
> o Simple configuration of IBNBD:
>    - Server side is completely passive: volumes do not need to be
>      explicitly exported.
>    - Only IB port GID and device path needed on client side to map
>      a block device.
>    - A device is remapped automatically i.e. after storage reboot.
> 
> Commits for kernel can be found here:
>    https://github.com/profitbricks/ibnbd/commits/linux-4.17-rc2
> 
> The out-of-tree modules are here:
>    https://github.com/profitbricks/ibnbd/
> 
> Vault 2017 presentation:
>    http://events.linuxfoundation.org/sites/events/files/slides/IBNBD-Vault-2017.pdf

I think from the RDMA side, before we accept something like this, I'd
like to hear from Christoph, Chuck or Sagi that the dataplane
implementation of this is correct, eg it uses the MRs properly and
invalidates at the right time, sequences with dma_ops as required,
etc.

They all have done this work on their ULPs and it was tricky, I don't
want to see another ULP implement this wrong..

I'm skeptical here already due to the performance numbers - they are
not really what I'd expects and we may find that invalidate changes
will bring the performance down further.

Jason

      parent reply	other threads:[~2018-05-22 16:45 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-18 13:03 [PATCH v2 00/26] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD) Roman Pen
2018-05-18 13:03 ` [PATCH v2 01/26] rculist: introduce list_next_or_null_rr_rcu() Roman Pen
2018-05-18 16:56   ` Linus Torvalds
2018-05-19 20:25     ` Roman Penyaev
2018-05-19 21:04       ` Linus Torvalds
2018-05-19 16:37   ` Paul E. McKenney
2018-05-19 20:20     ` Roman Penyaev
2018-05-19 20:56       ` Linus Torvalds
2018-05-20  0:43       ` Paul E. McKenney
2018-05-21 13:50         ` Roman Penyaev
2018-05-21 15:16           ` Linus Torvalds
2018-05-21 15:33             ` Paul E. McKenney
2018-05-22  9:09               ` Roman Penyaev
2018-05-22 16:36                 ` Paul E. McKenney
2018-05-22 16:38                 ` Linus Torvalds
2018-05-22 17:04                   ` Paul E. McKenney
2018-05-21 15:31           ` Paul E. McKenney
2018-05-22  9:09             ` Roman Penyaev
2018-05-22 17:03               ` Paul E. McKenney
2018-05-18 13:03 ` [PATCH v2 02/26] sysfs: export sysfs_remove_file_self() Roman Pen
2018-05-18 15:08   ` Tejun Heo
2018-05-18 13:03 ` [PATCH v2 03/26] ibtrs: public interface header to establish RDMA connections Roman Pen
2018-05-18 13:03 ` [PATCH v2 04/26] ibtrs: private headers with IBTRS protocol structs and helpers Roman Pen
2018-05-18 13:03 ` [PATCH v2 05/26] ibtrs: core: lib functions shared between client and server modules Roman Pen
2018-05-18 13:03 ` [PATCH v2 06/26] ibtrs: client: private header with client structs and functions Roman Pen
2018-05-18 13:03 ` [PATCH v2 07/26] ibtrs: client: main functionality Roman Pen
2018-05-18 13:03 ` [PATCH v2 08/26] ibtrs: client: statistics functions Roman Pen
2018-05-18 13:03 ` [PATCH v2 09/26] ibtrs: client: sysfs interface functions Roman Pen
2018-05-18 13:03 ` [PATCH v2 10/26] ibtrs: server: private header with server structs and functions Roman Pen
2018-05-18 13:03 ` [PATCH v2 11/26] ibtrs: server: main functionality Roman Pen
2018-05-18 13:03 ` [PATCH v2 12/26] ibtrs: server: statistics functions Roman Pen
2018-05-18 13:04 ` [PATCH v2 13/26] ibtrs: server: sysfs interface functions Roman Pen
2018-05-18 13:04 ` [PATCH v2 14/26] ibtrs: include client and server modules into kernel compilation Roman Pen
2018-05-20 22:14   ` kbuild test robot
2018-05-21  6:36   ` kbuild test robot
2018-05-22  5:05   ` Leon Romanovsky
2018-05-22  9:27     ` Roman Penyaev
2018-05-22 13:18       ` Leon Romanovsky
2018-05-22 16:12         ` Roman Penyaev
2018-05-18 13:04 ` [PATCH v2 15/26] ibtrs: a bit of documentation Roman Pen
2018-05-18 13:04 ` [PATCH v2 16/26] ibnbd: private headers with IBNBD protocol structs and helpers Roman Pen
2018-05-18 13:04 ` [PATCH v2 17/26] ibnbd: client: private header with client structs and functions Roman Pen
2018-05-18 13:04 ` [PATCH v2 18/26] ibnbd: client: main functionality Roman Pen
2018-05-18 13:04 ` [PATCH v2 19/26] ibnbd: client: sysfs interface functions Roman Pen
2018-05-18 13:04 ` [PATCH v2 20/26] ibnbd: server: private header with server structs and functions Roman Pen
2018-05-18 13:04 ` [PATCH v2 21/26] ibnbd: server: main functionality Roman Pen
2018-05-18 13:04 ` [PATCH v2 22/26] ibnbd: server: functionality for IO submission to file or block dev Roman Pen
2018-05-18 13:04 ` [PATCH v2 23/26] ibnbd: server: sysfs interface functions Roman Pen
2018-05-18 13:04 ` [PATCH v2 24/26] ibnbd: include client and server modules into kernel compilation Roman Pen
2018-05-20 17:21   ` kbuild test robot
2018-05-20 22:14   ` kbuild test robot
2018-05-21  5:33   ` kbuild test robot
2018-05-18 13:04 ` [PATCH v2 25/26] ibnbd: a bit of documentation Roman Pen
2018-05-18 13:04 ` [PATCH v2 26/26] MAINTAINERS: Add maintainer for IBNBD/IBTRS modules Roman Pen
2018-05-22 16:45 ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180522164506.GC3311@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=axboe@kernel.dk \
    --cc=bart.vanassche@sandisk.com \
    --cc=danil.kipnis@profitbricks.com \
    --cc=dledford@redhat.com \
    --cc=hch@infradead.org \
    --cc=jinpu.wang@profitbricks.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=roman.penyaev@profitbricks.com \
    --cc=sagi@grimberg.me \
    --cc=swapnil.ingle@profitbricks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).