All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yuval Shaia <yuval.shaia@oracle.com>
To: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: dmitry.fleytman@gmail.com, jasowang@redhat.com,
	eblake@redhat.com, armbru@redhat.com, pbonzini@redhat.com,
	qemu-devel@nongnu.org, shamir.rabinovitch@oracle.com,
	cohuck@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 23/23] docs: Update pvrdma device documentation
Date: Sun, 18 Nov 2018 09:27:51 +0200	[thread overview]
Message-ID: <20181118072750.GA3638@lap1> (raw)
In-Reply-To: <f5f448d1-2945-9ea7-1693-26c570981a4e@gmail.com>

On Sat, Nov 17, 2018 at 02:34:18PM +0200, Marcel Apfelbaum wrote:
> 
> 
> On 11/13/18 9:13 AM, Yuval Shaia wrote:
> > Interface with the device is changed with the addition of support for
> > MAD packets.
> > Adjust documentation accordingly.
> > 
> > While there fix a minor mistake which may lead to think that there is a
> > relation between using RXE on host and the compatibility with bare-metal
> > peers.
> > 
> > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> > ---
> >   docs/pvrdma.txt | 103 +++++++++++++++++++++++++++++++++++++++---------
> >   1 file changed, 84 insertions(+), 19 deletions(-)
> > 
> > diff --git a/docs/pvrdma.txt b/docs/pvrdma.txt
> > index 5599318159..9e8d1674b7 100644
> > --- a/docs/pvrdma.txt
> > +++ b/docs/pvrdma.txt
> > @@ -9,8 +9,9 @@ It works with its Linux Kernel driver AS IS, no need for any special guest
> >   modifications.
> >   While it complies with the VMware device, it can also communicate with bare
> > -metal RDMA-enabled machines and does not require an RDMA HCA in the host, it
> > -can work with Soft-RoCE (rxe).
> > +metal RDMA-enabled machines as peers.
> > +
> > +It does not require an RDMA HCA in the host, it can work with Soft-RoCE (rxe).
> >   It does not require the whole guest RAM to be pinned allowing memory
> >   over-commit and, even if not implemented yet, migration support will be
> > @@ -78,29 +79,93 @@ the required RDMA libraries.
> >   3. Usage
> >   ========
> > +
> > +
> > +3.1 VM Memory settings
> > +======================
> >   Currently the device is working only with memory backed RAM
> >   and it must be mark as "shared":
> >      -m 1G \
> >      -object memory-backend-ram,id=mb1,size=1G,share \
> >      -numa node,memdev=mb1 \
> > -The pvrdma device is composed of two functions:
> > - - Function 0 is a vmxnet Ethernet Device which is redundant in Guest
> > -   but is required to pass the ibdevice GID using its MAC.
> > -   Examples:
> > -     For an rxe backend using eth0 interface it will use its mac:
> > -       -device vmxnet3,addr=<slot>.0,multifunction=on,mac=<eth0 MAC>
> > -     For an SRIOV VF, we take the Ethernet Interface exposed by it:
> > -       -device vmxnet3,multifunction=on,mac=<RoCE eth MAC>
> > - - Function 1 is the actual device:
> > -       -device pvrdma,addr=<slot>.1,backend-dev=<ibdevice>,backend-gid-idx=<gid>,backend-port=<port>
> > -   where the ibdevice can be rxe or RDMA VF (e.g. mlx5_4)
> > - Note: Pay special attention that the GID at backend-gid-idx matches vmxnet's MAC.
> > - The rules of conversion are part of the RoCE spec, but since manual conversion
> > - is not required, spotting problems is not hard:
> > -    Example: GID: fe80:0000:0000:0000:7efe:90ff:fecb:743a
> > -             MAC: 7c:fe:90:cb:74:3a
> > -    Note the difference between the first byte of the MAC and the GID.
> > +
> > +3.2 MAD Multiplexer
> > +===================
> > +MAD Multiplexer is a service that exposes MAD-like interface for VMs in
> > +order to overcome the limitation where only single entity can register with
> > +MAD layer to send and receive RDMA-CM MAD packets.
> > +
> > +To build rdmacm-mux run
> > +# make rdmacm-mux
> > +
> > +The program accepts 3 command line arguments and exposes a UNIX socket to
> > +be used to relay control and data messages to and from the service.
> > +-s unix-socket-path   Path to unix socket to listen on
> > +                      (default /var/run/rdmacm-mux)
> > +-d rdma-device-name   Name of RDMA device to register with
> > +                      (default rxe0)
> > +-p rdma-device-port   Port number of RDMA device to register with
> > +                      (default 1)
> > +The final UNIX socket file name is a concatenation of the 3 arguments so
> > +for example for device name mlx5_0 and port 2 the file
> > +/var/run/rdmacm-mux-mlx5_0-2 will be created.
> > +
> > +Please refer to contrib/rdmacm-mux for more details.
> > +
> > +
> > +3.3 PCI devices settings
> > +========================
> > +RoCE device exposes two functions - Ethernet and RDMA.
> > +To support it, pvrdma device is composed of two PCI functions, an Ethernet
> > +device of type vmxnet3 on PCI slot 0 and a pvrdma device on PCI slot 1. The
> > +Ethernet function can be used for other Ethernet purposes such as IP.
> > +
> > +
> > +3.4 Device parameters
> > +=====================
> > +- netdev: Specifies the Ethernet device on host. For Soft-RoCE (rxe) this
> > +  would be the Ethernet device used to create it. For any other physical
> > +  RoCE device this would be the netdev name of the device.
> 
> I didn't understand, can you please elaborate? We need the ibdev,
> this is clear, but what is the "ethernet device on host", how do
> we get it and how it is used?

netdev is used to maintain port's GID table.

Adding GID entry is by assigning new IPv6 address to the corresponding
Ethernet function, opposite is the same, i.e. removing an IPv6 address from
the Ethernet function will delete the corresponding GID from the GID table.

I wish there would be a way to extract netdev from a given ibdev (by means
of an API) but since there isn't - we must have it as a parameter.

> 
> Thanks,
> Marcel
> 
> > +- ibdev: The IB device name on host for example rxe0, mlx5_0 etc.
> > +- mad-chardev: The name of the MAD multiplexer char device.
> > +- ibport: In case of multi-port device (such as Mellanox's HCA) this
> > +  specify the port to use. If not set 1 will be used.
> > +- dev-caps-max-mr-size: The maximum size of MR.
> > +- dev-caps-max-qp: Maximum number of QPs.
> > +- dev-caps-max-sge: Maximum number of SGE elements in WR.
> > +- dev-caps-max-cq: Maximum number of CQs.
> > +- dev-caps-max-mr: Maximum number of MRs.
> > +- dev-caps-max-pd: Maximum number of PDs.
> > +- dev-caps-max-ah: Maximum number of AHs.
> > +
> > +Notes:
> > +- The first 3 parameters are mandatory settings, the rest have their
> > +  defaults.
> > +- The last 8 parameters (the ones that prefixed by dev-caps) defines the top
> > +  limits but the final values are adjusted by the backend device limitations.
> > +
> > +3.5 Example
> > +===========
> > +Define bridge device with vmxnet3 network backend:
> > +<interface type='bridge'>
> > +  <mac address='56:b4:44:e9:62:dc'/>
> > +  <source bridge='bridge1'/>
> > +  <model type='vmxnet3'/>
> > +  <address type='pci' domain='0x0000' bus='0x00' slot='0x10' function='0x0' multifunction='on'/>
> > +</interface>
> > +
> > +Define pvrdma device:
> > +<qemu:commandline>
> > +  <qemu:arg value='-object'/>
> > +  <qemu:arg value='memory-backend-ram,id=mb1,size=1G,share'/>
> > +  <qemu:arg value='-numa'/>
> > +  <qemu:arg value='node,memdev=mb1'/>
> > +  <qemu:arg value='-chardev'/>
> > +  <qemu:arg value='socket,path=/var/run/rdmacm-mux-rxe0-1,id=mads'/>
> > +  <qemu:arg value='-device'/>
> > +  <qemu:arg value='pvrdma,addr=10.1,ibdev=rxe0,netdev=bridge0,mad-chardev=mads'/>
> > +</qemu:commandline>
> 

  reply	other threads:[~2018-11-18  7:28 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-13  7:12 [Qemu-devel] [PATCH v3 00/23] Add support for RDMA MAD Yuval Shaia
2018-11-13  7:12 ` [Qemu-devel] [PATCH v3 01/23] contrib/rdmacm-mux: Add implementation of RDMA User MAD multiplexer Yuval Shaia
2018-11-13  7:12 ` [Qemu-devel] [PATCH v3 02/23] hw/rdma: Add ability to force notification without re-arm Yuval Shaia
2018-11-13  7:12 ` [Qemu-devel] [PATCH v3 03/23] hw/rdma: Return qpn 1 if ibqp is NULL Yuval Shaia
2018-11-17 11:42   ` Marcel Apfelbaum
2018-11-13  7:12 ` [Qemu-devel] [PATCH v3 04/23] hw/rdma: Abort send-op if fail to create addr handler Yuval Shaia
2018-11-13  7:12 ` [Qemu-devel] [PATCH v3 05/23] hw/rdma: Add support for MAD packets Yuval Shaia
2018-11-17 12:06   ` Marcel Apfelbaum
2018-11-18  9:33     ` Yuval Shaia
2018-11-13  7:12 ` [Qemu-devel] [PATCH v3 06/23] hw/pvrdma: Make function reset_device return void Yuval Shaia
2018-11-13  7:12 ` [Qemu-devel] [PATCH v3 07/23] hw/pvrdma: Make default pkey 0xFFFF Yuval Shaia
2018-11-13  7:12 ` [Qemu-devel] [PATCH v3 08/23] hw/pvrdma: Set the correct opcode for recv completion Yuval Shaia
2018-11-13  7:12 ` [Qemu-devel] [PATCH v3 09/23] hw/pvrdma: Set the correct opcode for send completion Yuval Shaia
2018-11-17 12:07   ` Marcel Apfelbaum
2018-11-13  7:12 ` [Qemu-devel] [PATCH v3 10/23] json: Define new QMP message for pvrdma Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 11/23] hw/pvrdma: Add support to allow guest to configure GID table Yuval Shaia
2018-11-17 12:48   ` Marcel Apfelbaum
2018-11-18  8:13     ` Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 12/23] vmxnet3: Move some definitions to header file Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 13/23] hw/pvrdma: Make sure PCI function 0 is vmxnet3 Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 14/23] hw/rdma: Initialize node_guid from vmxnet3 mac address Yuval Shaia
2018-11-17 12:10   ` Marcel Apfelbaum
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 15/23] hw/pvrdma: Make device state depend on Ethernet function state Yuval Shaia
2018-11-17 12:11   ` Marcel Apfelbaum
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 16/23] hw/pvrdma: Fill all CQE fields Yuval Shaia
2018-11-17 12:19   ` Marcel Apfelbaum
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 17/23] hw/pvrdma: Fill error code in command's response Yuval Shaia
2018-11-17 12:22   ` Marcel Apfelbaum
2018-11-18  8:24     ` Yuval Shaia
2018-11-25  7:35       ` Marcel Apfelbaum
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 18/23] hw/rdma: Remove unneeded code that handles more that one port Yuval Shaia
2018-11-17 12:23   ` Marcel Apfelbaum
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 19/23] vl: Introduce shutdown_notifiers Yuval Shaia
2018-11-13  9:34   ` Cornelia Huck
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 20/23] hw/pvrdma: Clean device's resource when system is shutdown Yuval Shaia
2018-11-17 12:24   ` Marcel Apfelbaum
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 21/23] hw/rdma: Do not use bitmap_zero_extend to free bitmap Yuval Shaia
2018-11-17 12:25   ` Marcel Apfelbaum
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 22/23] hw/rdma: Do not call rdma_backend_del_gid on an empty gid Yuval Shaia
2018-11-17 12:25   ` Marcel Apfelbaum
2018-11-18  9:42     ` Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 23/23] docs: Update pvrdma device documentation Yuval Shaia
2018-11-17 12:34   ` Marcel Apfelbaum
2018-11-18  7:27     ` Yuval Shaia [this message]
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 00/23] Add support for RDMA MAD Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 01/23] contrib/rdmacm-mux: Add implementation of RDMA User MAD multiplexer Yuval Shaia
2018-11-17 17:27   ` Shamir Rabinovitch
2018-11-18 10:17     ` Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 02/23] hw/rdma: Add ability to force notification without re-arm Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 03/23] hw/rdma: Return qpn 1 if ibqp is NULL Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 04/23] hw/rdma: Abort send-op if fail to create addr handler Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 05/23] hw/rdma: Add support for MAD packets Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 06/23] hw/pvrdma: Make function reset_device return void Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 07/23] hw/pvrdma: Make default pkey 0xFFFF Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 08/23] hw/pvrdma: Set the correct opcode for recv completion Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 09/23] hw/pvrdma: Set the correct opcode for send completion Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 10/23] json: Define new QMP message for pvrdma Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 11/23] hw/pvrdma: Add support to allow guest to configure GID table Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 12/23] vmxnet3: Move some definitions to header file Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 13/23] hw/pvrdma: Make sure PCI function 0 is vmxnet3 Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 14/23] hw/rdma: Initialize node_guid from vmxnet3 mac address Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 15/23] hw/pvrdma: Make device state depend on Ethernet function state Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 16/23] hw/pvrdma: Fill all CQE fields Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 17/23] hw/pvrdma: Fill error code in command's response Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 18/23] hw/rdma: Remove unneeded code that handles more that one port Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 19/23] vl: Introduce shutdown_notifiers Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 20/23] hw/pvrdma: Clean device's resource when system is shutdown Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 21/23] hw/rdma: Do not use bitmap_zero_extend to free bitmap Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 22/23] hw/rdma: Do not call rdma_backend_del_gid on an empty gid Yuval Shaia
2018-11-13  7:13 ` [Qemu-devel] [PATCH v3 23/23] docs: Update pvrdma device documentation Yuval Shaia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181118072750.GA3638@lap1 \
    --to=yuval.shaia@oracle.com \
    --cc=armbru@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=dmitry.fleytman@gmail.com \
    --cc=eblake@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shamir.rabinovitch@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.