archive mirror
 help / color / mirror / Atom feed
From: Dennis Dalessandro <>
To: Jason Gunthorpe <>
Cc: Linux RDMA <>,
	Doug Ledford <>,, "Marciniszyn,
	Mike" <>
Subject: Re: [RFC] bulk zero copy transport
Date: Fri, 20 Aug 2021 08:55:22 -0400	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On 8/19/21 7:01 PM, Jason Gunthorpe wrote:
> On Thu, Aug 19, 2021 at 03:09:02PM -0400, Dennis Dalessandro wrote:
>> Just wanted to float an idea we are thinking about. It builds on the basic idea
>> of what Intel submitted as their RV module [1]. This however does things a bit
>> differently and is really all about bulk zero-copy using the kernel. It is a new
>> ULP.
>> The major differences are that there will be no new cdev needed. We will make
>> use of the existing HFI1 cdev where an FD is needed. We also propose to make use
>> of IO-Uring (hence needing FD) to get requests into the kernel. The idea will be
>> to not share Uverbs objects with the kernel. The kernel will maintain
>> ownership of the qp, pd, mr, cq, etc.
> I feel a lot of reluctance to see the API surface of the HFI1 cdev
> expanded, especially to encompass an entire ULP

I share the same reluctance as far as exposing it to anything beyond HFI1. The
idea would be for the ULP here to not need to know about what the thing the user
is talking to is. For now it's the hfi1 cdev but could be something else.

What I'm really thinking is this ULP would come up and register with rdmavt.
rdmavt. Rdmavt would call back when it has a HW device register, set up the
rings and the ULP would use the IO URing to get requests to and responses back
to the user.

> As you know I think that cdev is very much the wrong way to design
> driver interfaces, and since all the work is now completed to do it
> through verbs I'm not keen on any expansion.

I agree. What this allows us to do is deprecate the writev() interface that we
have. Instead of writing in the descriptors we will use the IO URing mechanism.
Once we have this working it should be pretty straight forward to move the rest
of the cdev functionality to verbs IOCTLs or whatever we call that interface. So
this is sort of a stepping stone vs ripping the band-aid off.

> But I'm confused how you are calling something a ULP but then talking
> about the HFI (or uverbs even) cdev? That isn't a ULP.

Just referring to HFI because that's obviously what we'll make this work with.
However in theory it could be any underlying verbs provider.

> A ULP is something like RDS that spawns its own cdevs and interworks
> with the common RDMA stack.

Agree. I'm saying we treat rdmavt as part of the common RDMA stack. Yes I know
in reality it's HFI specific, but the intention was to be more generic.

> I suppose I don't get what you are trying to sketch. Maybe you could
> share the uAPI you envision in more detail?

It's all still very high level. We just want to start the conversation early so
we can make sure we march in the right direction from the start. I'll talk to
Mike and we'll come up with a more detailed view for the uAPI as a next step.


  reply	other threads:[~2021-08-20 12:55 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-19 19:09 [RFC] bulk zero copy transport Dennis Dalessandro
2021-08-19 23:01 ` Jason Gunthorpe
2021-08-20 12:55   ` Dennis Dalessandro [this message]
2021-08-20  8:18 ` Stefan Metzmacher
2021-08-20 12:37   ` Dennis Dalessandro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).