All of lore.kernel.org
 help / color / mirror / Atom feed
* Further thoughts on uAPI
@ 2016-04-20  1:25 Jason Gunthorpe
       [not found] ` <20160420012526.GA25508-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2016-04-20  1:25 UTC (permalink / raw)
  To: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Hefty, Sean, Weiny, Ira

Thinking more deeply about what Liran presented, we can go a bit
further and multiplex on the object type as well as the action.

Specifically, let's have generic ioctl of the form:

  create_object
  query_object
  modify_object
  delete_object

Generically taking the same common struct:

  struct {
     u32 length;
     u32 object_type; // Eg QP,PD,etc
     u32 user_handle; // in/out
     u32 reserved;
     u64 data[];
  };

Where data follows some kind of structured attribute format like Liran
was exploring ala netlink.

>From this point we can obviously capture nearly all of the verbs
objects, but this also can become a clean way for us to allow drivers
to export driver-specific objects as well. (eg object_type & (1<<31)
== driver specific)

With one more entry point (call_driver_on_object) this would cover a
lot of cases. Maybe we have many such entry points for performance.

>From there we'd probably have to define a few other common up calls.

Study of objects:
 'device','port' - Pre-existing and read-only, query_object returns
                   information like we have today
 'pd','mr','mw','cq','srq','qp','ah' - Basic objects, not all have a modify/query
 'flow','xrc',etc - extra objects
  
Additional entry points:
 poll_cq/req_notify_cq - These are high speed, so simple ioctls, or
                         driver specific
 post recv/send - drivers implement these via call_driver_on_object
 attach/detach mcast - Could be 'modify' a qp, or could be new ioctls
 async_event - Should be an ioctl
 get_fd - Convert the object to a fd (eg async_fd, comp_channel, etc)
 
I think this would substantially address the concern that the uapi is
'verbs' or 'qp' specific. Clearly there is space here for any number
of object families related to RDMA devices.

It is also discoverable since we can have a query_object that returns
all 'object_types' the device driver supports.

Sean, is this more agreeable to you than the unstructured fd idea?

The kernel common code side is pretty straightforward, just a bunch of
tables of function pointers, templates and idrs for each object_type.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found] ` <20160420012526.GA25508-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-04-20  4:54   ` Hefty, Sean
  2016-04-21 12:32   ` Hefty, Sean
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Hefty, Sean @ 2016-04-20  4:54 UTC (permalink / raw)
  To: Jason Gunthorpe, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Weiny, Ira

> Specifically, let's have generic ioctl of the form:
> 
>   create_object
>   query_object
>   modify_object
>   delete_object
> 
> Generically taking the same common struct:
> 
>   struct {
>      u32 length;
>      u32 object_type; // Eg QP,PD,etc
>      u32 user_handle; // in/out
>      u32 reserved;
>      u64 data[];
>   };
> 
> Where data follows some kind of structured attribute format like Liran
> was exploring ala netlink.
> 
> From this point we can obviously capture nearly all of the verbs
> objects, but this also can become a clean way for us to allow drivers
> to export driver-specific objects as well. (eg object_type & (1<<31)
> == driver specific)
> 
> With one more entry point (call_driver_on_object) this would cover a
> lot of cases. Maybe we have many such entry points for performance.
> 
> From there we'd probably have to define a few other common up calls.
> 
> Study of objects:
>  'device','port' - Pre-existing and read-only, query_object returns
>                    information like we have today
>  'pd','mr','mw','cq','srq','qp','ah' - Basic objects, not all have a
> modify/query
>  'flow','xrc',etc - extra objects
> 
> Additional entry points:
>  poll_cq/req_notify_cq - These are high speed, so simple ioctls, or
>                          driver specific
>  post recv/send - drivers implement these via call_driver_on_object
>  attach/detach mcast - Could be 'modify' a qp, or could be new ioctls
>  async_event - Should be an ioctl
>  get_fd - Convert the object to a fd (eg async_fd, comp_channel, etc)
> 
> I think this would substantially address the concern that the uapi is
> 'verbs' or 'qp' specific. Clearly there is space here for any number
> of object families related to RDMA devices.
> 
> It is also discoverable since we can have a query_object that returns
> all 'object_types' the device driver supports.
> 
> Sean, is this more agreeable to you than the unstructured fd idea?

I need to give this more thought, but the general outline seems reasonable.  And I like the generic structure of the ioctl's.

The architecture I was envisioning was something along the lines of having an "rdma_uabi" module.  (And I honestly don't care what things are called; we're abusing names all over.  I'm avoiding the term uverbs because the functionality and structure is different.)  Drivers would plug _directly_ into that module.  This is separate from whatever kernel interfaces that the driver plugs into.  IOCTL dispatch would then go directly to the drivers.  A driver could set its ioctl dispatch to some common kernel function, call core functionality, or just implement what it needs.

Basically, I'm viewing the rdma_uabi as a generic mechanism to export hardware specific interfaces -- be it queue pairs, command queues, or whatever, versus exporting a kernel software interface, like umad or rdma_ucm do.  For better or worse, each hardware driver could independently export their device capabilities.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found] ` <20160420012526.GA25508-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-04-20  4:54   ` Hefty, Sean
@ 2016-04-21 12:32   ` Hefty, Sean
  2016-04-21 13:35   ` Hefty, Sean
  2016-04-25 16:29   ` Hefty, Sean
  3 siblings, 0 replies; 25+ messages in thread
From: Hefty, Sean @ 2016-04-21 12:32 UTC (permalink / raw)
  To: Jason Gunthorpe, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Weiny, Ira

This is just to point out a thread on proposed changes to the uABI, which was occurring under a subject heading that many people may not have been following.

http://www.spinics.net/lists/linux-rdma/msg35051.html

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found] ` <20160420012526.GA25508-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-04-20  4:54   ` Hefty, Sean
  2016-04-21 12:32   ` Hefty, Sean
@ 2016-04-21 13:35   ` Hefty, Sean
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB044043-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2016-04-25 16:29   ` Hefty, Sean
  3 siblings, 1 reply; 25+ messages in thread
From: Hefty, Sean @ 2016-04-21 13:35 UTC (permalink / raw)
  To: Jason Gunthorpe, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Weiny, Ira

> The kernel common code side is pretty straightforward, just a bunch of
> tables of function pointers, templates and idrs for each object_type.

This is the part where the intent and implementation are not clear to me.  I see value in generic user space handle validation, but I prefer they map to driver specific resources.  For example, I don't want a user space allocation of a QP to result in this:

1403 struct ib_qp {
1404         struct ib_device       *device;
1405         struct ib_pd           *pd;
1406         struct ib_cq           *send_cq;
1407         struct ib_cq           *recv_cq;
1408         struct ib_srq          *srq;
1409         struct ib_xrcd         *xrcd; /* XRC TGT QPs only */
1410         struct list_head        xrcd_list;
1411         /* count times opened, mcast attaches, flow attaches */
1412         atomic_t                usecnt;
1413         struct list_head        open_list;
1414         struct ib_qp           *real_qp;
1415         struct ib_uobject      *uobject;
1416         void                  (*event_handler)(struct ib_event *, void *);
1417         void                   *qp_context;
1418         u32                     qp_num;
1419         enum ib_qp_type         qp_type;
1420 };

monstrosity being allocated in the kernel.  The size and layout of this sort of structure impacts scalability and performance.  This is where I want to make sure there's alignment on the overall architecture.

More specifically, I would switch the existing uABI commands to ioctl's, leaving the code paths mostly unchanged, but marking them as legacy.  Then add the new ioctls that you suggested, with future drivers hooking directly into the ioctl framework.  Honestly, I'm not sure the rdma core should define anything beyond the common ioctl structure.  Anything else can just be placed into the driver specific portion, or maybe rdma core sub-subsystem (e.g. ib, iwarp, roce, opa, mlx, ...)  Maybe we would define some common ioctl's across all devices, but I don't think they would look anything like the current commands.  They would need to be much higher level.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB044043-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-04-21 13:54       ` Hefty, Sean
  2016-04-21 14:03       ` Leon Romanovsky
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Hefty, Sean @ 2016-04-21 13:54 UTC (permalink / raw)
  To: Hefty, Sean, Jason Gunthorpe, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA

> sub-subsystem (e.g. ib, iwarp, roce, opa, mlx, ...)  Maybe we would define
> some common ioctl's across all devices, but I don't think they would look
> anything like the current commands.  They would need to be much higher
> level.

Before someone goes off on this statement, I was thinking about device discovery and attributes, so that a user space library can figure out which devices it can handle.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB044043-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2016-04-21 13:54       ` Hefty, Sean
@ 2016-04-21 14:03       ` Leon Romanovsky
       [not found]         ` <20160421140347.GI26951-2ukJVAZIZ/Y@public.gmane.org>
  2016-04-21 17:24       ` Jason Gunthorpe
  2016-04-24 14:15       ` Liran Liss
  3 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2016-04-21 14:03 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Jason Gunthorpe, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

[-- Attachment #1: Type: text/plain, Size: 540 bytes --]

On Thu, Apr 21, 2016 at 01:35:53PM +0000, Hefty, Sean wrote:
> > The kernel common code side is pretty straightforward, just a bunch of
> > tables of function pointers, templates and idrs for each object_type.
> 
> This is the part where the intent and implementation are not clear to me.

The implementation and expected behavior will be easily achieved by the
following pseudo code:

.. ioctl call
.... pre rdma core hook
---- rdma core code
---- post rdma core hook

Why do you need to mark commands as legacy?

Thanks.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found]         ` <20160421140347.GI26951-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-04-21 14:35           ` Hefty, Sean
  0 siblings, 0 replies; 25+ messages in thread
From: Hefty, Sean @ 2016-04-21 14:35 UTC (permalink / raw)
  To: leon-DgEjT+Ai2ygdnm+yROfE0A
  Cc: Jason Gunthorpe, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

> > > The kernel common code side is pretty straightforward, just a bunch of
> > > tables of function pointers, templates and idrs for each object_type.
> >
> > This is the part where the intent and implementation are not clear to
> me.
> 
> The implementation and expected behavior will be easily achieved by the
> following pseudo code:
> 
> .. ioctl call
> .... pre rdma core hook
> ---- rdma core code
> ---- post rdma core hook

Pre/post hooks are not the same behavior.  The difference is that the driver registers with the ioctl interface directly and is not required to hook into a specific kernel interface.  It's not hooking a core ioctl, it's exposing its own. 

> Why do you need to mark commands as legacy?

Because we're talking about a new ioctl/command format.  I would kill them entirely, but I want to make the path forward as smooth as possible.  A driver should not be forced to change the kernel core interfaces to export some HW capability to user space.  That's the problem we're trying to solve, and I think it's an architectural problem inherent in the RDMA stack, not a maintainer issue as some believe.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB044043-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2016-04-21 13:54       ` Hefty, Sean
  2016-04-21 14:03       ` Leon Romanovsky
@ 2016-04-21 17:24       ` Jason Gunthorpe
       [not found]         ` <20160421172428.GA5102-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-04-24 14:15       ` Liran Liss
  3 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2016-04-21 17:24 UTC (permalink / raw)
  To: Hefty, Sean; +Cc: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

On Thu, Apr 21, 2016 at 01:35:53PM +0000, Hefty, Sean wrote:
> > The kernel common code side is pretty straightforward, just a bunch of
> > tables of function pointers, templates and idrs for each object_type.
> 
> This is the part where the intent and implementation are not clear
> to me.  I see value in generic user space handle validation, but I
> prefer they map to driver specific resources.  For example, I don't
> want a user space allocation of a QP to result in this:

When we had our telephone call I did raise this as a discussion topic,
framed a little differently than you:

  Do we need any 'common' calls or should 'everything' be driver
  specific?

This is based on the observation that the current uverbs code does
very little beyond just calling a driver function pointer. So why
marshal the userspace information into common uAPI information then
marshall it again into hardware/driver information when the uAPI side
could just format it directly for hardware consumption?

** And to answer my own question: there is actually a good reason to
   continue the 'common' API - internally the uverbs/drivers use the
   kAPI to do most of the work. An incremental API transformation
   requires us to continue to provide that basic flow or face a
   daunting task of radically upgrading every single driver!

> monstrosity being allocated in the kernel.  The size and layout of
> this sort of structure impacts scalability and performance.  This is
> where I want to make sure there's alignment on the overall
> architecture.

Your observation that uAPI users don't need the kapi struct ib_qp is
along the same lines and very valid. Could the driver build a more
efficient struct?

The flip side is that the entire kAPI related to a verbs qp in the
driver becomes duplicated, one side presents the kAPI verbs and the
other side presents the driver-specific ioctl uAPI for the same
functionality.

Obviously this is a radical direction away from the uapi we have
today. Today they all code share and the uverbs translates the uAPI to
the kAPI and back again so drivers only have to implement the kAPI to
get u/k verbs support.

Within the basic sketch I presented there are two basic ways to
address this:
  1) Allow the driver to override RDMA uAPI object call backs
     directly.
     A driver could replace the entire uAPI QP object with its
     own code. The default implementation would be the existing scheme
     that translates the uAPI to the kAPI.
     The driver code could do whatever it wants and isn't required to
     implement the 'struct ib_qp' to get there.
     In this case the driver must still implement the common uAPI for
     verbs QP objects.
  2) Allow the driver to provide a 'driver specific QP' object which
     does not follow the common uAPI. This is a new object that would
     only be used by the driver specific library, and like the above
     totally bypasses all of the standard kAPI design.

That said - there are clearly some important services that core code
is providing, locking, idrs, hot-removal, auto-deletion, things like
comp channels, etc. These are object agnostic and even if you have a
driver specific object they still have to be cared for.

I'm guessing some hybrid is going to be best, eg simple objects like a
verbs PD could be handled through common code calling the kAPI for all
drivers and complex very driver-specific objects like MRs, AHs or QPs
are better routed directly to the driver without using the kapi.

I'm also very concerned to give driver authors a 'free reign' to
design their driver-specific uAPIs, because historically it has been
proven across the kernel that driver authors do a bad job of that kind
of work :( Thus encouraging drivers to stick with the #1 approach as
much as possible should be encouraged, and all examples of #2 has to
be reviewed by the core maintainer group (eg to ensure someone doesn't
try to access EEPROM over this interface :P )

> More specifically, I would switch the existing uABI commands to
> ioctl's, leaving the code paths mostly unchanged, but marking them
> as legacy.  Then add the new ioctls that you suggested, with future
> drivers hooking directly into the ioctl framework.

In an ideal world, I would like to not retain the existing uABI in the
current format. We may decide that is just too much work, but as a
goal..

We already have to maintain it exactly as is (with write) for a long
time, adding another copy that is ioctl based and has to live forever
doesn't seem great :( Maybe the compromise is to use this scheme to
carry the existing struct as the base attribute?

The reason is a little nuanced - I see #1 as the desired path, so we
still have to define this common uAPI for the industry-standard
objects. If we migrate to it then the kernel side can be optimized at
our leisure and there is a very nice incremental transition. Further,
we have only one set of code to maintain in the kernel, and the kernel
uAPI/kAPI default implementation becomes the necessary reference for
driver authors seeking the optimize it.

I think some coding experiments are going to be needed to qualify the
various options.

As a starting point, I am thinking of something along the lines of:

struct ib_device {
   [ .. ] 
   /* During 'ib_device_register' this array is allocated. The driver
      provided ops and common ops merged into one table for
      performance. */
   struct rdma_uapi_class *class_data;
}

struct rdma_uapi_object_data {
   struct .. idr ..
   .. locking ..
   etc
};

struct rdma_uapi_context {
    struct rdma_uapi_object_data *object_data; // Indexed same as class_data
}

typedef int (*rdma_uapi_cb)(struct rdma_uapi_context *,const rdma_object_hdr *imsg,omsg);
struct rdma_uapi_class {
    rdma_uapi_cb create_object;
    rdma_uapi_cb query_object;
    rdma_uapi_cb modify_object;
    rdma_uapi_cb destroy_object;
    rdma_uapi_cb object_specific[4];
    rdma_uapi_cb driver_specific[4];
    uint32_t class_id;
};

core/uapi_verbs.c:

static const struct rdma_uapi_class verbs_uapi_ops[] {
  {.class_id = RDMA_OBJECT_DEVICE, driver_query},
  {.class_id = RDMA_OBJECT_PORT, port_query},
  {.class_id = RDMA_OBJECT_QP, qp_create,qp_query,qo_modify,qp_destroy},
  {.class_id = RDMA_OBJECT_PD, pd_create,qp_query,qp_modify,qp_destroy},
};

/* And perhaps if necessary have common, ib, iwarp, rocee, opa
   versions of these functions for optimal performance */

core/uapi_XXXX.c:

// Future new multi-vendor uAPI
static const struct rdma_uapi_class XXX_uapi_ops[] {
  {.class_id = RDMA_OBJECT_XXX1, },
  {.class_id = RDMA_OBJECT_XXX2, },
};

hw/driver.c:

static const struct rdma_uapi_class driver_uapi_ops[] {
  // Override what uapi_verbs is providing
  {.class_id = RDMA_OBJECT_QP, driver_create,driver_query,driver_modfiy,driver_destroy},
  {.class_id = RDMA_OBJECT_PD}, // NULL, use standard
};

hw/drivers/hfi1.c

// Replaces the char dev:
static const struct rdma_uapi_class hfi_uapi_ops[] {
  // Driver directly provides its own object
  {.class_id = RDMA_OBJECT_HFI1_CTXT,
   .create_object = assign_ctxt,
   .query_object = get_ctxt_info,
   .destroy_object = hfi1_cmd_ctxt_reset,
   .object_specific = {
       [HFI1_CMD_RECV_CTRL] = ...,
       ..
   },
  },
}

core/uapi_common.c:

static const struct rdma_uapi_class common_uapi_ops[] {
  /* Format the content of class_data so userspace can figure out what
     the kernel supports. */
  {.class_id = RDMA_OBJECT_INTERFACE, query_classes},
}

int fops_ioctl(... )
{
	const struct rdma_object_hdr *hdr = copy_from_user(udata)
	struct rdma_uapi_context *ctx = fops->private_data;

	if (hdr->class_id >= ctx->device->class_data_len ||
	    hdr->op_id >= MAX_OP_ID)
	    return EOPNOTSUPP;

        const struct rdma_uapi_class *class = ctx->device->class_data[hdr->class_id];
        // Direct-index the labeled function pointer scheme as an array for performance
        rdma_uapi_cb target = ((rdma_uapi_cb)class)[hdr->op_id];
	if (target == NULL)
	    return EOPNOTSUPP;

        /* There should probably be a little bit more support here,
	   eg common code to access the object-specific IDR, hold
	   things like the rw sem for hot-removal, basic parse and validate
	   the message. But this is the basic idea */
        return target(ctx,....);
}

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found]         ` <20160421172428.GA5102-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-04-22 16:35           ` Hefty, Sean
  2016-04-24 20:11           ` Hefty, Sean
  1 sibling, 0 replies; 25+ messages in thread
From: Hefty, Sean @ 2016-04-22 16:35 UTC (permalink / raw)
  To: 'Jason Gunthorpe'
  Cc: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

> ** And to answer my own question: there is actually a good reason to
>    continue the 'common' API - internally the uverbs/drivers use the
>    kAPI to do most of the work. An incremental API transformation
>    requires us to continue to provide that basic flow or face a
>    daunting task of radically upgrading every single driver!

I could not make the ofvwg call this week, so I missed this conversation.  But I agree that we can't have radical driver upgrades.


> Your observation that uAPI users don't need the kapi struct ib_qp is
> along the same lines and very valid. Could the driver build a more
> efficient struct?

I would add that there are efficiencies that we can gain if we support higher level of abstractions.  For example, there's no technical reason why connection establishment couldn't be done entirely in the kernel, versus making multiple user to kernel transitions to modify the qp.  And this would provide additional security that the app doesn't modify the path record.


> I'm guessing some hybrid is going to be best, eg simple objects like a
> verbs PD could be handled through common code calling the kAPI for all
> drivers and complex very driver-specific objects like MRs, AHs or QPs
> are better routed directly to the driver without using the kapi.

I agree, and I think this may fall out naturally.


> I'm also very concerned to give driver authors a 'free reign' to
> design their driver-specific uAPIs, because historically it has been
> proven across the kernel that driver authors do a bad job of that kind
> of work :( Thus encouraging drivers to stick with the #1 approach as
> much as possible should be encouraged, and all examples of #2 has to
> be reviewed by the core maintainer group (eg to ensure someone doesn't
> try to access EEPROM over this interface :P )

I agree here too.  The main criticism I've read against ioctl's is that they allowed drivers to implement poorly designed interfaces.  Trying to fix that by using write didn't really address the issue...


> We already have to maintain it exactly as is (with write) for a long
> time, adding another copy that is ioctl based and has to live forever
> doesn't seem great :( Maybe the compromise is to use this scheme to
> carry the existing struct as the base attribute?

I was assuming that the write uABI would be replaced with the ioctl path, so I was looking at a way to minimize that transition.  If that's not necessary, I'm sure I could come up with all sorts of other ideas.  :)

It's taking me longer to work through the ioctl details because of, let's call it, significant work distractions.  But I'll comment on your proposals separately once I analyze them more.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB044043-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
                         ` (2 preceding siblings ...)
  2016-04-21 17:24       ` Jason Gunthorpe
@ 2016-04-24 14:15       ` Liran Liss
       [not found]         ` <AM3PR05MB141161876D5CA05B1993D20CB1610-LOZWmgKjnYgmsg45OnKo/9qRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  3 siblings, 1 reply; 25+ messages in thread
From: Liran Liss @ 2016-04-24 14:15 UTC (permalink / raw)
  To: Hefty, Sean, Jason Gunthorpe, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA

> From: ofvwg [mailto:ofvwg-bounces@lists.openfabrics.org] On Behalf Of Hefty,
> Sean

> 
> > The kernel common code side is pretty straightforward, just a bunch of
> > tables of function pointers, templates and idrs for each object_type.
> 
> This is the part where the intent and implementation are not clear to me.  I see
> value in generic user space handle validation, but I prefer they map to driver
> specific resources.  For example, I don't want a user space allocation of a QP to
> result in this:
> 
> 1403 struct ib_qp {
> 1404         struct ib_device       *device;
> 1405         struct ib_pd           *pd;
> 1406         struct ib_cq           *send_cq;
> 1407         struct ib_cq           *recv_cq;
> 1408         struct ib_srq          *srq;
> 1409         struct ib_xrcd         *xrcd; /* XRC TGT QPs only */
> 1410         struct list_head        xrcd_list;
> 1411         /* count times opened, mcast attaches, flow attaches */
> 1412         atomic_t                usecnt;
> 1413         struct list_head        open_list;
> 1414         struct ib_qp           *real_qp;
> 1415         struct ib_uobject      *uobject;
> 1416         void                  (*event_handler)(struct ib_event *, void *);
> 1417         void                   *qp_context;
> 1418         u32                     qp_num;
> 1419         enum ib_qp_type         qp_type;
> 1420 };
> 
> monstrosity being allocated in the kernel.  The size and layout of this sort of
> structure impacts scalability and performance.  This is where I want to make
> sure there's alignment on the overall architecture.
> 

How the kernel decides to represent objects is a kernel implementation issue, not an ABI issue.
As long as the ABI is expressive enough (e.g., pass only the relevant information for the typed object at hand), the kernel code could decide what to hold where.

> More specifically, I would switch the existing uABI commands to ioctl's, leaving
> the code paths mostly unchanged, but marking them as legacy.  Then add the
> new ioctls that you suggested, with future drivers hooking directly into the ioctl
> framework.  Honestly, I'm not sure the rdma core should define anything
> beyond the common ioctl structure.  Anything else can just be placed into the
> driver specific portion, or maybe rdma core sub-subsystem (e.g. ib, iwarp, roce,
> opa, mlx, ...)  Maybe we would define some common ioctl's across all devices,
> but I don't think they would look anything like the current commands.  They
> would need to be much higher level.

That's where the provider channel comes in.
For provider stuff, uverbs won't do anything else other than a way to get a channel + some conventions (maybe).
There won't be any matching kAPI for it.

For generic interfaces (currently includes Verbs, Ethernet QPs, and IB management), the new scheme should map what we have today in a flexible manner.
This would enable us, for example, to pass only RoCE addressing attributes while modifying a RoCE QP (and optionally optimizing the kernel representation as well).
These interfaces have a matching kAPI.

New object types could be supported with additional/alternative attributes to existing objects, where it makes sense.
The flexibility of the ABI would allow for both optimal parameter passing and kernel representation.
For example, if I want to add a counter that counts completions, I would add it as a new type of CQ. This would allow it to interoperate with the existing object model.
The same would apply for new transport types.

If a completely new generic concept is introduced, we can add additional interfaces and extend the kAPI to support it.

> 
> - Sean


^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found]         ` <20160421172428.GA5102-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-04-22 16:35           ` Hefty, Sean
@ 2016-04-24 20:11           ` Hefty, Sean
       [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB045101-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 25+ messages in thread
From: Hefty, Sean @ 2016-04-24 20:11 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

> Within the basic sketch I presented there are two basic ways to
> address this:
>   1) Allow the driver to override RDMA uAPI object call backs
>      directly.
>      A driver could replace the entire uAPI QP object with its
>      own code. The default implementation would be the existing scheme
>      that translates the uAPI to the kAPI.
>      The driver code could do whatever it wants and isn't required to
>      implement the 'struct ib_qp' to get there.
>      In this case the driver must still implement the common uAPI for
>      verbs QP objects.
>   2) Allow the driver to provide a 'driver specific QP' object which
>      does not follow the common uAPI. This is a new object that would
>      only be used by the driver specific library, and like the above
>      totally bypasses all of the standard kAPI design.
> 
> That said - there are clearly some important services that core code
> is providing, locking, idrs, hot-removal, auto-deletion, things like
> comp channels, etc. These are object agnostic and even if you have a
> driver specific object they still have to be cared for.
> 
> I'm guessing some hybrid is going to be best, eg simple objects like a
> verbs PD could be handled through common code calling the kAPI for all
> drivers and complex very driver-specific objects like MRs, AHs or QPs
> are better routed directly to the driver without using the kapi.

{snip}

> We already have to maintain it exactly as is (with write) for a long
> time, adding another copy that is ioctl based and has to live forever
> doesn't seem great :( Maybe the compromise is to use this scheme to
> carry the existing struct as the base attribute?

After fully over-analyzing things, these are my current thoughts.

I'm for merging all the rdma uABI interfaces.  This will allow us to share events, and allow closer association between objects.

We have 256 ioctl commands available.  A straightforward mapping would result in uverbs using 43, ucma 23, and ucm 18.  I'd deprecate ucm, but if it needs to be kept, it could probably drop to about 6 commands.  Uverbs could be re-structured into 10 objects -- each with create/query/modify/close routines -- plus about 6 fast path commands.

I propose splitting the ioctls into 8 command blocks, with 32 commands each.

Block 0: core objects
The uapi object create/query/modify/close routines (most of uverbs) fit here. 

Block 1: cm/mgmt
Block 2: driver fast path
Block 3: experimental?  No ABI compat guarantees
Block 4-7: reserved

A command block has a num_ops, followed by an array of calls.  Each device structure has an array of pointers to command blocks.  This allows a driver to override any call, without necessarily storing a huge function table.

For the base ioctl command, I would also add these two fields: op_ctrl and flags.  I'm envision that these fields can be used to determine the format of the input/output data.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found] ` <20160420012526.GA25508-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-04-21 13:35   ` Hefty, Sean
@ 2016-04-25 16:29   ` Hefty, Sean
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB0453F5-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  3 siblings, 1 reply; 25+ messages in thread
From: Hefty, Sean @ 2016-04-25 16:29 UTC (permalink / raw)
  To: Jason Gunthorpe, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Weiny, Ira

> Study of objects:
>  'device','port' - Pre-existing and read-only, query_object returns
>                    information like we have today
>  'pd','mr','mw','cq','srq','qp','ah' - Basic objects, not all have a
> modify/query
>  'flow','xrc',etc - extra objects

IMO, we should examine what items are objects.  E.g. and XRC RQ is treated as a QP, whereas an SRQ is not.  It may make more sense to treat each QP type as separate objects.  It would definitely make the data structures more efficient and a lot more understandable.

> Additional entry points:
>  poll_cq/req_notify_cq - These are high speed, so simple ioctls, or
>                          driver specific
>  post recv/send - drivers implement these via call_driver_on_object
>  attach/detach mcast - Could be 'modify' a qp, or could be new ioctls
>  async_event - Should be an ioctl
>  get_fd - Convert the object to a fd (eg async_fd, comp_channel, etc)

We could define a generic event_channel/event_queue object to report asynchronous events.  The EQ could then be associated with a device, port, CQ, CM objects (e.g. QP), etc.  User space could achieve current semantics by limiting which objects are associated with each channel.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB045101-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-04-25 18:19               ` Jason Gunthorpe
       [not found]                 ` <20160425181953.GC7675-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2016-04-25 18:19 UTC (permalink / raw)
  To: Hefty, Sean; +Cc: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

On Sun, Apr 24, 2016 at 08:11:47PM +0000, Hefty, Sean wrote:

> After fully over-analyzing things, these are my current thoughts.
> 
> I'm for merging all the rdma uABI interfaces.  This will allow us to
> share events, and allow closer association between objects.

Yes

> We have 256 ioctl commands available.  A straightforward mapping
> would result in uverbs using 43, ucma 23, and ucm 18.  I'd deprecate
> ucm, but if it needs to be kept, it could probably drop to about 6
> commands.  Uverbs could be re-structured into 10 objects -- each
> with create/query/modify/close routines -- plus about 6 fast path
> commands.

Yes, this seems broadly right to me.

However, I had intended to use the object type carred in the ioctl arg
as the primary mux and the ioctl would just indicate the 'method'. The
method ID table would be split much like you describe:

'core common' object routines
'built-in extra' object routines
'driver-fast-path' object routines

Not sure about experimental..

~128 unique methods for every object seems like enough??

Why do you feel cm/mgmt needs dedicated routines? I was going to model
CM as more objects and use the 'built-in extra' block to make CM
object specific calls (eg bind/etc)

This still works OK for strace: it has to parse the ioctl # and then
look into the class_id uniform first dword, then it knows exactly how
to format and parse the ioctl argument.

> A command block has a num_ops, followed by an array of calls.  Each
> device structure has an array of pointers to command blocks.  This
> allows a driver to override any call, without necessarily storing a
> huge function table.

My sketch had the drivers just provide the individual things they
wanted to provide/override by number:

 static const struct rdma_uapi_class hfi_uapi_ops[] {
  // Driver directly provides its own object
  {.class_id = RDMA_OBJECT_HFI1_CTXT,
   .create_object = assign_ctxt,

And then rely on a 'compile' phase during registration to build a
micro-optimized dispatch table.

> For the base ioctl command, I would also add these two fields:
> op_ctrl and flags.  I'm envision that these fields can be used to
> determine the format of the input/output data.

There has been a lot of talk of using a structure like netlink with a
linked list of binary attributes and an optional/mandatory flag. For
the lower speed stuff that seems reasonable, though it is certainly
over-engineered for some commands.

So, a sketch would look like this:

struct msg
{
   uint16_t length;
   uint16_t class_id;
   uint32_t object_id; // in/out
   struct qp_base_attr
   {
       uint16_t length;
       uint16_t attribute_id;

       uint16_t qpn;  //in/out
       uint16_t qp_flags;
       uint16_t max_send_wr,max_recv_qr,max_send_sge,////
   };
   // Option to piggy back what ibv_modify_qp does:
   struct qp_addr_ib
   {
       uint16_t length;
       uint16_t attribute_id;

       uint16_t dlid,slid,sl,pkey,etc;
   }
}

msg.length = sizeof(msg);
msg.class_id = RDMA_OBJ_QP_UD;
msg.base.legnth = sizeof(msg.base);
msg.base.attribute_id = RDMA_ATTR_QP_BASE;
msg.base.qp_flags = XX
[..]
ioctl(fd,RDMA_CREATE_OBJECT,&msg);
[..]
ioctl(fd,RDMA_MODIFY_OBJECT,&msg2);

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB0453F5-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-04-25 18:32       ` Jason Gunthorpe
       [not found]         ` <20160425183243.GD7675-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2016-04-25 18:32 UTC (permalink / raw)
  To: Hefty, Sean; +Cc: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

On Mon, Apr 25, 2016 at 04:29:09PM +0000, Hefty, Sean wrote:
> > Study of objects:
> >  'device','port' - Pre-existing and read-only, query_object returns
> >                    information like we have today
> >  'pd','mr','mw','cq','srq','qp','ah' - Basic objects, not all have a
> > modify/query
> >  'flow','xrc',etc - extra objects
> 
> IMO, we should examine what items are objects.  E.g. and XRC RQ is
> treated as a QP, whereas an SRQ is not.  It may make more sense to
> treat each QP type as separate objects.  It would definitely make
> the data structures more efficient and a lot more understandable.

Yes, I was thinking along those lines as well.

It does make it much more understandable if each of the different QP
types was a different object at the high level and thus had a
different set of allowed attributes and and ops.

Eg should UD/RC/UC/RD/XRC/etc all be different object classes.

> > Additional entry points:
> >  poll_cq/req_notify_cq - These are high speed, so simple ioctls, or
> >                          driver specific
> >  post recv/send - drivers implement these via call_driver_on_object
> >  attach/detach mcast - Could be 'modify' a qp, or could be new ioctls
> >  async_event - Should be an ioctl
> >  get_fd - Convert the object to a fd (eg async_fd, comp_channel, etc)
> 
> We could define a generic event_channel/event_queue object to report
> asynchronous events.  The EQ could then be associated with a device,
> port, CQ, CM objects (e.g. QP), etc.  User space could achieve
> current semantics by limiting which objects are associated with each
> channel.

Yes, that is sort of where I was going with the 'get_fd' method - we
have several places that use these fds on different things for async
event delivery.

API wise, yes it is an event channel, however, there must always be an
FD, so if you want to do something to an event channel, it should be
done via ioctl on that fd, not through the common fd. So 'get_fd' is
another form of 'create_object' except that the returned object_handle
is a fd number not an object id.

Perhaps 'create_fd_object' instead ?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found]         ` <20160425183243.GD7675-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-04-25 18:51           ` Hefty, Sean
       [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB045460-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Hefty, Sean @ 2016-04-25 18:51 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

> API wise, yes it is an event channel, however, there must always be an
> FD, so if you want to do something to an event channel, it should be
> done via ioctl on that fd, not through the common fd. So 'get_fd' is
> another form of 'create_object' except that the returned object_handle
> is a fd number not an object id.
> 
> Perhaps 'create_fd_object' instead ?

I think we're pretty much in agreement, just differences in details.

I was thinking that create event channel would return an fd and implicitly define its ioctl format.  Create_fd_object is more generic, but how were you thinking of defining the ioctl format?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found]                 ` <20160425181953.GC7675-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-04-25 19:16                   ` Hefty, Sean
       [not found]                     ` <1828884A29C6694DAF28B7E6B8A82373AB04548E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2016-04-26 13:18                   ` Doug Ledford
  1 sibling, 1 reply; 25+ messages in thread
From: Hefty, Sean @ 2016-04-25 19:16 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

> However, I had intended to use the object type carred in the ioctl arg
> as the primary mux and the ioctl would just indicate the 'method'. The
> method ID table would be split much like you describe:
> 
> 'core common' object routines
> 'built-in extra' object routines
> 'driver-fast-path' object routines

I did understand the proposal.  My main concern was that it appeared that it would result in a very large function array, potentially with a significant number of NULL functions, associated with each driver.

> Not sure about experimental..

I wasn't sure either.

> ~128 unique methods for every object seems like enough??

Seems more than enough to me

> Why do you feel cm/mgmt needs dedicated routines? I was going to model
> CM as more objects and use the 'built-in extra' block to make CM
> object specific calls (eg bind/etc)

I separated the cm/mgmt calls because I doubt a driver will ever override them, and some of the calls are system wide, versus being bound to a driver. 

> This still works OK for strace: it has to parse the ioctl # and then
> look into the class_id uniform first dword, then it knows exactly how
> to format and parse the ioctl argument.
> 
> > A command block has a num_ops, followed by an array of calls.  Each
> > device structure has an array of pointers to command blocks.  This
> > allows a driver to override any call, without necessarily storing a
> > huge function table.
> 
> My sketch had the drivers just provide the individual things they
> wanted to provide/override by number:
> 
>  static const struct rdma_uapi_class hfi_uapi_ops[] {
>   // Driver directly provides its own object
>   {.class_id = RDMA_OBJECT_HFI1_CTXT,
>    .create_object = assign_ctxt,
> 
> And then rely on a 'compile' phase during registration to build a
> micro-optimized dispatch table.
> 
> > For the base ioctl command, I would also add these two fields:
> > op_ctrl and flags.  I'm envision that these fields can be used to
> > determine the format of the input/output data.
> 
> There has been a lot of talk of using a structure like netlink with a
> linked list of binary attributes and an optional/mandatory flag. For
> the lower speed stuff that seems reasonable, though it is certainly
> over-engineered for some commands.
> 
> So, a sketch would look like this:
> 
> struct msg
> {
>    uint16_t length;
>    uint16_t class_id;
>    uint32_t object_id; // in/out
>    struct qp_base_attr
>    {
>        uint16_t length;
>        uint16_t attribute_id;
> 
>        uint16_t qpn;  //in/out
>        uint16_t qp_flags;
>        uint16_t max_send_wr,max_recv_qr,max_send_sge,////
>    };
>    // Option to piggy back what ibv_modify_qp does:
>    struct qp_addr_ib
>    {
>        uint16_t length;
>        uint16_t attribute_id;
> 
>        uint16_t dlid,slid,sl,pkey,etc;
>    }
> }
> 
> msg.length = sizeof(msg);
> msg.class_id = RDMA_OBJ_QP_UD;
> msg.base.legnth = sizeof(msg.base);
> msg.base.attribute_id = RDMA_ATTR_QP_BASE;
> msg.base.qp_flags = XX
> [..]
> ioctl(fd,RDMA_CREATE_OBJECT,&msg);
> [..]
> ioctl(fd,RDMA_MODIFY_OBJECT,&msg2);

I had followed this, but wondered if it wouldn't be easier to just say, use structure 1 or structure 2.  A lot of the need for this complexity seems driven by treating all QPs as a single object, rather than separate objects.  Making that change might simplify things..?  Also I think we should consider reasonable optimizations for connecting QPs.  Doug and I had to debug apps that broke because the connection process was not completing quick enough.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB045460-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-04-25 20:22               ` Jason Gunthorpe
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2016-04-25 20:22 UTC (permalink / raw)
  To: Hefty, Sean; +Cc: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

On Mon, Apr 25, 2016 at 06:51:59PM +0000, Hefty, Sean wrote:

> I was thinking that create event channel would return an fd and
> implicitly define its ioctl format.  Create_fd_object is more
> generic, but how were you thinking of defining the ioctl format?

Just keep going with the main format, on the new fd, except that
the object_id == 0 and the ioctl must be executed on the new fd.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]                     ` <1828884A29C6694DAF28B7E6B8A82373AB04548E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-04-25 20:53                       ` Jason Gunthorpe
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2016-04-25 20:53 UTC (permalink / raw)
  To: Hefty, Sean; +Cc: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

On Mon, Apr 25, 2016 at 07:16:12PM +0000, Hefty, Sean wrote:
> > However, I had intended to use the object type carred in the ioctl arg
> > as the primary mux and the ioctl would just indicate the 'method'. The
> > method ID table would be split much like you describe:
> > 
> > 'core common' object routines
> > 'built-in extra' object routines
> > 'driver-fast-path' object routines
> 
> I did understand the proposal.  My main concern was that it appeared
> that it would result in a very large function array, potentially
> with a significant number of NULL functions, associated with each
> driver.

Well, one way or another we need to build an efficient dispatch
between method + object_type.

I do not think there will be alot of nulls, a major point of the
scheme was avoid that sort of problem.
 1) Only objects type id's that actually have functions would be
    allocated, unused object type ids cost 8 bytes.
 2) Each object has it's own function table array, and each array can
    be potentially be sized to the per-object maximum function
    ordinal. So minimal nulls here
 3) Assign function ordinal numbers and object_types in a way that
    promotes dense packing, eg not just 'top 128 are driver-specific',
    but a demand based mixture.
 4) The table is allocated per-device and there is a small number of
    devices, so even if it is a few kB it is not a meaningful overhead.

> > Why do you feel cm/mgmt needs dedicated routines? I was going to model
> > CM as more objects and use the 'built-in extra' block to make CM
> > object specific calls (eg bind/etc)
> 
> I separated the cm/mgmt calls because I doubt a driver will ever
> override them, and some of the calls are system wide, versus being
> bound to a driver.

Right, this same scheme would be mirrored on the system-wide cdev (aka
rdma_cm) for that need. hfi1 also has a part of their uAPI that needs
this same functionality. :|

I'd probably just run it through the same basic code and flag some
ojects as 'global OK' ?

> I had followed this, but wondered if it wouldn't be easier to just
> say, use structure 1 or structure 2.

I don't know for sure either.

It may be simple things use the same format with a 'fixed' layout with
the header and a single variable sized structure attribute, and works
the same as a v1/v2 scheme. A little bit of overhead for consistency.

Complex things handling addresses would probably need to be
multi-attribute.

Attributes are the natural way to pass driver specific information (eg
the udata), so I think a lot of the commands will actually turn out to
be multi-attribute naturally - I haven't done a study to see how often
this is used by drivers.

At first blush it does seem reasonable, as long as we don't go
overboard. Though, I am concerned about complexity parsing this kind
of structure - every time I've built something like this the	
parsing turns out to be a royal pain. But 'comp_mask' isn't much better.

> A lot of the need for this complexity seems driven by treating all
> QPs as a single object, rather than separate objects.  Making that
> change might simplify things..?

We can certainly look at this, but we have to be careful any change
can still be made to look like the current model by libibverbs with
100% fidelity.

> Also I think we should consider reasonable optimizations for
> connecting QPs.  Doug and I had to debug apps that broke because the
> connection process was not completing quick enough.

This was discussed on the call as well..

I suspect as soon as you go to the network with any kind of packet the
small differences in API marshalling techniques is unimportant. Do you
see otherwise?

The need to create large number of AH's in a loop was brought up for
UD applications.

In any event, it is better that a driver implement a driver-specific
command for things which are truely performance senstive. This would
let the driver wring out 100% of the possible performance.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]                 ` <20160425181953.GC7675-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-04-25 19:16                   ` Hefty, Sean
@ 2016-04-26 13:18                   ` Doug Ledford
       [not found]                     ` <571F6A8C.9080100-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 25+ messages in thread
From: Doug Ledford @ 2016-04-26 13:18 UTC (permalink / raw)
  To: Jason Gunthorpe, Hefty, Sean
  Cc: OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira


[-- Attachment #1.1: Type: text/plain, Size: 1717 bytes --]

On 4/25/2016 2:19 PM, Jason Gunthorpe wrote:

> There has been a lot of talk of using a structure like netlink with a
> linked list of binary attributes and an optional/mandatory flag. For
> the lower speed stuff that seems reasonable, though it is certainly
> over-engineered for some commands.
> 
> So, a sketch would look like this:
> 
> struct msg
> {
>    uint16_t length;
>    uint16_t class_id;
>    uint32_t object_id; // in/out
>    struct qp_base_attr
>    {
>        uint16_t length;
>        uint16_t attribute_id;
> 
>        uint16_t qpn;  //in/out
>        uint16_t qp_flags;
>        uint16_t max_send_wr,max_recv_qr,max_send_sge,////
>    };
>    // Option to piggy back what ibv_modify_qp does:
>    struct qp_addr_ib
>    {
>        uint16_t length;
>        uint16_t attribute_id;
> 
>        uint16_t dlid,slid,sl,pkey,etc;
>    }
> }
> 
> msg.length = sizeof(msg);
> msg.class_id = RDMA_OBJ_QP_UD;
> msg.base.legnth = sizeof(msg.base);
> msg.base.attribute_id = RDMA_ATTR_QP_BASE;
> msg.base.qp_flags = XX
> [..]
> ioctl(fd,RDMA_CREATE_OBJECT,&msg);
> [..]
> ioctl(fd,RDMA_MODIFY_OBJECT,&msg2);

I think I would do it slightly differently.  In this example, the
class_id covers the entire list of commands.  It might be more desirable
if each command in the linked list was fully self-contained and
complete.  For the example of the cmtime program that's part of
librdamcm, when you run it with 1000s of connections as the test, being
able to pipeline 10, or 50, or 100 different commands would be useful to
the test.  Likewise, a verbs 2.0 application might want to build up a
chain of commands and pass the whole chain down in one ioctl.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]         ` <AM3PR05MB141161876D5CA05B1993D20CB1610-LOZWmgKjnYgmsg45OnKo/9qRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2016-04-26 14:19           ` Doug Ledford
       [not found]             ` <571F78F9.8010401-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Doug Ledford @ 2016-04-26 14:19 UTC (permalink / raw)
  To: Liran Liss, Hefty, Sean, Jason Gunthorpe, OFVWG,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 3504 bytes --]

On 4/24/2016 10:15 AM, Liran Liss wrote:

> For generic interfaces (currently includes Verbs, Ethernet QPs, and IB management), the new scheme should map what we have today in a flexible manner.
> This would enable us, for example, to pass only RoCE addressing attributes while modifying a RoCE QP (and optionally optimizing the kernel representation as well).
> These interfaces have a matching kAPI.

This sounds like something I was thinking as well.  Of course, abstract
ideas are sometimes less similar than you think, so putting something
concrete down can help make sure that people are actually thinking about
the same thing.

For certain operations that have lots of optional items (work requests
for one, work completions for another), the old method has been to stick
everything in one struct (which bloats it for most uses), or the extreme
opposite end of the spectrum was the recent timestamping API patches
that totally deconstructed the wc struct and rebuilt it from individual
elements and completely reordered.  Another approach to dealing this
this is tons of different structs (Christoph's work request struct rework).

Maybe we can reach a different arrangement.  I'm thinking of one base
struct that's versioned.  This base struct is the common items we always
need across the board:

struct __work_completion_common_v1 {
	union { // Only first for alignment reasons
		u64		wr_id;
		struct ib_cqe	*wr_cqe;
	}
	int	magic = 1; /* magic starts as the version #
			      in the lower 8 bits, then we
			      add flags for optional struct
			      elements */
	enum ib_wc_status	status;
	enum ib_wc_opcode	opcode;
	u32			len;
};

#define rdma_wc __work_completion_common_v1

Then we create optional, additional structs.  Such as a specific struct
for each address type:

/* The common struct and option struct versions always match */
struct __work_completion_ib_addr_v1 {
	struct ib_qp		*qp;
	u32			src_qp;
	u16			pkey_index;
	u16			slid;
	u8			port_num;
	u8			reserved[3]; // preserve u64 alignment
};

#define rdma_wc_ib_addr __work_completion_ib_addr_v1

struct __work_completion_eth_addr_v1 {
	u8			smac[ETH_ALEN];
	u16			vlan_id;
};

#define rdma_wc_eth_addr __work_completion_eth_addr_v1

....

Addtional optional struct items can then be defined, for things like
errors, immediate/invalidate data, timestamps, etc.

When building a wc, you start with the base struct, the magic is set to
the version, then for each optional element you add, you set a flag
field for that element in the magic item.  Optional element flags occupy
the upper 24 bits.  The length of the total struct is the length of the
base struct plus the length of all optional structs, and the order of
the optional structs matches their bit order from lowest to highest in
the magic element.  It's not quite as free form as the patches for
timestamp support were, but still allows the structs some flexibility in
what is included and what isn't.

When parsing the wc, you verify you have the right version first, then
you process what you need from the common struct, and if you have need
of it, process any additional stuff by walking the set bits in the magic
struct to get to each optional struct item.

Something like that can be applied to wcs, wrs, and where there is
enough variability to warrant it, other items as well.  Of course, if an
item doesn't vary all that much, then a single struct is still preferable.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]                     ` <571F6A8C.9080100-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-04-26 14:33                       ` Jason Gunthorpe
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2016-04-26 14:33 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Hefty, Sean, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Weiny, Ira

On Tue, Apr 26, 2016 at 09:18:04AM -0400, Doug Ledford wrote:

> > So, a sketch would look like this:
> > 
> > struct msg
> > {
> >    uint16_t length;
> >    uint16_t class_id;
> >    uint32_t object_id; // in/out
> >    struct qp_base_attr
> >    {
> >        uint16_t length;
> >        uint16_t attribute_id;
> > 
> >        uint16_t qpn;  //in/out
> >        uint16_t qp_flags;
> >        uint16_t max_send_wr,max_recv_qr,max_send_sge,////
> >    };
> >    // Option to piggy back what ibv_modify_qp does:
> >    struct qp_addr_ib
> >    {
> >        uint16_t length;
> >        uint16_t attribute_id;
> > 
> >        uint16_t dlid,slid,sl,pkey,etc;
> >    }
> > }
> > 
> > msg.length = sizeof(msg);
> > msg.class_id = RDMA_OBJ_QP_UD;
> > msg.base.legnth = sizeof(msg.base);
> > msg.base.attribute_id = RDMA_ATTR_QP_BASE;
> > msg.base.qp_flags = XX
> > [..]
> > ioctl(fd,RDMA_CREATE_OBJECT,&msg);
> > [..]
> > ioctl(fd,RDMA_MODIFY_OBJECT,&msg2);
> 
> I think I would do it slightly differently.  In this example, the
> class_id covers the entire list of commands.

It isn't a list of commands, it is a list of attributes - this was
specifically exploring the idea Liran has talked about where create_qp
and modify_qp *as a special case* can be combined. This is not
chaining commands but atomically creating a qp with the full set of
attributes.

> if each command in the linked list was fully self-contained and
> complete.  For the example of the cmtime program that's part of
> librdamcm, when you run it with 1000s of connections as the test, being
> able to pipeline 10, or 50, or 100 different commands would be useful to
> the test.  Likewise, a verbs 2.0 application might want to build up a
> chain of commands and pass the whole chain down in one ioctl.

It would not be too hard to provide a linked list execution ioctl:

struct chain_hdr
{
    uint32_t command;
    struct chain_hdr *next;
    uint64_t msg[];
}
ioctl(fd,RDMA_EXECUTE_CHAINED_COMMANDS,&chain_hdr);

I think the kernel side would be fairly trivial.

Someone would have to figure out a way to use that from userspace of
course..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]             ` <571F78F9.8010401-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-04-26 14:58               ` Jason Gunthorpe
       [not found]                 ` <20160426145813.GB24104-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2016-04-26 14:58 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Liran Liss, Hefty, Sean, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 26, 2016 at 10:19:37AM -0400, Doug Ledford wrote:
> For certain operations that have lots of optional items (work requests
> for one, work completions for another)

FWIW, I think we had a general consensus to take a different approach.

Basically, the 'common' uAPI does not care about micro-performance.

Drivers have to implement hardware-specific driver calls to micro-optimize
their own high speed paths, and that would be done specifically with a
single hardware in mind.

This is already done by the majority of drivers for wc/wr processing
(IIRC, only qib calls to the kernel for this)

If we do provide a common wr/wc API then it can just be designed
inefficiently around the netlink attribute architecture, uncaring
about performance because nothing should use it. I'd prefer not to
implement it at all...

This same basic idea flows over to other parts, eg if a driver has
special support for a specific work load (say fast creation of IB UD
AHs) then it can have a high speed driver-specific call to do that
work completely micro-optimized using data formed *exactly* the way
the hardware needs.

> base struct plus the length of all optional structs, and the order of
> the optional structs matches their bit order from lowest to highest in
> the magic element.  It's not quite as free form as the patches for
> timestamp support were, but still allows the structs some flexibility in
> what is included and what isn't.

Mellanox has a patch series that tries to do exactly this for the wc
in libibverbs - it is quite ugly, and the the benchmarks showed worse
performance compared to the current technique.

For the reasons above I would prefer to stick entirely with the
netlink attribute format or very similar as the main mechanism.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]                 ` <20160426145813.GB24104-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-04-26 16:38                   ` Doug Ledford
       [not found]                     ` <571F9968.3080501-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2016-04-26 16:46                   ` Hefty, Sean
  1 sibling, 1 reply; 25+ messages in thread
From: Doug Ledford @ 2016-04-26 16:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Liran Liss, Hefty, Sean, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 4046 bytes --]

On 4/26/2016 10:58 AM, Jason Gunthorpe wrote:
> On Tue, Apr 26, 2016 at 10:19:37AM -0400, Doug Ledford wrote:
>> For certain operations that have lots of optional items (work requests
>> for one, work completions for another)
> 
> FWIW, I think we had a general consensus to take a different approach.
> 
> Basically, the 'common' uAPI does not care about micro-performance.
> 
> Drivers have to implement hardware-specific driver calls to micro-optimize
> their own high speed paths, and that would be done specifically with a
> single hardware in mind.
> 
> This is already done by the majority of drivers for wc/wr processing
> (IIRC, only qib calls to the kernel for this)
> 
> If we do provide a common wr/wc API then it can just be designed
> inefficiently around the netlink attribute architecture, uncaring
> about performance because nothing should use it. I'd prefer not to
> implement it at all...

We're talking about two different things.  I had the actual user space
API on my mind when I wrote what I wrote (aka, libibverbs).  If we are
going to talk about the verbs 2.0 kernel interface, then it makes sense
to me to keep the user space API firmly in mind too.  Although it would
be great if the user space verbs never changed a bit, that isn't
entirely possible.  The timestamp changes that are still waiting are an
example.  Currently, I'm not real happy with how the extension mechanism
in libibverbs has played out.  The intent was good, the reality is
clunky IMO.

> This same basic idea flows over to other parts, eg if a driver has
> special support for a specific work load (say fast creation of IB UD
> AHs) then it can have a high speed driver-specific call to do that
> work completely micro-optimized using data formed *exactly* the way
> the hardware needs.
> 
>> base struct plus the length of all optional structs, and the order of
>> the optional structs matches their bit order from lowest to highest in
>> the magic element.  It's not quite as free form as the patches for
>> timestamp support were, but still allows the structs some flexibility in
>> what is included and what isn't.
> 
> Mellanox has a patch series that tries to do exactly this for the wc
> in libibverbs - it is quite ugly, and the the benchmarks showed worse
> performance compared to the current technique.

Right, but it completely rewrote the struct from scratch for each WC.
That's different than what I posted, which was more along the lines of
functional groupings.  It reduces the number of conditionals while still
reducing the overall struct size for anything other than "we have every
option turned on" case.

There are only a few options for how to expose these things to user
space (using the example I gave as a further talking point):

1) Grow struct ibv_wc for every new option.  This totally prevents any
ability to either remove items or reorder items in this struct.  It also
bloats the size of the user visible struct for the common case.

2) Do away with direct variable access and go to indirect variable
access via accessor functions.  This will work, and has the advantage
that accessor functions can work directly with the hardware specific
structs, thereby eliminating a copy from the hardware struct to the user
visible API struct.  It has the disadvantage that every item you wish to
access will require an indirect function call from a table.

3) Do like the Mellanox patches did and completely rewrite the
completion struct for each completion.  Changing ordering and everything
else based on what's there.  As you pointed out, this had performance
issues.

4) What I wrote, which was intended to be a compromise between #1 and #3
to hopefully help with performance issues.

> For the reasons above I would prefer to stick entirely with the
> netlink attribute format or very similar as the main mechanism.

I don't intend to expose anything netlink to libibverbs users ;-)

Like I said, we're talking about different things.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Further thoughts on uAPI
       [not found]                 ` <20160426145813.GB24104-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-04-26 16:38                   ` Doug Ledford
@ 2016-04-26 16:46                   ` Hefty, Sean
  1 sibling, 0 replies; 25+ messages in thread
From: Hefty, Sean @ 2016-04-26 16:46 UTC (permalink / raw)
  To: Jason Gunthorpe, Doug Ledford
  Cc: Liran Liss, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA

> If we do provide a common wr/wc API then it can just be designed
> inefficiently around the netlink attribute architecture, uncaring
> about performance because nothing should use it. I'd prefer not to
> implement it at all...

I agree with this.  WR/WC should be driver specific.  There's no need to mirror a user space API to a kernel uABI and then to a kernel API.
 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Further thoughts on uAPI
       [not found]                     ` <571F9968.3080501-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-04-26 16:54                       ` Jason Gunthorpe
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2016-04-26 16:54 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Liran Liss, Hefty, Sean, OFVWG, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 26, 2016 at 12:38:00PM -0400, Doug Ledford wrote:
> We're talking about two different things.  I had the actual user space
> API on my mind when I wrote what I wrote (aka, libibverbs).

This will become very confusing if we don't focus on one thing..

How to fix libibverbs's API is a totally different problem, only very
loosely related to fixing the uAPI. Whatever new uAPI we come up with
must support current libibverbs with no loss during translation.

Tackling a verbs 2.0 along with the uAPI project is too big, IMHO.

It is pretty obvious to me we don't want to retain the current near
1:1 mapping of libverbs calls and kernel calls.

> There are only a few options for how to expose these things to user
> space (using the example I gave as a further talking point):

AFAIK Mellanox is working on benchmarking these options, so perhaps we
will have some data someday.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-04-26 16:54 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-20  1:25 Further thoughts on uAPI Jason Gunthorpe
     [not found] ` <20160420012526.GA25508-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-04-20  4:54   ` Hefty, Sean
2016-04-21 12:32   ` Hefty, Sean
2016-04-21 13:35   ` Hefty, Sean
     [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB044043-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-04-21 13:54       ` Hefty, Sean
2016-04-21 14:03       ` Leon Romanovsky
     [not found]         ` <20160421140347.GI26951-2ukJVAZIZ/Y@public.gmane.org>
2016-04-21 14:35           ` Hefty, Sean
2016-04-21 17:24       ` Jason Gunthorpe
     [not found]         ` <20160421172428.GA5102-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-04-22 16:35           ` Hefty, Sean
2016-04-24 20:11           ` Hefty, Sean
     [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB045101-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-04-25 18:19               ` Jason Gunthorpe
     [not found]                 ` <20160425181953.GC7675-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-04-25 19:16                   ` Hefty, Sean
     [not found]                     ` <1828884A29C6694DAF28B7E6B8A82373AB04548E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-04-25 20:53                       ` Jason Gunthorpe
2016-04-26 13:18                   ` Doug Ledford
     [not found]                     ` <571F6A8C.9080100-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-26 14:33                       ` Jason Gunthorpe
2016-04-24 14:15       ` Liran Liss
     [not found]         ` <AM3PR05MB141161876D5CA05B1993D20CB1610-LOZWmgKjnYgmsg45OnKo/9qRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2016-04-26 14:19           ` Doug Ledford
     [not found]             ` <571F78F9.8010401-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-26 14:58               ` Jason Gunthorpe
     [not found]                 ` <20160426145813.GB24104-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-04-26 16:38                   ` Doug Ledford
     [not found]                     ` <571F9968.3080501-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-26 16:54                       ` Jason Gunthorpe
2016-04-26 16:46                   ` Hefty, Sean
2016-04-25 16:29   ` Hefty, Sean
     [not found]     ` <1828884A29C6694DAF28B7E6B8A82373AB0453F5-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-04-25 18:32       ` Jason Gunthorpe
     [not found]         ` <20160425183243.GD7675-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-04-25 18:51           ` Hefty, Sean
     [not found]             ` <1828884A29C6694DAF28B7E6B8A82373AB045460-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-04-25 20:22               ` Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.