All of lore.kernel.org
 help / color / mirror / Atom feed
* AF_NETDEV - device specific sockets
@ 2015-02-01  4:20 Zayats, Michael
  2015-02-01  4:41 ` John Fastabend
  0 siblings, 1 reply; 5+ messages in thread
From: Zayats, Michael @ 2015-02-01  4:20 UTC (permalink / raw)
  To: netdev

Hi,

I am looking for a generic mechanism that would allow network device drivers to provide socket interface to user and kernel space clients.

Such an interface might be used to provide access to important sub-streams of packets, alongside with device specific packet metadata, provided through msg_control fields of recv/sendmsg.

RX Metadata might include device specific information, such as queuing priorities applied, potential destination interface in case of switching hardware etc.

On the transmission, metadata might be used to indicate hardware specific required optimizations, as well as any other transformation or accounting required on the packet.

AF_PACKET based mechanism doesn't allow metadata to be exchanged between the client and the device driver.
Extending it would require extending of sk_buff and potentially additional per packet operations.
Generic Netlink is not intended to pass packets.

As I am trying to validate generic applicability of such a mechanism, I see that TUN driver is providing custom socket interface, in order to deal with user information through msg_control.
Only usable inside the kernel, through custom interface.

Proposed interface
------------------
Kernel side: 
(struct proto *) should be added to struct net_device.
Device driver that is interested to support socket interface would populate the pointer.

User space: 
After creating AF_NETDEV socket, the only successful operation would be setting SO_BINDTODEVICE option.
Once set, all socket operations would be implemented by calling functions, that are registered at struct proto on the appropriate net_device.

What do you think?
Would you see a better approach?
Some other mechanism that already exists for such a purpose?

Thanks,

Michael

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: AF_NETDEV - device specific sockets
  2015-02-01  4:20 AF_NETDEV - device specific sockets Zayats, Michael
@ 2015-02-01  4:41 ` John Fastabend
  2015-02-01  5:04   ` Zayats, Michael
  0 siblings, 1 reply; 5+ messages in thread
From: John Fastabend @ 2015-02-01  4:41 UTC (permalink / raw)
  To: Zayats, Michael; +Cc: netdev

On 01/31/2015 08:20 PM, Zayats, Michael wrote:
> Hi,
>
> I am looking for a generic mechanism that would allow network device
> drivers to provide socket interface to user and kernel space
> clients.
>
> Such an interface might be used to provide access to important
> sub-streams of packets, alongside with device specific packet
> metadata, provided through msg_control fields of recv/sendmsg.
>
> RX Metadata might include device specific information, such as
> queuing priorities applied, potential destination interface in case
> of switching hardware etc.
>
> On the transmission, metadata might be used to indicate hardware
> specific required optimizations, as well as any other transformation
> or accounting required on the packet.
>
> AF_PACKET based mechanism doesn't allow metadata to be exchanged
> between the client and the device driver. Extending it would require
> extending of sk_buff and potentially additional per packet
> operations. Generic Netlink is not intended to pass packets.
>
> As I am trying to validate generic applicability of such a mechanism,
> I see that TUN driver is providing custom socket interface, in order
> to deal with user information through msg_control. Only usable inside
> the kernel, through custom interface.

> Proposed interface
> ------------------
> Kernel side:
> (struct proto *) should be added to struct net_device.
> Device driver that is interested to support socket interface would populate the pointer.
>

> User space: After creating AF_NETDEV socket, the only successful
> operation would be setting SO_BINDTODEVICE option. Once set, all
> socket operations would be implemented by calling functions, that are
> registered at struct proto on the appropriate net_device.
>
> What do you think?
> Would you see a better approach?
> Some other mechanism that already exists for such a purpose?

It might help to come up with specific examples but an alternate
proposal would be to use skb->priority field and then mqprio to
steer the traffic to a specific queue and then bind attributes to
the queue.

For example the NIC offloaded QOS can be mapped on to queues and
then sockets mapped to the queues.

Another example would be to forward all traffic from one queue
to a virtual fuction in SR-IOV use case. We don't have an interface
to do this but I have been working on an API that could be used
for this.

In this case you don't need to modify AF_PACKET interface but
configure the device correctly. If you need per-packet control
you could use 'tc' or 'nftables' to do the steering.

.John

-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: AF_NETDEV - device specific sockets
  2015-02-01  4:41 ` John Fastabend
@ 2015-02-01  5:04   ` Zayats, Michael
  2015-02-02 18:04     ` John Fastabend
  0 siblings, 1 reply; 5+ messages in thread
From: Zayats, Michael @ 2015-02-01  5:04 UTC (permalink / raw)
  To: John Fastabend; +Cc: netdev

More specific example would be when NIC performs certain fast path processing, 
while punting to the CPU for a slow path.
Slow path would be interested to know the punt reason.

Another example would be if specific NIC strips S-tag in QinQ case and would like to communicate the stripped 
Tag to the client. 

There might be many types of custom functionality, agreed between the NIC and the clients, 
which is not generic or not practical enough for inclusion in the kernel.

That's why I am looking for a generic, socket like mechanism of device<->client, packet + metadata communication,
which wouldn't require core kernel modification.

Thanks,

Michael





-----Original Message-----
From: John Fastabend [mailto:john.fastabend@gmail.com] 
Sent: Saturday, January 31, 2015 8:41 PM
To: Zayats, Michael
Cc: netdev@vger.kernel.org
Subject: Re: AF_NETDEV - device specific sockets

On 01/31/2015 08:20 PM, Zayats, Michael wrote:
> Hi,
>
> I am looking for a generic mechanism that would allow network device 
> drivers to provide socket interface to user and kernel space clients.
>
> Such an interface might be used to provide access to important 
> sub-streams of packets, alongside with device specific packet 
> metadata, provided through msg_control fields of recv/sendmsg.
>
> RX Metadata might include device specific information, such as queuing 
> priorities applied, potential destination interface in case of 
> switching hardware etc.
>
> On the transmission, metadata might be used to indicate hardware 
> specific required optimizations, as well as any other transformation 
> or accounting required on the packet.
>
> AF_PACKET based mechanism doesn't allow metadata to be exchanged 
> between the client and the device driver. Extending it would require 
> extending of sk_buff and potentially additional per packet operations. 
> Generic Netlink is not intended to pass packets.
>
> As I am trying to validate generic applicability of such a mechanism, 
> I see that TUN driver is providing custom socket interface, in order 
> to deal with user information through msg_control. Only usable inside 
> the kernel, through custom interface.

> Proposed interface
> ------------------
> Kernel side:
> (struct proto *) should be added to struct net_device.
> Device driver that is interested to support socket interface would populate the pointer.
>

> User space: After creating AF_NETDEV socket, the only successful 
> operation would be setting SO_BINDTODEVICE option. Once set, all 
> socket operations would be implemented by calling functions, that are 
> registered at struct proto on the appropriate net_device.
>
> What do you think?
> Would you see a better approach?
> Some other mechanism that already exists for such a purpose?

It might help to come up with specific examples but an alternate proposal would be to use skb->priority field and then mqprio to steer the traffic to a specific queue and then bind attributes to the queue.

For example the NIC offloaded QOS can be mapped on to queues and then sockets mapped to the queues.

Another example would be to forward all traffic from one queue to a virtual fuction in SR-IOV use case. We don't have an interface to do this but I have been working on an API that could be used for this.

In this case you don't need to modify AF_PACKET interface but configure the device correctly. If you need per-packet control you could use 'tc' or 'nftables' to do the steering.

.John

-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: AF_NETDEV - device specific sockets
  2015-02-01  5:04   ` Zayats, Michael
@ 2015-02-02 18:04     ` John Fastabend
  2015-02-04  5:51       ` Zayats, Michael
  0 siblings, 1 reply; 5+ messages in thread
From: John Fastabend @ 2015-02-02 18:04 UTC (permalink / raw)
  To: Zayats, Michael; +Cc: netdev, dborkman

On 01/31/2015 09:04 PM, Zayats, Michael wrote:
> More specific example would be when NIC performs certain fast path processing,
> while punting to the CPU for a slow path.
> Slow path would be interested to know the punt reason.
>
> Another example would be if specific NIC strips S-tag in QinQ case and would like to communicate the stripped
> Tag to the client.
>

Right, maybe we need some sort of TLV scheme to pass up the relevant
info. I'm not sure we want to necessarily bury it in the driver though.
Perhaps passing auxdata in a TLV format is worth considering.

Just curious do you have NICs that are stripping or inserting more then
a single tag?

For tagging my current scheme is to strip outer tags using this
experimental Flow  API

	http://www.spinics.net/lists/netdev/msg313071.html

and then only report the inner tag to the stack. At the moment I haven't
found any use cases this is not sufficient.

> There might be many types of custom functionality, agreed between the NIC and the clients,
> which is not generic or not practical enough for inclusion in the kernel.
>
> That's why I am looking for a generic, socket like mechanism of device<->client, packet + metadata communication,
> which wouldn't require core kernel modification.

hmm the question is how do the NIC and client "agree" on the
format of the data and its meaning? If you follow the thread
above and also our af_packet direct DMA work we are struggling
with similar questions,

	http://www.spinics.net/lists/netdev/msg311862.html

I think we need some way to "describe" the meta-data or we
need to build some kernel/uapi standard that defines them.

.John

>
> Thanks,
>
> Michael
>
>
>
>
>
> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@gmail.com]
> Sent: Saturday, January 31, 2015 8:41 PM
> To: Zayats, Michael
> Cc: netdev@vger.kernel.org
> Subject: Re: AF_NETDEV - device specific sockets
>
> On 01/31/2015 08:20 PM, Zayats, Michael wrote:
>> Hi,
>>
>> I am looking for a generic mechanism that would allow network device
>> drivers to provide socket interface to user and kernel space clients.
>>
>> Such an interface might be used to provide access to important
>> sub-streams of packets, alongside with device specific packet
>> metadata, provided through msg_control fields of recv/sendmsg.
>>
>> RX Metadata might include device specific information, such as queuing
>> priorities applied, potential destination interface in case of
>> switching hardware etc.
>>
>> On the transmission, metadata might be used to indicate hardware
>> specific required optimizations, as well as any other transformation
>> or accounting required on the packet.
>>
>> AF_PACKET based mechanism doesn't allow metadata to be exchanged
>> between the client and the device driver. Extending it would require
>> extending of sk_buff and potentially additional per packet operations.
>> Generic Netlink is not intended to pass packets.
>>
>> As I am trying to validate generic applicability of such a mechanism,
>> I see that TUN driver is providing custom socket interface, in order
>> to deal with user information through msg_control. Only usable inside
>> the kernel, through custom interface.
>
>> Proposed interface
>> ------------------
>> Kernel side:
>> (struct proto *) should be added to struct net_device.
>> Device driver that is interested to support socket interface would populate the pointer.
>>
>
>> User space: After creating AF_NETDEV socket, the only successful
>> operation would be setting SO_BINDTODEVICE option. Once set, all
>> socket operations would be implemented by calling functions, that are
>> registered at struct proto on the appropriate net_device.
>>
>> What do you think?
>> Would you see a better approach?
>> Some other mechanism that already exists for such a purpose?
>
> It might help to come up with specific examples but an alternate proposal would be to use skb->priority field and then mqprio to steer the traffic to a specific queue and then bind attributes to the queue.
>
> For example the NIC offloaded QOS can be mapped on to queues and then sockets mapped to the queues.
>
> Another example would be to forward all traffic from one queue to a virtual fuction in SR-IOV use case. We don't have an interface to do this but I have been working on an API that could be used for this.
>
> In this case you don't need to modify AF_PACKET interface but configure the device correctly. If you need per-packet control you could use 'tc' or 'nftables' to do the steering.
>
> .John
>


-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: AF_NETDEV - device specific sockets
  2015-02-02 18:04     ` John Fastabend
@ 2015-02-04  5:51       ` Zayats, Michael
  0 siblings, 0 replies; 5+ messages in thread
From: Zayats, Michael @ 2015-02-04  5:51 UTC (permalink / raw)
  To: John Fastabend; +Cc: netdev, dborkman

> > More specific example would be when NIC performs certain fast path
> > processing, while punting to the CPU for a slow path.
> > Slow path would be interested to know the punt reason.
> >
> > Another example would be if specific NIC strips S-tag in QinQ case and
> > would like to communicate the stripped Tag to the client.
> >
> 
> Right, maybe we need some sort of TLV scheme to pass up the relevant
> info. I'm not sure we want to necessarily bury it in the driver though.
> Perhaps passing auxdata in a TLV format is worth considering.

CMSG formatting in sockets msg_control is pretty close, right?

> 
> Just curious do you have NICs that are stripping or inserting more then
> a single tag?

No, just a single tag.

> 
> For tagging my current scheme is to strip outer tags using this
> experimental Flow  API
> 
> 	http://www.spinics.net/lists/netdev/msg313071.html
> 
> and then only report the inner tag to the stack. At the moment I haven't
> found any use cases this is not sufficient.
> 
> > There might be many types of custom functionality, agreed between the
> > NIC and the clients, which is not generic or not practical enough for
> inclusion in the kernel.
> >
> > That's why I am looking for a generic, socket like mechanism of
> > device<->client, packet + metadata communication, which wouldn't
> require core kernel modification.
> 
> hmm the question is how do the NIC and client "agree" on the format of
> the data and its meaning? If you follow the thread above and also our
> af_packet direct DMA work we are struggling with similar questions,
> 
> 	http://www.spinics.net/lists/netdev/msg311862.html
> 
> I think we need some way to "describe" the meta-data or we need to build
> some kernel/uapi standard that defines them.
> 

Agreed, some kind of standard way to describe the language is needed.
However, I don't think it should preclude us from having a generic mechanism for communicating custom metadata alongside the packets. When client knows, which NIC type it "talks to".
So far, all the custom drivers that I have seen that needed it, ended up exposing chr device and interposing a custom header between the packets. Same approach that TUN is taking for virtio.

> .John
> 
> >
> > Thanks,
> >
> > Michael
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: John Fastabend [mailto:john.fastabend@gmail.com]
> > Sent: Saturday, January 31, 2015 8:41 PM
> > To: Zayats, Michael
> > Cc: netdev@vger.kernel.org
> > Subject: Re: AF_NETDEV - device specific sockets
> >
> > On 01/31/2015 08:20 PM, Zayats, Michael wrote:
> >> Hi,
> >>
> >> I am looking for a generic mechanism that would allow network device
> >> drivers to provide socket interface to user and kernel space clients.
> >>
> >> Such an interface might be used to provide access to important
> >> sub-streams of packets, alongside with device specific packet
> >> metadata, provided through msg_control fields of recv/sendmsg.
> >>
> >> RX Metadata might include device specific information, such as
> >> queuing priorities applied, potential destination interface in case
> >> of switching hardware etc.
> >>
> >> On the transmission, metadata might be used to indicate hardware
> >> specific required optimizations, as well as any other transformation
> >> or accounting required on the packet.
> >>
> >> AF_PACKET based mechanism doesn't allow metadata to be exchanged
> >> between the client and the device driver. Extending it would require
> >> extending of sk_buff and potentially additional per packet
> operations.
> >> Generic Netlink is not intended to pass packets.
> >>
> >> As I am trying to validate generic applicability of such a mechanism,
> >> I see that TUN driver is providing custom socket interface, in order
> >> to deal with user information through msg_control. Only usable inside
> >> the kernel, through custom interface.
> >
> >> Proposed interface
> >> ------------------
> >> Kernel side:
> >> (struct proto *) should be added to struct net_device.
> >> Device driver that is interested to support socket interface would
> populate the pointer.
> >>
> >
> >> User space: After creating AF_NETDEV socket, the only successful
> >> operation would be setting SO_BINDTODEVICE option. Once set, all
> >> socket operations would be implemented by calling functions, that are
> >> registered at struct proto on the appropriate net_device.
> >>
> >> What do you think?
> >> Would you see a better approach?
> >> Some other mechanism that already exists for such a purpose?
> >
> > It might help to come up with specific examples but an alternate
> proposal would be to use skb->priority field and then mqprio to steer
> the traffic to a specific queue and then bind attributes to the queue.
> >
> > For example the NIC offloaded QOS can be mapped on to queues and then
> sockets mapped to the queues.
> >
> > Another example would be to forward all traffic from one queue to a
> virtual fuction in SR-IOV use case. We don't have an interface to do
> this but I have been working on an API that could be used for this.
> >
> > In this case you don't need to modify AF_PACKET interface but
> configure the device correctly. If you need per-packet control you could
> use 'tc' or 'nftables' to do the steering.
> >
> > .John
> >
> 
> 
> --
> John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-02-04  5:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-01  4:20 AF_NETDEV - device specific sockets Zayats, Michael
2015-02-01  4:41 ` John Fastabend
2015-02-01  5:04   ` Zayats, Michael
2015-02-02 18:04     ` John Fastabend
2015-02-04  5:51       ` Zayats, Michael

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.