[net-next,0/3] support changing steering policies in tuntap
mbox series

Message ID 1506500637-13881-1-git-send-email-jasowang@redhat.com
Headers show
Series
  • support changing steering policies in tuntap
Related show

Message

Jason Wang Sept. 27, 2017, 8:23 a.m. UTC
Hi all:

We use flow caches based flow steering policy now. This is good for
connection-oriented communication such as TCP but not for the others
e.g connectionless unidirectional workload which cares only about
pps. This calls the ability of supporting changing steering policies
in tuntap which was done by this series.

Flow steering policy was abstracted into tun_steering_ops in the first
patch. Then new ioctls to set or query current policy were introduced,
and the last patch introduces a very simple policy that select txq
based on processor id as an example.

Test was done by using xdp_redirect to redirect traffic generated from
MoonGen that was running on a remote machine. And I see 37%
improvement for processor id policy compared to automatic flow
steering policy.

In the future, both simple and sophisticated policy like RSS or other guest
driven steering policies could be done on top.

Thanks

Jason Wang (3):
  tun: abstract flow steering logic
  tun: introduce ioctls to set and get steering policies
  tun: introduce cpu id based steering policy

 drivers/net/tun.c           | 151 +++++++++++++++++++++++++++++++++++++-------
 include/uapi/linux/if_tun.h |   8 +++
 2 files changed, 136 insertions(+), 23 deletions(-)

Comments

Michael S. Tsirkin Sept. 27, 2017, 10:13 p.m. UTC | #1
On Wed, Sep 27, 2017 at 04:23:54PM +0800, Jason Wang wrote:
> Hi all:
> 
> We use flow caches based flow steering policy now. This is good for
> connection-oriented communication such as TCP but not for the others
> e.g connectionless unidirectional workload which cares only about
> pps. This calls the ability of supporting changing steering policies
> in tuntap which was done by this series.
> 
> Flow steering policy was abstracted into tun_steering_ops in the first
> patch. Then new ioctls to set or query current policy were introduced,
> and the last patch introduces a very simple policy that select txq
> based on processor id as an example.
> 
> Test was done by using xdp_redirect to redirect traffic generated from
> MoonGen that was running on a remote machine. And I see 37%
> improvement for processor id policy compared to automatic flow
> steering policy.

For sure, if you don't need to figure out the flow hash then you can
save a bunch of cycles.  But I don't think the cpu policy is too
practical outside of a benchmark.

Did you generate packets and just send them to tun? If so, this is not a
typical configuration, is it? With packets coming e.g.  from a real nic
they might already have the hash pre-calculated, and you won't
see the benefit.

> In the future, both simple and sophisticated policy like RSS or other guest
> driven steering policies could be done on top.

IMHO there should be a more practical example before adding all this
indirection. And it would be nice to understand why this queue selection
needs to be tun specific.

> Thanks
> 
> Jason Wang (3):
>   tun: abstract flow steering logic
>   tun: introduce ioctls to set and get steering policies
>   tun: introduce cpu id based steering policy
> 
>  drivers/net/tun.c           | 151 +++++++++++++++++++++++++++++++++++++-------
>  include/uapi/linux/if_tun.h |   8 +++
>  2 files changed, 136 insertions(+), 23 deletions(-)
> 
> -- 
> 2.7.4
Willem de Bruijn Sept. 27, 2017, 11:25 p.m. UTC | #2
>> In the future, both simple and sophisticated policy like RSS or other guest
>> driven steering policies could be done on top.
>
> IMHO there should be a more practical example before adding all this
> indirection. And it would be nice to understand why this queue selection
> needs to be tun specific.

I was thinking the same and this reminds me of the various strategies
implemented in packet fanout. tun_cpu_select_queue is analogous to
fanout_demux_cpu though it is tun-specific in that it requires tun->numqueues.

Fanout accrued various strategies until it gained an eBPF variant. Just
supporting BPF is probably sufficient here, too.
Tom Herbert Sept. 28, 2017, 5:02 a.m. UTC | #3
On Wed, Sep 27, 2017 at 4:25 PM, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>>> In the future, both simple and sophisticated policy like RSS or other guest
>>> driven steering policies could be done on top.
>>
>> IMHO there should be a more practical example before adding all this
>> indirection. And it would be nice to understand why this queue selection
>> needs to be tun specific.
>
> I was thinking the same and this reminds me of the various strategies
> implemented in packet fanout. tun_cpu_select_queue is analogous to
> fanout_demux_cpu though it is tun-specific in that it requires tun->numqueues.
>
> Fanout accrued various strategies until it gained an eBPF variant. Just
> supporting BPF is probably sufficient here, too.

+1, in addition to packet fanout, we have SO_REUSEPORT with BPF, RPS,
RFS, etc. It would be nice if existing packet steering mechanisms
could be leveraged for tun.
Jason Wang Sept. 28, 2017, 6:50 a.m. UTC | #4
On 2017年09月28日 06:13, Michael S. Tsirkin wrote:
> On Wed, Sep 27, 2017 at 04:23:54PM +0800, Jason Wang wrote:
>> Hi all:
>>
>> We use flow caches based flow steering policy now. This is good for
>> connection-oriented communication such as TCP but not for the others
>> e.g connectionless unidirectional workload which cares only about
>> pps. This calls the ability of supporting changing steering policies
>> in tuntap which was done by this series.
>>
>> Flow steering policy was abstracted into tun_steering_ops in the first
>> patch. Then new ioctls to set or query current policy were introduced,
>> and the last patch introduces a very simple policy that select txq
>> based on processor id as an example.
>>
>> Test was done by using xdp_redirect to redirect traffic generated from
>> MoonGen that was running on a remote machine. And I see 37%
>> improvement for processor id policy compared to automatic flow
>> steering policy.
> For sure, if you don't need to figure out the flow hash then you can
> save a bunch of cycles.  But I don't think the cpu policy is too
> practical outside of a benchmark.

Well, the aim of the series is to add methods to change the steering 
policy, cpu policy is an example. Actually, it may make sense for some 
cards which guarantee that all packets belongs to a specific flow goes 
into a specific cpu.

>
> Did you generate packets and just send them to tun? If so, this is not a
> typical configuration, is it?

The test was done by:

- generate UDP traffic from a remote machine
- use xdp redirection to do mac swap in guest and forward it back to the 
remote machine

>   With packets coming e.g.  from a real nic
> they might already have the hash pre-calculated, and you won't
> see the benefit.

Yes, I can switch to use this as a example policy.

Thanks

>
>> In the future, both simple and sophisticated policy like RSS or other guest
>> driven steering policies could be done on top.
> IMHO there should be a more practical example before adding all this
> indirection. And it would be nice to understand why this queue selection
> needs to be tun specific.

Actually, we can use fanout policy (as pointed out) by using the API 
introduced in this series.

Thanks

>
>> Thanks
>>
>> Jason Wang (3):
>>    tun: abstract flow steering logic
>>    tun: introduce ioctls to set and get steering policies
>>    tun: introduce cpu id based steering policy
>>
>>   drivers/net/tun.c           | 151 +++++++++++++++++++++++++++++++++++++-------
>>   include/uapi/linux/if_tun.h |   8 +++
>>   2 files changed, 136 insertions(+), 23 deletions(-)
>>
>> -- 
>> 2.7.4
Jason Wang Sept. 28, 2017, 7:23 a.m. UTC | #5
On 2017年09月28日 07:25, Willem de Bruijn wrote:
>>> In the future, both simple and sophisticated policy like RSS or other guest
>>> driven steering policies could be done on top.
>> IMHO there should be a more practical example before adding all this
>> indirection. And it would be nice to understand why this queue selection
>> needs to be tun specific.
> I was thinking the same and this reminds me of the various strategies
> implemented in packet fanout. tun_cpu_select_queue is analogous to
> fanout_demux_cpu though it is tun-specific in that it requires tun->numqueues.

Right, the main idea is to introduce a way to change flow steering 
policy for tun. I think fanout policy could be implemented through the 
API introduced in this series. (Current flow caches based automatic 
steering method is tun specific).

>
> Fanout accrued various strategies until it gained an eBPF variant. Just
> supporting BPF is probably sufficient here, too.

Technically yes, but for tun, it also serve for virt. We probably still 
need some hard coded policy which could be changed by guest until we can 
accept an BPF program from guest I think?

Thanks
Jason Wang Sept. 28, 2017, 7:53 a.m. UTC | #6
On 2017年09月28日 13:02, Tom Herbert wrote:
> On Wed, Sep 27, 2017 at 4:25 PM, Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>>>> In the future, both simple and sophisticated policy like RSS or other guest
>>>> driven steering policies could be done on top.
>>> IMHO there should be a more practical example before adding all this
>>> indirection. And it would be nice to understand why this queue selection
>>> needs to be tun specific.
>> I was thinking the same and this reminds me of the various strategies
>> implemented in packet fanout. tun_cpu_select_queue is analogous to
>> fanout_demux_cpu though it is tun-specific in that it requires tun->numqueues.
>>
>> Fanout accrued various strategies until it gained an eBPF variant. Just
>> supporting BPF is probably sufficient here, too.
> +1, in addition to packet fanout, we have SO_REUSEPORT with BPF, RPS,
> RFS, etc. It would be nice if existing packet steering mechanisms
> could be leveraged for tun.

This could be done by using the API introduced in this series, I can try 
this in V2.

Thanks
Willem de Bruijn Sept. 28, 2017, 4:09 p.m. UTC | #7
On Thu, Sep 28, 2017 at 3:23 AM, Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2017年09月28日 07:25, Willem de Bruijn wrote:
>>>>
>>>> In the future, both simple and sophisticated policy like RSS or other
>>>> guest
>>>> driven steering policies could be done on top.
>>>
>>> IMHO there should be a more practical example before adding all this
>>> indirection. And it would be nice to understand why this queue selection
>>> needs to be tun specific.
>>
>> I was thinking the same and this reminds me of the various strategies
>> implemented in packet fanout. tun_cpu_select_queue is analogous to
>> fanout_demux_cpu though it is tun-specific in that it requires
>> tun->numqueues.
>
>
> Right, the main idea is to introduce a way to change flow steering policy
> for tun. I think fanout policy could be implemented through the API
> introduced in this series. (Current flow caches based automatic steering
> method is tun specific).
>
>>
>> Fanout accrued various strategies until it gained an eBPF variant. Just
>> supporting BPF is probably sufficient here, too.
>
>
> Technically yes, but for tun, it also serve for virt. We probably still need
> some hard coded policy which could be changed by guest until we can accept
> an BPF program from guest I think?

When would a guest choose the policy? As long as this is under control
of a host user, possibly unprivileged, allowing BPF here is moot, as any
user can run socket filter BPF already. Programming from the guest is
indeed different. I don't fully understand that use case.
Jason Wang Sept. 29, 2017, 9:41 a.m. UTC | #8
On 2017年09月29日 00:09, Willem de Bruijn wrote:
> On Thu, Sep 28, 2017 at 3:23 AM, Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2017年09月28日 07:25, Willem de Bruijn wrote:
>>>>> In the future, both simple and sophisticated policy like RSS or other
>>>>> guest
>>>>> driven steering policies could be done on top.
>>>> IMHO there should be a more practical example before adding all this
>>>> indirection. And it would be nice to understand why this queue selection
>>>> needs to be tun specific.
>>> I was thinking the same and this reminds me of the various strategies
>>> implemented in packet fanout. tun_cpu_select_queue is analogous to
>>> fanout_demux_cpu though it is tun-specific in that it requires
>>> tun->numqueues.
>>
>> Right, the main idea is to introduce a way to change flow steering policy
>> for tun. I think fanout policy could be implemented through the API
>> introduced in this series. (Current flow caches based automatic steering
>> method is tun specific).
>>
>>> Fanout accrued various strategies until it gained an eBPF variant. Just
>>> supporting BPF is probably sufficient here, too.
>>
>> Technically yes, but for tun, it also serve for virt. We probably still need
>> some hard coded policy which could be changed by guest until we can accept
>> an BPF program from guest I think?
> When would a guest choose the policy? As long as this is under control
> of a host user, possibly unprivileged, allowing BPF here is moot, as any
> user can run socket filter BPF already. Programming from the guest is
> indeed different. I don't fully understand that use case.

The problem is userspace (qemu) know little about what kind of workloads 
will be done by guest, so we need guest controllable method here since 
it knows the best steering policy. Rethink about this, instead of 
passing eBPF from guest, qemu can have some pre-defined sets of polices. 
I will change the cpu id based to eBPF based in V2.

Thanks
Michael S. Tsirkin Oct. 1, 2017, 3:28 a.m. UTC | #9
On Thu, Sep 28, 2017 at 12:09:05PM -0400, Willem de Bruijn wrote:
> Programming from the guest is
> indeed different. I don't fully understand that use case.

Generally programming host BPF from guest is a clear win - think DOS
protection. Guest runs logic to detect dos attacks, then passes the
program to host.  Afterwards, host does not need to enter guest if
there's a DOS attack. Saves a ton of cycles.

The difficulty is making it work well, e.g. how do we handle maps?