From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752031AbdKHLNb (ORCPT <rfc822;w@1wt.eu>);
        Wed, 8 Nov 2017 06:13:31 -0500
Received: from mx1.redhat.com ([209.132.183.28]:40808 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1750760AbdKHLN3 (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 8 Nov 2017 06:13:29 -0500
DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 7A89425C3B
Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=jasowang@redhat.com
Subject: Re: [PATCH net-next V2 3/3] tun: add eBPF based queue selection
 method
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
        Network Development <netdev@vger.kernel.org>,
        LKML <linux-kernel@vger.kernel.org>, Tom Herbert <tom@herbertland.com>,
        Aaron Conole <aconole@redhat.com>
References: <1509445938-4345-1-git-send-email-jasowang@redhat.com>
 <1509445938-4345-4-git-send-email-jasowang@redhat.com>
 <CAF=yD-L7v-KQQ5SZ9yVUiqPcaNCn5XbwcPrPHXnGO_tDv6_UgQ@mail.gmail.com>
 <CAF=yD-Jp6+0gsbRv9ivyuuAbKzPJ0ooA1Zx28uZe+a6zZpqNaQ@mail.gmail.com>
 <1e5256e3-72cf-fa6b-b00e-2661e29291b1@redhat.com>
 <20171108073717-mutt-send-email-mst@kernel.org>
From: Jason Wang <jasowang@redhat.com>
Message-ID: <4fddf39b-dd34-4f5f-3adb-8be0b867f690@redhat.com>
Date: Wed, 8 Nov 2017 20:13:15 +0900
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <20171108073717-mutt-send-email-mst@kernel.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Content-Language: en-US
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Wed, 08 Nov 2017 11:13:29 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


On 2017年11月08日 14:43, Michael S. Tsirkin wrote:
> On Wed, Nov 08, 2017 at 02:28:53PM +0900, Jason Wang wrote:
>>
>> On 2017年11月04日 08:56, Willem de Bruijn wrote:
>>> On Fri, Nov 3, 2017 at 5:56 PM, Willem de Bruijn
>>> <willemdebruijn.kernel@gmail.com> wrote:
>>>> On Tue, Oct 31, 2017 at 7:32 PM, Jason Wang <jasowang@redhat.com> wrote:
>>>>> This patch introduces an eBPF based queue selection method based on
>>>>> the flow steering policy ops. Userspace could load an eBPF program
>>>>> through TUNSETSTEERINGEBPF. This gives much more flexibility compare
>>>>> to simple but hard coded policy in kernel.
>>>>>
>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>> ---
>>>>> +static int tun_set_steering_ebpf(struct tun_struct *tun, void __user *data)
>>>>> +{
>>>>> +       struct bpf_prog *prog;
>>>>> +       u32 fd;
>>>>> +
>>>>> +       if (copy_from_user(&fd, data, sizeof(fd)))
>>>>> +               return -EFAULT;
>>>>> +
>>>>> +       prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_SOCKET_FILTER);
>>>> If the idea is to allow guests to pass BPF programs down to the host,
>>>> you may want to define a new program type that is more restrictive than
>>>> socket filter.
>>>>
>>>> The external functions allowed for socket filters (sk_filter_func_proto)
>>>> are relatively few (compared to, say, clsact), but may still leak host
>>>> information to a guest. More importantly, guest security considerations
>>>> limits how we can extend socket filters later.
>>> Unless the idea is for the hypervisor to prepared the BPF based on a
>>> limited set of well defined modes that the guest can configure. Then
>>> socket filters are fine, as the BPF is prepared by a regular host process.
>> Yes, I think the idea is to let qemu to build a BPF program now.
>>
>> Passing eBPF program from guest to host is interesting, but an obvious issue
>> is how to deal with the accessing of map.
>>
>> Thanks
> Fundamentally, I suspect the way to solve it is to allow
> the program to specify "should be offloaded to host".
>
> And then it would access the host map rather than the guest map.

This looks a big extension.

>
> Then add some control path API for guest to poke at the host map.

Actually, as Willem said, we can even forbid using map through a type, 
but this will lose lots of flexibility.

>
> It's not that there's anything special about the host map -
> it's just separate from the guest - so if we wanted to
> do something that can work on bare-metal we could -
> just do something like a namespace and put all host
> maps there. But I'm not sure it's worth the complexity.
>
> Cc Aaron who wanted to look at this.
>

Maybe the first step is to let classic BPF to be passed from guest and 
consider eBPF on top.

Thanks