From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:39139) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sgds6-0007kx-5U for qemu-devel@nongnu.org; Mon, 18 Jun 2012 11:28:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Sgdrz-00009Y-OI for qemu-devel@nongnu.org; Mon, 18 Jun 2012 11:28:05 -0400 Received: from e39.co.us.ibm.com ([32.97.110.160]:56984) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sgdrz-00008t-HM for qemu-devel@nongnu.org; Mon, 18 Jun 2012 11:27:59 -0400 Received: from /spool/local by e39.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 18 Jun 2012 09:27:49 -0600 Received: from d01relay05.pok.ibm.com (d01relay05.pok.ibm.com [9.56.227.237]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 8A792C90364 for ; Mon, 18 Jun 2012 11:23:07 -0400 (EDT) Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay05.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q5IFN89C190224 for ; Mon, 18 Jun 2012 11:23:08 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q5IFM7NN019987 for ; Mon, 18 Jun 2012 11:22:07 -0400 Message-ID: <4FDF479B.9060502@linux.vnet.ibm.com> Date: Mon, 18 Jun 2012 11:22:03 -0400 From: Corey Bryant MIME-Version: 1.0 References: <20120613203305.GC6019@redhat.com> <20120618083335.GD28026@redhat.com> In-Reply-To: <20120618083335.GD28026@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in vl.c List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" Cc: Blue Swirl , qemu-devel@nongnu.org, Eduardo Otubo On 06/18/2012 04:33 AM, Daniel P. Berrange wrote: > On Fri, Jun 15, 2012 at 07:04:45PM +0000, Blue Swirl wrote: >> On Wed, Jun 13, 2012 at 8:33 PM, Daniel P. Berrange wrote: >>> On Wed, Jun 13, 2012 at 07:56:06PM +0000, Blue Swirl wrote: >>>> On Wed, Jun 13, 2012 at 7:20 PM, Eduardo Otubo wrote: >>>>> I added a syscall struct using priority levels as described in the >>>>> libseccomp man page. The priority numbers are based to the frequency >>>>> they appear in a sample strace from a regular qemu guest run under >>>>> libvirt. >>>>> >>>>> Libseccomp generates linear BPF code to filter system calls, those rules >>>>> are read one after another. The priority system places the most common >>>>> rules first in order to reduce the overhead when processing them. >>>>> >>>>> Also, since this is just a first RFC, the whitelist is a little raw. We >>>>> might need your help to improve, test and fine tune the set of system >>>>> calls. >>>>> >>>>> v2: Fixed some style issues >>>>> Removed code from vl.c and created qemu-seccomp.[ch] >>>>> Now using ARRAY_SIZE macro >>>>> Added more syscalls without priority/frequency set yet >>>>> >>>>> Signed-off-by: Eduardo Otubo >>>>> --- >>>>> qemu-seccomp.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>> qemu-seccomp.h | 9 +++++++ >>>>> vl.c | 7 ++++++ >>>>> 3 files changed, 89 insertions(+) >>>>> create mode 100644 qemu-seccomp.c >>>>> create mode 100644 qemu-seccomp.h >>>>> >>>>> diff --git a/qemu-seccomp.c b/qemu-seccomp.c >>>>> new file mode 100644 >>>>> index 0000000..048b7ba >>>>> --- /dev/null >>>>> +++ b/qemu-seccomp.c >>>>> @@ -0,0 +1,73 @@ >>>> >>>> Copyright and license info missing. >>>> >>>>> +#include >>>>> +#include >>>>> +#include "qemu-seccomp.h" >>>>> + >>>>> +static struct QemuSeccompSyscall seccomp_whitelist[] = { >>>> >>>> 'const' >>>> >>>>> + { SCMP_SYS(timer_settime), 255 }, >>>>> + { SCMP_SYS(timer_gettime), 254 }, >>>>> + { SCMP_SYS(futex), 253 }, >>>>> + { SCMP_SYS(select), 252 }, >>>>> + { SCMP_SYS(recvfrom), 251 }, >>>>> + { SCMP_SYS(sendto), 250 }, >>>>> + { SCMP_SYS(read), 249 }, >>>>> + { SCMP_SYS(brk), 248 }, >>>>> + { SCMP_SYS(clone), 247 }, >>>>> + { SCMP_SYS(mmap), 247 }, >>>>> + { SCMP_SYS(mprotect), 246 }, >>>>> + { SCMP_SYS(ioctl), 245 }, >>>>> + { SCMP_SYS(recvmsg), 245 }, >>>>> + { SCMP_SYS(sendmsg), 245 }, >>>>> + { SCMP_SYS(accept), 245 }, >>>>> + { SCMP_SYS(connect), 245 }, >>>>> + { SCMP_SYS(bind), 245 }, >>>> >>>> It would be nice to avoid connect() and bind(). Perhaps seccomp init >>>> should be postponed to after all sockets have been created? >>> >>> If you want to migrate your guest, you need to be able to >>> call connect() at an arbitrary point in the QEMU process' >>> lifecycle. So you can't avoid allowing connect(). Similarly >>> if you want to allow hotplug of NICs (and their backends) >>> then you need to have both bind() + connect() available. >> >> That's bad. Migration could conceivably be extended to use file >> descriptor passing, but hotplug is more tricky. > > As with execve(), i'm reporting this on the basis that on the previous > patch posting I was told we must whitelist any syscalls QEMU can > conceivably use to avoid any loss in functionality. Thanks for pointing out syscalls needed for the whitelist. As Paul has already mentioned, it was recommended that we restrict all of QEMU (as a single process) from the start of execution. This is opposed to other options of restricting QEMU from the time that vCPUS start, further restricting based on syscall parms, or decomposing QEMU into multiple processes that are individually restricted with their own seccomp whitelists. I think this approach is a good starting point that can be further tuned in the future. And as with most security measures, defense in depth improves the cause (e.g. combining seccomp with DAC or MAC). -- Regards, Corey