From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4193EC433B4 for ; Tue, 13 Apr 2021 10:06:56 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D5599613B2 for ; Tue, 13 Apr 2021 10:06:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D5599613B2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55368 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lWFwd-0002Oq-16 for qemu-devel@archiver.kernel.org; Tue, 13 Apr 2021 06:06:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41372) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lWFp5-0000cH-DT for qemu-devel@nongnu.org; Tue, 13 Apr 2021 05:59:07 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:31611) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lWFp2-0005Ij-Ux for qemu-devel@nongnu.org; Tue, 13 Apr 2021 05:59:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618307944; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=m+6THe7kuEatUC1VEYqCfUQQ/dGBe0D5MnCNr7ARK7g=; b=eGexl5MwDLamZOVfoMEXs15qiB6Lq3BIHVByiZ3x6WX0nkIMZSHg3mf/W5q0MWPTMXiZz5 axYVjhkHDZQbb860Wp40KI0JKsWutvcZa5/mQlWDUPimp9JZtK2+gi3wPdSvcjc/bunyFO K+T4cqPHvDboaCIOvLKWo9kfHytOz6M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-47-rdLxEiPDOauKF9_ANUkHiQ-1; Tue, 13 Apr 2021 05:59:02 -0400 X-MC-Unique: rdLxEiPDOauKF9_ANUkHiQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 521B610053ED; Tue, 13 Apr 2021 09:59:01 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-69.ams2.redhat.com [10.36.115.69]) by smtp.corp.redhat.com (Postfix) with ESMTP id ECD575FC14; Tue, 13 Apr 2021 09:58:47 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH RESEND v7 09/13] vfio: Support for RamDiscardManager in the vIOMMU case Date: Tue, 13 Apr 2021 11:55:27 +0200 Message-Id: <20210413095531.25603-10-david@redhat.com> In-Reply-To: <20210413095531.25603-1-david@redhat.com> References: <20210413095531.25603-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Peter Xu , Auger Eric , Alex Williamson , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" vIOMMU support works already with RamDiscardManager as long as guests only map populated memory. Both, populated and discarded memory is mapped into &address_space_memory, where vfio_get_xlat_addr() will find that memory, to create the vfio mapping. Sane guests will never map discarded memory (e.g., unplugged memory blocks in virtio-mem) into an IOMMU - or keep it mapped into an IOMMU while memory is getting discarded. However, there are two cases where a malicious guests could trigger pinning of more memory than intended. One case is easy to handle: the guest trying to map discarded memory into an IOMMU. The other case is harder to handle: the guest keeping memory mapped in the IOMMU while it is getting discarded. We would have to walk over all mappings when discarding memory and identify if any mapping would be a violation. Let's keep it simple for now and print a warning, indicating that setting RLIMIT_MEMLOCK can mitigate such attacks. We have to take care of incoming migration: at the point the IOMMUs get restored and start creating mappings in vfio, RamDiscardManager implementations might not be back up and running yet: let's add runstate priorities to enforce the order when restoring. Acked-by: Alex Williamson Reviewed-by: Alex Williamson Acked-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/vfio/common.c | 39 +++++++++++++++++++++++++++++++++++++ hw/virtio/virtio-mem.c | 1 + include/migration/vmstate.h | 1 + 3 files changed, 41 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index f8a2fe8441..8a9bbf2791 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -36,6 +36,7 @@ #include "qemu/range.h" #include "sysemu/kvm.h" #include "sysemu/reset.h" +#include "sysemu/runstate.h" #include "trace.h" #include "qapi/error.h" #include "migration/migration.h" @@ -569,6 +570,44 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, error_report("iommu map to non memory area %"HWADDR_PRIx"", xlat); return false; + } else if (memory_region_has_ram_discard_manager(mr)) { + RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr); + MemoryRegionSection tmp = { + .mr = mr, + .offset_within_region = xlat, + .size = int128_make64(len), + }; + + /* + * Malicious VMs can map memory into the IOMMU, which is expected + * to remain discarded. vfio will pin all pages, populating memory. + * Disallow that. vmstate priorities make sure any RamDiscardManager + * were already restored before IOMMUs are restored. + */ + if (!ram_discard_manager_is_populated(rdm, &tmp)) { + error_report("iommu map to discarded memory (e.g., unplugged via" + " virtio-mem): %"HWADDR_PRIx"", + iotlb->translated_addr); + return false; + } + + /* + * Malicious VMs might trigger discarding of IOMMU-mapped memory. The + * pages will remain pinned inside vfio until unmapped, resulting in a + * higher memory consumption than expected. If memory would get + * populated again later, there would be an inconsistency between pages + * pinned by vfio and pages seen by QEMU. This is the case until + * unmapped from the IOMMU (e.g., during device reset). + * + * With malicious guests, we really only care about pinning more memory + * than expected. RLIMIT_MEMLOCK set for the user/process can never be + * exceeded and can be used to mitigate this problem. + */ + warn_report_once("Using vfio with vIOMMUs and coordinated discarding of" + " RAM (e.g., virtio-mem) works, however, malicious" + " guests can trigger pinning of more memory than" + " intended via an IOMMU. It's possible to mitigate " + " by setting/adjusting RLIMIT_MEMLOCK."); } /* diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index e209b56057..cbd07fc9f1 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -886,6 +886,7 @@ static const VMStateDescription vmstate_virtio_mem_device = { .name = "virtio-mem-device", .minimum_version_id = 1, .version_id = 1, + .priority = MIG_PRI_VIRTIO_MEM, .post_load = virtio_mem_post_load, .fields = (VMStateField[]) { VMSTATE_WITH_TMP(VirtIOMEM, VirtIOMEMMigSanityChecks, diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index 075ee80096..3bf58ff043 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -153,6 +153,7 @@ typedef enum { MIG_PRI_DEFAULT = 0, MIG_PRI_IOMMU, /* Must happen before PCI devices */ MIG_PRI_PCI_BUS, /* Must happen before IOMMU */ + MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */ MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */ MIG_PRI_GICV3, /* Must happen before the ITS */ MIG_PRI_MAX, -- 2.30.2