All of lore.kernel.org
 help / color / mirror / Atom feed
From: "GuoRui.Yu" <GuoRui.Yu@linux.alibaba.com>
To: linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	konrad.wilk@oracle.com, linux-coco@lists.linux.dev
Cc: GuoRui.Yu@linux.alibaba.com, robin.murphy@arm.com
Subject: [RFC] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs
Date: Sat, 28 Jan 2023 16:32:50 +0800	[thread overview]
Message-ID: <20230128083254.86012-1-GuoRui.Yu@linux.alibaba.com> (raw)

This patch series adds a new swiotlb implementation, cc-swiotlb, for
Confidential VMs (such as TDX and SEV-SNP). The new cc-swiotlb allocates
the DMA TLB buffer dynamically in runtime instead of allocating at boot
with a fixed size. Furthermore, future optimization and security
enhancement could be applied on cc-swiotlb without "infecting" the
legacy swiotlb.

Background
==========
Under COnfidential COmputing (CoCo) scenarios, the VMM cannot access
guest memory directly but requires the guest to explicitly mark the
memory as shared (decrypted). To make the streaming DMA mappings work,
the current implementation relays on legacy SWIOTLB to bounce the DMA
buffer between private (encrypted) and shared (decrypted) memory.

However, the legacy swiotlb is designed for compatibility rather than
efficiency and CoCo purpose, which will inevitably introduce some
unnecessary restrictions.  

1. Fixed immutable swiotlb size cannot accommodate to requirements of
multiple devices. And 1GiB (current maximum size) of swiotlb in our
testbed cannot afford multiple disks reads/writes simultaneously.  

2. Fixed immutable IO_TLB_SIZE (2KiB) cannot satisfy various kinds of
devices. At the moment, the minimal size of a swiotlb buffer is 2KiB,
which will waste memory on small network packets (under 256 bytes) and
decrease efficiency on a large block (up to 256KiB) size reads/writes of
disks. And it is hard to have a trade-off on legacy swiotlb to rule them
all.  

3. The legacy swiotlb cannot efficiently support larger swiotlb buffers.
In the worst case, the current implementation requires a full scan of
the entire swiotlb buffer, which can cause severe performance hits.

Changes in this patch set
=========================
Instead of keeping "infecting" the legacy swiotlb code with CoCo logic,
this patch tries to introduce a new cc-swiotlb for Confidential VMs.

Confidential VMs usually have reasonable modern devices (virtio devices,
NVME, etc.), which can access memory above 4GiB, cc-swiotlb could
allocate TLB buffers at any position dynamically. Since
set_memory_{decrypted,encrypted} is time-consuming and cannot be used in
interrupt context, a new kernel thread "kccd" has been added to populate
new TLB buffers on-demand, which solved the problem 1.

In addition, the cc-swiotlb manages TLB buffers by different sizes
(512B, 2KiB, 4KiB, 16KiB, and 512KiB). The above values come from the
following observations (boot with 8core, 32 GiB, 1 nvme disk, and 1
virtio-net): 
- Allocations of 512 bytes and below account for 3.5% of the total DMA
  cache allocations;
- Allocations of 2 KiB and below account for 57.7%;
- Allocations of 4 KiB and below account for 91.3%;
- Allocations of 16 KiB and below account for 96.0%;
- Allocations of 512 KiB and below accounted for 100%;
- At the end of booting, cc-swiotlb uses 288 MiB in total. 

For comparison, legacy swiotlb reserves memory at 6%, which requires
min(1GiB, 32GiB * 0.06) = 1GiB, and will hang when operating multiple
disks simultaneously due to no memory for the swiotlb buffer.

These patches were tested with fio (using different iodepth and block
size) on a platform with 96 cores, 384 GiB, and 20 NVME disks, and no IO
hang or error was observed.

For simplicity, the current RFC version cannot switch between legacy
implementation with cmdline but through compile options. I am open to
discussing how to integrate the cc-swiotlb into the legacy one.

Patch Organization
==================
- swiotlb: Split common code from swiotlb.{c,h}
- swiotlb: Add a new cc-swiotlb implementation for Confidential VMs
- swiotlb: Add tracepoint swiotlb_unbounced
- cc-swiotlb: Allow set swiotlb watermark from cmdline

Thanks for your time!

Have a nice day,
Guorui



             reply	other threads:[~2023-01-28  8:48 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-28  8:32 GuoRui.Yu [this message]
2023-01-28  8:32 ` [PATCH 1/4] swiotlb: Split common code from swiotlb.{c,h} GuoRui.Yu
2023-01-28  8:32 ` [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs GuoRui.Yu
2023-01-28 12:03   ` kernel test robot
2023-01-28 16:41   ` Randy Dunlap
2023-01-29  1:54     ` Guorui Yu
2023-01-29 16:58   ` Andi Kleen
2023-01-30  2:25     ` Guorui Yu
2023-01-30  6:46       ` Andi Kleen
2023-01-30 13:45         ` Guorui Yu
2023-01-31 17:16           ` Andi Kleen
2023-02-01  2:08             ` Guorui Yu
2023-01-28  8:32 ` [PATCH 3/4] swiotlb: Add tracepoint swiotlb_unbounced GuoRui.Yu
2023-01-28  8:32 ` [PATCH 4/4] cc-swiotlb: Allow set swiotlb watermark from cmdline GuoRui.Yu
2023-01-28 20:19   ` kernel test robot
2023-01-28  9:03 ` [RFC] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs Guorui Yu
2023-01-30  6:54   ` Christoph Hellwig
2023-01-30 13:03 ` Robin Murphy
2023-01-30 14:37   ` Guorui Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230128083254.86012-1-GuoRui.Yu@linux.alibaba.com \
    --to=guorui.yu@linux.alibaba.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.