From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86D5FC43331 for ; Thu, 26 Mar 2020 07:26:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2ED5120714 for ; Thu, 26 Mar 2020 07:26:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FdGGcTHT" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2ED5120714 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BD76B6B0010; Thu, 26 Mar 2020 03:26:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B12436B0032; Thu, 26 Mar 2020 03:26:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B22E6B0036; Thu, 26 Mar 2020 03:26:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0215.hostedemail.com [216.40.44.215]) by kanga.kvack.org (Postfix) with ESMTP id 7BAE86B0010 for ; Thu, 26 Mar 2020 03:26:01 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5F4AD4DAC for ; Thu, 26 Mar 2020 07:26:01 +0000 (UTC) X-FDA: 76636679322.07.top03_51a1f478af73c X-HE-Tag: top03_51a1f478af73c X-Filterd-Recvd-Size: 11489 Received: from us-smtp-delivery-74.mimecast.com (us-smtp-delivery-74.mimecast.com [63.128.21.74]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Thu, 26 Mar 2020 07:26:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1585207560; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mgpP/2gbFa87I3ugA7pHc6Dg2Oij2fZIK8qbOlkH5E0=; b=FdGGcTHTAJWfY6MWDQKYR3aBGCwhy0eLwatYGfpS/J4+EZe3IzWQAK7LabEIikiRxoNZ/L S60lx8w07+daWcckv5HlqK7zuw8bZE9QVm6Zis6tlZkm9QFG1rAAsNFaOX7b6Yj5b3Xrkm N7UN4xM1wGIT4fUhhlgsMjjdAmlgK2c= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-134-mf7n1gKTPUGa_ixUt8k6mQ-1; Thu, 26 Mar 2020 03:25:57 -0400 X-MC-Unique: mf7n1gKTPUGa_ixUt8k6mQ-1 Received: by mail-wm1-f69.google.com with SMTP id s15so1802942wmc.0 for ; Thu, 26 Mar 2020 00:25:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=tmAet5y9Rsf9NYXP19teO+owPI4djxVX4/FYjX0fovI=; b=BQU3rtWX7Vxk0Sb94xhSjFmBZzANtAQ1wuBrkwYjfSe591j5boyMIFH/EggAh33xsb QGrkzUCa2nYRcDvuYaqqRLaTZXpjTlpaaPoQubgRRHs17r2vW5K+uHVkZrOsl5h/QOH5 irltYrQB4sdNLiZ2FRRYVRy7Q1BuYSi5+G/KygWf+AxdyMSE0FMnY5VJnIsNqp/4dw8D 8yq3vNJ+JhLIZoq05rIDAMlx5ZXz9DFBEJR64J0V3dOpndMBSV5MLyEqMEKH837b0YAO C0rRgTTVx/AxWUZIfblQXwhHePU2cibmeUHdTKM40HAbvhloAW/050L73Q7/slPCgnC9 VcCg== X-Gm-Message-State: ANhLgQ07Cow9QfwWI0oqb7dZbrDaZ4Ckr127xDwTfIUbBuQURRevsG7K YNA1vP7lMNXTBa+QjlMh3HNEejdiXjtJ1CoNbBnEbNc8Z6PXfgHzIjwnwWGV7Qi6xW0Gg8CGnOo RRGbySdsKzio= X-Received: by 2002:a1c:7d08:: with SMTP id y8mr1602171wmc.67.1585207555785; Thu, 26 Mar 2020 00:25:55 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtpEAwYXwXt7Kx4Cbwp2baZL8Nk8/DJVO/V8R9cbwVS/CVbXpyrfZI75a3gYBwC0lpw3ilwug== X-Received: by 2002:a1c:7d08:: with SMTP id y8mr1602148wmc.67.1585207555488; Thu, 26 Mar 2020 00:25:55 -0700 (PDT) Received: from redhat.com (bzq-79-182-20-254.red.bezeqint.net. [79.182.20.254]) by smtp.gmail.com with ESMTPSA id c189sm2274665wmd.12.2020.03.26.00.25.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 00:25:54 -0700 (PDT) Date: Thu, 26 Mar 2020 03:25:52 -0400 From: "Michael S. Tsirkin" To: Hui Zhu Cc: jasowang@redhat.com, akpm@linux-foundation.org, mojha@codeaurora.org, pagupta@redhat.com, aquini@redhat.com, namit@vmware.com, david@redhat.com, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, qemu-devel@nongnu.org, Hui Zhu Subject: Re: [PATCH for QEMU v2] virtio-balloon: Add option cont-pages to set VIRTIO_BALLOON_VQ_INFLATE_CONT Message-ID: <20200326032101-mutt-send-email-mst@kernel.org> References: <1584893097-12317-1-git-send-email-teawater@gmail.com> <1584893097-12317-2-git-send-email-teawater@gmail.com> MIME-Version: 1.0 In-Reply-To: <1584893097-12317-2-git-send-email-teawater@gmail.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 23, 2020 at 12:04:57AM +0800, Hui Zhu wrote: > If the guest kernel has many fragmentation pages, use virtio_balloon > will split THP of QEMU when it calls MADV_DONTNEED madvise to release > the balloon pages. > Set option cont-pages to on will open flags VIRTIO_BALLOON_VQ_INFLATE_CON= T > and set continuous pages order to THP order. > Then It will get continuous pages PFN from VQ icvq use use madvise > MADV_DONTNEED release the THP page. > This will handle the THP split issue. >=20 > Signed-off-by: Hui Zhu > --- > hw/virtio/virtio-balloon.c | 32 +++++++++++++++++++= ++---- > include/hw/virtio/virtio-balloon.h | 4 +++- > include/standard-headers/linux/virtio_balloon.h | 4 ++++ > 3 files changed, 35 insertions(+), 5 deletions(-) >=20 > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > index a4729f7..88bdaca 100644 > --- a/hw/virtio/virtio-balloon.c > +++ b/hw/virtio/virtio-balloon.c > @@ -34,6 +34,7 @@ > #include "hw/virtio/virtio-access.h" > =20 > #define BALLOON_PAGE_SIZE (1 << VIRTIO_BALLOON_PFN_SHIFT) > +#define CONT_PAGES_ORDER 9 > =20 > typedef struct PartiallyBalloonedPage { > ram_addr_t base_gpa; This doesn't look right to me. I suspect this is different between different hosts. Fixing this would also be tricky as we might need to migrate beween two hosts with different huge page sizes. My proposal is to instead enhance the PartiallyBalloonedPage machinery, teaching it to handle the case where host page size is smaller than the supported number of subpages. > @@ -65,7 +66,8 @@ static bool virtio_balloon_pbp_matches(PartiallyBalloon= edPage *pbp, > =20 > static void balloon_inflate_page(VirtIOBalloon *balloon, > MemoryRegion *mr, hwaddr mr_offset, > - PartiallyBalloonedPage *pbp) > + PartiallyBalloonedPage *pbp,=20 > + bool is_cont_pages) > { > void *addr =3D memory_region_get_ram_ptr(mr) + mr_offset; > ram_addr_t rb_offset, rb_aligned_offset, base_gpa; > @@ -76,6 +78,13 @@ static void balloon_inflate_page(VirtIOBalloon *balloo= n, > /* XXX is there a better way to get to the RAMBlock than via a > * host address? */ > rb =3D qemu_ram_block_from_host(addr, false, &rb_offset); > + > + if (is_cont_pages) { > + ram_block_discard_range(rb, rb_offset, > + BALLOON_PAGE_SIZE << CONT_PAGES_ORDER); > + return; > + } > + > rb_page_size =3D qemu_ram_pagesize(rb); > =20 > if (rb_page_size =3D=3D BALLOON_PAGE_SIZE) { > @@ -361,9 +370,10 @@ static void virtio_balloon_handle_output(VirtIODevic= e *vdev, VirtQueue *vq) > trace_virtio_balloon_handle_output(memory_region_name(sectio= n.mr), > pa); > if (!qemu_balloon_is_inhibited()) { > - if (vq =3D=3D s->ivq) { > + if (vq =3D=3D s->ivq || vq =3D=3D s->icvq) { > balloon_inflate_page(s, section.mr, > - section.offset_within_region, &= pbp); > + section.offset_within_region, &= pbp, > + vq =3D=3D s->icvq); > } else if (vq =3D=3D s->dvq) { > balloon_deflate_page(s, section.mr, section.offset_w= ithin_region); > } else { > @@ -618,9 +628,12 @@ static size_t virtio_balloon_config_size(VirtIOBallo= on *s) > if (s->qemu_4_0_config_size) { > return sizeof(struct virtio_balloon_config); > } > - if (virtio_has_feature(features, VIRTIO_BALLOON_F_PAGE_POISON)) { > + if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_CONT_PAGES= )) { > return sizeof(struct virtio_balloon_config); > } > + if (virtio_has_feature(features, VIRTIO_BALLOON_F_PAGE_POISON)) { > + return offsetof(struct virtio_balloon_config, pages_order); > + } > if (virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { > return offsetof(struct virtio_balloon_config, poison_val); > } > @@ -646,6 +659,10 @@ static void virtio_balloon_get_config(VirtIODevice *= vdev, uint8_t *config_data) > cpu_to_le32(VIRTIO_BALLOON_CMD_ID_DONE); > } > =20 > + if (virtio_has_feature(dev->host_features, VIRTIO_BALLOON_F_CONT_PAG= ES)) { > + config.pages_order =3D cpu_to_le32(CONT_PAGES_ORDER); > + } > + > trace_virtio_balloon_get_config(config.num_pages, config.actual); > memcpy(config_data, &config, virtio_balloon_config_size(dev)); > } > @@ -816,6 +833,11 @@ static void virtio_balloon_device_realize(DeviceStat= e *dev, Error **errp) > virtio_error(vdev, "iothread is missing"); > } > } > + > + if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_CONT_PAGES= )) { > + s->icvq =3D virtio_add_queue(vdev, 128, virtio_balloon_handle_ou= tput); > + } > + > reset_stats(s); > } > =20 > @@ -916,6 +938,8 @@ static Property virtio_balloon_properties[] =3D { > VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false), > DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features, > VIRTIO_BALLOON_F_FREE_PAGE_HINT, false), > + DEFINE_PROP_BIT("cont-pages", VirtIOBalloon, host_features, > + VIRTIO_BALLOON_F_CONT_PAGES, false), > /* QEMU 4.0 accidentally changed the config size even when free-page= -hint > * is disabled, resulting in QEMU 3.1 migration incompatibility. Th= is > * property retains this quirk for QEMU 4.1 machine types. > diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virti= o-balloon.h > index d1c968d..61d2419 100644 > --- a/include/hw/virtio/virtio-balloon.h > +++ b/include/hw/virtio/virtio-balloon.h > @@ -42,7 +42,7 @@ enum virtio_balloon_free_page_report_status { > =20 > typedef struct VirtIOBalloon { > VirtIODevice parent_obj; > - VirtQueue *ivq, *dvq, *svq, *free_page_vq; > + VirtQueue *ivq, *dvq, *svq, *free_page_vq, *icvq; > uint32_t free_page_report_status; > uint32_t num_pages; > uint32_t actual; > @@ -70,6 +70,8 @@ typedef struct VirtIOBalloon { > uint32_t host_features; > =20 > bool qemu_4_0_config_size; > + > + uint32_t pages_order; > } VirtIOBalloon; > =20 > #endif > diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/st= andard-headers/linux/virtio_balloon.h > index 9375ca2..ee18be7 100644 > --- a/include/standard-headers/linux/virtio_balloon.h > +++ b/include/standard-headers/linux/virtio_balloon.h > @@ -36,6 +36,8 @@ > #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM=092 /* Deflate balloon on OOM */ > #define VIRTIO_BALLOON_F_FREE_PAGE_HINT=093 /* VQ to report free pages *= / > #define VIRTIO_BALLOON_F_PAGE_POISON=094 /* Guest is using page poisonin= g */ > +#define VIRTIO_BALLOON_F_CONT_PAGES=095 /* VQ to report continuous pages= */ > + > =20 > /* Size of a PFN in the balloon interface. */ > #define VIRTIO_BALLOON_PFN_SHIFT 12 > @@ -51,6 +53,8 @@ struct virtio_balloon_config { > =09uint32_t free_page_report_cmd_id; > =09/* Stores PAGE_POISON if page poisoning is in use */ > =09uint32_t poison_val; > +=09/* Pages order if VIRTIO_BALLOON_F_CONT_PAGES is set */ > +=09uint32_t pages_order; > }; > =20 > #define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */ > --=20 > 2.7.4