From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C5BFC4361B for ; Wed, 9 Dec 2020 02:35:20 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EBB92239E5 for ; Wed, 9 Dec 2020 02:35:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EBB92239E5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:57830 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kmpK1-00042f-6V for qemu-devel@archiver.kernel.org; Tue, 08 Dec 2020 21:35:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59910) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kmpIy-0003Ty-Ah for qemu-devel@nongnu.org; Tue, 08 Dec 2020 21:34:13 -0500 Received: from szxga05-in.huawei.com ([45.249.212.191]:2989) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kmpIj-0002Xh-U8 for qemu-devel@nongnu.org; Tue, 08 Dec 2020 21:34:10 -0500 Received: from DGGEMS411-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4CrLgx0B23zhnb7; Wed, 9 Dec 2020 10:33:17 +0800 (CST) Received: from [10.174.185.179] (10.174.185.179) by DGGEMS411-HUB.china.huawei.com (10.3.19.211) with Microsoft SMTP Server id 14.3.487.0; Wed, 9 Dec 2020 10:33:42 +0800 Subject: Re: [PATCH] kvm: Take into account the unaligned section size when preparing bitmap To: Peter Xu References: <20201208114013.875-1-yuzenghui@huawei.com> <20201208151654.GA6432@xz-x1> From: Zenghui Yu Message-ID: Date: Wed, 9 Dec 2020 10:33:41 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <20201208151654.GA6432@xz-x1> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.185.179] X-CFilter-Loop: Reflected Received-SPF: pass client-ip=45.249.212.191; envelope-from=yuzenghui@huawei.com; helo=szxga05-in.huawei.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, wanghaibin.wang@huawei.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Hi Peter, Thanks for having a look at it. On 2020/12/8 23:16, Peter Xu wrote: > Hi, Zenghui, > > On Tue, Dec 08, 2020 at 07:40:13PM +0800, Zenghui Yu wrote: >> The kernel KVM_CLEAR_DIRTY_LOG interface has align requirement on both the >> start and the size of the given range of pages. We have been careful to >> handle the unaligned cases when performing CLEAR on one slot. But it seems >> that we forget to take the unaligned *size* case into account when >> preparing bitmap for the interface, and we may end up clearing dirty status >> for pages outside of [start, start + size). > > Thanks for the patch, though my understanding is that this is not a bug. > > Please have a look at kvm_memslot_init_dirty_bitmap() where we'll allocate the > dirty bitmap to be aligned to 8 bytes (assuming that's the possible max of the > value sizeof(unsigned long)). That exactly covers 64 pages. > > So here as long as start_delta==0 (so the value of "bmap_npages - size / psize" > won't really matter a lot, imho), then we'll definitely have KVMSlot.dirty_bmap > long enough to cover the range we'd like to clear. I agree. But actually I'm not saying that KVMSlot.dirty_bmap is not long enough. What I was having in mind is something like: // psize = qemu_real_host_page_size; // slot.start_addr = 0; // slot.memory_size = 64 * psize; kvm_log_clear_one_slot(slot, as, 0 * psize, 32 * psize); --> [1] kvm_log_clear_one_slot(slot, as, 32 * psize, 32 * psize); --> [2] So the @size is not aligned with 64 pages. Before this patch, we'll clear dirty status for all pages(0-63) through [1]. It looks to me that this violates the caller's expectation since we only want to clear pages(0-31). As I said, I don't think this will happen in practice -- the migration code should always provide us with a 64-page aligned section (right?). I'm just thinking about the correctness of the specific algorithm used by kvm_log_clear_one_slot(). Or maybe I had missed some other points obvious ;-) ? Thanks, Zenghui > Note that the size of KVMSlot.dirty_bmap can be bigger than the actually size > of the kvm memslot, however since kvm_memslot_init_dirty_bitmap() initialized > it to all zero so the extra bits will always be zero for the whole lifecycle of > the vm/bitmap. > > Thanks, > >> >> Signed-off-by: Zenghui Yu >> --- >> accel/kvm/kvm-all.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c >> index bed2455ca5..05d323ba1f 100644 >> --- a/accel/kvm/kvm-all.c >> +++ b/accel/kvm/kvm-all.c >> @@ -747,7 +747,7 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, uint64_t start, >> assert(bmap_start % BITS_PER_LONG == 0); >> /* We should never do log_clear before log_sync */ >> assert(mem->dirty_bmap); >> - if (start_delta) { >> + if (start_delta || bmap_npages - size / psize) { >> /* Slow path - we need to manipulate a temp bitmap */ >> bmap_clear = bitmap_new(bmap_npages); >> bitmap_copy_with_src_offset(bmap_clear, mem->dirty_bmap, >> @@ -760,7 +760,10 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, uint64_t start, >> bitmap_clear(bmap_clear, 0, start_delta); >> d.dirty_bitmap = bmap_clear; >> } else { >> - /* Fast path - start address aligns well with BITS_PER_LONG */ >> + /* >> + * Fast path - both start and size align well with BITS_PER_LONG >> + * (or the end of memory slot) >> + */ >> d.dirty_bitmap = mem->dirty_bmap + BIT_WORD(bmap_start); >> } >> >> -- >> 2.19.1 >> >> >