Re: [PATCH] selftests: kvm: fix overlapping addresses in memslot_perf_test

From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
To: "Duan, Zhenzhong" <zhenzhong.duan@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Andrew Jones <drjones@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH] selftests: kvm: fix overlapping addresses in memslot_perf_test
Date: Thu, 3 Jun 2021 15:05:50 +0200	[thread overview]
Message-ID: <de413189-b0c8-df1a-a0f5-e3fea1a329f8@oracle.com> (raw)
In-Reply-To: <DM8PR11MB5670B1AA392BF7502501D43B923C9@DM8PR11MB5670.namprd11.prod.outlook.com>

On 03.06.2021 07:26, Duan, Zhenzhong wrote:
>> -----Original Message-----
>> From: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>> Sent: Thursday, June 3, 2021 7:07 AM
>> To: Paolo Bonzini <pbonzini@redhat.com>; Duan, Zhenzhong
>> <zhenzhong.duan@intel.com>
>> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org; Andrew Jones
>> <drjones@redhat.com>
>> Subject: Re: [PATCH] selftests: kvm: fix overlapping addresses in
>> memslot_perf_test
>>
>> On 30.05.2021 01:13, Maciej S. Szmigiero wrote:
>>> On 29.05.2021 12:20, Paolo Bonzini wrote:
>>>> On 28/05/21 21:51, Maciej S. Szmigiero wrote:
>>>>> On 28.05.2021 21:11, Paolo Bonzini wrote:
>>>>>> The memory that is allocated in vm_create is already mapped close
>>>>>> to GPA 0, because test_execute passes the requested memory to
>>>>>> prepare_vm.  This causes overlapping memory regions and the test
>>>>>> crashes.  For simplicity just move MEM_GPA higher.
>>>>>>
>>>>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>>>
>>>>> I am not sure that I understand the issue correctly, is
>>>>> vm_create_default() already reserving low GPAs (around 0x10000000)
>>>>> on some arches or run environments?
>>>>
>>>> It maps the number of pages you pass in the second argument, see
>>>> vm_create.
>>>>
>>>>     if (phy_pages != 0)
>>>>       vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS,
>>>>                                   0, 0, phy_pages, 0);
>>>>
>>>> In this case:
>>>>
>>>>     data->vm = vm_create_default(VCPU_ID, mempages, guest_code);
>>>>
>>>> called here:
>>>>
>>>>     if (!prepare_vm(data, nslots, maxslots, tdata->guest_code,
>>>>                     mem_size, slot_runtime)) {
>>>>
>>>> where mempages is mem_size, which is declared as:
>>>>
>>>>           uint64_t mem_size = tdata->mem_size ? : MEM_SIZE_PAGES;
>>>>
>>>> but actually a better fix is just to pass a small fixed value (e.g.
>>>> 1024) to vm_create_default, since all other regions are added by hand
>>>
>>> Yes, but the argument that is passed to vm_create_default() (mem_size
>>> in the case of the test) is not passed as phy_pages to vm_create().
>>> Rather, vm_create_with_vcpus() calculates some upper bound of extra
>>> memory that is needed to cover that much guest memory (including for
>>> its page tables).
>>>
>>> The biggest possible mem_size from memslot_perf_test is 512 MiB + 1
>>> page, according to my calculations this results in phy_pages of 1029
>>> (~4 MiB) in the x86-64 case and around 1540 (~6 MiB) in the s390x case
>>> (here I am not sure about the exact number, since s390x has some
>>> additional alignment requirements).
>>>
>>> Both values are well below 256 MiB (0x10000000UL), so I was wondering
>>> what kind of circumstances can make these allocations collide (maybe I
>>> am missing something in my analysis).
>>
>> I see now that there has been a patch merged last week called
>> "selftests: kvm: make allocation of extra memory take effect" by Zhenzhong
>> that now allocates also the whole memory size passed to
>> vm_create_default() (instead of just page tables for that much memory).
>>
>> The commit message of this patch says that "perf_test_util and
>> kvm_page_table_test use it to alloc extra memory currently", however both
>> kvm_page_table_test and lib/perf_test_util framework explicitly add the
>> required memory allocation by doing a vm_userspace_mem_region_add()
>> call for the same memory size that they pass to vm_create_default().
>>
>> So now they allocate this memory twice.
>>
>> @Zhenzhong: did you notice improper operation of either
>> kvm_page_table_test or perf_test_util-based tests (demand_paging_test,
>> dirty_log_perf_test,
>> memslot_modification_stress_test) before your patch?
> No
> 
>>
>> They seem to work fine for me without the patch (and I guess other people
>> would have noticed earlier, too, if they were broken).
>>
>> After this patch not only these tests allocate their memory twice but it is
>> harder to make vm_create_default() allocate the right amount of memory for
>> the page tables in cases where the test needs to explicitly use
>> vm_userspace_mem_region_add() for its allocations (because it wants the
>> allocation placed at a specific GPA or in a specific memslot).
>>
>> One has to basically open-code the page table size calculations from
>> vm_create_with_vcpus() in the particular test then, taking also into account
>> that vm_create_with_vcpus() will not only allocate the passed memory size
>> (calculated page tables size) but also behave like it was allocating space for
>> page tables for these page tables (even though the passed memory size itself
>> is supposed to cover them).
> Looks we have different understanding to the parameter extra_mem_pages of vm_create_default().
> 
> In your usage, extra_mem_pages is only used for page table calculations, real extra memory allocation
> happens in the extra call of vm_userspace_mem_region_add().
> 
> In my understanding, extra_mem_pages is used for a VM who wants a custom memory size in slot0,
> rather than the default DEFAULT_GUEST_PHY_PAGES size.
> 
> I understood your comments and do agree that my patch bring some trouble to your code, sorry for that.
> I'm fine to revert that patch and I think it's better to let the maintainers to decide what extra_mem_pages
> Is used for.

No problem, I just noticed the inconsistent behavior.
I've coded memslot_perf_test to the old one (like other tests are) and
was surprised there were guest memory allocation collisions.

> Thanks
> Zhenzhong

Thanks,
Maciej