From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B36AC56201 for ; Wed, 18 Nov 2020 18:49:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BC2B0246BF for ; Wed, 18 Nov 2020 18:49:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="n5IsvD97" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727084AbgKRSs3 (ORCPT ); Wed, 18 Nov 2020 13:48:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727049AbgKRSs2 (ORCPT ); Wed, 18 Nov 2020 13:48:28 -0500 Received: from mail-qk1-x744.google.com (mail-qk1-x744.google.com [IPv6:2607:f8b0:4864:20::744]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24284C0613D4 for ; Wed, 18 Nov 2020 10:48:28 -0800 (PST) Received: by mail-qk1-x744.google.com with SMTP id d28so2803503qka.11 for ; Wed, 18 Nov 2020 10:48:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=uMblKnV3Ocq7dC1W047IYjXjnFTwauF9Ud6IY1BWMbM=; b=n5IsvD97MqIegpRhLh3kSQ7CIogy1lj8ZrC4xoFJFjxszMzD6ehTKHGFxdXejw9FYx MNL81q/84nPw85WhKL+Nvi5R18H3RAOETOz0FovzTgK0L5poS12M9cOsWnfhntYOkvrv d4R3Z/cHWySF4GRVbWlmRqK6KqOOtNW9uSft5aKgTOz5vD4sldDUvoXpphVnLMJTgtra IU/Cn1ODMPRYuJQWUYc1GMdrO7udMIQv5rbWw23BMhH4l4BfB8Q9XE/NYk8YrO8dF/qr JPMfCjuH5TJkPI8Rf8wBDMKV2NgmmSmUwVeeyPMSQEQHdpsBq4uVX7jl6hWBzsNQfSCX uyWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=uMblKnV3Ocq7dC1W047IYjXjnFTwauF9Ud6IY1BWMbM=; b=sWJXOVW5SDCzYc/sj0PNYMsDl7O5R3f3wXMbIMjcadTZke0QN5wOWy/IDtYvA/6Ly6 tqHms1I5Zc7NNby5hVXoB2GAH7749vhDjQg0DgikYmiHYhlZMTcXi8n8IBLLF7fgWFcL KgovqvSuZu2OzFTYoPdbi/7KsabMQCiiwcmENVwpbG7k3xPGnofnxmEEC9Elh+WwGxgi wy+Jr1hgy4BG1fSJmGd0vnoQLRbE5zYkozRRyId2FeZ1KNUdG1pD5ymQGIl7SI9Eg3+T Uc4MF7vLKjiOdfVNWcflZ0y6ondjoc9WWo9IGnIUaPhxMoa2M9sDTxqhhkTROrYTa3vA BCAg== X-Gm-Message-State: AOAM531qH2vDRd+sgDY2qRCPfqMO0loatiOSgd1uyICKugtlGROSufYc VEW85/7xH0MopQkaq5xZ3n37tQ8gDBY33Rr045d5vdSKw6M= X-Google-Smtp-Source: ABdhPJzVNoG7L51Cokhm3JextBMA2Ii5kUdL1whMqtkzojJ7peUySHiYUMT2M8/d0EBErbMj2HNd0ovvsuvrIfZPDu4= X-Received: by 2002:a05:620a:62b:: with SMTP id 11mr6642788qkv.229.1605725306460; Wed, 18 Nov 2020 10:48:26 -0800 (PST) MIME-Version: 1.0 References: <20201105223823.850068-1-pshier@google.com> In-Reply-To: <20201105223823.850068-1-pshier@google.com> From: Peter Shier Date: Wed, 18 Nov 2020 10:48:14 -0800 Message-ID: Subject: Re: [PATCH] KVM: selftests: Test IPI to halted vCPU in xAPIC while backing page moves To: kvm@vger.kernel.org Cc: Paolo Bonzini , Andrew Jones , Jim Mattson , Ricardo Koller Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Thu, Nov 5, 2020 at 2:38 PM Peter Shier wrote: > > When a guest is using xAPIC KVM allocates a backing page for the required > EPT entry for the APIC access address set in the VMCS. If mm decides to > move that page the KVM mmu notifier will update the VMCS with the new > HPA. This test induces a page move to test that APIC access continues to > work correctly. It is a directed test for > commit e649b3f0188f "KVM: x86: Fix APIC page invalidation race". > > Tested: ran for 1 hour on a skylake, migrating backing page every 1ms > > Depends on patch "selftests: kvm: Add exception handling to selftests" > from aaronlewis@google.com that has not yet been queued. > > Signed-off-by: Peter Shier > Reviewed-by: Jim Mattson > Reviewed-by: Ricardo Koller > --- > tools/testing/selftests/kvm/.gitignore | 1 + > tools/testing/selftests/kvm/Makefile | 1 + > tools/testing/selftests/kvm/include/numaif.h | 55 ++ > .../selftests/kvm/include/x86_64/processor.h | 20 + > .../selftests/kvm/x86_64/xapic_ipi_test.c | 544 ++++++++++++++++++ > 5 files changed, 621 insertions(+) > create mode 100644 tools/testing/selftests/kvm/include/numaif.h > create mode 100644 tools/testing/selftests/kvm/x86_64/xapic_ipi_test.c > > diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore > index 307ceaadbbb9..56f8367e2ace 100644 > --- a/tools/testing/selftests/kvm/.gitignore > +++ b/tools/testing/selftests/kvm/.gitignore > @@ -19,6 +19,7 @@ > /x86_64/vmx_dirty_log_test > /x86_64/vmx_set_nested_state_test > /x86_64/vmx_tsc_adjust_test > +/x86_64/xapic_ipi_test > /x86_64/xss_msr_test > /clear_dirty_log_test > /demand_paging_test > diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile > index aaaf992faf87..19283554ef5e 100644 > --- a/tools/testing/selftests/kvm/Makefile > +++ b/tools/testing/selftests/kvm/Makefile > @@ -53,6 +53,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/vmx_close_while_nested_test > TEST_GEN_PROGS_x86_64 += x86_64/vmx_dirty_log_test > TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test > TEST_GEN_PROGS_x86_64 += x86_64/vmx_tsc_adjust_test > +TEST_GEN_PROGS_x86_64 += x86_64/xapic_ipi_test > TEST_GEN_PROGS_x86_64 += x86_64/xss_msr_test > TEST_GEN_PROGS_x86_64 += x86_64/debug_regs > TEST_GEN_PROGS_x86_64 += x86_64/tsc_msrs_test > diff --git a/tools/testing/selftests/kvm/include/numaif.h b/tools/testing/selftests/kvm/include/numaif.h > new file mode 100644 > index 000000000000..b020547403fd > --- /dev/null > +++ b/tools/testing/selftests/kvm/include/numaif.h > @@ -0,0 +1,55 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * tools/testing/selftests/kvm/include/numaif.h > + * > + * Copyright (C) 2020, Google LLC. > + * > + * This work is licensed under the terms of the GNU GPL, version 2. > + * > + * Header file that provides access to NUMA API functions not explicitly > + * exported to user space. > + */ > + > +#ifndef SELFTEST_KVM_NUMAIF_H > +#define SELFTEST_KVM_NUMAIF_H > + > +#define __NR_get_mempolicy 239 > +#define __NR_migrate_pages 256 > + > +/* System calls */ > +long get_mempolicy(int *policy, const unsigned long *nmask, > + unsigned long maxnode, void *addr, int flags) > +{ > + return syscall(__NR_get_mempolicy, policy, nmask, > + maxnode, addr, flags); > +} > + > +long migrate_pages(int pid, unsigned long maxnode, > + const unsigned long *frommask, > + const unsigned long *tomask) > +{ > + return syscall(__NR_migrate_pages, pid, maxnode, frommask, tomask); > +} > + > +/* Policies */ > +#define MPOL_DEFAULT 0 > +#define MPOL_PREFERRED 1 > +#define MPOL_BIND 2 > +#define MPOL_INTERLEAVE 3 > + > +#define MPOL_MAX MPOL_INTERLEAVE > + > +/* Flags for get_mem_policy */ > +#define MPOL_F_NODE (1<<0) /* return next il node or node of address */ > + /* Warning: MPOL_F_NODE is unsupported and > + * subject to change. Don't use. > + */ > +#define MPOL_F_ADDR (1<<1) /* look up vma using address */ > +#define MPOL_F_MEMS_ALLOWED (1<<2) /* query nodes allowed in cpuset */ > + > +/* Flags for mbind */ > +#define MPOL_MF_STRICT (1<<0) /* Verify existing pages in the mapping */ > +#define MPOL_MF_MOVE (1<<1) /* Move pages owned by this process to conform to mapping */ > +#define MPOL_MF_MOVE_ALL (1<<2) /* Move every page to conform to mapping */ > + > +#endif /* SELFTEST_KVM_NUMAIF_H */ > diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h > index 02530dc6339b..313ec00b1f7c 100644 > --- a/tools/testing/selftests/kvm/include/x86_64/processor.h > +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h > @@ -377,8 +377,27 @@ void vm_handle_exception(struct kvm_vm *vm, int vector, > #define X86_CR0_CD (1UL<<30) /* Cache Disable */ > #define X86_CR0_PG (1UL<<31) /* Paging */ > > +#define APIC_DEFAULT_GPA 0xfee00000ULL > + > +/* APIC base address MSR and fields */ > +#define MSR_IA32_APICBASE 0x0000001b > +#define MSR_IA32_APICBASE_BSP (1<<8) > +#define MSR_IA32_APICBASE_EXTD (1<<10) > +#define MSR_IA32_APICBASE_ENABLE (1<<11) > +#define MSR_IA32_APICBASE_BASE (0xfffff<<12) > +#define GET_APIC_BASE(x) (((x) >> 12) << 12) > + > #define APIC_BASE_MSR 0x800 > #define X2APIC_ENABLE (1UL << 10) > +#define APIC_ID 0x20 > +#define APIC_LVR 0x30 > +#define GET_APIC_ID_FIELD(x) (((x) >> 24) & 0xFF) > +#define APIC_TASKPRI 0x80 > +#define APIC_PROCPRI 0xA0 > +#define APIC_EOI 0xB0 > +#define APIC_SPIV 0xF0 > +#define APIC_SPIV_FOCUS_DISABLED (1 << 9) > +#define APIC_SPIV_APIC_ENABLED (1 << 8) > #define APIC_ICR 0x300 > #define APIC_DEST_SELF 0x40000 > #define APIC_DEST_ALLINC 0x80000 > @@ -403,6 +422,7 @@ void vm_handle_exception(struct kvm_vm *vm, int vector, > #define APIC_DM_EXTINT 0x00700 > #define APIC_VECTOR_MASK 0x000FF > #define APIC_ICR2 0x310 > +#define SET_APIC_DEST_FIELD(x) ((x) << 24) > > /* VMX_EPT_VPID_CAP bits */ > #define VMX_EPT_VPID_CAP_AD_BITS (1ULL << 21) > diff --git a/tools/testing/selftests/kvm/x86_64/xapic_ipi_test.c b/tools/testing/selftests/kvm/x86_64/xapic_ipi_test.c > new file mode 100644 > index 000000000000..47c0ec975330 > --- /dev/null > +++ b/tools/testing/selftests/kvm/x86_64/xapic_ipi_test.c > @@ -0,0 +1,544 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * xapic_ipi_test > + * > + * Copyright (C) 2020, Google LLC. > + * > + * This work is licensed under the terms of the GNU GPL, version 2. > + * > + * Test that when the APIC is in xAPIC mode, a vCPU can send an IPI to wake > + * another vCPU that is halted when KVM's backing page for the APIC access > + * address has been moved by mm. > + * > + * The test starts two vCPUs: one that sends IPIs and one that continually > + * executes HLT. The sender checks that the halter has woken from the HLT and > + * has reentered HLT before sending the next IPI. While the vCPUs are running, > + * the host continually calls migrate_pages to move all of the process' pages > + * amongst the available numa nodes on the machine. > + * > + * Migration is a command line option. When used on non-numa machines will > + * exit with error. Test is still usefull on non-numa for testing IPIs. > + */ > + > +#define _GNU_SOURCE /* for program_invocation_short_name */ > +#include > +#include > +#include > +#include > +#include > + > +#include "kvm_util.h" > +#include "numaif.h" > +#include "processor.h" > +#include "test_util.h" > +#include "vmx.h" > + > +/* Default running time for the test */ > +#define DEFAULT_RUN_SECS 3 > + > +/* Default delay between migrate_pages calls (microseconds) */ > +#define DEFAULT_DELAY_USECS 500000 > + > +#define HALTER_VCPU_ID 0 > +#define SENDER_VCPU_ID 1 > + > +volatile uint32_t *apic_base = (volatile uint32_t *)APIC_DEFAULT_GPA; > + > +/* > + * Vector for IPI from sender vCPU to halting vCPU. > + * Value is arbitrary and was chosen for the alternating bit pattern. Any > + * value should work. > + */ > +#define IPI_VECTOR 0xa5 > + > +/* > + * Incremented in the IPI handler. Provides evidence to the sender that the IPI > + * arrived at the destination > + */ > +static volatile uint64_t ipis_rcvd; > + > +/* Data struct shared between host main thread and vCPUs */ > +struct test_data_page { > + uint32_t halter_apic_id; > + volatile uint64_t hlt_count; > + volatile uint64_t wake_count; > + uint64_t ipis_sent; > + uint64_t migrations_attempted; > + uint64_t migrations_completed; > + uint32_t icr; > + uint32_t icr2; > + uint32_t halter_tpr; > + uint32_t halter_ppr; > + > + /* > + * Record local version register as a cross-check that APIC access > + * worked. Value should match what KVM reports (APIC_VERSION in > + * arch/x86/kvm/lapic.c). If test is failing, check that values match > + * to determine whether APIC access exits are working. > + */ > + uint32_t halter_lvr; > +}; > + > +struct thread_params { > + struct test_data_page *data; > + struct kvm_vm *vm; > + uint32_t vcpu_id; > + uint64_t *pipis_rcvd; /* host address of ipis_rcvd global */ > +}; > + > +uint32_t read_apic_reg(uint reg) > +{ > + return apic_base[reg >> 2]; > +} > + > +void write_apic_reg(uint reg, uint32_t val) > +{ > + apic_base[reg >> 2] = val; > +} > + > +void disable_apic(void) > +{ > + wrmsr(MSR_IA32_APICBASE, > + rdmsr(MSR_IA32_APICBASE) & > + ~(MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD)); > +} > + > +void enable_xapic(void) > +{ > + uint64_t val = rdmsr(MSR_IA32_APICBASE); > + > + /* Per SDM: to enable xAPIC when in x2APIC must first disable APIC */ > + if (val & MSR_IA32_APICBASE_EXTD) { > + disable_apic(); > + wrmsr(MSR_IA32_APICBASE, > + rdmsr(MSR_IA32_APICBASE) | MSR_IA32_APICBASE_ENABLE); > + } else if (!(val & MSR_IA32_APICBASE_ENABLE)) { > + wrmsr(MSR_IA32_APICBASE, val | MSR_IA32_APICBASE_ENABLE); > + } > + > + /* > + * Per SDM: reset value of spurious interrupt vector register has the > + * APIC software enabled bit=0. It must be enabled in addition to the > + * enable bit in the MSR. > + */ > + val = read_apic_reg(APIC_SPIV) | APIC_SPIV_APIC_ENABLED; > + write_apic_reg(APIC_SPIV, val); > +} > + > +void verify_apic_base_addr(void) > +{ > + uint64_t msr = rdmsr(MSR_IA32_APICBASE); > + uint64_t base = GET_APIC_BASE(msr); > + > + GUEST_ASSERT(base == APIC_DEFAULT_GPA); > +} > + > +static void halter_guest_code(struct test_data_page *data) > +{ > + verify_apic_base_addr(); > + enable_xapic(); > + > + data->halter_apic_id = GET_APIC_ID_FIELD(read_apic_reg(APIC_ID)); > + data->halter_lvr = read_apic_reg(APIC_LVR); > + > + /* > + * Loop forever HLTing and recording halts & wakes. Disable interrupts > + * each time around to minimize window between signaling the pending > + * halt to the sender vCPU and executing the halt. No need to disable on > + * first run as this vCPU executes first and the host waits for it to > + * signal going into first halt before starting the sender vCPU. Record > + * TPR and PPR for diagnostic purposes in case the test fails. > + */ > + for (;;) { > + data->halter_tpr = read_apic_reg(APIC_TASKPRI); > + data->halter_ppr = read_apic_reg(APIC_PROCPRI); > + data->hlt_count++; > + asm volatile("sti; hlt; cli"); > + data->wake_count++; > + } > +} > + > +/* > + * Runs on halter vCPU when IPI arrives. Write an arbitrary non-zero value to > + * enable diagnosing errant writes to the APIC access address backing page in > + * case of test failure. > + */ > +static void guest_ipi_handler(struct ex_regs *regs) > +{ > + ipis_rcvd++; > + write_apic_reg(APIC_EOI, 77); > +} > + > +static void sender_guest_code(struct test_data_page *data) > +{ > + uint64_t last_wake_count; > + uint64_t last_hlt_count; > + uint64_t last_ipis_rcvd_count; > + uint32_t icr_val; > + uint32_t icr2_val; > + uint64_t tsc_start; > + > + verify_apic_base_addr(); > + enable_xapic(); > + > + /* > + * Init interrupt command register for sending IPIs > + * > + * Delivery mode=fixed, per SDM: > + * "Delivers the interrupt specified in the vector field to the target > + * processor." > + * > + * Destination mode=physical i.e. specify target by its local APIC > + * ID. This vCPU assumes that the halter vCPU has already started and > + * set data->halter_apic_id. > + */ > + icr_val = (APIC_DEST_PHYSICAL | APIC_DM_FIXED | IPI_VECTOR); > + icr2_val = SET_APIC_DEST_FIELD(data->halter_apic_id); > + data->icr = icr_val; > + data->icr2 = icr2_val; > + > + last_wake_count = data->wake_count; > + last_hlt_count = data->hlt_count; > + last_ipis_rcvd_count = ipis_rcvd; > + for (;;) { > + /* > + * Send IPI to halter vCPU. > + * First IPI can be sent unconditionally because halter vCPU > + * starts earlier. > + */ > + write_apic_reg(APIC_ICR2, icr2_val); > + write_apic_reg(APIC_ICR, icr_val); > + data->ipis_sent++; > + > + /* > + * Wait up to ~1 sec for halter to indicate that it has: > + * 1. Received the IPI > + * 2. Woken up from the halt > + * 3. Gone back into halt > + * Current CPUs typically run at 2.x Ghz which is ~2 > + * billion ticks per second. > + */ > + tsc_start = rdtsc(); > + while (rdtsc() - tsc_start < 2000000000) { > + if ((ipis_rcvd != last_ipis_rcvd_count) && > + (data->wake_count != last_wake_count) && > + (data->hlt_count != last_hlt_count)) > + break; > + } > + > + GUEST_ASSERT((ipis_rcvd != last_ipis_rcvd_count) && > + (data->wake_count != last_wake_count) && > + (data->hlt_count != last_hlt_count)); > + > + last_wake_count = data->wake_count; > + last_hlt_count = data->hlt_count; > + last_ipis_rcvd_count = ipis_rcvd; > + } > +} > + > +static void *vcpu_thread(void *arg) > +{ > + struct thread_params *params = (struct thread_params *)arg; > + struct ucall uc; > + int old; > + int r; > + unsigned int exit_reason; > + > + r = pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, &old); > + TEST_ASSERT(r == 0, > + "pthread_setcanceltype failed on vcpu_id=%u with errno=%d", > + params->vcpu_id, r); > + > + fprintf(stderr, "vCPU thread running vCPU %u\n", params->vcpu_id); > + vcpu_run(params->vm, params->vcpu_id); > + exit_reason = vcpu_state(params->vm, params->vcpu_id)->exit_reason; > + > + TEST_ASSERT(exit_reason == KVM_EXIT_IO, > + "vCPU %u exited with unexpected exit reason %u-%s, expected KVM_EXIT_IO", > + params->vcpu_id, exit_reason, exit_reason_str(exit_reason)); > + > + if (get_ucall(params->vm, params->vcpu_id, &uc) == UCALL_ABORT) { > + TEST_ASSERT(false, > + "vCPU %u exited with error: %s.\n" > + "Sending vCPU sent %lu IPIs to halting vCPU\n" > + "Halting vCPU halted %lu times, woke %lu times, received %lu IPIs.\n" > + "Halter TPR=%#x PPR=%#x LVR=%#x\n" > + "Migrations attempted: %lu\n" > + "Migrations completed: %lu\n", > + params->vcpu_id, (const char *)uc.args[0], > + params->data->ipis_sent, params->data->hlt_count, > + params->data->wake_count, > + *params->pipis_rcvd, params->data->halter_tpr, > + params->data->halter_ppr, params->data->halter_lvr, > + params->data->migrations_attempted, > + params->data->migrations_completed); > + } > + > + return NULL; > +} > + > +static void cancel_join_vcpu_thread(pthread_t thread, uint32_t vcpu_id) > +{ > + void *retval; > + int r; > + > + r = pthread_cancel(thread); > + TEST_ASSERT(r == 0, > + "pthread_cancel on vcpu_id=%d failed with errno=%d", > + vcpu_id, r); > + > + r = pthread_join(thread, &retval); > + TEST_ASSERT(r == 0, > + "pthread_join on vcpu_id=%d failed with errno=%d", > + vcpu_id, r); > + TEST_ASSERT(retval == PTHREAD_CANCELED, > + "expected retval=%p, got %p", PTHREAD_CANCELED, > + retval); > +} > + > +void do_migrations(struct test_data_page *data, int run_secs, int delay_usecs, > + uint64_t *pipis_rcvd) > +{ > + long pages_not_moved; > + unsigned long nodemask = 0; > + unsigned long nodemasks[sizeof(nodemask) * 8]; > + int nodes = 0; > + time_t start_time, last_update, now; > + time_t interval_secs = 1; > + int i, r; > + int from, to; > + unsigned long bit; > + uint64_t hlt_count; > + uint64_t wake_count; > + uint64_t ipis_sent; > + > + fprintf(stderr, "Calling migrate_pages every %d microseconds\n", > + delay_usecs); > + > + /* Get set of first 64 numa nodes available */ > + r = get_mempolicy(NULL, &nodemask, sizeof(nodemask) * 8, > + 0, MPOL_F_MEMS_ALLOWED); > + TEST_ASSERT(r == 0, "get_mempolicy failed errno=%d", errno); > + > + fprintf(stderr, "Numa nodes found amongst first %lu possible nodes " > + "(each 1-bit indicates node is present): %#lx\n", > + sizeof(nodemask) * 8, nodemask); > + > + /* Init array of masks containing a single-bit in each, one for each > + * available node. migrate_pages called below requires specifying nodes > + * as bit masks. > + */ > + for (i = 0, bit = 1; i < sizeof(nodemask) * 8; i++, bit <<= 1) { > + if (nodemask & bit) { > + nodemasks[nodes] = nodemask & bit; > + nodes++; > + } > + } > + > + TEST_ASSERT(nodes > 1, > + "Did not find at least 2 numa nodes. Can't do migration\n"); > + > + fprintf(stderr, "Migrating amongst %d nodes found\n", nodes); > + > + from = 0; > + to = 1; > + start_time = time(NULL); > + last_update = start_time; > + > + ipis_sent = data->ipis_sent; > + hlt_count = data->hlt_count; > + wake_count = data->wake_count; > + > + while ((int)(time(NULL) - start_time) < run_secs) { > + data->migrations_attempted++; > + > + /* > + * migrate_pages with PID=0 will migrate all pages of this > + * process between the nodes specified as bitmasks. The page > + * backing the APIC access address belongs to this process > + * because it is allocated by KVM in the context of the > + * KVM_CREATE_VCPU ioctl. If that assumption ever changes this > + * test may break or give a false positive signal. > + */ > + pages_not_moved = migrate_pages(0, sizeof(nodemasks[from]), > + &nodemasks[from], > + &nodemasks[to]); > + if (pages_not_moved < 0) > + fprintf(stderr, > + "migrate_pages failed, errno=%d\n", errno); > + else if (pages_not_moved > 0) > + fprintf(stderr, > + "migrate_pages could not move %ld pages\n", > + pages_not_moved); > + else > + data->migrations_completed++; > + > + from = to; > + to++; > + if (to == nodes) > + to = 0; > + > + now = time(NULL); > + if (((now - start_time) % interval_secs == 0) && > + (now != last_update)) { > + last_update = now; > + fprintf(stderr, > + "%lu seconds: Migrations attempted=%lu completed=%lu, " > + "IPIs sent=%lu received=%lu, HLTs=%lu wakes=%lu\n", > + now - start_time, data->migrations_attempted, > + data->migrations_completed, > + data->ipis_sent, *pipis_rcvd, > + data->hlt_count, data->wake_count); > + > + TEST_ASSERT(ipis_sent != data->ipis_sent && > + hlt_count != data->hlt_count && > + wake_count != data->wake_count, > + "IPI, HLT and wake count have not increased " > + "in the last %lu seconds. " > + "HLTer is likely hung.\n", interval_secs); > + > + ipis_sent = data->ipis_sent; > + hlt_count = data->hlt_count; > + wake_count = data->wake_count; > + } > + usleep(delay_usecs); > + } > +} > + > +void get_cmdline_args(int argc, char *argv[], int *run_secs, > + bool *migrate, int *delay_usecs) > +{ > + for (;;) { > + int opt = getopt(argc, argv, "s:d:m"); > + > + if (opt == -1) > + break; > + switch (opt) { > + case 's': > + *run_secs = parse_size(optarg); > + break; > + case 'm': > + *migrate = true; > + break; > + case 'd': > + *delay_usecs = parse_size(optarg); > + break; > + default: > + TEST_ASSERT(false, > + "Usage: -s . Default is %d seconds.\n" > + "-m adds calls to migrate_pages while vCPUs are running." > + " Default is no migrations.\n" > + "-d - delay between migrate_pages() calls." > + " Default is %d microseconds.\n", > + DEFAULT_RUN_SECS, DEFAULT_DELAY_USECS); > + } > + } > +} > + > +int main(int argc, char *argv[]) > +{ > + int r; > + int wait_secs; > + const int max_halter_wait = 10; > + int run_secs = 0; > + int delay_usecs = 0; > + struct test_data_page *data; > + vm_vaddr_t test_data_page_vaddr; > + bool migrate = false; > + pthread_t threads[2]; > + struct thread_params params[2]; > + struct kvm_vm *vm; > + uint64_t *pipis_rcvd; > + > + get_cmdline_args(argc, argv, &run_secs, &migrate, &delay_usecs); > + if (run_secs <= 0) > + run_secs = DEFAULT_RUN_SECS; > + if (delay_usecs <= 0) > + delay_usecs = DEFAULT_DELAY_USECS; > + > + vm = vm_create_default(HALTER_VCPU_ID, 0, halter_guest_code); > + params[0].vm = vm; > + params[1].vm = vm; > + > + vm_init_descriptor_tables(vm); > + vcpu_init_descriptor_tables(vm, HALTER_VCPU_ID); > + vm_handle_exception(vm, IPI_VECTOR, guest_ipi_handler); > + > + virt_pg_map(vm, APIC_DEFAULT_GPA, APIC_DEFAULT_GPA, 0); > + > + vm_vcpu_add_default(vm, SENDER_VCPU_ID, sender_guest_code); > + > + test_data_page_vaddr = vm_vaddr_alloc(vm, 0x1000, 0x1000, 0, 0); > + data = > + (struct test_data_page *)addr_gva2hva(vm, test_data_page_vaddr); > + memset(data, 0, sizeof(*data)); > + params[0].data = data; > + params[1].data = data; > + > + vcpu_args_set(vm, HALTER_VCPU_ID, 1, test_data_page_vaddr); > + vcpu_args_set(vm, SENDER_VCPU_ID, 1, test_data_page_vaddr); > + > + pipis_rcvd = (uint64_t *)addr_gva2hva(vm, (uint64_t)&ipis_rcvd); > + params[0].pipis_rcvd = pipis_rcvd; > + params[1].pipis_rcvd = pipis_rcvd; > + > + /* Start halter vCPU thread and wait for it to execute first HLT. */ > + params[0].vcpu_id = HALTER_VCPU_ID; > + r = pthread_create(&threads[0], NULL, vcpu_thread, ¶ms[0]); > + TEST_ASSERT(r == 0, > + "pthread_create halter failed errno=%d", errno); > + fprintf(stderr, "Halter vCPU thread started\n"); > + > + wait_secs = 0; > + while ((wait_secs < max_halter_wait) && !data->hlt_count) { > + sleep(1); > + wait_secs++; > + } > + > + TEST_ASSERT(data->hlt_count, > + "Halter vCPU did not execute first HLT within %d seconds", > + max_halter_wait); > + > + fprintf(stderr, > + "Halter vCPU thread reported its APIC ID: %u after %d seconds.\n", > + data->halter_apic_id, wait_secs); > + > + params[1].vcpu_id = SENDER_VCPU_ID; > + r = pthread_create(&threads[1], NULL, vcpu_thread, ¶ms[1]); > + TEST_ASSERT(r == 0, "pthread_create sender failed errno=%d", errno); > + > + fprintf(stderr, > + "IPI sender vCPU thread started. Letting vCPUs run for %d seconds.\n", > + run_secs); > + > + if (!migrate) > + sleep(run_secs); > + else > + do_migrations(data, run_secs, delay_usecs, pipis_rcvd); > + > + /* > + * Cancel threads and wait for them to stop. > + */ > + cancel_join_vcpu_thread(threads[0], HALTER_VCPU_ID); > + cancel_join_vcpu_thread(threads[1], SENDER_VCPU_ID); > + > + fprintf(stderr, > + "Test successful after running for %d seconds.\n" > + "Sending vCPU sent %lu IPIs to halting vCPU\n" > + "Halting vCPU halted %lu times, woke %lu times, received %lu IPIs.\n" > + "Halter APIC ID=%#x\n" > + "Sender ICR value=%#x ICR2 value=%#x\n" > + "Halter TPR=%#x PPR=%#x LVR=%#x\n" > + "Migrations attempted: %lu\n" > + "Migrations completed: %lu\n", > + run_secs, data->ipis_sent, > + data->hlt_count, data->wake_count, *pipis_rcvd, > + data->halter_apic_id, > + data->icr, data->icr2, > + data->halter_tpr, data->halter_ppr, data->halter_lvr, > + data->migrations_attempted, data->migrations_completed); > + > + kvm_vm_free(vm); > + > + return 0; > +} > -- > > Ping on this test patch. Thx