qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Roman Bolshakov <r.bolshakov@yadro.com>
To: Claudio Fontana <cfontana@suse.de>
Cc: "Laurent Vivier" <lvivier@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Thomas Huth" <thuth@redhat.com>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	haxm-team@intel.com, "Marcelo Tosatti" <mtosatti@redhat.com>,
	qemu-devel@nongnu.org, "Cameron Esfahani" <dirty@apple.com>,
	"Colin Xu" <colin.xu@intel.com>,
	"Wenchao Wang" <wenchao.wang@intel.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Sunil Muthuswamy" <sunilmut@microsoft.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Richard Henderson" <rth@twiddle.net>
Subject: Re: [RFC v3 4/4] cpus: extract out accel-specific code to each accel
Date: Tue, 26 May 2020 22:16:00 +0300	[thread overview]
Message-ID: <20200526191600.GA4851@SPB-NB-133.local> (raw)
In-Reply-To: <20200525145440.29728-5-cfontana@suse.de>

On Mon, May 25, 2020 at 04:54:40PM +0200, Claudio Fontana wrote:
> each accelerator registers a new "CpusAccelInterface"
> on initialization, providing functions for starting a vcpu,
> kicking a vcpu, and sychronizing state.
> 
> This way the code in cpus.cc is now all general softmmu code,
> nothing (or almost nothing) accelerator-specific anymore.
> 
> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> ---
>  MAINTAINERS                          |   1 +
>  accel/kvm/Makefile.objs              |   2 +
>  accel/kvm/kvm-all.c                  |  15 +-
>  accel/kvm/kvm-cpus-interface.c       |  94 ++++
>  accel/kvm/kvm-cpus-interface.h       |   8 +
>  accel/qtest.c                        |  82 ++++
>  accel/stubs/kvm-stub.c               |   3 +-
>  accel/tcg/Makefile.objs              |   1 +
>  accel/tcg/tcg-all.c                  |  12 +-
>  accel/tcg/tcg-cpus-interface.c       | 523 ++++++++++++++++++++
>  accel/tcg/tcg-cpus-interface.h       |   8 +
>  hw/core/cpu.c                        |   1 +
>  include/sysemu/cpus.h                |  32 ++
>  include/sysemu/hvf.h                 |   1 -
>  include/sysemu/hw_accel.h            |  57 +--
>  include/sysemu/kvm.h                 |   2 +-
>  softmmu/cpus.c                       | 911 +++--------------------------------
>  stubs/Makefile.objs                  |   1 +
>  stubs/cpu-synchronize-state.c        |  15 +
>  target/i386/Makefile.objs            |   7 +-
>  target/i386/hax-all.c                |   6 +-
>  target/i386/hax-cpus-interface.c     |  85 ++++
>  target/i386/hax-cpus-interface.h     |   8 +
>  target/i386/hax-i386.h               |   2 +
>  target/i386/hax-posix.c              |  12 +
>  target/i386/hax-windows.c            |  20 +
>  target/i386/hvf/Makefile.objs        |   2 +-
>  target/i386/hvf/hvf-cpus-interface.c |  92 ++++
>  target/i386/hvf/hvf-cpus-interface.h |   8 +
>  target/i386/hvf/hvf.c                |   5 +-
>  target/i386/whpx-all.c               |   3 +
>  target/i386/whpx-cpus-interface.c    |  96 ++++
>  target/i386/whpx-cpus-interface.h    |   8 +
>  33 files changed, 1220 insertions(+), 903 deletions(-)
>  create mode 100644 accel/kvm/kvm-cpus-interface.c
>  create mode 100644 accel/kvm/kvm-cpus-interface.h
>  create mode 100644 accel/tcg/tcg-cpus-interface.c
>  create mode 100644 accel/tcg/tcg-cpus-interface.h
>  create mode 100644 stubs/cpu-synchronize-state.c
>  create mode 100644 target/i386/hax-cpus-interface.c
>  create mode 100644 target/i386/hax-cpus-interface.h
>  create mode 100644 target/i386/hvf/hvf-cpus-interface.c
>  create mode 100644 target/i386/hvf/hvf-cpus-interface.h
>  create mode 100644 target/i386/whpx-cpus-interface.c
>  create mode 100644 target/i386/whpx-cpus-interface.h

Hi Claudio,

Overall it looks good.

I wonder if the new structure should get singular form, i.e.
softmmu/cpu.c instead of softmmu/cpus.c

Perhaps cpus.c had plural form because it was related to implementation
of multiple CPUs/accels. After the split, each accel got it's own
implementation of accel interface.

"<accel>-cpus-interface.c" contains implementation rather than interface
it's a bit confusing. Perhaps it should be called: "<accel>-cpu.c" or
even "<accel>-accel.c".

By the way, If we use registration for each accel, does it mean that
include/sysemu/<accel>.h and accel stubs are no longer needed in shared
location?

There's an AccelClass in accel/accel.c, I wonder if it should be re-used
for accel CPU registration? I don't know but may be generic the leftover
of cpus.c also belongs to accel/ rather than softmmu/?

>  
> diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
> index 149de000a0..cae22afe4d 100644
> --- a/include/sysemu/cpus.h
> +++ b/include/sysemu/cpus.h
> @@ -4,7 +4,39 @@
>  #include "qemu/timer.h"
>  
>  /* cpus.c */
> +
> +/* CPU execution threads */
> +
> +typedef struct CpusAccelInterface {
> +    void (*create_vcpu_thread)(CPUState *cpu);
> +    void (*kick_vcpu_thread)(CPUState *cpu);
> +
> +    void (*cpu_synchronize_post_reset)(CPUState *cpu);
> +    void (*cpu_synchronize_post_init)(CPUState *cpu);
> +    void (*cpu_synchronize_state)(CPUState *cpu);
> +    void (*cpu_synchronize_pre_loadvm)(CPUState *cpu);
> +} CpusAccelInterface;


I think plural name may be replaced to singular. Interface suffix
doesn't seem to be used in QEMU.

cpu_ and _vcpu are sort meaning the same and may be replaced to generic
cpu_ prefix. There's a CPUState, CPU<Arch>State, and IMO shorter
CPUAccel seems to match the naming. I also don't know if cpu_ prefix
should be kept.

So here's how I see the interface:

typedef struct CPUAccel {
    void (*create_thread)(CPUState *cpu);
    void (*kick_thread)(CPUState *cpu);

    void (*synchronize_post_reset)(CPUState *cpu);
    void (*synchronize_post_init)(CPUState *cpu);
    void (*synchronize_state)(CPUState *cpu);
    void (*synchronize_pre_loadvm)(CPUState *cpu);
} CPUAccel;


> +
> +/* register accel-specific interface */
> +void cpus_register_accel_interface(CpusAccelInterface *i);
> +
> +/* interface available for cpus accelerator threads */
> +
> +/* For temporary buffers for forming a name */
> +#define VCPU_THREAD_NAME_SIZE 16
> +
> +void cpus_kick_thread(CPUState *cpu);

If it's addressing a single CPU should it be singular too, i.e.
cpu_kick_thread?

> +bool cpu_thread_is_idle(CPUState *cpu);
>  bool all_cpu_threads_idle(void);
> +bool cpu_can_run(CPUState *cpu);
> +void qemu_wait_io_event_common(CPUState *cpu);
> +void qemu_wait_io_event(CPUState *cpu);
> +void cpu_thread_signal_created(CPUState *cpu);
> +void cpu_thread_signal_destroyed(CPUState *cpu);
> +void cpu_handle_guest_debug(CPUState *cpu);
> +
> +/* end interface for cpus accelerator threads */
> +
>  bool qemu_in_vcpu_thread(void);
>  void qemu_init_cpu_loop(void);
>  void resume_all_vcpus(void);
> diff --git a/target/i386/hvf/Makefile.objs b/target/i386/hvf/Makefile.objs
> index 927b86bc67..bdbc2c0227 100644
> --- a/target/i386/hvf/Makefile.objs
> +++ b/target/i386/hvf/Makefile.objs
> @@ -1,2 +1,2 @@
> -obj-y += hvf.o
> +obj-y += hvf.o hvf-cpus-interface.o
>  obj-y += x86.o x86_cpuid.o x86_decode.o x86_descr.o x86_emu.o x86_flags.o x86_mmu.o x86hvf.o x86_task.o
> diff --git a/target/i386/hvf/hvf-cpus-interface.c b/target/i386/hvf/hvf-cpus-interface.c
> new file mode 100644
> index 0000000000..54bfe307c1
> --- /dev/null
> +++ b/target/i386/hvf/hvf-cpus-interface.c
> @@ -0,0 +1,92 @@
> +#include "qemu/osdep.h"
> +#include "qemu/error-report.h"
> +#include "qemu/main-loop.h"
> +#include "sysemu/hvf.h"
> +#include "sysemu/runstate.h"
> +#include "sysemu/cpus.h"
> +#include "qemu/guest-random.h"
> +
> +#include "hvf-cpus-interface.h"
> +
> +/*
> + * The HVF-specific vCPU thread function. This one should only run when the host
> + * CPU supports the VMX "unrestricted guest" feature.
> + */
> +static void *hvf_cpu_thread_fn(void *arg)
> +{
> +    CPUState *cpu = arg;
> +
> +    int r;
> +
> +    assert(hvf_enabled());
> +
> +    rcu_register_thread();
> +
> +    qemu_mutex_lock_iothread();
> +    qemu_thread_get_self(cpu->thread);
> +
> +    cpu->thread_id = qemu_get_thread_id();
> +    cpu->can_do_io = 1;
> +    current_cpu = cpu;
> +
> +    hvf_init_vcpu(cpu);
> +
> +    /* signal CPU creation */
> +    cpu_thread_signal_created(cpu);
> +    qemu_guest_random_seed_thread_part2(cpu->random_seed);
> +
> +    do {
> +        if (cpu_can_run(cpu)) {
> +            r = hvf_vcpu_exec(cpu);
> +            if (r == EXCP_DEBUG) {
> +                cpu_handle_guest_debug(cpu);
> +            }
> +        }
> +        qemu_wait_io_event(cpu);
> +    } while (!cpu->unplug || cpu_can_run(cpu));
> +
> +    hvf_vcpu_destroy(cpu);
> +    cpu_thread_signal_destroyed(cpu);
> +    qemu_mutex_unlock_iothread();
> +    rcu_unregister_thread();
> +    return NULL;
> +}
> +
> +static void hvf_kick_vcpu_thread(CPUState *cpu)
> +{
> +    cpus_kick_thread(cpu);


I think we should leave it empty since it's not really implemented for
HVF. Here's a WIP implementation I'm yet to send:

https://github.com/roolebo/qemu/commit/4437a4b8af31b0adaa58375f09e0b8547507f64f#diff-5e6ad1baa140ca4a1743b5a2f1a664b5R970

> +}
> +
> +static void hvf_cpu_synchronize_noop(CPUState *cpu)
> +{
> +}
> +
> +static void hvf_start_vcpu_thread(CPUState *cpu)
> +{
> +    char thread_name[VCPU_THREAD_NAME_SIZE];
> +
> +    /*
> +     * HVF currently does not support TCG, and only runs in
> +     * unrestricted-guest mode.
> +     */
> +    assert(hvf_enabled());
> +
> +    cpu->thread = g_malloc0(sizeof(QemuThread));
> +    cpu->halt_cond = g_malloc0(sizeof(QemuCond));
> +    qemu_cond_init(cpu->halt_cond);
> +
> +    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HVF",
> +             cpu->cpu_index);
> +    qemu_thread_create(cpu->thread, thread_name, hvf_cpu_thread_fn,
> +                       cpu, QEMU_THREAD_JOINABLE);
> +}
> +
> +CpusAccelInterface hvf_cpus_interface = {
> +    .create_vcpu_thread = hvf_start_vcpu_thread,

I think it was meant as hvf_create_vcpu_thread.

> +    .kick_vcpu_thread = hvf_kick_vcpu_thread,
> +
> +    .cpu_synchronize_post_reset = hvf_cpu_synchronize_noop,
> +    .cpu_synchronize_post_init = hvf_cpu_synchronize_noop,
> +    .cpu_synchronize_state = hvf_cpu_synchronize_noop,
> +    .cpu_synchronize_pre_loadvm = hvf_cpu_synchronize_noop,

I have tested the RFC with hvf accel and it seems to work although I
expected it fail. The noop functions are the reason of the behaviour. In
some sense it's not equivalent to what it was before but to be fair it's
not possible to use existing hvf_cpu_synchronize_* functions in
target/i386/hvf/hvf.c without breaking hvf.

Here's how it fails. hvf_cpu_synchronize_state() sets dirty flag to mark
that emulator registers in CPUX86State are going to be out of sync with
accel registers. If we bind .cpu_synchronize_state to
hvf_synchronize_state it'd be called from vapic_write() on I/O writes
(from kvmvapic.bin option ROM) to KVM vAPIC page (I/O port 0x7e).

But HVF uses two emulator states CPUX86State and HVFX86EmulatorState
and they may become out-of-sync. That's exactly what happens when an I/O
write to KVM vAPIC is made. The I/O write triggers cpu state
synchronization, so VMCS registers are fetched into CPUX86State and the
vcpu_dirty flag is set. Emulation code then advances RIP in the
macvm_set_rip() by modifying VMCS directly. On the next round of vcpu
run loop it checks if the cpu is dirty (it is) and pushes CPUX86State
into the VMCS, effectively overwriting what was set by macvm_set_rip().

If the RFC isn't expected to be merged for a couple of weeks I can
prepare a series that drops HVFX86EmulatorState. I need the change for
single-stepping support in HVF anyway.

Thanks,
Roman


  parent reply	other threads:[~2020-05-26 19:17 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-25 14:54 [RFC v3 0/4] QEMU cpus.c refactoring Claudio Fontana
2020-05-25 14:54 ` [RFC v3 1/4] softmmu: move softmmu only files from root Claudio Fontana
2020-05-25 15:12   ` Philippe Mathieu-Daudé
2020-05-25 15:37     ` Claudio Fontana
2020-05-25 17:32       ` Philippe Mathieu-Daudé
2020-05-25 14:54 ` [RFC v3 2/4] cpu-throttle: new module, extracted from cpus.c Claudio Fontana
2020-05-25 15:14   ` Philippe Mathieu-Daudé
2020-05-25 15:42     ` Claudio Fontana
2020-05-25 14:54 ` [RFC v3 3/4] cpu-timers, icount: new modules Claudio Fontana
2020-05-25 15:16   ` Philippe Mathieu-Daudé
2020-05-25 15:43     ` Claudio Fontana
2020-05-25 14:54 ` [RFC v3 4/4] cpus: extract out accel-specific code to each accel Claudio Fontana
2020-05-25 15:22   ` Philippe Mathieu-Daudé
2020-05-25 15:37     ` Claudio Fontana
2020-05-26 19:16   ` Roman Bolshakov [this message]
2020-05-27 13:42     ` Claudio Fontana
2020-05-25 22:58 ` [RFC v3 0/4] QEMU cpus.c refactoring no-reply
2020-05-26 12:57   ` Claudio Fontana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200526191600.GA4851@SPB-NB-133.local \
    --to=r.bolshakov@yadro.com \
    --cc=alex.bennee@linaro.org \
    --cc=cfontana@suse.de \
    --cc=colin.xu@intel.com \
    --cc=dirty@apple.com \
    --cc=ehabkost@redhat.com \
    --cc=haxm-team@intel.com \
    --cc=lvivier@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=sunilmut@microsoft.com \
    --cc=thuth@redhat.com \
    --cc=wenchao.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).