All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v11 0/9] Xen VMware tools support
@ 2015-05-22 15:50 Don Slutz
  2015-05-22 15:50 ` [PATCH v11 1/9] tools: Add vga=vmware Don Slutz
                   ` (8 more replies)
  0 siblings, 9 replies; 48+ messages in thread
From: Don Slutz @ 2015-05-22 15:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

Changes v10 to v11:

  Andrew Cooper & Ian Campbell (#1 "tools: Add vga=vmware"):
    Nack. Qemu-trad is currently has remote code execution vulnerabilities.
      Dropped support for Qemu-trad.
    
     Also changed later patchs to not need this one.

  Andrew Cooper (#2 "xen: Add support for VMware cpuid leaves"):
    Needs re-base.
      Done
    Adjust /* Disallow if vmware_hwver */
      Done
    Newline after break;
      Done 2 places.
    Allowed Reviewed-by: Andrew Cooper, if these changes are done.
      Added Reviewed-by: Andrew Cooper.

   Julien Grall (#2 "xen: Add support for VMware cpuid leaves"):
    It would be worth to add an explicit vmware_hwver = 0 in the
    libxl__arch_domain_prepare_config.
      Done -- Note: Adds a tool change to this patch.

  (#3 "tools: Add vmware_hwver support"):
    Since Qemu-trad does not support vga=vmware,
    Dropped "If non-zero then default VGA to VMware's VGA"

  Andrew Cooper (#5 "xen: Add vmware_port support"):
    You will not be getting here for a non HVM domain...
      Dropped ASSERT(is_hvm_domain(currd))
    Newline after break;
      Done 6 places.
    Allowed Reviewed-by: Andrew Cooper, if these changes are done.
      Added Reviewed-by: Andrew Cooper.

  (#7 "tools: Add vmware_port support"):
    Since Qemu-trad does not support vga=vmware,
    Dropped "If non-zero then default VGA to VMware's VGA"

Changes v9 to v10:
  Split out LIBXL_VGA_INTERFACE_TYPE_VMWARE into it's own patch (#1)
  that can stand alone.  In the patch set because a later patch
  depends on it.

  Reworked to be based on:

    commit a7511905fae7ba592c5bf63cd77d8ff78087d689
    Author: Julien Grall <julien.grall@linaro.org>
    Date:   Wed Apr 1 17:21:41 2015 +0100

        xen: Extend DOMCTL createdomain to support arch configuration

  rebased onto:

    commit e13013dbf1d5997915548a3b5f1c39594d8c1d7b
    Author: Yang Hongyang <yanghy@cn.fujitsu.com>
    Date:   Thu May 14 16:55:18 2015 +0800

        libxc/restore: add checkpointed flag to the restore context


  Andrew Cooper (#2: "xen: Add support for VMware cpuid leaves"):
    Did not add "Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>"
    because of changes here to do things the new way.
  Reword comment message to reflect new way.

  Ian Campbell (#3 "tools: Add vmware_hwver support"):
    LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE &
    LIBXL_HAVE_BUILDINFO_HVM_VMWARE_HWVER are arriving together
    a single umbrella could be used.
      Since I split the LIBXL_VGA_INTERFACE_TYPE_VMWARE into
      it's own patch, this is not longer true.
      But I did use 1 for the 2 c_info changes.
    Please use GCSPRINTF.
      Done.
  Remove vga=vmware from here.

  Ian Campbell (#3 "tools: Add vmware_hwver support"):
    For "Add IOREQ_TYPE_VMWARE_PORT"
      With those fixed the tools/* bits are:
        Acked-by: Ian Campbell <ian.campbell@citrix.com>  
    Did not add Acked-by to "tools: Add vmware_hwver support"
    because of the rework for using libxl_domain_create_info.

  Andrew Cooper (#4: "vmware: Add VMware provided include file."):
    Added "Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>"

  Andrew Cooper (#5 "xen: Add vmware_port support"):
    Probably better as EOPNOTSUPP, as it is a configuration problem.
      Done.
    vmport_ioport function looks as if it should be static.
      Done.
    Why is GETHZ the only one of these with a CPL check?
      Please see thread for detail.
    I would suggest putting vmport_register declaration in hvm.h ...
      Done.

  Jan Beulich (#5 "xen: Add vmware_port support"):
    As indicated before, I don't think this is a good use case for a
    domain creation flag.
      Switch to the new config way.
    struct domain *d => struct domain *currd
      Done
    Are you sure you don't want to zero the high halves of 64-bit ...
      Comment added.
   Then just have this handled into the default case.
      Reworked new_eax handling.
   is_hvm_domain(currd)
   And - why here rather than before the switch() or even right at the
   start of the function?
      Moved to start.
   With that, is it really correct that OUT updates the other registers
   just like IN? If so, this deserves a comment, so that readers won't
   think this is in error.
     All done in comment at start.

  Andrew Cooper (#6 "xen: Add ring 3 vmware_port support"):
    >> This looks horribly invasive.
    >>
    >> Why are emulation changes needed?  What is wrong with the normal
    >> handling with a registered ioport handler?
    > Because VMware made a bad way to provide a "hyper call".  They decided to
    > allow user access to this.  So when a #GP fault should have been
    > reported, they instead do the "hyper call".
    >
    Urgh - now I remember.

    Right.  In the case that vmport is active, we start intercepting #GP
    faults and emulating access.  That part of the patch looks ok.

    However, the rest is very invasive to the emulation infrastructure.
      Re-worked along this lines suggested.

  Jan Beulich (#6 "xen: Add ring 3 vmware_port support"):
    I hope that vmport_check will no longer be needed with the adjustments ...
    > Since this is not an architecture feature and I do not expect any real
    > CPUs to support this, I do not expect any other use.  But I am happy
    > to make it more generic.

    Let's see how this ends up looking - the hook is probably indeed
    bogus (from an architectural pov) no matter how you name it.
      Last e-mail on thread, so no change.

  Ian Campbell (#7 "tools: Add vmware_port support"):
    If..." at the start of the sentence ...
      Used Ian's reword.
    Also, why is 7 special?
      Attempted to better explain.

  Paul Durrant & Jan Beulich (#8 "Add IOREQ_TYPE_VMWARE_PORT"):
    Now that buf is no longer a bool, could ...
    These literals should become an enum
      Added an enum.
    I don't think the invalidate type is needed.
      Dropped.
    IOREQ_TYPE_VMWARE_PORT as 3 is a re-use.
      Switch to 9.
    Code handling "case X86EMUL_UNHANDLEABLE:" in emulate.c
    is unclear.
       Re-worked to a version that Jan likes better.
    Comment about "special' range of 1" is not clear.
       Re-worded comments.

  Ian Campbell (#9 "Add xentrace to vmware_port"):
    Acked-by
  Readded dropped traces.

  Jan Beulich & Andrew Cooper (#9 "Add xentrace to vmware_port"):
    Why is cmd in this patch?
      Because the trace points use it.

  Jan Beulich (#10 "test_x86_emulator.c: Add tests for #GP usage"):
    Need more comments and simpler error checking.
      Done.  
      Dropped un-needed new routines.

  Andrew Cooper:
    That is because you broke it adding a bool_t item.
      Has now been dropped.


Changes v8 to v9:
  Overview of changes:
    s/vmware_hw/vmware_hwver/i
    Switch to x86_emulator to handle #GP
    New patch: Move MAX_INST_LEN into x86_emulate.h
    Add QEMU usage, patch #8 "Add IOREQ_TYPE_VMWARE_PORT"
    Split patch "xen: Add vmware_port support" into 2. 1st has same
    name.  New one is "xen: Add ring 3 vmware_port support".
    Added 3 new patches about test_x86_emulator.

  
  Jan Beulich (#2: "xen: Add support for VMware cpuid leaves"):
    Change -EXDEV to EOPNOTSUPP.
      Done.
    adding another subdirectory: xen/arch/x86/hvm/vmware
    Much will depend on the discussion of the subsequent patches.
      TBD.
    So for versions < 7 there's effectively no CPUID support at all?
      Changed to check at entry.
    The comment /* Params for VMware */ seems wrong...
      Changed to /* emulated VMware Hardware Version */
    Also please use d, not _d in #define is_vmware_domain()
      Changed.  Line is now > 80 characters, so split into 2.

  Andrew Cooper (#3: "tools: Add vmware_hwver support"):
      I assumed that s/vmware_hw/vmware_hwver/ is not a big enough
      change to drop the Reviewed-by.  Did a minor edit to the
      commit message to add 7 to the list of values checked.

  Jan Beulich (#4: "vmware: Add VMware provided include file"):
    Either the description is wrong, or the patch is stale.
      stale commit message -- fixed.
    I'd say a file with a single comment line in it would suffice.
      Done.

  Jan Beulich (#5: "xen: Add vmware_port support"):
    Can you explain why a HVM param isn't suitable here?
      Issue with changing QEMU on the fly.
      Andrew Cooper: My recommendation is still to use a creation flag
        So no change.
    Please move SVM's identical definition into ...
      Did this as #1.  No longer needed, but since the patch was ready
      I have included it.
    --Lots of questions about code that no long is part of this patch. --
    With this, is handling other than 32-bit in/out really
    meaningful/correct?
      Added comment about this.
    Since you can't get here for PV, I can't see what you need this.
      Changed to an ASSERT.
    Why version 4?
      Added comment about this.
    -- Several questions about register changes.
      Re-coded to use new_eax and set *val to this.
      Change to generealy use reg->_e..
    These ei1/ei2 checks belong in the callers imo -
      Moved.
    the "port" function parameter isn't even checked
      Add check for exact match.
    If dropping the code is safe without also forbidding the
    combination of nested and VMware emulation.
      Added the forbidding the combination of nested and VMware.
      Mostly do to the cases of the nested virtual code is the one
      to handle VMware stuff if needed, not the root one.  Also I am
      having issues testing xen nested in xen and using hvm.

      

Changes v7 to v8:

  Jan Beulich:
    Coding changes to vmport_ioport. Things like:
-             regs->rax = (uint32_t)~0ul;
+             regs->_eax = ~0u;
      
  Andrew Cooper (#2: "tools: Add vmware_hwver support"):
    Other than these two comments, the rest of the patch looks ok, so...
      Added Reviewed-by after addressing the "Spurious whitepsace change".
      and the wording in the new docs/misc/hypervisor-cpuid.markdown.


Changes v6 to v7:
  summary of changes.

  George Dunlap:
    Any doc about this?
      Added reference to:
        https://sites.google.com/site/chitchatvmback/backdoor
      Last updated: Feb. 2008

  George Dunlap & Jan Beulich
    Too much logging and tracing.
      Dropped a lot of it.  This includes vmport_debug=

  Ian Campbell:
    Any reason RPC code cannot be done in QEMU?
      Not that I know of, so dropped all parts of RPC code.
    Default handling of hvm.vga.kind bad.
      Fixed.
    Default of vmware_port should be based on vmware_hw.
      Done. 

  Tim Deegan:
    CPL check of GETHZ needs to be fixed somewhere.
      Added check for CPL == 0 (assuming this is what VMware is
      checking.  Matches the testing.

  Ian Campbell, Andrew Cooper, George Dunlap, Boris Ostrovsky,
   & Jan Beulich
     Various minor fixes.
    
  Per patch notes:
    #1 "xen: Add support for VMware cpuid leaves":
      Prevent setting of HVM_PARAM_VIRIDIAN if HVM_PARAM_VMWARE_HW set.
    #4 "xen: Add vmware_port support":
      More on AMD in the commit message.
      Switch to only change 32bit part of registers, what VMware
        does.
    #6 "Add xentrace to vmware_port":
      Dropped some of the new traces.
      Added HVMTRACE_ND7.
    #7 "Add xen-hvm-param":
       Was a later patch.  Still optional.
       Fixed formatting.
       Adjust for drop of VMware RPC.

Comments on v3, v4, v5, v6:
  George Dunlap:
    Is there any reason not to merge 05/16 with 03/16?
      The reason I have is that v3 03/16 only contains new files. 2
      from VMware and 1 to allow use of the VMware files.  I added
      xen/arch/x86/hvm/vmware/includeCheck.h at the request of
      Konrad Wilk.

      This patch has many style issues and white space issues.  So I
      want it as a separate patch so as to be clear on what files do
      not meet the coding style.  And why and where they came from.

Changes v5 to v6:
  Boris Ostrovsky & Jan Beulich
    #4 "xen: Add vmware_port support":
    #6 "xen: Convert vmware_port to xentrace usage":
    There is an issue with reading instruction bytes more then once.
      Dropped the attempt to use svm_nextrip_insn_length via
      __get_instruction_length (added in v2).  Just always look
      at upto 15 bytes on AMD.

Changes v4 to v5:
  Re tagged the optional patches.

  Added debug=y build checking that vmx is defining
  VM_EXIT_INTR_ERROR_CODE.

  Boris Ostrovsky:
    #1 "xen: Add support for VMware cpuid leaves":
      Given how is_viridian and is_vmware are defined I think '||' is more
      appropriate.
        Fixed.
    #4 "xen: Add is_vmware_port_enabled":
      we should make sure that svm_vmexit_gp_intercept is not executed for
      any other guest.
        Added an ASSERT on is_vmware_port_enabled.
      magic integers?
        Added #define for them.
    #6 "xen: Convert vmware_port to xentrace usage":
      exitinfo1 is used twice.
        Fixed.
    #7 "tools: Convert vmware_port to xentrace usage":
      'bytes = 0x%(2)d' or 'bytes = %(2)d' ?
        Fixed.
    #8 "xen: Add limited support of VMware's hyper-call rpc":
      PV vs. HVM vs. PVH. So probably 'if(is_hvm_vcpu)'?
        I see no reason to exclude PVH.   Will change to has_hvm_container_vcpu
    #11 "Add live migration of VMware's hyper-call":
      You ASSERTed that vg->key_len is 1 so you may not need the 'if'.
        That is a ASSERT(sizeof, not just ASSERT -- not changed.
      Use real errno, not -1.
        Fixed.
      No ASSERT in vmport_load_domain_ctxt
        Added.

  Jan Beulich & Boris Ostrovsky:
    #8 "xen: Add limited support of VMware's hyper-call rpc":
      The names of all three functions are bogus.
        removed static support routines.
        Also changed in #1.

  Andrew Cooper:
    #2 "tools: Add vmware_hw support":
      Anything looking for Xen according to the Xen cpuid instructions...
        Adjusted doc to new wording.
    #4 "xen: Add is_vmware_port_enabled":
      I am fairly certain that you need some brackets here.
        Added brackets.

  Jan Beulich & Andrew Cooper:
    #1 "xen: Add support for VMware cpuid leaves":
      This hunk is unrelated, but is perhaps something better fixed.
        Added to commit message.
      include <xen/types.h> (IIRC) please.
        Done.
      At least 1 pair of brackets please, especially as the placement of
      brackets affects the result of this particular calculation.
        Switch to "1000000ull / APIC_BUS_CYCLE_NS"      


Changes v3 to v4:
  Ian Campbell:
    Report on both viridian and vmware_hw set.
    Added LIBXL_VGA_INTERFACE_TYPE_VMWARE (vga=vmware).

  Andrew Cooper:
    Add doc for hypervisor-cpuid.

  Boris Ostrovsky:
    Changing regs->error_code may not be a good idea.
      Dropped this.
    
  Jan Beulich & Boris Ostrovsky:
    Only enable vmwxit for GP when vmware_port is set.
      Done.


Changes v2 to v3:

  Add optional unit test tools.
  Re-worked split of changes.

  Jan Beulich:
    for #0:
      I don't think you should be adding a new fine in hvm/ _and_ a new
      subdirectory.
        Moved all files to hvm/vmware that contain code.
    for old #1 (now #1 & #2):
      Is there really a point in enabling both Viridian and VMware extensions?
        I still think so.
      hvmloader change: This needs an explanation
        Dropped as not need now.
      Can you make vmware_hw similar to Viridian, returning success when
      setting the value to what it already is.
        Done.
      You don't seem to be using sub_idx: ...
        Dropped.
      Extra changes...
        Dropped.
    for old #2 (now #3):
      ... these guards have the (theoretical at this point) risk of clashing
      ... the patch is obviously incomplete without this header...
        Did not fix any of these issues.  I will stick with this needs
        to be a 2nd patch that changes the include files to better fit
        in Xen coding.  For now these files are in a sub directory
        which is not part of the normal include search.
        Moved the includeCheck.h file into this patch.
    for old #3 (now #4, #5, #6, #7, #8, #9, #10, #11)
      As I think was said on v1 already - this should be split into smaller
      pieces ...
        Done.
      All this would very likely better go into a separate function placed in
      vmport.c.
        Moved most of the code into vmport.c or vmport_rpc.c.
      In any event I'm rather uncomfortable about vmware_port getting
      enabled unconditionally, ...
        Added vmware_port (done in new patches #4, #5) as an xl.cfg
        option.
      You'll have to go through and fix coding style issues.
        I think I have found all these, but since they do not stand out
        for me, let me know of any left.
      "MAKE_INSTR(IN," name is ambiguous.
        Added all 4 opcodes for in and out that can access this port: INB_DX,
        INL_DX, OUTB_DX, OUTL_DX.
      A VMX-specific function shouldn't be named this way...
        Added new common routine vmport_gp_check() that is called from
        both vmx.c and svm.c which is where all the logic about checking
        for IN ans OUT is done.
        Also fixed naming and added static.
      Ah, here we go (as to using HVM_DBG_LOG()): Isn't this _way_ too
      fine grained?
        I have reduced the number of bits used.  Partialy by switching
        some to xentrace (new patch #6 and #7).
      Right, and zero is an indication that it wasn't found. Also I just
      noticed there's a gdprintk() in that event, which for all other ...
        Made the gdprintk() optional.

End of v3 changes.

This is a small part of the changes needed to allow running Linux
and windows (and others) guests that were built on VMware and run
run them unchanged on Xen.

This small part is the start of Xen support of VMware backdoor I/O
port which is how VMware tools (a standard addition installed on a
guest) communicates to the hypervisor.

I picked this subset to start with because it only has changes in
Xen.

Some of this code is already in QEMU and so KVM has some of this
already.  QEMU supported backdoor commands include VMware mouse
support.  A later patch set exists that links these changes, new
code and Xen changes to QEMU to provide VMware mouse support under
Xen.  The important part is that VMware mouse is an absolute
position mouse and so network delays do not effect usage of the
virtual mouse.

For example from the guest:

[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.joejoel"
No value found
[root@C63-min-tools ~]# vmtoolsd --cmd "info-set guestinfo.joejoel short"

[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.joejoel"
short
[root@C63-min-tools ~]# vmtoolsd --cmd "info-set guestinfo.joejoel long222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000joel"

[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.key1"
data1
[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.key2"
No value found
[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.key2"
data2
[root@C63-min-tools ~]# 


Most of this code has been reverse engineered by looking at
source code for Linux and open VMware tools.

http://open-vm-tools.sourceforge.net


changes RFC to v2:

Jan Beulich:
  Add xen/arch/x86/hvm/vmware.c for cpuid_vmware_leaves
  Fewer patches

Andrew Cooper:
  use the proper constant for apic_khz
  Follow 839b966e3f587bbb1a0d954230fb3904330dccb6 style changes.
  Changed HVM_PARAM_VMWARE_HW to write once (make is_vmware_domain()
    more static).
  Dropped vmport status stuff.
  Added checks for xzalloc() having failed.
  You should include backdoor_def.h ...
     Every thing I tried did not work better.  So I did not
     change VMPORT_PORT and BDOOR_PORT being the same value.
     I did not try and adjust VMware's include file backdoor_def.h
     to working in other xen source files.
  Switching to s_time_t is not valid. get_sec() is defined:
    unsigned long get_sec(void);
  and so my uses of it should be using unsigned long.  However
  since that is not a fixed width type, I used the uint64_t
  data type which is almost the same, but does allow the 32 bit
  build of libxc, libxl to do the correct thing.


Konrad Rzeszutek Wilk:
  Please don't include the address. It should be, etc
      about the Vmware provided include files.
    I went with no changes to these files.  Even if the files should
    be changed to match xen coding style, etc I still feel that the
    original ones should be added via a patch, and then adjusted in a
    2nd patch.
  Can you use XenBus?
    I would say no.  XenBus (and XenStore) is about domain to domain
    communication.  This is about VMware's hyper-call and providing
    access to VMware's guest info very low speed access.

Olaf Hering:
   Dropped changing of bios-strings.  Still needs some documentation
   about this may be needed to do in a tool stack or set of commands.


Boris Ostrovsky:
  Use svm_nextrip_insn_length()
    Looks like __get_instruction_length() does this, so switched to
    __get_instruction_length().
 
RFC:

See

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458

for info on detecting VMware.

Linux does not follow this exactly.  It checks for CPUID 1st.  If
that fails, it checks for SMBIOS containing "VMware" (not VMware- or
VMW).

So this patch set provides:

        SMBIOS -- Add string VMware-
        CPUID -- Add VMware's CPUID (Note: currently HyperV (viridian support) breaks this check.)
        Add the magic VMware port
            Allow VMware tools poweroff and reboot
            Enable access to VMware's guest info
            Provide the VMware tools build number


Don Slutz (9):
  tools: Add vga=vmware
  xen: Add support for VMware cpuid leaves
  tools: Add vmware_hwver support
  vmware: Add VMware provided include file.
  xen: Add vmware_port support
  xen: Add ring 3 vmware_port support
  tools: Add vmware_port support
  Add IOREQ_TYPE_VMWARE_PORT
  Add xentrace to vmware_port

 docs/man/xl.cfg.pod.5                  |  36 +++++-
 tools/libxc/xc_domain.c                |   2 +-
 tools/libxc/xc_hvm_build_x86.c         |   5 +-
 tools/libxl/libxl.h                    |  10 ++
 tools/libxl/libxl_create.c             |  13 ++-
 tools/libxl/libxl_dm.c                 |  11 ++
 tools/libxl/libxl_types.idl            |   3 +
 tools/libxl/libxl_x86.c                |   5 +-
 tools/libxl/xl_cmdimpl.c               |   5 +
 tools/xentrace/formats                 |   5 +
 xen/arch/x86/domain.c                  |   3 +
 xen/arch/x86/hvm/Makefile              |   1 +
 xen/arch/x86/hvm/emulate.c             | 132 ++++++++++++++++++---
 xen/arch/x86/hvm/hvm.c                 | 202 +++++++++++++++++++++++++++++----
 xen/arch/x86/hvm/io.c                  |  19 ++++
 xen/arch/x86/hvm/svm/svm.c             |  26 +++++
 xen/arch/x86/hvm/svm/vmcb.c            |   2 +
 xen/arch/x86/hvm/vmware/Makefile       |   2 +
 xen/arch/x86/hvm/vmware/backdoor_def.h | 167 +++++++++++++++++++++++++++
 xen/arch/x86/hvm/vmware/cpuid.c        |  77 +++++++++++++
 xen/arch/x86/hvm/vmware/includeCheck.h |   1 +
 xen/arch/x86/hvm/vmware/vmport.c       | 170 +++++++++++++++++++++++++++
 xen/arch/x86/hvm/vmx/vmcs.c            |   2 +
 xen/arch/x86/hvm/vmx/vmx.c             |  37 ++++++
 xen/arch/x86/traps.c                   |   8 +-
 xen/arch/x86/x86_emulate/x86_emulate.c |  13 ++-
 xen/arch/x86/x86_emulate/x86_emulate.h |   5 +
 xen/include/asm-x86/hvm/domain.h       |   9 +-
 xen/include/asm-x86/hvm/emulate.h      |   2 +
 xen/include/asm-x86/hvm/hvm.h          |  10 ++
 xen/include/asm-x86/hvm/trace.h        |  22 ++++
 xen/include/asm-x86/hvm/vmware.h       |  33 ++++++
 xen/include/public/arch-x86/xen.h      |   8 +-
 xen/include/public/hvm/hvm_op.h        |   5 +
 xen/include/public/hvm/ioreq.h         |  17 +++
 xen/include/public/hvm/params.h        |   4 +-
 xen/include/public/trace.h             |   3 +
 37 files changed, 1025 insertions(+), 50 deletions(-)
 create mode 100644 xen/arch/x86/hvm/vmware/Makefile
 create mode 100644 xen/arch/x86/hvm/vmware/backdoor_def.h
 create mode 100644 xen/arch/x86/hvm/vmware/cpuid.c
 create mode 100644 xen/arch/x86/hvm/vmware/includeCheck.h
 create mode 100644 xen/arch/x86/hvm/vmware/vmport.c
 create mode 100644 xen/include/asm-x86/hvm/vmware.h

-- 
1.8.4

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v11 1/9] tools: Add vga=vmware
  2015-05-22 15:50 [PATCH v11 0/9] Xen VMware tools support Don Slutz
@ 2015-05-22 15:50 ` Don Slutz
  2015-05-22 15:50 ` [PATCH v11 2/9] xen: Add support for VMware cpuid leaves Don Slutz
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-05-22 15:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This allows use of QEMU's VMware emulated video card

NOTE: vga=vmware is not supported by device_model_version=qemu-xen-traditional

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v11:
  Dropped support for Qemu-trad.
  Also changed later patchs to not need this one.

v10: New at v10.

  Was part of "tools: Add vmware_hwver support"

 docs/man/xl.cfg.pod.5       | 4 +++-
 tools/libxl/libxl.h         | 5 +++++
 tools/libxl/libxl_dm.c      | 9 +++++++++
 tools/libxl/libxl_types.idl | 1 +
 tools/libxl/xl_cmdimpl.c    | 2 ++
 5 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index a3e0e2e..84078f6 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1389,7 +1389,7 @@ This option is deprecated, use vga="stdvga" instead.
 
 =item B<vga="STRING">
 
-Selects the emulated video card (none|stdvga|cirrus|qxl).
+Selects the emulated video card (none|stdvga|cirrus|qxl|vmware).
 The default is cirrus.
 
 In general, QXL should work with the Spice remote display protocol
@@ -1397,6 +1397,8 @@ for acceleration, and QXL driver is necessary in guest in this case.
 QXL can also work with the VNC protocol, but it will be like a standard
 VGA without acceleration.
 
+NOTE: vmware is not supported on B<device_model_version = "qemu-xen-traditional">
+
 =item B<vnc=BOOLEAN>
 
 Allow access to the display via the VNC protocol.  This enables the
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 0a7913b..86164a7 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -200,6 +200,11 @@
 #define LIBXL_HAVE_DEVICETREE_PASSTHROUGH 1
 
 /*
+ * The libxl_vga_interface_type has the type for vmware.
+ */
+#define LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 3dd7c04..ce08461 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -256,6 +256,10 @@ static int libxl__build_device_model_args_old(libxl__gc *gc,
         case LIBXL_VGA_INTERFACE_TYPE_NONE:
             flexarray_append_pair(dm_args, "-vga", "none");
             break;
+        case LIBXL_VGA_INTERFACE_TYPE_VMWARE:
+            LOG(ERROR, "vga=vmware is not supported by "
+                "qemu-xen-traditional");
+            return ERROR_INVAL;
         case LIBXL_VGA_INTERFACE_TYPE_QXL:
             break;
         }
@@ -647,6 +651,11 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
                 GCSPRINTF("qxl-vga,vram_size_mb=%"PRIu64",ram_size_mb=%"PRIu64,
                 (b_info->video_memkb/2/1024), (b_info->video_memkb/2/1024) ) );
             break;
+        case LIBXL_VGA_INTERFACE_TYPE_VMWARE:
+            flexarray_append_pair(dm_args, "-device",
+                GCSPRINTF("vmware-svga,vgamem_mb=%d",
+                libxl__sizekb_to_mb(b_info->video_memkb)));
+            break;
         }
 
         if (b_info->u.hvm.boot) {
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 23f27d4..6cab732 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -184,6 +184,7 @@ libxl_vga_interface_type = Enumeration("vga_interface_type", [
     (2, "STD"),
     (3, "NONE"),
     (4, "QXL"),
+    (5, "VMWARE"),
     ], init_val = "LIBXL_VGA_INTERFACE_TYPE_CIRRUS")
 
 libxl_vendor_device = Enumeration("vendor_device", [
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c858068..02f5c7a 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -2138,6 +2138,8 @@ skip_vfb:
                 b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE;
             } else if (!strcmp(buf, "qxl")) {
                 b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_QXL;
+            } else if (!strcmp(buf, "vmware")) {
+                b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_VMWARE;
             } else {
                 fprintf(stderr, "Unknown vga \"%s\" specified\n", buf);
                 exit(1);
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v11 2/9] xen: Add support for VMware cpuid leaves
  2015-05-22 15:50 [PATCH v11 0/9] Xen VMware tools support Don Slutz
  2015-05-22 15:50 ` [PATCH v11 1/9] tools: Add vga=vmware Don Slutz
@ 2015-05-22 15:50 ` Don Slutz
  2015-05-22 15:50 ` [PATCH v11 3/9] tools: Add vmware_hwver support Don Slutz
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-05-22 15:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This is done by adding xen_arch_domainconfig vmware_hw. It is set to
the VMware virtual hardware version.

Currently 0, 3-4, 6-11 are good values.  However the
code only checks for == 0 or != 0 or >= 7.

If non-zero then
  Return VMware's cpuid leaves.  If >= 7 return data, else
  return 0.

The support of hypervisor cpuid leaves has not been agreed to.

MicroSoft Hyper-V (AKA viridian) currently must be at 0x40000000.

VMware currently must be at 0x40000000.

KVM currently must be at 0x40000000 (from Seabios).

Xen can be found at the first otherwise unused 0x100 aligned
offset between 0x40000000 and 0x40010000.

http://download.microsoft.com/download/F/B/0/FB0D01A3-8E3A-4F5F-AA59-08C8026D3B8A/requirements-for-implementing-microsoft-hypervisor-interface.docx

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458

http://lwn.net/Articles/301888/
  Attempted to get this cleaned up.

So based on this, I picked the order:

Xen at 0x40000000 or
Viridian or VMware at 0x40000000 and Xen at 0x40000100

If both Viridian and VMware selected, report an error.

Since I need to change xen/arch/x86/hvm/Makefile; also add
a newline at end of file.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v11:
  Adjust /* Disallow if vmware_hwver */
  Newline after break;
  Added Reviewed-by: Andrew Cooper.
    It would be worth to add an explicit vmware_hwver = 0 in the
    libxl__arch_domain_prepare_config.
 Note: Adds a tool change to this patch.

v10:
    Did not add "Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>"
    because of changes here to do things the new way.
  Reword comment message to reflect new way.

v9:
    s/vmware_hw/vmware_hwver/i
    Change -EXDEV to EOPNOTSUPP.
      Done.
    adding another subdirectory: xen/arch/x86/hvm/vmware
    Much will depend on the discussion of the subsequent patches.
      TBD.
    So for versions < 7 there's effectively no CPUID support at all?
      Changed to check at entry.
    The comment /* Params for VMware */ seems wrong...
      Changed to /* emulated VMware Hardware Version */
    Also please use d, not _d in #define is_vmware_domain()
      Changed.  Line is now > 80 characters, so split into 2.

v7:
      Prevent setting of HVM_PARAM_VIRIDIAN if HVM_PARAM_VMWARE_HW set.
v5:
      Given how is_viridian and is_vmware are defined I think '||' is more
      appropriate.
        Fixed.
      The names of all three functions are bogus.
        removed static support routines.
      This hunk is unrelated, but is perhaps something better fixed.
        Added to commit message.
      include <xen/types.h> (IIRC) please.
        Done.
      At least 1 pair of brackets please, especially as the placement of
      brackets affects the result of this particular calculation.
        Switch to "1000000ull / APIC_BUS_CYCLE_NS"      

 tools/libxl/libxl_x86.c           |  4 +-
 xen/arch/x86/domain.c             |  1 +
 xen/arch/x86/hvm/Makefile         |  1 +
 xen/arch/x86/hvm/hvm.c            | 11 ++++++
 xen/arch/x86/hvm/vmware/Makefile  |  1 +
 xen/arch/x86/hvm/vmware/cpuid.c   | 77 +++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c              |  8 +++-
 xen/include/asm-x86/hvm/domain.h  |  3 ++
 xen/include/asm-x86/hvm/hvm.h     |  6 +++
 xen/include/asm-x86/hvm/vmware.h  | 33 +++++++++++++++++
 xen/include/public/arch-x86/xen.h |  2 +-
 11 files changed, 142 insertions(+), 5 deletions(-)
 create mode 100644 xen/arch/x86/hvm/vmware/Makefile
 create mode 100644 xen/arch/x86/hvm/vmware/cpuid.c
 create mode 100644 xen/include/asm-x86/hvm/vmware.h

diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index ed2bd38..651b338 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -5,8 +5,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
                                       libxl_domain_config *d_config,
                                       xc_domain_configuration_t *xc_config)
 {
-    /* No specific configuration right now */
-
+    /* Note: will be changed in a later patch */
+    xc_config->vmware_hwver = 0;
     return 0;
 }
 
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index db073a6..7de9dd3 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -538,6 +538,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
     {
         d->arch.hvm_domain.hap_enabled =
             hvm_funcs.hap_supported && (domcr_flags & DOMCRF_hap);
+        d->arch.hvm_domain.vmware_hwver = config->vmware_hwver;
 
         rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, NULL, NULL);
     }
diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index 69af47f..284ca75 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -1,5 +1,6 @@
 subdir-y += svm
 subdir-y += vmx
+subdir-y += vmware
 
 obj-y += asid.o
 obj-y += emulate.o
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 89423fa..c464b29 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -59,6 +59,7 @@
 #include <asm/hvm/trace.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/event.h>
+#include <asm/hvm/vmware.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
 #include <public/sched.h>
@@ -4273,6 +4274,9 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
     if ( cpuid_viridian_leaves(input, eax, ebx, ecx, edx) )
         return;
 
+    if ( cpuid_vmware_leaves(input, eax, ebx, ecx, edx) )
+        return;
+
     if ( cpuid_hypervisor_leaves(input, count, eax, ebx, ecx, edx) )
         return;
 
@@ -5676,6 +5680,13 @@ static int hvm_allow_set_param(struct domain *d,
     {
     /* The following parameters should only be changed once. */
     case HVM_PARAM_VIRIDIAN:
+        /* Disallow if vmware_hwver is in use */
+        if ( d->arch.hvm_domain.vmware_hwver )
+        {
+            rc = -EOPNOTSUPP;
+            break;
+        }
+        /* Fall through */
     case HVM_PARAM_IOREQ_SERVER_PFN:
     case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
         if ( value != 0 && a->value != value )
diff --git a/xen/arch/x86/hvm/vmware/Makefile b/xen/arch/x86/hvm/vmware/Makefile
new file mode 100644
index 0000000..3fb2e0b
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/Makefile
@@ -0,0 +1 @@
+obj-y += cpuid.o
diff --git a/xen/arch/x86/hvm/vmware/cpuid.c b/xen/arch/x86/hvm/vmware/cpuid.c
new file mode 100644
index 0000000..0dff36b
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/cpuid.c
@@ -0,0 +1,77 @@
+/*
+ * arch/x86/hvm/vmware/cpuid.c
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/sched.h>
+
+#include <asm/hvm/hvm.h>
+#include <asm/hvm/vmware.h>
+
+/*
+ * VMware hardware version 7 defines some of these cpuid levels,
+ * below is a brief description about those.
+ *
+ *     Leaf 0x40000000, Hypervisor CPUID information
+ * # EAX: The maximum input value for hypervisor CPUID info (0x40000010).
+ * # EBX, ECX, EDX: Hypervisor vendor ID signature. E.g. "VMwareVMware"
+ *
+ *     Leaf 0x40000010, Timing information.
+ * # EAX: (Virtual) TSC frequency in kHz.
+ * # EBX: (Virtual) Bus (local apic timer) frequency in kHz.
+ * # ECX, EDX: RESERVED
+ */
+
+int cpuid_vmware_leaves(uint32_t idx, uint32_t *eax, uint32_t *ebx,
+                        uint32_t *ecx, uint32_t *edx)
+{
+    struct domain *d = current->domain;
+
+    if ( !is_vmware_domain(d) ||
+         d->arch.hvm_domain.vmware_hwver < 7 )
+        return 0;
+
+    switch ( idx - 0x40000000 )
+    {
+    case 0x0:
+        *eax = 0x40000010;  /* Largest leaf */
+        *ebx = 0x61774d56;  /* "VMwa" */
+        *ecx = 0x4d566572;  /* "reVM" */
+        *edx = 0x65726177;  /* "ware" */
+        break;
+
+    case 0x10:
+        /* (Virtual) TSC frequency in kHz. */
+        *eax =  d->arch.tsc_khz;
+        /* (Virtual) Bus (local apic timer) frequency in kHz. */
+        *ebx = 1000000ull / APIC_BUS_CYCLE_NS;
+        *ecx = 0;          /* Reserved */
+        *edx = 0;          /* Reserved */
+        break;
+
+    default:
+        return 0;
+    }
+
+    return 1;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 91701a2..4a791c6 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -750,8 +750,12 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
                uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx)
 {
     struct domain *d = current->domain;
-    /* Optionally shift out of the way of Viridian architectural leaves. */
-    uint32_t base = is_viridian_domain(d) ? 0x40000100 : 0x40000000;
+    /*
+     * Optionally shift out of the way of Viridian or VMware
+     * architectural leaves.
+     */
+    uint32_t base = is_viridian_domain(d) || is_vmware_domain(d) ?
+        0x40000100 : 0x40000000;
     uint32_t limit, dummy;
 
     idx -= base;
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index bdab45d..e30fd8a 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -109,6 +109,9 @@ struct hvm_domain {
 
     uint64_t              *params;
 
+    /* emulated VMware Hardware Version */
+    uint64_t               vmware_hwver;
+
     /* Memory ranges with pinned cache attributes. */
     struct list_head       pinned_cacheattr_ranges;
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 77eeac5..2965fbb 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -356,6 +356,12 @@ static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
 #define has_viridian_time_ref_count(d) \
     (is_viridian_domain(d) && (viridian_feature_mask(d) & HVMPV_time_ref_count))
 
+#define vmware_feature_mask(d) \
+    ((d)->arch.hvm_domain.vmware_hwver)
+
+#define is_vmware_domain(d) \
+    (is_hvm_domain(d) && vmware_feature_mask(d))
+
 void hvm_hypervisor_cpuid_leaf(uint32_t sub_idx,
                                uint32_t *eax, uint32_t *ebx,
                                uint32_t *ecx, uint32_t *edx);
diff --git a/xen/include/asm-x86/hvm/vmware.h b/xen/include/asm-x86/hvm/vmware.h
new file mode 100644
index 0000000..8390173
--- /dev/null
+++ b/xen/include/asm-x86/hvm/vmware.h
@@ -0,0 +1,33 @@
+/*
+ * asm-x86/hvm/vmware.h
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef ASM_X86_HVM_VMWARE_H__
+#define ASM_X86_HVM_VMWARE_H__
+
+#include <xen/types.h>
+
+int cpuid_vmware_leaves(uint32_t idx, uint32_t *eax, uint32_t *ebx,
+                        uint32_t *ecx, uint32_t *edx);
+
+#endif /* ASM_X86_HVM_VMWARE_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h
index 2ecc9c9..f84d10d 100644
--- a/xen/include/public/arch-x86/xen.h
+++ b/xen/include/public/arch-x86/xen.h
@@ -268,7 +268,7 @@ typedef struct arch_shared_info arch_shared_info_t;
  * XEN_DOMCTL_INTERFACE_VERSION.
  */
 struct xen_arch_domainconfig {
-    char dummy;
+    uint64_t vmware_hwver;
 };
 #endif
 
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v11 3/9] tools: Add vmware_hwver support
  2015-05-22 15:50 [PATCH v11 0/9] Xen VMware tools support Don Slutz
  2015-05-22 15:50 ` [PATCH v11 1/9] tools: Add vga=vmware Don Slutz
  2015-05-22 15:50 ` [PATCH v11 2/9] xen: Add support for VMware cpuid leaves Don Slutz
@ 2015-05-22 15:50 ` Don Slutz
  2015-06-03 14:53   ` George Dunlap
  2015-06-04 15:17   ` Ian Campbell
  2015-05-22 15:50 ` [PATCH v11 4/9] vmware: Add VMware provided include file Don Slutz
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 48+ messages in thread
From: Don Slutz @ 2015-05-22 15:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This is used to set xen_arch_domainconfig vmware_hw. It is set to
the emulated VMware virtual hardware version.

Currently 0, 3-4, 6-11 are good values.  However the code only
checks for == 0, != 0, or < 7.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v11:
  Dropped "If non-zero then default VGA to VMware's VGA"

v10:
    LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE &
    LIBXL_HAVE_BUILDINFO_HVM_VMWARE_HWVER are arriving together
    a single umbrella could be used.
      Since I split the LIBXL_VGA_INTERFACE_TYPE_VMWARE into
      it's own patch, this is not longer true.
      But I did use 1 for the 2 c_info changes.
    Please use GCSPRINTF.
  Remove vga=vmware from here.

v9:
      I assumed that s/vmware_hw/vmware_hwver/ is not a big enough
      change to drop the Reviewed-by.  Did a minor edit to the
      commit message to add 7 to the list of values checked.

v7:
    Default handling of hvm.vga.kind bad.
      Fixed.
    Default of vmware_port should be based on vmware_hw.
      Done. 

v5:
      Anything looking for Xen according to the Xen cpuid instructions...
        Adjusted doc to new wording.

 docs/man/xl.cfg.pod.5       | 17 +++++++++++++++++
 tools/libxc/xc_domain.c     |  2 +-
 tools/libxl/libxl_create.c  |  4 +++-
 tools/libxl/libxl_types.idl |  1 +
 tools/libxl/libxl_x86.c     |  3 +--
 tools/libxl/xl_cmdimpl.c    |  2 ++
 6 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 84078f6..eaad4bf 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1348,6 +1348,23 @@ The viridian option can be specified as a boolean. A value of true (1)
 is equivalent to the list [ "defaults" ], and a value of false (0) is
 equivalent to an empty list.
 
+=item B<vmware_hwver=NUMBER>
+
+Turns on or off the exposure of VMware cpuid.  The number is
+VMware's hardware version number, where 0 is off.  A number >= 7
+is needed to enable exposure of VMware cpuid.
+
+The hardware version number (vmware_hwver) come from VMware config files.
+
+=over 4
+
+In a .vmx it is virtualHW.version
+
+In a .ovf it is part of the value of vssd:VirtualSystemType.
+For vssd:VirtualSystemType == vmx-07, vmware_hwver = 7.
+
+=back
+
 =back
 
 =head3 Emulated VGA Graphics Device
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 38d065f..4362d5d 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -64,7 +64,7 @@ int xc_domain_create(xc_interface *xch,
     memset(&config, 0, sizeof(config));
 
 #if defined (__i386) || defined(__x86_64__)
-    /* No arch-specific configuration for now */
+    /* No arch-specific default configuration for now */
 #elif defined (__arm__) || defined(__aarch64__)
     config.gic_version = XEN_DOMCTL_CONFIG_GIC_DEFAULT;
     config.nr_spis = 0;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 86384d2..895577f 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -464,7 +464,7 @@ int libxl__domain_build(libxl__gc *gc,
         vments[4] = "start_time";
         vments[5] = libxl__sprintf(gc, "%lu.%02d", start_time.tv_sec,(int)start_time.tv_usec/10000);
 
-        localents = libxl__calloc(gc, 9, sizeof(char *));
+        localents = libxl__calloc(gc, 11, sizeof(char *));
         i = 0;
         localents[i++] = "platform/acpi";
         localents[i++] = libxl_defbool_val(info->u.hvm.acpi) ? "1" : "0";
@@ -472,6 +472,8 @@ int libxl__domain_build(libxl__gc *gc,
         localents[i++] = libxl_defbool_val(info->u.hvm.acpi_s3) ? "1" : "0";
         localents[i++] = "platform/acpi_s4";
         localents[i++] = libxl_defbool_val(info->u.hvm.acpi_s4) ? "1" : "0";
+        localents[i++] = "platform/vmware_hwver";
+        localents[i++] = GCSPRINTF("%"PRId64, d_config->c_info.vmware_hwver);
         if (info->u.hvm.mmio_hole_memkb) {
             uint64_t max_ram_below_4g =
                 (1ULL << 32) - (info->u.hvm.mmio_hole_memkb << 10);
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 6cab732..c8a1345 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -343,6 +343,7 @@ libxl_domain_create_info = Struct("domain_create_info",[
     ("run_hotplug_scripts",libxl_defbool),
     ("pvh",          libxl_defbool),
     ("driver_domain",libxl_defbool),
+    ("vmware_hwver", uint64),
     ], dir=DIR_IN)
 
 libxl_domain_restore_params = Struct("domain_restore_params", [
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 651b338..fd7dafa 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -5,8 +5,7 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
                                       libxl_domain_config *d_config,
                                       xc_domain_configuration_t *xc_config)
 {
-    /* Note: will be changed in a later patch */
-    xc_config->vmware_hwver = 0;
+    xc_config->vmware_hwver = d_config->c_info.vmware_hwver;
     return 0;
 }
 
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 02f5c7a..e79a9d0 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1383,6 +1383,8 @@ static void parse_config_data(const char *config_source,
     b_info->cmdline = parse_cmdline(config);
 
     xlu_cfg_get_defbool(config, "driver_domain", &c_info->driver_domain, 0);
+    if (!xlu_cfg_get_long(config, "vmware_hwver",  &l, 1))
+        c_info->vmware_hwver = l;
 
     switch(b_info->type) {
     case LIBXL_DOMAIN_TYPE_HVM:
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v11 4/9] vmware: Add VMware provided include file.
  2015-05-22 15:50 [PATCH v11 0/9] Xen VMware tools support Don Slutz
                   ` (2 preceding siblings ...)
  2015-05-22 15:50 ` [PATCH v11 3/9] tools: Add vmware_hwver support Don Slutz
@ 2015-05-22 15:50 ` Don Slutz
  2015-05-22 15:50 ` [PATCH v11 5/9] xen: Add vmware_port support Don Slutz
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-05-22 15:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This file: backdoor_def.h comes from:

http://packages.vmware.com/tools/esx/3.5latest/rhel4/SRPMS/index.html
 open-vm-tools-kmod-7.4.8-396269.423167.src.rpm
  open-vm-tools-kmod-7.4.8.tar.gz
   vmhgfs/backdoor_def.h

and is unchanged.

Added the badly named include file includeCheck.h also.  It only has
a comment and is provided so that backdoor_def.h can be used without
change.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v11:
  No change

v10:
   Add Acked-by: Andrew Cooper

v9:
    Either the description is wrong, or the patch is stale.
      stale commit message -- fixed.
    I'd say a file with a single comment line in it would suffice.
      Done.


 xen/arch/x86/hvm/vmware/backdoor_def.h | 167 +++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/vmware/includeCheck.h |   1 +
 2 files changed, 168 insertions(+)
 create mode 100644 xen/arch/x86/hvm/vmware/backdoor_def.h
 create mode 100644 xen/arch/x86/hvm/vmware/includeCheck.h

diff --git a/xen/arch/x86/hvm/vmware/backdoor_def.h b/xen/arch/x86/hvm/vmware/backdoor_def.h
new file mode 100644
index 0000000..e76795f
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/backdoor_def.h
@@ -0,0 +1,167 @@
+/* **********************************************************
+ * Copyright 1998 VMware, Inc.  All rights reserved. 
+ * **********************************************************
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation version 2 and no later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ * or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St, Fifth Floor, Boston, MA  02110-1301 USA
+ */
+
+/*
+ * backdoor_def.h --
+ *
+ * This contains backdoor defines that can be included from
+ * an assembly language file.
+ */
+
+
+
+#ifndef _BACKDOOR_DEF_H_
+#define _BACKDOOR_DEF_H_
+
+#define INCLUDE_ALLOW_MODULE
+#define INCLUDE_ALLOW_USERLEVEL
+#define INCLUDE_ALLOW_VMMEXT
+#define INCLUDE_ALLOW_VMCORE
+#define INCLUDE_ALLOW_VMKERNEL
+#include "includeCheck.h"
+
+/*
+ * If you want to add a new low-level backdoor call for a guest userland
+ * application, please consider using the GuestRpc mechanism instead. --hpreg
+ */
+
+#define BDOOR_MAGIC 0x564D5868
+
+/* Low-bandwidth backdoor port. --hpreg */
+
+#define BDOOR_PORT 0x5658
+
+#define BDOOR_CMD_GETMHZ      		   1
+/*
+ * BDOOR_CMD_APMFUNCTION is used by:
+ *
+ * o The FrobOS code, which instead should either program the virtual chipset
+ *   (like the new BIOS code does, matthias offered to implement that), or not
+ *   use any VM-specific code (which requires that we correctly implement
+ *   "power off on CLI HLT" for SMP VMs, boris offered to implement that)
+ *
+ * o The old BIOS code, which will soon be jettisoned
+ *
+ *  --hpreg
+ */
+#define BDOOR_CMD_APMFUNCTION 		   2
+#define BDOOR_CMD_GETDISKGEO  		   3
+#define BDOOR_CMD_GETPTRLOCATION	      4
+#define BDOOR_CMD_SETPTRLOCATION	      5
+#define BDOOR_CMD_GETSELLENGTH		   6
+#define BDOOR_CMD_GETNEXTPIECE		   7
+#define BDOOR_CMD_SETSELLENGTH		   8
+#define BDOOR_CMD_SETNEXTPIECE		   9
+#define BDOOR_CMD_GETVERSION		      10
+#define BDOOR_CMD_GETDEVICELISTELEMENT	11
+#define BDOOR_CMD_TOGGLEDEVICE		   12
+#define BDOOR_CMD_GETGUIOPTIONS		   13
+#define BDOOR_CMD_SETGUIOPTIONS		   14
+#define BDOOR_CMD_GETSCREENSIZE		   15
+#define BDOOR_CMD_MONITOR_CONTROL       16
+#define BDOOR_CMD_GETHWVERSION          17
+#define BDOOR_CMD_OSNOTFOUND            18
+#define BDOOR_CMD_GETUUID               19
+#define BDOOR_CMD_GETMEMSIZE            20
+#define BDOOR_CMD_HOSTCOPY              21 /* Devel only */
+/* BDOOR_CMD_GETOS2INTCURSOR, 22, is very old and defunct. Reuse. */
+#define BDOOR_CMD_GETTIME               23 /* Deprecated. Use GETTIMEFULL. */
+#define BDOOR_CMD_STOPCATCHUP           24
+#define BDOOR_CMD_PUTCHR	        25 /* Devel only */
+#define BDOOR_CMD_ENABLE_MSG	        26 /* Devel only */
+#define BDOOR_CMD_GOTO_TCL	        27 /* Devel only */
+#define BDOOR_CMD_INITPCIOPROM		28
+#define BDOOR_CMD_INT13			29
+#define BDOOR_CMD_MESSAGE               30
+#define BDOOR_CMD_RSVD0                 31
+#define BDOOR_CMD_RSVD1                 32
+#define BDOOR_CMD_RSVD2                 33
+#define BDOOR_CMD_ISACPIDISABLED	34
+#define BDOOR_CMD_TOE			35 /* Not in use */
+/* BDOOR_CMD_INITLSIOPROM, 36, was merged with 28. Reuse. */
+#define BDOOR_CMD_PATCH_SMBIOS_STRUCTS  37
+#define BDOOR_CMD_MAPMEM                38 /* Devel only */
+#define BDOOR_CMD_ABSPOINTER_DATA	39
+#define BDOOR_CMD_ABSPOINTER_STATUS	40
+#define BDOOR_CMD_ABSPOINTER_COMMAND	41
+#define BDOOR_CMD_TIMER_SPONGE          42
+#define BDOOR_CMD_PATCH_ACPI_TABLES	43
+/* Catch-all to allow synchronous tests */
+#define BDOOR_CMD_DEVEL_FAKEHARDWARE	44 /* Debug only - needed in beta */
+#define BDOOR_CMD_GETHZ      		45
+#define BDOOR_CMD_GETTIMEFULL           46
+#define BDOOR_CMD_STATELOGGER           47
+#define BDOOR_CMD_CHECKFORCEBIOSSETUP	48
+#define BDOOR_CMD_LAZYTIMEREMULATION    49
+#define BDOOR_CMD_BIOSBBS               50
+#define BDOOR_CMD_MAX                   51
+
+/* 
+ * IMPORTANT NOTE: When modifying the behavior of an existing backdoor command,
+ * you must adhere to the semantics expected by the oldest Tools who use that
+ * command. Specifically, do not alter the way in which the command modifies 
+ * the registers. Otherwise backwards compatibility will suffer.
+ */
+
+/* High-bandwidth backdoor port. --hpreg */
+
+#define BDOORHB_PORT 0x5659
+
+#define BDOORHB_CMD_MESSAGE 0
+#define BDOORHB_CMD_MAX 1
+
+/*
+ * There is another backdoor which allows access to certain TSC-related
+ * values using otherwise illegal PMC indices when the pseudo_perfctr
+ * control flag is set.
+ */
+
+#define BDOOR_PMC_HW_TSC      0x10000
+#define BDOOR_PMC_REAL_NS     0x10001
+#define BDOOR_PMC_APPARENT_NS 0x10002
+
+#define IS_BDOOR_PMC(index)  (((index) | 3) == 0x10003)
+#define BDOOR_CMD(ecx)       ((ecx) & 0xffff)
+
+
+#ifdef VMM
+/*
+ *----------------------------------------------------------------------
+ *
+ * Backdoor_CmdRequiresFullyValidVCPU --
+ *
+ *    A few backdoor commands require the full VCPU to be valid
+ *    (including GDTR, IDTR, TR and LDTR). The rest get read/write
+ *    access to GPRs and read access to Segment registers (selectors).
+ *
+ * Result:
+ *    True iff VECX contains a command that require the full VCPU to
+ *    be valid.
+ *
+ *----------------------------------------------------------------------
+ */
+static INLINE Bool
+Backdoor_CmdRequiresFullyValidVCPU(unsigned cmd)
+{
+   return cmd == BDOOR_CMD_RSVD0 ||
+          cmd == BDOOR_CMD_RSVD1 ||
+          cmd == BDOOR_CMD_RSVD2;
+}
+#endif
+
+#endif
diff --git a/xen/arch/x86/hvm/vmware/includeCheck.h b/xen/arch/x86/hvm/vmware/includeCheck.h
new file mode 100644
index 0000000..3b63fa4
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/includeCheck.h
@@ -0,0 +1 @@
+/* Nothing here.  Just to use backdoor_def.h without change. */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v11 5/9] xen: Add vmware_port support
  2015-05-22 15:50 [PATCH v11 0/9] Xen VMware tools support Don Slutz
                   ` (3 preceding siblings ...)
  2015-05-22 15:50 ` [PATCH v11 4/9] vmware: Add VMware provided include file Don Slutz
@ 2015-05-22 15:50 ` Don Slutz
  2015-06-05  9:52   ` Jan Beulich
  2015-05-22 15:50 ` [PATCH v11 6/9] xen: Add ring 3 " Don Slutz
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Don Slutz @ 2015-05-22 15:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This includes adding is_vmware_port_enabled

This is a new xen_arch_domainconfig flag,
XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK.

This enables limited support of VMware's hyper-call.

This is both a more complete support then in currently provided by
QEMU and/or KVM and less.  The missing part requires QEMU changes
and has been left out until the QEMU patches are accepted upstream.

VMware's hyper-call is also known as VMware Backdoor I/O Port.

Note: this support does not depend on vmware_hw being non-zero.

Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
to port 0x5658 specially.  Note: since many operations return data
in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
"in (%dx),%al" will still do things, only AL part of EAX will be
changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
unchanged.

An open source example of using this is:

http://open-vm-tools.sourceforge.net/

Which only uses "inl (%dx)".  Also

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458

Some of the best info is at:

https://sites.google.com/site/chitchatvmback/backdoor

Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v11:
   Dropped ASSERT(is_hvm_domain(currd))
    Newline after break;

v10:
    Probably better as EOPNOTSUPP, as it is a configuration problem.
    This function looks as if it should be static.
    I would suggest putting vmport_register declaration in hvm.h ...
    As indicated before, I don't think this is a good use case for a
    domain creation flag.
      Switch to the new config way.
    struct domain *d => struct domain *currd
    Are you sure you don't want to zero the high halves of 64-bit ...
      Comment added.
   Then just have this handled into the default case.
      Reworked new_eax handling.
   is_hvm_domain(currd)
   And - why here rather than before the switch() or even right at the
   start of the function?
      Moved to start.
   With that, is it really correct that OUT updates the other registers
   just like IN? If so, this deserves a comment, so that readers won't
   think this is in error.
     All done in comment at start.


v9:
  Switch to x86_emulator to handle #GP code moved to next patch.
    Can you explain why a HVM param isn't suitable here?
      Issue with changing QEMU on the fly.
      Andrew Cooper: My recommendation is still to use a creation flag
        So no change.
    Please move SVM's identical definition into ...
      Did this as #1.  No longer needed, but since the patch was ready
      I have included it.
    --Lots of questions about code that no long is part of this patch. --
    With this, is handling other than 32-bit in/out really
    meaningful/correct?
      Added comment about this.
    Since you can't get here for PV, I can't see what you need this.
      Changed to an ASSERT.
    Why version 4?
      Added comment about this.
    -- Several questions about register changes.
      Re-coded to use new_eax and set *val to this.
      Change to generealy use reg->_e..
    These ei1/ei2 checks belong in the callers imo -
      Moved.
    the "port" function parameter isn't even checked
      Add check for exact match.
    If dropping the code is safe without also forbidding the
    combination of nested and VMware emulation.
      Added the forbidding the combination of nested and VMware.
      Mostly do to the cases of the nested virtual code is the one
      to handle VMware stuff if needed, not the root one.  Also I am
      having issues testing xen nested in xen and using hvm.

v7:
      More on AMD in the commit message.
      Switch to only change 32bit part of registers, what VMware
        does.
    Too much logging and tracing.
      Dropped a lot of it.  This includes vmport_debug=

v6:
      Dropped the attempt to use svm_nextrip_insn_length via
      __get_instruction_length (added in v2).  Just always look
      at upto 15 bytes on AMD.

v5:
      we should make sure that svm_vmexit_gp_intercept is not executed for
      any other guest.
        Added an ASSERT on is_vmware_port_enabled.
      magic integers?
        Added #define for them.
      I am fairly certain that you need some brackets here.
        Added brackets.

 xen/arch/x86/domain.c             |   2 +
 xen/arch/x86/hvm/hvm.c            |   9 +++
 xen/arch/x86/hvm/vmware/Makefile  |   1 +
 xen/arch/x86/hvm/vmware/vmport.c  | 148 ++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/domain.h  |   3 +
 xen/include/asm-x86/hvm/hvm.h     |   2 +
 xen/include/public/arch-x86/xen.h |   6 ++
 7 files changed, 171 insertions(+)
 create mode 100644 xen/arch/x86/hvm/vmware/vmport.c

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 7de9dd3..a588d79 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -539,6 +539,8 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
         d->arch.hvm_domain.hap_enabled =
             hvm_funcs.hap_supported && (domcr_flags & DOMCRF_hap);
         d->arch.hvm_domain.vmware_hwver = config->vmware_hwver;
+        d->arch.hvm_domain.is_vmware_port_enabled =
+            !!(config->arch_flags & XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK);
 
         rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, NULL, NULL);
     }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c464b29..2752197 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1480,6 +1480,9 @@ int hvm_domain_initialise(struct domain *d)
     else
         d->arch.hvm_domain.io_bitmap = hvm_io_bitmap;
 
+    if ( d->arch.hvm_domain.is_vmware_port_enabled )
+        vmport_register(d);
+
     if ( is_pvh_domain(d) )
     {
         register_portio_handler(d, 0, 0x10003, handle_pvh_io);
@@ -5805,6 +5808,12 @@ static int hvmop_set_param(
             break;
         if ( a.value > 1 )
             rc = -EINVAL;
+        /* Prevent nestedhvm with vmport */
+        if ( d->arch.hvm_domain.is_vmware_port_enabled )
+        {
+            rc = -EOPNOTSUPP;
+            break;
+        }
         /*
          * Remove the check below once we have
          * shadow-on-shadow.
diff --git a/xen/arch/x86/hvm/vmware/Makefile b/xen/arch/x86/hvm/vmware/Makefile
index 3fb2e0b..cd8815b 100644
--- a/xen/arch/x86/hvm/vmware/Makefile
+++ b/xen/arch/x86/hvm/vmware/Makefile
@@ -1 +1,2 @@
 obj-y += cpuid.o
+obj-y += vmport.o
diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
new file mode 100644
index 0000000..f24d8e3
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/vmport.c
@@ -0,0 +1,148 @@
+/*
+ * HVM VMPORT emulation
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/lib.h>
+#include <asm/hvm/hvm.h>
+#include <asm/hvm/support.h>
+
+#include "backdoor_def.h"
+
+static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+{
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+
+    /*
+     * While VMware expects only 32-bit in, they do support using
+     * other sizes and out.  However they do require only the 1 port
+     * and the correct value in eax.  Since some of the data
+     * returned in eax is smaller the 32 bits and/or you only need
+     * the other registers the dir and bytes do not need any
+     * checking.  The caller will handle the bytes, and dir is
+     * handled below for eax.
+     */
+    if ( port == BDOOR_PORT && regs->_eax == BDOOR_MAGIC )
+    {
+        uint32_t new_eax = ~0u;
+        uint64_t value;
+        struct vcpu *curr = current;
+        struct domain *currd = curr->domain;
+
+        /*
+         * VMware changes the other (non eax) registers ignoring dir
+         * (IN vs OUT).  It also changes only the 32-bit part
+         * leaving the high 32-bits unchanged, unlike what one would
+         * expect to happen.
+         */
+        switch ( regs->_ecx & 0xffff )
+        {
+        case BDOOR_CMD_GETMHZ:
+            new_eax = currd->arch.tsc_khz / 1000;
+            break;
+
+        case BDOOR_CMD_GETVERSION:
+            /* MAGIC */
+            regs->_ebx = BDOOR_MAGIC;
+            /* VERSION_MAGIC */
+            new_eax = 6;
+            /* Claim we are an ESX. VMX_TYPE_SCALABLE_SERVER */
+            regs->_ecx = 2;
+            break;
+
+        case BDOOR_CMD_GETHWVERSION:
+            /* vmware_hw */
+            new_eax = currd->arch.hvm_domain.vmware_hwver;
+            /*
+             * Returning zero is not the best.  VMware was not at
+             * all consistent in the handling of this command until
+             * VMware hardware version 4.  So it is better to claim
+             * 4 then 0.  This should only happen in strange configs.
+             */
+            if ( !new_eax )
+                new_eax = 4;
+            break;
+
+        case BDOOR_CMD_GETHZ:
+        {
+            struct segment_register sreg;
+
+            hvm_get_segment_register(curr, x86_seg_ss, &sreg);
+            if ( sreg.attr.fields.dpl == 0 )
+            {
+                value = currd->arch.tsc_khz * 1000;
+                /* apic-frequency (bus speed) */
+                regs->_ecx = 1000000000ULL / APIC_BUS_CYCLE_NS;
+                /* High part of tsc-frequency */
+                regs->_ebx = value >> 32;
+                /* Low part of tsc-frequency */
+                new_eax = value;
+            }
+            break;
+
+        }
+        case BDOOR_CMD_GETTIME:
+            value = get_localtime_us(currd) -
+                currd->time_offset_seconds * 1000000ULL;
+            /* hostUsecs */
+            regs->_ebx = value % 1000000UL;
+            /* hostSecs */
+            new_eax = value / 1000000ULL;
+            /* maxTimeLag */
+            regs->_ecx = 1000000;
+            /* offset to GMT in minutes */
+            regs->_edx = currd->time_offset_seconds / 60;
+            break;
+
+        case BDOOR_CMD_GETTIMEFULL:
+            /* BDOOR_MAGIC */
+            new_eax = BDOOR_MAGIC;
+            value = get_localtime_us(currd) -
+                currd->time_offset_seconds * 1000000ULL;
+            /* hostUsecs */
+            regs->_ebx = value % 1000000UL;
+            /* hostSecs low 32 bits */
+            regs->_edx = value / 1000000ULL;
+            /* hostSecs high 32 bits */
+            regs->_esi = (value / 1000000ULL) >> 32;
+            /* maxTimeLag */
+            regs->_ecx = 1000000;
+            break;
+
+        default:
+            /* Let backing DM handle */
+            return X86EMUL_UNHANDLEABLE;
+        }
+        if ( dir == IOREQ_READ )
+            *val = new_eax;
+    }
+    else if ( dir == IOREQ_READ )
+        *val = ~0u;
+
+    return X86EMUL_OKAY;
+}
+
+void vmport_register(struct domain *d)
+{
+    register_portio_handler(d, BDOOR_PORT, 4, vmport_ioport);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index e30fd8a..b435689 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -124,6 +124,9 @@ struct hvm_domain {
     spinlock_t             uc_lock;
     bool_t                 is_in_uc_mode;
 
+    /* VMware backdoor port available */
+    bool_t                 is_vmware_port_enabled;
+
     /* Pass-through */
     struct hvm_iommu       hvm_iommu;
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 2965fbb..e76f612 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -522,6 +522,8 @@ extern bool_t opt_hvm_fep;
 #define opt_hvm_fep 0
 #endif
 
+void vmport_register(struct domain *d);
+
 #endif /* __ASM_X86_HVM_HVM_H__ */
 
 /*
diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h
index f84d10d..53b84da 100644
--- a/xen/include/public/arch-x86/xen.h
+++ b/xen/include/public/arch-x86/xen.h
@@ -267,8 +267,14 @@ typedef struct arch_shared_info arch_shared_info_t;
  * struct xen_arch_domainconfig's ABI is covered by
  * XEN_DOMCTL_INTERFACE_VERSION.
  */
+
+/* Enable use of vmware backdoor port. */
+#define XEN_DOMCTL_CONFIG_VMWARE_PORT_BIT   0
+#define XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK  (1U << XEN_DOMCTL_CONFIG_VMWARE_PORT_BIT)
+
 struct xen_arch_domainconfig {
     uint64_t vmware_hwver;
+    uint64_t arch_flags;
 };
 #endif
 
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-05-22 15:50 [PATCH v11 0/9] Xen VMware tools support Don Slutz
                   ` (4 preceding siblings ...)
  2015-05-22 15:50 ` [PATCH v11 5/9] xen: Add vmware_port support Don Slutz
@ 2015-05-22 15:50 ` Don Slutz
  2015-06-03 15:26   ` George Dunlap
  2015-06-23 16:14   ` Jan Beulich
  2015-05-22 15:50 ` [PATCH v11 7/9] tools: Add " Don Slutz
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 48+ messages in thread
From: Don Slutz @ 2015-05-22 15:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
to port 0x5658 specially.  Note: since many operations return data
in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
"in (%dx),%al" will still do things, only AL part of EAX will be
changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
unchanged.

This instruction is allowed to be used from ring 3.  To
support this the vmexit for GP needs to be enabled.  I have not
fully tested that nested HVM is doing the right thing for this.

Enable no-fault of pio in x86_emulate for VMware port

Also adjust the emulation registers after doing a VMware
backdoor operation.

Add new routine hvm_emulate_one_gp() to be used by the #GP fault
handler.

Some of the best info is at:

https://sites.google.com/site/chitchatvmback/backdoor

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v11:
  No change

v10:
   Re-worked to be simpler.

v9:
   Split #GP handling (or skipping of #GP) code out of previous
   patch to help with the review process.
   Switch to x86_emulator to handle #GP
   I think the hvm_emulate_ops_gp() covers all needed ops.  Not able to validate
   all paths though _hvm_emulate_one().

 xen/arch/x86/hvm/emulate.c             | 54 ++++++++++++++++++++++++++++++++--
 xen/arch/x86/hvm/svm/svm.c             | 26 ++++++++++++++++
 xen/arch/x86/hvm/svm/vmcb.c            |  2 ++
 xen/arch/x86/hvm/vmware/vmport.c       | 11 +++++++
 xen/arch/x86/hvm/vmx/vmcs.c            |  2 ++
 xen/arch/x86/hvm/vmx/vmx.c             | 37 +++++++++++++++++++++++
 xen/arch/x86/x86_emulate/x86_emulate.c | 13 +++++++-
 xen/arch/x86/x86_emulate/x86_emulate.h |  5 ++++
 xen/include/asm-x86/hvm/emulate.h      |  2 ++
 xen/include/asm-x86/hvm/hvm.h          |  1 +
 10 files changed, 150 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index ac9c9d6..d5e6468 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -803,6 +803,27 @@ static int hvmemul_wbinvd_discard(
     return X86EMUL_OKAY;
 }
 
+static int hvmemul_write_gp(
+    unsigned int seg,
+    unsigned long offset,
+    void *p_data,
+    unsigned int bytes,
+    struct x86_emulate_ctxt *ctxt)
+{
+    return X86EMUL_EXCEPTION;
+}
+
+static int hvmemul_cmpxchg_gp(
+    unsigned int seg,
+    unsigned long offset,
+    void *old,
+    void *new,
+    unsigned int bytes,
+    struct x86_emulate_ctxt *ctxt)
+{
+    return X86EMUL_EXCEPTION;
+}
+
 static int hvmemul_cmpxchg(
     enum x86_segment seg,
     unsigned long offset,
@@ -1356,6 +1377,13 @@ static int hvmemul_invlpg(
     return rc;
 }
 
+static int hvmemul_vmport_check(
+    unsigned int first_port,
+    struct x86_emulate_ctxt *ctxt)
+{
+    return vmport_check_port(first_port);
+}
+
 static const struct x86_emulate_ops hvm_emulate_ops = {
     .read          = hvmemul_read,
     .insn_fetch    = hvmemul_insn_fetch,
@@ -1379,7 +1407,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
     .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
-    .invlpg        = hvmemul_invlpg
+    .invlpg        = hvmemul_invlpg,
+    .vmport_check  = hvmemul_vmport_check,
 };
 
 static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
@@ -1405,7 +1434,22 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
     .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
-    .invlpg        = hvmemul_invlpg
+    .invlpg        = hvmemul_invlpg,
+    .vmport_check  = hvmemul_vmport_check,
+};
+
+static const struct x86_emulate_ops hvm_emulate_ops_gp = {
+    .read          = hvmemul_read,
+    .insn_fetch    = hvmemul_insn_fetch,
+    .write         = hvmemul_write_gp,
+    .cmpxchg       = hvmemul_cmpxchg_gp,
+    .read_segment  = hvmemul_read_segment,
+    .write_segment = hvmemul_write_segment,
+    .read_io       = hvmemul_read_io,
+    .write_io      = hvmemul_write_io,
+    .inject_hw_exception = hvmemul_inject_hw_exception,
+    .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
+    .vmport_check  = hvmemul_vmport_check,
 };
 
 static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
@@ -1522,6 +1566,12 @@ int hvm_emulate_one(
     return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops);
 }
 
+int hvm_emulate_one_gp(
+    struct hvm_emulate_ctxt *hvmemul_ctxt)
+{
+    return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops_gp);
+}
+
 int hvm_emulate_one_no_write(
     struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 6734fb6..62baf3c 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2119,6 +2119,28 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
     return;
 }
 
+static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
+                                    struct vcpu *v)
+{
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
+    int rc;
+
+    if ( vmcb->exitinfo1 != 0 || vmcb->exitinfo2 != 0 )
+        rc = X86EMUL_EXCEPTION;
+    else
+    {
+        struct hvm_emulate_ctxt ctxt;
+
+        hvm_emulate_prepare(&ctxt, regs);
+        rc = hvm_emulate_one_gp(&ctxt);
+
+        if ( rc == X86EMUL_OKAY )
+            hvm_emulate_writeback(&ctxt);
+    }
+    if ( rc != X86EMUL_OKAY && rc != X86EMUL_RETRY )
+        hvm_inject_hw_exception(TRAP_gp_fault, vmcb->exitinfo1);
+}
+
 static void svm_vmexit_ud_intercept(struct cpu_user_regs *regs)
 {
     struct hvm_emulate_ctxt ctxt;
@@ -2484,6 +2506,10 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
         break;
     }
 
+    case VMEXIT_EXCEPTION_GP:
+        svm_vmexit_gp_intercept(regs, v);
+        break;
+
     case VMEXIT_EXCEPTION_UD:
         svm_vmexit_ud_intercept(regs);
         break;
diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
index 6339d2a..7683c09 100644
--- a/xen/arch/x86/hvm/svm/vmcb.c
+++ b/xen/arch/x86/hvm/svm/vmcb.c
@@ -195,6 +195,8 @@ static int construct_vmcb(struct vcpu *v)
         HVM_TRAP_MASK
         | (1U << TRAP_no_device);
 
+    if ( v->domain->arch.hvm_domain.is_vmware_port_enabled )
+        vmcb->_exception_intercepts |= 1U << TRAP_gp_fault;
     if ( paging_mode_hap(v->domain) )
     {
         vmcb->_np_enable = 1; /* enable nested paging */
diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
index f24d8e3..36e3f1b 100644
--- a/xen/arch/x86/hvm/vmware/vmport.c
+++ b/xen/arch/x86/hvm/vmware/vmport.c
@@ -137,6 +137,17 @@ void vmport_register(struct domain *d)
     register_portio_handler(d, BDOOR_PORT, 4, vmport_ioport);
 }
 
+int vmport_check_port(unsigned int port)
+{
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
+
+    if ( port == BDOOR_PORT && is_hvm_domain(currd) &&
+         currd->arch.hvm_domain.is_vmware_port_enabled )
+        return 0;
+    return 1;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 877ec10..54360b0 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1151,6 +1151,8 @@ static int construct_vmcs(struct vcpu *v)
 
     v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK
               | (paging_mode_hap(d) ? 0 : (1U << TRAP_page_fault))
+              | (v->domain->arch.hvm_domain.is_vmware_port_enabled ?
+                 (1U << TRAP_gp_fault) : 0)
               | (1U << TRAP_no_device);
     vmx_update_exception_bitmap(v);
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 74f563f..fe88afe 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1312,6 +1312,8 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
                 v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK
                           | (paging_mode_hap(v->domain) ?
                              0 : (1U << TRAP_page_fault))
+                          | (v->domain->arch.hvm_domain.is_vmware_port_enabled ?
+                             (1U << TRAP_gp_fault) : 0)
                           | (1U << TRAP_no_device);
                 vmx_update_exception_bitmap(v);
                 vmx_update_debug_state(v);
@@ -2671,6 +2673,38 @@ static void vmx_idtv_reinject(unsigned long idtv_info)
     }
 }
 
+static void vmx_vmexit_gp_intercept(struct cpu_user_regs *regs,
+                                    struct vcpu *v)
+{
+    unsigned long exit_qualification;
+    unsigned long ecode;
+    int rc;
+    unsigned long vector;
+
+    __vmread(VM_EXIT_INTR_INFO, &vector);
+    ASSERT(vector & INTR_INFO_VALID_MASK);
+    ASSERT(vector & INTR_INFO_DELIVER_CODE_MASK);
+
+    __vmread(EXIT_QUALIFICATION, &exit_qualification);
+    __vmread(VM_EXIT_INTR_ERROR_CODE, &ecode);
+
+    if ( ecode != 0 || exit_qualification != 0 )
+        rc = X86EMUL_EXCEPTION;
+    else
+    {
+        struct hvm_emulate_ctxt ctxt;
+
+        hvm_emulate_prepare(&ctxt, regs);
+        rc = hvm_emulate_one_gp(&ctxt);
+
+        if ( rc == X86EMUL_OKAY )
+            hvm_emulate_writeback(&ctxt);
+    }
+
+    if ( rc != X86EMUL_OKAY && rc != X86EMUL_RETRY )
+        hvm_inject_hw_exception(TRAP_gp_fault, ecode);
+}
+
 static int vmx_handle_apic_write(void)
 {
     unsigned long exit_qualification;
@@ -2895,6 +2929,9 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             HVMTRACE_1D(TRAP, vector);
             vmx_fpu_dirty_intercept();
             break;
+        case TRAP_gp_fault:
+            vmx_vmexit_gp_intercept(regs, v);
+            break;
         case TRAP_page_fault:
             __vmread(EXIT_QUALIFICATION, &exit_qualification);
             __vmread(VM_EXIT_INTR_ERROR_CODE, &ecode);
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index c017c69..d3ea143 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -3393,8 +3393,11 @@ x86_emulate(
         unsigned int port = ((b < 0xe8)
                              ? insn_fetch_type(uint8_t)
                              : (uint16_t)_regs.edx);
+	bool_t vmport = (ops->vmport_check && /* Vmware backdoor? */
+			 (ops->vmport_check(port, ctxt) == 0));
         op_bytes = !(b & 1) ? 1 : (op_bytes == 8) ? 4 : op_bytes;
-        if ( (rc = ioport_access_check(port, op_bytes, ctxt, ops)) != 0 )
+        if ( !vmport &&
+	     (rc = ioport_access_check(port, op_bytes, ctxt, ops)) != 0 )
             goto done;
         if ( b & 2 )
         {
@@ -3413,6 +3416,14 @@ x86_emulate(
         }
         if ( rc != 0 )
             goto done;
+	if ( vmport )
+	{
+            _regs._ebx = ctxt->regs->_ebx;
+            _regs._ecx = ctxt->regs->_ecx;
+            _regs._edx = ctxt->regs->_edx;
+            _regs._esi = ctxt->regs->_esi;
+            _regs._edi = ctxt->regs->_edi;
+	}
         break;
     }
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 064b8f4..d914b5e 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -397,6 +397,11 @@ struct x86_emulate_ops
         enum x86_segment seg,
         unsigned long offset,
         struct x86_emulate_ctxt *ctxt);
+
+    /* vmport_check */
+    int (*vmport_check)(
+        unsigned int port,
+        struct x86_emulate_ctxt *ctxt);
 };
 
 struct cpu_user_regs;
diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
index b3971c8..4386169 100644
--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -36,6 +36,8 @@ struct hvm_emulate_ctxt {
 
 int hvm_emulate_one(
     struct hvm_emulate_ctxt *hvmemul_ctxt);
+int hvm_emulate_one_gp(
+    struct hvm_emulate_ctxt *hvmemul_ctxt);
 int hvm_emulate_one_no_write(
     struct hvm_emulate_ctxt *hvmemul_ctxt);
 void hvm_mem_access_emulate_one(bool_t nowrite,
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index e76f612..c42f7d8 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -523,6 +523,7 @@ extern bool_t opt_hvm_fep;
 #endif
 
 void vmport_register(struct domain *d);
+int vmport_check_port(unsigned int port);
 
 #endif /* __ASM_X86_HVM_HVM_H__ */
 
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v11 7/9] tools: Add vmware_port support
  2015-05-22 15:50 [PATCH v11 0/9] Xen VMware tools support Don Slutz
                   ` (5 preceding siblings ...)
  2015-05-22 15:50 ` [PATCH v11 6/9] xen: Add ring 3 " Don Slutz
@ 2015-05-22 15:50 ` Don Slutz
  2015-06-03 17:06   ` George Dunlap
  2015-06-04 15:20   ` Ian Campbell
  2015-05-22 15:50 ` [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT Don Slutz
  2015-05-22 15:50 ` [PATCH v11 9/9] Add xentrace to vmware_port Don Slutz
  8 siblings, 2 replies; 48+ messages in thread
From: Don Slutz @ 2015-05-22 15:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This new libxl_domain_create_info field is used to set
XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK in the xc_domain_configuration_t
for x86.

In xen it is is_vmware_port_enabled.

If is_vmware_port_enabled then
  enable a limited support of VMware's hyper-call.

VMware's hyper-call is also known as VMware Backdoor I/O Port.

if vmware_port is not specified in the config file, let
"vmware_hwver != 0" be the default value.  This means that only
vmware_hwver = 7 needs to be specified to enable both features.

vmware_hwver = 7 is special because that is what controls the
enable of CPUID leaves for VMware (vmware_hwver >= 7).

Note: vmware_port and nestedhvm cannot be specified at the
same time.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v11:
  Dropped "If non-zero then default VGA to VMware's VGA"

v10:
    If..." at the start of the sentence ...
    Also, why is 7 special?


 docs/man/xl.cfg.pod.5       | 15 +++++++++++++++
 tools/libxl/libxl.h         |  5 +++++
 tools/libxl/libxl_create.c  |  9 +++++++++
 tools/libxl/libxl_types.idl |  1 +
 tools/libxl/libxl_x86.c     |  2 ++
 tools/libxl/xl_cmdimpl.c    |  1 +
 6 files changed, 33 insertions(+)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index eaad4bf..00aa78f 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1354,6 +1354,8 @@ Turns on or off the exposure of VMware cpuid.  The number is
 VMware's hardware version number, where 0 is off.  A number >= 7
 is needed to enable exposure of VMware cpuid.
 
+If not zero it changes the default for vmware_port to on.
+
 The hardware version number (vmware_hwver) come from VMware config files.
 
 =over 4
@@ -1365,6 +1367,19 @@ For vssd:VirtualSystemType == vmx-07, vmware_hwver = 7.
 
 =back
 
+=item B<vmware_port=BOOLEAN>
+
+Turns on or off the exposure of VMware port.  This is known as
+vmport in QEMU.  Also called VMware Backdoor I/O Port.  Not all
+defined VMware backdoor commands are implemented.  All of the
+ones that Linux kernel uses are defined.
+
+Defaults to enabled if vmware_hwver is non-zero (i.e. enabled)
+otherwise defaults to disabled.
+
+Note: vmware_port and nestedhvm cannot be specified at the
+same time.
+
 =back
 
 =head3 Emulated VGA Graphics Device
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 86164a7..fcce7c3 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -205,6 +205,11 @@
 #define LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE 1
 
 /*
+ * libxl_domain_create_info has the vmware_hwver and vmware_port field.
+ */
+#define LIBXL_HAVE_CREATEINFO_VMWARE 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 895577f..ac05ecc 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -41,6 +41,7 @@ int libxl__domain_create_info_setdefault(libxl__gc *gc,
         libxl_defbool_setdefault(&c_info->hap, libxl_defbool_val(c_info->pvh));
     }
 
+    libxl_defbool_setdefault(&c_info->vmware_port, c_info->vmware_hwver != 0);
     libxl_defbool_setdefault(&c_info->run_hotplug_scripts, true);
     libxl_defbool_setdefault(&c_info->driver_domain, false);
 
@@ -917,6 +918,14 @@ static void initiate_domain_create(libxl__egc *egc,
     ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
     if (ret) goto error_out;
 
+    if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM &&
+        libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) &&
+        libxl_defbool_val(d_config->c_info.vmware_port)) {
+        LOG(ERROR,
+            "vmware_port and nestedhvm cannot be enabled simultaneously\n");
+        ret = ERROR_INVAL;
+        goto error_out;
+    }
     if (!sched_params_valid(gc, domid, &d_config->b_info.sched_params)) {
         LOG(ERROR, "Invalid scheduling parameters\n");
         ret = ERROR_INVAL;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index c8a1345..c7af74b 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -344,6 +344,7 @@ libxl_domain_create_info = Struct("domain_create_info",[
     ("pvh",          libxl_defbool),
     ("driver_domain",libxl_defbool),
     ("vmware_hwver", uint64),
+    ("vmware_port",  libxl_defbool),
     ], dir=DIR_IN)
 
 libxl_domain_restore_params = Struct("domain_restore_params", [
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index fd7dafa..404904a 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -6,6 +6,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
                                       xc_domain_configuration_t *xc_config)
 {
     xc_config->vmware_hwver = d_config->c_info.vmware_hwver;
+    if (libxl_defbool_val(d_config->c_info.vmware_port))
+        xc_config->arch_flags |= XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK;
     return 0;
 }
 
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index e79a9d0..b3fe0cd 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1230,6 +1230,7 @@ static void parse_config_data(const char *config_source,
     }
 
     xlu_cfg_get_defbool(config, "oos", &c_info->oos, 0);
+    xlu_cfg_get_defbool(config, "vmware_port", &c_info->vmware_port, 0);
 
     if (!xlu_cfg_get_string (config, "pool", &buf, 0))
         xlu_cfg_replace_string(config, "pool", &c_info->pool_name, 0);
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT
  2015-05-22 15:50 [PATCH v11 0/9] Xen VMware tools support Don Slutz
                   ` (6 preceding siblings ...)
  2015-05-22 15:50 ` [PATCH v11 7/9] tools: Add " Don Slutz
@ 2015-05-22 15:50 ` Don Slutz
  2015-06-03 17:09   ` George Dunlap
  2015-05-22 15:50 ` [PATCH v11 9/9] Add xentrace to vmware_port Don Slutz
  8 siblings, 1 reply; 48+ messages in thread
From: Don Slutz @ 2015-05-22 15:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This adds synchronization of the 6 vcpu registers (only 32bits of
them) that vmport.c needs between Xen and QEMU.

This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
fetch and put these 6 vcpu registers used by the code in vmport.c
and vmmouse.c

In the tools, enable usage of QEMU's vmport code.

The currently most useful VMware port support that QEMU has is the
VMware mouse support.  Xorg included a VMware mouse support that
uses absolute mode.  This make using a mouse in X11 much nicer.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v11:
  No change

v10:
  These literals should become an enum.
    I don't think the invalidate type is needed.
    Code handling "case X86EMUL_UNHANDLEABLE:" in emulate.c
    is unclear.
    Comment about "special' range of 1" is not clear.


v9:
  New code was presented as an RFC before this.

  Paul Durrant sugested I add support for other IOREQ types
  to HVMOP_map_io_range_to_ioreq_server.
    I have done this.

 tools/libxc/xc_hvm_build_x86.c   |   5 +-
 tools/libxl/libxl_dm.c           |   2 +
 xen/arch/x86/hvm/emulate.c       |  78 ++++++++++++++---
 xen/arch/x86/hvm/hvm.c           | 182 ++++++++++++++++++++++++++++++++++-----
 xen/arch/x86/hvm/io.c            |  16 ++++
 xen/include/asm-x86/hvm/domain.h |   3 +-
 xen/include/asm-x86/hvm/hvm.h    |   1 +
 xen/include/public/hvm/hvm_op.h  |   5 ++
 xen/include/public/hvm/ioreq.h   |  17 ++++
 xen/include/public/hvm/params.h  |   4 +-
 10 files changed, 274 insertions(+), 39 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index e45ae4a..ffe52eb 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -46,7 +46,8 @@
 #define SPECIALPAGE_IOREQ    5
 #define SPECIALPAGE_IDENT_PT 6
 #define SPECIALPAGE_CONSOLE  7
-#define NR_SPECIAL_PAGES     8
+#define SPECIALPAGE_VMPORT_REGS 8
+#define NR_SPECIAL_PAGES     9
 #define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x))
 
 #define NR_IOREQ_SERVER_PAGES 8
@@ -569,6 +570,8 @@ static int setup_guest(xc_interface *xch,
                      special_pfn(SPECIALPAGE_BUFIOREQ));
     xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_PFN,
                      special_pfn(SPECIALPAGE_IOREQ));
+    xc_hvm_param_set(xch, dom, HVM_PARAM_VMPORT_REGS_PFN,
+                     special_pfn(SPECIALPAGE_VMPORT_REGS));
     xc_hvm_param_set(xch, dom, HVM_PARAM_CONSOLE_PFN,
                      special_pfn(SPECIALPAGE_CONSOLE));
     xc_hvm_param_set(xch, dom, HVM_PARAM_PAGING_RING_PFN,
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index ce08461..b68c651 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -814,6 +814,8 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
                                             machinearg, max_ram_below_4g);
             }
         }
+        if (libxl_defbool_val(c_info->vmware_port))
+            machinearg = GCSPRINTF("%s,vmport=on", machinearg);
         flexarray_append(dm_args, machinearg);
         for (i = 0; b_info->extra_hvm && b_info->extra_hvm[i] != NULL; i++)
             flexarray_append(dm_args, b_info->extra_hvm[i]);
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index d5e6468..0a42d18 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -219,27 +219,70 @@ static int hvmemul_do_io(
             vio->io_state = HVMIO_handle_mmio_awaiting_completion;
         break;
     case X86EMUL_UNHANDLEABLE:
-    {
-        struct hvm_ioreq_server *s =
-            hvm_select_ioreq_server(curr->domain, &p);
-
-        /* If there is no suitable backing DM, just ignore accesses */
-        if ( !s )
+        if ( vmport_check_port(p.addr) )
         {
-            hvm_complete_assist_req(&p);
-            rc = X86EMUL_OKAY;
-            vio->io_state = HVMIO_none;
+            struct hvm_ioreq_server *s =
+                hvm_select_ioreq_server(curr->domain, &p);
+
+            /* If there is no suitable backing DM, just ignore accesses */
+            if ( !s )
+            {
+                hvm_complete_assist_req(&p);
+                rc = X86EMUL_OKAY;
+                vio->io_state = HVMIO_none;
+            }
+            else
+            {
+                rc = X86EMUL_RETRY;
+                if ( !hvm_send_assist_req(s, &p) )
+                    vio->io_state = HVMIO_none;
+                else if ( p_data == NULL )
+                    rc = X86EMUL_OKAY;
+            }
         }
         else
         {
-            rc = X86EMUL_RETRY;
-            if ( !hvm_send_assist_req(s, &p) )
-                vio->io_state = HVMIO_none;
-            else if ( p_data == NULL )
+            struct hvm_ioreq_server *s;
+            vmware_regs_t *vr;
+
+            BUILD_BUG_ON(sizeof(ioreq_t) < sizeof(vmware_regs_t));
+
+            p.type = IOREQ_TYPE_VMWARE_PORT;
+            s = hvm_select_ioreq_server(curr->domain, &p);
+            vr = get_vmport_regs_any(s, curr);
+
+            /*
+             * If there is no suitable backing DM, just ignore accesses.  If
+             * we do not have access to registers to pass to QEMU, just
+             * ignore access.
+             */
+            if ( !s || !vr )
+            {
+                hvm_complete_assist_req(&p);
                 rc = X86EMUL_OKAY;
+                vio->io_state = HVMIO_none;
+            }
+            else
+            {
+                struct cpu_user_regs *regs = guest_cpu_user_regs();
+
+                p.data = regs->rax;
+                vr->ebx = regs->_ebx;
+                vr->ecx = regs->_ecx;
+                vr->edx = regs->_edx;
+                vr->esi = regs->_esi;
+                vr->edi = regs->_edi;
+
+                vio->io_state = HVMIO_handle_pio_awaiting_completion;
+                if ( !hvm_send_assist_req(s, &p) )
+                {
+                    rc = X86EMUL_RETRY;
+                    vio->io_state = HVMIO_none;
+                }
+                /* else leave rc as X86EMUL_UNHANDLEABLE for below. */
+            }
         }
         break;
-    }
     default:
         BUG();
     }
@@ -248,6 +291,13 @@ static int hvmemul_do_io(
     {
         if ( ram_page )
             put_page(ram_page);
+        /*
+         * If rc is still X86EMUL_UNHANDLEABLE, then were are of
+         * type IOREQ_TYPE_VMWARE_PORT, so completion in
+         * hvm_io_assist() with no re-emulation required
+         */
+        if ( rc == X86EMUL_UNHANDLEABLE )
+            rc = X86EMUL_OKAY;
         return rc;
     }
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 2752197..7dd4fdb 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -394,6 +394,47 @@ static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
     return &p->vcpu_ioreq[v->vcpu_id];
 }
 
+static vmware_regs_t *get_vmport_regs_one(struct hvm_ioreq_server *s,
+                                          struct vcpu *v)
+{
+    struct hvm_ioreq_vcpu *sv;
+
+    list_for_each_entry ( sv,
+                          &s->ioreq_vcpu_list,
+                          list_entry )
+    {
+        if ( sv->vcpu == v )
+        {
+            shared_vmport_iopage_t *p = s->vmport_ioreq.va;
+            if ( !p )
+                return NULL;
+            return &p->vcpu_vmport_regs[v->vcpu_id];
+        }
+    }
+    return NULL;
+}
+
+vmware_regs_t *get_vmport_regs_any(struct hvm_ioreq_server *s, struct vcpu *v)
+{
+    struct domain *d = v->domain;
+
+    ASSERT((v == current) || !vcpu_runnable(v));
+
+    if ( s )
+        return get_vmport_regs_one(s, v);
+
+    list_for_each_entry ( s,
+                          &d->arch.hvm_domain.ioreq_server.list,
+                          list_entry )
+    {
+        vmware_regs_t *ret = get_vmport_regs_one(s, v);
+
+        if ( ret )
+            return ret;
+    }
+    return NULL;
+}
+
 bool_t hvm_io_pending(struct vcpu *v)
 {
     struct domain *d = v->domain;
@@ -504,22 +545,56 @@ static void hvm_free_ioreq_gmfn(struct domain *d, unsigned long gmfn)
         set_bit(i, &d->arch.hvm_domain.ioreq_gmfn.mask);
 }
 
-static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool_t buf)
+typedef enum {
+    IOREQ_PAGE_TYPE_IOREQ,
+    IOREQ_PAGE_TYPE_BUFIOREQ,
+    IOREQ_PAGE_TYPE_VMPORT,
+} ioreq_page_type_t;
+
+static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, ioreq_page_type_t buf)
 {
-    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+    struct hvm_ioreq_page *iorp = NULL;
+
+    switch ( buf )
+    {
+    case IOREQ_PAGE_TYPE_IOREQ:
+        iorp = &s->ioreq;
+        break;
+    case IOREQ_PAGE_TYPE_BUFIOREQ:
+        iorp = &s->bufioreq;
+        break;
+    case IOREQ_PAGE_TYPE_VMPORT:
+        iorp = &s->vmport_ioreq;
+        break;
+    }
+    ASSERT(iorp);
 
     destroy_ring_for_helper(&iorp->va, iorp->page);
 }
 
 static int hvm_map_ioreq_page(
-    struct hvm_ioreq_server *s, bool_t buf, unsigned long gmfn)
+    struct hvm_ioreq_server *s, ioreq_page_type_t buf, unsigned long gmfn)
 {
     struct domain *d = s->domain;
-    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
+    struct hvm_ioreq_page *iorp = NULL;
     struct page_info *page;
     void *va;
     int rc;
 
+    switch ( buf )
+    {
+    case IOREQ_PAGE_TYPE_IOREQ:
+        iorp = &s->ioreq;
+        break;
+    case IOREQ_PAGE_TYPE_BUFIOREQ:
+        iorp = &s->bufioreq;
+        break;
+    case IOREQ_PAGE_TYPE_VMPORT:
+        iorp = &s->vmport_ioreq;
+        break;
+    }
+    ASSERT(iorp);
+
     if ( (rc = prepare_ring_for_helper(d, gmfn, &page, &va)) )
         return rc;
 
@@ -736,19 +811,32 @@ static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
 
 static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s,
                                       unsigned long ioreq_pfn,
-                                      unsigned long bufioreq_pfn)
+                                      unsigned long bufioreq_pfn,
+                                      unsigned long vmport_ioreq_pfn)
 {
     int rc;
 
-    rc = hvm_map_ioreq_page(s, 0, ioreq_pfn);
+    rc = hvm_map_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ, ioreq_pfn);
     if ( rc )
         return rc;
 
     if ( bufioreq_pfn != INVALID_GFN )
-        rc = hvm_map_ioreq_page(s, 1, bufioreq_pfn);
+        rc = hvm_map_ioreq_page(s, IOREQ_PAGE_TYPE_BUFIOREQ, bufioreq_pfn);
 
     if ( rc )
-        hvm_unmap_ioreq_page(s, 0);
+    {
+        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ);
+        return rc;
+    }
+
+    rc = hvm_map_ioreq_page(s, IOREQ_PAGE_TYPE_VMPORT, vmport_ioreq_pfn);
+
+    if ( rc )
+    {
+        if ( bufioreq_pfn != INVALID_GFN )
+            hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_BUFIOREQ);
+        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ);
+    }
 
     return rc;
 }
@@ -760,6 +848,8 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
     struct domain *d = s->domain;
     unsigned long ioreq_pfn = INVALID_GFN;
     unsigned long bufioreq_pfn = INVALID_GFN;
+    unsigned long vmport_ioreq_pfn =
+        d->arch.hvm_domain.params[HVM_PARAM_VMPORT_REGS_PFN];
     int rc;
 
     if ( is_default )
@@ -771,7 +861,8 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
         ASSERT(handle_bufioreq);
         return hvm_ioreq_server_map_pages(s,
                    d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN],
-                   d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN]);
+                   d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN],
+                   vmport_ioreq_pfn);
     }
 
     rc = hvm_alloc_ioreq_gmfn(d, &ioreq_pfn);
@@ -780,8 +871,8 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
         rc = hvm_alloc_ioreq_gmfn(d, &bufioreq_pfn);
 
     if ( !rc )
-        rc = hvm_ioreq_server_map_pages(s, ioreq_pfn, bufioreq_pfn);
-
+        rc = hvm_ioreq_server_map_pages(s, ioreq_pfn, bufioreq_pfn,
+                                        vmport_ioreq_pfn);
     if ( rc )
     {
         hvm_free_ioreq_gmfn(d, ioreq_pfn);
@@ -796,11 +887,15 @@ static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s,
 {
     struct domain *d = s->domain;
     bool_t handle_bufioreq = ( s->bufioreq.va != NULL );
+    bool_t handle_vmport_ioreq = ( s->vmport_ioreq.va != NULL );
+
+    if ( handle_vmport_ioreq )
+        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_VMPORT);
 
     if ( handle_bufioreq )
-        hvm_unmap_ioreq_page(s, 1);
+        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_BUFIOREQ);
 
-    hvm_unmap_ioreq_page(s, 0);
+    hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ);
 
     if ( !is_default )
     {
@@ -835,12 +930,38 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
     for ( i = 0; i < NR_IO_RANGE_TYPES; i++ )
     {
         char *name;
+        char *type_name = NULL;
+        unsigned int limit;
 
-        rc = asprintf(&name, "ioreq_server %d %s", s->id,
-                      (i == HVMOP_IO_RANGE_PORT) ? "port" :
-                      (i == HVMOP_IO_RANGE_MEMORY) ? "memory" :
-                      (i == HVMOP_IO_RANGE_PCI) ? "pci" :
-                      "");
+        switch ( i )
+        {
+        case HVMOP_IO_RANGE_PORT:
+            type_name = "port";
+            limit = MAX_NR_IO_RANGES;
+            break;
+        case HVMOP_IO_RANGE_MEMORY:
+            type_name = "memory";
+            limit = MAX_NR_IO_RANGES;
+            break;
+        case HVMOP_IO_RANGE_PCI:
+            type_name = "pci";
+            limit = MAX_NR_IO_RANGES;
+            break;
+        case HVMOP_IO_RANGE_VMWARE_PORT:
+            type_name = "VMware port";
+            limit = 1;
+            break;
+        case HVMOP_IO_RANGE_TIMEOFFSET:
+            type_name = "timeoffset";
+            limit = 1;
+            break;
+        default:
+            break;
+        }
+        if ( !type_name )
+            continue;
+
+        rc = asprintf(&name, "ioreq_server %d %s", s->id, type_name);
         if ( rc )
             goto fail;
 
@@ -853,7 +974,12 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
         if ( !s->range[i] )
             goto fail;
 
-        rangeset_limit(s->range[i], MAX_NR_IO_RANGES);
+        rangeset_limit(s->range[i], limit);
+
+        /* VMware port */
+        if ( i == HVMOP_IO_RANGE_VMWARE_PORT &&
+            s->domain->arch.hvm_domain.is_vmware_port_enabled )
+            rc = rangeset_add_range(s->range[i], 1, 1);
     }
 
  done:
@@ -1151,6 +1277,8 @@ static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
             case HVMOP_IO_RANGE_PORT:
             case HVMOP_IO_RANGE_MEMORY:
             case HVMOP_IO_RANGE_PCI:
+            case HVMOP_IO_RANGE_VMWARE_PORT:
+            case HVMOP_IO_RANGE_TIMEOFFSET:
                 r = s->range[type];
                 break;
 
@@ -1202,6 +1330,8 @@ static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
             case HVMOP_IO_RANGE_PORT:
             case HVMOP_IO_RANGE_MEMORY:
             case HVMOP_IO_RANGE_PCI:
+            case HVMOP_IO_RANGE_VMWARE_PORT:
+            case HVMOP_IO_RANGE_TIMEOFFSET:
                 r = s->range[type];
                 break;
 
@@ -2426,9 +2556,6 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
     if ( list_empty(&d->arch.hvm_domain.ioreq_server.list) )
         return NULL;
 
-    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
-        return d->arch.hvm_domain.default_ioreq_server;
-
     cf8 = d->arch.hvm_domain.pci_cf8;
 
     if ( p->type == IOREQ_TYPE_PIO &&
@@ -2471,7 +2598,10 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
         BUILD_BUG_ON(IOREQ_TYPE_PIO != HVMOP_IO_RANGE_PORT);
         BUILD_BUG_ON(IOREQ_TYPE_COPY != HVMOP_IO_RANGE_MEMORY);
         BUILD_BUG_ON(IOREQ_TYPE_PCI_CONFIG != HVMOP_IO_RANGE_PCI);
+        BUILD_BUG_ON(IOREQ_TYPE_VMWARE_PORT != HVMOP_IO_RANGE_VMWARE_PORT);
+        BUILD_BUG_ON(IOREQ_TYPE_TIMEOFFSET != HVMOP_IO_RANGE_TIMEOFFSET);
         r = s->range[type];
+        ASSERT(r);
 
         switch ( type )
         {
@@ -2498,6 +2628,13 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
             }
 
             break;
+        case IOREQ_TYPE_VMWARE_PORT:
+        case IOREQ_TYPE_TIMEOFFSET:
+            /* The 'special' range of [1,1] is checked for being enabled */
+            if ( rangeset_contains_singleton(r, 1) )
+                return s;
+
+            break;
         }
     }
 
@@ -2657,6 +2794,7 @@ void hvm_complete_assist_req(ioreq_t *p)
     case IOREQ_TYPE_PCI_CONFIG:
         ASSERT_UNREACHABLE();
         break;
+    case IOREQ_TYPE_VMWARE_PORT:
     case IOREQ_TYPE_COPY:
     case IOREQ_TYPE_PIO:
         if ( p->dir == IOREQ_READ )
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 68fb890..7684cf0 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -192,6 +192,22 @@ void hvm_io_assist(ioreq_t *p)
         (void)handle_mmio();
         break;
     case HVMIO_handle_pio_awaiting_completion:
+        if ( p->type == IOREQ_TYPE_VMWARE_PORT )
+        {
+            vmware_regs_t *vr = get_vmport_regs_any(NULL, curr);
+
+            if ( vr )
+            {
+                struct cpu_user_regs *regs = guest_cpu_user_regs();
+
+                /* Only change the 32bit part of the register */
+                regs->_ebx = vr->ebx;
+                regs->_ecx = vr->ecx;
+                regs->_edx = vr->edx;
+                regs->_esi = vr->esi;
+                regs->_edi = vr->edi;
+            }
+        }
         if ( vio->io_size == 4 ) /* Needs zero extension. */
             guest_cpu_user_regs()->rax = (uint32_t)p->data;
         else
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index b435689..599a688 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
     evtchn_port_t    ioreq_evtchn;
 };
 
-#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
+#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_VMWARE_PORT + 1)
 #define MAX_NR_IO_RANGES  256
 
 struct hvm_ioreq_server {
@@ -63,6 +63,7 @@ struct hvm_ioreq_server {
     ioservid_t             id;
     struct hvm_ioreq_page  ioreq;
     struct list_head       ioreq_vcpu_list;
+    struct hvm_ioreq_page  vmport_ioreq;
     struct hvm_ioreq_page  bufioreq;
 
     /* Lock to serialize access to buffered ioreq ring */
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index c42f7d8..0c72ac8 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -524,6 +524,7 @@ extern bool_t opt_hvm_fep;
 
 void vmport_register(struct domain *d);
 int vmport_check_port(unsigned int port);
+vmware_regs_t *get_vmport_regs_any(struct hvm_ioreq_server *s, struct vcpu *v);
 
 #endif /* __ASM_X86_HVM_HVM_H__ */
 
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index cde3571..2dcafc3 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -314,6 +314,9 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_get_ioreq_server_info_t);
  *
  * NOTE: unless an emulation request falls entirely within a range mapped
  * by a secondary emulator, it will not be passed to that emulator.
+ *
+ * NOTE: The 'special' range of [1,1] is what is checked for on
+ * TIMEOFFSET and VMWARE_PORT.
  */
 #define HVMOP_map_io_range_to_ioreq_server 19
 #define HVMOP_unmap_io_range_from_ioreq_server 20
@@ -324,6 +327,8 @@ struct xen_hvm_io_range {
 # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
 # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
 # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func range */
+# define HVMOP_IO_RANGE_TIMEOFFSET 7 /* TIMEOFFSET special range */
+# define HVMOP_IO_RANGE_VMWARE_PORT 9 /* VMware port special range */
     uint64_aligned_t start, end; /* IN - inclusive start and end of range */
 };
 typedef struct xen_hvm_io_range xen_hvm_io_range_t;
diff --git a/xen/include/public/hvm/ioreq.h b/xen/include/public/hvm/ioreq.h
index 5b5fedf..2d9dcbe 100644
--- a/xen/include/public/hvm/ioreq.h
+++ b/xen/include/public/hvm/ioreq.h
@@ -37,6 +37,7 @@
 #define IOREQ_TYPE_PCI_CONFIG   2
 #define IOREQ_TYPE_TIMEOFFSET   7
 #define IOREQ_TYPE_INVALIDATE   8 /* mapcache */
+#define IOREQ_TYPE_VMWARE_PORT  9 /* pio + vmport registers */
 
 /*
  * VMExit dispatcher should cooperate with instruction decoder to
@@ -48,6 +49,8 @@
  * 
  * 63....48|47..40|39..35|34..32|31........0
  * SEGMENT |BUS   |DEV   |FN    |OFFSET
+ *
+ * For I/O type IOREQ_TYPE_VMWARE_PORT also use the vmware_regs.
  */
 struct ioreq {
     uint64_t addr;          /* physical address */
@@ -66,11 +69,25 @@ struct ioreq {
 };
 typedef struct ioreq ioreq_t;
 
+struct vmware_regs {
+    uint32_t esi;
+    uint32_t edi;
+    uint32_t ebx;
+    uint32_t ecx;
+    uint32_t edx;
+};
+typedef struct vmware_regs vmware_regs_t;
+
 struct shared_iopage {
     struct ioreq vcpu_ioreq[1];
 };
 typedef struct shared_iopage shared_iopage_t;
 
+struct shared_vmport_iopage {
+    struct vmware_regs vcpu_vmport_regs[1];
+};
+typedef struct shared_vmport_iopage shared_vmport_iopage_t;
+
 struct buf_ioreq {
     uint8_t  type;   /* I/O type                    */
     uint8_t  pad:1;
diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
index 7c73089..130eba9 100644
--- a/xen/include/public/hvm/params.h
+++ b/xen/include/public/hvm/params.h
@@ -50,6 +50,8 @@
 #define HVM_PARAM_PAE_ENABLED  4
 
 #define HVM_PARAM_IOREQ_PFN    5
+/* Extra vmport PFN. */
+#define HVM_PARAM_VMPORT_REGS_PFN 35
 
 #define HVM_PARAM_BUFIOREQ_PFN 6
 #define HVM_PARAM_BUFIOREQ_EVTCHN 26
@@ -187,6 +189,6 @@
 /* Location of the VM Generation ID in guest physical address space. */
 #define HVM_PARAM_VM_GENERATION_ID_ADDR 34
 
-#define HVM_NR_PARAMS          35
+#define HVM_NR_PARAMS          36
 
 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v11 9/9] Add xentrace to vmware_port
  2015-05-22 15:50 [PATCH v11 0/9] Xen VMware tools support Don Slutz
                   ` (7 preceding siblings ...)
  2015-05-22 15:50 ` [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT Don Slutz
@ 2015-05-22 15:50 ` Don Slutz
  2015-06-04 11:20   ` George Dunlap
  8 siblings, 1 reply; 48+ messages in thread
From: Don Slutz @ 2015-05-22 15:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

Also added missing TRAP_DEBUG & VLAPIC.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v11:
  No change

v10:
  Added Acked-by: Ian Campbell
  Added back in the trace point calls.

    Why is cmd in this patch?
      Because the trace points use it.

v9:
  Dropped unneed VMPORT_UNHANDLED, VMPORT_DECODE.

v7:
      Dropped some of the new traces.
      Added HVMTRACE_ND7.

v6:
      Dropped the attempt to use svm_nextrip_insn_length via
      __get_instruction_length (added in v2).  Just always look
      at upto 15 bytes on AMD.

v5:
      exitinfo1 is used twice.
        Fixed.

 tools/xentrace/formats           |  5 +++++
 xen/arch/x86/hvm/io.c            |  3 +++
 xen/arch/x86/hvm/vmware/vmport.c | 17 ++++++++++++++---
 xen/include/asm-x86/hvm/trace.h  | 22 ++++++++++++++++++++++
 xen/include/public/trace.h       |  3 +++
 5 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index 5d7b72a..eec65f4 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -79,6 +79,11 @@
 0x00082020  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  INTR_WINDOW [ value = 0x%(1)08x ]
 0x00082021  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  NPF         [ gpa = 0x%(2)08x%(1)08x mfn = 0x%(4)08x%(3)08x qual = 0x%(5)04x p2mt = 0x%(6)04x ]
 0x00082023  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP        [ vector = 0x%(1)02x ]
+0x00082024  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP_DEBUG  [ exit_qualification = 0x%(1)08x ]
+0x00082025  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VLAPIC
+0x00082026  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_HANDLED   [ cmd = %(1)d eax = 0x%(2)08x ebx = 0x%(3)08x ecx = 0x%(4)08x edx = 0x%(5)08x esi = 0x%(6)08x edi = 0x%(7)08x ]
+0x00082027  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_IGNORED   [ port = %(1)d eax = 0x%(2)08x ebx = 0x%(3)08x ecx = 0x%(4)08x edx = 0x%(5)08x esi = 0x%(6)08x edi = 0x%(7)08x ]
+0x00082028  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_QEMU      [ eax = 0x%(1)08x ebx = 0x%(2)08x ecx = 0x%(3)08x edx = 0x%(4)08x esi = 0x%(5)08x edi = 0x%(6)08x ]
 
 0x0010f001  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  page_grant_map      [ domid = %(1)d ]
 0x0010f002  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  page_grant_unmap    [ domid = %(1)d ]
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 7684cf0..6a9cfb0 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -206,6 +206,9 @@ void hvm_io_assist(ioreq_t *p)
                 regs->_edx = vr->edx;
                 regs->_esi = vr->esi;
                 regs->_edi = vr->edi;
+                HVMTRACE_ND(VMPORT_QEMU, 0, 1/*cycles*/, 6,
+                            p->data, regs->_ebx, regs->_ecx,
+                            regs->_edx, regs->_esi, regs->_edi);
             }
         }
         if ( vio->io_size == 4 ) /* Needs zero extension. */
diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
index 36e3f1b..3c3ccd4 100644
--- a/xen/arch/x86/hvm/vmware/vmport.c
+++ b/xen/arch/x86/hvm/vmware/vmport.c
@@ -16,6 +16,7 @@
 #include <xen/lib.h>
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/support.h>
+#include <asm/hvm/trace.h>
 
 #include "backdoor_def.h"
 
@@ -35,6 +36,7 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
     if ( port == BDOOR_PORT && regs->_eax == BDOOR_MAGIC )
     {
         uint32_t new_eax = ~0u;
+        uint16_t cmd = regs->_ecx;
         uint64_t value;
         struct vcpu *curr = current;
         struct domain *currd = curr->domain;
@@ -45,7 +47,7 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
          * leaving the high 32-bits unchanged, unlike what one would
          * expect to happen.
          */
-        switch ( regs->_ecx & 0xffff )
+        switch ( cmd )
         {
         case BDOOR_CMD_GETMHZ:
             new_eax = currd->arch.tsc_khz / 1000;
@@ -123,11 +125,20 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
             /* Let backing DM handle */
             return X86EMUL_UNHANDLEABLE;
         }
+        HVMTRACE_ND7(VMPORT_HANDLED, 0, 0/*cycles*/, 7,
+                     cmd, new_eax, regs->_ebx, regs->_ecx,
+                     regs->_edx, regs->_esi, regs->_edi);
         if ( dir == IOREQ_READ )
             *val = new_eax;
     }
-    else if ( dir == IOREQ_READ )
-        *val = ~0u;
+    else
+    {
+        HVMTRACE_ND7(VMPORT_IGNORED, 0, 0/*cycles*/, 7,
+                     port, regs->_eax, regs->_ebx, regs->_ecx,
+                     regs->_edx, regs->_esi, regs->_edi);
+        if ( dir == IOREQ_READ )
+            *val = ~0u;
+    }
 
     return X86EMUL_OKAY;
 }
diff --git a/xen/include/asm-x86/hvm/trace.h b/xen/include/asm-x86/hvm/trace.h
index de802a6..0ad805f 100644
--- a/xen/include/asm-x86/hvm/trace.h
+++ b/xen/include/asm-x86/hvm/trace.h
@@ -54,6 +54,9 @@
 #define DO_TRC_HVM_TRAP             DEFAULT_HVM_MISC
 #define DO_TRC_HVM_TRAP_DEBUG       DEFAULT_HVM_MISC
 #define DO_TRC_HVM_VLAPIC           DEFAULT_HVM_MISC
+#define DO_TRC_HVM_VMPORT_HANDLED   DEFAULT_HVM_IO
+#define DO_TRC_HVM_VMPORT_IGNORED   DEFAULT_HVM_IO
+#define DO_TRC_HVM_VMPORT_QEMU      DEFAULT_HVM_IO
 
 
 #define TRC_PAR_LONG(par) ((par)&0xFFFFFFFF),((par)>>32)
@@ -83,6 +86,25 @@
         }                                                                 \
     } while(0)
 
+#define HVMTRACE_ND7(evt, modifier, cycles, count, d1, d2, d3, d4, d5, d6, d7) \
+    do {                                                                  \
+        if ( unlikely(tb_init_done) && DO_TRC_HVM_ ## evt )               \
+        {                                                                 \
+            struct {                                                      \
+                u32 d[7];                                                 \
+            } _d;                                                         \
+            _d.d[0]=(d1);                                                 \
+            _d.d[1]=(d2);                                                 \
+            _d.d[2]=(d3);                                                 \
+            _d.d[3]=(d4);                                                 \
+            _d.d[4]=(d5);                                                 \
+            _d.d[5]=(d6);                                                 \
+            _d.d[6]=(d7);                                                 \
+            __trace_var(TRC_HVM_ ## evt | (modifier), cycles,             \
+                        sizeof(*_d.d) * count, &_d);                      \
+        }                                                                 \
+    } while(0)
+
 #define HVMTRACE_6D(evt, d1, d2, d3, d4, d5, d6)    \
     HVMTRACE_ND(evt, 0, 0, 6, d1, d2, d3, d4, d5, d6)
 #define HVMTRACE_5D(evt, d1, d2, d3, d4, d5)        \
diff --git a/xen/include/public/trace.h b/xen/include/public/trace.h
index 5211ae7..16b87f9 100644
--- a/xen/include/public/trace.h
+++ b/xen/include/public/trace.h
@@ -227,6 +227,9 @@
 #define TRC_HVM_TRAP             (TRC_HVM_HANDLER + 0x23)
 #define TRC_HVM_TRAP_DEBUG       (TRC_HVM_HANDLER + 0x24)
 #define TRC_HVM_VLAPIC           (TRC_HVM_HANDLER + 0x25)
+#define TRC_HVM_VMPORT_HANDLED   (TRC_HVM_HANDLER + 0x26)
+#define TRC_HVM_VMPORT_IGNORED   (TRC_HVM_HANDLER + 0x27)
+#define TRC_HVM_VMPORT_QEMU      (TRC_HVM_HANDLER + 0x28)
 
 #define TRC_HVM_IOPORT_WRITE    (TRC_HVM_HANDLER + 0x216)
 #define TRC_HVM_IOMEM_WRITE     (TRC_HVM_HANDLER + 0x217)
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 3/9] tools: Add vmware_hwver support
  2015-05-22 15:50 ` [PATCH v11 3/9] tools: Add vmware_hwver support Don Slutz
@ 2015-06-03 14:53   ` George Dunlap
  2015-06-04 15:15     ` Ian Campbell
  2015-06-04 15:17   ` Ian Campbell
  1 sibling, 1 reply; 48+ messages in thread
From: George Dunlap @ 2015-06-03 14:53 UTC (permalink / raw)
  To: Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 05/22/2015 04:50 PM, Don Slutz wrote:
> This is used to set xen_arch_domainconfig vmware_hw. It is set to
> the emulated VMware virtual hardware version.
> 
> Currently 0, 3-4, 6-11 are good values.  However the code only
> checks for == 0, != 0, or < 7.
> 
> Signed-off-by: Don Slutz <dslutz@verizon.com>

Ian,

It looks like you gave a pre-approved Ack to something almost identical
to v10.  The only thing in v10 not in your pre-approval specification
was removing the thing automatically setting vga=vmware.  Instead in v10
he added a patch to take 'vmware' as an explicit argument to vga.

Patch 2 is reviewed-by Andy Cooper (and can have an Ack from me if it's
wanted), and patches 1-3 constitute a sensible chunk of functionality --
if you could take a quick look over 1 and 3 and give them a thumbs-up, I
think it might make sense to go ahead and check those in.

(Still working through the rest of the patches.)

 -George

> ---
> v11:
>   Dropped "If non-zero then default VGA to VMware's VGA"
> 
> v10:
>     LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE &
>     LIBXL_HAVE_BUILDINFO_HVM_VMWARE_HWVER are arriving together
>     a single umbrella could be used.
>       Since I split the LIBXL_VGA_INTERFACE_TYPE_VMWARE into
>       it's own patch, this is not longer true.
>       But I did use 1 for the 2 c_info changes.
>     Please use GCSPRINTF.
>   Remove vga=vmware from here.
> 
> v9:
>       I assumed that s/vmware_hw/vmware_hwver/ is not a big enough
>       change to drop the Reviewed-by.  Did a minor edit to the
>       commit message to add 7 to the list of values checked.
> 
> v7:
>     Default handling of hvm.vga.kind bad.
>       Fixed.
>     Default of vmware_port should be based on vmware_hw.
>       Done. 
> 
> v5:
>       Anything looking for Xen according to the Xen cpuid instructions...
>         Adjusted doc to new wording.
> 
>  docs/man/xl.cfg.pod.5       | 17 +++++++++++++++++
>  tools/libxc/xc_domain.c     |  2 +-
>  tools/libxl/libxl_create.c  |  4 +++-
>  tools/libxl/libxl_types.idl |  1 +
>  tools/libxl/libxl_x86.c     |  3 +--
>  tools/libxl/xl_cmdimpl.c    |  2 ++
>  6 files changed, 25 insertions(+), 4 deletions(-)
> 
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index 84078f6..eaad4bf 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -1348,6 +1348,23 @@ The viridian option can be specified as a boolean. A value of true (1)
>  is equivalent to the list [ "defaults" ], and a value of false (0) is
>  equivalent to an empty list.
>  
> +=item B<vmware_hwver=NUMBER>
> +
> +Turns on or off the exposure of VMware cpuid.  The number is
> +VMware's hardware version number, where 0 is off.  A number >= 7
> +is needed to enable exposure of VMware cpuid.
> +
> +The hardware version number (vmware_hwver) come from VMware config files.
> +
> +=over 4
> +
> +In a .vmx it is virtualHW.version
> +
> +In a .ovf it is part of the value of vssd:VirtualSystemType.
> +For vssd:VirtualSystemType == vmx-07, vmware_hwver = 7.
> +
> +=back
> +
>  =back
>  
>  =head3 Emulated VGA Graphics Device
> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> index 38d065f..4362d5d 100644
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -64,7 +64,7 @@ int xc_domain_create(xc_interface *xch,
>      memset(&config, 0, sizeof(config));
>  
>  #if defined (__i386) || defined(__x86_64__)
> -    /* No arch-specific configuration for now */
> +    /* No arch-specific default configuration for now */
>  #elif defined (__arm__) || defined(__aarch64__)
>      config.gic_version = XEN_DOMCTL_CONFIG_GIC_DEFAULT;
>      config.nr_spis = 0;
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 86384d2..895577f 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -464,7 +464,7 @@ int libxl__domain_build(libxl__gc *gc,
>          vments[4] = "start_time";
>          vments[5] = libxl__sprintf(gc, "%lu.%02d", start_time.tv_sec,(int)start_time.tv_usec/10000);
>  
> -        localents = libxl__calloc(gc, 9, sizeof(char *));
> +        localents = libxl__calloc(gc, 11, sizeof(char *));
>          i = 0;
>          localents[i++] = "platform/acpi";
>          localents[i++] = libxl_defbool_val(info->u.hvm.acpi) ? "1" : "0";
> @@ -472,6 +472,8 @@ int libxl__domain_build(libxl__gc *gc,
>          localents[i++] = libxl_defbool_val(info->u.hvm.acpi_s3) ? "1" : "0";
>          localents[i++] = "platform/acpi_s4";
>          localents[i++] = libxl_defbool_val(info->u.hvm.acpi_s4) ? "1" : "0";
> +        localents[i++] = "platform/vmware_hwver";
> +        localents[i++] = GCSPRINTF("%"PRId64, d_config->c_info.vmware_hwver);
>          if (info->u.hvm.mmio_hole_memkb) {
>              uint64_t max_ram_below_4g =
>                  (1ULL << 32) - (info->u.hvm.mmio_hole_memkb << 10);
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 6cab732..c8a1345 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -343,6 +343,7 @@ libxl_domain_create_info = Struct("domain_create_info",[
>      ("run_hotplug_scripts",libxl_defbool),
>      ("pvh",          libxl_defbool),
>      ("driver_domain",libxl_defbool),
> +    ("vmware_hwver", uint64),
>      ], dir=DIR_IN)
>  
>  libxl_domain_restore_params = Struct("domain_restore_params", [
> diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
> index 651b338..fd7dafa 100644
> --- a/tools/libxl/libxl_x86.c
> +++ b/tools/libxl/libxl_x86.c
> @@ -5,8 +5,7 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>                                        libxl_domain_config *d_config,
>                                        xc_domain_configuration_t *xc_config)
>  {
> -    /* Note: will be changed in a later patch */
> -    xc_config->vmware_hwver = 0;
> +    xc_config->vmware_hwver = d_config->c_info.vmware_hwver;
>      return 0;
>  }
>  
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 02f5c7a..e79a9d0 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -1383,6 +1383,8 @@ static void parse_config_data(const char *config_source,
>      b_info->cmdline = parse_cmdline(config);
>  
>      xlu_cfg_get_defbool(config, "driver_domain", &c_info->driver_domain, 0);
> +    if (!xlu_cfg_get_long(config, "vmware_hwver",  &l, 1))
> +        c_info->vmware_hwver = l;
>  
>      switch(b_info->type) {
>      case LIBXL_DOMAIN_TYPE_HVM:
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-05-22 15:50 ` [PATCH v11 6/9] xen: Add ring 3 " Don Slutz
@ 2015-06-03 15:26   ` George Dunlap
  2015-06-03 15:58     ` Andrew Cooper
  2015-06-23 16:14   ` Jan Beulich
  1 sibling, 1 reply; 48+ messages in thread
From: George Dunlap @ 2015-06-03 15:26 UTC (permalink / raw)
  To: Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 05/22/2015 04:50 PM, Don Slutz wrote:
> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
> to port 0x5658 specially.  Note: since many operations return data
> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
> "in (%dx),%al" will still do things, only AL part of EAX will be
> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
> unchanged.
> 
> This instruction is allowed to be used from ring 3.  To
> support this the vmexit for GP needs to be enabled.  I have not
> fully tested that nested HVM is doing the right thing for this.
> 
> Enable no-fault of pio in x86_emulate for VMware port
> 
> Also adjust the emulation registers after doing a VMware
> backdoor operation.
> 
> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
> handler.
> 
> Some of the best info is at:
> 
> https://sites.google.com/site/chitchatvmback/backdoor
> 
> Signed-off-by: Don Slutz <dslutz@verizon.com>

So let me get this straight.

VMWare allows ring3 to access the magic port regardless of whether the
guest OS has enabled access to that IO port or not.

In order to emulate this, we need to:
* Trap to Xen on #GPs rather than just letting the hardware handle it
* Emulate all instructions which cause a #GP, just to see if they might
be an IO instruction accessing the magic port.
* If it is an IO instruction, and it's accessing the magic port, then we
skip the ioport access checks (which will cause the instruction to
execute as though it had been given access).
* Under all other circumstances (we hope) the emulator in Xen will do
exactly what the hardware just did, and deliver a #GP to the guest.

In an attempt to make this more safe, emulation ops that write (such as
write and cmpxchg) are replaced with stubs which always return an error.

Is that about right?

That sounds completely insane.  It opens up an almost infinite surface
of attack onto the Xen emulator.

I understand that having the "VMWare compatible" is a nice tick-box to
have, but seriously, I cannot imagine that having unprivileged
user-space tools know the real clock frequency without having to involve
the OS is anywhere close to worth the risk involved.

 -George



> ---
> v11:
>   No change
> 
> v10:
>    Re-worked to be simpler.
> 
> v9:
>    Split #GP handling (or skipping of #GP) code out of previous
>    patch to help with the review process.
>    Switch to x86_emulator to handle #GP
>    I think the hvm_emulate_ops_gp() covers all needed ops.  Not able to validate
>    all paths though _hvm_emulate_one().
> 
>  xen/arch/x86/hvm/emulate.c             | 54 ++++++++++++++++++++++++++++++++--
>  xen/arch/x86/hvm/svm/svm.c             | 26 ++++++++++++++++
>  xen/arch/x86/hvm/svm/vmcb.c            |  2 ++
>  xen/arch/x86/hvm/vmware/vmport.c       | 11 +++++++
>  xen/arch/x86/hvm/vmx/vmcs.c            |  2 ++
>  xen/arch/x86/hvm/vmx/vmx.c             | 37 +++++++++++++++++++++++
>  xen/arch/x86/x86_emulate/x86_emulate.c | 13 +++++++-
>  xen/arch/x86/x86_emulate/x86_emulate.h |  5 ++++
>  xen/include/asm-x86/hvm/emulate.h      |  2 ++
>  xen/include/asm-x86/hvm/hvm.h          |  1 +
>  10 files changed, 150 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index ac9c9d6..d5e6468 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -803,6 +803,27 @@ static int hvmemul_wbinvd_discard(
>      return X86EMUL_OKAY;
>  }
>  
> +static int hvmemul_write_gp(
> +    unsigned int seg,
> +    unsigned long offset,
> +    void *p_data,
> +    unsigned int bytes,
> +    struct x86_emulate_ctxt *ctxt)
> +{
> +    return X86EMUL_EXCEPTION;
> +}
> +
> +static int hvmemul_cmpxchg_gp(
> +    unsigned int seg,
> +    unsigned long offset,
> +    void *old,
> +    void *new,
> +    unsigned int bytes,
> +    struct x86_emulate_ctxt *ctxt)
> +{
> +    return X86EMUL_EXCEPTION;
> +}
> +
>  static int hvmemul_cmpxchg(
>      enum x86_segment seg,
>      unsigned long offset,
> @@ -1356,6 +1377,13 @@ static int hvmemul_invlpg(
>      return rc;
>  }
>  
> +static int hvmemul_vmport_check(
> +    unsigned int first_port,
> +    struct x86_emulate_ctxt *ctxt)
> +{
> +    return vmport_check_port(first_port);
> +}
> +
>  static const struct x86_emulate_ops hvm_emulate_ops = {
>      .read          = hvmemul_read,
>      .insn_fetch    = hvmemul_insn_fetch,
> @@ -1379,7 +1407,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
>      .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
>      .get_fpu       = hvmemul_get_fpu,
>      .put_fpu       = hvmemul_put_fpu,
> -    .invlpg        = hvmemul_invlpg
> +    .invlpg        = hvmemul_invlpg,
> +    .vmport_check  = hvmemul_vmport_check,
>  };
>  
>  static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
> @@ -1405,7 +1434,22 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
>      .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
>      .get_fpu       = hvmemul_get_fpu,
>      .put_fpu       = hvmemul_put_fpu,
> -    .invlpg        = hvmemul_invlpg
> +    .invlpg        = hvmemul_invlpg,
> +    .vmport_check  = hvmemul_vmport_check,
> +};
> +
> +static const struct x86_emulate_ops hvm_emulate_ops_gp = {
> +    .read          = hvmemul_read,
> +    .insn_fetch    = hvmemul_insn_fetch,
> +    .write         = hvmemul_write_gp,
> +    .cmpxchg       = hvmemul_cmpxchg_gp,
> +    .read_segment  = hvmemul_read_segment,
> +    .write_segment = hvmemul_write_segment,
> +    .read_io       = hvmemul_read_io,
> +    .write_io      = hvmemul_write_io,
> +    .inject_hw_exception = hvmemul_inject_hw_exception,
> +    .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
> +    .vmport_check  = hvmemul_vmport_check,
>  };
>  
>  static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
> @@ -1522,6 +1566,12 @@ int hvm_emulate_one(
>      return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops);
>  }
>  
> +int hvm_emulate_one_gp(
> +    struct hvm_emulate_ctxt *hvmemul_ctxt)
> +{
> +    return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops_gp);
> +}
> +
>  int hvm_emulate_one_no_write(
>      struct hvm_emulate_ctxt *hvmemul_ctxt)
>  {
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 6734fb6..62baf3c 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -2119,6 +2119,28 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
>      return;
>  }
>  
> +static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
> +                                    struct vcpu *v)
> +{
> +    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
> +    int rc;
> +
> +    if ( vmcb->exitinfo1 != 0 || vmcb->exitinfo2 != 0 )
> +        rc = X86EMUL_EXCEPTION;
> +    else
> +    {
> +        struct hvm_emulate_ctxt ctxt;
> +
> +        hvm_emulate_prepare(&ctxt, regs);
> +        rc = hvm_emulate_one_gp(&ctxt);
> +
> +        if ( rc == X86EMUL_OKAY )
> +            hvm_emulate_writeback(&ctxt);
> +    }
> +    if ( rc != X86EMUL_OKAY && rc != X86EMUL_RETRY )
> +        hvm_inject_hw_exception(TRAP_gp_fault, vmcb->exitinfo1);
> +}
> +
>  static void svm_vmexit_ud_intercept(struct cpu_user_regs *regs)
>  {
>      struct hvm_emulate_ctxt ctxt;
> @@ -2484,6 +2506,10 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
>          break;
>      }
>  
> +    case VMEXIT_EXCEPTION_GP:
> +        svm_vmexit_gp_intercept(regs, v);
> +        break;
> +
>      case VMEXIT_EXCEPTION_UD:
>          svm_vmexit_ud_intercept(regs);
>          break;
> diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
> index 6339d2a..7683c09 100644
> --- a/xen/arch/x86/hvm/svm/vmcb.c
> +++ b/xen/arch/x86/hvm/svm/vmcb.c
> @@ -195,6 +195,8 @@ static int construct_vmcb(struct vcpu *v)
>          HVM_TRAP_MASK
>          | (1U << TRAP_no_device);
>  
> +    if ( v->domain->arch.hvm_domain.is_vmware_port_enabled )
> +        vmcb->_exception_intercepts |= 1U << TRAP_gp_fault;
>      if ( paging_mode_hap(v->domain) )
>      {
>          vmcb->_np_enable = 1; /* enable nested paging */
> diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
> index f24d8e3..36e3f1b 100644
> --- a/xen/arch/x86/hvm/vmware/vmport.c
> +++ b/xen/arch/x86/hvm/vmware/vmport.c
> @@ -137,6 +137,17 @@ void vmport_register(struct domain *d)
>      register_portio_handler(d, BDOOR_PORT, 4, vmport_ioport);
>  }
>  
> +int vmport_check_port(unsigned int port)
> +{
> +    struct vcpu *curr = current;
> +    struct domain *currd = curr->domain;
> +
> +    if ( port == BDOOR_PORT && is_hvm_domain(currd) &&
> +         currd->arch.hvm_domain.is_vmware_port_enabled )
> +        return 0;
> +    return 1;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index 877ec10..54360b0 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -1151,6 +1151,8 @@ static int construct_vmcs(struct vcpu *v)
>  
>      v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK
>                | (paging_mode_hap(d) ? 0 : (1U << TRAP_page_fault))
> +              | (v->domain->arch.hvm_domain.is_vmware_port_enabled ?
> +                 (1U << TRAP_gp_fault) : 0)
>                | (1U << TRAP_no_device);
>      vmx_update_exception_bitmap(v);
>  
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 74f563f..fe88afe 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1312,6 +1312,8 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
>                  v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK
>                            | (paging_mode_hap(v->domain) ?
>                               0 : (1U << TRAP_page_fault))
> +                          | (v->domain->arch.hvm_domain.is_vmware_port_enabled ?
> +                             (1U << TRAP_gp_fault) : 0)
>                            | (1U << TRAP_no_device);
>                  vmx_update_exception_bitmap(v);
>                  vmx_update_debug_state(v);
> @@ -2671,6 +2673,38 @@ static void vmx_idtv_reinject(unsigned long idtv_info)
>      }
>  }
>  
> +static void vmx_vmexit_gp_intercept(struct cpu_user_regs *regs,
> +                                    struct vcpu *v)
> +{
> +    unsigned long exit_qualification;
> +    unsigned long ecode;
> +    int rc;
> +    unsigned long vector;
> +
> +    __vmread(VM_EXIT_INTR_INFO, &vector);
> +    ASSERT(vector & INTR_INFO_VALID_MASK);
> +    ASSERT(vector & INTR_INFO_DELIVER_CODE_MASK);
> +
> +    __vmread(EXIT_QUALIFICATION, &exit_qualification);
> +    __vmread(VM_EXIT_INTR_ERROR_CODE, &ecode);
> +
> +    if ( ecode != 0 || exit_qualification != 0 )
> +        rc = X86EMUL_EXCEPTION;
> +    else
> +    {
> +        struct hvm_emulate_ctxt ctxt;
> +
> +        hvm_emulate_prepare(&ctxt, regs);
> +        rc = hvm_emulate_one_gp(&ctxt);
> +
> +        if ( rc == X86EMUL_OKAY )
> +            hvm_emulate_writeback(&ctxt);
> +    }
> +
> +    if ( rc != X86EMUL_OKAY && rc != X86EMUL_RETRY )
> +        hvm_inject_hw_exception(TRAP_gp_fault, ecode);
> +}
> +
>  static int vmx_handle_apic_write(void)
>  {
>      unsigned long exit_qualification;
> @@ -2895,6 +2929,9 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>              HVMTRACE_1D(TRAP, vector);
>              vmx_fpu_dirty_intercept();
>              break;
> +        case TRAP_gp_fault:
> +            vmx_vmexit_gp_intercept(regs, v);
> +            break;
>          case TRAP_page_fault:
>              __vmread(EXIT_QUALIFICATION, &exit_qualification);
>              __vmread(VM_EXIT_INTR_ERROR_CODE, &ecode);
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
> index c017c69..d3ea143 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -3393,8 +3393,11 @@ x86_emulate(
>          unsigned int port = ((b < 0xe8)
>                               ? insn_fetch_type(uint8_t)
>                               : (uint16_t)_regs.edx);
> +	bool_t vmport = (ops->vmport_check && /* Vmware backdoor? */
> +			 (ops->vmport_check(port, ctxt) == 0));
>          op_bytes = !(b & 1) ? 1 : (op_bytes == 8) ? 4 : op_bytes;
> -        if ( (rc = ioport_access_check(port, op_bytes, ctxt, ops)) != 0 )
> +        if ( !vmport &&
> +	     (rc = ioport_access_check(port, op_bytes, ctxt, ops)) != 0 )
>              goto done;
>          if ( b & 2 )
>          {
> @@ -3413,6 +3416,14 @@ x86_emulate(
>          }
>          if ( rc != 0 )
>              goto done;
> +	if ( vmport )
> +	{
> +            _regs._ebx = ctxt->regs->_ebx;
> +            _regs._ecx = ctxt->regs->_ecx;
> +            _regs._edx = ctxt->regs->_edx;
> +            _regs._esi = ctxt->regs->_esi;
> +            _regs._edi = ctxt->regs->_edi;
> +	}
>          break;
>      }
>  
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
> index 064b8f4..d914b5e 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -397,6 +397,11 @@ struct x86_emulate_ops
>          enum x86_segment seg,
>          unsigned long offset,
>          struct x86_emulate_ctxt *ctxt);
> +
> +    /* vmport_check */
> +    int (*vmport_check)(
> +        unsigned int port,
> +        struct x86_emulate_ctxt *ctxt);
>  };
>  
>  struct cpu_user_regs;
> diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
> index b3971c8..4386169 100644
> --- a/xen/include/asm-x86/hvm/emulate.h
> +++ b/xen/include/asm-x86/hvm/emulate.h
> @@ -36,6 +36,8 @@ struct hvm_emulate_ctxt {
>  
>  int hvm_emulate_one(
>      struct hvm_emulate_ctxt *hvmemul_ctxt);
> +int hvm_emulate_one_gp(
> +    struct hvm_emulate_ctxt *hvmemul_ctxt);
>  int hvm_emulate_one_no_write(
>      struct hvm_emulate_ctxt *hvmemul_ctxt);
>  void hvm_mem_access_emulate_one(bool_t nowrite,
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index e76f612..c42f7d8 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -523,6 +523,7 @@ extern bool_t opt_hvm_fep;
>  #endif
>  
>  void vmport_register(struct domain *d);
> +int vmport_check_port(unsigned int port);
>  
>  #endif /* __ASM_X86_HVM_HVM_H__ */
>  
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-03 15:26   ` George Dunlap
@ 2015-06-03 15:58     ` Andrew Cooper
  2015-06-03 16:23       ` George Dunlap
  2015-06-03 16:36       ` Don Slutz
  0 siblings, 2 replies; 48+ messages in thread
From: Andrew Cooper @ 2015-06-03 15:58 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 03/06/15 16:26, George Dunlap wrote:
> On 05/22/2015 04:50 PM, Don Slutz wrote:
>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>> to port 0x5658 specially.  Note: since many operations return data
>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>> "in (%dx),%al" will still do things, only AL part of EAX will be
>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>> unchanged.
>>
>> This instruction is allowed to be used from ring 3.  To
>> support this the vmexit for GP needs to be enabled.  I have not
>> fully tested that nested HVM is doing the right thing for this.
>>
>> Enable no-fault of pio in x86_emulate for VMware port
>>
>> Also adjust the emulation registers after doing a VMware
>> backdoor operation.
>>
>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>> handler.
>>
>> Some of the best info is at:
>>
>> https://sites.google.com/site/chitchatvmback/backdoor
>>
>> Signed-off-by: Don Slutz <dslutz@verizon.com>
> So let me get this straight.
>
> VMWare allows ring3 to access the magic port regardless of whether the
> guest OS has enabled access to that IO port or not.
>
> In order to emulate this, we need to:
> * Trap to Xen on #GPs rather than just letting the hardware handle it
> * Emulate all instructions which cause a #GP, just to see if they might
> be an IO instruction accessing the magic port.
> * If it is an IO instruction, and it's accessing the magic port, then we
> skip the ioport access checks (which will cause the instruction to
> execute as though it had been given access).
> * Under all other circumstances (we hope) the emulator in Xen will do
> exactly what the hardware just did, and deliver a #GP to the guest.
>
> In an attempt to make this more safe, emulation ops that write (such as
> write and cmpxchg) are replaced with stubs which always return an error.
>
> Is that about right?
>
> That sounds completely insane.  It opens up an almost infinite surface
> of attack onto the Xen emulator.
>
> I understand that having the "VMWare compatible" is a nice tick-box to
> have, but seriously, I cannot imagine that having unprivileged
> user-space tools know the real clock frequency without having to involve
> the OS is anywhere close to worth the risk involved.

The attack surface sadly is not enlarged in the slightest by this change.

We already trap and emulate all #UD exceptions in an attempt to support
migration of VMs between Intel and AMD hardware.  See XSA-105.  (There
is a good argument to be made for not trapping #UD, but that doesn't
completely close the hole)

~Andrew

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-03 15:58     ` Andrew Cooper
@ 2015-06-03 16:23       ` George Dunlap
  2015-06-03 16:40         ` Andrew Cooper
  2015-06-03 16:41         ` Don Slutz
  2015-06-03 16:36       ` Don Slutz
  1 sibling, 2 replies; 48+ messages in thread
From: George Dunlap @ 2015-06-03 16:23 UTC (permalink / raw)
  To: Andrew Cooper, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 06/03/2015 04:58 PM, Andrew Cooper wrote:
> On 03/06/15 16:26, George Dunlap wrote:
>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>> to port 0x5658 specially.  Note: since many operations return data
>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>> unchanged.
>>>
>>> This instruction is allowed to be used from ring 3.  To
>>> support this the vmexit for GP needs to be enabled.  I have not
>>> fully tested that nested HVM is doing the right thing for this.
>>>
>>> Enable no-fault of pio in x86_emulate for VMware port
>>>
>>> Also adjust the emulation registers after doing a VMware
>>> backdoor operation.
>>>
>>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>>> handler.
>>>
>>> Some of the best info is at:
>>>
>>> https://sites.google.com/site/chitchatvmback/backdoor
>>>
>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>> So let me get this straight.
>>
>> VMWare allows ring3 to access the magic port regardless of whether the
>> guest OS has enabled access to that IO port or not.
>>
>> In order to emulate this, we need to:
>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>> * Emulate all instructions which cause a #GP, just to see if they might
>> be an IO instruction accessing the magic port.
>> * If it is an IO instruction, and it's accessing the magic port, then we
>> skip the ioport access checks (which will cause the instruction to
>> execute as though it had been given access).
>> * Under all other circumstances (we hope) the emulator in Xen will do
>> exactly what the hardware just did, and deliver a #GP to the guest.
>>
>> In an attempt to make this more safe, emulation ops that write (such as
>> write and cmpxchg) are replaced with stubs which always return an error.
>>
>> Is that about right?
>>
>> That sounds completely insane.  It opens up an almost infinite surface
>> of attack onto the Xen emulator.
>>
>> I understand that having the "VMWare compatible" is a nice tick-box to
>> have, but seriously, I cannot imagine that having unprivileged
>> user-space tools know the real clock frequency without having to involve
>> the OS is anywhere close to worth the risk involved.
> 
> The attack surface sadly is not enlarged in the slightest by this change.
> 
> We already trap and emulate all #UD exceptions in an attempt to support
> migration of VMs between Intel and AMD hardware.  See XSA-105.  (There
> is a good argument to be made for not trapping #UD, but that doesn't
> completely close the hole)

So at the moment, an attacker on Intel can force the emulation of any
AMD-only instruction (and vice versa), is that right?

This would allow an attacker to force the emulation of every #GP
condition of every instruction we emulate.

Those two sets may be within an order of magnitude of each other, but
they will only overlap a little bit.  So my guess is that enabling this
would double the surface of attack (give or take).

I'd be a lot happier with this patch if we could make it so that on a
#GP the only instruction that could get emulated would be an IO instruction.

 -George

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-03 15:58     ` Andrew Cooper
  2015-06-03 16:23       ` George Dunlap
@ 2015-06-03 16:36       ` Don Slutz
  2015-06-03 16:50         ` George Dunlap
  1 sibling, 1 reply; 48+ messages in thread
From: Don Slutz @ 2015-06-03 16:36 UTC (permalink / raw)
  To: Andrew Cooper, George Dunlap, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Tim Deegan, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Boris Ostrovsky,
	Suravee Suthikulpanit

On 06/03/15 11:58, Andrew Cooper wrote:
> On 03/06/15 16:26, George Dunlap wrote:
>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>> to port 0x5658 specially.  Note: since many operations return data
>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>> unchanged.
>>>
>>> This instruction is allowed to be used from ring 3.  To
>>> support this the vmexit for GP needs to be enabled.  I have not
>>> fully tested that nested HVM is doing the right thing for this.
>>>
>>> Enable no-fault of pio in x86_emulate for VMware port
>>>
>>> Also adjust the emulation registers after doing a VMware
>>> backdoor operation.
>>>
>>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>>> handler.
>>>
>>> Some of the best info is at:
>>>
>>> https://sites.google.com/site/chitchatvmback/backdoor
>>>
>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>> So let me get this straight.
>>
>> VMWare allows ring3 to access the magic port regardless of whether the
>> guest OS has enabled access to that IO port or not.
>>
>> In order to emulate this, we need to:
>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>> * Emulate all instructions which cause a #GP, just to see if they might
>> be an IO instruction accessing the magic port.
>> * If it is an IO instruction, and it's accessing the magic port, then we
>> skip the ioport access checks (which will cause the instruction to
>> execute as though it had been given access).
>> * Under all other circumstances (we hope) the emulator in Xen will do
>> exactly what the hardware just did, and deliver a #GP to the guest.
>>
>> In an attempt to make this more safe, emulation ops that write (such as
>> write and cmpxchg) are replaced with stubs which always return an error.
>>
>> Is that about right?

Yes, however it is missing that Jan Beulich wanted the emulator in Xen
to be used.  I had started with code that did not use the emulator.

>> That sounds completely insane.  It opens up an almost infinite surface
>> of attack onto the Xen emulator.
>>
>> I understand that having the "VMWare compatible" is a nice tick-box to
>> have, but seriously, I cannot imagine that having unprivileged
>> user-space tools know the real clock frequency without having to involve
>> the OS is anywhere close to worth the risk involved.

Not sure how you moved from attack surface to "real clock frequency"
(which I am not sure which of the many "clock frequency" you are
referring to.  The only new one that leaps to mind is the emulated lapic
bus frequency (which Linux attempts to determine from other clocks).

> The attack surface sadly is not enlarged in the slightest by this change.
>
> We already trap and emulate all #UD exceptions in an attempt to support
> migration of VMs between Intel and AMD hardware.  See XSA-105.  (There
> is a good argument to be made for not trapping #UD, but that doesn't
> completely close the hole)

Ah, thanks, I did not know this fact.

   -Don Slutz

>
> ~Andrew
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-03 16:23       ` George Dunlap
@ 2015-06-03 16:40         ` Andrew Cooper
  2015-06-03 17:00           ` George Dunlap
  2015-06-03 16:41         ` Don Slutz
  1 sibling, 1 reply; 48+ messages in thread
From: Andrew Cooper @ 2015-06-03 16:40 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 03/06/15 17:23, George Dunlap wrote:
> On 06/03/2015 04:58 PM, Andrew Cooper wrote:
>> On 03/06/15 16:26, George Dunlap wrote:
>>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>>> to port 0x5658 specially.  Note: since many operations return data
>>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>>> unchanged.
>>>>
>>>> This instruction is allowed to be used from ring 3.  To
>>>> support this the vmexit for GP needs to be enabled.  I have not
>>>> fully tested that nested HVM is doing the right thing for this.
>>>>
>>>> Enable no-fault of pio in x86_emulate for VMware port
>>>>
>>>> Also adjust the emulation registers after doing a VMware
>>>> backdoor operation.
>>>>
>>>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>>>> handler.
>>>>
>>>> Some of the best info is at:
>>>>
>>>> https://sites.google.com/site/chitchatvmback/backdoor
>>>>
>>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>> So let me get this straight.
>>>
>>> VMWare allows ring3 to access the magic port regardless of whether the
>>> guest OS has enabled access to that IO port or not.
>>>
>>> In order to emulate this, we need to:
>>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>>> * Emulate all instructions which cause a #GP, just to see if they might
>>> be an IO instruction accessing the magic port.
>>> * If it is an IO instruction, and it's accessing the magic port, then we
>>> skip the ioport access checks (which will cause the instruction to
>>> execute as though it had been given access).
>>> * Under all other circumstances (we hope) the emulator in Xen will do
>>> exactly what the hardware just did, and deliver a #GP to the guest.
>>>
>>> In an attempt to make this more safe, emulation ops that write (such as
>>> write and cmpxchg) are replaced with stubs which always return an error.
>>>
>>> Is that about right?
>>>
>>> That sounds completely insane.  It opens up an almost infinite surface
>>> of attack onto the Xen emulator.
>>>
>>> I understand that having the "VMWare compatible" is a nice tick-box to
>>> have, but seriously, I cannot imagine that having unprivileged
>>> user-space tools know the real clock frequency without having to involve
>>> the OS is anywhere close to worth the risk involved.
>> The attack surface sadly is not enlarged in the slightest by this change.
>>
>> We already trap and emulate all #UD exceptions in an attempt to support
>> migration of VMs between Intel and AMD hardware.  See XSA-105.  (There
>> is a good argument to be made for not trapping #UD, but that doesn't
>> completely close the hole)
> So at the moment, an attacker on Intel can force the emulation of any
> AMD-only instruction (and vice versa), is that right?
>
> This would allow an attacker to force the emulation of every #GP
> condition of every instruction we emulate.
>
> Those two sets may be within an order of magnitude of each other, but
> they will only overlap a little bit.  So my guess is that enabling this
> would double the surface of attack (give or take).
>
> I'd be a lot happier with this patch if we could make it so that on a
> #GP the only instruction that could get emulated would be an IO instruction.

Any multi-threaded guest (userspace even) can force any arbitrary
instruction through the emulator.

If there is anything we currently mis-emulate #GP wise, it is already a
issue and enabling #GP intercepts doesn't make the hole any bigger.

~Andrew

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-03 16:23       ` George Dunlap
  2015-06-03 16:40         ` Andrew Cooper
@ 2015-06-03 16:41         ` Don Slutz
  2015-06-03 16:58           ` George Dunlap
  1 sibling, 1 reply; 48+ messages in thread
From: Don Slutz @ 2015-06-03 16:41 UTC (permalink / raw)
  To: George Dunlap, Andrew Cooper, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 06/03/15 12:23, George Dunlap wrote:
> On 06/03/2015 04:58 PM, Andrew Cooper wrote:
>> On 03/06/15 16:26, George Dunlap wrote:
>>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>>> to port 0x5658 specially.  Note: since many operations return data
>>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>>> unchanged.
>>>>
>>>> This instruction is allowed to be used from ring 3.  To
>>>> support this the vmexit for GP needs to be enabled.  I have not
>>>> fully tested that nested HVM is doing the right thing for this.
>>>>
>>>> Enable no-fault of pio in x86_emulate for VMware port
>>>>
>>>> Also adjust the emulation registers after doing a VMware
>>>> backdoor operation.
>>>>
>>>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>>>> handler.
>>>>
>>>> Some of the best info is at:
>>>>
>>>> https://sites.google.com/site/chitchatvmback/backdoor
>>>>
>>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>> So let me get this straight.
>>>
>>> VMWare allows ring3 to access the magic port regardless of whether the
>>> guest OS has enabled access to that IO port or not.
>>>
>>> In order to emulate this, we need to:
>>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>>> * Emulate all instructions which cause a #GP, just to see if they might
>>> be an IO instruction accessing the magic port.
>>> * If it is an IO instruction, and it's accessing the magic port, then we
>>> skip the ioport access checks (which will cause the instruction to
>>> execute as though it had been given access).
>>> * Under all other circumstances (we hope) the emulator in Xen will do
>>> exactly what the hardware just did, and deliver a #GP to the guest.
>>>
>>> In an attempt to make this more safe, emulation ops that write (such as
>>> write and cmpxchg) are replaced with stubs which always return an error.
>>>
>>> Is that about right?
>>>
>>> That sounds completely insane.  It opens up an almost infinite surface
>>> of attack onto the Xen emulator.
>>>
>>> I understand that having the "VMWare compatible" is a nice tick-box to
>>> have, but seriously, I cannot imagine that having unprivileged
>>> user-space tools know the real clock frequency without having to involve
>>> the OS is anywhere close to worth the risk involved.
>>
>> The attack surface sadly is not enlarged in the slightest by this change.
>>
>> We already trap and emulate all #UD exceptions in an attempt to support
>> migration of VMs between Intel and AMD hardware.  See XSA-105.  (There
>> is a good argument to be made for not trapping #UD, but that doesn't
>> completely close the hole)
> 
> So at the moment, an attacker on Intel can force the emulation of any
> AMD-only instruction (and vice versa), is that right?
> 
> This would allow an attacker to force the emulation of every #GP
> condition of every instruction we emulate.
> 
> Those two sets may be within an order of magnitude of each other, but
> they will only overlap a little bit.  So my guess is that enabling this
> would double the surface of attack (give or take).
> 
> I'd be a lot happier with this patch if we could make it so that on a
> #GP the only instruction that could get emulated would be an IO instruction.
> 

You mean like I said in:


Message-ID: <54C67D8302000078000598E4@mail.emea.novell.com>
X-Mailer: Novell GroupWise Internet Agent 14.0.1
Date: Mon, 26 Jan 2015 16:46:43 +0000
From: Jan Beulich <JBeulich@suse.com>
To: Don Slutz <dslutz@verizon.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>, Andrew Cooper
	<andrew.cooper3@citrix.com>, Ian Campbell <ian.campbell@citrix.com>,
"George
 Dunlap" <George.Dunlap@eu.citrix.com>, Ian Jackson
	<ian.jackson@eu.citrix.com>, Stefano Stabellini
	<stefano.stabellini@eu.citrix.com>, Eddie Dong <eddie.dong@intel.com>, "Jun
 Nakajima" <jun.nakajima@intel.com>, Kevin Tian <kevin.tian@intel.com>,
	<xen-devel@lists.xen.org>, Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Keir Fraser <keir@xen.org>,
	Tim Deegan <tim@xen.org>
Subject: Re: [PATCH for-4.5 v8 4/7] xen: Add vmware_port support
References: <1412285417-19180-1-git-send-email-dslutz@verizon.com>
 <1412285417-19180-5-git-send-email-dslutz@verizon.com>
 <542DCA92.1030701@terremark.com> <542DD44F.6030101@terremark.com>
 <54B8F1740200007800055B42@mail.emea.novell.com>
 <54BFE768.3090309@terremark.com>
 <54C0C39F0200007800057F73@mail.emea.novell.com> <54C6643B.1@terremark.com>
In-Reply-To: <54C6643B.1@terremark.com>

   -Don Slutz

>  -George
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-03 16:36       ` Don Slutz
@ 2015-06-03 16:50         ` George Dunlap
  2015-06-05  9:31           ` Jan Beulich
  0 siblings, 1 reply; 48+ messages in thread
From: George Dunlap @ 2015-06-03 16:50 UTC (permalink / raw)
  To: Don Slutz, Andrew Cooper, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Tim Deegan, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Boris Ostrovsky,
	Suravee Suthikulpanit

On 06/03/2015 05:36 PM, Don Slutz wrote:
> On 06/03/15 11:58, Andrew Cooper wrote:
>> On 03/06/15 16:26, George Dunlap wrote:
>>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>>> to port 0x5658 specially.  Note: since many operations return data
>>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>>> unchanged.
>>>>
>>>> This instruction is allowed to be used from ring 3.  To
>>>> support this the vmexit for GP needs to be enabled.  I have not
>>>> fully tested that nested HVM is doing the right thing for this.
>>>>
>>>> Enable no-fault of pio in x86_emulate for VMware port
>>>>
>>>> Also adjust the emulation registers after doing a VMware
>>>> backdoor operation.
>>>>
>>>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>>>> handler.
>>>>
>>>> Some of the best info is at:
>>>>
>>>> https://sites.google.com/site/chitchatvmback/backdoor
>>>>
>>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>> So let me get this straight.
>>>
>>> VMWare allows ring3 to access the magic port regardless of whether the
>>> guest OS has enabled access to that IO port or not.
>>>
>>> In order to emulate this, we need to:
>>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>>> * Emulate all instructions which cause a #GP, just to see if they might
>>> be an IO instruction accessing the magic port.
>>> * If it is an IO instruction, and it's accessing the magic port, then we
>>> skip the ioport access checks (which will cause the instruction to
>>> execute as though it had been given access).
>>> * Under all other circumstances (we hope) the emulator in Xen will do
>>> exactly what the hardware just did, and deliver a #GP to the guest.
>>>
>>> In an attempt to make this more safe, emulation ops that write (such as
>>> write and cmpxchg) are replaced with stubs which always return an error.
>>>
>>> Is that about right?
> 
> Yes, however it is missing that Jan Beulich wanted the emulator in Xen
> to be used.  I had started with code that did not use the emulator.

I agree with him that the emulator should be used to emulate the
instructions we *want* to emulate.  I'm just not happy with using the
emulator to emulate all the instructions we *don't* want to emulate
(i.e., all the ones that really do need to #GP).

>>> That sounds completely insane.  It opens up an almost infinite surface
>>> of attack onto the Xen emulator.
>>>
>>> I understand that having the "VMWare compatible" is a nice tick-box to
>>> have, but seriously, I cannot imagine that having unprivileged
>>> user-space tools know the real clock frequency without having to involve
>>> the OS is anywhere close to worth the risk involved.
> 
> Not sure how you moved from attack surface to "real clock frequency"
> (which I am not sure which of the many "clock frequency" you are
> referring to.  The only new one that leaps to mind is the emulated lapic
> bus frequency (which Linux attempts to determine from other clocks).

I'm talking about cost-benefits analysis.  What's the benefit of
accepting this patch, and is it worth the cost?

My argument here is that the cost of this change is opening up a massive
attack surface on the Xen emulation code.

The benefit of this change: Allowing guest processes access to the
VMWare backdoor without guest OS cooperation.  (Guest OSes can access
the backdoor without this patch.)

I hadn't gotten to the part of the series where Qemu was roped in to do
mouse and clipboard stuff; so at the time I wrote that, the only
functionality that it looked like was being made available to the guest
was reading the clock and a couple of other random bits.

But I see I have another e-mail from Andy with information of material
importance to this discussion.

 -George

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-03 16:41         ` Don Slutz
@ 2015-06-03 16:58           ` George Dunlap
  2015-06-04 12:37             ` Don Slutz
  0 siblings, 1 reply; 48+ messages in thread
From: George Dunlap @ 2015-06-03 16:58 UTC (permalink / raw)
  To: Don Slutz, Andrew Cooper, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 06/03/2015 05:41 PM, Don Slutz wrote:
> On 06/03/15 12:23, George Dunlap wrote:
>> On 06/03/2015 04:58 PM, Andrew Cooper wrote:
>>> On 03/06/15 16:26, George Dunlap wrote:
>>>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>>>> to port 0x5658 specially.  Note: since many operations return data
>>>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>>>> unchanged.
>>>>>
>>>>> This instruction is allowed to be used from ring 3.  To
>>>>> support this the vmexit for GP needs to be enabled.  I have not
>>>>> fully tested that nested HVM is doing the right thing for this.
>>>>>
>>>>> Enable no-fault of pio in x86_emulate for VMware port
>>>>>
>>>>> Also adjust the emulation registers after doing a VMware
>>>>> backdoor operation.
>>>>>
>>>>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>>>>> handler.
>>>>>
>>>>> Some of the best info is at:
>>>>>
>>>>> https://sites.google.com/site/chitchatvmback/backdoor
>>>>>
>>>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>>> So let me get this straight.
>>>>
>>>> VMWare allows ring3 to access the magic port regardless of whether the
>>>> guest OS has enabled access to that IO port or not.
>>>>
>>>> In order to emulate this, we need to:
>>>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>>>> * Emulate all instructions which cause a #GP, just to see if they might
>>>> be an IO instruction accessing the magic port.
>>>> * If it is an IO instruction, and it's accessing the magic port, then we
>>>> skip the ioport access checks (which will cause the instruction to
>>>> execute as though it had been given access).
>>>> * Under all other circumstances (we hope) the emulator in Xen will do
>>>> exactly what the hardware just did, and deliver a #GP to the guest.
>>>>
>>>> In an attempt to make this more safe, emulation ops that write (such as
>>>> write and cmpxchg) are replaced with stubs which always return an error.
>>>>
>>>> Is that about right?
>>>>
>>>> That sounds completely insane.  It opens up an almost infinite surface
>>>> of attack onto the Xen emulator.
>>>>
>>>> I understand that having the "VMWare compatible" is a nice tick-box to
>>>> have, but seriously, I cannot imagine that having unprivileged
>>>> user-space tools know the real clock frequency without having to involve
>>>> the OS is anywhere close to worth the risk involved.
>>>
>>> The attack surface sadly is not enlarged in the slightest by this change.
>>>
>>> We already trap and emulate all #UD exceptions in an attempt to support
>>> migration of VMs between Intel and AMD hardware.  See XSA-105.  (There
>>> is a good argument to be made for not trapping #UD, but that doesn't
>>> completely close the hole)
>>
>> So at the moment, an attacker on Intel can force the emulation of any
>> AMD-only instruction (and vice versa), is that right?
>>
>> This would allow an attacker to force the emulation of every #GP
>> condition of every instruction we emulate.
>>
>> Those two sets may be within an order of magnitude of each other, but
>> they will only overlap a little bit.  So my guess is that enabling this
>> would double the surface of attack (give or take).
>>
>> I'd be a lot happier with this patch if we could make it so that on a
>> #GP the only instruction that could get emulated would be an IO instruction.
>>
> 
> You mean like I said in:
> 
> 
> Message-ID: <54C67D8302000078000598E4@mail.emea.novell.com>

Yes, pretty much exactly.

I didn't notice that particular part of the discussion, but I did go
back and skim the comments that people had made on previous revisions,
and I certainly noticed that both Jan and Andy reviewed this patch, and
that neither one objected to the general idea.  So my "That sounds
insane" was as much directed at them as at you.

(As an aside, I think my description does a better job of alerting a
reviewer to what's going on in this patch -- you might consider stealing
part of it if you end up re-submitting this one.)

 -George

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-03 16:40         ` Andrew Cooper
@ 2015-06-03 17:00           ` George Dunlap
  0 siblings, 0 replies; 48+ messages in thread
From: George Dunlap @ 2015-06-03 17:00 UTC (permalink / raw)
  To: Andrew Cooper, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 06/03/2015 05:40 PM, Andrew Cooper wrote:
> On 03/06/15 17:23, George Dunlap wrote:
>> On 06/03/2015 04:58 PM, Andrew Cooper wrote:
>>> On 03/06/15 16:26, George Dunlap wrote:
>>>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>>>> to port 0x5658 specially.  Note: since many operations return data
>>>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>>>> unchanged.
>>>>>
>>>>> This instruction is allowed to be used from ring 3.  To
>>>>> support this the vmexit for GP needs to be enabled.  I have not
>>>>> fully tested that nested HVM is doing the right thing for this.
>>>>>
>>>>> Enable no-fault of pio in x86_emulate for VMware port
>>>>>
>>>>> Also adjust the emulation registers after doing a VMware
>>>>> backdoor operation.
>>>>>
>>>>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>>>>> handler.
>>>>>
>>>>> Some of the best info is at:
>>>>>
>>>>> https://sites.google.com/site/chitchatvmback/backdoor
>>>>>
>>>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>>> So let me get this straight.
>>>>
>>>> VMWare allows ring3 to access the magic port regardless of whether the
>>>> guest OS has enabled access to that IO port or not.
>>>>
>>>> In order to emulate this, we need to:
>>>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>>>> * Emulate all instructions which cause a #GP, just to see if they might
>>>> be an IO instruction accessing the magic port.
>>>> * If it is an IO instruction, and it's accessing the magic port, then we
>>>> skip the ioport access checks (which will cause the instruction to
>>>> execute as though it had been given access).
>>>> * Under all other circumstances (we hope) the emulator in Xen will do
>>>> exactly what the hardware just did, and deliver a #GP to the guest.
>>>>
>>>> In an attempt to make this more safe, emulation ops that write (such as
>>>> write and cmpxchg) are replaced with stubs which always return an error.
>>>>
>>>> Is that about right?
>>>>
>>>> That sounds completely insane.  It opens up an almost infinite surface
>>>> of attack onto the Xen emulator.
>>>>
>>>> I understand that having the "VMWare compatible" is a nice tick-box to
>>>> have, but seriously, I cannot imagine that having unprivileged
>>>> user-space tools know the real clock frequency without having to involve
>>>> the OS is anywhere close to worth the risk involved.
>>> The attack surface sadly is not enlarged in the slightest by this change.
>>>
>>> We already trap and emulate all #UD exceptions in an attempt to support
>>> migration of VMs between Intel and AMD hardware.  See XSA-105.  (There
>>> is a good argument to be made for not trapping #UD, but that doesn't
>>> completely close the hole)
>> So at the moment, an attacker on Intel can force the emulation of any
>> AMD-only instruction (and vice versa), is that right?
>>
>> This would allow an attacker to force the emulation of every #GP
>> condition of every instruction we emulate.
>>
>> Those two sets may be within an order of magnitude of each other, but
>> they will only overlap a little bit.  So my guess is that enabling this
>> would double the surface of attack (give or take).
>>
>> I'd be a lot happier with this patch if we could make it so that on a
>> #GP the only instruction that could get emulated would be an IO instruction.
> 
> Any multi-threaded guest (userspace even) can force any arbitrary
> instruction through the emulator.

*grumble grumble*

> If there is anything we currently mis-emulate #GP wise, it is already a
> issue and enabling #GP intercepts doesn't make the hole any bigger.

I still really don't like it.  It's not just flipping a bit and then
doing what we did before.

You want to take bets as to whether this will be involved in an XSA
sometime in the next 5 years? :-)

 -George

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 7/9] tools: Add vmware_port support
  2015-05-22 15:50 ` [PATCH v11 7/9] tools: Add " Don Slutz
@ 2015-06-03 17:06   ` George Dunlap
  2015-06-04 15:49     ` Ian Campbell
  2015-06-04 15:20   ` Ian Campbell
  1 sibling, 1 reply; 48+ messages in thread
From: George Dunlap @ 2015-06-03 17:06 UTC (permalink / raw)
  To: Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 05/22/2015 04:50 PM, Don Slutz wrote:
> This new libxl_domain_create_info field is used to set
> XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK in the xc_domain_configuration_t
> for x86.
> 
> In xen it is is_vmware_port_enabled.
> 
> If is_vmware_port_enabled then
>   enable a limited support of VMware's hyper-call.
> 
> VMware's hyper-call is also known as VMware Backdoor I/O Port.
> 
> if vmware_port is not specified in the config file, let
> "vmware_hwver != 0" be the default value.  This means that only
> vmware_hwver = 7 needs to be specified to enable both features.
> 
> vmware_hwver = 7 is special because that is what controls the
> enable of CPUID leaves for VMware (vmware_hwver >= 7).
> 
> Note: vmware_port and nestedhvm cannot be specified at the
> same time.
> 
> Signed-off-by: Don Slutz <dslutz@verizon.com>

Ian:

So I *think* it may be the case that this patch only depends on patch 5
to apply.  I also think that patches 5 and 7 together add another useful
"chunk" of functionality (core vmport functionality for guest OSes).
Patch 5 already has Andy's Reviewed-by (and can have my Ack as well),
and it seemed like previous versions of this patch were close to being
acceptable to you.

So if you wanted to give this a once-over, we could probably apply these
two without much trouble as well.

Then I think this series would also be off your plate. :-)

 -George


> ---
> v11:
>   Dropped "If non-zero then default VGA to VMware's VGA"
> 
> v10:
>     If..." at the start of the sentence ...
>     Also, why is 7 special?
> 
> 
>  docs/man/xl.cfg.pod.5       | 15 +++++++++++++++
>  tools/libxl/libxl.h         |  5 +++++
>  tools/libxl/libxl_create.c  |  9 +++++++++
>  tools/libxl/libxl_types.idl |  1 +
>  tools/libxl/libxl_x86.c     |  2 ++
>  tools/libxl/xl_cmdimpl.c    |  1 +
>  6 files changed, 33 insertions(+)
> 
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index eaad4bf..00aa78f 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -1354,6 +1354,8 @@ Turns on or off the exposure of VMware cpuid.  The number is
>  VMware's hardware version number, where 0 is off.  A number >= 7
>  is needed to enable exposure of VMware cpuid.
>  
> +If not zero it changes the default for vmware_port to on.
> +
>  The hardware version number (vmware_hwver) come from VMware config files.
>  
>  =over 4
> @@ -1365,6 +1367,19 @@ For vssd:VirtualSystemType == vmx-07, vmware_hwver = 7.
>  
>  =back
>  
> +=item B<vmware_port=BOOLEAN>
> +
> +Turns on or off the exposure of VMware port.  This is known as
> +vmport in QEMU.  Also called VMware Backdoor I/O Port.  Not all
> +defined VMware backdoor commands are implemented.  All of the
> +ones that Linux kernel uses are defined.
> +
> +Defaults to enabled if vmware_hwver is non-zero (i.e. enabled)
> +otherwise defaults to disabled.
> +
> +Note: vmware_port and nestedhvm cannot be specified at the
> +same time.
> +
>  =back
>  
>  =head3 Emulated VGA Graphics Device
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 86164a7..fcce7c3 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -205,6 +205,11 @@
>  #define LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE 1
>  
>  /*
> + * libxl_domain_create_info has the vmware_hwver and vmware_port field.
> + */
> +#define LIBXL_HAVE_CREATEINFO_VMWARE 1
> +
> +/*
>   * libxl ABI compatibility
>   *
>   * The only guarantee which libxl makes regarding ABI compatibility
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 895577f..ac05ecc 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -41,6 +41,7 @@ int libxl__domain_create_info_setdefault(libxl__gc *gc,
>          libxl_defbool_setdefault(&c_info->hap, libxl_defbool_val(c_info->pvh));
>      }
>  
> +    libxl_defbool_setdefault(&c_info->vmware_port, c_info->vmware_hwver != 0);
>      libxl_defbool_setdefault(&c_info->run_hotplug_scripts, true);
>      libxl_defbool_setdefault(&c_info->driver_domain, false);
>  
> @@ -917,6 +918,14 @@ static void initiate_domain_create(libxl__egc *egc,
>      ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
>      if (ret) goto error_out;
>  
> +    if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM &&
> +        libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) &&
> +        libxl_defbool_val(d_config->c_info.vmware_port)) {
> +        LOG(ERROR,
> +            "vmware_port and nestedhvm cannot be enabled simultaneously\n");
> +        ret = ERROR_INVAL;
> +        goto error_out;
> +    }
>      if (!sched_params_valid(gc, domid, &d_config->b_info.sched_params)) {
>          LOG(ERROR, "Invalid scheduling parameters\n");
>          ret = ERROR_INVAL;
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index c8a1345..c7af74b 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -344,6 +344,7 @@ libxl_domain_create_info = Struct("domain_create_info",[
>      ("pvh",          libxl_defbool),
>      ("driver_domain",libxl_defbool),
>      ("vmware_hwver", uint64),
> +    ("vmware_port",  libxl_defbool),
>      ], dir=DIR_IN)
>  
>  libxl_domain_restore_params = Struct("domain_restore_params", [
> diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
> index fd7dafa..404904a 100644
> --- a/tools/libxl/libxl_x86.c
> +++ b/tools/libxl/libxl_x86.c
> @@ -6,6 +6,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>                                        xc_domain_configuration_t *xc_config)
>  {
>      xc_config->vmware_hwver = d_config->c_info.vmware_hwver;
> +    if (libxl_defbool_val(d_config->c_info.vmware_port))
> +        xc_config->arch_flags |= XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK;
>      return 0;
>  }
>  
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index e79a9d0..b3fe0cd 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -1230,6 +1230,7 @@ static void parse_config_data(const char *config_source,
>      }
>  
>      xlu_cfg_get_defbool(config, "oos", &c_info->oos, 0);
> +    xlu_cfg_get_defbool(config, "vmware_port", &c_info->vmware_port, 0);
>  
>      if (!xlu_cfg_get_string (config, "pool", &buf, 0))
>          xlu_cfg_replace_string(config, "pool", &c_info->pool_name, 0);
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT
  2015-05-22 15:50 ` [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT Don Slutz
@ 2015-06-03 17:09   ` George Dunlap
  2015-06-04 11:28     ` Don Slutz
  0 siblings, 1 reply; 48+ messages in thread
From: George Dunlap @ 2015-06-03 17:09 UTC (permalink / raw)
  To: Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 05/22/2015 04:50 PM, Don Slutz wrote:
> This adds synchronization of the 6 vcpu registers (only 32bits of
> them) that vmport.c needs between Xen and QEMU.
> 
> This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
> fetch and put these 6 vcpu registers used by the code in vmport.c
> and vmmouse.c
> 
> In the tools, enable usage of QEMU's vmport code.
> 
> The currently most useful VMware port support that QEMU has is the
> VMware mouse support.  Xorg included a VMware mouse support that
> uses absolute mode.  This make using a mouse in X11 much nicer.
> 
> Signed-off-by: Don Slutz <dslutz@verizon.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Sorry for coming a bit late to this party.  On a high level I think this
is good, but there doesn't seem to be anything in here in particular
that is vmware-specific.  Would it make more sense to give this a more
generic name, and have it include all of the general-purpose registers?

 -George

> ---
> v11:
>   No change
> 
> v10:
>   These literals should become an enum.
>     I don't think the invalidate type is needed.
>     Code handling "case X86EMUL_UNHANDLEABLE:" in emulate.c
>     is unclear.
>     Comment about "special' range of 1" is not clear.
> 
> 
> v9:
>   New code was presented as an RFC before this.
> 
>   Paul Durrant sugested I add support for other IOREQ types
>   to HVMOP_map_io_range_to_ioreq_server.
>     I have done this.
> 
>  tools/libxc/xc_hvm_build_x86.c   |   5 +-
>  tools/libxl/libxl_dm.c           |   2 +
>  xen/arch/x86/hvm/emulate.c       |  78 ++++++++++++++---
>  xen/arch/x86/hvm/hvm.c           | 182 ++++++++++++++++++++++++++++++++++-----
>  xen/arch/x86/hvm/io.c            |  16 ++++
>  xen/include/asm-x86/hvm/domain.h |   3 +-
>  xen/include/asm-x86/hvm/hvm.h    |   1 +
>  xen/include/public/hvm/hvm_op.h  |   5 ++
>  xen/include/public/hvm/ioreq.h   |  17 ++++
>  xen/include/public/hvm/params.h  |   4 +-
>  10 files changed, 274 insertions(+), 39 deletions(-)
> 
> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
> index e45ae4a..ffe52eb 100644
> --- a/tools/libxc/xc_hvm_build_x86.c
> +++ b/tools/libxc/xc_hvm_build_x86.c
> @@ -46,7 +46,8 @@
>  #define SPECIALPAGE_IOREQ    5
>  #define SPECIALPAGE_IDENT_PT 6
>  #define SPECIALPAGE_CONSOLE  7
> -#define NR_SPECIAL_PAGES     8
> +#define SPECIALPAGE_VMPORT_REGS 8
> +#define NR_SPECIAL_PAGES     9
>  #define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x))
>  
>  #define NR_IOREQ_SERVER_PAGES 8
> @@ -569,6 +570,8 @@ static int setup_guest(xc_interface *xch,
>                       special_pfn(SPECIALPAGE_BUFIOREQ));
>      xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_PFN,
>                       special_pfn(SPECIALPAGE_IOREQ));
> +    xc_hvm_param_set(xch, dom, HVM_PARAM_VMPORT_REGS_PFN,
> +                     special_pfn(SPECIALPAGE_VMPORT_REGS));
>      xc_hvm_param_set(xch, dom, HVM_PARAM_CONSOLE_PFN,
>                       special_pfn(SPECIALPAGE_CONSOLE));
>      xc_hvm_param_set(xch, dom, HVM_PARAM_PAGING_RING_PFN,
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index ce08461..b68c651 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -814,6 +814,8 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
>                                              machinearg, max_ram_below_4g);
>              }
>          }
> +        if (libxl_defbool_val(c_info->vmware_port))
> +            machinearg = GCSPRINTF("%s,vmport=on", machinearg);
>          flexarray_append(dm_args, machinearg);
>          for (i = 0; b_info->extra_hvm && b_info->extra_hvm[i] != NULL; i++)
>              flexarray_append(dm_args, b_info->extra_hvm[i]);
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index d5e6468..0a42d18 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -219,27 +219,70 @@ static int hvmemul_do_io(
>              vio->io_state = HVMIO_handle_mmio_awaiting_completion;
>          break;
>      case X86EMUL_UNHANDLEABLE:
> -    {
> -        struct hvm_ioreq_server *s =
> -            hvm_select_ioreq_server(curr->domain, &p);
> -
> -        /* If there is no suitable backing DM, just ignore accesses */
> -        if ( !s )
> +        if ( vmport_check_port(p.addr) )
>          {
> -            hvm_complete_assist_req(&p);
> -            rc = X86EMUL_OKAY;
> -            vio->io_state = HVMIO_none;
> +            struct hvm_ioreq_server *s =
> +                hvm_select_ioreq_server(curr->domain, &p);
> +
> +            /* If there is no suitable backing DM, just ignore accesses */
> +            if ( !s )
> +            {
> +                hvm_complete_assist_req(&p);
> +                rc = X86EMUL_OKAY;
> +                vio->io_state = HVMIO_none;
> +            }
> +            else
> +            {
> +                rc = X86EMUL_RETRY;
> +                if ( !hvm_send_assist_req(s, &p) )
> +                    vio->io_state = HVMIO_none;
> +                else if ( p_data == NULL )
> +                    rc = X86EMUL_OKAY;
> +            }
>          }
>          else
>          {
> -            rc = X86EMUL_RETRY;
> -            if ( !hvm_send_assist_req(s, &p) )
> -                vio->io_state = HVMIO_none;
> -            else if ( p_data == NULL )
> +            struct hvm_ioreq_server *s;
> +            vmware_regs_t *vr;
> +
> +            BUILD_BUG_ON(sizeof(ioreq_t) < sizeof(vmware_regs_t));
> +
> +            p.type = IOREQ_TYPE_VMWARE_PORT;
> +            s = hvm_select_ioreq_server(curr->domain, &p);
> +            vr = get_vmport_regs_any(s, curr);
> +
> +            /*
> +             * If there is no suitable backing DM, just ignore accesses.  If
> +             * we do not have access to registers to pass to QEMU, just
> +             * ignore access.
> +             */
> +            if ( !s || !vr )
> +            {
> +                hvm_complete_assist_req(&p);
>                  rc = X86EMUL_OKAY;
> +                vio->io_state = HVMIO_none;
> +            }
> +            else
> +            {
> +                struct cpu_user_regs *regs = guest_cpu_user_regs();
> +
> +                p.data = regs->rax;
> +                vr->ebx = regs->_ebx;
> +                vr->ecx = regs->_ecx;
> +                vr->edx = regs->_edx;
> +                vr->esi = regs->_esi;
> +                vr->edi = regs->_edi;
> +
> +                vio->io_state = HVMIO_handle_pio_awaiting_completion;
> +                if ( !hvm_send_assist_req(s, &p) )
> +                {
> +                    rc = X86EMUL_RETRY;
> +                    vio->io_state = HVMIO_none;
> +                }
> +                /* else leave rc as X86EMUL_UNHANDLEABLE for below. */
> +            }
>          }
>          break;
> -    }
>      default:
>          BUG();
>      }
> @@ -248,6 +291,13 @@ static int hvmemul_do_io(
>      {
>          if ( ram_page )
>              put_page(ram_page);
> +        /*
> +         * If rc is still X86EMUL_UNHANDLEABLE, then were are of
> +         * type IOREQ_TYPE_VMWARE_PORT, so completion in
> +         * hvm_io_assist() with no re-emulation required
> +         */
> +        if ( rc == X86EMUL_UNHANDLEABLE )
> +            rc = X86EMUL_OKAY;
>          return rc;
>      }
>  
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 2752197..7dd4fdb 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -394,6 +394,47 @@ static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
>      return &p->vcpu_ioreq[v->vcpu_id];
>  }
>  
> +static vmware_regs_t *get_vmport_regs_one(struct hvm_ioreq_server *s,
> +                                          struct vcpu *v)
> +{
> +    struct hvm_ioreq_vcpu *sv;
> +
> +    list_for_each_entry ( sv,
> +                          &s->ioreq_vcpu_list,
> +                          list_entry )
> +    {
> +        if ( sv->vcpu == v )
> +        {
> +            shared_vmport_iopage_t *p = s->vmport_ioreq.va;
> +            if ( !p )
> +                return NULL;
> +            return &p->vcpu_vmport_regs[v->vcpu_id];
> +        }
> +    }
> +    return NULL;
> +}
> +
> +vmware_regs_t *get_vmport_regs_any(struct hvm_ioreq_server *s, struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +
> +    ASSERT((v == current) || !vcpu_runnable(v));
> +
> +    if ( s )
> +        return get_vmport_regs_one(s, v);
> +
> +    list_for_each_entry ( s,
> +                          &d->arch.hvm_domain.ioreq_server.list,
> +                          list_entry )
> +    {
> +        vmware_regs_t *ret = get_vmport_regs_one(s, v);
> +
> +        if ( ret )
> +            return ret;
> +    }
> +    return NULL;
> +}
> +
>  bool_t hvm_io_pending(struct vcpu *v)
>  {
>      struct domain *d = v->domain;
> @@ -504,22 +545,56 @@ static void hvm_free_ioreq_gmfn(struct domain *d, unsigned long gmfn)
>          set_bit(i, &d->arch.hvm_domain.ioreq_gmfn.mask);
>  }
>  
> -static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool_t buf)
> +typedef enum {
> +    IOREQ_PAGE_TYPE_IOREQ,
> +    IOREQ_PAGE_TYPE_BUFIOREQ,
> +    IOREQ_PAGE_TYPE_VMPORT,
> +} ioreq_page_type_t;
> +
> +static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, ioreq_page_type_t buf)
>  {
> -    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
> +    struct hvm_ioreq_page *iorp = NULL;
> +
> +    switch ( buf )
> +    {
> +    case IOREQ_PAGE_TYPE_IOREQ:
> +        iorp = &s->ioreq;
> +        break;
> +    case IOREQ_PAGE_TYPE_BUFIOREQ:
> +        iorp = &s->bufioreq;
> +        break;
> +    case IOREQ_PAGE_TYPE_VMPORT:
> +        iorp = &s->vmport_ioreq;
> +        break;
> +    }
> +    ASSERT(iorp);
>  
>      destroy_ring_for_helper(&iorp->va, iorp->page);
>  }
>  
>  static int hvm_map_ioreq_page(
> -    struct hvm_ioreq_server *s, bool_t buf, unsigned long gmfn)
> +    struct hvm_ioreq_server *s, ioreq_page_type_t buf, unsigned long gmfn)
>  {
>      struct domain *d = s->domain;
> -    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
> +    struct hvm_ioreq_page *iorp = NULL;
>      struct page_info *page;
>      void *va;
>      int rc;
>  
> +    switch ( buf )
> +    {
> +    case IOREQ_PAGE_TYPE_IOREQ:
> +        iorp = &s->ioreq;
> +        break;
> +    case IOREQ_PAGE_TYPE_BUFIOREQ:
> +        iorp = &s->bufioreq;
> +        break;
> +    case IOREQ_PAGE_TYPE_VMPORT:
> +        iorp = &s->vmport_ioreq;
> +        break;
> +    }
> +    ASSERT(iorp);
> +
>      if ( (rc = prepare_ring_for_helper(d, gmfn, &page, &va)) )
>          return rc;
>  
> @@ -736,19 +811,32 @@ static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
>  
>  static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s,
>                                        unsigned long ioreq_pfn,
> -                                      unsigned long bufioreq_pfn)
> +                                      unsigned long bufioreq_pfn,
> +                                      unsigned long vmport_ioreq_pfn)
>  {
>      int rc;
>  
> -    rc = hvm_map_ioreq_page(s, 0, ioreq_pfn);
> +    rc = hvm_map_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ, ioreq_pfn);
>      if ( rc )
>          return rc;
>  
>      if ( bufioreq_pfn != INVALID_GFN )
> -        rc = hvm_map_ioreq_page(s, 1, bufioreq_pfn);
> +        rc = hvm_map_ioreq_page(s, IOREQ_PAGE_TYPE_BUFIOREQ, bufioreq_pfn);
>  
>      if ( rc )
> -        hvm_unmap_ioreq_page(s, 0);
> +    {
> +        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ);
> +        return rc;
> +    }
> +
> +    rc = hvm_map_ioreq_page(s, IOREQ_PAGE_TYPE_VMPORT, vmport_ioreq_pfn);
> +
> +    if ( rc )
> +    {
> +        if ( bufioreq_pfn != INVALID_GFN )
> +            hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_BUFIOREQ);
> +        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ);
> +    }
>  
>      return rc;
>  }
> @@ -760,6 +848,8 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
>      struct domain *d = s->domain;
>      unsigned long ioreq_pfn = INVALID_GFN;
>      unsigned long bufioreq_pfn = INVALID_GFN;
> +    unsigned long vmport_ioreq_pfn =
> +        d->arch.hvm_domain.params[HVM_PARAM_VMPORT_REGS_PFN];
>      int rc;
>  
>      if ( is_default )
> @@ -771,7 +861,8 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
>          ASSERT(handle_bufioreq);
>          return hvm_ioreq_server_map_pages(s,
>                     d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN],
> -                   d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN]);
> +                   d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN],
> +                   vmport_ioreq_pfn);
>      }
>  
>      rc = hvm_alloc_ioreq_gmfn(d, &ioreq_pfn);
> @@ -780,8 +871,8 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
>          rc = hvm_alloc_ioreq_gmfn(d, &bufioreq_pfn);
>  
>      if ( !rc )
> -        rc = hvm_ioreq_server_map_pages(s, ioreq_pfn, bufioreq_pfn);
> -
> +        rc = hvm_ioreq_server_map_pages(s, ioreq_pfn, bufioreq_pfn,
> +                                        vmport_ioreq_pfn);
>      if ( rc )
>      {
>          hvm_free_ioreq_gmfn(d, ioreq_pfn);
> @@ -796,11 +887,15 @@ static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s,
>  {
>      struct domain *d = s->domain;
>      bool_t handle_bufioreq = ( s->bufioreq.va != NULL );
> +    bool_t handle_vmport_ioreq = ( s->vmport_ioreq.va != NULL );
> +
> +    if ( handle_vmport_ioreq )
> +        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_VMPORT);
>  
>      if ( handle_bufioreq )
> -        hvm_unmap_ioreq_page(s, 1);
> +        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_BUFIOREQ);
>  
> -    hvm_unmap_ioreq_page(s, 0);
> +    hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ);
>  
>      if ( !is_default )
>      {
> @@ -835,12 +930,38 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
>      for ( i = 0; i < NR_IO_RANGE_TYPES; i++ )
>      {
>          char *name;
> +        char *type_name = NULL;
> +        unsigned int limit;
>  
> -        rc = asprintf(&name, "ioreq_server %d %s", s->id,
> -                      (i == HVMOP_IO_RANGE_PORT) ? "port" :
> -                      (i == HVMOP_IO_RANGE_MEMORY) ? "memory" :
> -                      (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> -                      "");
> +        switch ( i )
> +        {
> +        case HVMOP_IO_RANGE_PORT:
> +            type_name = "port";
> +            limit = MAX_NR_IO_RANGES;
> +            break;
> +        case HVMOP_IO_RANGE_MEMORY:
> +            type_name = "memory";
> +            limit = MAX_NR_IO_RANGES;
> +            break;
> +        case HVMOP_IO_RANGE_PCI:
> +            type_name = "pci";
> +            limit = MAX_NR_IO_RANGES;
> +            break;
> +        case HVMOP_IO_RANGE_VMWARE_PORT:
> +            type_name = "VMware port";
> +            limit = 1;
> +            break;
> +        case HVMOP_IO_RANGE_TIMEOFFSET:
> +            type_name = "timeoffset";
> +            limit = 1;
> +            break;
> +        default:
> +            break;
> +        }
> +        if ( !type_name )
> +            continue;
> +
> +        rc = asprintf(&name, "ioreq_server %d %s", s->id, type_name);
>          if ( rc )
>              goto fail;
>  
> @@ -853,7 +974,12 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
>          if ( !s->range[i] )
>              goto fail;
>  
> -        rangeset_limit(s->range[i], MAX_NR_IO_RANGES);
> +        rangeset_limit(s->range[i], limit);
> +
> +        /* VMware port */
> +        if ( i == HVMOP_IO_RANGE_VMWARE_PORT &&
> +            s->domain->arch.hvm_domain.is_vmware_port_enabled )
> +            rc = rangeset_add_range(s->range[i], 1, 1);
>      }
>  
>   done:
> @@ -1151,6 +1277,8 @@ static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
>              case HVMOP_IO_RANGE_PORT:
>              case HVMOP_IO_RANGE_MEMORY:
>              case HVMOP_IO_RANGE_PCI:
> +            case HVMOP_IO_RANGE_VMWARE_PORT:
> +            case HVMOP_IO_RANGE_TIMEOFFSET:
>                  r = s->range[type];
>                  break;
>  
> @@ -1202,6 +1330,8 @@ static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>              case HVMOP_IO_RANGE_PORT:
>              case HVMOP_IO_RANGE_MEMORY:
>              case HVMOP_IO_RANGE_PCI:
> +            case HVMOP_IO_RANGE_VMWARE_PORT:
> +            case HVMOP_IO_RANGE_TIMEOFFSET:
>                  r = s->range[type];
>                  break;
>  
> @@ -2426,9 +2556,6 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
>      if ( list_empty(&d->arch.hvm_domain.ioreq_server.list) )
>          return NULL;
>  
> -    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
> -        return d->arch.hvm_domain.default_ioreq_server;
> -
>      cf8 = d->arch.hvm_domain.pci_cf8;
>  
>      if ( p->type == IOREQ_TYPE_PIO &&
> @@ -2471,7 +2598,10 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
>          BUILD_BUG_ON(IOREQ_TYPE_PIO != HVMOP_IO_RANGE_PORT);
>          BUILD_BUG_ON(IOREQ_TYPE_COPY != HVMOP_IO_RANGE_MEMORY);
>          BUILD_BUG_ON(IOREQ_TYPE_PCI_CONFIG != HVMOP_IO_RANGE_PCI);
> +        BUILD_BUG_ON(IOREQ_TYPE_VMWARE_PORT != HVMOP_IO_RANGE_VMWARE_PORT);
> +        BUILD_BUG_ON(IOREQ_TYPE_TIMEOFFSET != HVMOP_IO_RANGE_TIMEOFFSET);
>          r = s->range[type];
> +        ASSERT(r);
>  
>          switch ( type )
>          {
> @@ -2498,6 +2628,13 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
>              }
>  
>              break;
> +        case IOREQ_TYPE_VMWARE_PORT:
> +        case IOREQ_TYPE_TIMEOFFSET:
> +            /* The 'special' range of [1,1] is checked for being enabled */
> +            if ( rangeset_contains_singleton(r, 1) )
> +                return s;
> +
> +            break;
>          }
>      }
>  
> @@ -2657,6 +2794,7 @@ void hvm_complete_assist_req(ioreq_t *p)
>      case IOREQ_TYPE_PCI_CONFIG:
>          ASSERT_UNREACHABLE();
>          break;
> +    case IOREQ_TYPE_VMWARE_PORT:
>      case IOREQ_TYPE_COPY:
>      case IOREQ_TYPE_PIO:
>          if ( p->dir == IOREQ_READ )
> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index 68fb890..7684cf0 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -192,6 +192,22 @@ void hvm_io_assist(ioreq_t *p)
>          (void)handle_mmio();
>          break;
>      case HVMIO_handle_pio_awaiting_completion:
> +        if ( p->type == IOREQ_TYPE_VMWARE_PORT )
> +        {
> +            vmware_regs_t *vr = get_vmport_regs_any(NULL, curr);
> +
> +            if ( vr )
> +            {
> +                struct cpu_user_regs *regs = guest_cpu_user_regs();
> +
> +                /* Only change the 32bit part of the register */
> +                regs->_ebx = vr->ebx;
> +                regs->_ecx = vr->ecx;
> +                regs->_edx = vr->edx;
> +                regs->_esi = vr->esi;
> +                regs->_edi = vr->edi;
> +            }
> +        }
>          if ( vio->io_size == 4 ) /* Needs zero extension. */
>              guest_cpu_user_regs()->rax = (uint32_t)p->data;
>          else
> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
> index b435689..599a688 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
>      evtchn_port_t    ioreq_evtchn;
>  };
>  
> -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
> +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_VMWARE_PORT + 1)
>  #define MAX_NR_IO_RANGES  256
>  
>  struct hvm_ioreq_server {
> @@ -63,6 +63,7 @@ struct hvm_ioreq_server {
>      ioservid_t             id;
>      struct hvm_ioreq_page  ioreq;
>      struct list_head       ioreq_vcpu_list;
> +    struct hvm_ioreq_page  vmport_ioreq;
>      struct hvm_ioreq_page  bufioreq;
>  
>      /* Lock to serialize access to buffered ioreq ring */
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index c42f7d8..0c72ac8 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -524,6 +524,7 @@ extern bool_t opt_hvm_fep;
>  
>  void vmport_register(struct domain *d);
>  int vmport_check_port(unsigned int port);
> +vmware_regs_t *get_vmport_regs_any(struct hvm_ioreq_server *s, struct vcpu *v);
>  
>  #endif /* __ASM_X86_HVM_HVM_H__ */
>  
> diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
> index cde3571..2dcafc3 100644
> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -314,6 +314,9 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_get_ioreq_server_info_t);
>   *
>   * NOTE: unless an emulation request falls entirely within a range mapped
>   * by a secondary emulator, it will not be passed to that emulator.
> + *
> + * NOTE: The 'special' range of [1,1] is what is checked for on
> + * TIMEOFFSET and VMWARE_PORT.
>   */
>  #define HVMOP_map_io_range_to_ioreq_server 19
>  #define HVMOP_unmap_io_range_from_ioreq_server 20
> @@ -324,6 +327,8 @@ struct xen_hvm_io_range {
>  # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
>  # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
>  # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func range */
> +# define HVMOP_IO_RANGE_TIMEOFFSET 7 /* TIMEOFFSET special range */
> +# define HVMOP_IO_RANGE_VMWARE_PORT 9 /* VMware port special range */
>      uint64_aligned_t start, end; /* IN - inclusive start and end of range */
>  };
>  typedef struct xen_hvm_io_range xen_hvm_io_range_t;
> diff --git a/xen/include/public/hvm/ioreq.h b/xen/include/public/hvm/ioreq.h
> index 5b5fedf..2d9dcbe 100644
> --- a/xen/include/public/hvm/ioreq.h
> +++ b/xen/include/public/hvm/ioreq.h
> @@ -37,6 +37,7 @@
>  #define IOREQ_TYPE_PCI_CONFIG   2
>  #define IOREQ_TYPE_TIMEOFFSET   7
>  #define IOREQ_TYPE_INVALIDATE   8 /* mapcache */
> +#define IOREQ_TYPE_VMWARE_PORT  9 /* pio + vmport registers */
>  
>  /*
>   * VMExit dispatcher should cooperate with instruction decoder to
> @@ -48,6 +49,8 @@
>   * 
>   * 63....48|47..40|39..35|34..32|31........0
>   * SEGMENT |BUS   |DEV   |FN    |OFFSET
> + *
> + * For I/O type IOREQ_TYPE_VMWARE_PORT also use the vmware_regs.
>   */
>  struct ioreq {
>      uint64_t addr;          /* physical address */
> @@ -66,11 +69,25 @@ struct ioreq {
>  };
>  typedef struct ioreq ioreq_t;
>  
> +struct vmware_regs {
> +    uint32_t esi;
> +    uint32_t edi;
> +    uint32_t ebx;
> +    uint32_t ecx;
> +    uint32_t edx;
> +};
> +typedef struct vmware_regs vmware_regs_t;
> +
>  struct shared_iopage {
>      struct ioreq vcpu_ioreq[1];
>  };
>  typedef struct shared_iopage shared_iopage_t;
>  
> +struct shared_vmport_iopage {
> +    struct vmware_regs vcpu_vmport_regs[1];
> +};
> +typedef struct shared_vmport_iopage shared_vmport_iopage_t;
> +
>  struct buf_ioreq {
>      uint8_t  type;   /* I/O type                    */
>      uint8_t  pad:1;
> diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
> index 7c73089..130eba9 100644
> --- a/xen/include/public/hvm/params.h
> +++ b/xen/include/public/hvm/params.h
> @@ -50,6 +50,8 @@
>  #define HVM_PARAM_PAE_ENABLED  4
>  
>  #define HVM_PARAM_IOREQ_PFN    5
> +/* Extra vmport PFN. */
> +#define HVM_PARAM_VMPORT_REGS_PFN 35
>  
>  #define HVM_PARAM_BUFIOREQ_PFN 6
>  #define HVM_PARAM_BUFIOREQ_EVTCHN 26
> @@ -187,6 +189,6 @@
>  /* Location of the VM Generation ID in guest physical address space. */
>  #define HVM_PARAM_VM_GENERATION_ID_ADDR 34
>  
> -#define HVM_NR_PARAMS          35
> +#define HVM_NR_PARAMS          36
>  
>  #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 9/9] Add xentrace to vmware_port
  2015-05-22 15:50 ` [PATCH v11 9/9] Add xentrace to vmware_port Don Slutz
@ 2015-06-04 11:20   ` George Dunlap
  2015-06-04 12:31     ` Don Slutz
  0 siblings, 1 reply; 48+ messages in thread
From: George Dunlap @ 2015-06-04 11:20 UTC (permalink / raw)
  To: Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 05/22/2015 04:50 PM, Don Slutz wrote:
> Also added missing TRAP_DEBUG & VLAPIC.
> 
> Signed-off-by: Don Slutz <dslutz@verizon.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> ---
> v11:
>   No change
> 
> v10:
>   Added Acked-by: Ian Campbell
>   Added back in the trace point calls.
> 
>     Why is cmd in this patch?
>       Because the trace points use it.
> 
> v9:
>   Dropped unneed VMPORT_UNHANDLED, VMPORT_DECODE.
> 
> v7:
>       Dropped some of the new traces.
>       Added HVMTRACE_ND7.
> 
> v6:
>       Dropped the attempt to use svm_nextrip_insn_length via
>       __get_instruction_length (added in v2).  Just always look
>       at upto 15 bytes on AMD.
> 
> v5:
>       exitinfo1 is used twice.
>         Fixed.
> 
>  tools/xentrace/formats           |  5 +++++
>  xen/arch/x86/hvm/io.c            |  3 +++
>  xen/arch/x86/hvm/vmware/vmport.c | 17 ++++++++++++++---
>  xen/include/asm-x86/hvm/trace.h  | 22 ++++++++++++++++++++++
>  xen/include/public/trace.h       |  3 +++
>  5 files changed, 47 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/xentrace/formats b/tools/xentrace/formats
> index 5d7b72a..eec65f4 100644
> --- a/tools/xentrace/formats
> +++ b/tools/xentrace/formats
> @@ -79,6 +79,11 @@
>  0x00082020  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  INTR_WINDOW [ value = 0x%(1)08x ]
>  0x00082021  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  NPF         [ gpa = 0x%(2)08x%(1)08x mfn = 0x%(4)08x%(3)08x qual = 0x%(5)04x p2mt = 0x%(6)04x ]
>  0x00082023  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP        [ vector = 0x%(1)02x ]
> +0x00082024  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP_DEBUG  [ exit_qualification = 0x%(1)08x ]
> +0x00082025  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VLAPIC
> +0x00082026  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_HANDLED   [ cmd = %(1)d eax = 0x%(2)08x ebx = 0x%(3)08x ecx = 0x%(4)08x edx = 0x%(5)08x esi = 0x%(6)08x edi = 0x%(7)08x ]
> +0x00082027  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_IGNORED   [ port = %(1)d eax = 0x%(2)08x ebx = 0x%(3)08x ecx = 0x%(4)08x edx = 0x%(5)08x esi = 0x%(6)08x edi = 0x%(7)08x ]
> +0x00082028  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_QEMU      [ eax = 0x%(1)08x ebx = 0x%(2)08x ecx = 0x%(3)08x edx = 0x%(4)08x esi = 0x%(5)08x edi = 0x%(6)08x ]
>  
>  0x0010f001  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  page_grant_map      [ domid = %(1)d ]
>  0x0010f002  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  page_grant_unmap    [ domid = %(1)d ]
> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index 7684cf0..6a9cfb0 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -206,6 +206,9 @@ void hvm_io_assist(ioreq_t *p)
>                  regs->_edx = vr->edx;
>                  regs->_esi = vr->esi;
>                  regs->_edi = vr->edi;
> +                HVMTRACE_ND(VMPORT_QEMU, 0, 1/*cycles*/, 6,
> +                            p->data, regs->_ebx, regs->_ecx,
> +                            regs->_edx, regs->_esi, regs->_edi);
>              }
>          }
>          if ( vio->io_size == 4 ) /* Needs zero extension. */
> diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
> index 36e3f1b..3c3ccd4 100644
> --- a/xen/arch/x86/hvm/vmware/vmport.c
> +++ b/xen/arch/x86/hvm/vmware/vmport.c
> @@ -16,6 +16,7 @@
>  #include <xen/lib.h>
>  #include <asm/hvm/hvm.h>
>  #include <asm/hvm/support.h>
> +#include <asm/hvm/trace.h>
>  
>  #include "backdoor_def.h"
>  
> @@ -35,6 +36,7 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
>      if ( port == BDOOR_PORT && regs->_eax == BDOOR_MAGIC )
>      {
>          uint32_t new_eax = ~0u;
> +        uint16_t cmd = regs->_ecx;
>          uint64_t value;
>          struct vcpu *curr = current;
>          struct domain *currd = curr->domain;
> @@ -45,7 +47,7 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
>           * leaving the high 32-bits unchanged, unlike what one would
>           * expect to happen.
>           */
> -        switch ( regs->_ecx & 0xffff )
> +        switch ( cmd )
>          {
>          case BDOOR_CMD_GETMHZ:
>              new_eax = currd->arch.tsc_khz / 1000;
> @@ -123,11 +125,20 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
>              /* Let backing DM handle */
>              return X86EMUL_UNHANDLEABLE;
>          }
> +        HVMTRACE_ND7(VMPORT_HANDLED, 0, 0/*cycles*/, 7,
> +                     cmd, new_eax, regs->_ebx, regs->_ecx,
> +                     regs->_edx, regs->_esi, regs->_edi);

Do you need to log edi as well? It looks like it's not used.

>          if ( dir == IOREQ_READ )
>              *val = new_eax;
>      }
> -    else if ( dir == IOREQ_READ )
> -        *val = ~0u;
> +    else
> +    {
> +        HVMTRACE_ND7(VMPORT_IGNORED, 0, 0/*cycles*/, 7,
> +                     port, regs->_eax, regs->_ebx, regs->_ecx,
> +                     regs->_edx, regs->_esi, regs->_edi);

And do you need to log all the registers here?  It seems like port +
regs->_ecx would be enough to tell you why it got ignored.

> +        if ( dir == IOREQ_READ )
> +            *val = ~0u;
> +    }
>  
>      return X86EMUL_OKAY;
>  }
> diff --git a/xen/include/asm-x86/hvm/trace.h b/xen/include/asm-x86/hvm/trace.h
> index de802a6..0ad805f 100644
> --- a/xen/include/asm-x86/hvm/trace.h
> +++ b/xen/include/asm-x86/hvm/trace.h
> @@ -54,6 +54,9 @@
>  #define DO_TRC_HVM_TRAP             DEFAULT_HVM_MISC
>  #define DO_TRC_HVM_TRAP_DEBUG       DEFAULT_HVM_MISC
>  #define DO_TRC_HVM_VLAPIC           DEFAULT_HVM_MISC
> +#define DO_TRC_HVM_VMPORT_HANDLED   DEFAULT_HVM_IO
> +#define DO_TRC_HVM_VMPORT_IGNORED   DEFAULT_HVM_IO
> +#define DO_TRC_HVM_VMPORT_QEMU      DEFAULT_HVM_IO
>  
>  
>  #define TRC_PAR_LONG(par) ((par)&0xFFFFFFFF),((par)>>32)
> @@ -83,6 +86,25 @@
>          }                                                                 \
>      } while(0)
>  
> +#define HVMTRACE_ND7(evt, modifier, cycles, count, d1, d2, d3, d4, d5, d6, d7) \
> +    do {                                                                  \
> +        if ( unlikely(tb_init_done) && DO_TRC_HVM_ ## evt )               \
> +        {                                                                 \
> +            struct {                                                      \
> +                u32 d[7];                                                 \
> +            } _d;                                                         \
> +            _d.d[0]=(d1);                                                 \
> +            _d.d[1]=(d2);                                                 \
> +            _d.d[2]=(d3);                                                 \
> +            _d.d[3]=(d4);                                                 \
> +            _d.d[4]=(d5);                                                 \
> +            _d.d[5]=(d6);                                                 \
> +            _d.d[6]=(d7);                                                 \
> +            __trace_var(TRC_HVM_ ## evt | (modifier), cycles,             \
> +                        sizeof(*_d.d) * count, &_d);                      \
> +        }                                                                 \
> +    } while(0)

If you reduced the registers as mentioned above, you wouldn't need this
here either.

 -George

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT
  2015-06-03 17:09   ` George Dunlap
@ 2015-06-04 11:28     ` Don Slutz
  2015-06-05  9:35       ` Jan Beulich
  2015-06-08 10:05       ` George Dunlap
  0 siblings, 2 replies; 48+ messages in thread
From: Don Slutz @ 2015-06-04 11:28 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Tim Deegan, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 06/03/15 13:09, George Dunlap wrote:
> On 05/22/2015 04:50 PM, Don Slutz wrote:
>> This adds synchronization of the 6 vcpu registers (only 32bits of
>> them) that vmport.c needs between Xen and QEMU.
>>
>> This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
>> fetch and put these 6 vcpu registers used by the code in vmport.c
>> and vmmouse.c
>>
>> In the tools, enable usage of QEMU's vmport code.
>>
>> The currently most useful VMware port support that QEMU has is the
>> VMware mouse support.  Xorg included a VMware mouse support that
>> uses absolute mode.  This make using a mouse in X11 much nicer.
>>
>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> Sorry for coming a bit late to this party.  On a high level I think this
> is good, but there doesn't seem to be anything in here in particular
> that is vmware-specific.  Would it make more sense to give this a more
> generic name, and have it include all of the general-purpose registers?

I do not know of a more general case.  The code here is very VMware "in
(%dx),%eax" specific.  The x86 architecture does not have an in/out case
where registers other then rax get used and/or changed that need to be
sent to QEMU.  There already is code to handle ins better then 1 byte at
a time.

There is also a data size issue.  The register data sent over is smaller
then the ioreq data.  Therefore the number of vCPUs that are supported
is the the same.  Changing the amount of data sent would effect this
(like requiring more then 1 page).

    -Don Slutz

>  -George
>
>> ---
>> v11:
>>   No change
>>
>> v10:
>>   These literals should become an enum.
>>     I don't think the invalidate type is needed.
>>     Code handling "case X86EMUL_UNHANDLEABLE:" in emulate.c
>>     is unclear.
>>     Comment about "special' range of 1" is not clear.
>>
>>
>> v9:
>>   New code was presented as an RFC before this.
>>
>>   Paul Durrant sugested I add support for other IOREQ types
>>   to HVMOP_map_io_range_to_ioreq_server.
>>     I have done this.
>>
>>  tools/libxc/xc_hvm_build_x86.c   |   5 +-
>>  tools/libxl/libxl_dm.c           |   2 +
>>  xen/arch/x86/hvm/emulate.c       |  78 ++++++++++++++---
>>  xen/arch/x86/hvm/hvm.c           | 182 ++++++++++++++++++++++++++++++++++-----
>>  xen/arch/x86/hvm/io.c            |  16 ++++
>>  xen/include/asm-x86/hvm/domain.h |   3 +-
>>  xen/include/asm-x86/hvm/hvm.h    |   1 +
>>  xen/include/public/hvm/hvm_op.h  |   5 ++
>>  xen/include/public/hvm/ioreq.h   |  17 ++++
>>  xen/include/public/hvm/params.h  |   4 +-
>>  10 files changed, 274 insertions(+), 39 deletions(-)
>>
>> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
>> index e45ae4a..ffe52eb 100644
>> --- a/tools/libxc/xc_hvm_build_x86.c
>> +++ b/tools/libxc/xc_hvm_build_x86.c
>> @@ -46,7 +46,8 @@
>>  #define SPECIALPAGE_IOREQ    5
>>  #define SPECIALPAGE_IDENT_PT 6
>>  #define SPECIALPAGE_CONSOLE  7
>> -#define NR_SPECIAL_PAGES     8
>> +#define SPECIALPAGE_VMPORT_REGS 8
>> +#define NR_SPECIAL_PAGES     9
>>  #define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x))
>>  
>>  #define NR_IOREQ_SERVER_PAGES 8
>> @@ -569,6 +570,8 @@ static int setup_guest(xc_interface *xch,
>>                       special_pfn(SPECIALPAGE_BUFIOREQ));
>>      xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_PFN,
>>                       special_pfn(SPECIALPAGE_IOREQ));
>> +    xc_hvm_param_set(xch, dom, HVM_PARAM_VMPORT_REGS_PFN,
>> +                     special_pfn(SPECIALPAGE_VMPORT_REGS));
>>      xc_hvm_param_set(xch, dom, HVM_PARAM_CONSOLE_PFN,
>>                       special_pfn(SPECIALPAGE_CONSOLE));
>>      xc_hvm_param_set(xch, dom, HVM_PARAM_PAGING_RING_PFN,
>> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
>> index ce08461..b68c651 100644
>> --- a/tools/libxl/libxl_dm.c
>> +++ b/tools/libxl/libxl_dm.c
>> @@ -814,6 +814,8 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
>>                                              machinearg, max_ram_below_4g);
>>              }
>>          }
>> +        if (libxl_defbool_val(c_info->vmware_port))
>> +            machinearg = GCSPRINTF("%s,vmport=on", machinearg);
>>          flexarray_append(dm_args, machinearg);
>>          for (i = 0; b_info->extra_hvm && b_info->extra_hvm[i] != NULL; i++)
>>              flexarray_append(dm_args, b_info->extra_hvm[i]);
>> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
>> index d5e6468..0a42d18 100644
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -219,27 +219,70 @@ static int hvmemul_do_io(
>>              vio->io_state = HVMIO_handle_mmio_awaiting_completion;
>>          break;
>>      case X86EMUL_UNHANDLEABLE:
>> -    {
>> -        struct hvm_ioreq_server *s =
>> -            hvm_select_ioreq_server(curr->domain, &p);
>> -
>> -        /* If there is no suitable backing DM, just ignore accesses */
>> -        if ( !s )
>> +        if ( vmport_check_port(p.addr) )
>>          {
>> -            hvm_complete_assist_req(&p);
>> -            rc = X86EMUL_OKAY;
>> -            vio->io_state = HVMIO_none;
>> +            struct hvm_ioreq_server *s =
>> +                hvm_select_ioreq_server(curr->domain, &p);
>> +
>> +            /* If there is no suitable backing DM, just ignore accesses */
>> +            if ( !s )
>> +            {
>> +                hvm_complete_assist_req(&p);
>> +                rc = X86EMUL_OKAY;
>> +                vio->io_state = HVMIO_none;
>> +            }
>> +            else
>> +            {
>> +                rc = X86EMUL_RETRY;
>> +                if ( !hvm_send_assist_req(s, &p) )
>> +                    vio->io_state = HVMIO_none;
>> +                else if ( p_data == NULL )
>> +                    rc = X86EMUL_OKAY;
>> +            }
>>          }
>>          else
>>          {
>> -            rc = X86EMUL_RETRY;
>> -            if ( !hvm_send_assist_req(s, &p) )
>> -                vio->io_state = HVMIO_none;
>> -            else if ( p_data == NULL )
>> +            struct hvm_ioreq_server *s;
>> +            vmware_regs_t *vr;
>> +
>> +            BUILD_BUG_ON(sizeof(ioreq_t) < sizeof(vmware_regs_t));
>> +
>> +            p.type = IOREQ_TYPE_VMWARE_PORT;
>> +            s = hvm_select_ioreq_server(curr->domain, &p);
>> +            vr = get_vmport_regs_any(s, curr);
>> +
>> +            /*
>> +             * If there is no suitable backing DM, just ignore accesses.  If
>> +             * we do not have access to registers to pass to QEMU, just
>> +             * ignore access.
>> +             */
>> +            if ( !s || !vr )
>> +            {
>> +                hvm_complete_assist_req(&p);
>>                  rc = X86EMUL_OKAY;
>> +                vio->io_state = HVMIO_none;
>> +            }
>> +            else
>> +            {
>> +                struct cpu_user_regs *regs = guest_cpu_user_regs();
>> +
>> +                p.data = regs->rax;
>> +                vr->ebx = regs->_ebx;
>> +                vr->ecx = regs->_ecx;
>> +                vr->edx = regs->_edx;
>> +                vr->esi = regs->_esi;
>> +                vr->edi = regs->_edi;
>> +
>> +                vio->io_state = HVMIO_handle_pio_awaiting_completion;
>> +                if ( !hvm_send_assist_req(s, &p) )
>> +                {
>> +                    rc = X86EMUL_RETRY;
>> +                    vio->io_state = HVMIO_none;
>> +                }
>> +                /* else leave rc as X86EMUL_UNHANDLEABLE for below. */
>> +            }
>>          }
>>          break;
>> -    }
>>      default:
>>          BUG();
>>      }
>> @@ -248,6 +291,13 @@ static int hvmemul_do_io(
>>      {
>>          if ( ram_page )
>>              put_page(ram_page);
>> +        /*
>> +         * If rc is still X86EMUL_UNHANDLEABLE, then were are of
>> +         * type IOREQ_TYPE_VMWARE_PORT, so completion in
>> +         * hvm_io_assist() with no re-emulation required
>> +         */
>> +        if ( rc == X86EMUL_UNHANDLEABLE )
>> +            rc = X86EMUL_OKAY;
>>          return rc;
>>      }
>>  
>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>> index 2752197..7dd4fdb 100644
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -394,6 +394,47 @@ static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
>>      return &p->vcpu_ioreq[v->vcpu_id];
>>  }
>>  
>> +static vmware_regs_t *get_vmport_regs_one(struct hvm_ioreq_server *s,
>> +                                          struct vcpu *v)
>> +{
>> +    struct hvm_ioreq_vcpu *sv;
>> +
>> +    list_for_each_entry ( sv,
>> +                          &s->ioreq_vcpu_list,
>> +                          list_entry )
>> +    {
>> +        if ( sv->vcpu == v )
>> +        {
>> +            shared_vmport_iopage_t *p = s->vmport_ioreq.va;
>> +            if ( !p )
>> +                return NULL;
>> +            return &p->vcpu_vmport_regs[v->vcpu_id];
>> +        }
>> +    }
>> +    return NULL;
>> +}
>> +
>> +vmware_regs_t *get_vmport_regs_any(struct hvm_ioreq_server *s, struct vcpu *v)
>> +{
>> +    struct domain *d = v->domain;
>> +
>> +    ASSERT((v == current) || !vcpu_runnable(v));
>> +
>> +    if ( s )
>> +        return get_vmport_regs_one(s, v);
>> +
>> +    list_for_each_entry ( s,
>> +                          &d->arch.hvm_domain.ioreq_server.list,
>> +                          list_entry )
>> +    {
>> +        vmware_regs_t *ret = get_vmport_regs_one(s, v);
>> +
>> +        if ( ret )
>> +            return ret;
>> +    }
>> +    return NULL;
>> +}
>> +
>>  bool_t hvm_io_pending(struct vcpu *v)
>>  {
>>      struct domain *d = v->domain;
>> @@ -504,22 +545,56 @@ static void hvm_free_ioreq_gmfn(struct domain *d, unsigned long gmfn)
>>          set_bit(i, &d->arch.hvm_domain.ioreq_gmfn.mask);
>>  }
>>  
>> -static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool_t buf)
>> +typedef enum {
>> +    IOREQ_PAGE_TYPE_IOREQ,
>> +    IOREQ_PAGE_TYPE_BUFIOREQ,
>> +    IOREQ_PAGE_TYPE_VMPORT,
>> +} ioreq_page_type_t;
>> +
>> +static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, ioreq_page_type_t buf)
>>  {
>> -    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
>> +    struct hvm_ioreq_page *iorp = NULL;
>> +
>> +    switch ( buf )
>> +    {
>> +    case IOREQ_PAGE_TYPE_IOREQ:
>> +        iorp = &s->ioreq;
>> +        break;
>> +    case IOREQ_PAGE_TYPE_BUFIOREQ:
>> +        iorp = &s->bufioreq;
>> +        break;
>> +    case IOREQ_PAGE_TYPE_VMPORT:
>> +        iorp = &s->vmport_ioreq;
>> +        break;
>> +    }
>> +    ASSERT(iorp);
>>  
>>      destroy_ring_for_helper(&iorp->va, iorp->page);
>>  }
>>  
>>  static int hvm_map_ioreq_page(
>> -    struct hvm_ioreq_server *s, bool_t buf, unsigned long gmfn)
>> +    struct hvm_ioreq_server *s, ioreq_page_type_t buf, unsigned long gmfn)
>>  {
>>      struct domain *d = s->domain;
>> -    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
>> +    struct hvm_ioreq_page *iorp = NULL;
>>      struct page_info *page;
>>      void *va;
>>      int rc;
>>  
>> +    switch ( buf )
>> +    {
>> +    case IOREQ_PAGE_TYPE_IOREQ:
>> +        iorp = &s->ioreq;
>> +        break;
>> +    case IOREQ_PAGE_TYPE_BUFIOREQ:
>> +        iorp = &s->bufioreq;
>> +        break;
>> +    case IOREQ_PAGE_TYPE_VMPORT:
>> +        iorp = &s->vmport_ioreq;
>> +        break;
>> +    }
>> +    ASSERT(iorp);
>> +
>>      if ( (rc = prepare_ring_for_helper(d, gmfn, &page, &va)) )
>>          return rc;
>>  
>> @@ -736,19 +811,32 @@ static void hvm_ioreq_server_remove_all_vcpus(struct hvm_ioreq_server *s)
>>  
>>  static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s,
>>                                        unsigned long ioreq_pfn,
>> -                                      unsigned long bufioreq_pfn)
>> +                                      unsigned long bufioreq_pfn,
>> +                                      unsigned long vmport_ioreq_pfn)
>>  {
>>      int rc;
>>  
>> -    rc = hvm_map_ioreq_page(s, 0, ioreq_pfn);
>> +    rc = hvm_map_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ, ioreq_pfn);
>>      if ( rc )
>>          return rc;
>>  
>>      if ( bufioreq_pfn != INVALID_GFN )
>> -        rc = hvm_map_ioreq_page(s, 1, bufioreq_pfn);
>> +        rc = hvm_map_ioreq_page(s, IOREQ_PAGE_TYPE_BUFIOREQ, bufioreq_pfn);
>>  
>>      if ( rc )
>> -        hvm_unmap_ioreq_page(s, 0);
>> +    {
>> +        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ);
>> +        return rc;
>> +    }
>> +
>> +    rc = hvm_map_ioreq_page(s, IOREQ_PAGE_TYPE_VMPORT, vmport_ioreq_pfn);
>> +
>> +    if ( rc )
>> +    {
>> +        if ( bufioreq_pfn != INVALID_GFN )
>> +            hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_BUFIOREQ);
>> +        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ);
>> +    }
>>  
>>      return rc;
>>  }
>> @@ -760,6 +848,8 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
>>      struct domain *d = s->domain;
>>      unsigned long ioreq_pfn = INVALID_GFN;
>>      unsigned long bufioreq_pfn = INVALID_GFN;
>> +    unsigned long vmport_ioreq_pfn =
>> +        d->arch.hvm_domain.params[HVM_PARAM_VMPORT_REGS_PFN];
>>      int rc;
>>  
>>      if ( is_default )
>> @@ -771,7 +861,8 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
>>          ASSERT(handle_bufioreq);
>>          return hvm_ioreq_server_map_pages(s,
>>                     d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN],
>> -                   d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN]);
>> +                   d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN],
>> +                   vmport_ioreq_pfn);
>>      }
>>  
>>      rc = hvm_alloc_ioreq_gmfn(d, &ioreq_pfn);
>> @@ -780,8 +871,8 @@ static int hvm_ioreq_server_setup_pages(struct hvm_ioreq_server *s,
>>          rc = hvm_alloc_ioreq_gmfn(d, &bufioreq_pfn);
>>  
>>      if ( !rc )
>> -        rc = hvm_ioreq_server_map_pages(s, ioreq_pfn, bufioreq_pfn);
>> -
>> +        rc = hvm_ioreq_server_map_pages(s, ioreq_pfn, bufioreq_pfn,
>> +                                        vmport_ioreq_pfn);
>>      if ( rc )
>>      {
>>          hvm_free_ioreq_gmfn(d, ioreq_pfn);
>> @@ -796,11 +887,15 @@ static void hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s,
>>  {
>>      struct domain *d = s->domain;
>>      bool_t handle_bufioreq = ( s->bufioreq.va != NULL );
>> +    bool_t handle_vmport_ioreq = ( s->vmport_ioreq.va != NULL );
>> +
>> +    if ( handle_vmport_ioreq )
>> +        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_VMPORT);
>>  
>>      if ( handle_bufioreq )
>> -        hvm_unmap_ioreq_page(s, 1);
>> +        hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_BUFIOREQ);
>>  
>> -    hvm_unmap_ioreq_page(s, 0);
>> +    hvm_unmap_ioreq_page(s, IOREQ_PAGE_TYPE_IOREQ);
>>  
>>      if ( !is_default )
>>      {
>> @@ -835,12 +930,38 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
>>      for ( i = 0; i < NR_IO_RANGE_TYPES; i++ )
>>      {
>>          char *name;
>> +        char *type_name = NULL;
>> +        unsigned int limit;
>>  
>> -        rc = asprintf(&name, "ioreq_server %d %s", s->id,
>> -                      (i == HVMOP_IO_RANGE_PORT) ? "port" :
>> -                      (i == HVMOP_IO_RANGE_MEMORY) ? "memory" :
>> -                      (i == HVMOP_IO_RANGE_PCI) ? "pci" :
>> -                      "");
>> +        switch ( i )
>> +        {
>> +        case HVMOP_IO_RANGE_PORT:
>> +            type_name = "port";
>> +            limit = MAX_NR_IO_RANGES;
>> +            break;
>> +        case HVMOP_IO_RANGE_MEMORY:
>> +            type_name = "memory";
>> +            limit = MAX_NR_IO_RANGES;
>> +            break;
>> +        case HVMOP_IO_RANGE_PCI:
>> +            type_name = "pci";
>> +            limit = MAX_NR_IO_RANGES;
>> +            break;
>> +        case HVMOP_IO_RANGE_VMWARE_PORT:
>> +            type_name = "VMware port";
>> +            limit = 1;
>> +            break;
>> +        case HVMOP_IO_RANGE_TIMEOFFSET:
>> +            type_name = "timeoffset";
>> +            limit = 1;
>> +            break;
>> +        default:
>> +            break;
>> +        }
>> +        if ( !type_name )
>> +            continue;
>> +
>> +        rc = asprintf(&name, "ioreq_server %d %s", s->id, type_name);
>>          if ( rc )
>>              goto fail;
>>  
>> @@ -853,7 +974,12 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
>>          if ( !s->range[i] )
>>              goto fail;
>>  
>> -        rangeset_limit(s->range[i], MAX_NR_IO_RANGES);
>> +        rangeset_limit(s->range[i], limit);
>> +
>> +        /* VMware port */
>> +        if ( i == HVMOP_IO_RANGE_VMWARE_PORT &&
>> +            s->domain->arch.hvm_domain.is_vmware_port_enabled )
>> +            rc = rangeset_add_range(s->range[i], 1, 1);
>>      }
>>  
>>   done:
>> @@ -1151,6 +1277,8 @@ static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
>>              case HVMOP_IO_RANGE_PORT:
>>              case HVMOP_IO_RANGE_MEMORY:
>>              case HVMOP_IO_RANGE_PCI:
>> +            case HVMOP_IO_RANGE_VMWARE_PORT:
>> +            case HVMOP_IO_RANGE_TIMEOFFSET:
>>                  r = s->range[type];
>>                  break;
>>  
>> @@ -1202,6 +1330,8 @@ static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>>              case HVMOP_IO_RANGE_PORT:
>>              case HVMOP_IO_RANGE_MEMORY:
>>              case HVMOP_IO_RANGE_PCI:
>> +            case HVMOP_IO_RANGE_VMWARE_PORT:
>> +            case HVMOP_IO_RANGE_TIMEOFFSET:
>>                  r = s->range[type];
>>                  break;
>>  
>> @@ -2426,9 +2556,6 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
>>      if ( list_empty(&d->arch.hvm_domain.ioreq_server.list) )
>>          return NULL;
>>  
>> -    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
>> -        return d->arch.hvm_domain.default_ioreq_server;
>> -
>>      cf8 = d->arch.hvm_domain.pci_cf8;
>>  
>>      if ( p->type == IOREQ_TYPE_PIO &&
>> @@ -2471,7 +2598,10 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
>>          BUILD_BUG_ON(IOREQ_TYPE_PIO != HVMOP_IO_RANGE_PORT);
>>          BUILD_BUG_ON(IOREQ_TYPE_COPY != HVMOP_IO_RANGE_MEMORY);
>>          BUILD_BUG_ON(IOREQ_TYPE_PCI_CONFIG != HVMOP_IO_RANGE_PCI);
>> +        BUILD_BUG_ON(IOREQ_TYPE_VMWARE_PORT != HVMOP_IO_RANGE_VMWARE_PORT);
>> +        BUILD_BUG_ON(IOREQ_TYPE_TIMEOFFSET != HVMOP_IO_RANGE_TIMEOFFSET);
>>          r = s->range[type];
>> +        ASSERT(r);
>>  
>>          switch ( type )
>>          {
>> @@ -2498,6 +2628,13 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
>>              }
>>  
>>              break;
>> +        case IOREQ_TYPE_VMWARE_PORT:
>> +        case IOREQ_TYPE_TIMEOFFSET:
>> +            /* The 'special' range of [1,1] is checked for being enabled */
>> +            if ( rangeset_contains_singleton(r, 1) )
>> +                return s;
>> +
>> +            break;
>>          }
>>      }
>>  
>> @@ -2657,6 +2794,7 @@ void hvm_complete_assist_req(ioreq_t *p)
>>      case IOREQ_TYPE_PCI_CONFIG:
>>          ASSERT_UNREACHABLE();
>>          break;
>> +    case IOREQ_TYPE_VMWARE_PORT:
>>      case IOREQ_TYPE_COPY:
>>      case IOREQ_TYPE_PIO:
>>          if ( p->dir == IOREQ_READ )
>> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
>> index 68fb890..7684cf0 100644
>> --- a/xen/arch/x86/hvm/io.c
>> +++ b/xen/arch/x86/hvm/io.c
>> @@ -192,6 +192,22 @@ void hvm_io_assist(ioreq_t *p)
>>          (void)handle_mmio();
>>          break;
>>      case HVMIO_handle_pio_awaiting_completion:
>> +        if ( p->type == IOREQ_TYPE_VMWARE_PORT )
>> +        {
>> +            vmware_regs_t *vr = get_vmport_regs_any(NULL, curr);
>> +
>> +            if ( vr )
>> +            {
>> +                struct cpu_user_regs *regs = guest_cpu_user_regs();
>> +
>> +                /* Only change the 32bit part of the register */
>> +                regs->_ebx = vr->ebx;
>> +                regs->_ecx = vr->ecx;
>> +                regs->_edx = vr->edx;
>> +                regs->_esi = vr->esi;
>> +                regs->_edi = vr->edi;
>> +            }
>> +        }
>>          if ( vio->io_size == 4 ) /* Needs zero extension. */
>>              guest_cpu_user_regs()->rax = (uint32_t)p->data;
>>          else
>> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
>> index b435689..599a688 100644
>> --- a/xen/include/asm-x86/hvm/domain.h
>> +++ b/xen/include/asm-x86/hvm/domain.h
>> @@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
>>      evtchn_port_t    ioreq_evtchn;
>>  };
>>  
>> -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
>> +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_VMWARE_PORT + 1)
>>  #define MAX_NR_IO_RANGES  256
>>  
>>  struct hvm_ioreq_server {
>> @@ -63,6 +63,7 @@ struct hvm_ioreq_server {
>>      ioservid_t             id;
>>      struct hvm_ioreq_page  ioreq;
>>      struct list_head       ioreq_vcpu_list;
>> +    struct hvm_ioreq_page  vmport_ioreq;
>>      struct hvm_ioreq_page  bufioreq;
>>  
>>      /* Lock to serialize access to buffered ioreq ring */
>> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
>> index c42f7d8..0c72ac8 100644
>> --- a/xen/include/asm-x86/hvm/hvm.h
>> +++ b/xen/include/asm-x86/hvm/hvm.h
>> @@ -524,6 +524,7 @@ extern bool_t opt_hvm_fep;
>>  
>>  void vmport_register(struct domain *d);
>>  int vmport_check_port(unsigned int port);
>> +vmware_regs_t *get_vmport_regs_any(struct hvm_ioreq_server *s, struct vcpu *v);
>>  
>>  #endif /* __ASM_X86_HVM_HVM_H__ */
>>  
>> diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
>> index cde3571..2dcafc3 100644
>> --- a/xen/include/public/hvm/hvm_op.h
>> +++ b/xen/include/public/hvm/hvm_op.h
>> @@ -314,6 +314,9 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_get_ioreq_server_info_t);
>>   *
>>   * NOTE: unless an emulation request falls entirely within a range mapped
>>   * by a secondary emulator, it will not be passed to that emulator.
>> + *
>> + * NOTE: The 'special' range of [1,1] is what is checked for on
>> + * TIMEOFFSET and VMWARE_PORT.
>>   */
>>  #define HVMOP_map_io_range_to_ioreq_server 19
>>  #define HVMOP_unmap_io_range_from_ioreq_server 20
>> @@ -324,6 +327,8 @@ struct xen_hvm_io_range {
>>  # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
>>  # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
>>  # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func range */
>> +# define HVMOP_IO_RANGE_TIMEOFFSET 7 /* TIMEOFFSET special range */
>> +# define HVMOP_IO_RANGE_VMWARE_PORT 9 /* VMware port special range */
>>      uint64_aligned_t start, end; /* IN - inclusive start and end of range */
>>  };
>>  typedef struct xen_hvm_io_range xen_hvm_io_range_t;
>> diff --git a/xen/include/public/hvm/ioreq.h b/xen/include/public/hvm/ioreq.h
>> index 5b5fedf..2d9dcbe 100644
>> --- a/xen/include/public/hvm/ioreq.h
>> +++ b/xen/include/public/hvm/ioreq.h
>> @@ -37,6 +37,7 @@
>>  #define IOREQ_TYPE_PCI_CONFIG   2
>>  #define IOREQ_TYPE_TIMEOFFSET   7
>>  #define IOREQ_TYPE_INVALIDATE   8 /* mapcache */
>> +#define IOREQ_TYPE_VMWARE_PORT  9 /* pio + vmport registers */
>>  
>>  /*
>>   * VMExit dispatcher should cooperate with instruction decoder to
>> @@ -48,6 +49,8 @@
>>   * 
>>   * 63....48|47..40|39..35|34..32|31........0
>>   * SEGMENT |BUS   |DEV   |FN    |OFFSET
>> + *
>> + * For I/O type IOREQ_TYPE_VMWARE_PORT also use the vmware_regs.
>>   */
>>  struct ioreq {
>>      uint64_t addr;          /* physical address */
>> @@ -66,11 +69,25 @@ struct ioreq {
>>  };
>>  typedef struct ioreq ioreq_t;
>>  
>> +struct vmware_regs {
>> +    uint32_t esi;
>> +    uint32_t edi;
>> +    uint32_t ebx;
>> +    uint32_t ecx;
>> +    uint32_t edx;
>> +};
>> +typedef struct vmware_regs vmware_regs_t;
>> +
>>  struct shared_iopage {
>>      struct ioreq vcpu_ioreq[1];
>>  };
>>  typedef struct shared_iopage shared_iopage_t;
>>  
>> +struct shared_vmport_iopage {
>> +    struct vmware_regs vcpu_vmport_regs[1];
>> +};
>> +typedef struct shared_vmport_iopage shared_vmport_iopage_t;
>> +
>>  struct buf_ioreq {
>>      uint8_t  type;   /* I/O type                    */
>>      uint8_t  pad:1;
>> diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
>> index 7c73089..130eba9 100644
>> --- a/xen/include/public/hvm/params.h
>> +++ b/xen/include/public/hvm/params.h
>> @@ -50,6 +50,8 @@
>>  #define HVM_PARAM_PAE_ENABLED  4
>>  
>>  #define HVM_PARAM_IOREQ_PFN    5
>> +/* Extra vmport PFN. */
>> +#define HVM_PARAM_VMPORT_REGS_PFN 35
>>  
>>  #define HVM_PARAM_BUFIOREQ_PFN 6
>>  #define HVM_PARAM_BUFIOREQ_EVTCHN 26
>> @@ -187,6 +189,6 @@
>>  /* Location of the VM Generation ID in guest physical address space. */
>>  #define HVM_PARAM_VM_GENERATION_ID_ADDR 34
>>  
>> -#define HVM_NR_PARAMS          35
>> +#define HVM_NR_PARAMS          36
>>  
>>  #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
>>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 9/9] Add xentrace to vmware_port
  2015-06-04 11:20   ` George Dunlap
@ 2015-06-04 12:31     ` Don Slutz
  0 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-06-04 12:31 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Tim Deegan, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 06/04/15 07:20, George Dunlap wrote:
> On 05/22/2015 04:50 PM, Don Slutz wrote:
>> Also added missing TRAP_DEBUG & VLAPIC.
>>
>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>> ---
>> v11:
>>   No change
>>
>> v10:
>>   Added Acked-by: Ian Campbell
>>   Added back in the trace point calls.
>>
>>     Why is cmd in this patch?
>>       Because the trace points use it.
>>
>> v9:
>>   Dropped unneed VMPORT_UNHANDLED, VMPORT_DECODE.
>>
>> v7:
>>       Dropped some of the new traces.
>>       Added HVMTRACE_ND7.
>>
>> v6:
>>       Dropped the attempt to use svm_nextrip_insn_length via
>>       __get_instruction_length (added in v2).  Just always look
>>       at upto 15 bytes on AMD.
>>
>> v5:
>>       exitinfo1 is used twice.
>>         Fixed.
>>
>>  tools/xentrace/formats           |  5 +++++
>>  xen/arch/x86/hvm/io.c            |  3 +++
>>  xen/arch/x86/hvm/vmware/vmport.c | 17 ++++++++++++++---
>>  xen/include/asm-x86/hvm/trace.h  | 22 ++++++++++++++++++++++
>>  xen/include/public/trace.h       |  3 +++
>>  5 files changed, 47 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/xentrace/formats b/tools/xentrace/formats
>> index 5d7b72a..eec65f4 100644
>> --- a/tools/xentrace/formats
>> +++ b/tools/xentrace/formats
>> @@ -79,6 +79,11 @@
>>  0x00082020  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  INTR_WINDOW [ value = 0x%(1)08x ]
>>  0x00082021  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  NPF         [ gpa = 0x%(2)08x%(1)08x mfn = 0x%(4)08x%(3)08x qual = 0x%(5)04x p2mt = 0x%(6)04x ]
>>  0x00082023  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP        [ vector = 0x%(1)02x ]
>> +0x00082024  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP_DEBUG  [ exit_qualification = 0x%(1)08x ]
>> +0x00082025  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VLAPIC
>> +0x00082026  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_HANDLED   [ cmd = %(1)d eax = 0x%(2)08x ebx = 0x%(3)08x ecx = 0x%(4)08x edx = 0x%(5)08x esi = 0x%(6)08x edi = 0x%(7)08x ]
>> +0x00082027  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_IGNORED   [ port = %(1)d eax = 0x%(2)08x ebx = 0x%(3)08x ecx = 0x%(4)08x edx = 0x%(5)08x esi = 0x%(6)08x edi = 0x%(7)08x ]
>> +0x00082028  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_QEMU      [ eax = 0x%(1)08x ebx = 0x%(2)08x ecx = 0x%(3)08x edx = 0x%(4)08x esi = 0x%(5)08x edi = 0x%(6)08x ]
>>  
>>  0x0010f001  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  page_grant_map      [ domid = %(1)d ]
>>  0x0010f002  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  page_grant_unmap    [ domid = %(1)d ]
>> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
>> index 7684cf0..6a9cfb0 100644
>> --- a/xen/arch/x86/hvm/io.c
>> +++ b/xen/arch/x86/hvm/io.c
>> @@ -206,6 +206,9 @@ void hvm_io_assist(ioreq_t *p)
>>                  regs->_edx = vr->edx;
>>                  regs->_esi = vr->esi;
>>                  regs->_edi = vr->edi;
>> +                HVMTRACE_ND(VMPORT_QEMU, 0, 1/*cycles*/, 6,
>> +                            p->data, regs->_ebx, regs->_ecx,
>> +                            regs->_edx, regs->_esi, regs->_edi);
>>              }
>>          }
>>          if ( vio->io_size == 4 ) /* Needs zero extension. */
>> diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
>> index 36e3f1b..3c3ccd4 100644
>> --- a/xen/arch/x86/hvm/vmware/vmport.c
>> +++ b/xen/arch/x86/hvm/vmware/vmport.c
>> @@ -16,6 +16,7 @@
>>  #include <xen/lib.h>
>>  #include <asm/hvm/hvm.h>
>>  #include <asm/hvm/support.h>
>> +#include <asm/hvm/trace.h>
>>  
>>  #include "backdoor_def.h"
>>  
>> @@ -35,6 +36,7 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
>>      if ( port == BDOOR_PORT && regs->_eax == BDOOR_MAGIC )
>>      {
>>          uint32_t new_eax = ~0u;
>> +        uint16_t cmd = regs->_ecx;
>>          uint64_t value;
>>          struct vcpu *curr = current;
>>          struct domain *currd = curr->domain;
>> @@ -45,7 +47,7 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
>>           * leaving the high 32-bits unchanged, unlike what one would
>>           * expect to happen.
>>           */
>> -        switch ( regs->_ecx & 0xffff )
>> +        switch ( cmd )
>>          {
>>          case BDOOR_CMD_GETMHZ:
>>              new_eax = currd->arch.tsc_khz / 1000;
>> @@ -123,11 +125,20 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
>>              /* Let backing DM handle */
>>              return X86EMUL_UNHANDLEABLE;
>>          }
>> +        HVMTRACE_ND7(VMPORT_HANDLED, 0, 0/*cycles*/, 7,
>> +                     cmd, new_eax, regs->_ebx, regs->_ecx,
>> +                     regs->_edx, regs->_esi, regs->_edi);
> 
> Do you need to log edi as well? It looks like it's not used.


I guess not, but since there are VMware port commands that do use edi, a
future add might need it.  I find it simpler to have this and the QEMU
case above the same but will change it if you want.

> 
>>          if ( dir == IOREQ_READ )
>>              *val = new_eax;
>>      }
>> -    else if ( dir == IOREQ_READ )
>> -        *val = ~0u;
>> +    else
>> +    {
>> +        HVMTRACE_ND7(VMPORT_IGNORED, 0, 0/*cycles*/, 7,
>> +                     port, regs->_eax, regs->_ebx, regs->_ecx,
>> +                     regs->_edx, regs->_esi, regs->_edi);
> 
> And do you need to log all the registers here?  It seems like port +
> regs->_ecx would be enough to tell you why it got ignored.
> 

The min would be port and regs->_eax.  QEMU is the one that cares about
regs->_ecx.

Happy to make this change. Will wait until response from above.

>> +        if ( dir == IOREQ_READ )
>> +            *val = ~0u;
>> +    }
>>  
>>      return X86EMUL_OKAY;
>>  }
>> diff --git a/xen/include/asm-x86/hvm/trace.h b/xen/include/asm-x86/hvm/trace.h
>> index de802a6..0ad805f 100644
>> --- a/xen/include/asm-x86/hvm/trace.h
>> +++ b/xen/include/asm-x86/hvm/trace.h
>> @@ -54,6 +54,9 @@
>>  #define DO_TRC_HVM_TRAP             DEFAULT_HVM_MISC
>>  #define DO_TRC_HVM_TRAP_DEBUG       DEFAULT_HVM_MISC
>>  #define DO_TRC_HVM_VLAPIC           DEFAULT_HVM_MISC
>> +#define DO_TRC_HVM_VMPORT_HANDLED   DEFAULT_HVM_IO
>> +#define DO_TRC_HVM_VMPORT_IGNORED   DEFAULT_HVM_IO
>> +#define DO_TRC_HVM_VMPORT_QEMU      DEFAULT_HVM_IO
>>  
>>  
>>  #define TRC_PAR_LONG(par) ((par)&0xFFFFFFFF),((par)>>32)
>> @@ -83,6 +86,25 @@
>>          }                                                                 \
>>      } while(0)
>>  
>> +#define HVMTRACE_ND7(evt, modifier, cycles, count, d1, d2, d3, d4, d5, d6, d7) \
>> +    do {                                                                  \
>> +        if ( unlikely(tb_init_done) && DO_TRC_HVM_ ## evt )               \
>> +        {                                                                 \
>> +            struct {                                                      \
>> +                u32 d[7];                                                 \
>> +            } _d;                                                         \
>> +            _d.d[0]=(d1);                                                 \
>> +            _d.d[1]=(d2);                                                 \
>> +            _d.d[2]=(d3);                                                 \
>> +            _d.d[3]=(d4);                                                 \
>> +            _d.d[4]=(d5);                                                 \
>> +            _d.d[5]=(d6);                                                 \
>> +            _d.d[6]=(d7);                                                 \
>> +            __trace_var(TRC_HVM_ ## evt | (modifier), cycles,             \
>> +                        sizeof(*_d.d) * count, &_d);                      \
>> +        }                                                                 \
>> +    } while(0)
> 
> If you reduced the registers as mentioned above, you wouldn't need this
> here either.
> 

Yes.  However with out this, I did not understand that a trace could
have 7 32bit arguments.  Happy to go either way depending on changes above.

   -Don Slutz

>  -George
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-03 16:58           ` George Dunlap
@ 2015-06-04 12:37             ` Don Slutz
  2015-06-04 14:14               ` George Dunlap
  0 siblings, 1 reply; 48+ messages in thread
From: Don Slutz @ 2015-06-04 12:37 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, Andrew Cooper, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Tim Deegan, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Boris Ostrovsky,
	Suravee Suthikulpanit

On 06/03/15 12:58, George Dunlap wrote:
> On 06/03/2015 05:41 PM, Don Slutz wrote:
>> On 06/03/15 12:23, George Dunlap wrote:
>>> On 06/03/2015 04:58 PM, Andrew Cooper wrote:
>>>> On 03/06/15 16:26, George Dunlap wrote:
>>>>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>>>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>>>>> to port 0x5658 specially.  Note: since many operations return data
>>>>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>>>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>>>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>>>>> unchanged.
>>>>>>
>>>>>> This instruction is allowed to be used from ring 3.  To
>>>>>> support this the vmexit for GP needs to be enabled.  I have not
>>>>>> fully tested that nested HVM is doing the right thing for this.
>>>>>>
>>>>>> Enable no-fault of pio in x86_emulate for VMware port
>>>>>>
>>>>>> Also adjust the emulation registers after doing a VMware
>>>>>> backdoor operation.
>>>>>>
>>>>>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>>>>>> handler.
>>>>>>
>>>>>> Some of the best info is at:
>>>>>>
>>>>>> https://sites.google.com/site/chitchatvmback/backdoor
>>>>>>
>>>>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>>>> So let me get this straight.
>>>>>
>>>>> VMWare allows ring3 to access the magic port regardless of whether the
>>>>> guest OS has enabled access to that IO port or not.
>>>>>
>>>>> In order to emulate this, we need to:
>>>>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>>>>> * Emulate all instructions which cause a #GP, just to see if they might
>>>>> be an IO instruction accessing the magic port.
>>>>> * If it is an IO instruction, and it's accessing the magic port, then we
>>>>> skip the ioport access checks (which will cause the instruction to
>>>>> execute as though it had been given access).
>>>>> * Under all other circumstances (we hope) the emulator in Xen will do
>>>>> exactly what the hardware just did, and deliver a #GP to the guest.
>>>>>
>>>>> In an attempt to make this more safe, emulation ops that write (such as
>>>>> write and cmpxchg) are replaced with stubs which always return an error.
>>>>>
>>>>> Is that about right?
>>>>>
>>>>> That sounds completely insane.  It opens up an almost infinite surface
>>>>> of attack onto the Xen emulator.
>>>>>
>>>>> I understand that having the "VMWare compatible" is a nice tick-box to
>>>>> have, but seriously, I cannot imagine that having unprivileged
>>>>> user-space tools know the real clock frequency without having to involve
>>>>> the OS is anywhere close to worth the risk involved.
>>>>
>>>> The attack surface sadly is not enlarged in the slightest by this change.
>>>>
>>>> We already trap and emulate all #UD exceptions in an attempt to support
>>>> migration of VMs between Intel and AMD hardware.  See XSA-105.  (There
>>>> is a good argument to be made for not trapping #UD, but that doesn't
>>>> completely close the hole)
>>>
>>> So at the moment, an attacker on Intel can force the emulation of any
>>> AMD-only instruction (and vice versa), is that right?
>>>
>>> This would allow an attacker to force the emulation of every #GP
>>> condition of every instruction we emulate.
>>>
>>> Those two sets may be within an order of magnitude of each other, but
>>> they will only overlap a little bit.  So my guess is that enabling this
>>> would double the surface of attack (give or take).
>>>
>>> I'd be a lot happier with this patch if we could make it so that on a
>>> #GP the only instruction that could get emulated would be an IO instruction.
>>>
>>
>> You mean like I said in:
>>
>>
>> Message-ID: <54C67D8302000078000598E4@mail.emea.novell.com>
> 
> Yes, pretty much exactly.
> 
> I didn't notice that particular part of the discussion, but I did go
> back and skim the comments that people had made on previous revisions,
> and I certainly noticed that both Jan and Andy reviewed this patch, and
> that neither one objected to the general idea.  So my "That sounds
> insane" was as much directed at them as at you.
> 
> (As an aside, I think my description does a better job of alerting a
> reviewer to what's going on in this patch -- you might consider stealing
> part of it if you end up re-submitting this one.)
> 

I would be happy to "steal" the description part.  I normally give
credit to the author in the "what has changed".  I could also add to the
commit message:


George Dunlap summarized this change as:

VMWare allows ring3 to access the magic port regardless of whether the
guest OS has enabled access to that IO port or not.

In order to emulate this, we need to:
* Trap to Xen on #GPs rather than just letting the hardware handle it
* Emulate all instructions which cause a #GP, just to see if they might
  be an IO instruction accessing the magic port.
* If it is an IO instruction, and it's accessing the magic port, then we
  skip the ioport access checks (which will cause the instruction to
  execute as though it had been given access).
* Under all other circumstances (we hope) the emulator in Xen will do
  exactly what the hardware just did, and deliver a #GP to the guest.

In an attempt to make this more safe, emulation ops that write (such as
write and cmpxchg) are replaced with stubs which always return an error.


   -Don Slutz

>  -George
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-04 12:37             ` Don Slutz
@ 2015-06-04 14:14               ` George Dunlap
  2015-06-04 16:17                 ` Don Slutz
  0 siblings, 1 reply; 48+ messages in thread
From: George Dunlap @ 2015-06-04 14:14 UTC (permalink / raw)
  To: Don Slutz, Don Slutz, Andrew Cooper, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Tim Deegan, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Boris Ostrovsky,
	Suravee Suthikulpanit

On 06/04/2015 01:37 PM, Don Slutz wrote:
> On 06/03/15 12:58, George Dunlap wrote:
>> On 06/03/2015 05:41 PM, Don Slutz wrote:
>>> On 06/03/15 12:23, George Dunlap wrote:
>>>> On 06/03/2015 04:58 PM, Andrew Cooper wrote:
>>>>> On 03/06/15 16:26, George Dunlap wrote:
>>>>>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>>>>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>>>>>> to port 0x5658 specially.  Note: since many operations return data
>>>>>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>>>>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>>>>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>>>>>> unchanged.
>>>>>>>
>>>>>>> This instruction is allowed to be used from ring 3.  To
>>>>>>> support this the vmexit for GP needs to be enabled.  I have not
>>>>>>> fully tested that nested HVM is doing the right thing for this.
>>>>>>>
>>>>>>> Enable no-fault of pio in x86_emulate for VMware port
>>>>>>>
>>>>>>> Also adjust the emulation registers after doing a VMware
>>>>>>> backdoor operation.
>>>>>>>
>>>>>>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>>>>>>> handler.
>>>>>>>
>>>>>>> Some of the best info is at:
>>>>>>>
>>>>>>> https://sites.google.com/site/chitchatvmback/backdoor
>>>>>>>
>>>>>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>>>>> So let me get this straight.
>>>>>>
>>>>>> VMWare allows ring3 to access the magic port regardless of whether the
>>>>>> guest OS has enabled access to that IO port or not.
>>>>>>
>>>>>> In order to emulate this, we need to:
>>>>>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>>>>>> * Emulate all instructions which cause a #GP, just to see if they might
>>>>>> be an IO instruction accessing the magic port.
>>>>>> * If it is an IO instruction, and it's accessing the magic port, then we
>>>>>> skip the ioport access checks (which will cause the instruction to
>>>>>> execute as though it had been given access).
>>>>>> * Under all other circumstances (we hope) the emulator in Xen will do
>>>>>> exactly what the hardware just did, and deliver a #GP to the guest.
>>>>>>
>>>>>> In an attempt to make this more safe, emulation ops that write (such as
>>>>>> write and cmpxchg) are replaced with stubs which always return an error.
>>>>>>
>>>>>> Is that about right?
>>>>>>
>>>>>> That sounds completely insane.  It opens up an almost infinite surface
>>>>>> of attack onto the Xen emulator.
>>>>>>
>>>>>> I understand that having the "VMWare compatible" is a nice tick-box to
>>>>>> have, but seriously, I cannot imagine that having unprivileged
>>>>>> user-space tools know the real clock frequency without having to involve
>>>>>> the OS is anywhere close to worth the risk involved.
>>>>>
>>>>> The attack surface sadly is not enlarged in the slightest by this change.
>>>>>
>>>>> We already trap and emulate all #UD exceptions in an attempt to support
>>>>> migration of VMs between Intel and AMD hardware.  See XSA-105.  (There
>>>>> is a good argument to be made for not trapping #UD, but that doesn't
>>>>> completely close the hole)
>>>>
>>>> So at the moment, an attacker on Intel can force the emulation of any
>>>> AMD-only instruction (and vice versa), is that right?
>>>>
>>>> This would allow an attacker to force the emulation of every #GP
>>>> condition of every instruction we emulate.
>>>>
>>>> Those two sets may be within an order of magnitude of each other, but
>>>> they will only overlap a little bit.  So my guess is that enabling this
>>>> would double the surface of attack (give or take).
>>>>
>>>> I'd be a lot happier with this patch if we could make it so that on a
>>>> #GP the only instruction that could get emulated would be an IO instruction.
>>>>
>>>
>>> You mean like I said in:
>>>
>>>
>>> Message-ID: <54C67D8302000078000598E4@mail.emea.novell.com>
>>
>> Yes, pretty much exactly.
>>
>> I didn't notice that particular part of the discussion, but I did go
>> back and skim the comments that people had made on previous revisions,
>> and I certainly noticed that both Jan and Andy reviewed this patch, and
>> that neither one objected to the general idea.  So my "That sounds
>> insane" was as much directed at them as at you.
>>
>> (As an aside, I think my description does a better job of alerting a
>> reviewer to what's going on in this patch -- you might consider stealing
>> part of it if you end up re-submitting this one.)
>>
> 
> I would be happy to "steal" the description part.  I normally give
> credit to the author in the "what has changed".  I could also add to the
> commit message:
> 
> 
> George Dunlap summarized this change as:
> 
> VMWare allows ring3 to access the magic port regardless of whether the
> guest OS has enabled access to that IO port or not.
> 
> In order to emulate this, we need to:
> * Trap to Xen on #GPs rather than just letting the hardware handle it
> * Emulate all instructions which cause a #GP, just to see if they might
>   be an IO instruction accessing the magic port.
> * If it is an IO instruction, and it's accessing the magic port, then we
>   skip the ioport access checks (which will cause the instruction to
>   execute as though it had been given access).
> * Under all other circumstances (we hope) the emulator in Xen will do
>   exactly what the hardware just did, and deliver a #GP to the guest.
> 
> In an attempt to make this more safe, emulation ops that write (such as
> write and cmpxchg) are replaced with stubs which always return an error.

I would take the "(we hope)" out, since that's really part of the second
half (me doubting whether such a patch is wise).  Not a good idea to
doubt whether a patch is a good idea in the commit message. :-)

If you feel like credit is necessary, I'd just put at the bottom
somewhere in parentheses something like this:

(h/t to George Dunlap for the patch summary.)

 -George

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 3/9] tools: Add vmware_hwver support
  2015-06-03 14:53   ` George Dunlap
@ 2015-06-04 15:15     ` Ian Campbell
  2015-06-04 15:46       ` Don Slutz
  0 siblings, 1 reply; 48+ messages in thread
From: Ian Campbell @ 2015-06-04 15:15 UTC (permalink / raw)
  To: George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, Don Slutz,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Wed, 2015-06-03 at 15:53 +0100, George Dunlap wrote:
> On 05/22/2015 04:50 PM, Don Slutz wrote:
> > This is used to set xen_arch_domainconfig vmware_hw. It is set to
> > the emulated VMware virtual hardware version.
> > 
> > Currently 0, 3-4, 6-11 are good values.  However the code only
> > checks for == 0, != 0, or < 7.
> > 
> > Signed-off-by: Don Slutz <dslutz@verizon.com>
> 
> Ian,
> 
> It looks like you gave a pre-approved Ack to something almost identical
> to v10.

In v9 I indicated that LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE and
LIBXL_HAVE_BUILDINFO_HVM_VMWARE_HWVER could be covered by a single ack
(introducing vmware support generally).

In v11 this seems to have morphed into only
LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE being provided, which is
clearly not an appropriate umbrella #define.

I'm also not sure if there is more stuff later in the series, if so then
unless it is all committed together an umbrella option may not work,
unless it is added right at the end, in which case I suppose having some
"unadvertised" functionality in the midst of a dev cycle would be ok.
Releasing like that would be a mistake though.

Ian.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 3/9] tools: Add vmware_hwver support
  2015-05-22 15:50 ` [PATCH v11 3/9] tools: Add vmware_hwver support Don Slutz
  2015-06-03 14:53   ` George Dunlap
@ 2015-06-04 15:17   ` Ian Campbell
  2015-06-04 15:59     ` Don Slutz
  1 sibling, 1 reply; 48+ messages in thread
From: Ian Campbell @ 2015-06-04 15:17 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jan Beulich,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Fri, 2015-05-22 at 11:50 -0400, Don Slutz wrote:
> [...] 
> +=item B<vmware_hwver=NUMBER>
> +
> +Turns on or off the exposure of VMware cpuid.  The number is
> +VMware's hardware version number, where 0 is off.  A number >= 7
> +is needed to enable exposure of VMware cpuid.
> +
> +The hardware version number (vmware_hwver) come from VMware config files.

"comes"

> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> index 38d065f..4362d5d 100644
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -64,7 +64,7 @@ int xc_domain_create(xc_interface *xch,
>      memset(&config, 0, sizeof(config));
>  
>  #if defined (__i386) || defined(__x86_64__)
> -    /* No arch-specific configuration for now */
> +    /* No arch-specific default configuration for now */

I'm not sure this hunk has anything to do with this patch, nor what the
semantic difference between the old and new text is supposed to be.

Ian.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 7/9] tools: Add vmware_port support
  2015-05-22 15:50 ` [PATCH v11 7/9] tools: Add " Don Slutz
  2015-06-03 17:06   ` George Dunlap
@ 2015-06-04 15:20   ` Ian Campbell
  1 sibling, 0 replies; 48+ messages in thread
From: Ian Campbell @ 2015-06-04 15:20 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jan Beulich,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Fri, 2015-05-22 at 11:50 -0400, Don Slutz wrote:
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 86164a7..fcce7c3 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -205,6 +205,11 @@
>  #define LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE 1
>  
>  /*
> + * libxl_domain_create_info has the vmware_hwver and vmware_port field.
> + */
> +#define LIBXL_HAVE_CREATEINFO_VMWARE 1

Lets just have a single one of these indicating support for vmware, it
should be added at the end of the series after all the baseline vmware
functionality is in place. I think that means hwver, vga=vmware and this
port stuff.

(Future incremental changes will of course require their own flags).

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 3/9] tools: Add vmware_hwver support
  2015-06-04 15:15     ` Ian Campbell
@ 2015-06-04 15:46       ` Don Slutz
  0 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-06-04 15:46 UTC (permalink / raw)
  To: Ian Campbell, George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 06/04/15 11:15, Ian Campbell wrote:
> On Wed, 2015-06-03 at 15:53 +0100, George Dunlap wrote:
>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>> This is used to set xen_arch_domainconfig vmware_hw. It is set to
>>> the emulated VMware virtual hardware version.
>>>
>>> Currently 0, 3-4, 6-11 are good values.  However the code only
>>> checks for == 0, != 0, or < 7.
>>>
>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>
>> Ian,
>>
>> It looks like you gave a pre-approved Ack to something almost identical
>> to v10.
> 
> In v9 I indicated that LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE and
> LIBXL_HAVE_BUILDINFO_HVM_VMWARE_HWVER could be covered by a single ack
> (introducing vmware support generally).
> 
> In v11 this seems to have morphed into only
> LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE being provided, which is
> clearly not an appropriate umbrella #define.
> 

Only in PATCH 1/9 -- Which in v11 is now completely independent.  I only
kept it in the series since in v10 it was not fully independent.

> I'm also not sure if there is more stuff later in the series, if so then
> unless it is all committed together an umbrella option may not work,
> unless it is added right at the end, in which case I suppose having some
> "unadvertised" functionality in the midst of a dev cycle would be ok.
> Releasing like that would be a mistake though.
> 

There is one later in the series 7/9.  to which you said (in a different
thread):

>> +#define LIBXL_HAVE_CREATEINFO_VMWARE 1
>
> Lets just have a single one of these indicating support for vmware, it
> should be added at the end of the series after all the baseline vmware
> functionality is in place. I think that means hwver, vga=vmware and this
> port stuff.
>
> (Future incremental changes will of course require their own flags).

If I am reading this correctly, you want PATCH 1/9 to not be completely
independent.

   -Don Slutz

> Ian.
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 7/9] tools: Add vmware_port support
  2015-06-03 17:06   ` George Dunlap
@ 2015-06-04 15:49     ` Ian Campbell
  2015-06-04 16:09       ` Don Slutz
  0 siblings, 1 reply; 48+ messages in thread
From: Ian Campbell @ 2015-06-04 15:49 UTC (permalink / raw)
  To: George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, Don Slutz,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Wed, 2015-06-03 at 18:06 +0100, George Dunlap wrote:
> On 05/22/2015 04:50 PM, Don Slutz wrote:
> > This new libxl_domain_create_info field is used to set
> > XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK in the xc_domain_configuration_t
> > for x86.
> > 
> > In xen it is is_vmware_port_enabled.
> > 
> > If is_vmware_port_enabled then
> >   enable a limited support of VMware's hyper-call.
> > 
> > VMware's hyper-call is also known as VMware Backdoor I/O Port.
> > 
> > if vmware_port is not specified in the config file, let
> > "vmware_hwver != 0" be the default value.  This means that only
> > vmware_hwver = 7 needs to be specified to enable both features.
> > 
> > vmware_hwver = 7 is special because that is what controls the
> > enable of CPUID leaves for VMware (vmware_hwver >= 7).
> > 
> > Note: vmware_port and nestedhvm cannot be specified at the
> > same time.
> > 
> > Signed-off-by: Don Slutz <dslutz@verizon.com>
> 
> Ian:
> 
> So I *think* it may be the case that this patch only depends on patch 5
> to apply.  I also think that patches 5 and 7 together add another useful
> "chunk" of functionality (core vmport functionality for guest OSes).

Is there any point in this chunk without 1..3?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 3/9] tools: Add vmware_hwver support
  2015-06-04 15:17   ` Ian Campbell
@ 2015-06-04 15:59     ` Don Slutz
  0 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-06-04 15:59 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Jun Nakajima, Kevin Tian, Keir Fraser, Eddie Dong,
	Stefano Stabellini, George Dunlap, Ian Jackson, Tim Deegan,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 06/04/15 11:17, Ian Campbell wrote:
> On Fri, 2015-05-22 at 11:50 -0400, Don Slutz wrote:
>> [...] 
>> +=item B<vmware_hwver=NUMBER>
>> +
>> +Turns on or off the exposure of VMware cpuid.  The number is
>> +VMware's hardware version number, where 0 is off.  A number >= 7
>> +is needed to enable exposure of VMware cpuid.
>> +
>> +The hardware version number (vmware_hwver) come from VMware config files.
> 
> "comes"
> 

Will fix.

>> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
>> index 38d065f..4362d5d 100644
>> --- a/tools/libxc/xc_domain.c
>> +++ b/tools/libxc/xc_domain.c
>> @@ -64,7 +64,7 @@ int xc_domain_create(xc_interface *xch,
>>      memset(&config, 0, sizeof(config));
>>  
>>  #if defined (__i386) || defined(__x86_64__)
>> -    /* No arch-specific configuration for now */
>> +    /* No arch-specific default configuration for now */
> 
> I'm not sure this hunk has anything to do with this patch, nor what the
> semantic difference between the old and new text is supposed to be.
> 

I do not have to make this change.  However I feel the comment is
misleading, and I changed it.  The real change is later:


diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 651b338..fd7dafa 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -5,8 +5,7 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
                                       libxl_domain_config *d_config,
                                       xc_domain_configuration_t *xc_config)
 {
-    /* Note: will be changed in a later patch */
-    xc_config->vmware_hwver = 0;
+    xc_config->vmware_hwver = d_config->c_info.vmware_hwver;
     return 0;
 }

which is where "arch-specific configuration" is done.  This is where the
default "arch-specific configuration" is done.

   -Don Slutz


> Ian.
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 7/9] tools: Add vmware_port support
  2015-06-04 15:49     ` Ian Campbell
@ 2015-06-04 16:09       ` Don Slutz
  0 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-06-04 16:09 UTC (permalink / raw)
  To: Ian Campbell, George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 06/04/15 11:49, Ian Campbell wrote:
> On Wed, 2015-06-03 at 18:06 +0100, George Dunlap wrote:
>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>> This new libxl_domain_create_info field is used to set
>>> XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK in the xc_domain_configuration_t
>>> for x86.
>>>
>>> In xen it is is_vmware_port_enabled.
>>>
>>> If is_vmware_port_enabled then
>>>   enable a limited support of VMware's hyper-call.
>>>
>>> VMware's hyper-call is also known as VMware Backdoor I/O Port.
>>>
>>> if vmware_port is not specified in the config file, let
>>> "vmware_hwver != 0" be the default value.  This means that only
>>> vmware_hwver = 7 needs to be specified to enable both features.
>>>
>>> vmware_hwver = 7 is special because that is what controls the
>>> enable of CPUID leaves for VMware (vmware_hwver >= 7).
>>>
>>> Note: vmware_port and nestedhvm cannot be specified at the
>>> same time.
>>>
>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>
>> Ian:
>>
>> So I *think* it may be the case that this patch only depends on patch 5
>> to apply.  I also think that patches 5 and 7 together add another useful
>> "chunk" of functionality (core vmport functionality for guest OSes).
> 
> Is there any point in this chunk without 1..3?
> 
> 

There may be.  However changes would need to be done if 2 and 3 are not
present.

Patch #1 is independent.

Pacth 2,3 provide CPUID support, which is not always needed.

Just 5,7 and 8 may work to provide X windows usage of vmware mouse.  I
have not tested with just this set.  This is because most Xservers run
with IOPL set to 3 and so do not need patch #6.  My memory also says
that CPUID support is not needed (which is what you get for vmware_hwver=3).


   -Don Slutz

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-04 14:14               ` George Dunlap
@ 2015-06-04 16:17                 ` Don Slutz
  0 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-06-04 16:17 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, Andrew Cooper, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Tim Deegan, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Boris Ostrovsky,
	Suravee Suthikulpanit

On 06/04/15 10:14, George Dunlap wrote:
> On 06/04/2015 01:37 PM, Don Slutz wrote:
>> On 06/03/15 12:58, George Dunlap wrote:
>>> On 06/03/2015 05:41 PM, Don Slutz wrote:
>>>> On 06/03/15 12:23, George Dunlap wrote:
>>>>> On 06/03/2015 04:58 PM, Andrew Cooper wrote:
>>>>>> On 03/06/15 16:26, George Dunlap wrote:
>>>>>>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>>>>>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>>>>>>> to port 0x5658 specially.  Note: since many operations return data
>>>>>>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>>>>>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>>>>>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>>>>>>> unchanged.


>>> (As an aside, I think my description does a better job of alerting a
>>> reviewer to what's going on in this patch -- you might consider stealing
>>> part of it if you end up re-submitting this one.)
>>>
>>
>> I would be happy to "steal" the description part.  I normally give
>> credit to the author in the "what has changed".  I could also add to the
>> commit message:
>>
>>
>> George Dunlap summarized this change as:
>>
>> VMWare allows ring3 to access the magic port regardless of whether the
>> guest OS has enabled access to that IO port or not.
>>
>> In order to emulate this, we need to:
>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>> * Emulate all instructions which cause a #GP, just to see if they might
>>   be an IO instruction accessing the magic port.
>> * If it is an IO instruction, and it's accessing the magic port, then we
>>   skip the ioport access checks (which will cause the instruction to
>>   execute as though it had been given access).
>> * Under all other circumstances (we hope) the emulator in Xen will do
>>   exactly what the hardware just did, and deliver a #GP to the guest.
>>
>> In an attempt to make this more safe, emulation ops that write (such as
>> write and cmpxchg) are replaced with stubs which always return an error.
> 
> I would take the "(we hope)" out, since that's really part of the second
> half (me doubting whether such a patch is wise).  Not a good idea to
> doubt whether a patch is a good idea in the commit message. :-)
> 
> If you feel like credit is necessary, I'd just put at the bottom
> somewhere in parentheses something like this:
> 
> (h/t to George Dunlap for the patch summary.)
> 

Ok.  Thanks,

   -Don Slutz


>  -George
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-03 16:50         ` George Dunlap
@ 2015-06-05  9:31           ` Jan Beulich
  2015-06-05 10:54             ` Ian Campbell
  0 siblings, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2015-06-05  9:31 UTC (permalink / raw)
  To: George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, IanCampbell,
	Stefano Stabellini, Jun Nakajima, Andrew Cooper, Ian Jackson,
	Don Slutz, Don Slutz, xen-devel, Eddie Dong,
	Aravind Gopalakrishnan, Suravee Suthikulpanit, BorisOstrovsky

>>> On 03.06.15 at 18:50, <george.dunlap@eu.citrix.com> wrote:
> On 06/03/2015 05:36 PM, Don Slutz wrote:
>> On 06/03/15 11:58, Andrew Cooper wrote:
>>> On 03/06/15 16:26, George Dunlap wrote:
>>>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>>>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>>>>> to port 0x5658 specially.  Note: since many operations return data
>>>>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>>>>> "in (%dx),%al" will still do things, only AL part of EAX will be
>>>>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>>>>> unchanged.
>>>>>
>>>>> This instruction is allowed to be used from ring 3.  To
>>>>> support this the vmexit for GP needs to be enabled.  I have not
>>>>> fully tested that nested HVM is doing the right thing for this.
>>>>>
>>>>> Enable no-fault of pio in x86_emulate for VMware port
>>>>>
>>>>> Also adjust the emulation registers after doing a VMware
>>>>> backdoor operation.
>>>>>
>>>>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>>>>> handler.
>>>>>
>>>>> Some of the best info is at:
>>>>>
>>>>> https://sites.google.com/site/chitchatvmback/backdoor 
>>>>>
>>>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>>> So let me get this straight.
>>>>
>>>> VMWare allows ring3 to access the magic port regardless of whether the
>>>> guest OS has enabled access to that IO port or not.
>>>>
>>>> In order to emulate this, we need to:
>>>> * Trap to Xen on #GPs rather than just letting the hardware handle it
>>>> * Emulate all instructions which cause a #GP, just to see if they might
>>>> be an IO instruction accessing the magic port.
>>>> * If it is an IO instruction, and it's accessing the magic port, then we
>>>> skip the ioport access checks (which will cause the instruction to
>>>> execute as though it had been given access).
>>>> * Under all other circumstances (we hope) the emulator in Xen will do
>>>> exactly what the hardware just did, and deliver a #GP to the guest.
>>>>
>>>> In an attempt to make this more safe, emulation ops that write (such as
>>>> write and cmpxchg) are replaced with stubs which always return an error.
>>>>
>>>> Is that about right?
>> 
>> Yes, however it is missing that Jan Beulich wanted the emulator in Xen
>> to be used.  I had started with code that did not use the emulator.
> 
> I agree with him that the emulator should be used to emulate the
> instructions we *want* to emulate.  I'm just not happy with using the
> emulator to emulate all the instructions we *don't* want to emulate
> (i.e., all the ones that really do need to #GP).

And hence the suggestion to stub out hooks that shouldn't get
involved. Of course us running into the problem of wanting to limit
what gets emulated now the second or third time perhaps we
should rather consider an extension to the emulator interface that
callers can use to control what kinds of operations it wants
emulated (with anything else causing failure) - in the case here,
I/O instructions.

>>>> That sounds completely insane.  It opens up an almost infinite surface
>>>> of attack onto the Xen emulator.
>>>>
>>>> I understand that having the "VMWare compatible" is a nice tick-box to
>>>> have, but seriously, I cannot imagine that having unprivileged
>>>> user-space tools know the real clock frequency without having to involve
>>>> the OS is anywhere close to worth the risk involved.
>> 
>> Not sure how you moved from attack surface to "real clock frequency"
>> (which I am not sure which of the many "clock frequency" you are
>> referring to.  The only new one that leaps to mind is the emulated lapic
>> bus frequency (which Linux attempts to determine from other clocks).
> 
> I'm talking about cost-benefits analysis.  What's the benefit of
> accepting this patch, and is it worth the cost?

The basic idea of allowing guests originally having got installed on
VMware to continue their lives on Xen is certainly something worth
accepting some cost. It's really hard to judge whether in the case
here things go too far (and that would equally apply to the hand
crafted instruction decoding done in earlier versions of this series).

Jan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT
  2015-06-04 11:28     ` Don Slutz
@ 2015-06-05  9:35       ` Jan Beulich
  2015-06-05 10:03         ` Paul Durrant
  2015-06-08 10:05       ` George Dunlap
  1 sibling, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2015-06-05  9:35 UTC (permalink / raw)
  To: Don Slutz, Don Slutz
  Cc: Jun Nakajima, Tim Deegan, Eddie Dong, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Kevin Tian, xen-devel, Paul Durrant, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

>>> On 04.06.15 at 13:28, <don.slutz@gmail.com> wrote:
> On 06/03/15 13:09, George Dunlap wrote:
>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>> This adds synchronization of the 6 vcpu registers (only 32bits of
>>> them) that vmport.c needs between Xen and QEMU.
>>>
>>> This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
>>> fetch and put these 6 vcpu registers used by the code in vmport.c
>>> and vmmouse.c
>>>
>>> In the tools, enable usage of QEMU's vmport code.
>>>
>>> The currently most useful VMware port support that QEMU has is the
>>> VMware mouse support.  Xorg included a VMware mouse support that
>>> uses absolute mode.  This make using a mouse in X11 much nicer.
>>>
>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>> Sorry for coming a bit late to this party.  On a high level I think this
>> is good, but there doesn't seem to be anything in here in particular
>> that is vmware-specific.  Would it make more sense to give this a more
>> generic name, and have it include all of the general-purpose registers?
> 
> I do not know of a more general case.  The code here is very VMware "in
> (%dx),%eax" specific.  The x86 architecture does not have an in/out case
> where registers other then rax get used and/or changed that need to be
> sent to QEMU.  There already is code to handle ins better then 1 byte at
> a time.
> 
> There is also a data size issue.  The register data sent over is smaller
> then the ioreq data.  Therefore the number of vCPUs that are supported
> is the the same.  Changing the amount of data sent would effect this
> (like requiring more then 1 page).

You may or may not have heard talk about there being an extension
to the qemu interface in the works anyway that involves larger data
items (xmm, ymm, and zmm data in particular) to be communicated to
qemu. Depending on the time frame for this to arrive (Paul?) perhaps
it would make sense to defer the changes here to then build on top of
that instead of introducing a custom mechanism?

Jan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 5/9] xen: Add vmware_port support
  2015-05-22 15:50 ` [PATCH v11 5/9] xen: Add vmware_port support Don Slutz
@ 2015-06-05  9:52   ` Jan Beulich
  2015-06-05 13:18     ` Don Slutz
  0 siblings, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2015-06-05  9:52 UTC (permalink / raw)
  To: Don Slutz
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Eddie Dong, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

>>> On 22.05.15 at 17:50, <dslutz@verizon.com> wrote:
> @@ -5805,6 +5808,12 @@ static int hvmop_set_param(
>              break;
>          if ( a.value > 1 )
>              rc = -EINVAL;
> +        /* Prevent nestedhvm with vmport */
> +        if ( d->arch.hvm_domain.is_vmware_port_enabled )
> +        {
> +            rc = -EOPNOTSUPP;
> +            break;
> +        }

Surrounding code avoiding the use of "break" makes the result look
rather inconsistent. Please move this up immediately after the XSM
check, or drop the "break".

Jan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT
  2015-06-05  9:35       ` Jan Beulich
@ 2015-06-05 10:03         ` Paul Durrant
  0 siblings, 0 replies; 48+ messages in thread
From: Paul Durrant @ 2015-06-05 10:03 UTC (permalink / raw)
  To: Jan Beulich, Don Slutz, Don Slutz
  Cc: Tim (Xen.org), Kevin Tian, Keir (Xen.org),
	Ian Campbell, Jun Nakajima, Andrew Cooper, Eddie Dong,
	George Dunlap, xen-devel, Stefano Stabellini,
	Aravind Gopalakrishnan, Suravee Suthikulpanit, Ian Jackson,
	Boris Ostrovsky

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 05 June 2015 10:36
> To: Don Slutz; Don Slutz
> Cc: Aravind Gopalakrishnan; Suravee Suthikulpanit; Andrew Cooper; Ian
> Campbell; Paul Durrant; George Dunlap; Ian Jackson; Stefano Stabellini; Eddie
> Dong; Jun Nakajima; Kevin Tian; xen-devel@lists.xen.org; Boris Ostrovsky;
> Keir (Xen.org); Tim (Xen.org)
> Subject: Re: [Xen-devel] [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT
> 
> >>> On 04.06.15 at 13:28, <don.slutz@gmail.com> wrote:
> > On 06/03/15 13:09, George Dunlap wrote:
> >> On 05/22/2015 04:50 PM, Don Slutz wrote:
> >>> This adds synchronization of the 6 vcpu registers (only 32bits of
> >>> them) that vmport.c needs between Xen and QEMU.
> >>>
> >>> This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
> >>> fetch and put these 6 vcpu registers used by the code in vmport.c
> >>> and vmmouse.c
> >>>
> >>> In the tools, enable usage of QEMU's vmport code.
> >>>
> >>> The currently most useful VMware port support that QEMU has is the
> >>> VMware mouse support.  Xorg included a VMware mouse support that
> >>> uses absolute mode.  This make using a mouse in X11 much nicer.
> >>>
> >>> Signed-off-by: Don Slutz <dslutz@verizon.com>
> >>> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> >> Sorry for coming a bit late to this party.  On a high level I think this
> >> is good, but there doesn't seem to be anything in here in particular
> >> that is vmware-specific.  Would it make more sense to give this a more
> >> generic name, and have it include all of the general-purpose registers?
> >
> > I do not know of a more general case.  The code here is very VMware "in
> > (%dx),%eax" specific.  The x86 architecture does not have an in/out case
> > where registers other then rax get used and/or changed that need to be
> > sent to QEMU.  There already is code to handle ins better then 1 byte at
> > a time.
> >
> > There is also a data size issue.  The register data sent over is smaller
> > then the ioreq data.  Therefore the number of vCPUs that are supported
> > is the the same.  Changing the amount of data sent would effect this
> > (like requiring more then 1 page).
> 
> You may or may not have heard talk about there being an extension
> to the qemu interface in the works anyway that involves larger data
> items (xmm, ymm, and zmm data in particular) to be communicated to
> qemu. Depending on the time frame for this to arrive (Paul?) perhaps
> it would make sense to defer the changes here to then build on top of
> that instead of introducing a custom mechanism?
> 

The idea was to 'bounce' larger accesses via a guest RAM page (another magic page within the E820 reserved region just below 4G). Theoretically this should not require modifications to QEMU because it already handles multi-rep I/O to/from guest pages. Alas this proved not to be true as QEMU does translates all guest addresses (whether they are in the ioreq addr or data field) through its own memory map and hence anything within the E820 is treated as emulated.
So, I'm now working on fixing the current 'chunking' code in hvmemul_read/write to handle accesses wider than 8 bytes as multiple round-trips to QEMU. Less efficient, but it I believe it will work... and if QEMU is modified, we could try bouncing again in future.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-05  9:31           ` Jan Beulich
@ 2015-06-05 10:54             ` Ian Campbell
  2015-06-11 22:10               ` Don Slutz
  0 siblings, 1 reply; 48+ messages in thread
From: Ian Campbell @ 2015-06-05 10:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Keir Fraser, Eddie Dong,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Don Slutz, Don Slutz, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, BorisOstrovsky

On Fri, 2015-06-05 at 10:31 +0100, Jan Beulich wrote:
> > I'm talking about cost-benefits analysis.  What's the benefit of
> > accepting this patch, and is it worth the cost?
> 
> The basic idea of allowing guests originally having got installed on
> VMware to continue their lives on Xen is certainly something worth
> accepting some cost. It's really hard to judge whether in the case
> here things go too far (and that would equally apply to the hand
> crafted instruction decoding done in earlier versions of this series).

I can see the benefit in having a guest which was installed on vmware be
able to boot and work on Xen.

But AIUI this userspace vmware port thing is not needed for that basic
use case but instead goes farther and enables advanced features like
clip boards integration, which TBH I think we could consider living
without (especially considering the costs discussed here).

It would be really useful to see a comprehensive list of exactly what
guest ring3 access to the vmware port actually enables i.e. a list of
specific features which require it.

Ian.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 5/9] xen: Add vmware_port support
  2015-06-05  9:52   ` Jan Beulich
@ 2015-06-05 13:18     ` Don Slutz
  0 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-06-05 13:18 UTC (permalink / raw)
  To: Jan Beulich, Don Slutz
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	George Dunlap, Andrew Cooper, Tim Deegan, xen-devel, Eddie Dong,
	Aravind Gopalakrishnan, Jun Nakajima, Suravee Suthikulpanit,
	Boris Ostrovsky, Ian Jackson

On 06/05/15 05:52, Jan Beulich wrote:
>>>> On 22.05.15 at 17:50, <dslutz@verizon.com> wrote:
>> @@ -5805,6 +5808,12 @@ static int hvmop_set_param(
>>               break;
>>           if ( a.value > 1 )
>>               rc = -EINVAL;
>> +        /* Prevent nestedhvm with vmport */
>> +        if ( d->arch.hvm_domain.is_vmware_port_enabled )
>> +        {
>> +            rc = -EOPNOTSUPP;
>> +            break;
>> +        }
> Surrounding code avoiding the use of "break" makes the result look
> rather inconsistent. Please move this up immediately after the XSM
> check, or drop the "break".

Will do.

    -Don Slutz

> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT
  2015-06-04 11:28     ` Don Slutz
  2015-06-05  9:35       ` Jan Beulich
@ 2015-06-08 10:05       ` George Dunlap
  2015-06-11 21:51         ` Don Slutz
  1 sibling, 1 reply; 48+ messages in thread
From: George Dunlap @ 2015-06-08 10:05 UTC (permalink / raw)
  To: Don Slutz, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Tim Deegan, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 06/04/2015 12:28 PM, Don Slutz wrote:
> On 06/03/15 13:09, George Dunlap wrote:
>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>> This adds synchronization of the 6 vcpu registers (only 32bits of
>>> them) that vmport.c needs between Xen and QEMU.
>>>
>>> This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
>>> fetch and put these 6 vcpu registers used by the code in vmport.c
>>> and vmmouse.c
>>>
>>> In the tools, enable usage of QEMU's vmport code.
>>>
>>> The currently most useful VMware port support that QEMU has is the
>>> VMware mouse support.  Xorg included a VMware mouse support that
>>> uses absolute mode.  This make using a mouse in X11 much nicer.
>>>
>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>> Sorry for coming a bit late to this party.  On a high level I think this
>> is good, but there doesn't seem to be anything in here in particular
>> that is vmware-specific.  Would it make more sense to give this a more
>> generic name, and have it include all of the general-purpose registers?
> 
> I do not know of a more general case.  The code here is very VMware "in
> (%dx),%eax" specific.  The x86 architecture does not have an in/out case
> where registers other then rax get used and/or changed that need to be
> sent to QEMU.  There already is code to handle ins better then 1 byte at
> a time.

"VMWare-specific" doesn't mean VMWare is *currently* the only one that
uses it; it means that the data passed is so VMWare specific that VMWare
is likely to *always* be the only user.

All this additional functionality does (as I understand it) is ship over
some registers verbatim, and restore them on completion.  You could
imagine other functionality which might be implemented in qemu (or
another ioreq server) that could use functionality like that.

For example, this functionality might potentially be of use to the XenGT
guys, who need to emulate writes to some pages to shadow the graphics
card pagetables; or to someone wanting to implement some sort of
introspection feature that is meant to work in both KVM and Xen.

The only thing vmware-specific about this at the moment is the
particular subset of registers you're copying over.

> There is also a data size issue.  The register data sent over is smaller
> then the ioreq data.  Therefore the number of vCPUs that are supported
> is the the same.  Changing the amount of data sent would effect this
> (like requiring more then 1 page).

Hmm... so it looks like the ioreq struct is about the size of 8
uint32_t's, or 4 uint64_ts.

So you could easily include eax, ebx, ecx, edx, esi, edi, eip, esp.

But it's not clear that you could do general-purpose emulation without
things like ebp, eflags, and so on.  Nor is it clear that it would be
useful to do only emulation for 32-bit instructions.

Would it be terribly bad to make it 4 pages long -- long enough to get
most of the 64-bit registers in there if wanted?

Or alternately, would it be possible to allow the contents of this page
to be changed in the future, perhaps with a domctl?

 -George

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT
  2015-06-08 10:05       ` George Dunlap
@ 2015-06-11 21:51         ` Don Slutz
  0 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-06-11 21:51 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Tim Deegan, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 06/08/15 06:05, George Dunlap wrote:
> On 06/04/2015 12:28 PM, Don Slutz wrote:
>> On 06/03/15 13:09, George Dunlap wrote:
>>> On 05/22/2015 04:50 PM, Don Slutz wrote:
>>>> This adds synchronization of the 6 vcpu registers (only 32bits of
>>>> them) that vmport.c needs between Xen and QEMU.
>>>>
>>>> This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
>>>> fetch and put these 6 vcpu registers used by the code in vmport.c
>>>> and vmmouse.c
>>>>
>>>> In the tools, enable usage of QEMU's vmport code.
>>>>
>>>> The currently most useful VMware port support that QEMU has is the
>>>> VMware mouse support.  Xorg included a VMware mouse support that
>>>> uses absolute mode.  This make using a mouse in X11 much nicer.
>>>>
>>>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>>>> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>>> Sorry for coming a bit late to this party.  On a high level I think this
>>> is good, but there doesn't seem to be anything in here in particular
>>> that is vmware-specific.  Would it make more sense to give this a more
>>> generic name, and have it include all of the general-purpose registers?
>>
>> I do not know of a more general case.  The code here is very VMware "in
>> (%dx),%eax" specific.  The x86 architecture does not have an in/out case
>> where registers other then rax get used and/or changed that need to be
>> sent to QEMU.  There already is code to handle ins better then 1 byte at
>> a time.
> 
> "VMWare-specific" doesn't mean VMWare is *currently* the only one that
> uses it; it means that the data passed is so VMWare specific that VMWare
> is likely to *always* be the only user.
> 

Yes.

> All this additional functionality does (as I understand it) is ship over
> some registers verbatim, and restore them on completion.  You could
> imagine other functionality which might be implemented in qemu (or
> another ioreq server) that could use functionality like that.
> 

The current way does not support multiple ioreq servers.  However any
one of the ioreq servers can be the one using it.  Currently QEMU
already gets it.  QEMU can say it does not want the page, but that is
code that currently does not exist.

> For example, this functionality might potentially be of use to the XenGT
> guys, who need to emulate writes to some pages to shadow the graphics
> card pagetables; or to someone wanting to implement some sort of
> introspection feature that is meant to work in both KVM and Xen.
> 

Not sure how KVM can use Xen's ioreq, but you seem to think there is a way.

> The only thing vmware-specific about this at the moment is the
> particular subset of registers you're copying over.
> 

It is also only 32bits of the 64bit registers....

>> There is also a data size issue.  The register data sent over is smaller
>> then the ioreq data.  Therefore the number of vCPUs that are supported
>> is the the same.  Changing the amount of data sent would effect this
>> (like requiring more then 1 page).
> 
> Hmm... so it looks like the ioreq struct is about the size of 8
> uint32_t's, or 4 uint64_ts.
> 

Yes.

> So you could easily include eax, ebx, ecx, edx, esi, edi, eip, esp.
> 

Not clear eax is any benefit, it is already in ioreq by a different name
(really rax) req->data. eip and esp would be much harder to get correct.

> But it's not clear that you could do general-purpose emulation without
> things like ebp, eflags, and so on.  Nor is it clear that it would be
> useful to do only emulation for 32-bit instructions.

Mostly why I was going with VMware "only" and I/O instructions only.

> 
> Would it be terribly bad to make it 4 pages long -- long enough to get
> most of the 64-bit registers in there if wanted?
> 

The switch to 4 pages is harder.  There is code in QEMU that does the
page mapping.  So you need to add a way to map 4 pages in QEMU, wait for
it to get into upstream QEMU, then change Xen to use it.  The 1 page
code is already there (QEMU 2.3 and later).

> Or alternately, would it be possible to allow the contents of this page
> to be changed in the future, perhaps with a domctl?

The actual layout is mostly controlled (a copy is in the QEMU soucre) by
Xen include files.  I do not think that a QEMU built without the correct
Xen include files would work, but I would not want to bet that it would
never happen.

However the code to move registers around is in QEMU and does not come
from include files.  So adding more registers does not work.  I have not
looked at switch to 64bits in the registers in QEMU, that maybe could be
done with an include file change.

If I am reading this correctly, you would like the code in QEMU to fully
be from xen include files.  This would still need design, coding, and
acceptance by QEMU before Xen can be changed to use it.

   -Don Slutz

> 
>  -George
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-05 10:54             ` Ian Campbell
@ 2015-06-11 22:10               ` Don Slutz
  2015-06-12  6:25                 ` Jan Beulich
  0 siblings, 1 reply; 48+ messages in thread
From: Don Slutz @ 2015-06-11 22:10 UTC (permalink / raw)
  To: Ian Campbell, Jan Beulich
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Keir Fraser, Eddie Dong,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Don Slutz, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, BorisOstrovsky

On 06/05/15 06:54, Ian Campbell wrote:
> On Fri, 2015-06-05 at 10:31 +0100, Jan Beulich wrote:
>>> I'm talking about cost-benefits analysis.  What's the benefit of
>>> accepting this patch, and is it worth the cost?
>>
>> The basic idea of allowing guests originally having got installed on
>> VMware to continue their lives on Xen is certainly something worth
>> accepting some cost. It's really hard to judge whether in the case
>> here things go too far (and that would equally apply to the hand
>> crafted instruction decoding done in earlier versions of this series).
> 
> I can see the benefit in having a guest which was installed on vmware be
> able to boot and work on Xen.
> 
> But AIUI this userspace vmware port thing is not needed for that basic
> use case but instead goes farther and enables advanced features like
> clip boards integration, which TBH I think we could consider living
> without (especially considering the costs discussed here).
> 
> It would be really useful to see a comprehensive list of exactly what
> guest ring3 access to the vmware port actually enables i.e. a list of
> specific features which require it.

Ok, I have done some testing.  Here is what I know:

Without ring3 support:

1) VMware tools will not install on linux and windows.
2) open-vm-tools (https://github.com/vmware/open-vm-tools) will not
install (how ever it is not hard to change it to do so, you need to add
a call to iopl(3) need to be added in a few places) on linux

However if VMware tools did get installed on the window disk bits
somehow, the VMware mouse support works.  Linux gets this because Xorg
detects and uses the VMware mouse under IOPL(3).


The following are available via QEMU 2.4 (if the patches get accepted)
and a functioning open-vm-tools:

3) The ability to perform virtual machine power operations gracefully is
missing. (code to access QEMU's from Xen to do this is missing).  I.E.
get windows to shutdown when requested!

4) Execution of VMware provided or user configured scripts in guests
during various power operations.

5) Clock synchronization between guests and hosts or client desktops.

6) Access to VMware guest info variables (code to access QEMU's from Xen
to do this is missing).  This can be used to customize guest operating
systems immediately after powering on virtual machines. It can also be
used to monitor the health of a guest.


   -Don Slutz



> 
> Ian.
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-11 22:10               ` Don Slutz
@ 2015-06-12  6:25                 ` Jan Beulich
  2015-06-12 12:52                   ` Don Slutz
  0 siblings, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2015-06-12  6:25 UTC (permalink / raw)
  To: Don Slutz
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Don Slutz, xen-devel, Eddie Dong, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, BorisOstrovsky

>>> On 12.06.15 at 00:10, <dslutz@verizon.com> wrote:
> On 06/05/15 06:54, Ian Campbell wrote:
>> It would be really useful to see a comprehensive list of exactly what
>> guest ring3 access to the vmware port actually enables i.e. a list of
>> specific features which require it.
> 
> Ok, I have done some testing.  Here is what I know:
> 
> Without ring3 support:
> 
> 1) VMware tools will not install on linux and windows.
> 2) open-vm-tools (https://github.com/vmware/open-vm-tools) will not
> install (how ever it is not hard to change it to do so, you need to add
> a call to iopl(3) need to be added in a few places) on linux
> 
> However if VMware tools did get installed on the window disk bits
> somehow, the VMware mouse support works.  Linux gets this because Xorg
> detects and uses the VMware mouse under IOPL(3).

Now that tells us that the tools may not work, but not what
implications that has on the usability of the VM once migrated
to Xen. Them not installing is a non-issue afaict, since after
having moved the VM to Xen there shouldn't be a need to
install them anymore - either they've been there, or I don't
see why they would be needed _after_ the move. And you
say that the mouse works in both cases if the tools happen
to be there.

Jan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-12  6:25                 ` Jan Beulich
@ 2015-06-12 12:52                   ` Don Slutz
  0 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-06-12 12:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Don Slutz, xen-devel, Eddie Dong, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, BorisOstrovsky

On 06/12/15 02:25, Jan Beulich wrote:
>>>> On 12.06.15 at 00:10, <dslutz@verizon.com> wrote:
>> On 06/05/15 06:54, Ian Campbell wrote:
>>> It would be really useful to see a comprehensive list of exactly what
>>> guest ring3 access to the vmware port actually enables i.e. a list of
>>> specific features which require it.
>>
>> Ok, I have done some testing.  Here is what I know:
>>
>> Without ring3 support:
>>
>> 1) VMware tools will not install on linux and windows.
>> 2) open-vm-tools (https://github.com/vmware/open-vm-tools) will not
>> install (how ever it is not hard to change it to do so, you need to add
>> a call to iopl(3) need to be added in a few places) on linux
>>
>> However if VMware tools did get installed on the window disk bits
>> somehow, the VMware mouse support works.  Linux gets this because Xorg
>> detects and uses the VMware mouse under IOPL(3).
> 
> Now that tells us that the tools may not work, but not what
> implications that has on the usability of the VM once migrated
> to Xen. Them not installing is a non-issue afaict, since after
> having moved the VM to Xen there shouldn't be a need to
> install them anymore - either they've been there, or I don't
> see why they would be needed _after_ the move. And you
> say that the mouse works in both cases if the tools happen
> to be there.
> 

The VMware tools service will start but does not work.  It adds to the
event log:

[warning][vmusr:vmtoolsd] The vmuser service needs to run inside a
virtual machine.

And the available features of VMware tools is disabled:

   1) The ability to perform virtual machine power operations gracefully
is missing. (code to access QEMU's from Xen to do this is  missing).
I.E. get windows to shutdown when requested!

   2) Execution of VMware provided or user configured scripts in guests
during various power operations.

   3) Clock synchronization between guests and hosts or client desktops.

   4) Access to VMware guest info variables (code to access QEMU's from
Xen to do this is missing).  This can be used to customize guest
operating systems immediately after powering on virtual machines. It can
also be used to monitor the health of a guest.


The reason to install them after is to get the VMware mouse driver on
Windows.  This mouse driver works much better on Window when there is
higher network latency.

   -Don Slutz

> Jan
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-05-22 15:50 ` [PATCH v11 6/9] xen: Add ring 3 " Don Slutz
  2015-06-03 15:26   ` George Dunlap
@ 2015-06-23 16:14   ` Jan Beulich
  2015-06-26 14:54     ` Don Slutz
  1 sibling, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2015-06-23 16:14 UTC (permalink / raw)
  To: Don Slutz
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Eddie Dong, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

>>> On 22.05.15 at 17:50, <dslutz@verizon.com> wrote:
> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
> to port 0x5658 specially.  Note: since many operations return data
> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
> "in (%dx),%al" will still do things, only AL part of EAX will be
> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
> unchanged.
> 
> This instruction is allowed to be used from ring 3.  To
> support this the vmexit for GP needs to be enabled.  I have not
> fully tested that nested HVM is doing the right thing for this.
> 
> Enable no-fault of pio in x86_emulate for VMware port
> 
> Also adjust the emulation registers after doing a VMware
> backdoor operation.
> 
> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
> handler.
> 
> Some of the best info is at:
> 
> https://sites.google.com/site/chitchatvmback/backdoor 
> 
> Signed-off-by: Don Slutz <dslutz@verizon.com>

As there don't seem to be enough convincing arguments for this to
be worthwhile, I'm going to drop this and subsequent patches from
my list of things to look at. Would you mind following George's (at
least I think it was him) advice to post a shortened series with all
review comments taken care of, so that at least the ring 0 pieces
could go in for 4.6?

Thanks, Jan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v11 6/9] xen: Add ring 3 vmware_port support
  2015-06-23 16:14   ` Jan Beulich
@ 2015-06-26 14:54     ` Don Slutz
  0 siblings, 0 replies; 48+ messages in thread
From: Don Slutz @ 2015-06-26 14:54 UTC (permalink / raw)
  To: Jan Beulich, Don Slutz
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	George Dunlap, Andrew Cooper, Tim Deegan, xen-devel, Eddie Dong,
	Aravind Gopalakrishnan, Jun Nakajima, Suravee Suthikulpanit,
	Boris Ostrovsky, Ian Jackson

On 06/23/15 12:14, Jan Beulich wrote:
>>>> On 22.05.15 at 17:50, <dslutz@verizon.com> wrote:
>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>> to port 0x5658 specially.  Note: since many operations return data
>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>> "in (%dx),%al" will still do things, only AL part of EAX will be
>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>> unchanged.
>>
>> This instruction is allowed to be used from ring 3.  To
>> support this the vmexit for GP needs to be enabled.  I have not
>> fully tested that nested HVM is doing the right thing for this.
>>
>> Enable no-fault of pio in x86_emulate for VMware port
>>
>> Also adjust the emulation registers after doing a VMware
>> backdoor operation.
>>
>> Add new routine hvm_emulate_one_gp() to be used by the #GP fault
>> handler.
>>
>> Some of the best info is at:
>>
>> https://sites.google.com/site/chitchatvmback/backdoor
>>
>> Signed-off-by: Don Slutz <dslutz@verizon.com>
> As there don't seem to be enough convincing arguments for this to
> be worthwhile, I'm going to drop this and subsequent patches from
> my list of things to look at. Would you mind following George's (at
> least I think it was him) advice to post a shortened series with all
> review comments taken care of, so that at least the ring 0 pieces
> could go in for 4.6?

Sure, I was just about to post v12 (last check was a rebase with failed):

commit 65bb47fb732265f704d4ec6616076ec74771a6eb
Author: Paul Durrant <paul.durrant@citrix.com>
Date:   Tue Jun 23 18:08:32 2015 +0200

Needs more then simple merge.  Will post when ready.

    -Don Slutz


> Thanks, Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2015-06-26 14:54 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-22 15:50 [PATCH v11 0/9] Xen VMware tools support Don Slutz
2015-05-22 15:50 ` [PATCH v11 1/9] tools: Add vga=vmware Don Slutz
2015-05-22 15:50 ` [PATCH v11 2/9] xen: Add support for VMware cpuid leaves Don Slutz
2015-05-22 15:50 ` [PATCH v11 3/9] tools: Add vmware_hwver support Don Slutz
2015-06-03 14:53   ` George Dunlap
2015-06-04 15:15     ` Ian Campbell
2015-06-04 15:46       ` Don Slutz
2015-06-04 15:17   ` Ian Campbell
2015-06-04 15:59     ` Don Slutz
2015-05-22 15:50 ` [PATCH v11 4/9] vmware: Add VMware provided include file Don Slutz
2015-05-22 15:50 ` [PATCH v11 5/9] xen: Add vmware_port support Don Slutz
2015-06-05  9:52   ` Jan Beulich
2015-06-05 13:18     ` Don Slutz
2015-05-22 15:50 ` [PATCH v11 6/9] xen: Add ring 3 " Don Slutz
2015-06-03 15:26   ` George Dunlap
2015-06-03 15:58     ` Andrew Cooper
2015-06-03 16:23       ` George Dunlap
2015-06-03 16:40         ` Andrew Cooper
2015-06-03 17:00           ` George Dunlap
2015-06-03 16:41         ` Don Slutz
2015-06-03 16:58           ` George Dunlap
2015-06-04 12:37             ` Don Slutz
2015-06-04 14:14               ` George Dunlap
2015-06-04 16:17                 ` Don Slutz
2015-06-03 16:36       ` Don Slutz
2015-06-03 16:50         ` George Dunlap
2015-06-05  9:31           ` Jan Beulich
2015-06-05 10:54             ` Ian Campbell
2015-06-11 22:10               ` Don Slutz
2015-06-12  6:25                 ` Jan Beulich
2015-06-12 12:52                   ` Don Slutz
2015-06-23 16:14   ` Jan Beulich
2015-06-26 14:54     ` Don Slutz
2015-05-22 15:50 ` [PATCH v11 7/9] tools: Add " Don Slutz
2015-06-03 17:06   ` George Dunlap
2015-06-04 15:49     ` Ian Campbell
2015-06-04 16:09       ` Don Slutz
2015-06-04 15:20   ` Ian Campbell
2015-05-22 15:50 ` [PATCH v11 8/9] Add IOREQ_TYPE_VMWARE_PORT Don Slutz
2015-06-03 17:09   ` George Dunlap
2015-06-04 11:28     ` Don Slutz
2015-06-05  9:35       ` Jan Beulich
2015-06-05 10:03         ` Paul Durrant
2015-06-08 10:05       ` George Dunlap
2015-06-11 21:51         ` Don Slutz
2015-05-22 15:50 ` [PATCH v11 9/9] Add xentrace to vmware_port Don Slutz
2015-06-04 11:20   ` George Dunlap
2015-06-04 12:31     ` Don Slutz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.