All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.5 v6 00/16] Xen VMware tools support
@ 2014-09-20 18:07 Don Slutz
  2014-09-20 18:07 ` [PATCH for-4.5 v6 01/16] xen: Add support for VMware cpuid leaves Don Slutz
                   ` (16 more replies)
  0 siblings, 17 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

Comments on v3, v4, v5, v6:
  George Dunlap:
    Is there any reason not to merge 05/16 with 03/16?
      The reason I have is that v3 03/16 only contains new files. 2
      from VMware and 1 to allow use of the VMware files.  I added
      xen/arch/x86/hvm/vmware/includeCheck.h at the request of
      Konrad Wilk.

      This patch has many style issues and white space issues.  So I
      want it as a separate patch so as to be clear on what files do
      not meet the coding style.  And why and where they came from.

Changes v5 to v6:
  Boris Ostrovsky & Jan Beulich
    #4 "xen: Add vmware_port support":
    #6 "xen: Convert vmware_port to xentrace usage":
    There is an issue with reading instruction bytes more then once.
      Dropped the attempt to use svm_nextrip_insn_length via
      __get_instruction_length (added in v2).  Just always look
      at upto 15 bytes on AMD.

Changes v4 to v5:
  Re tagged the optional patches.

  Added debug=y build checking that vmx is defining
  VM_EXIT_INTR_ERROR_CODE.

  Boris Ostrovsky:
    #1 "xen: Add support for VMware cpuid leaves":
      Given how is_viridian and is_vmware are defined I think '||' is more
      appropriate.
        Fixed.
    #4 "xen: Add is_vmware_port_enabled":
      we should make sure that svm_vmexit_gp_intercept is not executed for
      any other guest.
        Added an ASSERT on is_vmware_port_enabled.
      magic integers?
        Added #define for them.
    #6 "xen: Convert vmware_port to xentrace usage":
      exitinfo1 is used twice.
        Fixed.
    #7 "tools: Convert vmware_port to xentrace usage":
      'bytes = 0x%(2)d' or 'bytes = %(2)d' ?
        Fixed.
    #8 "xen: Add limited support of VMware's hyper-call rpc":
      PV vs. HVM vs. PVH. So probably 'if(is_hvm_vcpu)'?
        I see no reason to exclude PVH.   Will change to has_hvm_container_vcpu
    #11 "Add live migration of VMware's hyper-call":
      You ASSERTed that vg->key_len is 1 so you may not need the 'if'.
        That is a ASSERT(sizeof, not just ASSERT -- not changed.
      Use real errno, not -1.
        Fixed.
      No ASSERT in vmport_load_domain_ctxt
        Added.

  Jan Beulich & Boris Ostrovsky:
    #8 "xen: Add limited support of VMware's hyper-call rpc":
      The names of all three functions are bogus.
        removed static support routines.
        Also changed in #1.

  Andrew Cooper:
    #2 "tools: Add vmware_hw support":
      Anything looking for Xen according to the Xen cpuid instructions...
        Adjusted doc to new wording.
    #4 "xen: Add is_vmware_port_enabled":
      I am fairly certain that you need some brackets here.
        Added brackets.

  Jan Beulich & Andrew Cooper:
    #1 "xen: Add support for VMware cpuid leaves":
      This hunk is unrelated, but is perhaps something better fixed.
        Added to commit message.
      include <xen/types.h> (IIRC) please.
        Done.
      At least 1 pair of brackets please, especially as the placement of
      brackets affects the result of this particular calculation.
        Switch to "1000000ull / APIC_BUS_CYCLE_NS"      


Changes v3 to v4:
  Ian Campbell:
    Report on both viridian and vmware_hw set.
    Added LIBXL_VGA_INTERFACE_TYPE_VMWARE (vga=vmware).

  Andrew Cooper:
    Add doc for hypervisor-cpuid.

  Boris Ostrovsky:
    Changing regs->error_code may not be a good idea.
      Dropped this.
    
  Jan Beulich & Boris Ostrovsky:
    Only enable vmwxit for GP when vmware_port is set.
      Done.


Changes v2 to v3:

  Add optional unit test tools.
  Re-worked split of changes.

  Jan Beulich:
    for #0:
      I don't think you should be adding a new fine in hvm/ _and_ a new
      subdirectory.
        Moved all files to hvm/vmware that contain code.
    for old #1 (now #1 & #2):
      Is there really a point in enabling both Viridian and VMware extensions?
        I still think so.
      hvmloader change: This needs an explanation
        Dropped as not need now.
      Can you make vmware_hw similar to Viridian, returning success when
      setting the value to what it already is.
        Done.
      You don't seem to be using sub_idx: ...
        Dropped.
      Extra changes...
        Dropped.
    for old #2 (now #3):
      ... these guards have the (theoretical at this point) risk of clashing
      ... the patch is obviously incomplete without this header...
        Did not fix any of these issues.  I will stick with this needs
        to be a 2nd patch that changes the include files to better fit
        in Xen coding.  For now these files are in a sub directory
        which is not part of the normal include search.
        Moved the includeCheck.h file into this patch.
    for old #3 (now #4, #5, #6, #7, #8, #9, #10, #11)
      As I think was said on v1 already - this should be split into smaller
      pieces ...
        Done.
      All this would very likely better go into a separate function placed in
      vmport.c.
        Moved most of the code into vmport.c or vmport_rpc.c.
      In any event I'm rather uncomfortable about vmware_port getting
      enabled unconditionally, ...
        Added vmware_port (done in new patches #4, #5) as an xl.cfg
        option.
      You'll have to go through and fix coding style issues.
        I think I have found all these, but since they do not stand out
        for me, let me know of any left.
      "MAKE_INSTR(IN," name is ambiguous.
        Added all 4 opcodes for in and out that can access this port: INB_DX,
        INL_DX, OUTB_DX, OUTL_DX.
      A VMX-specific function shouldn't be named this way...
        Added new common routine vmport_gp_check() that is called from
        both vmx.c and svm.c which is where all the logic about checking
        for IN ans OUT is done.
        Also fixed naming and added static.
      Ah, here we go (as to using HVM_DBG_LOG()): Isn't this _way_ too
      fine grained?
        I have reduced the number of bits used.  Partialy by switching
        some to xentrace (new patch #6 and #7).
      Right, and zero is an indication that it wasn't found. Also I just
      noticed there's a gdprintk() in that event, which for all other ...
        Made the gdprintk() optional.

End of v3 changes.

This is a small part of the changes needed to allow running Linux
and windows (and others) guests that were built on VMware and run
run them unchanged on Xen.

This small part is the start of Xen support of VMware backdoor I/O
port which is how VMware tools (a standard addition installed on a
guest) communicates to the hypervisor.

I picked this subset to start with because it only has changes in
Xen.

Some of this code is already in QEMU and so KVM has some of this
already.  QEMU supported backdoor commands include VMware mouse
support.  A later patch set exists that links these changes, new
code and Xen changes to QEMU to provide VMware mouse support under
Xen.  The important part is that VMware mouse is an absolute
position mouse and so network delays do not effect usage of the
virtual mouse.

For example from the guest:

[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.joejoel"
No value found
[root@C63-min-tools ~]# vmtoolsd --cmd "info-set guestinfo.joejoel short"

[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.joejoel"
short
[root@C63-min-tools ~]# vmtoolsd --cmd "info-set guestinfo.joejoel long222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000joel"

[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.key1"
data1
[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.key2"
No value found
[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.key2"
data2
[root@C63-min-tools ~]# 


Most of this code has been reverse engineered by looking at
source code for Linux and open VMware tools.

http://open-vm-tools.sourceforge.net


changes RFC to v2:

Jan Beulich:
  Add xen/arch/x86/hvm/vmware.c for cpuid_vmware_leaves
  Fewer patches

Andrew Cooper:
  use the proper constant for apic_khz
  Follow 839b966e3f587bbb1a0d954230fb3904330dccb6 style changes.
  Changed HVM_PARAM_VMWARE_HW to write once (make is_vmware_domain()
    more static).
  Dropped vmport status stuff.
  Added checks for xzalloc() having failed.
  You should include backdoor_def.h ...
     Every thing I tried did not work better.  So I did not
     change VMPORT_PORT and BDOOR_PORT being the same value.
     I did not try and adjust VMware's include file backdoor_def.h
     to working in other xen source files.
  Switching to s_time_t is not valid. get_sec() is defined:
    unsigned long get_sec(void);
  and so my uses of it should be using unsigned long.  However
  since that is not a fixed width type, I used the uint64_t
  data type which is almost the same, but does allow the 32 bit
  build of libxc, libxl to do the correct thing.


Konrad Rzeszutek Wilk:
  Please don't include the address. It should be, etc
      about the Vmware provided include files.
    I went with no changes to these files.  Even if the files should
    be changed to match xen coding style, etc I still feel that the
    original ones should be added via a patch, and then adjusted in a
    2nd patch.
  Can you use XenBus?
    I would say no.  XenBus (and XenStore) is about domain to domain
    communication.  This is about VMware's hyper-call and providing
    access to VMware's guest info very low speed access.

Olaf Hering:
   Dropped changing of bios-strings.  Still needs some documentation
   about this may be needed to do in a tool stack or set of commands.


Boris Ostrovsky:
  Use svm_nextrip_insn_length()
    Looks like __get_instruction_length() does this, so switched to
    __get_instruction_length().
 
RFC:

See

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458

for info on detecting VMware.

Linux does not follow this exactly.  It checks for CPUID 1st.  If
that fails, it checks for SMBIOS containing "VMware" (not VMware- or
VMW).

So this patch set provides:

        SMBIOS -- Add string VMware-
        CPUID -- Add VMware's CPUID (Note: currently HyperV (viridian support) breaks this check.)
        Add the magic VMware port
            Allow VMware tools poweroff and reboot
            Enable access to VMware's guest info
            Provide the VMware tools build number


Don Slutz (16):
  xen: Add support for VMware cpuid leaves
  tools: Add vmware_hw support
  vmware: Add VMware provided include files.
  xen: Add vmware_port support
  tools: Add vmware_port support
  xen: Convert vmware_port to xentrace usage
  tools: Convert vmware_port to xentrace usage
  xen: Add limited support of VMware's hyper-call rpc
  tools: Add limited support of VMware's hyper-call rpc
  Add VMware tool's triggers
  Add live migration of VMware's hyper-call RPC
  Add dump of HVM_SAVE_CODE(VMPORT) to xen-hvmctx.
  Add xen-hvm-param
  Add xen-vmware-guestinfo
  Add xen-list-vmware-guestinfo
  Add xen-hvm-send-trigger

 .gitignore                                   |    4 +
 docs/man/xl.cfg.pod.5                        |   28 +-
 docs/misc/hypervisor-cpuid.markdown          |   28 +
 tools/libxc/xc_domain.c                      |  115 ++
 tools/libxc/xc_domain_restore.c              |   14 +
 tools/libxc/xc_domain_save.c                 |   11 +
 tools/libxc/xenctrl.h                        |   24 +
 tools/libxc/xg_save_restore.h                |    2 +
 tools/libxl/libxl.h                          |   15 +
 tools/libxl/libxl_create.c                   |    6 +-
 tools/libxl/libxl_dm.c                       |   10 +-
 tools/libxl/libxl_dom.c                      |    2 +
 tools/libxl/libxl_types.idl                  |    3 +
 tools/libxl/xl_cmdimpl.c                     |   12 +-
 tools/misc/Makefile                          |   16 +-
 tools/misc/xen-hvm-param.c                   |  164 +++
 tools/misc/xen-hvm-send-trigger.c            |  103 ++
 tools/misc/xen-hvmctx.c                      |  229 ++++
 tools/misc/xen-list-vmware-guestinfo.c       |   88 ++
 tools/misc/xen-vmware-guestinfo.c            |   97 ++
 tools/xentrace/formats                       |   13 +
 xen/arch/x86/domain.c                        |    2 +
 xen/arch/x86/domctl.c                        |   34 +
 xen/arch/x86/hvm/Makefile                    |    3 +-
 xen/arch/x86/hvm/hvm.c                       |   79 ++
 xen/arch/x86/hvm/svm/emulate.c               |    2 +-
 xen/arch/x86/hvm/svm/svm.c                   |   55 +
 xen/arch/x86/hvm/svm/vmcb.c                  |    2 +
 xen/arch/x86/hvm/vmware/Makefile             |    3 +
 xen/arch/x86/hvm/vmware/backdoor_def.h       |  167 +++
 xen/arch/x86/hvm/vmware/cpuid.c              |   89 ++
 xen/arch/x86/hvm/vmware/guest_msg_def.h      |   87 ++
 xen/arch/x86/hvm/vmware/includeCheck.h       |   17 +
 xen/arch/x86/hvm/vmware/vmport.c             |  339 ++++++
 xen/arch/x86/hvm/vmware/vmport_rpc.c         | 1580 ++++++++++++++++++++++++++
 xen/arch/x86/hvm/vmx/vmcs.c                  |    2 +
 xen/arch/x86/hvm/vmx/vmx.c                   |   96 +-
 xen/arch/x86/hvm/vmx/vvmx.c                  |   14 +
 xen/arch/x86/traps.c                         |    8 +-
 xen/common/domctl.c                          |    3 +
 xen/include/asm-x86/hvm/domain.h             |    7 +
 xen/include/asm-x86/hvm/hvm.h                |    3 +
 xen/include/asm-x86/hvm/io.h                 |    2 +-
 xen/include/asm-x86/hvm/svm/emulate.h        |    1 +
 xen/include/asm-x86/hvm/trace.h              |   45 +
 xen/include/asm-x86/hvm/vmport.h             |   89 ++
 xen/include/asm-x86/hvm/vmware.h             |   33 +
 xen/include/public/arch-x86/hvm/save.h       |   39 +-
 xen/include/public/arch-x86/hvm/vmporttype.h |  118 ++
 xen/include/public/domctl.h                  |    6 +
 xen/include/public/hvm/hvm_op.h              |   18 +
 xen/include/public/hvm/params.h              |    8 +-
 xen/include/public/trace.h                   |   12 +
 xen/include/xen/sched.h                      |    3 +
 54 files changed, 3933 insertions(+), 17 deletions(-)
 create mode 100644 docs/misc/hypervisor-cpuid.markdown
 create mode 100644 tools/misc/xen-hvm-param.c
 create mode 100644 tools/misc/xen-hvm-send-trigger.c
 create mode 100644 tools/misc/xen-list-vmware-guestinfo.c
 create mode 100644 tools/misc/xen-vmware-guestinfo.c
 create mode 100644 xen/arch/x86/hvm/vmware/Makefile
 create mode 100644 xen/arch/x86/hvm/vmware/backdoor_def.h
 create mode 100644 xen/arch/x86/hvm/vmware/cpuid.c
 create mode 100644 xen/arch/x86/hvm/vmware/guest_msg_def.h
 create mode 100644 xen/arch/x86/hvm/vmware/includeCheck.h
 create mode 100644 xen/arch/x86/hvm/vmware/vmport.c
 create mode 100644 xen/arch/x86/hvm/vmware/vmport_rpc.c
 create mode 100644 xen/include/asm-x86/hvm/vmport.h
 create mode 100644 xen/include/asm-x86/hvm/vmware.h
 create mode 100644 xen/include/public/arch-x86/hvm/vmporttype.h

-- 
1.8.4

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 01/16] xen: Add support for VMware cpuid leaves
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-22 11:49   ` Andrew Cooper
  2014-09-24 14:33   ` George Dunlap
  2014-09-20 18:07 ` [PATCH for-4.5 v6 02/16] tools: Add vmware_hw support Don Slutz
                   ` (15 subsequent siblings)
  16 siblings, 2 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This is done by adding HVM_PARAM_VMWARE_HW. It is set to the VMware
virtual hardware version.

Currently 0, 3-4, 6-11 are good values.  However the
code only checks for == 0 or != 0.

If non-zero then
  Return VMware's cpuid leaves.

The support of hypervisor cpuid leaves has not been agreed to.

MicroSoft Hyper-V (AKA viridian) currently must be at 0x40000000.

VMware currently must be at 0x40000000.

KVM currently must be at 0x40000000 (from Seabios).

Xen can be found at the first otherwise unused 0x100 aligned
offset between 0x40000000 and 0x40010000.

http://download.microsoft.com/download/F/B/0/FB0D01A3-8E3A-4F5F-AA59-08C8026D3B8A/requirements-for-implementing-microsoft-hypervisor-interface.docx

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458

http://lwn.net/Articles/301888/
  Attempted to get this cleaned up.

So based on this, I picked the order:

Xen at 0x40000000 or
Viridian or VMware at 0x40000000 and Xen at 0x40000100

If both Viridian and VMware selected, report an error.

Since I need to change xen/arch/x86/hvm/Makefile; also add
a newline at end of file.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v5:
      Given how is_viridian and is_vmware are defined I think '||' is more
      appropriate.
        Fixed.
      The names of all three functions are bogus.
        removed static support routines.
      This hunk is unrelated, but is perhaps something better fixed.
        Added to commit message.
      include <xen/types.h> (IIRC) please.
        Done.
      At least 1 pair of brackets please, especially as the placement of
      brackets affects the result of this particular calculation.
        Switch to "1000000ull / APIC_BUS_CYCLE_NS"      

 xen/arch/x86/hvm/Makefile        |  3 +-
 xen/arch/x86/hvm/hvm.c           | 32 +++++++++++++++
 xen/arch/x86/hvm/vmware/Makefile |  1 +
 xen/arch/x86/hvm/vmware/cpuid.c  | 89 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c             |  8 +++-
 xen/include/asm-x86/hvm/hvm.h    |  3 ++
 xen/include/asm-x86/hvm/vmware.h | 33 +++++++++++++++
 xen/include/public/hvm/params.h  |  5 ++-
 8 files changed, 170 insertions(+), 4 deletions(-)
 create mode 100644 xen/arch/x86/hvm/vmware/Makefile
 create mode 100644 xen/arch/x86/hvm/vmware/cpuid.c
 create mode 100644 xen/include/asm-x86/hvm/vmware.h

diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index eea5555..77598a6 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -1,5 +1,6 @@
 subdir-y += svm
 subdir-y += vmx
+subdir-y += vmware
 
 obj-y += asid.o
 obj-y += emulate.o
@@ -22,4 +23,4 @@ obj-y += vlapic.o
 obj-y += vmsi.o
 obj-y += vpic.o
 obj-y += vpt.o
-obj-y += vpmu.o
\ No newline at end of file
+obj-y += vpmu.o
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index bb45593..f3cf566 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -57,6 +57,7 @@
 #include <asm/hvm/cacheattr.h>
 #include <asm/hvm/trace.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/vmware.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
 #include <public/sched.h>
@@ -4228,6 +4229,9 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
     if ( cpuid_viridian_leaves(input, eax, ebx, ecx, edx) )
         return;
 
+    if ( cpuid_vmware_leaves(input, eax, ebx, ecx, edx) )
+        return;
+
     if ( cpuid_hypervisor_leaves(input, count, eax, ebx, ecx, edx) )
         return;
 
@@ -5555,6 +5559,11 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
                     rc = -EINVAL;
                 break;
             case HVM_PARAM_VIRIDIAN:
+                if ( d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] )
+                {
+                    rc = -EXDEV;
+                    break;
+                }
                 if ( a.value > 1 )
                     rc = -EINVAL;
                 break;
@@ -5692,6 +5701,29 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
 
                 break;
             }
+            case HVM_PARAM_VMWARE_HW:
+                /*
+                 * This should only ever be set non-zero one time by
+                 * the tools and is read only by the guest.
+                 */
+                if ( d == current->domain )
+                {
+                    rc = -EPERM;
+                    break;
+                }
+                if ( d->arch.hvm_domain.params[HVM_PARAM_VIRIDIAN] )
+                {
+                    rc = -EXDEV;
+                    break;
+                }
+                if ( d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] &&
+                     d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] !=
+                     a.value )
+                {
+                    rc = -EEXIST;
+                    break;
+                }
+                break;
             }
 
             if ( rc == 0 ) 
diff --git a/xen/arch/x86/hvm/vmware/Makefile b/xen/arch/x86/hvm/vmware/Makefile
new file mode 100644
index 0000000..3fb2e0b
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/Makefile
@@ -0,0 +1 @@
+obj-y += cpuid.o
diff --git a/xen/arch/x86/hvm/vmware/cpuid.c b/xen/arch/x86/hvm/vmware/cpuid.c
new file mode 100644
index 0000000..29f6213
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/cpuid.c
@@ -0,0 +1,89 @@
+/*
+ * arch/x86/hvm/vmware/cpuid.c
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/sched.h>
+
+#include <asm/hvm/hvm.h>
+#include <asm/hvm/vmware.h>
+
+/*
+ * VMware hardware version 7 defines some of these cpuid levels,
+ * below is a brief description about those.
+ *
+ *     Leaf 0x40000000, Hypervisor CPUID information
+ * # EAX: The maximum input value for hypervisor CPUID info (0x40000010).
+ * # EBX, ECX, EDX: Hypervisor vendor ID signature. E.g. "VMwareVMware"
+ *
+ *     Leaf 0x40000010, Timing information.
+ * # EAX: (Virtual) TSC frequency in kHz.
+ * # EBX: (Virtual) Bus (local apic timer) frequency in kHz.
+ * # ECX, EDX: RESERVED
+ */
+
+int cpuid_vmware_leaves(uint32_t idx, uint32_t *eax, uint32_t *ebx,
+                        uint32_t *ecx, uint32_t *edx)
+{
+    struct domain *d = current->domain;
+
+    if ( !is_vmware_domain(d) )
+        return 0;
+
+    switch ( idx - 0x40000000 )
+    {
+    case 0x0:
+        if ( d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] >= 7 )
+        {
+            *eax = 0x40000010;  /* Largest leaf */
+            *ebx = 0x61774d56;  /* "VMwa" */
+            *ecx = 0x4d566572;  /* "reVM" */
+            *edx = 0x65726177;  /* "ware" */
+            break;
+        }
+        /* fallthrough */
+    case 0x10:
+        if ( d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] >= 7 )
+        {
+            /* (Virtual) TSC frequency in kHz. */
+            *eax =  d->arch.tsc_khz;
+            /* (Virtual) Bus (local apic timer) frequency in kHz. */
+            *ebx = 1000000ull / APIC_BUS_CYCLE_NS;
+            *ecx = 0;          /* Reserved */
+            *edx = 0;          /* Reserved */
+            break;
+        }
+        /* fallthrough */
+    case 0x1 ... 0xf:
+        *eax = 0;          /* Reserved */
+        *ebx = 0;          /* Reserved */
+        *ecx = 0;          /* Reserved */
+        *edx = 0;          /* Reserved */
+        break;
+
+    default:
+        return 0;
+    }
+
+    return 1;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 10fc2ca..90542f9 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -685,8 +685,12 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
                uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx)
 {
     struct domain *d = current->domain;
-    /* Optionally shift out of the way of Viridian architectural leaves. */
-    uint32_t base = is_viridian_domain(d) ? 0x40000100 : 0x40000000;
+    /*
+     * Optionally shift out of the way of Viridian or VMware
+     * architectural leaves.
+     */
+    uint32_t base = is_viridian_domain(d) || is_vmware_domain(d) ?
+        0x40000100 : 0x40000000;
     uint32_t limit, dummy;
 
     idx -= base;
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 121d053..3916fec 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -349,6 +349,9 @@ static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
 #define is_viridian_domain(_d)                                             \
  (is_hvm_domain(_d) && ((_d)->arch.hvm_domain.params[HVM_PARAM_VIRIDIAN]))
 
+#define is_vmware_domain(_d)                                             \
+ (is_hvm_domain(_d) && ((_d)->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW]))
+
 void hvm_hypervisor_cpuid_leaf(uint32_t sub_idx,
                                uint32_t *eax, uint32_t *ebx,
                                uint32_t *ecx, uint32_t *edx);
diff --git a/xen/include/asm-x86/hvm/vmware.h b/xen/include/asm-x86/hvm/vmware.h
new file mode 100644
index 0000000..8390173
--- /dev/null
+++ b/xen/include/asm-x86/hvm/vmware.h
@@ -0,0 +1,33 @@
+/*
+ * asm-x86/hvm/vmware.h
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef ASM_X86_HVM_VMWARE_H__
+#define ASM_X86_HVM_VMWARE_H__
+
+#include <xen/types.h>
+
+int cpuid_vmware_leaves(uint32_t idx, uint32_t *eax, uint32_t *ebx,
+                        uint32_t *ecx, uint32_t *edx);
+
+#endif /* ASM_X86_HVM_VMWARE_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
index 614ff5f..dee6d68 100644
--- a/xen/include/public/hvm/params.h
+++ b/xen/include/public/hvm/params.h
@@ -151,6 +151,9 @@
 /* Location of the VM Generation ID in guest physical address space. */
 #define HVM_PARAM_VM_GENERATION_ID_ADDR 34
 
-#define HVM_NR_PARAMS          35
+/* Params for VMware */
+#define HVM_PARAM_VMWARE_HW                 35
+
+#define HVM_NR_PARAMS          36
 
 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 02/16] tools: Add vmware_hw support
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
  2014-09-20 18:07 ` [PATCH for-4.5 v6 01/16] xen: Add support for VMware cpuid leaves Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-22 13:34   ` Ian Campbell
  2014-09-24 14:44   ` George Dunlap
  2014-09-20 18:07 ` [PATCH for-4.5 v6 03/16] vmware: Add VMware provided include files Don Slutz
                   ` (14 subsequent siblings)
  16 siblings, 2 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This is used to set HVM_PARAM_VMWARE_HW. It is set to the VMware
virtual hardware version.

Currently 0, 3-4, 6-11 are good values.  However the code only
checks for == 0 or != 0.

If non-zero then
  default VGA to VMware's VGA.

Also now allows vga=vmware

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v5:
      Anything looking for Xen according to the Xen cpuid instructions...
        Adjusted doc to new wording.

 docs/man/xl.cfg.pod.5               | 21 +++++++++++++++++++--
 docs/misc/hypervisor-cpuid.markdown | 28 ++++++++++++++++++++++++++++
 tools/libxc/xc_domain_restore.c     | 14 ++++++++++++++
 tools/libxc/xc_domain_save.c        | 11 +++++++++++
 tools/libxc/xg_save_restore.h       |  2 ++
 tools/libxl/libxl.h                 | 10 ++++++++++
 tools/libxl/libxl_create.c          |  4 +++-
 tools/libxl/libxl_dm.c              | 10 +++++++++-
 tools/libxl/libxl_dom.c             |  2 ++
 tools/libxl/libxl_types.idl         |  2 ++
 tools/libxl/xl_cmdimpl.c            | 11 ++++++++++-
 11 files changed, 110 insertions(+), 5 deletions(-)
 create mode 100644 docs/misc/hypervisor-cpuid.markdown

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 517ae2f..367b401 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1147,6 +1147,23 @@ some other Operating Systems and in some circumstance can prevent
 Xen's own paravirtualisation interfaces for HVM guests from being
 used.
 
+=item B<vmware_hw=NUMBER>
+
+Turns on or off the exposure of VMware cpuid.  The number is the
+VMware's hardware version number, where 0 is off.  If not zero it
+changes the default VGA to VMware's VGA.
+
+The hardware version number (vmware_hw) come from VMware config files.
+
+=over 4
+
+In a .vmx it is virtualHW.version
+
+In a .ovf it is part of the value of vssd:VirtualSystemType.
+For vssd:VirtualSystemType == vmx-07, vmware_hw = 7.
+
+=back
+
 =back
 
 =head3 Emulated VGA Graphics Device
@@ -1185,8 +1202,8 @@ This option is deprecated, use vga="stdvga" instead.
 
 =item B<vga="STRING">
 
-Selects the emulated video card (none|stdvga|cirrus).
-The default is cirrus.
+Selects the emulated video card (none|stdvga|cirrus|vmware).
+The default is cirrus (or vmware if B<vmware_hw> is not zero).
 
 =item B<vnc=BOOLEAN>
 
diff --git a/docs/misc/hypervisor-cpuid.markdown b/docs/misc/hypervisor-cpuid.markdown
new file mode 100644
index 0000000..901a4e1
--- /dev/null
+++ b/docs/misc/hypervisor-cpuid.markdown
@@ -0,0 +1,28 @@
+Hypervisor Cpuid
+================
+
+The support of hypervisor cpuid leaves has not been agreed to.
+Other then the range 0x40000000 to 0x400000ff can be used by
+hypervisors.
+
+MicroSoft Hyper-V (AKA viridian) currently must be at 0x40000000.
+
+VMware currently must be at 0x40000000.
+
+KVM currently must be at 0x40000000 (from Seabios).
+
+Xen can be found at the first otherwise unused 0x100 aligned
+offset between 0x40000000 and 0x40010000.
+
+http://download.microsoft.com/download/F/B/0/FB0D01A3-8E3A-4F5F-AA59-08C8026D3B8A/requirements-for-implementing-microsoft-hypervisor-interface.docx
+
+http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458
+
+http://lwn.net/Articles/301888/
+  Attempted to get this cleaned up.
+
+So if Viridian or VMware_hw is selected, return their format for the
+range 0x40000000 to 0x400000ff. And return Xen format for the range
+0x40000100 to 0x400001ff.
+
+Otherwise return Xen format for the range 0x40000000 to 0x400000ff.
diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
index b9a56d5..bc5cd57 100644
--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -743,6 +743,7 @@ typedef struct {
     uint64_t vm_generationid_addr;
     uint64_t ioreq_server_pfn;
     uint64_t nr_ioreq_server_pages;
+    uint64_t vmware_hw;
 
     struct toolstack_data_t tdata;
 } pagebuf_t;
@@ -927,6 +928,16 @@ static int pagebuf_get_one(xc_interface *xch, struct restore_ctx *ctx,
         }
         return pagebuf_get_one(xch, ctx, buf, fd, dom);
 
+    case XC_SAVE_ID_HVM_VMWARE_HW:
+        /* Skip padding 4 bytes then read the vmware hw version. */
+        if ( RDEXACT(fd, &buf->vmware_hw, sizeof(uint32_t)) ||
+             RDEXACT(fd, &buf->vmware_hw, sizeof(uint64_t)) )
+        {
+            PERROR("error read the vmware_hw value");
+            return -1;
+        }
+        return pagebuf_get_one(xch, ctx, buf, fd, dom);
+
     case XC_SAVE_ID_TOOLSTACK:
         {
             if ( RDEXACT(fd, &buf->tdata.len, sizeof(buf->tdata.len)) )
@@ -1760,6 +1771,9 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
         }
     }
 
+    if (pagebuf.vmware_hw != 0)
+        xc_set_hvm_param(xch, dom, HVM_PARAM_VMWARE_HW, pagebuf.vmware_hw);
+
     if (pagebuf.acpi_ioport_location == 1) {
         DBGPRINTF("Use new firmware ioport from the checkpoint\n");
         xc_hvm_param_set(xch, dom, HVM_PARAM_ACPI_IOPORTS_LOCATION, 1);
diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
index 254fdb3..76dc307 100644
--- a/tools/libxc/xc_domain_save.c
+++ b/tools/libxc/xc_domain_save.c
@@ -1750,6 +1750,17 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
             PERROR("Error when writing the ioreq server gmfn count");
             goto out;
         }
+
+        chunk.id = XC_SAVE_ID_HVM_VMWARE_HW;
+        chunk.data = 0;
+        xc_hvm_param_get(xch, dom, HVM_PARAM_VMWARE_HW, &chunk.data);
+
+        if ( (chunk.data != 0) &&
+             wrexact(io_fd, &chunk, sizeof(chunk)) )
+        {
+            PERROR("Error when writing the vmware_hw value");
+            goto out;
+        }
     }
 
     if ( callbacks != NULL && callbacks->toolstack_save != NULL )
diff --git a/tools/libxc/xg_save_restore.h b/tools/libxc/xg_save_restore.h
index bdd9009..d185ba9 100644
--- a/tools/libxc/xg_save_restore.h
+++ b/tools/libxc/xg_save_restore.h
@@ -262,6 +262,8 @@
 /* These are a pair; it is an error for one to exist without the other */
 #define XC_SAVE_ID_HVM_IOREQ_SERVER_PFN -19
 #define XC_SAVE_ID_HVM_NR_IOREQ_SERVER_PAGES -20
+/* VMware data */
+#define XC_SAVE_ID_HVM_VMWARE_HW      -21
 
 /*
 ** We process save/restore/migrate in batches of pages; the below
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index bc68cac..14048e4 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -145,6 +145,16 @@
 #define LIBXL_HAVE_BUILDINFO_IOMEM_START_GFN 1
 
 /*
+ * The libxl_vga_interface_type has the type for vmware.
+ */
+#define LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE 1
+
+/*
+ * libxl_domain_build_info has the u.hvm.vmware_hw field.
+ */
+#define LIBXL_HAVE_BUILDINFO_HVM_VMWARE_HW 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index f7f178e..da79a18 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -428,13 +428,15 @@ int libxl__domain_build(libxl__gc *gc,
         vments[4] = "start_time";
         vments[5] = libxl__sprintf(gc, "%lu.%02d", start_time.tv_sec,(int)start_time.tv_usec/10000);
 
-        localents = libxl__calloc(gc, 7, sizeof(char *));
+        localents = libxl__calloc(gc, 9, sizeof(char *));
         localents[0] = "platform/acpi";
         localents[1] = libxl_defbool_val(info->u.hvm.acpi) ? "1" : "0";
         localents[2] = "platform/acpi_s3";
         localents[3] = libxl_defbool_val(info->u.hvm.acpi_s3) ? "1" : "0";
         localents[4] = "platform/acpi_s4";
         localents[5] = libxl_defbool_val(info->u.hvm.acpi_s4) ? "1" : "0";
+        localents[6] = "platform/vmware_hw";
+        localents[7] = libxl__sprintf(gc, "%"PRId64, info->u.hvm.vmware_hw);
 
         break;
     case LIBXL_DOMAIN_TYPE_PV:
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 103cbca..d27b364 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -243,6 +243,9 @@ static char ** libxl__build_device_model_args_old(libxl__gc *gc,
         case LIBXL_VGA_INTERFACE_TYPE_NONE:
             flexarray_append_pair(dm_args, "-vga", "none");
             break;
+        case LIBXL_VGA_INTERFACE_TYPE_VMWARE:
+            flexarray_append_pair(dm_args, "-vga", "vmware");
+            break;
         }
 
         if (b_info->u.hvm.boot) {
@@ -555,7 +558,12 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc,
             break;
         case LIBXL_VGA_INTERFACE_TYPE_NONE:
             break;
-        }
+        case LIBXL_VGA_INTERFACE_TYPE_VMWARE:
+            flexarray_append_pair(dm_args, "-device",
+                GCSPRINTF("vmware-svga,vgamem_mb=%d",
+                libxl__sizekb_to_mb(b_info->video_memkb)));
+            break;
+            }
 
         if (b_info->u.hvm.boot) {
             flexarray_vappend(dm_args, "-boot",
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index ce0c4ac..be55d5f 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -219,6 +219,8 @@ static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
                     libxl_defbool_val(info->u.hvm.viridian));
     xc_hvm_param_set(handle, domid, HVM_PARAM_HPET_ENABLED,
                     libxl_defbool_val(info->u.hvm.hpet));
+    xc_set_hvm_param(handle, domid, HVM_PARAM_VMWARE_HW,
+                     info->u.hvm.vmware_hw);
 #endif
     xc_hvm_param_set(handle, domid, HVM_PARAM_TIMER_MODE, timer_mode(info));
     xc_hvm_param_set(handle, domid, HVM_PARAM_VPT_ALIGN,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index f1fcbc3..907572c 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -172,6 +172,7 @@ libxl_vga_interface_type = Enumeration("vga_interface_type", [
     (1, "CIRRUS"),
     (2, "STD"),
     (3, "NONE"),
+    (4, "VMWARE"),
     ], init_val = "LIBXL_VGA_INTERFACE_TYPE_CIRRUS")
 
 libxl_vendor_device = Enumeration("vendor_device", [
@@ -378,6 +379,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                        ("timeoffset",       string),
                                        ("hpet",             libxl_defbool),
                                        ("vpt_align",        libxl_defbool),
+                                       ("vmware_hw",        UInt(64, init_val = 0)),
                                        ("timer_mode",       libxl_timer_mode),
                                        ("nested_hvm",       libxl_defbool),
                                        ("smbios_firmware",  string),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 698b3bc..2119bd6 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1038,6 +1038,8 @@ static void parse_config_data(const char *config_source,
         xlu_cfg_get_defbool(config, "hpet", &b_info->u.hvm.hpet, 0);
         xlu_cfg_get_defbool(config, "vpt_align", &b_info->u.hvm.vpt_align, 0);
 
+        if (!xlu_cfg_get_long(config, "vmware_hw",  &l, 1))
+            b_info->u.hvm.vmware_hw = l;
         if (!xlu_cfg_get_long(config, "timer_mode", &l, 1)) {
             const char *s = libxl_timer_mode_to_string(l);
             fprintf(stderr, "WARNING: specifying \"timer_mode\" as an integer is deprecated. "
@@ -1676,13 +1678,20 @@ skip_vfb:
                 b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
             } else if (!strcmp(buf, "none")) {
                 b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE;
+            } else if (!strcmp(buf, "vmware")) {
+                b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_VMWARE;
             } else {
                 fprintf(stderr, "Unknown vga \"%s\" specified\n", buf);
                 exit(1);
             }
         } else if (!xlu_cfg_get_long(config, "stdvga", &l, 0))
             b_info->u.hvm.vga.kind = l ? LIBXL_VGA_INTERFACE_TYPE_STD :
-                                         LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
+                b_info->u.hvm.vmware_hw ? LIBXL_VGA_INTERFACE_TYPE_VMWARE :
+                                          LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
+        else
+            b_info->u.hvm.vga.kind =
+                b_info->u.hvm.vmware_hw ? LIBXL_VGA_INTERFACE_TYPE_VMWARE :
+                                          LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
 
         xlu_cfg_replace_string (config, "keymap", &b_info->u.hvm.keymap, 0);
         xlu_cfg_get_defbool (config, "spice", &b_info->u.hvm.spice.enable, 0);
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 03/16] vmware: Add VMware provided include files.
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
  2014-09-20 18:07 ` [PATCH for-4.5 v6 01/16] xen: Add support for VMware cpuid leaves Don Slutz
  2014-09-20 18:07 ` [PATCH for-4.5 v6 02/16] tools: Add vmware_hw support Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-20 18:07 ` [PATCH for-4.5 v6 04/16] xen: Add vmware_port support Don Slutz
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

These 2 files: backdoor_def.h and guest_msg_def.h come from:

http://packages.vmware.com/tools/esx/3.5latest/rhel4/SRPMS/index.html
 open-vm-tools-kmod-7.4.8-396269.423167.src.rpm
  open-vm-tools-kmod-7.4.8.tar.gz
   vmhgfs/backdoor_def.h
   vmhgfs/guest_msg_def.h

and are unchanged.

Added the badly named include file includeCheck.h also.  It only
has comments and is provided so that backdoor_def.h and
guest_msg_def.h can be used without change.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
 xen/arch/x86/hvm/vmware/backdoor_def.h  | 167 ++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/vmware/guest_msg_def.h |  87 +++++++++++++++++
 xen/arch/x86/hvm/vmware/includeCheck.h  |  17 ++++
 3 files changed, 271 insertions(+)
 create mode 100644 xen/arch/x86/hvm/vmware/backdoor_def.h
 create mode 100644 xen/arch/x86/hvm/vmware/guest_msg_def.h
 create mode 100644 xen/arch/x86/hvm/vmware/includeCheck.h

diff --git a/xen/arch/x86/hvm/vmware/backdoor_def.h b/xen/arch/x86/hvm/vmware/backdoor_def.h
new file mode 100644
index 0000000..e76795f
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/backdoor_def.h
@@ -0,0 +1,167 @@
+/* **********************************************************
+ * Copyright 1998 VMware, Inc.  All rights reserved. 
+ * **********************************************************
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation version 2 and no later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ * or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St, Fifth Floor, Boston, MA  02110-1301 USA
+ */
+
+/*
+ * backdoor_def.h --
+ *
+ * This contains backdoor defines that can be included from
+ * an assembly language file.
+ */
+
+
+
+#ifndef _BACKDOOR_DEF_H_
+#define _BACKDOOR_DEF_H_
+
+#define INCLUDE_ALLOW_MODULE
+#define INCLUDE_ALLOW_USERLEVEL
+#define INCLUDE_ALLOW_VMMEXT
+#define INCLUDE_ALLOW_VMCORE
+#define INCLUDE_ALLOW_VMKERNEL
+#include "includeCheck.h"
+
+/*
+ * If you want to add a new low-level backdoor call for a guest userland
+ * application, please consider using the GuestRpc mechanism instead. --hpreg
+ */
+
+#define BDOOR_MAGIC 0x564D5868
+
+/* Low-bandwidth backdoor port. --hpreg */
+
+#define BDOOR_PORT 0x5658
+
+#define BDOOR_CMD_GETMHZ      		   1
+/*
+ * BDOOR_CMD_APMFUNCTION is used by:
+ *
+ * o The FrobOS code, which instead should either program the virtual chipset
+ *   (like the new BIOS code does, matthias offered to implement that), or not
+ *   use any VM-specific code (which requires that we correctly implement
+ *   "power off on CLI HLT" for SMP VMs, boris offered to implement that)
+ *
+ * o The old BIOS code, which will soon be jettisoned
+ *
+ *  --hpreg
+ */
+#define BDOOR_CMD_APMFUNCTION 		   2
+#define BDOOR_CMD_GETDISKGEO  		   3
+#define BDOOR_CMD_GETPTRLOCATION	      4
+#define BDOOR_CMD_SETPTRLOCATION	      5
+#define BDOOR_CMD_GETSELLENGTH		   6
+#define BDOOR_CMD_GETNEXTPIECE		   7
+#define BDOOR_CMD_SETSELLENGTH		   8
+#define BDOOR_CMD_SETNEXTPIECE		   9
+#define BDOOR_CMD_GETVERSION		      10
+#define BDOOR_CMD_GETDEVICELISTELEMENT	11
+#define BDOOR_CMD_TOGGLEDEVICE		   12
+#define BDOOR_CMD_GETGUIOPTIONS		   13
+#define BDOOR_CMD_SETGUIOPTIONS		   14
+#define BDOOR_CMD_GETSCREENSIZE		   15
+#define BDOOR_CMD_MONITOR_CONTROL       16
+#define BDOOR_CMD_GETHWVERSION          17
+#define BDOOR_CMD_OSNOTFOUND            18
+#define BDOOR_CMD_GETUUID               19
+#define BDOOR_CMD_GETMEMSIZE            20
+#define BDOOR_CMD_HOSTCOPY              21 /* Devel only */
+/* BDOOR_CMD_GETOS2INTCURSOR, 22, is very old and defunct. Reuse. */
+#define BDOOR_CMD_GETTIME               23 /* Deprecated. Use GETTIMEFULL. */
+#define BDOOR_CMD_STOPCATCHUP           24
+#define BDOOR_CMD_PUTCHR	        25 /* Devel only */
+#define BDOOR_CMD_ENABLE_MSG	        26 /* Devel only */
+#define BDOOR_CMD_GOTO_TCL	        27 /* Devel only */
+#define BDOOR_CMD_INITPCIOPROM		28
+#define BDOOR_CMD_INT13			29
+#define BDOOR_CMD_MESSAGE               30
+#define BDOOR_CMD_RSVD0                 31
+#define BDOOR_CMD_RSVD1                 32
+#define BDOOR_CMD_RSVD2                 33
+#define BDOOR_CMD_ISACPIDISABLED	34
+#define BDOOR_CMD_TOE			35 /* Not in use */
+/* BDOOR_CMD_INITLSIOPROM, 36, was merged with 28. Reuse. */
+#define BDOOR_CMD_PATCH_SMBIOS_STRUCTS  37
+#define BDOOR_CMD_MAPMEM                38 /* Devel only */
+#define BDOOR_CMD_ABSPOINTER_DATA	39
+#define BDOOR_CMD_ABSPOINTER_STATUS	40
+#define BDOOR_CMD_ABSPOINTER_COMMAND	41
+#define BDOOR_CMD_TIMER_SPONGE          42
+#define BDOOR_CMD_PATCH_ACPI_TABLES	43
+/* Catch-all to allow synchronous tests */
+#define BDOOR_CMD_DEVEL_FAKEHARDWARE	44 /* Debug only - needed in beta */
+#define BDOOR_CMD_GETHZ      		45
+#define BDOOR_CMD_GETTIMEFULL           46
+#define BDOOR_CMD_STATELOGGER           47
+#define BDOOR_CMD_CHECKFORCEBIOSSETUP	48
+#define BDOOR_CMD_LAZYTIMEREMULATION    49
+#define BDOOR_CMD_BIOSBBS               50
+#define BDOOR_CMD_MAX                   51
+
+/* 
+ * IMPORTANT NOTE: When modifying the behavior of an existing backdoor command,
+ * you must adhere to the semantics expected by the oldest Tools who use that
+ * command. Specifically, do not alter the way in which the command modifies 
+ * the registers. Otherwise backwards compatibility will suffer.
+ */
+
+/* High-bandwidth backdoor port. --hpreg */
+
+#define BDOORHB_PORT 0x5659
+
+#define BDOORHB_CMD_MESSAGE 0
+#define BDOORHB_CMD_MAX 1
+
+/*
+ * There is another backdoor which allows access to certain TSC-related
+ * values using otherwise illegal PMC indices when the pseudo_perfctr
+ * control flag is set.
+ */
+
+#define BDOOR_PMC_HW_TSC      0x10000
+#define BDOOR_PMC_REAL_NS     0x10001
+#define BDOOR_PMC_APPARENT_NS 0x10002
+
+#define IS_BDOOR_PMC(index)  (((index) | 3) == 0x10003)
+#define BDOOR_CMD(ecx)       ((ecx) & 0xffff)
+
+
+#ifdef VMM
+/*
+ *----------------------------------------------------------------------
+ *
+ * Backdoor_CmdRequiresFullyValidVCPU --
+ *
+ *    A few backdoor commands require the full VCPU to be valid
+ *    (including GDTR, IDTR, TR and LDTR). The rest get read/write
+ *    access to GPRs and read access to Segment registers (selectors).
+ *
+ * Result:
+ *    True iff VECX contains a command that require the full VCPU to
+ *    be valid.
+ *
+ *----------------------------------------------------------------------
+ */
+static INLINE Bool
+Backdoor_CmdRequiresFullyValidVCPU(unsigned cmd)
+{
+   return cmd == BDOOR_CMD_RSVD0 ||
+          cmd == BDOOR_CMD_RSVD1 ||
+          cmd == BDOOR_CMD_RSVD2;
+}
+#endif
+
+#endif
diff --git a/xen/arch/x86/hvm/vmware/guest_msg_def.h b/xen/arch/x86/hvm/vmware/guest_msg_def.h
new file mode 100644
index 0000000..44ae0fa
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/guest_msg_def.h
@@ -0,0 +1,87 @@
+/* **********************************************************
+ * Copyright 1998 VMware, Inc.  All rights reserved. 
+ * **********************************************************
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation version 2 and no later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ * or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St, Fifth Floor, Boston, MA  02110-1301 USA
+ */
+
+/*
+ * guest_msg_def.h --
+ *
+ *    Second layer of the internal communication channel between guest
+ *    applications and vmware
+ *
+ */
+
+#ifndef _GUEST_MSG_DEF_H_
+#define _GUEST_MSG_DEF_H_
+
+#define INCLUDE_ALLOW_MODULE
+#define INCLUDE_ALLOW_USERLEVEL
+#define INCLUDE_ALLOW_VMMEXT
+#include "includeCheck.h"
+
+
+/* Basic request types */
+typedef enum {
+   MESSAGE_TYPE_OPEN,
+   MESSAGE_TYPE_SENDSIZE,
+   MESSAGE_TYPE_SENDPAYLOAD,
+   MESSAGE_TYPE_RECVSIZE,
+   MESSAGE_TYPE_RECVPAYLOAD,
+   MESSAGE_TYPE_RECVSTATUS,
+   MESSAGE_TYPE_CLOSE,
+} MessageType;
+
+
+/* Reply statuses */
+/*  The basic request succeeded */
+#define MESSAGE_STATUS_SUCCESS  0x0001
+/*  vmware has a message available for its party */
+#define MESSAGE_STATUS_DORECV   0x0002
+/*  The channel has been closed */
+#define MESSAGE_STATUS_CLOSED   0x0004
+/*  vmware removed the message before the party fetched it */
+#define MESSAGE_STATUS_UNSENT   0x0008
+/*  A checkpoint occurred */
+#define MESSAGE_STATUS_CPT      0x0010
+/*  An underlying device is powering off */
+#define MESSAGE_STATUS_POWEROFF 0x0020
+/*  vmware has detected a timeout on the channel */
+#define MESSAGE_STATUS_TIMEOUT  0x0040
+/*  vmware supports high-bandwidth for sending and receiving the payload */
+#define MESSAGE_STATUS_HB       0x0080
+
+/*
+ * This mask defines the status bits that the guest is allowed to set;
+ * we use this to mask out all other bits when receiving the status
+ * from the guest. Otherwise, the guest can manipulate VMX state by
+ * setting status bits that are only supposed to be changed by the
+ * VMX. See bug 45385.
+ */
+#define MESSAGE_STATUS_GUEST_MASK    MESSAGE_STATUS_SUCCESS
+
+/*
+ * Max number of channels.
+ * Unfortunately this has to be public because the monitor part
+ * of the backdoor needs it for its trivial-case optimization. [greg]
+ */
+#define GUESTMSG_MAX_CHANNEL 8
+
+/* Flags to open a channel. --hpreg */
+#define GUESTMSG_FLAG_COOKIE 0x80000000
+#define GUESTMSG_FLAG_ALL GUESTMSG_FLAG_COOKIE
+
+
+#endif /* _GUEST_MSG_DEF_H_ */
diff --git a/xen/arch/x86/hvm/vmware/includeCheck.h b/xen/arch/x86/hvm/vmware/includeCheck.h
new file mode 100644
index 0000000..26e0d59
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/includeCheck.h
@@ -0,0 +1,17 @@
+/*
+ * includeCheck.h
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+/*
+ * Nothing here.  Just to use backdoor_def.h without change.
+ */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 04/16] xen: Add vmware_port support
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (2 preceding siblings ...)
  2014-09-20 18:07 ` [PATCH for-4.5 v6 03/16] vmware: Add VMware provided include files Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-23 17:16   ` Boris Ostrovsky
  2014-09-24 16:01   ` George Dunlap
  2014-09-20 18:07 ` [PATCH for-4.5 v6 05/16] tools: " Don Slutz
                   ` (12 subsequent siblings)
  16 siblings, 2 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This includes adding is_vmware_port_enabled

This is a new domain_create() flag, DOMCRF_vmware_port.  It is
passed to domctl as XEN_DOMCTL_CDF_vmware_port.

This enables limited support of VMware's hyper-call.

This is both a more complete support then in currently provided by
QEMU and/or KVM and less.  The missing part requires QEMU changes
and has been left out until the QEMU patches are accepted upstream.

VMware's hyper-call is also known as VMware Backdoor I/O Port.

Note: this support does not depend on vmware_hw being non-zero.

Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
to port 0x5658 specially.  Note: since many operations return data
in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
"in (%dx),%al" will still do things, only AL part of EAX will be
changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
unchanged.

Also this instruction is allowed to be used from ring 3.  To
support this the vmexit for GP needs to be enabled.  I have not
fully tested that nested HVM is doing the right thing for this.

An open source example of using this is:

http://open-vm-tools.sourceforge.net/

Which only uses "inl (%dx)".  Also

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458

The support included is enough to allow VMware tools to install in a
HVM domU.

For a debug=y build there is a new command line option
vmport_debug=.  It enabled output to the console of various
stages of handling the "in (%dx)" instruction.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v6:
      Dropped the attempt to use svm_nextrip_insn_length via
      __get_instruction_length (added in v2).  Just always look
      at upto 15 bytes on AMD.

v5:
      we should make sure that svm_vmexit_gp_intercept is not executed for
      any other guest.
        Added an ASSERT on is_vmware_port_enabled.
      magic integers?
        Added #define for them.
      I am fairly certain that you need some brackets here.
        Added brackets.

 xen/arch/x86/domain.c                 |   2 +
 xen/arch/x86/hvm/hvm.c                |   4 +
 xen/arch/x86/hvm/svm/emulate.c        |   2 +-
 xen/arch/x86/hvm/svm/svm.c            |  41 +++++
 xen/arch/x86/hvm/svm/vmcb.c           |   2 +
 xen/arch/x86/hvm/vmware/Makefile      |   1 +
 xen/arch/x86/hvm/vmware/vmport.c      | 326 ++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/vmx/vmcs.c           |   2 +
 xen/arch/x86/hvm/vmx/vmx.c            |  84 ++++++++-
 xen/arch/x86/hvm/vmx/vvmx.c           |  14 ++
 xen/common/domctl.c                   |   3 +
 xen/include/asm-x86/hvm/domain.h      |   3 +
 xen/include/asm-x86/hvm/io.h          |   2 +-
 xen/include/asm-x86/hvm/svm/emulate.h |   1 +
 xen/include/asm-x86/hvm/vmport.h      |  77 ++++++++
 xen/include/public/domctl.h           |   3 +
 xen/include/xen/sched.h               |   3 +
 17 files changed, 565 insertions(+), 5 deletions(-)
 create mode 100644 xen/arch/x86/hvm/vmware/vmport.c
 create mode 100644 xen/include/asm-x86/hvm/vmport.h

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 7b1dfe6..e2e4aad 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -510,6 +510,8 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
     d->arch.hvm_domain.mem_sharing_enabled = 0;
 
     d->arch.s3_integrity = !!(domcr_flags & DOMCRF_s3_integrity);
+    d->arch.hvm_domain.is_vmware_port_enabled =
+        (domcr_flags & DOMCRF_vmware_port);
 
     INIT_LIST_HEAD(&d->arch.pdev_list);
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index f3cf566..c583179 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -58,6 +58,7 @@
 #include <asm/hvm/trace.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/vmware.h>
+#include <asm/hvm/vmport.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
 #include <public/sched.h>
@@ -1498,6 +1499,9 @@ int hvm_domain_initialise(struct domain *d)
         goto fail1;
     d->arch.hvm_domain.io_handler->num_slot = 0;
 
+    if ( d->arch.hvm_domain.is_vmware_port_enabled )
+        vmport_register(d);
+
     if ( is_pvh_domain(d) )
     {
         register_portio_handler(d, 0, 0x10003, handle_pvh_io);
diff --git a/xen/arch/x86/hvm/svm/emulate.c b/xen/arch/x86/hvm/svm/emulate.c
index 37a1ece..cfad9ab 100644
--- a/xen/arch/x86/hvm/svm/emulate.c
+++ b/xen/arch/x86/hvm/svm/emulate.c
@@ -50,7 +50,7 @@ static unsigned int is_prefix(u8 opc)
     return 0;
 }
 
-static unsigned long svm_rip2pointer(struct vcpu *v)
+unsigned long svm_rip2pointer(struct vcpu *v)
 {
     struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
     unsigned long p = vmcb->cs.base + guest_cpu_user_regs()->eip;
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 5d404ce..ea99dfb 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -59,6 +59,7 @@
 #include <public/sched.h>
 #include <asm/hvm/vpt.h>
 #include <asm/hvm/trace.h>
+#include <asm/hvm/vmport.h>
 #include <asm/hap.h>
 #include <asm/apic.h>
 #include <asm/debugger.h>
@@ -2064,6 +2065,42 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
     return;
 }
 
+static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
+                                    struct vcpu *v)
+{
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
+    /*
+     * Just use 15 for the instruction length; vmport_gp_check will
+     * adjust it.  This is because
+     * __get_instruction_length_from_list() has issues, and may
+     * require a double read of the instruction bytes.  At some
+     * point a new routine could be added that is based on the code
+     * in vmport_gp_check with extensions to make it more general.
+     * Since that routine is the only user of this code this can be
+     * done later.
+     */
+    unsigned long inst_len = 15;
+    unsigned long inst_addr = svm_rip2pointer(v);
+    int rc;
+
+    rc = vmport_gp_check(regs, v, &inst_len, inst_addr,
+                         vmcb->exitinfo1, vmcb->exitinfo2);
+    if ( !rc )
+        __update_guest_eip(regs, inst_len);
+    else
+    {
+        VMPORT_DBG_LOG(VMPORT_LOG_GP_UNKNOWN,
+                       "gp: rc=%d ei1=0x%lx ei2=0x%lx ec=0x%x ip=%"PRIx64
+                       " (0x%lx,%ld) ax=%"PRIx64" bx=%"PRIx64" cx=%"PRIx64
+                       " dx=%"PRIx64" si=%"PRIx64" di=%"PRIx64, rc,
+                       (unsigned long)vmcb->exitinfo1,
+                       (unsigned long)vmcb->exitinfo2, regs->error_code,
+                       regs->rip, inst_addr, inst_len, regs->rax, regs->rbx,
+                       regs->rcx, regs->rdx, regs->rsi, regs->rdi);
+        hvm_inject_hw_exception(TRAP_gp_fault, vmcb->exitinfo1);
+    }
+}
+
 static void svm_vmexit_ud_intercept(struct cpu_user_regs *regs)
 {
     struct hvm_emulate_ctxt ctxt;
@@ -2411,6 +2448,10 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
         break;
     }
 
+    case VMEXIT_EXCEPTION_GP:
+        svm_vmexit_gp_intercept(regs, v);
+        break;
+
     case VMEXIT_EXCEPTION_UD:
         svm_vmexit_ud_intercept(regs);
         break;
diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
index 21292bb..45ead61 100644
--- a/xen/arch/x86/hvm/svm/vmcb.c
+++ b/xen/arch/x86/hvm/svm/vmcb.c
@@ -195,6 +195,8 @@ static int construct_vmcb(struct vcpu *v)
         HVM_TRAP_MASK
         | (1U << TRAP_no_device);
 
+    if ( v->domain->arch.hvm_domain.is_vmware_port_enabled )
+        vmcb->_exception_intercepts |= 1U << TRAP_gp_fault;
     if ( paging_mode_hap(v->domain) )
     {
         vmcb->_np_enable = 1; /* enable nested paging */
diff --git a/xen/arch/x86/hvm/vmware/Makefile b/xen/arch/x86/hvm/vmware/Makefile
index 3fb2e0b..cd8815b 100644
--- a/xen/arch/x86/hvm/vmware/Makefile
+++ b/xen/arch/x86/hvm/vmware/Makefile
@@ -1 +1,2 @@
 obj-y += cpuid.o
+obj-y += vmport.o
diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
new file mode 100644
index 0000000..811c303
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/vmport.c
@@ -0,0 +1,326 @@
+/*
+ * HVM VMPORT emulation
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+#include <xen/lib.h>
+#include <asm/hvm/hvm.h>
+#include <asm/hvm/support.h>
+#include <asm/hvm/vmport.h>
+
+#include "backdoor_def.h"
+#include "guest_msg_def.h"
+
+#define MAX_INST_LEN 15
+
+#ifndef NDEBUG
+unsigned int opt_vmport_debug __read_mostly;
+integer_param("vmport_debug", opt_vmport_debug);
+#endif
+
+/* More VMware defines */
+
+#define VMWARE_GUI_AUTO_GRAB              0x001
+#define VMWARE_GUI_AUTO_UNGRAB            0x002
+#define VMWARE_GUI_AUTO_SCROLL            0x004
+#define VMWARE_GUI_AUTO_RAISE             0x008
+#define VMWARE_GUI_EXCHANGE_SELECTIONS    0x010
+#define VMWARE_GUI_WARP_CURSOR_ON_UNGRAB  0x020
+#define VMWARE_GUI_FULL_SCREEN            0x040
+
+#define VMWARE_GUI_TO_FULL_SCREEN         0x080
+#define VMWARE_GUI_TO_WINDOW              0x100
+
+#define VMWARE_GUI_AUTO_RAISE_DISABLED    0x200
+
+#define VMWARE_GUI_SYNC_TIME              0x400
+
+/* When set, toolboxes should not show the cursor options page. */
+#define VMWARE_DISABLE_CURSOR_OPTIONS     0x800
+
+void vmport_register(struct domain *d)
+{
+    register_portio_handler(d, BDOOR_PORT, 4, vmport_ioport);
+}
+
+int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
+{
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+    uint32_t cmd = regs->rcx & 0xffff;
+    uint32_t magic = regs->rax;
+    int rc = X86EMUL_OKAY;
+
+    if ( magic == BDOOR_MAGIC )
+    {
+        uint64_t saved_rax = regs->rax;
+        uint64_t value;
+
+        VMPORT_DBG_LOG(VMPORT_LOG_TRACE,
+                       "VMware trace dir=%d bytes=%u ip=%"PRIx64" cmd=%d ax=%"
+                       PRIx64" bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"
+                       PRIx64" di=%"PRIx64"\n", dir, bytes,
+                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
+                       regs->rdx, regs->rsi, regs->rdi);
+        switch ( cmd )
+        {
+        case BDOOR_CMD_GETMHZ:
+            /* ... */
+            regs->rbx = BDOOR_MAGIC;
+            regs->rax = current->domain->arch.tsc_khz / 1000;
+            break;
+        case BDOOR_CMD_GETVERSION:
+            /* ... */
+            regs->rbx = BDOOR_MAGIC;
+            /* VERSION_MAGIC */
+            regs->rax = 6;
+            /* Claim we are an ESX. VMX_TYPE_SCALABLE_SERVER */
+            regs->rcx = 2;
+            break;
+        case BDOOR_CMD_GETSCREENSIZE:
+            /* We have no screen size */
+            regs->rax = 0;
+            break;
+        case BDOOR_CMD_GETHWVERSION:
+            /* vmware_hw */
+            regs->rax = 0;
+            if ( is_hvm_vcpu(current) )
+            {
+                struct hvm_domain *hd = &current->domain->arch.hvm_domain;
+
+                regs->rax = hd->params[HVM_PARAM_VMWARE_HW];
+            }
+            if ( !regs->rax )
+                regs->rax = 4;  /* Act like version 4 */
+            break;
+        case BDOOR_CMD_GETHZ:
+            value = current->domain->arch.tsc_khz * 1000;
+            /* apic-frequency (bus speed) */
+            regs->rcx = (uint32_t)(1000000000ULL / APIC_BUS_CYCLE_NS);
+            /* High part of tsc-frequency */
+            regs->rbx = (uint32_t)(value >> 32);
+            /* Low part of tsc-frequency */
+            regs->rax = value;
+            break;
+        case BDOOR_CMD_GETTIME:
+            value = get_localtime_us(current->domain);
+            /* hostUsecs */
+            regs->rbx = (uint32_t)(value % 1000000UL);
+            /* hostSecs */
+            regs->rax = value / 1000000ULL;
+            /* maxTimeLag */
+            regs->rcx = 0;
+            break;
+        case BDOOR_CMD_GETTIMEFULL:
+            value = get_localtime_us(current->domain);
+            /* ... */
+            regs->rax = BDOOR_MAGIC;
+            /* hostUsecs */
+            regs->rbx = (uint32_t)(value % 1000000UL);
+            /* High part of hostSecs */
+            regs->rsi = (uint32_t)((value / 1000000ULL) >> 32);
+            /* Low part of hostSecs */
+            regs->rdx = (uint32_t)(value / 1000000ULL);
+            /* maxTimeLag */
+            regs->rcx = 0;
+            break;
+        case BDOOR_CMD_GETGUIOPTIONS:
+            regs->rax = VMWARE_GUI_AUTO_GRAB | VMWARE_GUI_AUTO_UNGRAB |
+                VMWARE_GUI_AUTO_RAISE_DISABLED | VMWARE_GUI_SYNC_TIME |
+                VMWARE_DISABLE_CURSOR_OPTIONS;
+            break;
+        case BDOOR_CMD_SETGUIOPTIONS:
+            regs->rax = 0x0;
+            break;
+        default:
+            VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
+                           "VMware bytes=%d dir=%d cmd=%d",
+                           bytes, dir, cmd);
+            break;
+        }
+        VMPORT_DBG_LOG(VMPORT_LOG_VMWARE_AFTER,
+                       "VMware after ip=%"PRIx64" cmd=%d ax=%"PRIx64" bx=%"
+                       PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"PRIx64" di=%"
+                       PRIx64"\n",
+                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
+                       regs->rdx, regs->rsi, regs->rdi);
+        if ( dir == IOREQ_READ )
+        {
+            switch ( bytes )
+            {
+            case 1:
+                regs->rax = (saved_rax & 0xffffff00) | (regs->rax & 0xff);
+                break;
+            case 2:
+                regs->rax = (saved_rax & 0xffff0000) | (regs->rax & 0xffff);
+                break;
+            case 4:
+                regs->rax = (uint32_t)regs->rax;
+                break;
+            }
+            *val = regs->rax;
+        }
+        else
+            regs->rax = saved_rax;
+    }
+    else
+    {
+        rc = X86EMUL_UNHANDLEABLE;
+        VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
+                       "Not VMware %x vs %x; ip=%"PRIx64" ax=%"PRIx64
+                       " bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"PRIx64
+                       " di=%"PRIx64"",
+                       magic, BDOOR_MAGIC, regs->rip, regs->rax, regs->rbx,
+                       regs->rcx, regs->rdx, regs->rsi, regs->rdi);
+    }
+
+    return rc;
+}
+
+int vmport_gp_check(struct cpu_user_regs *regs, struct vcpu *v,
+                    unsigned long *inst_len, unsigned long inst_addr,
+                    unsigned long ei1, unsigned long ei2)
+{
+    ASSERT(v->domain->arch.hvm_domain.is_vmware_port_enabled);
+
+    if ( *inst_len && *inst_len <= MAX_INST_LEN && (regs->rdx & 0xffff) == BDOOR_PORT &&
+         ei1 == 0 && ei2 == 0 && (uint32_t)regs->rax == BDOOR_MAGIC )
+    {
+        int i = 0;
+        uint32_t val;
+        uint32_t byte_cnt = hvm_guest_x86_mode(v);
+        unsigned char bytes[MAX_INST_LEN];
+        unsigned int fetch_len;
+        int frc;
+        int rc;
+
+        /* in or out are limited to 32bits */
+        if ( byte_cnt > 4 )
+            byte_cnt = 4;
+
+        /*
+         * Fetch up to the next page break; we'll fetch from the
+         * next page later if we have to.
+         */
+        fetch_len = min_t(unsigned int, *inst_len,
+                          PAGE_SIZE - (inst_addr  & ~PAGE_MASK));
+        frc = hvm_fetch_from_guest_virt_nofault(bytes, inst_addr, fetch_len,
+                                                PFEC_page_present);
+        if ( frc != HVMCOPY_okay )
+        {
+            gdprintk(XENLOG_WARNING,
+                     "Bad instruction fetch at %#lx (frc=%d il=%lu fl=%u)\n",
+                     (unsigned long) inst_addr, frc, *inst_len, fetch_len);
+            return X86EMUL_VMPORT_FETCH_ERROR_BYTE1;
+        }
+
+        /* Check for operand size prefix */
+        while ( (i < MAX_INST_LEN) && (bytes[i] == 0x66) )
+        {
+            i++;
+            if ( i >= fetch_len )
+            {
+                frc = hvm_fetch_from_guest_virt_nofault(&bytes[fetch_len],
+                                                        inst_addr + fetch_len,
+                                                        MAX_INST_LEN - fetch_len,
+                                                        PFEC_page_present);
+                if ( frc != HVMCOPY_okay )
+                {
+                    gdprintk(XENLOG_WARNING,
+                             "Bad instruction fetch at %#lx + %#x (frc=%d)\n",
+                             inst_addr, fetch_len, frc);
+                    return X86EMUL_VMPORT_FETCH_ERROR_BYTE2;
+                }
+                fetch_len = MAX_INST_LEN;
+            }
+        }
+        *inst_len = i + 1;
+
+        /* Only adjust byte_cnt 1 time */
+        if ( bytes[0] == 0x66 )     /* operand size prefix */
+        {
+            if ( byte_cnt == 4 )
+                byte_cnt = 2;
+            else
+                byte_cnt = 4;
+        }
+        if ( bytes[i] == 0xed )     /* in (%dx),%eax or in (%dx),%ax */
+        {
+            rc = vmport_ioport(IOREQ_READ, BDOOR_PORT, byte_cnt, &val);
+            VMPORT_DBG_LOG(VMPORT_LOG_GP_VMWARE_AFTER,
+                           "gp: VMwareIn  rc=%d ip=%"PRIx64" byte_cnt=%d ax=%"
+                           PRIx64" bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64
+                           " si=%"PRIx64" di=%"PRIx64, rc,
+                           inst_addr, byte_cnt, regs->rax, regs->rbx,
+                           regs->rcx, regs->rdx, regs->rsi, regs->rdi);
+            return rc;
+        }
+        else if ( bytes[i] == 0xec )     /* in (%dx),%al */
+        {
+            rc = vmport_ioport(IOREQ_READ, BDOOR_PORT, 1, &val);
+            VMPORT_DBG_LOG(VMPORT_LOG_GP_VMWARE_AFTER,
+                           "gp: VMwareIn  rc=%d ip=%"PRIx64" byte_cnt=1 ax=%"
+                           PRIx64" bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64
+                           " si=%"PRIx64" di=%"PRIx64, rc,
+                           inst_addr, regs->rax, regs->rbx, regs->rcx,
+                           regs->rdx, regs->rsi, regs->rdi);
+            return rc;
+        }
+        else if ( bytes[i] == 0xef )     /* out %eax,(%dx) or out %ax,(%dx) */
+        {
+            rc = vmport_ioport(IOREQ_WRITE, BDOOR_PORT, byte_cnt, &val);
+            VMPORT_DBG_LOG(VMPORT_LOG_GP_VMWARE_AFTER,
+                           "gp: VMwareOut rc=%d ip=%"PRIx64" byte_cnt=%d ax=%"
+                           PRIx64" bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64
+                           " si=%"PRIx64" di=%"PRIx64, rc,
+                           inst_addr, byte_cnt, regs->rax, regs->rbx,
+                           regs->rcx, regs->rdx, regs->rsi, regs->rdi);
+            return rc;
+        }
+        else if ( bytes[i] == 0xee )     /* out %al,(%dx) */
+        {
+            rc = vmport_ioport(IOREQ_WRITE, BDOOR_PORT, 1, &val);
+            VMPORT_DBG_LOG(VMPORT_LOG_GP_VMWARE_AFTER,
+                           "gp: VMwareOut rc=%d ip=%"PRIx64" byte_cnt=1 ax=%"
+                           PRIx64" bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64
+                           " si=%"PRIx64" di=%"PRIx64, rc,
+                           inst_addr, regs->rax, regs->rbx, regs->rcx,
+                           regs->rdx, regs->rsi, regs->rdi);
+            return rc;
+        }
+        else
+        {
+            VMPORT_DBG_LOG(VMPORT_LOG_GP_FAIL_RD_INST,
+                           "gp: VMware? lip=%"PRIx64"[%d]=>0x%x(%lu) ax=%"
+                           PRIx64" bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64
+                           " si=%"PRIx64" di=%"PRIx64,
+                           inst_addr, i, bytes[i], *inst_len, regs->rax,
+                           regs->rbx, regs->rcx, regs->rdx, regs->rsi,
+                           regs->rdi);
+            *inst_len = 0; /* This is unknown. */
+            return X86EMUL_VMPORT_BAD_OPCODE;
+        }
+    }
+    *inst_len = 0; /* This is unknown. */
+    return X86EMUL_VMPORT_BAD_STATE;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index fc1f882..6fe9389 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1102,6 +1102,8 @@ static int construct_vmcs(struct vcpu *v)
 
     v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK
               | (paging_mode_hap(d) ? 0 : (1U << TRAP_page_fault))
+              | (v->domain->arch.hvm_domain.is_vmware_port_enabled ?
+                 (1U << TRAP_gp_fault) : 0)
               | (1U << TRAP_no_device);
     vmx_update_exception_bitmap(v);
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 84119ed..73f55f2 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -44,6 +44,7 @@
 #include <asm/hvm/support.h>
 #include <asm/hvm/vmx/vmx.h>
 #include <asm/hvm/vmx/vmcs.h>
+#include <asm/hvm/vmport.h>
 #include <public/sched.h>
 #include <public/hvm/ioreq.h>
 #include <asm/hvm/vpic.h>
@@ -1276,9 +1277,11 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
                         vmx_set_segment_register(
                             v, s, &v->arch.hvm_vmx.vm86_saved_seg[s]);
                 v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK
-                          | (paging_mode_hap(v->domain) ?
-                             0 : (1U << TRAP_page_fault))
-                          | (1U << TRAP_no_device);
+                    | (paging_mode_hap(v->domain) ?
+                       0 : (1U << TRAP_page_fault))
+                    | (v->domain->arch.hvm_domain.is_vmware_port_enabled ?
+                       (1U << TRAP_gp_fault) : 0)
+                    | (1U << TRAP_no_device);
                 vmx_update_exception_bitmap(v);
                 vmx_update_debug_state(v);
             }
@@ -2576,6 +2579,67 @@ static void vmx_idtv_reinject(unsigned long idtv_info)
     }
 }
 
+static unsigned long vmx_rip2pointer(struct cpu_user_regs *regs,
+                                     struct vcpu *v)
+{
+    struct segment_register cs;
+    unsigned long p;
+
+    vmx_get_segment_register(v, x86_seg_cs, &cs);
+    p = cs.base + regs->rip;
+    if ( !(cs.attr.fields.l && hvm_long_mode_enabled(v)) )
+        return (uint32_t)p; /* mask to 32 bits */
+    return p;
+}
+
+static void vmx_vmexit_gp_intercept(struct cpu_user_regs *regs,
+                                    struct vcpu *v)
+{
+    unsigned long exit_qualification;
+    unsigned long inst_len;
+    unsigned long inst_addr = vmx_rip2pointer(regs, v);
+    unsigned long ecode;
+    int rc;
+#ifndef NDEBUG
+    unsigned long orig_inst_len;
+    unsigned long vector;
+
+    __vmread(VM_EXIT_INTR_INFO, &vector);
+    BUG_ON(!(vector & INTR_INFO_VALID_MASK));
+    BUG_ON(!(vector & INTR_INFO_DELIVER_CODE_MASK));
+#endif
+
+    __vmread(EXIT_QUALIFICATION, &exit_qualification);
+    __vmread(VM_EXIT_INSTRUCTION_LEN, &inst_len);
+    __vmread(VM_EXIT_INTR_ERROR_CODE, &ecode);
+
+#ifndef NDEBUG
+    orig_inst_len = inst_len;
+#endif
+    rc = vmport_gp_check(regs, v, &inst_len, inst_addr,
+                         ecode, exit_qualification);
+#ifndef NDEBUG
+    if ( inst_len && orig_inst_len != inst_len )
+        gdprintk(XENLOG_WARNING,
+                 "Unexpected instruction length difference: %lu vs %lu\n",
+                 orig_inst_len, inst_len);
+#endif
+    if ( !rc )
+        update_guest_eip();
+    else
+    {
+        VMPORT_DBG_LOG(VMPORT_LOG_GP_UNKNOWN,
+                       "gp: rc=%d ecode=0x%lx eq=0x%lx ec=0x%x ip=%"PRIx64
+                       " (0x%lx,%lu=>%lu) ax=%"PRIx64" bx=%"PRIx64
+                       " cx=%"PRIx64" dx=%"PRIx64" si=%"PRIx64" di=%"PRIx64,
+                       rc, ecode, exit_qualification, regs->error_code,
+                       regs->rip, inst_addr, orig_inst_len, inst_len,
+                       regs->rax, regs->rbx, regs->rcx, regs->rdx, regs->rsi,
+                       regs->rdi);
+        hvm_inject_hw_exception(TRAP_gp_fault, ecode);
+    }
+}
+
 static int vmx_handle_apic_write(void)
 {
     unsigned long exit_qualification;
@@ -2686,6 +2750,17 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
                  && vector != TRAP_nmi 
                  && vector != TRAP_machine_check ) 
             {
+#ifndef NDEBUG
+                if ( vector == TRAP_gp_fault )
+                {
+                    VMPORT_DBG_LOG(VMPORT_LOG_REALMODE_GP,
+                                   "realmode gp: ip=%"PRIx64" ax=%"PRIx64
+                                   " bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64
+                                   " si=%"PRIx64" di=%"PRIx64,
+                                   regs->rip, regs->rax, regs->rbx, regs->rcx,
+                                   regs->rdx, regs->rsi, regs->rdi);
+                }
+#endif
                 perfc_incr(realmode_exits);
                 v->arch.hvm_vmx.vmx_emulate = 1;
                 HVMTRACE_0D(REALMODE_EMULATE);
@@ -2801,6 +2876,9 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             HVMTRACE_1D(TRAP, vector);
             vmx_fpu_dirty_intercept();
             break;
+        case TRAP_gp_fault:
+            vmx_vmexit_gp_intercept(regs, v);
+            break;
         case TRAP_page_fault:
             __vmread(EXIT_QUALIFICATION, &exit_qualification);
             __vmread(VM_EXIT_INTR_ERROR_CODE, &ecode);
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 9ccc03f..51d2336 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -24,6 +24,7 @@
 #include <asm/types.h>
 #include <asm/mtrr.h>
 #include <asm/p2m.h>
+#include <asm/hvm/vmport.h>
 #include <asm/hvm/vmx/vmx.h>
 #include <asm/hvm/vmx/vvmx.h>
 #include <asm/hvm/nestedhvm.h>
@@ -2182,6 +2183,19 @@ int nvmx_n2_vmexit_handler(struct cpu_user_regs *regs,
             if ( v->fpu_dirtied )
                 nvcpu->nv_vmexit_pending = 1;
         }
+        else if ( vector == TRAP_gp_fault )
+        {
+#ifndef NDEBUG
+            struct cpu_user_regs *ur = guest_cpu_user_regs();
+            VMPORT_DBG_LOG(VMPORT_LOG_VGP_UNKNOWN,
+                           "Unexpected gp: ip=%"PRIx64" ax=%"PRIx64
+                           " bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64
+                           " si=%"PRIx64" di=%"PRIx64,
+                           ur->rip, ur->rax, ur->rbx, ur->rcx, ur->rdx,
+                           ur->rsi, ur->rdi);
+#endif
+            nvcpu->nv_vmexit_pending = 1;
+        }
         else if ( (intr_info & valid_mask) == valid_mask )
         {
             exec_bitmap =__get_vvmcs(nvcpu->nv_vvmcx, EXCEPTION_BITMAP);
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 8907aac..1307be0 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -541,6 +541,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
              ~(XEN_DOMCTL_CDF_hvm_guest
                | XEN_DOMCTL_CDF_pvh_guest
                | XEN_DOMCTL_CDF_hap
+               | XEN_DOMCTL_CDF_vmware_port
                | XEN_DOMCTL_CDF_s3_integrity
                | XEN_DOMCTL_CDF_oos_off)) )
             break;
@@ -584,6 +585,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
             domcr_flags |= DOMCRF_s3_integrity;
         if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_oos_off )
             domcr_flags |= DOMCRF_oos_off;
+        if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_vmware_port )
+            domcr_flags |= DOMCRF_vmware_port;
 
         d = domain_create(dom, domcr_flags, op->u.createdomain.ssidref);
         if ( IS_ERR(d) )
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 30d4aa3..e68a3ae 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -121,6 +121,9 @@ struct hvm_domain {
     spinlock_t             uc_lock;
     bool_t                 is_in_uc_mode;
 
+    /* VMware backdoor port available */
+    bool_t                 is_vmware_port_enabled;
+
     /* Pass-through */
     struct hvm_iommu       hvm_iommu;
 
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 886a9d6..d257161 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -25,7 +25,7 @@
 #include <public/hvm/ioreq.h>
 #include <public/event_channel.h>
 
-#define MAX_IO_HANDLER             16
+#define MAX_IO_HANDLER             17
 
 #define HVM_PORTIO                  0
 #define HVM_BUFFERED_IO             2
diff --git a/xen/include/asm-x86/hvm/svm/emulate.h b/xen/include/asm-x86/hvm/svm/emulate.h
index ccc2d3c..d9a9dc5 100644
--- a/xen/include/asm-x86/hvm/svm/emulate.h
+++ b/xen/include/asm-x86/hvm/svm/emulate.h
@@ -44,6 +44,7 @@ enum instruction_index {
 
 struct vcpu;
 
+unsigned long svm_rip2pointer(struct vcpu *v);
 int __get_instruction_length_from_list(
     struct vcpu *, const enum instruction_index *, unsigned int list_count);
 
diff --git a/xen/include/asm-x86/hvm/vmport.h b/xen/include/asm-x86/hvm/vmport.h
new file mode 100644
index 0000000..c4f3926
--- /dev/null
+++ b/xen/include/asm-x86/hvm/vmport.h
@@ -0,0 +1,77 @@
+/*
+ * asm/hvm/vmport.h: HVM VMPORT emulation
+ *
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef ASM_X86_HVM_VMPORT_H__
+#define ASM_X86_HVM_VMPORT_H__
+
+#ifndef NDEBUG
+
+#define VMPORT_LOG_GP_UNKNOWN      (1 << 0)
+#define VMPORT_LOG_GP_VMWARE_AFTER (1 << 1)
+#define VMPORT_LOG_GP_FAIL_RD_INST (1 << 2)
+#define VMPORT_LOG_VGP_UNKNOWN     (1 << 3)
+#define VMPORT_LOG_REALMODE_GP     (1 << 4)
+
+#define VMPORT_LOG_GP_NOT_VMWARE   (1 << 9)
+
+#define VMPORT_LOG_TRACE           (1 << 16)
+#define VMPORT_LOG_ERROR           (1 << 17)
+#define VMPORT_LOG_VMWARE_AFTER    (1 << 18)
+
+extern unsigned int opt_vmport_debug;
+#define VMPORT_DBG_LOG(level, _f, _a...)                                \
+    do {                                                                \
+        if ( unlikely((level) & opt_vmport_debug) )                     \
+            printk("[HVM:%d.%d] <%s> " _f "\n",                         \
+                   current->domain->domain_id, current->vcpu_id, __func__, \
+                   ## _a);                                              \
+    } while ( 0 )
+#else
+#define VMPORT_DBG_LOG(level, _f, _a...) do {} while ( 0 )
+#endif
+
+void vmport_register(struct domain *d);
+int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val);
+int vmport_gp_check(struct cpu_user_regs *regs, struct vcpu *v,
+                    unsigned long *inst_len, unsigned long inst_addr,
+                    unsigned long ei1, unsigned long ei2);
+/*
+ * Additional return values from vmport_gp_check.
+ *
+ * Note: return values include:
+ *   X86EMUL_OKAY
+ *   X86EMUL_UNHANDLEABLE
+ *   X86EMUL_EXCEPTION
+ *   X86EMUL_RETRY
+ *   X86EMUL_CMPXCHG_FAILED
+ *
+ * The additional do not overlap any of the above.
+ */
+#define X86EMUL_VMPORT_FETCH_ERROR_BYTE1        11
+#define X86EMUL_VMPORT_FETCH_ERROR_BYTE2        12
+#define X86EMUL_VMPORT_BAD_OPCODE               13
+#define X86EMUL_VMPORT_BAD_STATE                14
+
+#endif /* ASM_X86_HVM_VMPORT_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index cfa39b3..92b50ef 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -63,6 +63,9 @@ struct xen_domctl_createdomain {
  /* Is this a PVH guest (as opposed to an HVM or PV guest)? */
 #define _XEN_DOMCTL_CDF_pvh_guest     4
 #define XEN_DOMCTL_CDF_pvh_guest      (1U<<_XEN_DOMCTL_CDF_pvh_guest)
+ /* Is VMware backdoor port available? */
+#define _XEN_DOMCTL_CDF_vmware_port   5
+#define XEN_DOMCTL_CDF_vmware_port    (1U<<_XEN_DOMCTL_CDF_vmware_port)
     uint32_t flags;
 };
 typedef struct xen_domctl_createdomain xen_domctl_createdomain_t;
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index c5157e6..d741978 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -546,6 +546,9 @@ struct domain *domain_create(
  /* DOMCRF_pvh: Create PV domain in HVM container. */
 #define _DOMCRF_pvh             5
 #define DOMCRF_pvh              (1U<<_DOMCRF_pvh)
+ /* DOMCRF_vmware_port: Enable use of vmware backdoor port. */
+#define _DOMCRF_vmware_port     6
+#define DOMCRF_vmware_port      (1U<<_DOMCRF_vmware_port)
 
 /*
  * rcu_lock_domain_by_id() is more efficient than get_domain_by_id().
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (3 preceding siblings ...)
  2014-09-20 18:07 ` [PATCH for-4.5 v6 04/16] xen: Add vmware_port support Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-22 13:41   ` Ian Campbell
  2014-09-20 18:07 ` [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage Don Slutz
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

This new libxl_domain_create_info field is used to set
XEN_DOMCTL_CDF_vmware_port for the xc_domain_create() routine.

In xen it is is_vmware_port_enabled.

If is_vmware_port_enabled then
  enable a limited support of VMware's hyper-call.

VMware's hyper-call is also known as VMware Backdoor I/O Port.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
 docs/man/xl.cfg.pod.5       | 7 +++++++
 tools/libxl/libxl.h         | 5 +++++
 tools/libxl/libxl_create.c  | 2 ++
 tools/libxl/libxl_types.idl | 1 +
 tools/libxl/xl_cmdimpl.c    | 1 +
 5 files changed, 16 insertions(+)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 367b401..ab645d8 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1164,6 +1164,13 @@ For vssd:VirtualSystemType == vmx-07, vmware_hw = 7.
 
 =back
 
+=item B<vmware_port=BOOLEAN>
+
+Turns on or off the exposure of VMware port.  This is known as
+vmport in QEMU.  Also called VMware Backdoor I/O Port.  Not all
+defined VMware backdoor commands are implemented.  All of the
+ones that Linux kernel uses are defined.
+
 =back
 
 =head3 Emulated VGA Graphics Device
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 14048e4..9958355 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -155,6 +155,11 @@
 #define LIBXL_HAVE_BUILDINFO_HVM_VMWARE_HW 1
 
 /*
+ * libxl_domain_create_info has the vmware_port field.
+ */
+#define LIBXL_HAVE_CREATEINFO_VMWARE_PORT 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index da79a18..0fc6830 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -38,6 +38,7 @@ int libxl__domain_create_info_setdefault(libxl__gc *gc,
         libxl_defbool_setdefault(&c_info->hap, libxl_defbool_val(c_info->pvh));
     }
 
+    libxl_defbool_setdefault(&c_info->vmware_port, false);
     libxl_defbool_setdefault(&c_info->run_hotplug_scripts, true);
     libxl_defbool_setdefault(&c_info->driver_domain, false);
 
@@ -501,6 +502,7 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_create_info *info,
         flags |= XEN_DOMCTL_CDF_hvm_guest;
         flags |= libxl_defbool_val(info->hap) ? XEN_DOMCTL_CDF_hap : 0;
         flags |= libxl_defbool_val(info->oos) ? 0 : XEN_DOMCTL_CDF_oos_off;
+        flags |= libxl_defbool_val(info->vmware_port)? XEN_DOMCTL_CDF_vmware_port : 0;
     } else if (libxl_defbool_val(info->pvh)) {
         flags |= XEN_DOMCTL_CDF_pvh_guest;
         if (!libxl_defbool_val(info->hap)) {
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 907572c..608b64d 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -294,6 +294,7 @@ libxl_domain_create_info = Struct("domain_create_info",[
     ("type",         libxl_domain_type),
     ("hap",          libxl_defbool),
     ("oos",          libxl_defbool),
+    ("vmware_port",  libxl_defbool),
     ("ssidref",      uint32),
     ("ssid_label",   string),
     ("name",         string),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 2119bd6..3eb4494 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -868,6 +868,7 @@ static void parse_config_data(const char *config_source,
     }
 
     xlu_cfg_get_defbool(config, "oos", &c_info->oos, 0);
+    xlu_cfg_get_defbool(config, "vmware_port", &c_info->vmware_port, 0);
 
     if (!xlu_cfg_get_string (config, "pool", &buf, 0))
         xlu_cfg_replace_string(config, "pool", &c_info->pool_name, 0);
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (4 preceding siblings ...)
  2014-09-20 18:07 ` [PATCH for-4.5 v6 05/16] tools: " Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-24 17:27   ` George Dunlap
  2014-09-20 18:07 ` [PATCH for-4.5 v6 07/16] tools: " Don Slutz
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

Reduce the VMPORT_DBG_LOG calls.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v6:
      Dropped the attempt to use svm_nextrip_insn_length via
      __get_instruction_length (added in v2).  Just always look
      at upto 15 bytes on AMD.

v5:
      exitinfo1 is used twice.
        Fixed.

 xen/arch/x86/hvm/svm/svm.c       | 20 ++++++++++++++---
 xen/arch/x86/hvm/vmware/vmport.c | 48 ++++++++++++++++++++++------------------
 xen/arch/x86/hvm/vmx/vmx.c       | 12 ++++++++++
 xen/include/asm-x86/hvm/trace.h  | 45 +++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/vmport.h |  6 -----
 xen/include/public/trace.h       | 12 ++++++++++
 6 files changed, 113 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index ea99dfb..716dda1 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2081,10 +2081,18 @@ static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
      */
     unsigned long inst_len = 15;
     unsigned long inst_addr = svm_rip2pointer(v);
-    int rc;
+    uint32_t starting_rdx = regs->rdx;
+    int rc = vmport_gp_check(regs, v, &inst_len, inst_addr,
+                             vmcb->exitinfo1, vmcb->exitinfo2);
+
+    if ( hvm_long_mode_enabled(v) )
+        HVMTRACE_LONG2_C4D(TRAP_GP, inst_len, starting_rdx,
+                           TRC_PAR_LONG(vmcb->exitinfo1),
+                           TRC_PAR_LONG(vmcb->exitinfo2));
+    else
+        HVMTRACE_C4D(TRAP_GP, inst_len, starting_rdx, vmcb->exitinfo1,
+                     vmcb->exitinfo2);
 
-    rc = vmport_gp_check(regs, v, &inst_len, inst_addr,
-                         vmcb->exitinfo1, vmcb->exitinfo2);
     if ( !rc )
         __update_guest_eip(regs, inst_len);
     else
@@ -2097,6 +2105,12 @@ static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
                        (unsigned long)vmcb->exitinfo2, regs->error_code,
                        regs->rip, inst_addr, inst_len, regs->rax, regs->rbx,
                        regs->rcx, regs->rdx, regs->rsi, regs->rdi);
+        if ( hvm_long_mode_enabled(v) )
+            HVMTRACE_LONG_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, regs->rbx, regs->rcx,
+                              TRC_PAR_LONG(inst_addr));
+        else
+            HVMTRACE_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, regs->rbx, regs->rcx,
+                         inst_addr);
         hvm_inject_hw_exception(TRAP_gp_fault, vmcb->exitinfo1);
     }
 }
diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
index 811c303..962ee32 100644
--- a/xen/arch/x86/hvm/vmware/vmport.c
+++ b/xen/arch/x86/hvm/vmware/vmport.c
@@ -18,6 +18,7 @@
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/support.h>
 #include <asm/hvm/vmport.h>
+#include <asm/hvm/trace.h>
 
 #include "backdoor_def.h"
 #include "guest_msg_def.h"
@@ -66,12 +67,15 @@ int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
         uint64_t saved_rax = regs->rax;
         uint64_t value;
 
-        VMPORT_DBG_LOG(VMPORT_LOG_TRACE,
-                       "VMware trace dir=%d bytes=%u ip=%"PRIx64" cmd=%d ax=%"
-                       PRIx64" bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"
-                       PRIx64" di=%"PRIx64"\n", dir, bytes,
-                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
-                       regs->rdx, regs->rsi, regs->rdi);
+        if ( dir == IOREQ_READ )
+            HVMTRACE_ND(VMPORT_READ_BEFORE, 0, 1/*cycles*/, 6,
+                        regs->rax, regs->rbx, regs->rcx,
+                        regs->rdx, regs->rsi, regs->rdi);
+        else
+            HVMTRACE_ND(VMPORT_WRITE_AFTER_BEFORE, 0, 1/*cycles*/, 6,
+                        regs->rax, regs->rbx, regs->rcx,
+                        regs->rdx, regs->rsi, regs->rdi);
+
         switch ( cmd )
         {
         case BDOOR_CMD_GETMHZ:
@@ -143,19 +147,17 @@ int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
             regs->rax = 0x0;
             break;
         default:
-            VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
-                           "VMware bytes=%d dir=%d cmd=%d",
-                           bytes, dir, cmd);
+            HVMTRACE_ND(VMPORT_UNKNOWN, 0, 1/*cycles*/, 6,
+                        (bytes << 8) + dir, cmd, regs->rbx,
+                        regs->rcx, regs->rsi, regs->rdi);
             break;
         }
-        VMPORT_DBG_LOG(VMPORT_LOG_VMWARE_AFTER,
-                       "VMware after ip=%"PRIx64" cmd=%d ax=%"PRIx64" bx=%"
-                       PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"PRIx64" di=%"
-                       PRIx64"\n",
-                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
-                       regs->rdx, regs->rsi, regs->rdi);
+
         if ( dir == IOREQ_READ )
         {
+            HVMTRACE_ND(VMPORT_READ_AFTER, 0, 1/*cycles*/, 6,
+                        regs->rax, regs->rbx, regs->rcx,
+                        regs->rdx, regs->rsi, regs->rdi);
             switch ( bytes )
             {
             case 1:
@@ -171,17 +173,21 @@ int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
             *val = regs->rax;
         }
         else
+        {
+            HVMTRACE_ND(VMPORT_WRITE_AFTER, 0, 1/*cycles*/, 6,
+                        regs->rax, regs->rbx, regs->rcx,
+                        regs->rdx, regs->rsi, regs->rdi);
             regs->rax = saved_rax;
+        }
     }
     else
     {
+        if ( hvm_long_mode_enabled(current) )
+            HVMTRACE_LONG_C4D(VMPORT_BAD, dir, bytes, regs->rax,
+                              TRC_PAR_LONG(regs->rip));
+        else
+            HVMTRACE_C4D(VMPORT_BAD, dir, bytes, regs->rax, regs->rip);
         rc = X86EMUL_UNHANDLEABLE;
-        VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
-                       "Not VMware %x vs %x; ip=%"PRIx64" ax=%"PRIx64
-                       " bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"PRIx64
-                       " di=%"PRIx64"",
-                       magic, BDOOR_MAGIC, regs->rip, regs->rax, regs->rbx,
-                       regs->rcx, regs->rdx, regs->rsi, regs->rdi);
     }
 
     return rc;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 73f55f2..5395028 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2613,6 +2613,12 @@ static void vmx_vmexit_gp_intercept(struct cpu_user_regs *regs,
     __vmread(VM_EXIT_INSTRUCTION_LEN, &inst_len);
     __vmread(VM_EXIT_INTR_ERROR_CODE, &ecode);
 
+    if ( hvm_long_mode_enabled(v) )
+        HVMTRACE_LONG2_C4D(TRAP_GP, inst_len, regs->rdx, TRC_PAR_LONG(ecode),
+                           TRC_PAR_LONG(exit_qualification));
+    else
+        HVMTRACE_C4D(TRAP_GP, inst_len, regs->rdx, ecode, exit_qualification);
+
 #ifndef NDEBUG
     orig_inst_len = inst_len;
 #endif
@@ -2636,6 +2642,12 @@ static void vmx_vmexit_gp_intercept(struct cpu_user_regs *regs,
                        regs->rip, inst_addr, orig_inst_len, inst_len,
                        regs->rax, regs->rbx, regs->rcx, regs->rdx, regs->rsi,
                        regs->rdi);
+        if ( hvm_long_mode_enabled(v) )
+            HVMTRACE_LONG_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, regs->rbx, regs->rcx,
+                              TRC_PAR_LONG(inst_addr));
+        else
+            HVMTRACE_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, regs->rbx, regs->rcx,
+                         inst_addr);
         hvm_inject_hw_exception(TRAP_gp_fault, ecode);
     }
 }
diff --git a/xen/include/asm-x86/hvm/trace.h b/xen/include/asm-x86/hvm/trace.h
index de802a6..8af2d6a 100644
--- a/xen/include/asm-x86/hvm/trace.h
+++ b/xen/include/asm-x86/hvm/trace.h
@@ -52,8 +52,20 @@
 #define DO_TRC_HVM_LMSW64      DEFAULT_HVM_MISC
 #define DO_TRC_HVM_REALMODE_EMULATE DEFAULT_HVM_MISC 
 #define DO_TRC_HVM_TRAP             DEFAULT_HVM_MISC
+#define DO_TRC_HVM_TRAP64           DEFAULT_HVM_MISC
 #define DO_TRC_HVM_TRAP_DEBUG       DEFAULT_HVM_MISC
 #define DO_TRC_HVM_VLAPIC           DEFAULT_HVM_MISC
+#define DO_TRC_HVM_TRAP_GP          DEFAULT_HVM_MISC
+#define DO_TRC_HVM_TRAP_GP64        DEFAULT_HVM_MISC
+#define DO_TRC_HVM_TRAP_GP_UNKNOWN  DEFAULT_HVM_MISC
+#define DO_TRC_HVM_TRAP_GP_UNKNOWN64 DEFAULT_HVM_MISC
+#define DO_TRC_HVM_VMPORT_READ_BEFORE DEFAULT_HVM_IO
+#define DO_TRC_HVM_VMPORT_WRITE_AFTER_BEFORE DEFAULT_HVM_IO
+#define DO_TRC_HVM_VMPORT_READ_AFTER DEFAULT_HVM_IO
+#define DO_TRC_HVM_VMPORT_WRITE_AFTER DEFAULT_HVM_IO
+#define DO_TRC_HVM_VMPORT_BAD         DEFAULT_HVM_IO
+#define DO_TRC_HVM_VMPORT_BAD64       DEFAULT_HVM_IO
+#define DO_TRC_HVM_VMPORT_UNKNOWN     DEFAULT_HVM_IO
 
 
 #define TRC_PAR_LONG(par) ((par)&0xFFFFFFFF),((par)>>32)
@@ -98,6 +110,21 @@
 #define HVMTRACE_0D(evt)                            \
     HVMTRACE_ND(evt, 0, 0, 0,  0,  0,  0,  0,  0,  0)
 
+#define HVMTRACE_C6D(evt, d1, d2, d3, d4, d5, d6)    \
+    HVMTRACE_ND(evt, 0, 1, 6, d1, d2, d3, d4, d5, d6)
+#define HVMTRACE_C5D(evt, d1, d2, d3, d4, d5)        \
+    HVMTRACE_ND(evt, 0, 1, 5, d1, d2, d3, d4, d5,  0)
+#define HVMTRACE_C4D(evt, d1, d2, d3, d4)            \
+    HVMTRACE_ND(evt, 0, 1, 4, d1, d2, d3, d4,  0,  0)
+#define HVMTRACE_C3D(evt, d1, d2, d3)                \
+    HVMTRACE_ND(evt, 0, 1, 3, d1, d2, d3,  0,  0,  0)
+#define HVMTRACE_C2D(evt, d1, d2)                    \
+    HVMTRACE_ND(evt, 0, 1, 2, d1, d2,  0,  0,  0,  0)
+#define HVMTRACE_C1D(evt, d1)                        \
+    HVMTRACE_ND(evt, 0, 1, 1, d1,  0,  0,  0,  0,  0)
+#define HVMTRACE_C0D(evt)                            \
+    HVMTRACE_ND(evt, 0, 1, 0,  0,  0,  0,  0,  0,  0)
+
 #define HVMTRACE_LONG_1D(evt, d1)                  \
                    HVMTRACE_2D(evt ## 64, (d1) & 0xFFFFFFFF, (d1) >> 32)
 #define HVMTRACE_LONG_2D(evt, d1, d2, ...)              \
@@ -107,6 +134,24 @@
 #define HVMTRACE_LONG_4D(evt, d1, d2, d3, d4, ...)  \
                    HVMTRACE_5D(evt ## 64, d1, d2, d3, d4)
 
+#define HVMTRACE_LONG_C1D(evt, d1)                  \
+                   HVMTRACE_C2D(evt ## 64, (d1) & 0xFFFFFFFF, (d1) >> 32)
+#define HVMTRACE_LONG_C2D(evt, d1, d2, ...)              \
+                   HVMTRACE_C3D(evt ## 64, d1, d2)
+#define HVMTRACE_LONG_C3D(evt, d1, d2, d3, ...)      \
+                   HVMTRACE_C4D(evt ## 64, d1, d2, d3)
+#define HVMTRACE_LONG_C4D(evt, d1, d2, d3, d4, ...)  \
+                   HVMTRACE_C5D(evt ## 64, d1, d2, d3, d4)
+#define HVMTRACE_LONG_C5D(evt, d1, d2, d3, d4, d5, ...) \
+                   HVMTRACE_C6D(evt ## 64, d1, d2, d3, d4, d5)
+
+#define HVMTRACE_LONG2_C2D(evt, d1, d2, ...)              \
+                   HVMTRACE_C4D(evt ## 64, d1, d2)
+#define HVMTRACE_LONG2_C3D(evt, d1, d2, d3, ...)      \
+                   HVMTRACE_C5D(evt ## 64, d1, d2, d3)
+#define HVMTRACE_LONG2_C4D(evt, d1, d2, d3, d4, ...)  \
+                   HVMTRACE_C6D(evt ## 64, d1, d2, d3, d4)
+
 #endif /* __ASM_X86_HVM_TRACE_H__ */
 
 /*
diff --git a/xen/include/asm-x86/hvm/vmport.h b/xen/include/asm-x86/hvm/vmport.h
index c4f3926..401cbf4 100644
--- a/xen/include/asm-x86/hvm/vmport.h
+++ b/xen/include/asm-x86/hvm/vmport.h
@@ -25,12 +25,6 @@
 #define VMPORT_LOG_VGP_UNKNOWN     (1 << 3)
 #define VMPORT_LOG_REALMODE_GP     (1 << 4)
 
-#define VMPORT_LOG_GP_NOT_VMWARE   (1 << 9)
-
-#define VMPORT_LOG_TRACE           (1 << 16)
-#define VMPORT_LOG_ERROR           (1 << 17)
-#define VMPORT_LOG_VMWARE_AFTER    (1 << 18)
-
 extern unsigned int opt_vmport_debug;
 #define VMPORT_DBG_LOG(level, _f, _a...)                                \
     do {                                                                \
diff --git a/xen/include/public/trace.h b/xen/include/public/trace.h
index cfcf4aa..ae3613c 100644
--- a/xen/include/public/trace.h
+++ b/xen/include/public/trace.h
@@ -224,11 +224,23 @@
 #define TRC_HVM_NPF             (TRC_HVM_HANDLER + 0x21)
 #define TRC_HVM_REALMODE_EMULATE (TRC_HVM_HANDLER + 0x22)
 #define TRC_HVM_TRAP             (TRC_HVM_HANDLER + 0x23)
+#define TRC_HVM_TRAP64           (TRC_HVM_HANDLER + TRC_64_FLAG + 0x23)
 #define TRC_HVM_TRAP_DEBUG       (TRC_HVM_HANDLER + 0x24)
 #define TRC_HVM_VLAPIC           (TRC_HVM_HANDLER + 0x25)
+#define TRC_HVM_TRAP_GP          (TRC_HVM_HANDLER + 0x26)
+#define TRC_HVM_TRAP_GP64        (TRC_HVM_HANDLER + TRC_64_FLAG + 0x26)
+#define TRC_HVM_TRAP_GP_UNKNOWN  (TRC_HVM_HANDLER + 0x27)
+#define TRC_HVM_TRAP_GP_UNKNOWN64 (TRC_HVM_HANDLER + TRC_64_FLAG + 0x27)
+#define TRC_HVM_VMPORT_READ_BEFORE (TRC_HVM_HANDLER + 0x28)
+#define TRC_HVM_VMPORT_READ_AFTER (TRC_HVM_HANDLER + 0x29)
+#define TRC_HVM_VMPORT_BAD       (TRC_HVM_HANDLER + 0x2a)
+#define TRC_HVM_VMPORT_BAD64     (TRC_HVM_HANDLER + TRC_64_FLAG + 0x2a)
+#define TRC_HVM_VMPORT_UNKNOWN   (TRC_HVM_HANDLER + 0x2b)
 
 #define TRC_HVM_IOPORT_WRITE    (TRC_HVM_HANDLER + 0x216)
 #define TRC_HVM_IOMEM_WRITE     (TRC_HVM_HANDLER + 0x217)
+#define TRC_HVM_VMPORT_WRITE_AFTER_BEFORE (TRC_HVM_HANDLER + 0x228)
+#define TRC_HVM_VMPORT_WRITE_AFTER (TRC_HVM_HANDLER + 0x229)
 
 /* Trace events for emulated devices */
 #define TRC_HVM_EMUL_HPET_START_TIMER  (TRC_HVM_EMUL + 0x1)
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 07/16] tools: Convert vmware_port to xentrace usage
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (5 preceding siblings ...)
  2014-09-20 18:07 ` [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-25 15:18   ` George Dunlap
  2014-09-20 18:07 ` [PATCH for-4.5 v6 08/16] xen: Add limited support of VMware's hyper-call rpc Don Slutz
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

Also added missing TRAP_DEBUG & VLAPIC.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v5:
      'bytes = 0x%(2)d' or 'bytes = %(2)d' ?
        Fixed.

 tools/xentrace/formats | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index da658bf..7b21b22 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -79,6 +79,19 @@
 0x00082020  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  INTR_WINDOW [ value = 0x%(1)08x ]
 0x00082021  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  NPF         [ gpa = 0x%(2)08x%(1)08x mfn = 0x%(4)08x%(3)08x qual = 0x%(5)04x p2mt = 0x%(6)04x ]
 0x00082023  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP        [ vector = 0x%(1)02x ]
+0x00082024  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP_DEBUG  [ exit_qualification = 0x%(1)08x ]
+0x00082025  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VLAPIC
+0x00082026  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP_GP     [ inst_len = %(1)d edx = 0x%(2)08x exitinfo1 = 0x%(3)08x exitinfo2 = 0x%(4)08x ]
+0x00082126  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP_GP     [ inst_len = %(1)d edx = 0x%(2)08x exitinfo1 = 0x%(4)08x%(3)08x exitinfo2 = 0x%(6)08x%(5)08x ]
+0x00082027  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP_GP_UNKNOWN [ rc = %(1)d eax = 0x%(2)08x ebx = 0x%(3)08x ecx = 0x%(4)08x inst_addr = 0x%(5)08x ]
+0x00082127  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP_GP_UNKNOWN [ rc = %(1)d eax = 0x%(2)08x ebx = 0x%(3)08x ecx = 0x%(4)08x inst_addr = 0x%(6)08x%(5)08x ]
+0x00082028  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_READ_BEFORE  [ eax = 0x%(1)08x ebx = 0x%(2)08x ecx = 0x%(3)08x edx = 0x%(4)08x esi = 0x%(5)08x edi = 0x%(6)08x ]
+0x00082228  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_WRITE_BEFORE [ eax = 0x%(1)08x ebx = 0x%(2)08x ecx = 0x%(3)08x edx = 0x%(4)08x esi = 0x%(5)08x edi = 0x%(6)08x ]
+0x00082029  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_READ_AFTER  [ eax = 0x%(1)08x ebx = 0x%(2)08x ecx = 0x%(3)08x edx = 0x%(4)08x esi = 0x%(5)08x edi = 0x%(6)08x ]
+0x00082229  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_WRITE_AFTER [ eax = 0x%(1)08x ebx = 0x%(2)08x ecx = 0x%(3)08x edx = 0x%(4)08x esi = 0x%(5)08x edi = 0x%(6)08x ]
+0x0008202a  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_BAD [ dir = %(1)d bytes = %(2)d eax = 0x%(3)08x eip = 0x%(4)08x ]
+0x0008212a  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_BAD [ dir = %(1)d bytes = %(2)d eax = 0x%(3)08x rip = 0x%(5)08x%(4)08x ]
+0x0008202b  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_UNKNOWN [ bytes << 8 + dir = 0x%(1)03x cmd = 0x%(2)x cmd = %(2)d ebx = 0x%(3)08x ecx = 0x%(4)08x esi = 0x%(5)08x edi = 0x%(6)08x ]
 
 0x0010f001  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  page_grant_map      [ domid = %(1)d ]
 0x0010f002  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  page_grant_unmap    [ domid = %(1)d ]
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 08/16] xen: Add limited support of VMware's hyper-call rpc
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (6 preceding siblings ...)
  2014-09-20 18:07 ` [PATCH for-4.5 v6 07/16] tools: " Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-22 13:47   ` Ian Campbell
  2014-09-20 18:07 ` [PATCH for-4.5 v6 09/16] tools: " Don Slutz
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

The support included is enough to allow VMware tools to install in a
HVM domU and provide guestinfo support.  guestinfo support is
provide by what is known as VMware RPC support.  This guestinfo
support is provided via libxc.  libxl support has not be written.

Note: VMware RPC support is only available on HVM domU.

This interface is an extension of __HYPERVISOR_HVM_op.  It was
picked because xc_get_hvm_param() also uses it and VMware guest
info is a lot like a hvm param.

The HVMOP_get_vmport_guest_info is used by two libxc functions,
xc_get_vmport_guest_info and xc_fetch_all_vmport_guest_info.
xc_fetch_all_vmport_guest_info is designed to be used to fetch all
currently set guestinfo values.

To save on hypervisor heap memory, the guestinfo support in done in
two sizes, normal and jumbo.  Normal is used to handle up to 128
byte values and jumbo is used to handle up to 4096 byte values.

Since all this is work is done when the guest is doing a single
instruction; it was designed to not use the hypervisor heap to
allocate the memory at this time.  Instead a few are allocated at
the create domain time and during the xen's hyper-call to get or set
them.  This was picked in that if a tool stack is using the VMware
guest info support, it should be using either of both of the get and
set.  And so in this case the guest should only see an out of memory
error when the compile max amount of hypervisor heap memory is in
use.

Doing it this way does lead to a lot of pointer use and many
sub structures.

If the domU is running VMware tools, then the "build version" of
the tools is also available via xc_get_HVM_param().  This also
enables the use of new triggers that will use the VMware hyper-call
to do some limited control of the domU.  The most useful are
poweroff and reboot.  Since a guest process needs to be running
for these to work, a tool stack should check that the build version
is non zero before assuming these will work.

The 2 hvm param's HVM_PARAM_VMPORT_BUILD_NUMBER_TIME and
HVM_PARAM_VMPORT_BUILD_NUMBER_VALUE are how "build version" is
accessed.  These 2 params are only allowed to be set to zero.  The
HVM_PARAM_VMPORT_BUILD_NUMBER_TIME can be used to track the last
time the VMware tools in the guest responded.  One such use would
be the health of the tools in the guest.  The hvm param
HVM_PARAM_VMPORT_RESET_TIME controls how often to request them in
seconds minus 1.  The minus 1 is to handle to 0 case.  I.E. the
fastest that can be selected is every second.  The default is 4
times a minute.

The VMware RPC support includes the notion of channels that are
opened, active and closed.  All RPC messages sent via a channel
starts with normal ASCII text.  The message some times does include
binary data.

Currently there are 2 protocols defined for VMware RPC.  They
determine the direction for data flow, domU to tool stack or
tool stack to domU.

There is no provided interrupt for VMware RPC.

For a debug=y build there is a new command line option
vmport_debug=.  It enabled output to the console of various
stages of handling the "IN EAX, DX" instruction.  Most uses
are the summary ones that show complete RPC actions.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v5:
      PV vs. HVM vs. PVH. So probably 'if(is_hvm_vcpu)'?
        I see no reason to exclude PVH.   Will change to has_hvm_container_vcpu
      The names of all three functions are bogus.
        removed static support routines.

 xen/arch/x86/hvm/hvm.c                       |   43 +
 xen/arch/x86/hvm/vmware/Makefile             |    1 +
 xen/arch/x86/hvm/vmware/vmport.c             |    7 +
 xen/arch/x86/hvm/vmware/vmport_rpc.c         | 1273 ++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/domain.h             |    4 +
 xen/include/asm-x86/hvm/vmport.h             |   17 +
 xen/include/public/arch-x86/hvm/vmporttype.h |  118 +++
 xen/include/public/hvm/hvm_op.h              |   18 +
 xen/include/public/hvm/params.h              |    5 +-
 9 files changed, 1485 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/x86/hvm/vmware/vmport_rpc.c
 create mode 100644 xen/include/public/arch-x86/hvm/vmporttype.h

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c583179..cda158e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1500,7 +1500,12 @@ int hvm_domain_initialise(struct domain *d)
     d->arch.hvm_domain.io_handler->num_slot = 0;
 
     if ( d->arch.hvm_domain.is_vmware_port_enabled )
+    {
         vmport_register(d);
+        rc = vmport_rpc_init(d);
+        if ( rc != 0 )
+            goto fail1;
+    }
 
     if ( is_pvh_domain(d) )
     {
@@ -1537,6 +1542,7 @@ int hvm_domain_initialise(struct domain *d)
     stdvga_deinit(d);
     vioapic_deinit(d);
  fail1:
+    vmport_rpc_deinit(d);
     xfree(d->arch.hvm_domain.io_handler);
     xfree(d->arch.hvm_domain.params);
  fail0:
@@ -6142,6 +6148,43 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case HVMOP_get_vmport_guest_info:
+    case HVMOP_set_vmport_guest_info:
+    {
+        struct xen_hvm_vmport_guest_info a;
+        struct domain *d;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        rc = vmport_rpc_hvmop_precheck(op, &a);
+        if ( rc )
+            return rc;
+
+        d = rcu_lock_domain_by_any_id(a.domid);
+        if ( d == NULL )
+            return rc;
+
+        rc = -EINVAL;
+        if ( !is_hvm_domain(d) )
+            goto param_fail9;
+
+        rc = xsm_hvm_param(XSM_TARGET, d, op);
+        if ( rc )
+            goto param_fail9;
+
+        rc = vmport_rpc_hvmop_do(d, op, &a);
+        if ( rc )
+            goto param_fail9;
+
+        if ( op == HVMOP_get_vmport_guest_info )
+            rc = copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+
+    param_fail9:
+        rcu_unlock_domain(d);
+        break;
+    }
+
     default:
     {
         gdprintk(XENLOG_DEBUG, "Bad HVM op %ld.\n", op);
diff --git a/xen/arch/x86/hvm/vmware/Makefile b/xen/arch/x86/hvm/vmware/Makefile
index cd8815b..4a14124 100644
--- a/xen/arch/x86/hvm/vmware/Makefile
+++ b/xen/arch/x86/hvm/vmware/Makefile
@@ -1,2 +1,3 @@
 obj-y += cpuid.o
 obj-y += vmport.o
+obj-y += vmport_rpc.o
diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
index 962ee32..da7f752 100644
--- a/xen/arch/x86/hvm/vmware/vmport.c
+++ b/xen/arch/x86/hvm/vmware/vmport.c
@@ -138,6 +138,13 @@ int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
             /* maxTimeLag */
             regs->rcx = 0;
             break;
+        case BDOOR_CMD_MESSAGE:
+            if ( has_hvm_container_vcpu(current) )
+            {
+                /* Only supported for non pv domains */
+                vmport_rpc(&current->domain->arch.hvm_domain, regs);
+            }
+            break;
         case BDOOR_CMD_GETGUIOPTIONS:
             regs->rax = VMWARE_GUI_AUTO_GRAB | VMWARE_GUI_AUTO_UNGRAB |
                 VMWARE_GUI_AUTO_RAISE_DISABLED | VMWARE_GUI_SYNC_TIME |
diff --git a/xen/arch/x86/hvm/vmware/vmport_rpc.c b/xen/arch/x86/hvm/vmware/vmport_rpc.c
new file mode 100644
index 0000000..ed779b4
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/vmport_rpc.c
@@ -0,0 +1,1273 @@
+/*
+ * HVM VMPORT RPC emulation
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * VMware Tools running in a DOMU will do "info-get" and "info-set"
+ * guestinfo commands to get and set keys and values. Inside the VM,
+ * vmtools at its lower level will feed the command string 4 bytes
+ * at a time into the VMWARE magic port using the IN
+ * instruction. Each 4 byte mini-rpc will get handled
+ * vmport_io()-->vmport_rpc()-->vmport_process_packet()-->
+ * vmport_process_send_payload()-->vmport_send() and the command
+ * string will get accumulated into a channels send_buffer.  When
+ * the full length of the string has been accumulated, then this
+ * code copies the send_buffer into a free
+ * vmport_state->channel-->receive_bucket.buffer
+ * VMware tools then does RECVSIZE and RECVPAYLOAD messages, the
+ * latter then reads 4 bytes at a time using the IN instruction (for
+ * the info-get case).  Then a final RECVSTATUS message is sent to
+ * finish up
+ */
+
+#include <xen/config.h>
+#include <xen/lib.h>
+#include <xen/cper.h>
+#include <asm/hvm/hvm.h>
+#include <asm/hvm/support.h>
+#include <asm/hvm/vmport.h>
+#include <asm/hvm/trace.h>
+
+#include "public/arch-x86/hvm/vmporttype.h"
+
+#include "backdoor_def.h"
+#include "guest_msg_def.h"
+
+
+#define VMWARE_PROTO_TO_GUEST        0x4f4c4354
+#define VMWARE_PROTO_FROM_GUEST      0x49435052
+
+#define GUESTINFO_NOTFOUND      500
+#define GUESTINFO_VALTOOLONG    1
+#define GUESTINFO_KEYTOOLONG    2
+#define GUESTINFO_TOOMANYKEYS   3
+
+
+static inline void set_status(struct cpu_user_regs *ur, uint16_t val)
+{
+    /* VMware defines this to be only 32 bits */
+    ur->rcx = (val << 16) | (ur->rcx & 0xffff);
+}
+
+#ifndef NDEBUG
+static void vmport_safe_print(char *prefix, int len, const char *msg)
+{
+    unsigned char c;
+    unsigned int end = len;
+    unsigned int i, k;
+    char out[4 * (VMPORT_MAX_SEND_BUF + 1) * 3 + 6];
+
+    if ( end > (sizeof(out) / 3 - 6) )
+        end = sizeof(out) / 3 - 6;
+    out[0] = '<';
+    k = 1;
+    for ( i = 0; i < end; i++ )
+    {
+        c = msg[i];
+        if ( (c == '^') || (c == '\\') || (c == '>') )
+        {
+            out[k++] = '\\';
+            out[k++] = c;
+        }
+        else if ( (c >= ' ') && (c <= '~') )
+            out[k++] = c;
+        else if ( c < ' ' )
+        {
+            out[k++] = '^';
+            out[k++] = c ^ 0x40;
+        }
+        else
+        {
+            snprintf(&out[k], sizeof(out) - k, "\\%02x", c);
+            k += 3;
+        }
+    }
+    out[k++] = '>';
+    if ( len > end )
+    {
+        out[k++] = '.';
+        out[k++] = '.';
+        out[k++] = '.';
+    }
+    out[k++] = 0;
+    gdprintk(XENLOG_DEBUG, "%s%d(%d,%d,%zu)%s\n", prefix, end, len, k,
+             sizeof(out), out);
+}
+#endif
+
+/*
+ * Copy message into a jumbo bucket buffer which vmtools will use to
+ * read from 4 bytes at a time until done with it
+ */
+static void vmport_send_jumbo(struct hvm_domain *hd, vmport_channel_t *c,
+                              const char *msg)
+{
+    unsigned int cur_recv_len = strlen(msg) + 1;
+    vmport_jumbo_bucket_t *b = &(c->jumbo_recv_bkt);
+
+    b->ctl.recv_len = cur_recv_len;
+    b->ctl.recv_idx = 0;
+
+    memset(b->recv_buf, 0, sizeof(b->recv_buf));
+
+    if ( cur_recv_len >= (sizeof(b->recv_buf) - 1) )
+    {
+        VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
+                       "VMware jumbo recv_len=%d >= %ld",
+                       cur_recv_len, sizeof(b->recv_buf) - 1);
+        cur_recv_len = sizeof(b->recv_buf) - 1;
+    }
+
+    memcpy(b->recv_buf, msg, cur_recv_len);
+
+    c->ctl.jumbo = 1;
+}
+
+/*
+ * Copy message into a free receive bucket buffer which vmtools will use to
+ * read from 4 bytes at a time until done with it
+ */
+static void vmport_send_normal(struct hvm_domain *hd, vmport_channel_t *c,
+                               const char *msg)
+{
+    unsigned int cur_recv_len = strlen(msg) + 1;
+    unsigned int my_bkt = c->ctl.recv_write;
+    unsigned int next_bkt = my_bkt + 1;
+    vmport_bucket_t *b;
+
+    if ( next_bkt >= VMPORT_MAX_BKTS )
+        next_bkt = 0;
+
+    if ( next_bkt == c->ctl.recv_read )
+    {
+#ifndef NDEBUG
+        if ( opt_vmport_debug & VMPORT_LOG_SKIP_SEND )
+        {
+            char prefix[30];
+
+            snprintf(prefix, sizeof(prefix),
+                     "VMware _send skipped %d (%d, %d) ",
+                     c->ctl.chan_id, my_bkt, c->ctl.recv_read);
+            prefix[sizeof(prefix) - 1] = 0;
+            vmport_safe_print(prefix, cur_recv_len, msg);
+        }
+#endif
+        return;
+    }
+
+    c->ctl.recv_write = next_bkt;
+    b = &c->recv_bkt[my_bkt];
+#ifndef NDEBUG
+    if ( opt_vmport_debug & VMPORT_LOG_SEND )
+    {
+        char prefix[30];
+
+        snprintf(prefix, sizeof(prefix), "VMware _send %d (%d) ",
+                 c->ctl.chan_id, my_bkt);
+        prefix[sizeof(prefix) - 1] = 0;
+        vmport_safe_print(prefix, cur_recv_len, msg);
+    }
+#endif
+
+    b->ctl.recv_len = cur_recv_len;
+    b->ctl.recv_idx = 0;
+    memset(b->recv_buf, 0, sizeof(b->recv_buf));
+    if ( cur_recv_len >= (sizeof(b->recv_buf) - 1) )
+    {
+        VMPORT_DBG_LOG(VMPORT_LOG_ERROR, "VMware recv_len=%d >= %zd",
+                       cur_recv_len, sizeof(b->recv_buf) - 1);
+        cur_recv_len = sizeof(b->recv_buf) - 1;
+    }
+    memcpy(b->recv_buf, msg, cur_recv_len);
+}
+
+static void vmport_send(struct hvm_domain *hd, vmport_channel_t *c,
+                        const char *msg)
+{
+    unsigned int cur_recv_len = strlen(msg) + 1;
+
+    if ( cur_recv_len > VMPORT_MAX_VAL_LEN )
+        vmport_send_jumbo(hd, c, msg);
+    else
+        vmport_send_normal(hd, c, msg);
+}
+
+void vmport_ctrl_send(struct hvm_domain *hd, char *msg)
+{
+    struct vmport_state *vs = hd->vmport_data;
+    unsigned int i;
+
+    if ( !hd->vmport_data )
+        return;
+    hd->vmport_data->ping_time = get_sec();
+    spin_lock(&hd->vmport_lock);
+    for ( i = 0; i < VMPORT_MAX_CHANS; i++ )
+    {
+        if ( vs->chans[i].ctl.proto_num == VMWARE_PROTO_TO_GUEST )
+            vmport_send(hd, &vs->chans[i], msg);
+    }
+    spin_unlock(&hd->vmport_lock);
+}
+
+static void vmport_flush(struct hvm_domain *hd)
+{
+    spin_lock(&hd->vmport_lock);
+    memset(&hd->vmport_data->chans, 0, sizeof(hd->vmport_data->chans));
+    spin_unlock(&hd->vmport_lock);
+}
+
+static void vmport_sweep(struct hvm_domain *hd, unsigned long now_time)
+{
+    struct vmport_state *vs = hd->vmport_data;
+    unsigned int i;
+
+    for ( i = 0; i < VMPORT_MAX_CHANS; i++ )
+    {
+        if ( vs->chans[i].ctl.proto_num )
+        {
+            vmport_channel_t *c = &vs->chans[i];
+            long delta = now_time - c->ctl.active_time;
+
+            if ( delta >= 80 )
+            {
+                VMPORT_DBG_LOG(VMPORT_LOG_SWEEP, "VMware flush %d. delta=%ld",
+                               c->ctl.chan_id, delta);
+                /* Return channel to free pool */
+                c->ctl.proto_num = 0;
+            }
+        }
+    }
+}
+
+static vmport_channel_t *vmport_new_chan(struct vmport_state *vs,
+                                         unsigned long now_time)
+{
+    unsigned int i;
+
+    for ( i = 0; i < VMPORT_MAX_CHANS; i++ )
+    {
+        if ( !vs->chans[i].ctl.proto_num )
+        {
+            vmport_channel_t *c = &vs->chans[i];
+
+            c->ctl.chan_id = i;
+            c->ctl.cookie = vs->open_cookie++;
+            c->ctl.active_time = now_time;
+            c->ctl.send_len = 0;
+            c->ctl.send_idx = 0;
+            c->ctl.recv_read = 0;
+            c->ctl.recv_write = 0;
+            return c;
+        }
+    }
+    return NULL;
+}
+
+static void vmport_process_send_size(struct hvm_domain *hd, vmport_channel_t *c,
+                                     struct cpu_user_regs *ur)
+{
+    /* vmware tools often send a 0 byte request size. */
+    c->ctl.send_len = ur->rbx;
+    c->ctl.send_idx = 0;
+
+    set_status(ur, MESSAGE_STATUS_SUCCESS);
+}
+
+/* ret_buffer is in/out param */
+static int vmport_get_guestinfo(struct hvm_domain *hd, struct vmport_state *vs,
+                                char *a_info_key, unsigned int a_key_len,
+                                char *ret_buffer, unsigned int ret_buffer_len)
+{
+    unsigned int i;
+
+    for ( i = 0; i < vs->used_guestinfo; i++ )
+    {
+        if ( vs->guestinfo[i] &&
+             (vs->guestinfo[i]->key_len == a_key_len) &&
+             (memcmp(a_info_key, vs->guestinfo[i]->key_data,
+                     vs->guestinfo[i]->key_len) == 0) )
+        {
+            snprintf(ret_buffer, ret_buffer_len - 1, "1 %.*s",
+                     (int)vs->guestinfo[i]->val_len,
+                     vs->guestinfo[i]->val_data);
+            return i;
+        }
+    }
+
+    for ( i = 0; i < vs->used_guestinfo_jumbo; i++ )
+    {
+        if ( vs->guestinfo_jumbo[i] &&
+             (vs->guestinfo_jumbo[i]->key_len == a_key_len) &&
+             (memcmp(a_info_key, vs->guestinfo_jumbo[i]->key_data,
+                     vs->guestinfo_jumbo[i]->key_len) == 0) )
+        {
+            snprintf(ret_buffer, ret_buffer_len - 1, "1 %.*s",
+                     (int)vs->guestinfo_jumbo[i]->val_len,
+                     vs->guestinfo_jumbo[i]->val_data);
+            return i;
+        }
+    }
+    return GUESTINFO_NOTFOUND;
+}
+
+static void hvm_del_guestinfo_jumbo(struct vmport_state *vs, char *key,
+                                    uint8_t len)
+{
+    int i;
+
+    for ( i = 0; i < vs->used_guestinfo_jumbo; i++ )
+    {
+        if ( !vs->guestinfo_jumbo[i] )
+        {
+#ifndef NDEBUG
+            gdprintk(XENLOG_WARNING,
+                     "i=%d not allocated used_guestinfo_jumbo=%d\n",
+                     i, vs->used_guestinfo_jumbo);
+#endif
+        }
+        else if ( (vs->guestinfo_jumbo[i]->key_len == len) &&
+                  (memcmp(key, vs->guestinfo_jumbo[i]->key_data, len) == 0) )
+        {
+            vs->guestinfo_jumbo[i]->key_len = 0;
+            vs->guestinfo_jumbo[i]->val_len = 0;
+            break;
+        }
+    }
+}
+
+static void hvm_del_guestinfo(struct vmport_state *vs, char *key, uint8_t len)
+{
+    int i;
+
+    for ( i = 0; i < vs->used_guestinfo; i++ )
+    {
+        if ( !vs->guestinfo[i] )
+        {
+#ifndef NDEBUG
+            gdprintk(XENLOG_WARNING,
+                     "i=%d not allocated, but used_guestinfo=%d\n",
+                     i, vs->used_guestinfo);
+#endif
+        }
+        else if ( (vs->guestinfo[i]->key_len == len) &&
+                  (memcmp(key, vs->guestinfo[i]->key_data, len) == 0) )
+        {
+            vs->guestinfo[i]->key_len = 0;
+            vs->guestinfo[i]->val_len = 0;
+            break;
+        }
+    }
+}
+
+static int vmport_set_guestinfo(struct vmport_state *vs, int a_key_len,
+                                unsigned int a_val_len, char *a_info_key, char *val)
+{
+    unsigned int i;
+    int free_i = -1, rc = 0;
+
+#ifndef NDEBUG
+    gdprintk(XENLOG_WARNING, "vmport_set_guestinfo a_val_len=%d\n", a_val_len);
+#endif
+
+    if ( a_key_len <= VMPORT_MAX_KEY_LEN )
+    {
+        if ( a_val_len <= VMPORT_MAX_VAL_LEN )
+        {
+            for ( i = 0; i < vs->used_guestinfo; i++ )
+            {
+                if ( !vs->guestinfo[i] )
+                {
+#ifndef NDEBUG
+                    gdprintk(XENLOG_WARNING,
+                             "i=%d not allocated, but used_guestinfo=%d\n",
+                             i, vs->used_guestinfo);
+#endif
+                }
+                else if ( (vs->guestinfo[i]->key_len == a_key_len) &&
+                          (memcmp(a_info_key, vs->guestinfo[i]->key_data,
+                                  vs->guestinfo[i]->key_len) == 0) )
+                {
+                    vs->guestinfo[i]->val_len = a_val_len;
+                    memcpy(vs->guestinfo[i]->val_data, val, a_val_len);
+                    break;
+                }
+                else if ( (vs->guestinfo[i]->key_len == 0) &&
+                          (free_i == -1) )
+                    free_i = i;
+            }
+            if ( i >= vs->used_guestinfo )
+            {
+                if ( free_i == -1 )
+                    rc = GUESTINFO_TOOMANYKEYS;
+                else
+                {
+                    vs->guestinfo[free_i]->key_len = a_key_len;
+                    memcpy(vs->guestinfo[free_i]->key_data,
+                           a_info_key, a_key_len);
+                    vs->guestinfo[free_i]->val_len = a_val_len;
+                    memcpy(vs->guestinfo[free_i]->val_data,
+                           val, a_val_len);
+                }
+            }
+        }
+        else
+            rc = GUESTINFO_VALTOOLONG;
+    }
+    else
+        rc = GUESTINFO_KEYTOOLONG;
+    if ( !rc )
+        hvm_del_guestinfo_jumbo(vs, a_info_key, a_key_len);
+    return rc;
+}
+
+static int vmport_set_guestinfo_jumbo(struct vmport_state *vs, int a_key_len,
+                                      int a_val_len, char *a_info_key, char *val)
+{
+    unsigned int i;
+    int free_i = -1, rc = 0;
+
+#ifndef NDEBUG
+    gdprintk(XENLOG_WARNING, "vmport_set_guestinfo_jumbo a_val_len=%d\n",
+             a_val_len);
+#endif
+
+    if ( a_key_len <= VMPORT_MAX_KEY_LEN )
+    {
+        if ( a_val_len <= VMPORT_MAX_VAL_JUMBO_LEN )
+        {
+            for ( i = 0; i < vs->used_guestinfo_jumbo; i++ )
+            {
+                if ( !vs->guestinfo_jumbo[i] )
+                {
+#ifndef NDEBUG
+                    gdprintk(XENLOG_WARNING,
+                             "i=%d not allocated; used_guestinfo_jumbo=%d\n",
+                             i, vs->used_guestinfo_jumbo);
+#endif
+                }
+                else if ( (vs->guestinfo_jumbo[i]->key_len == a_key_len) &&
+                          (memcmp(a_info_key,
+                                  vs->guestinfo_jumbo[i]->key_data,
+                                  vs->guestinfo_jumbo[i]->key_len) == 0) )
+                {
+
+                    vs->guestinfo_jumbo[i]->val_len = a_val_len;
+                    memcpy(vs->guestinfo_jumbo[i]->val_data, val, a_val_len);
+                    break;
+                }
+                else if ( (vs->guestinfo_jumbo[i]->key_len == 0) &&
+                          (free_i == -1) )
+                    free_i = i;
+            }
+            if ( i >= vs->used_guestinfo_jumbo )
+            {
+                if ( free_i == -1 )
+                    rc = GUESTINFO_TOOMANYKEYS;
+                else
+                {
+                    vs->guestinfo_jumbo[free_i]->key_len = a_key_len;
+                    memcpy(vs->guestinfo_jumbo[free_i]->key_data,
+                           a_info_key, a_key_len);
+                    vs->guestinfo_jumbo[free_i]->val_len = a_val_len;
+                    memcpy(vs->guestinfo_jumbo[free_i]->val_data,
+                           val, a_val_len);
+                }
+            }
+        }
+        else
+            rc = GUESTINFO_VALTOOLONG;
+    }
+    else
+        rc = GUESTINFO_KEYTOOLONG;
+    if ( !rc )
+        hvm_del_guestinfo(vs, a_info_key, a_key_len);
+    return rc;
+}
+
+static void vmport_process_send_payload(struct hvm_domain *hd,
+                                        vmport_channel_t *c,
+                                        struct cpu_user_regs *ur,
+                                        unsigned long now_time)
+{
+    /* Accumulate 4 bytes of paload into send_buf using offset */
+    if ( c->ctl.send_idx < VMPORT_MAX_SEND_BUF )
+        c->send_buf[c->ctl.send_idx] = ur->rbx;
+
+    c->ctl.send_idx++;
+    set_status(ur, MESSAGE_STATUS_SUCCESS);
+
+    if ( c->ctl.send_idx * 4 >= c->ctl.send_len )
+    {
+
+        /* We are done accumulating so handle the command */
+
+        if ( c->ctl.send_idx < VMPORT_MAX_SEND_BUF )
+            ((char *)c->send_buf)[c->ctl.send_len] = 0;
+#ifndef NDEBUG
+        if ( opt_vmport_debug & VMPORT_LOG_RECV )
+        {
+            char prefix[30];
+
+            snprintf(prefix, sizeof(prefix),
+                     "VMware RECV %d (%d) ", c->ctl.chan_id, c->ctl.recv_read);
+            prefix[sizeof(prefix) - 1] = 0;
+            vmport_safe_print(prefix, c->ctl.send_len, (char *)c->send_buf);
+        }
+#endif
+        if ( c->ctl.proto_num == VMWARE_PROTO_FROM_GUEST )
+        {
+            /*
+             * Eaxmples of messages:
+             *
+             *   log toolbox: Version: build-341836
+             *   SetGuestInfo  4 build-341836
+             *   info-get guestinfo.ip
+             *   info-set guestinfo.ip joe
+             *
+             */
+
+            char *build = NULL;
+            char *info_key = NULL;
+            char *ret_msg = "1 ";
+            char ret_buffer[2 + VMPORT_MAX_VAL_JUMBO_LEN + 2];
+
+            if ( strncmp((char *)c->send_buf, "log toolbox: Version: build-",
+                         strlen("log toolbox: Version: build-")) == 0 )
+
+                build = (char *)c->send_buf +
+                    strlen("log toolbox: Version: build-");
+
+            else if ( strncmp((char *)c->send_buf, "SetGuestInfo  4 build-",
+                              strlen("SetGuestInfo  4 build-")) == 0 )
+
+                build = (char *)c->send_buf + strlen("SetGuestInfo  4 build-");
+
+            else if ( strncmp((char *)c->send_buf, "info-get guestinfo.",
+                              strlen("info-get guestinfo.")) == 0 )
+            {
+
+                unsigned int a_key_len = c->ctl.send_len -
+                    strlen("info-get guestinfo.");
+                int rc;
+                struct vmport_state *vs = hd->vmport_data;
+
+                info_key = (char *)c->send_buf + strlen("info-get guestinfo.");
+                if ( a_key_len <= VMPORT_MAX_KEY_LEN )
+                {
+
+                    rc = vmport_get_guestinfo(hd, vs, info_key, a_key_len,
+                                              ret_buffer, sizeof(ret_buffer));
+                    if ( rc == GUESTINFO_NOTFOUND )
+                        ret_msg = "0 No value found";
+                    else
+                        ret_msg = ret_buffer;
+                }
+                else
+                    ret_msg = "0 Key is too long";
+
+            }
+            else if ( strncmp((char *)c->send_buf, "info-set guestinfo.",
+                              strlen("info-set guestinfo.")) == 0 )
+            {
+                char *val;
+                unsigned int rest_len = c->ctl.send_len -
+                    strlen("info-set guestinfo.");
+
+                info_key = (char *)c->send_buf + strlen("info-set guestinfo.");
+                val = strstr(info_key, " ");
+                if ( val )
+                {
+                    unsigned int a_key_len = val - info_key;
+                    unsigned int a_val_len = rest_len - a_key_len - 1;
+                    int rc;
+                    struct vmport_state *vs = hd->vmport_data;
+
+                    val++;
+                    if ( a_val_len > VMPORT_MAX_VAL_LEN )
+                        rc = vmport_set_guestinfo_jumbo(vs, a_key_len,
+                                                        a_val_len,
+                                                        info_key, val);
+                    else
+                        rc = vmport_set_guestinfo(vs, a_key_len, a_val_len,
+                                                  info_key, val);
+                    if ( rc == 0 )
+                        ret_msg = "1 ";
+                    if ( rc == GUESTINFO_VALTOOLONG )
+                        ret_msg = "0 Value too long";
+                    if ( rc == GUESTINFO_KEYTOOLONG )
+                        ret_msg = "0 Key is too long";
+                    if ( rc == GUESTINFO_TOOMANYKEYS )
+                        ret_msg = "0 Too many keys";
+
+
+                }
+                else
+                    ret_msg = "0 Two and exactly two arguments expected";
+            }
+
+            vmport_send(hd, c, ret_msg);
+            if ( build )
+            {
+                long val = 0;
+                char *p = build;
+
+                while ( *p )
+                {
+                    if ( *p < '0' || *p > '9' )
+                        break;
+                    val = val * 10 + *p - '0';
+                    p++;
+                };
+
+                hd->params[HVM_PARAM_VMPORT_BUILD_NUMBER_VALUE] = val;
+                hd->params[HVM_PARAM_VMPORT_BUILD_NUMBER_TIME] = now_time;
+            }
+        }
+        else
+        {
+            unsigned int my_bkt = c->ctl.recv_read - 1;
+            vmport_bucket_t *b;
+
+            if ( my_bkt >= VMPORT_MAX_BKTS )
+                my_bkt = VMPORT_MAX_BKTS - 1;
+            b = &c->recv_bkt[my_bkt];
+            b->ctl.recv_len = 0;
+        }
+    }
+}
+
+static void vmport_process_recv_size(struct hvm_domain *hd, vmport_channel_t *c,
+                                     struct cpu_user_regs *ur)
+{
+    vmport_bucket_t *b;
+    vmport_jumbo_bucket_t *jb;
+    int16_t recv_len;
+
+    if ( c->ctl.jumbo )
+    {
+        jb = &c->jumbo_recv_bkt;
+        recv_len = jb->ctl.recv_len;
+    }
+    else
+    {
+        b = &c->recv_bkt[c->ctl.recv_read];
+        recv_len = b->ctl.recv_len;
+    }
+    if ( recv_len )
+    {
+        set_status(ur, MESSAGE_STATUS_DORECV | MESSAGE_STATUS_SUCCESS);
+        ur->rdx = (ur->rdx & 0xffff) | (MESSAGE_TYPE_SENDSIZE << 16);
+        ur->rbx = recv_len;
+    }
+    else
+        set_status(ur, MESSAGE_STATUS_SUCCESS);
+}
+
+static void vmport_process_recv_payload(struct hvm_domain *hd,
+                                        vmport_channel_t *c,
+                                        struct cpu_user_regs *ur)
+{
+    vmport_bucket_t *b;
+    vmport_jumbo_bucket_t *jb;
+
+    if ( c->ctl.jumbo )
+    {
+        jb = &c->jumbo_recv_bkt;
+        ur->rbx = jb->recv_buf[jb->ctl.recv_idx++];
+    }
+    else
+    {
+        b = &c->recv_bkt[c->ctl.recv_read];
+        if ( b->ctl.recv_idx < VMPORT_MAX_RECV_BUF )
+            ur->rbx = b->recv_buf[b->ctl.recv_idx++];
+        else
+            ur->rbx = 0;
+    }
+
+    set_status(ur, MESSAGE_STATUS_SUCCESS);
+    ur->rdx = (ur->rdx & 0xffff) | (MESSAGE_TYPE_SENDPAYLOAD << 16);
+}
+
+static void vmport_process_recv_status(struct hvm_domain *hd,
+                                       vmport_channel_t *c,
+                                       struct cpu_user_regs *ur)
+{
+    vmport_bucket_t *b;
+    vmport_jumbo_bucket_t *jb;
+
+    set_status(ur, MESSAGE_STATUS_SUCCESS);
+
+    if ( c->ctl.jumbo )
+    {
+        c->ctl.jumbo = 0;
+        /* add debug here */
+        jb = &c->jumbo_recv_bkt;
+        return;
+    }
+
+    b = &c->recv_bkt[c->ctl.recv_read];
+
+    c->ctl.recv_read++;
+    if ( c->ctl.recv_read >= VMPORT_MAX_BKTS )
+        c->ctl.recv_read = 0;
+}
+
+static void vmport_process_close(struct hvm_domain *hd, vmport_channel_t *c,
+                                 struct cpu_user_regs *ur)
+{
+    /* Return channel to free pool */
+    c->ctl.proto_num = 0;
+    set_status(ur, MESSAGE_STATUS_SUCCESS);
+}
+
+static void vmport_process_packet(struct hvm_domain *hd, vmport_channel_t *c,
+                                  struct cpu_user_regs *ur, unsigned int sub_cmd,
+                                  unsigned long now_time)
+{
+    c->ctl.active_time = now_time;
+
+    switch ( sub_cmd )
+    {
+    case MESSAGE_TYPE_SENDSIZE:
+        vmport_process_send_size(hd, c, ur);
+        break;
+
+    case MESSAGE_TYPE_SENDPAYLOAD:
+        vmport_process_send_payload(hd, c, ur, now_time);
+        break;
+
+    case MESSAGE_TYPE_RECVSIZE:
+        vmport_process_recv_size(hd, c, ur);
+        break;
+
+    case MESSAGE_TYPE_RECVPAYLOAD:
+        vmport_process_recv_payload(hd, c, ur);
+        break;
+
+    case MESSAGE_TYPE_RECVSTATUS:
+        vmport_process_recv_status(hd, c, ur);
+        break;
+
+    case MESSAGE_TYPE_CLOSE:
+        vmport_process_close(hd, c, ur);
+        break;
+
+    default:
+        ur->rcx = 0;
+        break;
+    }
+}
+
+void vmport_rpc(struct hvm_domain *hd, struct cpu_user_regs *ur)
+{
+    unsigned int sub_cmd = (ur->rcx >> 16) & 0xffff;
+    vmport_channel_t *c = NULL;
+    uint16_t msg_id;
+    uint32_t msg_cookie;
+    unsigned long now_time = get_sec();
+    long delta = now_time - hd->vmport_data->ping_time;
+
+    if ( !hd->vmport_data )
+        return;
+    if ( delta > hd->params[HVM_PARAM_VMPORT_RESET_TIME] )
+    {
+        VMPORT_DBG_LOG(VMPORT_LOG_PING, "VMware ping. delta=%ld",
+                       delta);
+        vmport_ctrl_send(hd, "reset");
+    }
+    spin_lock(&hd->vmport_lock);
+    vmport_sweep(hd, now_time);
+    do {
+        /* Check to see if a new open request is happening... */
+        if ( MESSAGE_TYPE_OPEN == sub_cmd )
+        {
+            c = vmport_new_chan(hd->vmport_data, now_time);
+            if ( NULL == c )
+            {
+                VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
+                               "VMware failed to find a free channel");
+                break;
+            }
+
+            /* Attach the apropriate protocol the the channel */
+            c->ctl.proto_num = ur->rbx & ~GUESTMSG_FLAG_COOKIE;
+            set_status(ur, MESSAGE_STATUS_SUCCESS);
+            ur->rdx = (ur->rdx & 0xffff) | (c->ctl.chan_id << 16);
+            ur->rdi = c->ctl.cookie & 0xffff;
+            ur->rsi = (c->ctl.cookie >> 16) & 0xffff;
+            if ( c->ctl.proto_num == VMWARE_PROTO_TO_GUEST )
+                vmport_send(hd, c, "reset");
+            break;
+        }
+
+        msg_id = (ur->rdx >> 16) & 0xffff;
+        msg_cookie = (ur->rdi & 0xffff) | (ur->rsi << 16);
+        if ( msg_id >= VMPORT_MAX_CHANS )
+        {
+            VMPORT_DBG_LOG(VMPORT_LOG_ERROR, "VMware chan id err %d >= %d",
+                           msg_id, VMPORT_MAX_CHANS);
+            break;
+        }
+        c = &hd->vmport_data->chans[msg_id];
+        if ( !c->ctl.proto_num )
+        {
+            VMPORT_DBG_LOG(VMPORT_LOG_ERROR, "VMware chan %d not open",
+                           msg_id);
+            break;
+        }
+
+        /* We check the cookie here since it's possible that the
+         * connection timed out on us and another channel was opened
+         * if this happens, return error and the um tool will
+         * need to reopen the connection
+         */
+        if ( msg_cookie != c->ctl.cookie )
+        {
+            VMPORT_DBG_LOG(VMPORT_LOG_ERROR, "VMware ctl.cookie err %x vs %x",
+                           msg_cookie, c->ctl.cookie);
+            break;
+        }
+        vmport_process_packet(hd, c, ur, sub_cmd, now_time);
+    } while ( 0 );
+
+    if ( NULL == c )
+        set_status(ur, 0);
+
+    spin_unlock(&hd->vmport_lock);
+}
+
+static int hvm_set_guestinfo(struct vmport_state *vs,
+                             struct xen_hvm_vmport_guest_info *a,
+                             char *key, char *value)
+{
+    int idx;
+    int free_idx = -1;
+    int rc = 0;
+
+    for ( idx = 0; idx < vs->used_guestinfo; idx++ )
+    {
+        if ( !vs->guestinfo[idx] )
+        {
+#ifndef NDEBUG
+            gdprintk(XENLOG_WARNING,
+                     "idx=%d not allocated, but used_guestinfo=%d\n",
+                     idx, vs->used_guestinfo);
+#endif
+        }
+        else if ( (vs->guestinfo[idx]->key_len == a->key_length) &&
+                  (memcmp(key,
+                          vs->guestinfo[idx]->key_data,
+                          vs->guestinfo[idx]->key_len) == 0) )
+        {
+            vs->guestinfo[idx]->val_len = a->value_length;
+            memcpy(vs->guestinfo[idx]->val_data, value, a->value_length);
+            break;
+
+        }
+        else if ( (vs->guestinfo[idx]->key_len == 0) &&
+                  (free_idx == -1) )
+            free_idx = idx;
+    }
+
+    if ( idx >= vs->used_guestinfo )
+    {
+        if ( free_idx == -1 )
+            rc = -EBUSY;
+        else
+        {
+            vs->guestinfo[free_idx]->key_len = a->key_length;
+            memcpy(vs->guestinfo[free_idx]->key_data, key, a->key_length);
+            vs->guestinfo[free_idx]->val_len = a->value_length;
+            memcpy(vs->guestinfo[free_idx]->val_data, value, a->value_length);
+        }
+    }
+
+    /* Delete any duplicate entry */
+    if ( rc == 0 )
+        hvm_del_guestinfo_jumbo(vs, key, a->key_length);
+
+    return rc;
+}
+
+static int hvm_set_guestinfo_jumbo(struct vmport_state *vs,
+                                   struct xen_hvm_vmport_guest_info *a,
+                                   char *key, char *value)
+{
+    int idx;
+    int free_idx = -1;
+    int rc = 0;
+
+    for ( idx = 0; idx < vs->used_guestinfo_jumbo; idx++ )
+    {
+
+        if ( !vs->guestinfo_jumbo[idx] )
+        {
+#ifndef NDEBUG
+            gdprintk(XENLOG_WARNING,
+                     "idx=%d not allocated, but used_guestinfo_jumbo=%d\n",
+                     idx, vs->used_guestinfo_jumbo);
+#endif
+        }
+        else if ( (vs->guestinfo_jumbo[idx]->key_len == a->key_length) &&
+                  (memcmp(key, vs->guestinfo_jumbo[idx]->key_data,
+                          vs->guestinfo_jumbo[idx]->key_len) == 0) )
+        {
+            vs->guestinfo_jumbo[idx]->val_len = a->value_length;
+            memcpy(vs->guestinfo_jumbo[idx]->val_data, value, a->value_length);
+            break;
+
+        }
+        else if ( (vs->guestinfo_jumbo[idx]->key_len == 0) &&
+                  (free_idx == -1) )
+            free_idx = idx;
+    }
+
+    if ( idx >= vs->used_guestinfo_jumbo )
+    {
+        if ( free_idx == -1 )
+            rc = -EBUSY;
+        else
+        {
+            vs->guestinfo_jumbo[free_idx]->key_len = a->key_length;
+            memcpy(vs->guestinfo_jumbo[free_idx]->key_data,
+                   key, a->key_length);
+            vs->guestinfo_jumbo[free_idx]->val_len = a->value_length;
+            memcpy(vs->guestinfo_jumbo[free_idx]->val_data,
+                   value, a->value_length);
+        }
+    }
+
+    /* Delete any duplicate entry */
+    if ( rc == 0 )
+        hvm_del_guestinfo(vs, key, a->key_length);
+
+    return rc;
+}
+
+static int hvm_get_guestinfo(struct vmport_state *vs,
+                             struct xen_hvm_vmport_guest_info *a,
+                             char *key, char *value)
+{
+    int idx;
+    int rc = 0;
+
+    if ( a->key_length == 0 )
+    {
+        /*
+         * Here we are iterating on getting all guestinfo entries
+         * using index
+         */
+        idx = a->value_length;
+        if ( idx >= vs->used_guestinfo ||
+             !vs->guestinfo[idx] )
+            rc = -ENOENT;
+        else
+        {
+            a->key_length = vs->guestinfo[idx]->key_len;
+            memcpy(a->data, vs->guestinfo[idx]->key_data, a->key_length);
+            a->value_length = vs->guestinfo[idx]->val_len;
+            memcpy(&a->data[a->key_length], vs->guestinfo[idx]->val_data,
+                   a->value_length);
+            rc = 0;
+        }
+    }
+    else
+    {
+        for ( idx = 0; idx < vs->used_guestinfo; idx++ )
+        {
+            if ( vs->guestinfo[idx] &&
+                 (vs->guestinfo[idx]->key_len == a->key_length) &&
+                 (memcmp(key, vs->guestinfo[idx]->key_data,
+                         vs->guestinfo[idx]->key_len) == 0) )
+            {
+                a->value_length = vs->guestinfo[idx]->val_len;
+                memcpy(value, vs->guestinfo[idx]->val_data,
+                       a->value_length);
+                rc = 0;
+                break;
+            }
+        }
+        if ( idx >= vs->used_guestinfo )
+            rc = -ENOENT;
+    }
+    return rc;
+}
+
+static int hvm_get_guestinfo_jumbo(struct vmport_state *vs,
+                                   struct xen_hvm_vmport_guest_info *a,
+                                   char *key, char *value)
+{
+    int idx, total_entries;
+    int rc = 0;
+
+    if ( a->key_length == 0 )
+    {
+        /*
+         * Here we are iterating on getting all guestinfo entries
+         * using index
+         */
+        total_entries = vs->used_guestinfo + vs->used_guestinfo_jumbo;
+
+        /* Input index is in a->value_length */
+        if ( a->value_length >= total_entries )
+        {
+            rc = -ENOENT;
+            return rc;
+        }
+        idx = a->value_length - vs->used_guestinfo;
+        if ( idx >= vs->used_guestinfo_jumbo ||
+             !vs->guestinfo_jumbo[idx] )
+            rc = -ENOENT;
+        else
+        {
+            a->key_length = vs->guestinfo_jumbo[idx]->key_len;
+            memcpy(a->data, vs->guestinfo_jumbo[idx]->key_data, a->key_length);
+            a->value_length = vs->guestinfo_jumbo[idx]->val_len;
+            memcpy(&a->data[a->key_length],
+                   vs->guestinfo_jumbo[idx]->val_data, a->value_length);
+            rc = 0;
+        }
+    }
+    else
+    {
+        for ( idx = 0; idx < vs->used_guestinfo_jumbo; idx++ )
+        {
+            if ( vs->guestinfo_jumbo[idx] &&
+                 (vs->guestinfo_jumbo[idx]->key_len == a->key_length) &&
+                 (memcmp(key, vs->guestinfo_jumbo[idx]->key_data,
+                         vs->guestinfo_jumbo[idx]->key_len) == 0) )
+            {
+                a->value_length = vs->guestinfo_jumbo[idx]->val_len;
+                memcpy(value, vs->guestinfo_jumbo[idx]->val_data,
+                       a->value_length);
+                rc = 0;
+                break;
+            }
+        }
+        if ( idx >= vs->used_guestinfo_jumbo )
+            rc = -ENOENT;
+    }
+    return rc;
+}
+
+int vmport_rpc_hvmop_precheck(unsigned long op,
+                              struct xen_hvm_vmport_guest_info *a)
+{
+    int new_key_length = a->key_length;
+
+    if ( new_key_length > strlen("guestinfo.") )
+    {
+        if ( (size_t)new_key_length + (size_t)a->value_length >
+             sizeof(a->data) )
+            return -EINVAL;
+        if ( memcmp(a->data, "guestinfo.", strlen("guestinfo.")) == 0 )
+            new_key_length -= strlen("guestinfo.");
+        if ( new_key_length > VMPORT_MAX_KEY_LEN )
+        {
+            gdprintk(XENLOG_ERR, "bad key len %d\n", new_key_length);
+            return -EINVAL;
+        }
+        if ( a->value_length > VMPORT_MAX_VAL_JUMBO_LEN )
+        {
+            gdprintk(XENLOG_ERR, "bad val len %d\n", a->value_length);
+            return -EINVAL;
+        }
+    }
+    else if ( new_key_length > 0 )
+    {
+        if ( (size_t)new_key_length + (size_t)a->value_length >
+             sizeof(a->data) )
+            return -EINVAL;
+        if ( new_key_length > VMPORT_MAX_KEY_LEN )
+        {
+            gdprintk(XENLOG_ERR, "bad key len %d", new_key_length);
+            return -EINVAL;
+        }
+        if ( a->value_length > VMPORT_MAX_VAL_JUMBO_LEN )
+        {
+            gdprintk(XENLOG_ERR, "bad val len %d\n", a->value_length);
+            return -EINVAL;
+        }
+    }
+    else if ( (new_key_length == 0) && (op == HVMOP_set_vmport_guest_info) )
+        return -EINVAL;
+
+    return 0;
+}
+
+int vmport_rpc_hvmop_do(struct domain *d, unsigned long op,
+                        struct xen_hvm_vmport_guest_info *a)
+{
+    char *key = NULL;
+    char *value = NULL;
+    struct vmport_state *vs = d->arch.hvm_domain.vmport_data;
+    int i, total_entries;
+    vmport_guestinfo_t *add_slots[5];
+    vmport_guestinfo_jumbo_t *add_slots_jumbo[2];
+    int num_slots = 0, num_slots_jumbo = 0, num_free_slots = 0;
+    int rc = 0;
+
+    if ( !vs )
+        return -ENXIO;
+
+    if ( a->key_length > strlen("guestinfo.") )
+    {
+        if ( memcmp(a->data, "guestinfo.", strlen("guestinfo.")) == 0 )
+        {
+            key = &a->data[strlen("guestinfo.")];
+            a->key_length -= strlen("guestinfo.");
+        }
+        else
+            key = &a->data[0];
+        value = key + a->key_length;
+    }
+    else if ( a->key_length > 0 )
+    {
+        key = &a->data[0];
+        value = key + a->key_length;
+    }
+
+    total_entries = vs->used_guestinfo + vs->used_guestinfo_jumbo;
+
+    if ( (a->key_length == 0) && (a->value_length >= total_entries) )
+    {
+        /*
+         * When key length is zero, we are interating on
+         * get-guest-info hypercalls to retrieve all guestinfo
+         * entries using index passed in a->value_length
+         */
+        return -E2BIG;
+    }
+
+    num_free_slots = 0;
+    for ( i = 0; i < vs->used_guestinfo; i++ )
+    {
+        if ( vs->guestinfo[i] &&
+             (vs->guestinfo[i]->key_len == 0) )
+            num_free_slots++;
+    }
+    if ( num_free_slots < 5 )
+    {
+        num_slots = 5 - num_free_slots;
+        if ( vs->used_guestinfo + num_slots > VMPORT_MAX_NUM_KEY )
+            num_slots = VMPORT_MAX_NUM_KEY - vs->used_guestinfo;
+        for ( i = 0; i < num_slots; i++ )
+            add_slots[i] = xzalloc(vmport_guestinfo_t);
+    }
+
+    num_free_slots = 0;
+    for ( i = 0; i < vs->used_guestinfo_jumbo; i++ )
+    {
+        if ( vs->guestinfo_jumbo[i] &&
+             (vs->guestinfo_jumbo[i]->key_len == 0) )
+            num_free_slots++;
+    }
+    if ( num_free_slots < 1 )
+    {
+        num_slots_jumbo = 1 - num_free_slots;
+        if ( vs->used_guestinfo_jumbo + num_slots_jumbo >
+             VMPORT_MAX_NUM_JUMBO_KEY )
+            num_slots_jumbo = VMPORT_MAX_NUM_JUMBO_KEY -
+                vs->used_guestinfo_jumbo;
+        for ( i = 0; i < num_slots_jumbo; i++ )
+            add_slots_jumbo[i] = xzalloc(vmport_guestinfo_jumbo_t);
+    }
+
+    spin_lock(&d->arch.hvm_domain.vmport_lock);
+
+    for ( i = 0; i < num_slots; i++ )
+        vs->guestinfo[vs->used_guestinfo + i] = add_slots[i];
+    vs->used_guestinfo += num_slots;
+
+    for ( i = 0; i < num_slots_jumbo; i++ )
+        vs->guestinfo_jumbo[vs->used_guestinfo_jumbo + i] =
+            add_slots_jumbo[i];
+    vs->used_guestinfo_jumbo += num_slots_jumbo;
+
+    if ( op == HVMOP_set_vmport_guest_info )
+    {
+        if ( a->value_length > VMPORT_MAX_VAL_LEN )
+            rc = hvm_set_guestinfo_jumbo(vs, a, key, value);
+        else
+            rc = hvm_set_guestinfo(vs, a, key, value);
+    }
+    else
+    {
+        /* Get Guest Info */
+        rc = hvm_get_guestinfo(vs, a, key, value);
+        if ( rc != 0 )
+            rc = hvm_get_guestinfo_jumbo(vs, a, key, value);
+    }
+    spin_unlock(&d->arch.hvm_domain.vmport_lock);
+
+    return rc;
+}
+
+int vmport_rpc_init(struct domain *d)
+{
+    struct vmport_state *vs = xzalloc(struct vmport_state);
+    int i;
+
+    spin_lock_init(&d->arch.hvm_domain.vmport_lock);
+    d->arch.hvm_domain.vmport_data = vs;
+
+    if ( !vs )
+        return -ENOMEM;
+
+    /*
+     * Any value is fine here. In fact a random number may better.
+     * It is used to help validate that a both sides are talking
+     * about the same channel.
+     */
+    vs->open_cookie = 435;
+
+    vs->used_guestinfo = 10;
+    for ( i = 0; i < vs->used_guestinfo; i++ )
+        vs->guestinfo[i] = xzalloc(vmport_guestinfo_t);
+
+    vs->used_guestinfo_jumbo = 2;
+    for ( i = 0; i < vs->used_guestinfo_jumbo; i++ )
+        vs->guestinfo_jumbo[i] = xzalloc(vmport_guestinfo_jumbo_t);
+
+    vmport_flush(&d->arch.hvm_domain);
+
+    d->arch.hvm_domain.params[HVM_PARAM_VMPORT_RESET_TIME] = 14;
+
+    return 0;
+}
+
+void vmport_rpc_deinit(struct domain *d)
+{
+    struct vmport_state *vs = d->arch.hvm_domain.vmport_data;
+    int i;
+
+    if ( !vs )
+        return;
+
+    for ( i = 0; i < vs->used_guestinfo; i++ )
+        xfree(vs->guestinfo[i]);
+    for ( i = 0; i < vs->used_guestinfo_jumbo; i++ )
+        xfree(vs->guestinfo_jumbo[i]);
+    xfree(vs);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index e68a3ae..cecda24 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -107,6 +107,10 @@ struct hvm_domain {
     /* emulated irq to pirq */
     struct radix_tree_root emuirq_pirq;
 
+    /* VMware special port RPC */
+    spinlock_t             vmport_lock;
+    struct vmport_state   *vmport_data;
+
     uint64_t              *params;
 
     /* Memory ranges with pinned cache attributes. */
diff --git a/xen/include/asm-x86/hvm/vmport.h b/xen/include/asm-x86/hvm/vmport.h
index 401cbf4..f14bdd2 100644
--- a/xen/include/asm-x86/hvm/vmport.h
+++ b/xen/include/asm-x86/hvm/vmport.h
@@ -25,6 +25,15 @@
 #define VMPORT_LOG_VGP_UNKNOWN     (1 << 3)
 #define VMPORT_LOG_REALMODE_GP     (1 << 4)
 
+#define VMPORT_LOG_RECV            (1 << 8)
+#define VMPORT_LOG_SEND            (1 << 9)
+#define VMPORT_LOG_SKIP_SEND       (1 << 10)
+#define VMPORT_LOG_ERROR           (1 << 11)
+
+#define VMPORT_LOG_SWEEP           (1 << 12)
+#define VMPORT_LOG_PING            (1 << 13)
+
+
 extern unsigned int opt_vmport_debug;
 #define VMPORT_DBG_LOG(level, _f, _a...)                                \
     do {                                                                \
@@ -59,6 +68,14 @@ int vmport_gp_check(struct cpu_user_regs *regs, struct vcpu *v,
 #define X86EMUL_VMPORT_BAD_OPCODE               13
 #define X86EMUL_VMPORT_BAD_STATE                14
 
+int vmport_rpc_init(struct domain *d);
+void vmport_rpc_deinit(struct domain *d);
+void vmport_rpc(struct hvm_domain *hd, struct cpu_user_regs *ur);
+int vmport_rpc_hvmop_precheck(unsigned long op,
+                              struct xen_hvm_vmport_guest_info *a);
+int vmport_rpc_hvmop_do(struct domain *d, unsigned long op,
+                        struct xen_hvm_vmport_guest_info *a);
+
 #endif /* ASM_X86_HVM_VMPORT_H__ */
 
 /*
diff --git a/xen/include/public/arch-x86/hvm/vmporttype.h b/xen/include/public/arch-x86/hvm/vmporttype.h
new file mode 100644
index 0000000..98875d2
--- /dev/null
+++ b/xen/include/public/arch-x86/hvm/vmporttype.h
@@ -0,0 +1,118 @@
+/*
+ * vmporttype.h: HVM VMPORT structure definitions
+ *
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef ASM_X86_HVM_VMPORTTYPE_H__
+#define ASM_X86_HVM_VMPORTTYPE_H__
+
+#define VMPORT_MAX_KEY_LEN 30
+#define VMPORT_MAX_VAL_LEN 128
+#define VMPORT_MAX_NUM_KEY  64
+#define VMPORT_MAX_NUM_JUMBO_KEY 4
+#define VMPORT_MAX_VAL_JUMBO_LEN 4096
+
+#define VMPORT_MAX_SEND_BUF ((22 + VMPORT_MAX_KEY_LEN +         \
+                              VMPORT_MAX_VAL_JUMBO_LEN + 3)/4)
+#define VMPORT_MAX_RECV_BUF ((2 + VMPORT_MAX_VAL_LEN + 3)/4)
+#define VMPORT_MAX_RECV_JUMBO_BUF ((2 + VMPORT_MAX_VAL_JUMBO_LEN + 3)/4)
+#define VMPORT_MAX_CHANS    6
+#define VMPORT_MAX_BKTS     8
+
+#define VMPORT_SAVE_VERSION 0xabcd0001
+
+typedef struct
+{
+    uint8_t key_len;
+    uint8_t val_len;
+    char key_data[VMPORT_MAX_KEY_LEN];
+    char val_data[VMPORT_MAX_VAL_LEN];
+} vmport_guestinfo_t;
+
+typedef struct
+{
+    uint16_t val_len;
+    uint8_t  key_len;
+    char     key_data[VMPORT_MAX_KEY_LEN];
+    char     val_data[VMPORT_MAX_VAL_JUMBO_LEN];
+} vmport_guestinfo_jumbo_t;
+
+typedef struct __attribute__((packed))
+{
+    uint16_t recv_len;
+    uint16_t recv_idx;
+}
+vmport_bucket_control_t;
+
+typedef struct __attribute__((packed))
+{
+    vmport_bucket_control_t ctl;
+    uint32_t recv_buf[VMPORT_MAX_RECV_BUF + 1];
+}
+vmport_bucket_t;
+
+typedef struct __attribute__((packed))
+{
+    vmport_bucket_control_t ctl;
+    uint32_t recv_buf[VMPORT_MAX_RECV_JUMBO_BUF + 1];
+}
+vmport_jumbo_bucket_t;
+
+typedef struct __attribute__((packed))
+{
+    uint64_t active_time;
+    uint32_t chan_id;
+    uint32_t cookie;
+    uint32_t proto_num;
+    uint16_t send_len;
+    uint16_t send_idx;
+    uint8_t jumbo;
+    uint8_t recv_read;
+    uint8_t recv_write;
+    uint8_t recv_chan_pad[1];
+}
+vmport_channel_control_t;
+
+typedef struct __attribute__((packed))
+{
+    vmport_channel_control_t ctl;
+    vmport_bucket_t recv_bkt[VMPORT_MAX_BKTS];
+    vmport_jumbo_bucket_t jumbo_recv_bkt;
+    uint32_t send_buf[VMPORT_MAX_SEND_BUF + 1];
+}
+vmport_channel_t;
+
+struct vmport_state
+{
+    uint64_t ping_time;
+    uint32_t open_cookie;
+    uint32_t used_guestinfo;
+    uint32_t used_guestinfo_jumbo;
+    uint8_t  max_chans;
+    uint8_t  state_pad[3];
+    vmport_channel_t chans[VMPORT_MAX_CHANS];
+    vmport_guestinfo_t *guestinfo[VMPORT_MAX_NUM_KEY];
+    vmport_guestinfo_jumbo_t *guestinfo_jumbo[VMPORT_MAX_NUM_JUMBO_KEY];
+};
+
+#endif /* ASM_X86_HVM_VMPORTTYPE_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index eeb0a60..8e1e072 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -369,6 +369,24 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_set_ioreq_server_state_t);
 
 #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */
 
+/* Get/set vmport subcommands */
+#define HVMOP_get_vmport_guest_info 23
+#define HVMOP_set_vmport_guest_info 24
+#define VMPORT_GUEST_INFO_KEY_MAX   40
+#define VMPORT_GUEST_INFO_VAL_MAX   4096
+struct xen_hvm_vmport_guest_info {
+    /* Domain to be accessed */
+    domid_t   domid;
+    /* key length */
+    uint16_t   key_length;
+    /* value length */
+    uint16_t   value_length;
+    /* key and value data */
+    char      data[VMPORT_GUEST_INFO_KEY_MAX + VMPORT_GUEST_INFO_VAL_MAX];
+};
+typedef struct xen_hvm_vmport_guest_info xen_hvm_vmport_guest_info_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_vmport_guest_info_t);
+
 #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
 
 /*
diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
index dee6d68..722e30b 100644
--- a/xen/include/public/hvm/params.h
+++ b/xen/include/public/hvm/params.h
@@ -153,7 +153,10 @@
 
 /* Params for VMware */
 #define HVM_PARAM_VMWARE_HW                 35
+#define HVM_PARAM_VMPORT_BUILD_NUMBER_TIME  36
+#define HVM_PARAM_VMPORT_BUILD_NUMBER_VALUE 37
+#define HVM_PARAM_VMPORT_RESET_TIME         38
 
-#define HVM_NR_PARAMS          36
+#define HVM_NR_PARAMS          39
 
 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 09/16] tools: Add limited support of VMware's hyper-call rpc
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (7 preceding siblings ...)
  2014-09-20 18:07 ` [PATCH for-4.5 v6 08/16] xen: Add limited support of VMware's hyper-call rpc Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-22 13:52   ` Ian Campbell
  2014-09-20 18:07 ` [PATCH for-4.5 v6 10/16] Add VMware tool's triggers Don Slutz
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

  This guestinfo support is provided via libxc.  libxl support has
not be written.

Note: VMware RPC support is only available on HVM domU.

This interface is an extension of __HYPERVISOR_HVM_op.  It was
picked because xc_get_hvm_param() also uses it and VMware guest
info is a lot like a hvm param.

The HVMOP_get_vmport_guest_info is used by two libxc functions,
xc_get_vmport_guest_info and xc_fetch_all_vmport_guest_info.
xc_fetch_all_vmport_guest_info is designed to be used to fetch all
currently set guestinfo values.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
 tools/libxc/xc_domain.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++
 tools/libxc/xenctrl.h   |  24 ++++++++++
 2 files changed, 139 insertions(+)

diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 1a6f90a..ce24dad 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1577,6 +1577,121 @@ int xc_hvm_set_ioreq_server_state(xc_interface *xch,
     return rc;
 }
 
+int xc_set_vmport_guest_info(xc_interface *handle,
+                             domid_t dom,
+                             unsigned int key_len,
+                             char *key,
+                             unsigned int val_len,
+                             char *val)
+{
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_vmport_guest_info_t, arg);
+    int rc;
+
+    if ( (key_len < 1) ||
+        (key_len > VMPORT_GUEST_INFO_KEY_MAX) ||
+        (val_len > VMPORT_GUEST_INFO_VAL_MAX) )
+        return -1;
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_set_vmport_guest_info;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+    arg->domid = dom;
+    arg->key_length = key_len;
+    arg->value_length = val_len;
+    memcpy(arg->data, key, key_len);
+    memcpy(&arg->data[key_len], val, val_len);
+    rc = do_xen_hypercall(handle, &hypercall);
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_get_vmport_guest_info(xc_interface *handle,
+                             domid_t dom,
+                             unsigned int key_len,
+                             char *key,
+                             unsigned int val_max,
+                             unsigned int *val_len,
+                             char *val)
+{
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_vmport_guest_info_t, arg);
+    int rc;
+
+    if ( (key_len < 1) ||
+        (key_len > VMPORT_GUEST_INFO_KEY_MAX) )
+        return -1;
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_get_vmport_guest_info;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+    arg->domid = dom;
+    arg->key_length = key_len;
+    arg->value_length = 0;
+    *val_len = 0;
+    memcpy(arg->data, key, key_len);
+    rc = do_xen_hypercall(handle, &hypercall);
+    if ( rc == 0 )
+    {
+        *val_len = arg->value_length;
+        if ( arg->value_length > val_max )
+            arg->value_length = val_max;
+        memcpy(val, &arg->data[key_len], arg->value_length);
+    }
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_fetch_all_vmport_guest_info(xc_interface *handle,
+                                   domid_t dom,
+                                   unsigned int idx,
+                                   unsigned int key_max,
+                                   unsigned int *key_len,
+                                   char *key,
+                                   unsigned int val_max,
+                                   unsigned int *val_len,
+                                   char *val)
+{
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_vmport_guest_info_t, arg);
+    int rc;
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_get_vmport_guest_info;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+    arg->domid = dom;
+    arg->key_length = 0;
+    arg->value_length = idx;
+    *key_len = 0;
+    *val_len = 0;
+    rc = do_xen_hypercall(handle, &hypercall);
+    if ( rc == 0 )
+    {
+        uint16_t val_off = arg->key_length;
+
+        *key_len = arg->key_length;
+        if ( arg->key_length > key_max )
+            arg->key_length = key_max;
+        memcpy(key, arg->data, arg->key_length);
+        *val_len = arg->value_length;
+        if ( arg->value_length > val_max )
+            arg->value_length = val_max;
+        memcpy(val,
+               &arg->data[val_off],
+               arg->value_length);
+    }
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
 int xc_domain_setdebugging(xc_interface *xch,
                            uint32_t domid,
                            unsigned int enable)
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index 514b241..baac464 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -2019,6 +2019,30 @@ int xc_hvm_destroy_ioreq_server(xc_interface *xch,
                                 domid_t domid,
                                 ioservid_t id);
 
+int xc_set_vmport_guest_info(xc_interface *handle,
+                             domid_t dom,
+                             unsigned int key_len,
+                             char *key,
+                             unsigned int val_len,
+                             char *val);
+int xc_get_vmport_guest_info(xc_interface *handle,
+                             domid_t dom,
+                             unsigned int key_len,
+                             char *key,
+                             unsigned int val_max,
+                             unsigned int *val_len,
+                             char *val);
+int xc_fetch_all_vmport_guest_info(xc_interface *handle,
+                                   domid_t dom,
+                                   unsigned int idx,
+                                   unsigned int key_max,
+                                   unsigned int *key_len,
+                                   char *key,
+                                   unsigned int val_max,
+                                   unsigned int *val_len,
+                                   char *val);
+
+
 /* HVM guest pass-through */
 int xc_assign_device(xc_interface *xch,
                      uint32_t domid,
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 10/16] Add VMware tool's triggers
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (8 preceding siblings ...)
  2014-09-20 18:07 ` [PATCH for-4.5 v6 09/16] tools: " Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-20 18:07 ` [PATCH for-4.5 v6 11/16] Add live migration of VMware's hyper-call RPC Don Slutz
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

These are not the same as the PV control interface; they
are more like the ACPI power event.

For anything to happen when they are used, the domU must
be running a VMware tools daemon that is polling for triggers.

If the domU is running VMware tools, then the "build version" of
the tools is also available via xc_get_HVM_param().  This also
enables the use of new triggers that will use the VMware hyper-call
to do some limited control of the domU.  The most useful are
poweroff and reboot.  Since a guest process needs to be running
for these to work, a tool stack should check that the build version
is non zero before assuming these will work.

The 2 hvm param's HVM_PARAM_VMPORT_BUILD_NUMBER_TIME and
HVM_PARAM_VMPORT_BUILD_NUMBER_VALUE are how "build version" is
accessed.  These 2 params are only allowed to be set to zero.  The
HVM_PARAM_VMPORT_BUILD_NUMBER_TIME can be used to track the last
time the VMware tools in the guest responded.  One such use would
be the health of the tools in the guest.  The hvm param
HVM_PARAM_VMPORT_RESET_TIME controls how often to request them in
seconds minus 1.  The minus 1 is to handle to 0 case.  I.E. the
fastest that can be selected is every second.  The default is 4
times a minute.

XEN_DOMCTL_SENDTRIGGER_VTPING is the same as what is done using
HVM_PARAM_VMPORT_RESET_TIME.  This trigger allows the tool
stack to request a sooner update of HVM_PARAM_VMPORT_BUILD_NUMBER_TIME
and HVM_PARAM_VMPORT_BUILD_NUMBER_VALUE.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
 xen/arch/x86/domctl.c            | 34 ++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/vmport.h |  1 +
 xen/include/public/domctl.h      |  3 +++
 3 files changed, 38 insertions(+)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 7a5de43..50596a6 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -23,6 +23,7 @@
 #include <xen/paging.h>
 #include <asm/irq.h>
 #include <asm/hvm/hvm.h>
+#include <asm/hvm/vmport.h>
 #include <asm/hvm/support.h>
 #include <asm/hvm/cacheattr.h>
 #include <asm/processor.h>
@@ -579,6 +580,39 @@ long arch_do_domctl(
         }
         break;
 
+        case XEN_DOMCTL_SENDTRIGGER_VTPOWER:
+        {
+            ret = -EINVAL;
+            if ( is_hvm_domain(d) )
+            {
+                ret = 0;
+                vmport_ctrl_send(&d->arch.hvm_domain, "OS_Halt");
+            }
+        }
+        break;
+
+        case XEN_DOMCTL_SENDTRIGGER_VTREBOOT:
+        {
+            ret = -EINVAL;
+            if ( is_hvm_domain(d) )
+            {
+                ret = 0;
+                vmport_ctrl_send(&d->arch.hvm_domain, "OS_Reboot");
+            }
+        }
+        break;
+
+        case XEN_DOMCTL_SENDTRIGGER_VTPING:
+        {
+            ret = -EINVAL;
+            if ( is_hvm_domain(d) )
+            {
+                ret = 0;
+                vmport_ctrl_send(&d->arch.hvm_domain, "ping");
+            }
+        }
+        break;
+
         default:
             ret = -ENOSYS;
         }
diff --git a/xen/include/asm-x86/hvm/vmport.h b/xen/include/asm-x86/hvm/vmport.h
index f14bdd2..20f7883 100644
--- a/xen/include/asm-x86/hvm/vmport.h
+++ b/xen/include/asm-x86/hvm/vmport.h
@@ -75,6 +75,7 @@ int vmport_rpc_hvmop_precheck(unsigned long op,
                               struct xen_hvm_vmport_guest_info *a);
 int vmport_rpc_hvmop_do(struct domain *d, unsigned long op,
                         struct xen_hvm_vmport_guest_info *a);
+void vmport_ctrl_send(struct hvm_domain *hd, char *msg);
 
 #endif /* ASM_X86_HVM_VMPORT_H__ */
 
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 92b50ef..bce8925 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -462,6 +462,9 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_address_size_t);
 #define XEN_DOMCTL_SENDTRIGGER_INIT   2
 #define XEN_DOMCTL_SENDTRIGGER_POWER  3
 #define XEN_DOMCTL_SENDTRIGGER_SLEEP  4
+#define XEN_DOMCTL_SENDTRIGGER_VTPOWER  5
+#define XEN_DOMCTL_SENDTRIGGER_VTREBOOT 6
+#define XEN_DOMCTL_SENDTRIGGER_VTPING   7
 struct xen_domctl_sendtrigger {
     uint32_t  trigger;  /* IN */
     uint32_t  vcpu;     /* IN */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 11/16] Add live migration of VMware's hyper-call RPC
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (9 preceding siblings ...)
  2014-09-20 18:07 ` [PATCH for-4.5 v6 10/16] Add VMware tool's triggers Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-20 18:07 ` [PATCH for-4.5 v6 12/16] Add dump of HVM_SAVE_CODE(VMPORT) to xen-hvmctx Don Slutz
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

The VMware's hyper-call state is included in live migration and
save/restore.  Because the max size of the VMware guestinfo is
large, then data is compressed and expanded in the
vmport_save_domain_ctxt and vmport_load_domain_ctxt.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
v5:
      You ASSERTed that vg->key_len is 1 so you may not need the 'if'.
        That is a ASSERT(sizeof, not just ASSERT -- not changed.
      Use real errno, not -1.
        Fixed.
      No ASSERT in vmport_load_domain_ctxt
        Added.

 xen/arch/x86/hvm/vmware/vmport_rpc.c   | 309 ++++++++++++++++++++++++++++++++-
 xen/include/public/arch-x86/hvm/save.h |  39 ++++-
 2 files changed, 346 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/vmware/vmport_rpc.c b/xen/arch/x86/hvm/vmware/vmport_rpc.c
index ed779b4..f795928 100644
--- a/xen/arch/x86/hvm/vmware/vmport_rpc.c
+++ b/xen/arch/x86/hvm/vmware/vmport_rpc.c
@@ -39,7 +39,8 @@
 #include <asm/hvm/vmport.h>
 #include <asm/hvm/trace.h>
 
-#include "public/arch-x86/hvm/vmporttype.h"
+#include <public/hvm/save.h>
+#include <public/arch-x86/hvm/save.h>
 
 #include "backdoor_def.h"
 #include "guest_msg_def.h"
@@ -1262,6 +1263,312 @@ void vmport_rpc_deinit(struct domain *d)
     xfree(vs);
 }
 
+/* save and restore functions */
+
+static int vmport_save_domain_ctxt(struct domain *d, hvm_domain_context_t *h)
+{
+    struct hvm_vmport_context *ctxt;
+    struct hvm_save_descriptor *desc;
+    struct hvm_domain *hd = &d->arch.hvm_domain;
+    struct vmport_state *vs = hd->vmport_data;
+    char *p;
+    unsigned int guestinfo_size = 0;
+    unsigned int used_guestinfo = 0;
+    unsigned int used_guestinfo_jumbo = 0;
+    unsigned int chans_size;
+    unsigned int i;
+
+    /* Customized handling for entry since our entry is of variable length */
+    desc = (struct hvm_save_descriptor *)&h->data[h->cur];
+    if ( _hvm_init_entry(h, HVM_SAVE_CODE(VMPORT), 0,
+                         HVM_SAVE_LENGTH(VMPORT)) )
+        return 1;
+    ctxt = (struct hvm_vmport_context *)&h->data[h->cur];
+
+    spin_lock(&hd->vmport_lock);
+
+    ctxt->version = VMPORT_SAVE_VERSION;
+    ctxt->ping_time = vs->ping_time;
+    ctxt->open_cookie = vs->open_cookie;
+    ctxt->used_guestinfo = vs->used_guestinfo;
+    ctxt->used_guestinfo_jumbo = vs->used_guestinfo_jumbo;
+
+    p = ctxt->u.packed.packed_data;
+
+    for ( i = 0; i < VMPORT_MAX_CHANS; i++ )
+    {
+        unsigned int j;
+        unsigned int buf_max;
+
+        ctxt->u.packed.chan_ctl[i].chan = vs->chans[i].ctl;
+        buf_max = vs->chans[i].ctl.send_len;
+        if ( buf_max > VMPORT_MAX_SEND_BUF * 4 )
+            buf_max = VMPORT_MAX_SEND_BUF * 4;
+        memcpy(p, vs->chans[i].send_buf, buf_max);
+        p += buf_max;
+        for ( j = 0; j < VMPORT_MAX_BKTS; j++ )
+        {
+            ctxt->u.packed.chan_ctl[i].recv[j] = vs->chans[i].recv_bkt[j].ctl;
+            buf_max = vs->chans[i].recv_bkt[j].ctl.recv_len;
+            if ( buf_max > VMPORT_MAX_RECV_BUF * 4 )
+                buf_max = VMPORT_MAX_RECV_BUF * 4;
+            memcpy(p, vs->chans[i].recv_bkt[j].recv_buf, buf_max);
+            p += buf_max;
+        }
+        ctxt->u.packed.chan_ctl[i].jumbo = vs->chans[i].jumbo_recv_bkt.ctl;
+        buf_max = vs->chans[i].jumbo_recv_bkt.ctl.recv_len;
+        if ( buf_max > VMPORT_MAX_RECV_JUMBO_BUF * 4 )
+            buf_max = VMPORT_MAX_RECV_JUMBO_BUF * 4;
+        memcpy(p, vs->chans[i].jumbo_recv_bkt.recv_buf, buf_max);
+        p += buf_max;
+    }
+
+    chans_size = p - ctxt->u.packed.packed_data;
+
+    for ( i = 0; i < ctxt->used_guestinfo; i++ )
+    {
+        vmport_guestinfo_t *vg = vs->guestinfo[i];
+
+        if ( vg && vg->key_len )
+        {
+            guestinfo_size += sizeof(vg->key_len) + sizeof(vg->val_len) +
+                vg->key_len + vg->val_len;
+            used_guestinfo++;
+            ASSERT(sizeof(vg->key_len) == 1);
+            *p++ = (char) vg->key_len;
+            ASSERT(sizeof(vg->val_len) == 1);
+            *p++ = (char) vg->val_len;
+            if ( vg->key_len )
+            {
+                memcpy(p, vg->key_data, vg->key_len);
+                p += vg->key_len;
+                if ( vg->val_len )
+                {
+                    memcpy(p, vg->val_data, vg->val_len);
+                    p += vg->val_len;
+                }
+            }
+        }
+    }
+    ctxt->used_guestinfo = used_guestinfo;
+
+    for ( i = 0; i < ctxt->used_guestinfo_jumbo; i++ )
+    {
+        vmport_guestinfo_jumbo_t *vgj =
+            vs->guestinfo_jumbo[i];
+        if ( vgj && vgj->key_len )
+        {
+            guestinfo_size += sizeof(vgj->key_len) + sizeof(vgj->val_len) +
+                vgj->key_len + vgj->val_len;
+            used_guestinfo_jumbo++;
+            ASSERT(sizeof(vgj->key_len) == 1);
+            *p++ = (char) vgj->key_len;
+            /* This is so migation does not fail */
+            ASSERT(sizeof(vgj->val_len) == 2);
+            memcpy(p, &vgj->val_len, sizeof(vgj->val_len));
+            p += sizeof(vgj->val_len);
+            if ( vgj->key_len )
+            {
+                memcpy(p, vgj->key_data, vgj->key_len);
+                p += vgj->key_len;
+                if ( vgj->val_len )
+                {
+                    memcpy(p, vgj->val_data, vgj->val_len);
+                    p += vgj->val_len;
+                }
+            }
+        }
+    }
+    ctxt->used_guestinfo_jumbo = used_guestinfo_jumbo;
+
+    ctxt->used_guestsize = guestinfo_size;
+
+    spin_unlock(&hd->vmport_lock);
+
+#ifndef NDEBUG
+    gdprintk(XENLOG_WARNING, "chans_size=%d guestinfo_size=%d, used=%ld\n",
+             chans_size, guestinfo_size,
+             p - ctxt->u.packed.packed_data);
+#endif
+    ASSERT(p - ctxt->u.packed.packed_data == chans_size + guestinfo_size);
+    ASSERT(desc->length >= p - (char *)ctxt);
+    desc->length = p - (char *)ctxt; /* Fixup length to be right */
+    h->cur += desc->length; /* Do _hvm_write_entry */
+    ASSERT(guestinfo_size < desc->length);
+
+    return 0;
+}
+
+static int vmport_load_domain_ctxt(struct domain *d, hvm_domain_context_t *h)
+{
+    struct hvm_vmport_context *ctxt;
+    struct hvm_save_descriptor *desc;
+    struct hvm_domain *hd = &d->arch.hvm_domain;
+    struct vmport_state *vs = hd->vmport_data;
+    unsigned int i;
+    uint8_t key_len;
+    uint16_t val_len;
+    char *p;
+    vmport_guestinfo_t *vg;
+    vmport_guestinfo_jumbo_t *vgj;
+    unsigned int loop_cnt;
+    unsigned int guestinfo_size;
+    unsigned int used_guestinfo;
+    unsigned int used_guestinfo_jumbo;
+
+    if ( !vs )
+        return -ENOMEM;
+
+    /* Customized checking for entry since our entry is of variable length */
+    desc = (struct hvm_save_descriptor *)&h->data[h->cur];
+    if ( sizeof(*desc) > h->size - h->cur )
+    {
+        printk(XENLOG_G_WARNING
+               "HVM%d restore: not enough data left to read descriptor"
+               "for type %lu\n", d->domain_id,
+               HVM_SAVE_CODE(VMPORT));
+        return -E2BIG;
+    }
+    if ( desc->length + sizeof(*desc) > h->size - h->cur )
+    {
+        printk(XENLOG_G_WARNING
+               "HVM%d restore: not enough data left to read %u bytes "
+               "for type %lu\n", d->domain_id, desc->length,
+               HVM_SAVE_CODE(VMPORT));
+        return -EFBIG;
+    }
+    if ( HVM_SAVE_CODE(VMPORT) != desc->typecode ||
+         (desc->length > HVM_SAVE_LENGTH(VMPORT)) )
+    {
+        printk(XENLOG_G_WARNING
+               "HVM%d restore mismatch: expected type %lu with max length %lu, "
+               "saw type %u length %u\n", d->domain_id, HVM_SAVE_CODE(VMPORT),
+               HVM_SAVE_LENGTH(VMPORT),
+               desc->typecode, desc->length);
+        return -ESRCH;
+    }
+    h->cur += sizeof(*desc);
+    /* Checking finished */
+
+    ctxt = (struct hvm_vmport_context *)&h->data[h->cur];
+    h->cur += desc->length;
+
+    if ( ctxt->version != VMPORT_SAVE_VERSION )
+        return -EINVAL;
+
+    spin_lock(&hd->vmport_lock);
+
+    vs->ping_time = ctxt->ping_time;
+    vs->open_cookie = ctxt->open_cookie;
+    vs->used_guestinfo = ctxt->used_guestinfo;
+    vs->used_guestinfo_jumbo = ctxt->used_guestinfo_jumbo;
+    guestinfo_size = ctxt->used_guestsize;
+    used_guestinfo = ctxt->used_guestinfo;
+    used_guestinfo_jumbo = ctxt->used_guestinfo_jumbo;
+
+    p = ctxt->u.packed.packed_data;
+
+    for ( i = 0; i < VMPORT_MAX_CHANS; i++ )
+    {
+        unsigned int j;
+
+        vs->chans[i].ctl = ctxt->u.packed.chan_ctl[i].chan;
+        memcpy(vs->chans[i].send_buf, p, vs->chans[i].ctl.send_len);
+        p += vs->chans[i].ctl.send_len;
+        for ( j = 0; j < VMPORT_MAX_BKTS; j++ )
+        {
+            vs->chans[i].recv_bkt[j].ctl = ctxt->u.packed.chan_ctl[i].recv[j];
+            memcpy(vs->chans[i].recv_bkt[j].recv_buf, p,
+                   vs->chans[i].recv_bkt[j].ctl.recv_len);
+            p += vs->chans[i].recv_bkt[j].ctl.recv_len;
+        }
+        vs->chans[i].jumbo_recv_bkt.ctl = ctxt->u.packed.chan_ctl[i].jumbo;
+        memcpy(vs->chans[i].jumbo_recv_bkt.recv_buf, p,
+               vs->chans[i].jumbo_recv_bkt.ctl.recv_len);
+        p += vs->chans[i].jumbo_recv_bkt.ctl.recv_len;
+    }
+
+
+    /* keep at least 10 total and 5 empty entries */
+    loop_cnt = (vs->used_guestinfo + 5) > 10 ?
+        (vs->used_guestinfo + 5) : 10;
+    for ( i = 0; i < loop_cnt; i++ )
+    {
+        if ( !vs->guestinfo[i] )
+        {
+            vs->guestinfo[i] = xzalloc(vmport_guestinfo_t);
+        }
+        if ( i < vs->used_guestinfo
+             && guestinfo_size > 0 )
+        {
+            ASSERT(sizeof(vg->key_len) == 1);
+            key_len = (uint8_t)*p++;
+            ASSERT(sizeof(vg->val_len) == 1);
+            val_len = (uint8_t)*p++;
+            guestinfo_size -= 2;
+            if ( guestinfo_size >= key_len + val_len )
+            {
+                vg = vs->guestinfo[i];
+                if ( key_len )
+                {
+                    vg->key_len = key_len;
+                    vg->val_len = val_len;
+                    memcpy(vg->key_data, p, key_len);
+                    p += key_len;
+                    memcpy(vg->val_data, p, val_len);
+                    p += val_len;
+                    guestinfo_size -= key_len + val_len;
+                }
+            }
+        }
+    }
+    vs->used_guestinfo = loop_cnt;
+
+    /* keep at least 2 total and 1 empty entries */
+    loop_cnt = (vs->used_guestinfo_jumbo + 1) > 2 ?
+        (vs->used_guestinfo_jumbo + 1) : 2;
+    for ( i = 0; i < loop_cnt; i++ )
+    {
+        if ( !vs->guestinfo_jumbo[i] )
+        {
+            vs->guestinfo_jumbo[i] = xzalloc(vmport_guestinfo_jumbo_t);
+        }
+        if ( i < vs->used_guestinfo_jumbo
+             && guestinfo_size > 0 )
+        {
+            ASSERT(sizeof(vgj->key_len) == 1);
+            key_len = (uint8_t)*p++;
+            /* This is so migation does not fail */
+            ASSERT(sizeof(vgj->val_len) == 2);
+            memcpy(&val_len, p, 2);
+            p += 2;
+            guestinfo_size -= 3;
+            if ( guestinfo_size >= key_len + val_len )
+            {
+                vgj = vs->guestinfo_jumbo[i];
+                if ( key_len )
+                {
+                    vgj->key_len = key_len;
+                    vgj->val_len = val_len;
+                    memcpy(vgj->key_data, p, key_len);
+                    p += key_len;
+                    memcpy(vgj->val_data, p, val_len);
+                    p += val_len;
+                    guestinfo_size -= key_len + val_len;
+                }
+            }
+        }
+    }
+    vs->used_guestinfo_jumbo = loop_cnt;
+
+    spin_unlock(&hd->vmport_lock);
+
+    return 0;
+}
+
+HVM_REGISTER_SAVE_RESTORE(VMPORT, vmport_save_domain_ctxt,
+                          vmport_load_domain_ctxt, 1, HVMSR_PER_DOM);
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
index 16d85a3..7cdcb8f 100644
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -26,6 +26,8 @@
 #ifndef __XEN_PUBLIC_HVM_SAVE_X86_H__
 #define __XEN_PUBLIC_HVM_SAVE_X86_H__
 
+#include "vmporttype.h"
+
 /* 
  * Save/restore header: general info about the save file. 
  */
@@ -610,9 +612,44 @@ struct hvm_msr {
 
 #define CPU_MSR_CODE  20
 
+/*
+ * VMware context.
+ */
+struct hvm_vmport_context {
+    uint32_t version;
+    uint32_t used_guestsize;
+    uint64_t ping_time;
+    uint32_t open_cookie;
+    uint32_t used_guestinfo;
+    uint32_t used_guestinfo_jumbo;
+    union {
+        struct {
+            vmport_channel_t chans[VMPORT_MAX_CHANS];
+            vmport_guestinfo_t guestinfo[VMPORT_MAX_NUM_KEY];
+            vmport_guestinfo_jumbo_t guestinfo_jumbo[VMPORT_MAX_NUM_JUMBO_KEY];
+        } full;
+        struct {
+            struct {
+                vmport_channel_control_t chan;
+                vmport_bucket_control_t recv[VMPORT_MAX_BKTS];
+                vmport_bucket_control_t jumbo;
+            } chan_ctl[VMPORT_MAX_CHANS];
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+            char packed_data[];
+#elif defined(__GNUC__)
+            char packed_data[0];
+#else
+            char packed_data[1 /* Variable length */];
+#endif
+        } packed;
+    } u;
+};
+
+DECLARE_HVM_SAVE_TYPE(VMPORT, 21, struct hvm_vmport_context);
+
 /* 
  * Largest type-code in use
  */
-#define HVM_SAVE_CODE_MAX 20
+#define HVM_SAVE_CODE_MAX 21
 
 #endif /* __XEN_PUBLIC_HVM_SAVE_X86_H__ */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH for-4.5 v6 12/16] Add dump of HVM_SAVE_CODE(VMPORT) to xen-hvmctx.
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (10 preceding siblings ...)
  2014-09-20 18:07 ` [PATCH for-4.5 v6 11/16] Add live migration of VMware's hyper-call RPC Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 13/16] Add xen-hvm-param Don Slutz
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

It also does some validation of the compressed data.  Currently expects
that all guest info are printable strings.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
 tools/misc/xen-hvmctx.c | 229 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 229 insertions(+)

diff --git a/tools/misc/xen-hvmctx.c b/tools/misc/xen-hvmctx.c
index 5a69245..d076091 100644
--- a/tools/misc/xen-hvmctx.c
+++ b/tools/misc/xen-hvmctx.c
@@ -399,6 +399,225 @@ static void dump_tsc_adjust(void)
     printf("    TSC_ADJUST: tsc_adjust %" PRIx64 "\n", p.tsc_adjust);
 }
 
+static void dump_vmport(int vmport_size)
+{
+    int i;
+    HVM_SAVE_TYPE(VMPORT) *vp;
+    int64_t vmport_guestsize;
+    uint32_t vmport_used_guestinfo;
+    uint32_t vmport_used_guestinfo_jumbo;
+    uint8_t pb[vmport_size];
+    char *p;
+    int chans_size;
+
+    READ(pb);
+    vp = (void *)&pb;
+
+    p = vp->u.packed.packed_data;
+
+    vmport_guestsize = vp->used_guestsize;
+    chans_size = vmport_size - vmport_guestsize - (p - (char *)pb);
+    if ( chans_size < 0 )
+    {
+        fprintf(stderr, "*** VMPORT: bogus chans_size=%d should be >= 0\n"
+                "   vmport_size=%d vmport_guestsize=%d fixed_size=%ld\n",
+                chans_size, vmport_size, (int)vmport_guestsize,
+                p - (char *)pb);
+        chans_size = 0;
+    }
+
+    printf("    VMPORT: ping_time %" PRIu64 "\n", vp->ping_time);
+    printf("    VMPORT: open_cookie %" PRIx32 "\n", vp->open_cookie);
+    for ( i = 0; i < VMPORT_MAX_CHANS; i++ )
+    {
+        int j;
+        vmport_channel_control_t *vc = &vp->u.packed.chan_ctl[i].chan;
+        vmport_bucket_control_t *jb =  &vp->u.packed.chan_ctl[i].jumbo;
+
+        printf("    VMPORT: chan[%d] chan_id %d\n", i, vc->chan_id);
+        printf("    VMPORT: chan[%d] active_time %" PRIx64 "\n",
+               i, vc->active_time);
+        printf("    VMPORT: chan[%d] proto_num %" PRIx32 "\n",
+               i, vc->proto_num);
+        printf("    VMPORT: chan[%d] recv_read %d\n", i, vc->recv_read);
+        printf("    VMPORT: chan[%d] recv_write %d\n", i, vc->recv_write);
+        printf("    VMPORT: chan[%d] jumbo %d\n", i, vc->jumbo);
+        printf("    VMPORT: chan[%d] send_len %d\n", i, vc->send_len);
+        if ( vc->send_len > VMPORT_MAX_SEND_BUF * 4 )
+        {
+            printf("--- VMPORT: trucated send_len=%d > %d\n",
+                   vc->send_len, VMPORT_MAX_SEND_BUF * 4);
+            vc->send_len = VMPORT_MAX_SEND_BUF * 4;
+        }
+        if ( vc->send_len > chans_size )
+        {
+            fprintf(stderr, "*** VMPORT: bogus send_len=%d > %d\n",
+                    vc->send_len, chans_size);
+            if ( chans_size >= 0 )
+                vc->send_len = chans_size;
+            else
+                vc->send_len = 0;
+        }
+        p += vc->send_len;
+        chans_size -= vc->send_len;
+        for ( j = 0; j < VMPORT_MAX_BKTS; j++ )
+        {
+            vmport_bucket_control_t *b = &vp->u.packed.chan_ctl[i].recv[j];
+
+            printf("    VMPORT: chan[%d] bucket[%d] recv_len %d\n",
+                   i, j, b->recv_len);
+            if ( b->recv_len > VMPORT_MAX_RECV_BUF * 4 )
+            {
+                printf("--- VMPORT: trucated recv_len=%d > %d\n",
+                       b->recv_len, VMPORT_MAX_RECV_BUF * 4);
+                b->recv_len = VMPORT_MAX_RECV_BUF * 4;
+            }
+            if ( b->recv_len > chans_size )
+            {
+                fprintf(stderr, "*** VMPORT: bogus recv_len=%d > %d\n",
+                        b->recv_len, chans_size);
+                if ( chans_size >= 0 )
+                    b->recv_len = chans_size;
+                else
+                    b->recv_len = 0;
+            }
+            p += b->recv_len;
+            chans_size -= b->recv_len;
+        }
+        printf("    VMPORT: chan[%d] jumbo_bkt recv_len %d\n", i, jb->recv_len);
+        if ( jb->recv_len > VMPORT_MAX_RECV_JUMBO_BUF * 4 )
+        {
+            printf("--- VMPORT: trucated recv_len=%d > %d\n",
+                   jb->recv_len, VMPORT_MAX_RECV_JUMBO_BUF * 4);
+            jb->recv_len = VMPORT_MAX_RECV_JUMBO_BUF * 4;
+        }
+        if ( jb->recv_len > chans_size )
+        {
+            fprintf(stderr, "*** VMPORT: bogus recv_len=%d > %d\n",
+                    jb->recv_len, chans_size);
+            if ( chans_size >= 0 )
+                jb->recv_len = chans_size;
+            else
+                jb->recv_len = 0;
+        }
+        p += jb->recv_len;
+        chans_size -= jb->recv_len;
+    }
+
+    if ( chans_size != 0 )
+        fprintf(stderr, "*** VMPORT: bogus chans_size=%d should be 0\n",
+                chans_size);
+
+    vmport_used_guestinfo = vp->used_guestinfo;
+    vmport_used_guestinfo_jumbo = vp->used_guestinfo_jumbo;
+
+    if ( vmport_used_guestinfo == 0 )
+        printf("    VMPORT: no small data\n");
+    for ( i = 0; i < vmport_used_guestinfo; i++ )
+    {
+        if ( vmport_guestsize > 0 )
+        {
+            uint8_t key_len = (uint8_t)(*p++);
+            uint8_t val_len = (uint8_t)(*p++);
+            if ( key_len )
+            {
+                char key[VMPORT_MAX_KEY_LEN + 1];
+                char val[VMPORT_MAX_VAL_LEN + 1];
+
+                if ( key_len > VMPORT_MAX_KEY_LEN )
+                {
+                    fprintf(stderr,
+                            "*** VMPORT: bogus key_len=%d > %d for guestinfo[%d]\n",
+                            key_len, VMPORT_MAX_KEY_LEN, i);
+                    key_len = VMPORT_MAX_KEY_LEN;
+                }
+                memcpy(key, p, key_len);
+                p += key_len;
+                key[key_len] = '\0';
+                if ( val_len > VMPORT_MAX_VAL_LEN )
+                {
+                    fprintf(stderr,
+                            "*** VMPORT: bogus val_len=%d > %d for guestinfo[%d]\n",
+                            val_len, VMPORT_MAX_VAL_LEN, i);
+                    val_len = VMPORT_MAX_VAL_LEN;
+                }
+                memcpy(val, p, val_len);
+                p += val_len;
+                val[val_len] = '\0';
+                vmport_guestsize -= 2 + key_len + val_len;
+                printf("    VMPORT: guestinfo[%d](%s) = \"%s\"\n",
+                       i, key, val);
+            }
+            else
+            {
+                fprintf(stderr,
+                        "*** VMPORT: bogus len for guestinfo[%d]\n",
+                        i);
+                vmport_guestsize -= 2;
+            }
+            if ( vmport_guestsize < 0 )
+                printf("    VMPORT: data length skew at guestinfo[%d]\n"
+                       "         remaining datasize=%ld\n",
+                       i, vmport_guestsize);
+        }
+    }
+
+    if ( vmport_guestsize == 0 )
+        printf("    VMPORT: no jumbo data\n");
+    for ( i = 0; i < vmport_used_guestinfo_jumbo; i++ )
+    {
+        if ( vmport_guestsize > 0 )
+        {
+            uint8_t key_len = (uint8_t)(*p++);
+            uint16_t val_len;
+
+            memcpy(&val_len, p, 2);
+            p += 2;
+            if ( key_len )
+            {
+                char key[VMPORT_MAX_KEY_LEN + 1];
+                char val[VMPORT_MAX_VAL_JUMBO_LEN + 1];
+
+                if ( key_len > VMPORT_MAX_KEY_LEN )
+                {
+                    fprintf(stderr,
+                            "*** VMPORT: bogus key_len=%d > %d for guestinfo[%d]\n",
+                            key_len, VMPORT_MAX_KEY_LEN, i);
+                    key_len = VMPORT_MAX_KEY_LEN;
+                }
+                memcpy(key, p, key_len);
+                p += key_len;
+                key[key_len] = '\0';
+                if ( val_len > VMPORT_MAX_VAL_JUMBO_LEN )
+                {
+                    fprintf(stderr,
+                            "*** VMPORT: bogus val_len=%d > %d for guestinfo[%d]\n",
+                            val_len, VMPORT_MAX_VAL_JUMBO_LEN, i);
+                    val_len = VMPORT_MAX_VAL_JUMBO_LEN;
+                }
+                memcpy(val, p, val_len);
+                p += val_len;
+                val[val_len] = '\0';
+                vmport_guestsize -= 2 + key_len + val_len;
+                printf("    VMPORT: guestinfo_jumbo[%d](%s) = \"%s\"\n",
+                       i, key, val);
+            }
+            else
+            {
+                printf("    VMPORT: bogus len for guestinfo_jumbo[%d]\n", i);
+                vmport_guestsize -= 2;
+            }
+            if ( vmport_guestsize < 0 )
+                printf("    VMPORT: data length skew at guestinfo_jumbo[%d]\n"
+                       "         remaining datasize=%ld\n", i,
+                       vmport_guestsize);
+        }
+    }
+
+    if ( !vmport_guestsize )
+        printf("    VMPORT: %ld bytes leftover data\n", vmport_guestsize);
+}
+
 int main(int argc, char **argv)
 {
     int entry, domid;
@@ -467,6 +686,7 @@ int main(int argc, char **argv)
         case HVM_SAVE_CODE(VIRIDIAN_VCPU): dump_viridian_vcpu(); break;
         case HVM_SAVE_CODE(VMCE_VCPU): dump_vmce_vcpu(); break;
         case HVM_SAVE_CODE(TSC_ADJUST): dump_tsc_adjust(); break;
+        case HVM_SAVE_CODE(VMPORT): dump_vmport(desc.length); break;
         case HVM_SAVE_CODE(END): break;
         default:
             printf(" ** Don't understand type %u: skipping\n",
@@ -477,3 +697,12 @@ int main(int argc, char **argv)
 
     return 0;
 } 
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [OPTIONAL][PATCH for-4.5 v6 13/16] Add xen-hvm-param
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (11 preceding siblings ...)
  2014-09-20 18:07 ` [PATCH for-4.5 v6 12/16] Add dump of HVM_SAVE_CODE(VMPORT) to xen-hvmctx Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 14/16] Add xen-vmware-guestinfo Don Slutz
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

A tool to get and set hvm param.

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
 .gitignore                 |   1 +
 tools/misc/Makefile        |   7 +-
 tools/misc/xen-hvm-param.c | 164 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 170 insertions(+), 2 deletions(-)
 create mode 100644 tools/misc/xen-hvm-param.c

diff --git a/.gitignore b/.gitignore
index bb928a5..aff5758 100644
--- a/.gitignore
+++ b/.gitignore
@@ -180,6 +180,7 @@ tools/misc/xen-tmem-list-parse
 tools/misc/xenperf
 tools/misc/xenpm
 tools/misc/xen-hvmctx
+tools/misc/xen-hvm-param
 tools/misc/gtraceview
 tools/misc/gtracestat
 tools/misc/xenlockprof
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 69b1817..b8d4579 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -10,7 +10,7 @@ CFLAGS += $(CFLAGS_libxenstore)
 HDRS     = $(wildcard *.h)
 
 TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof xenwatchdogd xencov
-TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvmcrash xen-lowmemd xen-mfndump
+TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvm-param xen-hvmcrash xen-lowmemd xen-mfndump
 TARGETS-$(CONFIG_MIGRATE) += xen-hptool
 TARGETS := $(TARGETS-y)
 
@@ -22,7 +22,7 @@ INSTALL_BIN := $(INSTALL_BIN-y)
 
 INSTALL_SBIN-y := xen-bugtool xen-python-path xenperf xenpm xen-tmem-list-parse gtraceview \
 	gtracestat xenlockprof xenwatchdogd xen-ringwatch xencov
-INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvmcrash xen-lowmemd xen-mfndump
+INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvm-param xen-hvmcrash xen-lowmemd xen-mfndump
 INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool
 INSTALL_SBIN := $(INSTALL_SBIN-y)
 
@@ -57,6 +57,9 @@ clean:
 xen-hvmctx: xen-hvmctx.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+xen-hvm-param: xen-hvm-param.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
+
 xen-hvmcrash: xen-hvmcrash.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-hvm-param.c b/tools/misc/xen-hvm-param.c
new file mode 100644
index 0000000..0c62fa2
--- /dev/null
+++ b/tools/misc/xen-hvm-param.c
@@ -0,0 +1,164 @@
+/*
+ * tools/misc/xen-hvm-param.c
+ *
+ * Copyright (C) 2014 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <err.h>
+
+#include <xenctrl.h>
+
+
+int
+main(int argc, char **argv)
+{
+    xc_interface *xch;
+    int domid;
+    int start_param = 0;
+    int end_param = HVM_NR_PARAMS;
+    int param;
+    int ret = 0;
+    int i;
+    char hvm_param_name[HVM_NR_PARAMS][80];
+
+    unsigned long hvm_param = -1;
+
+    if ( (argc < 2) || (argc > 4) )
+        errx(1, "usage: %s domid [param [new]]", argv[0]);
+
+    for ( i = 0; i < HVM_NR_PARAMS; i++ )
+        snprintf(hvm_param_name[i], sizeof(hvm_param_name[i]), "Unknown %d", i);
+
+    snprintf(hvm_param_name[HVM_PARAM_CALLBACK_IRQ],
+             sizeof(hvm_param_name[HVM_PARAM_CALLBACK_IRQ]), "Callback_Irq");
+    snprintf(hvm_param_name[HVM_PARAM_STORE_PFN],
+             sizeof(hvm_param_name[HVM_PARAM_STORE_PFN]), "Store_Pfn");
+    snprintf(hvm_param_name[HVM_PARAM_STORE_EVTCHN],
+             sizeof(hvm_param_name[HVM_PARAM_STORE_EVTCHN]), "Store_Evtchn");
+    snprintf(hvm_param_name[HVM_PARAM_PAE_ENABLED],
+             sizeof(hvm_param_name[HVM_PARAM_PAE_ENABLED]), "Pae_Enabled");
+    snprintf(hvm_param_name[HVM_PARAM_IOREQ_PFN],
+             sizeof(hvm_param_name[HVM_PARAM_IOREQ_PFN]), "Ioreq_Pfn");
+    snprintf(hvm_param_name[HVM_PARAM_BUFIOREQ_PFN],
+             sizeof(hvm_param_name[HVM_PARAM_BUFIOREQ_PFN]), "Bufioreq_Pfn");
+    snprintf(hvm_param_name[HVM_PARAM_VIRIDIAN],
+             sizeof(hvm_param_name[HVM_PARAM_VIRIDIAN]), "Viridian");
+    snprintf(hvm_param_name[HVM_PARAM_TIMER_MODE],
+             sizeof(hvm_param_name[HVM_PARAM_TIMER_MODE]), "Timer_Mode");
+    snprintf(hvm_param_name[HVM_PARAM_HPET_ENABLED],
+             sizeof(hvm_param_name[HVM_PARAM_HPET_ENABLED]), "Hpet_Enabled");
+    snprintf(hvm_param_name[HVM_PARAM_IDENT_PT],
+             sizeof(hvm_param_name[HVM_PARAM_IDENT_PT]), "Ident_Pt");
+    snprintf(hvm_param_name[HVM_PARAM_DM_DOMAIN],
+             sizeof(hvm_param_name[HVM_PARAM_DM_DOMAIN]), "Dm_Domain");
+    snprintf(hvm_param_name[HVM_PARAM_ACPI_S_STATE],
+             sizeof(hvm_param_name[HVM_PARAM_ACPI_S_STATE]), "Acpi_S_State");
+    snprintf(hvm_param_name[HVM_PARAM_VM86_TSS],
+             sizeof(hvm_param_name[HVM_PARAM_VM86_TSS]), "Vm86_Tss");
+    snprintf(hvm_param_name[HVM_PARAM_VPT_ALIGN],
+             sizeof(hvm_param_name[HVM_PARAM_VPT_ALIGN]), "Vpt_Align");
+    snprintf(hvm_param_name[HVM_PARAM_CONSOLE_PFN],
+             sizeof(hvm_param_name[HVM_PARAM_CONSOLE_PFN]), "Console_Pfn");
+    snprintf(hvm_param_name[HVM_PARAM_CONSOLE_EVTCHN],
+             sizeof(hvm_param_name[HVM_PARAM_CONSOLE_EVTCHN]), "Console_Evtchn");
+    snprintf(hvm_param_name[HVM_PARAM_ACPI_IOPORTS_LOCATION],
+             sizeof(hvm_param_name[HVM_PARAM_ACPI_IOPORTS_LOCATION]),
+             "Acpi_Ioports_Location");
+    snprintf(hvm_param_name[HVM_PARAM_MEMORY_EVENT_CR0],
+             sizeof(hvm_param_name[HVM_PARAM_MEMORY_EVENT_CR0]), "Memory_Event_Cr0");
+    snprintf(hvm_param_name[HVM_PARAM_MEMORY_EVENT_CR3],
+             sizeof(hvm_param_name[HVM_PARAM_MEMORY_EVENT_CR3]), "Memory_Event_Cr3");
+    snprintf(hvm_param_name[HVM_PARAM_MEMORY_EVENT_CR4],
+             sizeof(hvm_param_name[HVM_PARAM_MEMORY_EVENT_CR4]), "Memory_Event_Cr4");
+    snprintf(hvm_param_name[HVM_PARAM_MEMORY_EVENT_INT3],
+             sizeof(hvm_param_name[HVM_PARAM_MEMORY_EVENT_INT3]), "Memory_Event_Int3");
+    snprintf(hvm_param_name[HVM_PARAM_NESTEDHVM],
+             sizeof(hvm_param_name[HVM_PARAM_NESTEDHVM]), "Nestedhvm");
+    snprintf(hvm_param_name[HVM_PARAM_MEMORY_EVENT_SINGLE_STEP],
+             sizeof(hvm_param_name[HVM_PARAM_MEMORY_EVENT_SINGLE_STEP]),
+             "Memory_Event_Single_Step");
+    snprintf(hvm_param_name[HVM_PARAM_BUFIOREQ_EVTCHN],
+             sizeof(hvm_param_name[HVM_PARAM_BUFIOREQ_EVTCHN]), "Bufioreq_Evtchn");
+    snprintf(hvm_param_name[HVM_PARAM_PAGING_RING_PFN],
+             sizeof(hvm_param_name[HVM_PARAM_PAGING_RING_PFN]), "Paging_Ring_Pfn");
+    snprintf(hvm_param_name[HVM_PARAM_ACCESS_RING_PFN],
+             sizeof(hvm_param_name[HVM_PARAM_ACCESS_RING_PFN]), "Access_Ring_Pfn");
+    snprintf(hvm_param_name[HVM_PARAM_SHARING_RING_PFN],
+             sizeof(hvm_param_name[HVM_PARAM_SHARING_RING_PFN]), "Sharing_Ring_Pfn");
+    snprintf(hvm_param_name[HVM_PARAM_VMWARE_HW],
+             sizeof(hvm_param_name[HVM_PARAM_VMWARE_HW]), "Vmware_Hw");
+    snprintf(hvm_param_name[HVM_PARAM_VMPORT_BUILD_NUMBER_TIME],
+             sizeof(hvm_param_name[HVM_PARAM_VMPORT_BUILD_NUMBER_TIME]),
+             "Vmport_Build_Number_Time");
+    snprintf(hvm_param_name[HVM_PARAM_VMPORT_BUILD_NUMBER_VALUE],
+             sizeof(hvm_param_name[HVM_PARAM_VMPORT_BUILD_NUMBER_VALUE]),
+             "Vmport_Build_Number_Value");
+    snprintf(hvm_param_name[HVM_PARAM_VMPORT_RESET_TIME],
+             sizeof(hvm_param_name[HVM_PARAM_VMPORT_RESET_TIME]), "Vmport_Reset_Time");
+
+    xch = xc_interface_open(0, 0, 0);
+    if ( !xch )
+        err(1, "failed to open control interface");
+
+    domid = atoi(argv[1]);
+    if ( argc > 2 )
+    {
+        start_param = strtol(argv[2], NULL, 0);
+        end_param = start_param + 1;
+    }
+
+    for ( param = start_param; param < end_param; param++ )
+    {
+        ret = xc_get_hvm_param(xch, domid, param, &hvm_param);
+        if ( ret )
+            err(1, "failed to get hvm param %d for domid %d", param, domid);
+        else
+        {
+            if ( argc == 4 )
+            {
+                long new = strtol(argv[3], NULL, 0);
+
+                ret = xc_set_hvm_param(xch, domid, param, new);
+                if ( ret )
+                    err(1, "failed to set hvm param %d for domid %d", param, domid);
+                else if ( (param >= 0) && (param < HVM_NR_PARAMS) )
+                    printf("hvm_param(%s)=0x%lx(%ld) was 0x%lx(%ld)\n",
+                           hvm_param_name[param], new, new, hvm_param, hvm_param);
+                else
+                    printf("hvm_param(%d)=0x%lx(%ld) was 0x%lx(%ld)\n",
+                           param, new, new, hvm_param, hvm_param);
+            }
+            else
+            {
+                if ( (param >= 0) && (param < HVM_NR_PARAMS) )
+                    printf("hvm_param(%s)=0x%lx(%ld)\n", hvm_param_name[param], hvm_param,
+                           hvm_param);
+                else
+                    printf("hvm_param(%d)=0x%lx(%ld)\n", param, hvm_param, hvm_param);
+            }
+        }
+    }
+    xc_interface_close(xch);
+
+    return ret;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [OPTIONAL][PATCH for-4.5 v6 14/16] Add xen-vmware-guestinfo
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (12 preceding siblings ...)
  2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 13/16] Add xen-hvm-param Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 15/16] Add xen-list-vmware-guestinfo Don Slutz
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

A tool to get and set VMware guestinfo

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
 .gitignore                        |  1 +
 tools/misc/Makefile               |  7 ++-
 tools/misc/xen-vmware-guestinfo.c | 97 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 103 insertions(+), 2 deletions(-)
 create mode 100644 tools/misc/xen-vmware-guestinfo.c

diff --git a/.gitignore b/.gitignore
index aff5758..07f20b9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -181,6 +181,7 @@ tools/misc/xenperf
 tools/misc/xenpm
 tools/misc/xen-hvmctx
 tools/misc/xen-hvm-param
+tools/misc/xen-vmware-guestinfo
 tools/misc/gtraceview
 tools/misc/gtracestat
 tools/misc/xenlockprof
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index b8d4579..f2ffe1a 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -10,7 +10,7 @@ CFLAGS += $(CFLAGS_libxenstore)
 HDRS     = $(wildcard *.h)
 
 TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof xenwatchdogd xencov
-TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvm-param xen-hvmcrash xen-lowmemd xen-mfndump
+TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvm-param xen-vmware-guestinfo xen-hvmcrash xen-lowmemd xen-mfndump
 TARGETS-$(CONFIG_MIGRATE) += xen-hptool
 TARGETS := $(TARGETS-y)
 
@@ -22,7 +22,7 @@ INSTALL_BIN := $(INSTALL_BIN-y)
 
 INSTALL_SBIN-y := xen-bugtool xen-python-path xenperf xenpm xen-tmem-list-parse gtraceview \
 	gtracestat xenlockprof xenwatchdogd xen-ringwatch xencov
-INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvm-param xen-hvmcrash xen-lowmemd xen-mfndump
+INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvm-param xen-vmware-guestinfo xen-hvmcrash xen-lowmemd xen-mfndump
 INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool
 INSTALL_SBIN := $(INSTALL_SBIN-y)
 
@@ -60,6 +60,9 @@ xen-hvmctx: xen-hvmctx.o
 xen-hvm-param: xen-hvm-param.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+xen-vmware-guestinfo: xen-vmware-guestinfo.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
+
 xen-hvmcrash: xen-hvmcrash.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-vmware-guestinfo.c b/tools/misc/xen-vmware-guestinfo.c
new file mode 100644
index 0000000..e6b288c
--- /dev/null
+++ b/tools/misc/xen-vmware-guestinfo.c
@@ -0,0 +1,97 @@
+/*
+ * tools/misc/xen-vmware-guestinfo.c
+ *
+ * Copyright (C) 2014 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <err.h>
+
+#include <xenctrl.h>
+
+
+int
+main(int argc, char **argv)
+{
+    xc_interface *xch;
+    int domid;
+    int ret = 0;
+
+    char value[8192];
+    unsigned int val_len;
+    char *vals = "";
+
+    if ( (argc < 3) || (argc > 4) )
+        errx(1, "usage: %s domid guestinfo.<name> [new]", argv[0]);
+
+    xch = xc_interface_open(0, 0, 0);
+    if ( !xch )
+        err(1, "failed to open control interface");
+
+    domid = atoi(argv[1]);
+
+    ret = xc_get_vmport_guest_info(xch, domid, strlen(argv[2]), argv[2],
+                                   sizeof(value), &val_len, value);
+    if ( !ret )
+    {
+        /* Make sure this is a c-string */
+        if ( val_len < sizeof(value) )
+            value[val_len] = 0;
+        else
+        {
+            value[sizeof(value) - 1] = 0;
+            vals = "...";
+        }
+    }
+
+    if ( argc == 4 )
+    {
+        int ret1;
+
+        if ( ret )
+            warn("failed to get VMware guestinfo '%s' for domid %d", argv[2], domid);
+        ret1 = xc_set_vmport_guest_info(xch, domid, strlen(argv[2]), argv[2],
+                                        strlen(argv[3]), argv[3]);
+        if ( ret1 )
+            err(1, "failed to set VMware guestinfo '%s' for domid %d", argv[2], domid);
+        else if ( ret )
+            printf("VMware guestinfo '%s'='%s'\n",
+                   argv[2], argv[3]);
+        else
+            printf("VMware guestinfo '%s' was '%s'%s now '%s'\n",
+                   argv[2], value, vals, argv[3]);
+    }
+    else
+    {
+        if ( ret )
+            err(1, "failed to get VMware guestinfo '%s' for domid %d", argv[2], domid);
+        else
+        {
+            printf("VMware guestinfo '%s'='%s'%s\n",
+                   argv[2], value, vals);
+        }
+    }
+    xc_interface_close(xch);
+
+    return ret;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [OPTIONAL][PATCH for-4.5 v6 15/16] Add xen-list-vmware-guestinfo
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (13 preceding siblings ...)
  2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 14/16] Add xen-vmware-guestinfo Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 16/16] Add xen-hvm-send-trigger Don Slutz
  2014-09-22 13:56 ` [PATCH for-4.5 v6 00/16] Xen VMware tools support Ian Campbell
  16 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

A tool to list currently set VMware guestinfo

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
 .gitignore                             |  1 +
 tools/misc/Makefile                    |  7 ++-
 tools/misc/xen-list-vmware-guestinfo.c | 88 ++++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+), 2 deletions(-)
 create mode 100644 tools/misc/xen-list-vmware-guestinfo.c

diff --git a/.gitignore b/.gitignore
index 07f20b9..606c703 100644
--- a/.gitignore
+++ b/.gitignore
@@ -182,6 +182,7 @@ tools/misc/xenpm
 tools/misc/xen-hvmctx
 tools/misc/xen-hvm-param
 tools/misc/xen-vmware-guestinfo
+tools/misc/xen-list-vmware-guestinfo
 tools/misc/gtraceview
 tools/misc/gtracestat
 tools/misc/xenlockprof
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index f2ffe1a..3e7d216 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -10,7 +10,7 @@ CFLAGS += $(CFLAGS_libxenstore)
 HDRS     = $(wildcard *.h)
 
 TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof xenwatchdogd xencov
-TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvm-param xen-vmware-guestinfo xen-hvmcrash xen-lowmemd xen-mfndump
+TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvm-param xen-vmware-guestinfo xen-list-vmware-guestinfo xen-hvmcrash xen-lowmemd xen-mfndump
 TARGETS-$(CONFIG_MIGRATE) += xen-hptool
 TARGETS := $(TARGETS-y)
 
@@ -22,7 +22,7 @@ INSTALL_BIN := $(INSTALL_BIN-y)
 
 INSTALL_SBIN-y := xen-bugtool xen-python-path xenperf xenpm xen-tmem-list-parse gtraceview \
 	gtracestat xenlockprof xenwatchdogd xen-ringwatch xencov
-INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvm-param xen-vmware-guestinfo xen-hvmcrash xen-lowmemd xen-mfndump
+INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvm-param xen-vmware-guestinfo xen-list-vmware-guestinfo xen-hvmcrash xen-lowmemd xen-mfndump
 INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool
 INSTALL_SBIN := $(INSTALL_SBIN-y)
 
@@ -63,6 +63,9 @@ xen-hvm-param: xen-hvm-param.o
 xen-vmware-guestinfo: xen-vmware-guestinfo.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+xen-list-vmware-guestinfo: xen-list-vmware-guestinfo.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
+
 xen-hvmcrash: xen-hvmcrash.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-list-vmware-guestinfo.c b/tools/misc/xen-list-vmware-guestinfo.c
new file mode 100644
index 0000000..2122fcc
--- /dev/null
+++ b/tools/misc/xen-list-vmware-guestinfo.c
@@ -0,0 +1,88 @@
+/*
+ * tools/misc/xen-list-vmware-guestinfo.c
+ *
+ * Copyright (C) 2014 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <err.h>
+
+#include <xenctrl.h>
+
+
+int
+main(int argc, char **argv)
+{
+    xc_interface *xch;
+    int domid;
+    int ret = 0;
+
+    unsigned int idx = 0;
+    char key[128];
+    unsigned int key_len;
+    char value[8192];
+    unsigned int value_len;
+
+    if ( argc != 2 )
+        errx(1, "usage: %s domid", argv[0]);
+
+    xch = xc_interface_open(0, 0, 0);
+    if ( !xch )
+        err(1, "failed to open control interface");
+
+    domid = atoi(argv[1]);
+
+    while ( !xc_fetch_all_vmport_guest_info(xch, domid, idx, sizeof(key),
+                                            &key_len, key, sizeof(value),
+                                            &value_len, value) )
+    {
+        if ( key_len )
+        {
+            char *keys = "";
+            char *vals = "";
+
+            /* Make sure this is a c-string */
+            if ( key_len < sizeof(key) )
+                key[key_len] = 0;
+            else
+            {
+                key[sizeof(key) - 1] = 0;
+                keys = "...";
+            }
+            /* Make sure this is a c-string */
+            if ( value_len < sizeof(value) )
+                value[value_len] = 0;
+            else
+            {
+                value[sizeof(value) - 1] = 0;
+                vals = "...";
+            }
+            printf("VMware guestinfo '%s'%s='%s'%s\n",
+                   key, keys, value, vals);
+        }
+        idx++;
+    }
+    xc_interface_close(xch);
+
+    return ret;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [OPTIONAL][PATCH for-4.5 v6 16/16] Add xen-hvm-send-trigger
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (14 preceding siblings ...)
  2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 15/16] Add xen-list-vmware-guestinfo Don Slutz
@ 2014-09-20 18:07 ` Don Slutz
  2014-09-22 13:56 ` [PATCH for-4.5 v6 00/16] Xen VMware tools support Ian Campbell
  16 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-20 18:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Don Slutz, Tim Deegan,
	George Dunlap, Aravind Gopalakrishnan, Jan Beulich,
	Andrew Cooper, Boris Ostrovsky, Suravee Suthikulpanit

A tool to send a trigger to a domU via xc_domain_send_trigger

Signed-off-by: Don Slutz <dslutz@verizon.com>
---
 .gitignore                        |   1 +
 tools/misc/Makefile               |   7 ++-
 tools/misc/xen-hvm-send-trigger.c | 103 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 109 insertions(+), 2 deletions(-)
 create mode 100644 tools/misc/xen-hvm-send-trigger.c

diff --git a/.gitignore b/.gitignore
index 606c703..d66c5f5 100644
--- a/.gitignore
+++ b/.gitignore
@@ -183,6 +183,7 @@ tools/misc/xen-hvmctx
 tools/misc/xen-hvm-param
 tools/misc/xen-vmware-guestinfo
 tools/misc/xen-list-vmware-guestinfo
+tools/misc/xen-hvm-send-trigger
 tools/misc/gtraceview
 tools/misc/gtracestat
 tools/misc/xenlockprof
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 3e7d216..9c5e988 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -10,7 +10,7 @@ CFLAGS += $(CFLAGS_libxenstore)
 HDRS     = $(wildcard *.h)
 
 TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat xenlockprof xenwatchdogd xencov
-TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvm-param xen-vmware-guestinfo xen-list-vmware-guestinfo xen-hvmcrash xen-lowmemd xen-mfndump
+TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvm-param xen-vmware-guestinfo xen-list-vmware-guestinfo xen-hvm-send-trigger xen-hvmcrash xen-lowmemd xen-mfndump
 TARGETS-$(CONFIG_MIGRATE) += xen-hptool
 TARGETS := $(TARGETS-y)
 
@@ -22,7 +22,7 @@ INSTALL_BIN := $(INSTALL_BIN-y)
 
 INSTALL_SBIN-y := xen-bugtool xen-python-path xenperf xenpm xen-tmem-list-parse gtraceview \
 	gtracestat xenlockprof xenwatchdogd xen-ringwatch xencov
-INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvm-param xen-vmware-guestinfo xen-list-vmware-guestinfo xen-hvmcrash xen-lowmemd xen-mfndump
+INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvm-param xen-vmware-guestinfo xen-list-vmware-guestinfo xen-hvm-send-trigger xen-hvmcrash xen-lowmemd xen-mfndump
 INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool
 INSTALL_SBIN := $(INSTALL_SBIN-y)
 
@@ -66,6 +66,9 @@ xen-vmware-guestinfo: xen-vmware-guestinfo.o
 xen-list-vmware-guestinfo: xen-list-vmware-guestinfo.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+xen-hvm-send-trigger: xen-hvm-send-trigger.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
+
 xen-hvmcrash: xen-hvmcrash.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-hvm-send-trigger.c b/tools/misc/xen-hvm-send-trigger.c
new file mode 100644
index 0000000..b822f9c
--- /dev/null
+++ b/tools/misc/xen-hvm-send-trigger.c
@@ -0,0 +1,103 @@
+/*
+ * tools/misc/xen-hvm-send-trigger.c
+ *
+ * Copyright (C) 2014 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <err.h>
+
+#include <xenctrl.h>
+#include <xen/domctl.h>
+
+#define HVM_NR_SENDTRIGGERS 8
+
+int
+main(int argc, char **argv)
+{
+    xc_interface *xch;
+    int domid;
+    int sendtrigger;
+    uint32_t vcpuid = 0;
+    int ret = 0;
+    int i;
+    char hvm_sendtrigger_name[HVM_NR_SENDTRIGGERS][80];
+
+    if ( (argc < 3) || (argc > 4) )
+        errx(1, "usage: %s domid trigger [vcpuid]", argv[0]);
+
+    for ( i = 0; i < HVM_NR_SENDTRIGGERS; i++ )
+        snprintf(hvm_sendtrigger_name[i], sizeof(hvm_sendtrigger_name[i]), "Unknown %d",
+                 i);
+
+    snprintf(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_NMI],
+             sizeof(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_NMI]),
+             "Trigger_Nmi");
+    snprintf(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_RESET],
+             sizeof(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_RESET]),
+             "Trigger_Reset");
+    snprintf(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_INIT],
+             sizeof(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_INIT]),
+             "Trigger_Init");
+    snprintf(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_POWER],
+             sizeof(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_POWER]),
+             "Trigger_Power");
+    snprintf(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_SLEEP],
+             sizeof(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_SLEEP]),
+             "Trigger_Sleep");
+    snprintf(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_VTPOWER],
+             sizeof(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_VTPOWER]),
+             "Trigger_VTPower");
+    snprintf(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_VTREBOOT],
+             sizeof(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_VTREBOOT]),
+             "Trigger_VTReboot");
+    snprintf(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_VTPING],
+             sizeof(hvm_sendtrigger_name[XEN_DOMCTL_SENDTRIGGER_VTPING]),
+             "Trigger_VTPing");
+
+
+    xch = xc_interface_open(0, 0, 0);
+    if ( !xch )
+        err(1, "failed to open control interface");
+
+    domid = atoi(argv[1]);
+    sendtrigger = strtol(argv[2], NULL, 0);
+    if ( argc > 3 )
+        vcpuid = strtol(argv[3], NULL, 0);
+
+    if ( sendtrigger >= 0 && sendtrigger < HVM_NR_SENDTRIGGERS )
+        printf("Sending trigger(%s)=%d to domid %d vcpuid %u\n",
+               hvm_sendtrigger_name[sendtrigger], sendtrigger,
+               domid, vcpuid);
+    else
+        printf("Sending trigger %d to domid %d vcpuid %u\n",
+               sendtrigger, domid, vcpuid);
+
+    ret = xc_domain_send_trigger(xch, domid, sendtrigger, vcpuid);
+    if ( ret )
+        err(1, "failed to send sendtrigger %d for domid %d", sendtrigger,
+            domid);
+
+    xc_interface_close(xch);
+
+    return ret;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 01/16] xen: Add support for VMware cpuid leaves
  2014-09-20 18:07 ` [PATCH for-4.5 v6 01/16] xen: Add support for VMware cpuid leaves Don Slutz
@ 2014-09-22 11:49   ` Andrew Cooper
  2014-09-22 16:53     ` Don Slutz
  2014-09-24 14:33   ` George Dunlap
  1 sibling, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2014-09-22 11:49 UTC (permalink / raw)
  To: Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan, George Dunlap,
	Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 20/09/14 19:07, Don Slutz wrote:
> This is done by adding HVM_PARAM_VMWARE_HW. It is set to the VMware
> virtual hardware version.
>
> Currently 0, 3-4, 6-11 are good values.  However the
> code only checks for == 0 or != 0.
>
> If non-zero then
>   Return VMware's cpuid leaves.
>
> The support of hypervisor cpuid leaves has not been agreed to.
>
> MicroSoft Hyper-V (AKA viridian) currently must be at 0x40000000.
>
> VMware currently must be at 0x40000000.
>
> KVM currently must be at 0x40000000 (from Seabios).
>
> Xen can be found at the first otherwise unused 0x100 aligned
> offset between 0x40000000 and 0x40010000.
>
> http://download.microsoft.com/download/F/B/0/FB0D01A3-8E3A-4F5F-AA59-08C8026D3B8A/requirements-for-implementing-microsoft-hypervisor-interface.docx
>
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458
>
> http://lwn.net/Articles/301888/
>   Attempted to get this cleaned up.
>
> So based on this, I picked the order:
>
> Xen at 0x40000000 or
> Viridian or VMware at 0x40000000 and Xen at 0x40000100
>
> If both Viridian and VMware selected, report an error.
>
> Since I need to change xen/arch/x86/hvm/Makefile; also add
> a newline at end of file.
>
> Signed-off-by: Don Slutz <dslutz@verizon.com>
> ---
> v5:
>       Given how is_viridian and is_vmware are defined I think '||' is more
>       appropriate.
>         Fixed.
>       The names of all three functions are bogus.
>         removed static support routines.
>       This hunk is unrelated, but is perhaps something better fixed.
>         Added to commit message.
>       include <xen/types.h> (IIRC) please.
>         Done.
>       At least 1 pair of brackets please, especially as the placement of
>       brackets affects the result of this particular calculation.
>         Switch to "1000000ull / APIC_BUS_CYCLE_NS"      
>
>  xen/arch/x86/hvm/Makefile        |  3 +-
>  xen/arch/x86/hvm/hvm.c           | 32 +++++++++++++++
>  xen/arch/x86/hvm/vmware/Makefile |  1 +
>  xen/arch/x86/hvm/vmware/cpuid.c  | 89 ++++++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/traps.c             |  8 +++-
>  xen/include/asm-x86/hvm/hvm.h    |  3 ++
>  xen/include/asm-x86/hvm/vmware.h | 33 +++++++++++++++
>  xen/include/public/hvm/params.h  |  5 ++-
>  8 files changed, 170 insertions(+), 4 deletions(-)
>  create mode 100644 xen/arch/x86/hvm/vmware/Makefile
>  create mode 100644 xen/arch/x86/hvm/vmware/cpuid.c
>  create mode 100644 xen/include/asm-x86/hvm/vmware.h
>
> diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
> index eea5555..77598a6 100644
> --- a/xen/arch/x86/hvm/Makefile
> +++ b/xen/arch/x86/hvm/Makefile
> @@ -1,5 +1,6 @@
>  subdir-y += svm
>  subdir-y += vmx
> +subdir-y += vmware
>  
>  obj-y += asid.o
>  obj-y += emulate.o
> @@ -22,4 +23,4 @@ obj-y += vlapic.o
>  obj-y += vmsi.o
>  obj-y += vpic.o
>  obj-y += vpt.o
> -obj-y += vpmu.o
> \ No newline at end of file
> +obj-y += vpmu.o
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index bb45593..f3cf566 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -57,6 +57,7 @@
>  #include <asm/hvm/cacheattr.h>
>  #include <asm/hvm/trace.h>
>  #include <asm/hvm/nestedhvm.h>
> +#include <asm/hvm/vmware.h>
>  #include <asm/mtrr.h>
>  #include <asm/apic.h>
>  #include <public/sched.h>
> @@ -4228,6 +4229,9 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>      if ( cpuid_viridian_leaves(input, eax, ebx, ecx, edx) )
>          return;
>  
> +    if ( cpuid_vmware_leaves(input, eax, ebx, ecx, edx) )
> +        return;
> +
>      if ( cpuid_hypervisor_leaves(input, count, eax, ebx, ecx, edx) )
>          return;
>  
> @@ -5555,6 +5559,11 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
>                      rc = -EINVAL;
>                  break;
>              case HVM_PARAM_VIRIDIAN:
> +                if ( d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] )
> +                {
> +                    rc = -EXDEV;
> +                    break;
> +                }
>                  if ( a.value > 1 )
>                      rc = -EINVAL;
>                  break;
> @@ -5692,6 +5701,29 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>                  break;
>              }
> +            case HVM_PARAM_VMWARE_HW:
> +                /*
> +                 * This should only ever be set non-zero one time by
> +                 * the tools and is read only by the guest.
> +                 */
> +                if ( d == current->domain )

curr_d instead of current->domain

> +                {
> +                    rc = -EPERM;
> +                    break;
> +                }
> +                if ( d->arch.hvm_domain.params[HVM_PARAM_VIRIDIAN] )
> +                {
> +                    rc = -EXDEV;
> +                    break;
> +                }
> +                if ( d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] &&

This check is redundant.  Repeatedly setting 0 is still permitted
because of the break after this if().

> +                     d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] !=
> +                     a.value )
> +                {
> +                    rc = -EEXIST;
> +                    break;
> +                }
> +                break;
>              }
>  
>              if ( rc == 0 ) 
> diff --git a/xen/arch/x86/hvm/vmware/Makefile b/xen/arch/x86/hvm/vmware/Makefile
> new file mode 100644
> index 0000000..3fb2e0b
> --- /dev/null
> +++ b/xen/arch/x86/hvm/vmware/Makefile
> @@ -0,0 +1 @@
> +obj-y += cpuid.o
> diff --git a/xen/arch/x86/hvm/vmware/cpuid.c b/xen/arch/x86/hvm/vmware/cpuid.c
> new file mode 100644
> index 0000000..29f6213
> --- /dev/null
> +++ b/xen/arch/x86/hvm/vmware/cpuid.c
> @@ -0,0 +1,89 @@
> +/*
> + * arch/x86/hvm/vmware/cpuid.c
> + *
> + * Copyright (C) 2012 Verizon Corporation
> + *
> + * This file is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License Version 2 (GPLv2)
> + * as published by the Free Software Foundation.
> + *
> + * This file is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details. <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/sched.h>
> +
> +#include <asm/hvm/hvm.h>
> +#include <asm/hvm/vmware.h>
> +
> +/*
> + * VMware hardware version 7 defines some of these cpuid levels,
> + * below is a brief description about those.
> + *
> + *     Leaf 0x40000000, Hypervisor CPUID information
> + * # EAX: The maximum input value for hypervisor CPUID info (0x40000010).
> + * # EBX, ECX, EDX: Hypervisor vendor ID signature. E.g. "VMwareVMware"
> + *
> + *     Leaf 0x40000010, Timing information.
> + * # EAX: (Virtual) TSC frequency in kHz.
> + * # EBX: (Virtual) Bus (local apic timer) frequency in kHz.
> + * # ECX, EDX: RESERVED
> + */
> +
> +int cpuid_vmware_leaves(uint32_t idx, uint32_t *eax, uint32_t *ebx,
> +                        uint32_t *ecx, uint32_t *edx)
> +{
> +    struct domain *d = current->domain;
> +
> +    if ( !is_vmware_domain(d) )
> +        return 0;
> +
> +    switch ( idx - 0x40000000 )
> +    {
> +    case 0x0:
> +        if ( d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] >= 7 )

You are going to need some reference about VMWare versions.  The comment
for this function is fine for describing what "version 7" constitutes,
but how did you come about it?  What is supposed to be visible in the
leaves of a version less than 7 is selected by the toolstack, because I
doubt its all zeroes including the root leaf?

~Andrew

> +        {
> +            *eax = 0x40000010;  /* Largest leaf */
> +            *ebx = 0x61774d56;  /* "VMwa" */
> +            *ecx = 0x4d566572;  /* "reVM" */
> +            *edx = 0x65726177;  /* "ware" */
> +            break;
> +        }
> +        /* fallthrough */
> +    case 0x10:
> +        if ( d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] >= 7 )
> +        {
> +            /* (Virtual) TSC frequency in kHz. */
> +            *eax =  d->arch.tsc_khz;
> +            /* (Virtual) Bus (local apic timer) frequency in kHz. */
> +            *ebx = 1000000ull / APIC_BUS_CYCLE_NS;
> +            *ecx = 0;          /* Reserved */
> +            *edx = 0;          /* Reserved */
> +            break;
> +        }
> +        /* fallthrough */
> +    case 0x1 ... 0xf:
> +        *eax = 0;          /* Reserved */
> +        *ebx = 0;          /* Reserved */
> +        *ecx = 0;          /* Reserved */
> +        *edx = 0;          /* Reserved */
> +        break;
> +
> +    default:
> +        return 0;
> +    }
> +
> +    return 1;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 10fc2ca..90542f9 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -685,8 +685,12 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
>                 uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx)
>  {
>      struct domain *d = current->domain;
> -    /* Optionally shift out of the way of Viridian architectural leaves. */
> -    uint32_t base = is_viridian_domain(d) ? 0x40000100 : 0x40000000;
> +    /*
> +     * Optionally shift out of the way of Viridian or VMware
> +     * architectural leaves.
> +     */
> +    uint32_t base = is_viridian_domain(d) || is_vmware_domain(d) ?
> +        0x40000100 : 0x40000000;
>      uint32_t limit, dummy;
>  
>      idx -= base;
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index 121d053..3916fec 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -349,6 +349,9 @@ static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
>  #define is_viridian_domain(_d)                                             \
>   (is_hvm_domain(_d) && ((_d)->arch.hvm_domain.params[HVM_PARAM_VIRIDIAN]))
>  
> +#define is_vmware_domain(_d)                                             \
> + (is_hvm_domain(_d) && ((_d)->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW]))
> +
>  void hvm_hypervisor_cpuid_leaf(uint32_t sub_idx,
>                                 uint32_t *eax, uint32_t *ebx,
>                                 uint32_t *ecx, uint32_t *edx);
> diff --git a/xen/include/asm-x86/hvm/vmware.h b/xen/include/asm-x86/hvm/vmware.h
> new file mode 100644
> index 0000000..8390173
> --- /dev/null
> +++ b/xen/include/asm-x86/hvm/vmware.h
> @@ -0,0 +1,33 @@
> +/*
> + * asm-x86/hvm/vmware.h
> + *
> + * Copyright (C) 2012 Verizon Corporation
> + *
> + * This file is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License Version 2 (GPLv2)
> + * as published by the Free Software Foundation.
> + *
> + * This file is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details. <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef ASM_X86_HVM_VMWARE_H__
> +#define ASM_X86_HVM_VMWARE_H__
> +
> +#include <xen/types.h>
> +
> +int cpuid_vmware_leaves(uint32_t idx, uint32_t *eax, uint32_t *ebx,
> +                        uint32_t *ecx, uint32_t *edx);
> +
> +#endif /* ASM_X86_HVM_VMWARE_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
> index 614ff5f..dee6d68 100644
> --- a/xen/include/public/hvm/params.h
> +++ b/xen/include/public/hvm/params.h
> @@ -151,6 +151,9 @@
>  /* Location of the VM Generation ID in guest physical address space. */
>  #define HVM_PARAM_VM_GENERATION_ID_ADDR 34
>  
> -#define HVM_NR_PARAMS          35
> +/* Params for VMware */
> +#define HVM_PARAM_VMWARE_HW                 35
> +
> +#define HVM_NR_PARAMS          36
>  
>  #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 02/16] tools: Add vmware_hw support
  2014-09-20 18:07 ` [PATCH for-4.5 v6 02/16] tools: Add vmware_hw support Don Slutz
@ 2014-09-22 13:34   ` Ian Campbell
  2014-09-22 22:08     ` Don Slutz
  2014-09-24 14:44   ` George Dunlap
  1 sibling, 1 reply; 93+ messages in thread
From: Ian Campbell @ 2014-09-22 13:34 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
> This is used to set HVM_PARAM_VMWARE_HW. It is set to the VMware
> virtual hardware version.
> 
> Currently 0, 3-4, 6-11 are good values.  However the code only
> checks for == 0 or != 0.
> 
> If non-zero then
>   default VGA to VMware's VGA.
> 
> Also now allows vga=vmware
> 
> Signed-off-by: Don Slutz <dslutz@verizon.com>
> ---
> v5:
>       Anything looking for Xen according to the Xen cpuid instructions...
>         Adjusted doc to new wording.
> 
>  docs/man/xl.cfg.pod.5               | 21 +++++++++++++++++++--
>  docs/misc/hypervisor-cpuid.markdown | 28 ++++++++++++++++++++++++++++
>  tools/libxc/xc_domain_restore.c     | 14 ++++++++++++++
>  tools/libxc/xc_domain_save.c        | 11 +++++++++++
>  tools/libxc/xg_save_restore.h       |  2 ++
>  tools/libxl/libxl.h                 | 10 ++++++++++
>  tools/libxl/libxl_create.c          |  4 +++-
>  tools/libxl/libxl_dm.c              | 10 +++++++++-
>  tools/libxl/libxl_dom.c             |  2 ++
>  tools/libxl/libxl_types.idl         |  2 ++
>  tools/libxl/xl_cmdimpl.c            | 11 ++++++++++-
>  11 files changed, 110 insertions(+), 5 deletions(-)
>  create mode 100644 docs/misc/hypervisor-cpuid.markdown
> 
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index 517ae2f..367b401 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -1147,6 +1147,23 @@ some other Operating Systems and in some circumstance can prevent
>  Xen's own paravirtualisation interfaces for HVM guests from being
>  used.
>  
> +=item B<vmware_hw=NUMBER>
> +
> +Turns on or off the exposure of VMware cpuid.  The number is the
> +VMware's hardware version number, where 0 is off.  If not zero it
> +changes the default VGA to VMware's VGA.

"is the VMware's" => "is VMware's".

> @@ -1185,8 +1202,8 @@ This option is deprecated, use vga="stdvga" instead.
>  
>  =item B<vga="STRING">
>  
> -Selects the emulated video card (none|stdvga|cirrus).
> -The default is cirrus.
> +Selects the emulated video card (none|stdvga|cirrus|vmware).
> +The default is cirrus (or vmware if B<vmware_hw> is not zero).

"The default is cirrus unless B<vmware_hw> is non-zero in which case it
is vmware." ?

>  
>  =item B<vnc=BOOLEAN>
>  
> diff --git a/docs/misc/hypervisor-cpuid.markdown b/docs/misc/hypervisor-cpuid.markdown
> new file mode 100644
> index 0000000..901a4e1
> --- /dev/null
> +++ b/docs/misc/hypervisor-cpuid.markdown
> @@ -0,0 +1,28 @@
> +Hypervisor Cpuid
> +================
> +
> +The support of hypervisor cpuid leaves has not been agreed to.

by....

"the general hypervisor community" perhaps?

Perhaps a better way of putting this would be "There is no agreed
standard for the use of hypervisor cpuid leaves" or some such.

> +Other then the range 0x40000000 to 0x400000ff can be used by
> +hypervisors.

s/then/than/ I think. 

> +
> +MicroSoft Hyper-V (AKA viridian) currently must be at 0x40000000.
> +
> +VMware currently must be at 0x40000000.
> +
> +KVM currently must be at 0x40000000 (from Seabios).
> +
> +Xen can be found at the first otherwise unused 0x100 aligned
> +offset between 0x40000000 and 0x40010000.

I think you should add " leaves" after each hypervisor name.

> @@ -378,6 +379,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>                                         ("timeoffset",       string),
>                                         ("hpet",             libxl_defbool),
>                                         ("vpt_align",        libxl_defbool),
> +                                       ("vmware_hw",        UInt(64, init_val = 0)),

There is no need for an explicitly 0 init_val, it's the default default.

>                                         ("timer_mode",       libxl_timer_mode),
>                                         ("nested_hvm",       libxl_defbool),
>                                         ("smbios_firmware",  string),
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 698b3bc..2119bd6 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -1038,6 +1038,8 @@ static void parse_config_data(const char *config_source,
>          xlu_cfg_get_defbool(config, "hpet", &b_info->u.hvm.hpet, 0);
>          xlu_cfg_get_defbool(config, "vpt_align", &b_info->u.hvm.vpt_align, 0);
>  
> +        if (!xlu_cfg_get_long(config, "vmware_hw",  &l, 1))
> +            b_info->u.hvm.vmware_hw = l;
>          if (!xlu_cfg_get_long(config, "timer_mode", &l, 1)) {
>              const char *s = libxl_timer_mode_to_string(l);
>              fprintf(stderr, "WARNING: specifying \"timer_mode\" as an integer is deprecated. "
> @@ -1676,13 +1678,20 @@ skip_vfb:
>                  b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
>              } else if (!strcmp(buf, "none")) {
>                  b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE;
> +            } else if (!strcmp(buf, "vmware")) {
> +                b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_VMWARE;
>              } else {
>                  fprintf(stderr, "Unknown vga \"%s\" specified\n", buf);
>                  exit(1);
>              }
>          } else if (!xlu_cfg_get_long(config, "stdvga", &l, 0))
>              b_info->u.hvm.vga.kind = l ? LIBXL_VGA_INTERFACE_TYPE_STD :
> -                                         LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
> +                b_info->u.hvm.vmware_hw ? LIBXL_VGA_INTERFACE_TYPE_VMWARE :
> +                                          LIBXL_VGA_INTERFACE_TYPE_CIRRUS;

I don't think this is a good idea. stdvga = 1 in the config file should
still mean stdvga, not conditionally vmware. Likewise stdvga = 0 should
always be cirrus.

Someone who wants to force vmware should use vga=vmware and not specify
stdvga at all.

(NB: stdvga is deprecated synonym, the man page advises using vga=
already)

> +        else
> +            b_info->u.hvm.vga.kind =
> +                b_info->u.hvm.vmware_hw ? LIBXL_VGA_INTERFACE_TYPE_VMWARE :
> +                                          LIBXL_VGA_INTERFACE_TYPE_CIRRUS;

This else clause shouldn't be here, update
libxl__domain_build_info_setdefault instead where it currently says:
        if (!b_info->u.hvm.vga.kind)
            b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS;

note that this code should only set vga.kind if it is currently zero
(which indicates to libxl "pick a default")

>  
>          xlu_cfg_replace_string (config, "keymap", &b_info->u.hvm.keymap, 0);
>          xlu_cfg_get_defbool (config, "spice", &b_info->u.hvm.spice.enable, 0);

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-20 18:07 ` [PATCH for-4.5 v6 05/16] tools: " Don Slutz
@ 2014-09-22 13:41   ` Ian Campbell
  2014-09-22 16:34     ` Andrew Cooper
  2014-09-22 16:42     ` Don Slutz
  0 siblings, 2 replies; 93+ messages in thread
From: Ian Campbell @ 2014-09-22 13:41 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
> This new libxl_domain_create_info field is used to set
> XEN_DOMCTL_CDF_vmware_port for the xc_domain_create() routine.

Does this really need to be a CDF, rather than a domctl/hvm param?

The latter would allow moving to buildinfo.u.hvm, which would be nicer
from the libxl PoV, I think.

> In xen it is is_vmware_port_enabled.
> 
> If is_vmware_port_enabled then
>   enable a limited support of VMware's hyper-call.
> 
> VMware's hyper-call is also known as VMware Backdoor I/O Port.
> 
> Signed-off-by: Don Slutz <dslutz@verizon.com>

Is it useful to be able to configure this independently of the vmware_hw
version?

If yes then I still think you would want to set the default based on
vmware-hw, wouldn't you?

Ian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 08/16] xen: Add limited support of VMware's hyper-call rpc
  2014-09-20 18:07 ` [PATCH for-4.5 v6 08/16] xen: Add limited support of VMware's hyper-call rpc Don Slutz
@ 2014-09-22 13:47   ` Ian Campbell
  2014-09-22 21:18     ` Don Slutz
  2014-09-25 16:28     ` George Dunlap
  0 siblings, 2 replies; 93+ messages in thread
From: Ian Campbell @ 2014-09-22 13:47 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
> This interface is an extension of __HYPERVISOR_HVM_op.  It was
> picked because xc_get_hvm_param() also uses it and VMware guest
> info is a lot like a hvm param.

Sorry if this has been discussed before, but did you consider doing all
this in qemu rather than Xen?

Unless there are frequent accesses to these things then qemu would be
the default best place for this sort of thing, especially since as
you've observed there is some pretty complex memory management and
string handling which it would generally be better to avoid in the
hypervisor.

Your description of HVM_PARAM_VMPORT_RESET_TIME suggests they aren't
typically accessed very frequently.

Ian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 09/16] tools: Add limited support of VMware's hyper-call rpc
  2014-09-20 18:07 ` [PATCH for-4.5 v6 09/16] tools: " Don Slutz
@ 2014-09-22 13:52   ` Ian Campbell
  2014-09-22 21:32     ` Don Slutz
  0 siblings, 1 reply; 93+ messages in thread
From: Ian Campbell @ 2014-09-22 13:52 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>   This guestinfo support is provided via libxc.  libxl support has
> not be written.
> 
> Note: VMware RPC support is only available on HVM domU.
> 
> This interface is an extension of __HYPERVISOR_HVM_op.  It was
> picked because xc_get_hvm_param() also uses it and VMware guest
> info is a lot like a hvm param.
> 
> The HVMOP_get_vmport_guest_info is used by two libxc functions,
> xc_get_vmport_guest_info and xc_fetch_all_vmport_guest_info.
> xc_fetch_all_vmport_guest_info is designed to be used to fetch all
> currently set guestinfo values.
> 
> Signed-off-by: Don Slutz <dslutz@verizon.com>

These look to be correct implementations of accessors for the hypercalls
as defined, although apart from my query about whether this belongs in
Xen at all I do have concerns about an hvm param argument struct of >4K.
AIUI more normal would be to have two GUEST_HANDLE* fields and to copy
in/out explicitly.

Ultimately that's up to the hypervisor maintainers though.

Ian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
                   ` (15 preceding siblings ...)
  2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 16/16] Add xen-hvm-send-trigger Don Slutz
@ 2014-09-22 13:56 ` Ian Campbell
  2014-09-22 15:19   ` George Dunlap
  16 siblings, 1 reply; 93+ messages in thread
From: Ian Campbell @ 2014-09-22 13:56 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:

> I picked this subset to start with because it only has changes in
> Xen.
> 
> Some of this code is already in QEMU

As I suggest in my reply to one for the rpc port patches it's not clear
that any of this needs to be in Xen rather than qemu in the first place.

I came to think this even more once I saw the save/restore support...

Ian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 13:56 ` [PATCH for-4.5 v6 00/16] Xen VMware tools support Ian Campbell
@ 2014-09-22 15:19   ` George Dunlap
  2014-09-22 15:34     ` Ian Campbell
  0 siblings, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-22 15:19 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/22/2014 02:56 PM, Ian Campbell wrote:
> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>
>> I picked this subset to start with because it only has changes in
>> Xen.
>>
>> Some of this code is already in QEMU
> As I suggest in my reply to one for the rpc port patches it's not clear
> that any of this needs to be in Xen rather than qemu in the first place.
>
> I came to think this even more once I saw the save/restore support...

I don't think qemu can get notified on either cpuid or #GP faults, can it?

A big chunk of the functionality here is to allow a userspace process to 
transparently make the "hypercalls" without the OS needing to explicitly 
give it access to the IO space, by trapping the resulting #GP faults and 
checking to see if they are IO instructions .  If that's functionality 
we think is important, then it will have to be done in Xen, I think.

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 15:19   ` George Dunlap
@ 2014-09-22 15:34     ` Ian Campbell
  2014-09-22 15:38       ` George Dunlap
  2014-09-22 15:52       ` Andrew Cooper
  0 siblings, 2 replies; 93+ messages in thread
From: Ian Campbell @ 2014-09-22 15:34 UTC (permalink / raw)
  To: George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, Don Slutz,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
> On 09/22/2014 02:56 PM, Ian Campbell wrote:
> > On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
> >
> >> I picked this subset to start with because it only has changes in
> >> Xen.
> >>
> >> Some of this code is already in QEMU
> > As I suggest in my reply to one for the rpc port patches it's not clear
> > that any of this needs to be in Xen rather than qemu in the first place.
> >
> > I came to think this even more once I saw the save/restore support...
> 
> I don't think qemu can get notified on either cpuid or #GP faults, can it?

I understand the need for the cpuid bits, I should have made that clear.

> A big chunk of the functionality here is to allow a userspace process to 
> transparently make the "hypercalls" without the OS needing to explicitly 
> give it access to the IO space, by trapping the resulting #GP faults and 
> checking to see if they are IO instructions .  If that's functionality 
> we think is important, then it will have to be done in Xen, I think.

Ah, the need to #GP was what I had missed, I was thinking it was just a
regular I/O port access.

Having trapped the #GP and decoded it into an IO access, is there
anything stopping us forwarding that to qemu for consideration?

(I confess I'm not sure why this is a #GP thing and not a VTd/SVM I/O
access trap, just like if userspace mmaps /dev/ioports, but I'll trust
that's just my lack of x86 hw virt knowledge)

Ian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 15:34     ` Ian Campbell
@ 2014-09-22 15:38       ` George Dunlap
  2014-09-22 15:50         ` Ian Campbell
  2014-09-22 16:18         ` Jan Beulich
  2014-09-22 15:52       ` Andrew Cooper
  1 sibling, 2 replies; 93+ messages in thread
From: George Dunlap @ 2014-09-22 15:38 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, Don Slutz,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/22/2014 04:34 PM, Ian Campbell wrote:
> On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
>> On 09/22/2014 02:56 PM, Ian Campbell wrote:
>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>
>>>> I picked this subset to start with because it only has changes in
>>>> Xen.
>>>>
>>>> Some of this code is already in QEMU
>>> As I suggest in my reply to one for the rpc port patches it's not clear
>>> that any of this needs to be in Xen rather than qemu in the first place.
>>>
>>> I came to think this even more once I saw the save/restore support...
>> I don't think qemu can get notified on either cpuid or #GP faults, can it?
> I understand the need for the cpuid bits, I should have made that clear.
>
>> A big chunk of the functionality here is to allow a userspace process to
>> transparently make the "hypercalls" without the OS needing to explicitly
>> give it access to the IO space, by trapping the resulting #GP faults and
>> checking to see if they are IO instructions .  If that's functionality
>> we think is important, then it will have to be done in Xen, I think.
> Ah, the need to #GP was what I had missed, I was thinking it was just a
> regular I/O port access.
>
> Having trapped the #GP and decoded it into an IO access, is there
> anything stopping us forwarding that to qemu for consideration?
>
> (I confess I'm not sure why this is a #GP thing and not a VTd/SVM I/O
> access trap, just like if userspace mmaps /dev/ioports, but I'll trust
> that's just my lack of x86 hw virt knowledge)

I'm not 100% sure of this, but my understanding was that it *would* be a 
normal IO trap *if* the guest OS gave access to that IO range to the 
guest (via IOPL, maybe?).  But if the userspace program is not 
explicitly given access by the OS to those ports, it will generate a #GP 
instead.  The idea is to allow the "hypercall" to happen *without 
cooperation* from the guest OS.

Again, that's my understanding, someone please correct me if I'm wrong...

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 15:38       ` George Dunlap
@ 2014-09-22 15:50         ` Ian Campbell
  2014-09-22 15:55           ` George Dunlap
  2014-09-22 16:18         ` Jan Beulich
  1 sibling, 1 reply; 93+ messages in thread
From: Ian Campbell @ 2014-09-22 15:50 UTC (permalink / raw)
  To: George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, Don Slutz,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Mon, 2014-09-22 at 16:38 +0100, George Dunlap wrote:
> On 09/22/2014 04:34 PM, Ian Campbell wrote:
> > On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
> >> On 09/22/2014 02:56 PM, Ian Campbell wrote:
> >>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
> >>>
> >>>> I picked this subset to start with because it only has changes in
> >>>> Xen.
> >>>>
> >>>> Some of this code is already in QEMU
> >>> As I suggest in my reply to one for the rpc port patches it's not clear
> >>> that any of this needs to be in Xen rather than qemu in the first place.
> >>>
> >>> I came to think this even more once I saw the save/restore support...
> >> I don't think qemu can get notified on either cpuid or #GP faults, can it?
> > I understand the need for the cpuid bits, I should have made that clear.
> >
> >> A big chunk of the functionality here is to allow a userspace process to
> >> transparently make the "hypercalls" without the OS needing to explicitly
> >> give it access to the IO space, by trapping the resulting #GP faults and
> >> checking to see if they are IO instructions .  If that's functionality
> >> we think is important, then it will have to be done in Xen, I think.
> > Ah, the need to #GP was what I had missed, I was thinking it was just a
> > regular I/O port access.
> >
> > Having trapped the #GP and decoded it into an IO access, is there
> > anything stopping us forwarding that to qemu for consideration?
> >
> > (I confess I'm not sure why this is a #GP thing and not a VTd/SVM I/O
> > access trap, just like if userspace mmaps /dev/ioports, but I'll trust
> > that's just my lack of x86 hw virt knowledge)
> 
> I'm not 100% sure of this, but my understanding was that it *would* be a 
> normal IO trap *if* the guest OS gave access to that IO range to the 
> guest (via IOPL, maybe?).  But if the userspace program is not 
> explicitly given access by the OS to those ports, it will generate a #GP 
> instead.  The idea is to allow the "hypercall" to happen *without 
> cooperation* from the guest OS.
> 
> Again, that's my understanding, someone please correct me if I'm wrong...

It sounds plausible, for sure.

Even so, why can't the result of that #GP be a calldown into qemu for
further processing?

Ian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 15:34     ` Ian Campbell
  2014-09-22 15:38       ` George Dunlap
@ 2014-09-22 15:52       ` Andrew Cooper
  2014-09-22 18:39         ` Don Slutz
  1 sibling, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2014-09-22 15:52 UTC (permalink / raw)
  To: Ian Campbell, George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, Don Slutz,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 22/09/14 16:34, Ian Campbell wrote:
> On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
>> On 09/22/2014 02:56 PM, Ian Campbell wrote:
>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>
>>>> I picked this subset to start with because it only has changes in
>>>> Xen.
>>>>
>>>> Some of this code is already in QEMU
>>> As I suggest in my reply to one for the rpc port patches it's not clear
>>> that any of this needs to be in Xen rather than qemu in the first place.
>>>
>>> I came to think this even more once I saw the save/restore support...
>> I don't think qemu can get notified on either cpuid or #GP faults, can it?
> I understand the need for the cpuid bits, I should have made that clear.
>
>> A big chunk of the functionality here is to allow a userspace process to 
>> transparently make the "hypercalls" without the OS needing to explicitly 
>> give it access to the IO space, by trapping the resulting #GP faults and 
>> checking to see if they are IO instructions .  If that's functionality 
>> we think is important, then it will have to be done in Xen, I think.
> Ah, the need to #GP was what I had missed, I was thinking it was just a
> regular I/O port access.
>
> Having trapped the #GP and decoded it into an IO access, is there
> anything stopping us forwarding that to qemu for consideration?
>
> (I confess I'm not sure why this is a #GP thing and not a VTd/SVM I/O
> access trap, just like if userspace mmaps /dev/ioports, but I'll trust
> that's just my lack of x86 hw virt knowledge)

I am fairly sure (reading the VMX/SVM manuals) that Xen can force a trap
of a specific IO port as an IO access trap even if it would otherwise
cause a #GP fault due to lack of IO permissions (which I guess is
exactly for purposes like this).

I am also entirely certain that this is a far better position to be in
than fully enabling #GP intercepts, assuming I have interpreted the
manuals correctly.

~Andrew

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 15:50         ` Ian Campbell
@ 2014-09-22 15:55           ` George Dunlap
  2014-09-22 17:19             ` Don Slutz
  0 siblings, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-22 15:55 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, Don Slutz,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/22/2014 04:50 PM, Ian Campbell wrote:
> On Mon, 2014-09-22 at 16:38 +0100, George Dunlap wrote:
>> On 09/22/2014 04:34 PM, Ian Campbell wrote:
>>> On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
>>>> On 09/22/2014 02:56 PM, Ian Campbell wrote:
>>>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>>>
>>>>>> I picked this subset to start with because it only has changes in
>>>>>> Xen.
>>>>>>
>>>>>> Some of this code is already in QEMU
>>>>> As I suggest in my reply to one for the rpc port patches it's not clear
>>>>> that any of this needs to be in Xen rather than qemu in the first place.
>>>>>
>>>>> I came to think this even more once I saw the save/restore support...
>>>> I don't think qemu can get notified on either cpuid or #GP faults, can it?
>>> I understand the need for the cpuid bits, I should have made that clear.
>>>
>>>> A big chunk of the functionality here is to allow a userspace process to
>>>> transparently make the "hypercalls" without the OS needing to explicitly
>>>> give it access to the IO space, by trapping the resulting #GP faults and
>>>> checking to see if they are IO instructions .  If that's functionality
>>>> we think is important, then it will have to be done in Xen, I think.
>>> Ah, the need to #GP was what I had missed, I was thinking it was just a
>>> regular I/O port access.
>>>
>>> Having trapped the #GP and decoded it into an IO access, is there
>>> anything stopping us forwarding that to qemu for consideration?
>>>
>>> (I confess I'm not sure why this is a #GP thing and not a VTd/SVM I/O
>>> access trap, just like if userspace mmaps /dev/ioports, but I'll trust
>>> that's just my lack of x86 hw virt knowledge)
>> I'm not 100% sure of this, but my understanding was that it *would* be a
>> normal IO trap *if* the guest OS gave access to that IO range to the
>> guest (via IOPL, maybe?).  But if the userspace program is not
>> explicitly given access by the OS to those ports, it will generate a #GP
>> instead.  The idea is to allow the "hypercall" to happen *without
>> cooperation* from the guest OS.
>>
>> Again, that's my understanding, someone please correct me if I'm wrong...
> It sounds plausible, for sure.
>
> Even so, why can't the result of that #GP be a calldown into qemu for
> further processing?

I was only responding to the part of your comment in parentheses. :-)

I suppose in large part it would depend on what the hypercalls were 
actually doing; I'd have to go back and look at them to say if they need 
to be in Xen or whether they could be passed on to qemu.

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 15:38       ` George Dunlap
  2014-09-22 15:50         ` Ian Campbell
@ 2014-09-22 16:18         ` Jan Beulich
  2014-09-22 18:32           ` Don Slutz
  2014-09-25 10:37           ` Tim Deegan
  1 sibling, 2 replies; 93+ messages in thread
From: Jan Beulich @ 2014-09-22 16:18 UTC (permalink / raw)
  To: Ian Campbell, George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Andrew Cooper, Eddie Dong, Don Slutz,
	xen-devel, AravindGopalakrishnan, Suravee Suthikulpanit,
	Boris Ostrovsky, Ian Jackson

>>> On 22.09.14 at 17:38, <george.dunlap@eu.citrix.com> wrote:
> On 09/22/2014 04:34 PM, Ian Campbell wrote:
>> On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
>>> On 09/22/2014 02:56 PM, Ian Campbell wrote:
>>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>>
>>>>> I picked this subset to start with because it only has changes in
>>>>> Xen.
>>>>>
>>>>> Some of this code is already in QEMU
>>>> As I suggest in my reply to one for the rpc port patches it's not clear
>>>> that any of this needs to be in Xen rather than qemu in the first place.
>>>>
>>>> I came to think this even more once I saw the save/restore support...
>>> I don't think qemu can get notified on either cpuid or #GP faults, can it?
>> I understand the need for the cpuid bits, I should have made that clear.
>>
>>> A big chunk of the functionality here is to allow a userspace process to
>>> transparently make the "hypercalls" without the OS needing to explicitly
>>> give it access to the IO space, by trapping the resulting #GP faults and
>>> checking to see if they are IO instructions .  If that's functionality
>>> we think is important, then it will have to be done in Xen, I think.
>> Ah, the need to #GP was what I had missed, I was thinking it was just a
>> regular I/O port access.
>>
>> Having trapped the #GP and decoded it into an IO access, is there
>> anything stopping us forwarding that to qemu for consideration?
>>
>> (I confess I'm not sure why this is a #GP thing and not a VTd/SVM I/O
>> access trap, just like if userspace mmaps /dev/ioports, but I'll trust
>> that's just my lack of x86 hw virt knowledge)
> 
> I'm not 100% sure of this, but my understanding was that it *would* be a 
> normal IO trap *if* the guest OS gave access to that IO range to the 
> guest (via IOPL, maybe?).  But if the userspace program is not 
> explicitly given access by the OS to those ports, it will generate a #GP 
> instead.  The idea is to allow the "hypercall" to happen *without 
> cooperation* from the guest OS.
> 
> Again, that's my understanding, someone please correct me if I'm wrong...

That's indeed what was said so far. I wonder though whether opening
this up without guest OS consent isn't gong to introduce a security
issue inside the guest (depending on the exact functionality of these
hypercalls).

Jan

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-22 13:41   ` Ian Campbell
@ 2014-09-22 16:34     ` Andrew Cooper
  2014-09-22 21:22       ` Don Slutz
  2014-09-22 16:42     ` Don Slutz
  1 sibling, 1 reply; 93+ messages in thread
From: Andrew Cooper @ 2014-09-22 16:34 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 22/09/14 14:41, Ian Campbell wrote:
> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>> This new libxl_domain_create_info field is used to set
>> XEN_DOMCTL_CDF_vmware_port for the xc_domain_create() routine.
> Does this really need to be a CDF, rather than a domctl/hvm param?

I have made the argument that many things which are currently HVM Params
should be CDF, as they absolutely should be set and immutable for the
entire lifetime of the domain.

>From recollection, we have had several XSAs in the past which are
directly attributable to the toolstack or guest being able to play with
an (insufficiently locked down) HVM param after boot.

Using a CDF avoids potential issues along these lines.

>
> The latter would allow moving to buildinfo.u.hvm, which would be nicer
> from the libxl PoV, I think.
>

Whatever the decision regarding CDF/hvmparam/other is, getting it right
in the hypervisor is a much higher priority than being nice in libxl.

It is unfortunate that libxl exposes the internal implementation details
of createinfo vs buildinfo in its API.  With hindsight, it was a poor
design decision.

~Andrew

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-22 13:41   ` Ian Campbell
  2014-09-22 16:34     ` Andrew Cooper
@ 2014-09-22 16:42     ` Don Slutz
  2014-09-23 12:20       ` Ian Campbell
  1 sibling, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-22 16:42 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit


On 09/22/14 09:41, Ian Campbell wrote:
> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>> This new libxl_domain_create_info field is used to set
>> XEN_DOMCTL_CDF_vmware_port for the xc_domain_create() routine.
> Does this really need to be a CDF, rather than a domctl/hvm param?

It makes the setting up of v->arch.hvm_vmx.exception_bitmap happen at 
the right
time.  domctl/hvm param happen later in the start of a domain.

> The latter would allow moving to buildinfo.u.hvm, which would be nicer
> from the libxl PoV, I think.

I could not find "buildinfo.u.hvm":


dcs-xen-54:~/xen>git grep buildinfo.u.hvm
dcs-xen-54:~/xen>


So unable to comment.


>> In xen it is is_vmware_port_enabled.
>>
>> If is_vmware_port_enabled then
>>    enable a limited support of VMware's hyper-call.
>>
>> VMware's hyper-call is also known as VMware Backdoor I/O Port.
>>
>> Signed-off-by: Don Slutz <dslutz@verizon.com>
> Is it useful to be able to configure this independently of the vmware_hw
> version?

Yes.

> If yes then I still think you would want to set the default based on
> vmware-hw, wouldn't you?

I guess so since this is a BOOLEAN.  Currently I do not know of a way to 
say "set vmware_hw to 7
if vmware_port is true and vmware_hw is not specified".  Which would be 
the inverse.  I lean to
not having the default of vmware_port based on vmware_hw.

    -Don Slutz


> Ian.
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 01/16] xen: Add support for VMware cpuid leaves
  2014-09-22 11:49   ` Andrew Cooper
@ 2014-09-22 16:53     ` Don Slutz
  0 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-22 16:53 UTC (permalink / raw)
  To: Andrew Cooper, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan, George Dunlap,
	Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit


On 09/22/14 07:49, Andrew Cooper wrote:
> On 20/09/14 19:07, Don Slutz wrote:
>> This is done by adding HVM_PARAM_VMWARE_HW. It is set to the VMware
>> virtual hardware version.
>>
>> Currently 0, 3-4, 6-11 are good values.  However the
>> code only checks for == 0 or != 0.
>>

Sigh, need to fix this statement, 7 is now checked for.

...
>> @@ -5692,6 +5701,29 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
>>   
>>                   break;
>>               }
>> +            case HVM_PARAM_VMWARE_HW:
>> +                /*
>> +                 * This should only ever be set non-zero one time by
>> +                 * the tools and is read only by the guest.
>> +                 */
>> +                if ( d == current->domain )
> curr_d instead of current->domain

Will do.

>> +                {
>> +                    rc = -EPERM;
>> +                    break;
>> +                }
>> +                if ( d->arch.hvm_domain.params[HVM_PARAM_VIRIDIAN] )
>> +                {
>> +                    rc = -EXDEV;
>> +                    break;
>> +                }
>> +                if ( d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] &&
> This check is redundant.  Repeatedly setting 0 is still permitted
> because of the break after this if().

No at all sure this is right.  If it is set to 7, setting to 0 is not 
allowed. Which is
what this if is checking for.  This also allow multiple setting to 7 
without error.
But setting to 7 and then 8 is not allowed.

>> +                     d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] !=
>> +                     a.value )
>> +                {
>> +                    rc = -EEXIST;
>> +                    break;
>> +                }
>> +                break;
>>               }
>>   


>> +/*
>> + * VMware hardware version 7 defines some of these cpuid levels,
>> + * below is a brief description about those.
>> + *
>> + *     Leaf 0x40000000, Hypervisor CPUID information
>> + * # EAX: The maximum input value for hypervisor CPUID info (0x40000010).
>> + * # EBX, ECX, EDX: Hypervisor vendor ID signature. E.g. "VMwareVMware"
>> + *
>> + *     Leaf 0x40000010, Timing information.
>> + * # EAX: (Virtual) TSC frequency in kHz.
>> + * # EBX: (Virtual) Bus (local apic timer) frequency in kHz.
>> + * # ECX, EDX: RESERVED
>> + */
>> +
>> +int cpuid_vmware_leaves(uint32_t idx, uint32_t *eax, uint32_t *ebx,
>> +                        uint32_t *ecx, uint32_t *edx)
>> +{
>> +    struct domain *d = current->domain;
>> +
>> +    if ( !is_vmware_domain(d) )
>> +        return 0;
>> +
>> +    switch ( idx - 0x40000000 )
>> +    {
>> +    case 0x0:
>> +        if ( d->arch.hvm_domain.params[HVM_PARAM_VMWARE_HW] >= 7 )
> You are going to need some reference about VMWare versions.  The comment
> for this function is fine for describing what "version 7" constitutes,
> but how did you come about it?  What is supposed to be visible in the
> leaves of a version less than 7 is selected by the toolstack, because I
> doubt its all zeroes including the root leaf?

I was unable to find any clear statements on the VMware web site. This 
statement (from 2008)
can be found via google.  The best statement is from:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458 


(Which is listed in the commit message).  I have included part of that 
page here:

...


    Testing the CPUID hypervisor present bit

Intel and AMD CPUs have reserved bit 31 of ECX of CPUID leaf 0x1 as the 
hypervisor present bit. This bit allows hypervisors to indicate their 
presence to the guest operating system. Hypervisors set this bit and 
physical CPUs (all existing and future CPUs) set this bit to zero. Guest 
operating systems can test bit 31 to detect if they are running inside a 
virtual machine.
Intel and AMD have also reserved CPUID leaves 0x40000000 - 0x400000FF 
for software use. Hypervisors can use these leaves to provide an 
interface to pass information from the hypervisor to the guest operating 
system running inside a virtual machine. The hypervisor bit indicates 
the presence of a hypervisor and that it is safe to test these 
additional software leaves. VMware defines the 0x40000000 leaf as the 
hypervisor CPUID information leaf. Code running on a VMware hypervisor 
can test the CPUID information leaf for the hypervisor signature. VMware 
stores the string "VMwareVMware" in EBX, ECX, EDX of CPUID leaf 0x40000000.

*Sample code*

int cpuid_check()
{
         unsigned int eax, ebx, ecx, edx;
         char hyper_vendor_id[13];

         cpuid(0x1, &eax, &ebx, &ecx, &edx);
         if  (bit 31 of ecx is set) {
                 cpuid(0x40000000, &eax, &ebx, &ecx, &edx);
                 memcpy(hyper_vendor_id + 0, &ebx, 4);
                 memcpy(hyper_vendor_id + 4, &ecx, 4);
                 memcpy(hyper_vendor_id + 8, &edx, 4);
                 hyper_vendor_id[12] = '\0';
                 if (!strcmp(hyper_vendor_id, "VMwareVMware"))
                         return 1;               // Success - running under VMware
         }
         return 0;
}


    Testing the virtual BIOS DMI information and the hypervisor port

Apart from the CPUID-based method for VMware virtual machine detection, 
VMware also provides a fallback mechanism for the following reasons:

  * This CPUID-based technique will not work for guest code running at
    CPL3 when VT/AMD-V is not available or not enabled.
  * The hypervisor present bit and hypervisor information leaf are only
    defined for products based on VMware hardware version 7.


      Virtual BIOS DMI information

The VMware virtual BIOS has many VMware-specific identifiers which 
programs can use to detect hypervisors. For the DMI string check, use 
the BIOS serial number and check for either string "VMware-" or "VMW" 
(for Mac OS X guests running on Fusion)

...




I tested it for 3, 4, 7, and 8 (the versions that I have easy access to) 
and 3 & 4 are all zero
(including root leaf) and 7 & 8 follow the above.

I just noticed that bit 31 of ECX of CPUID leaf 0x1 should also be zero 
(not yet tested) for
vmware_hw < 7.  Not at all sure what happens if I do this.


The best web pages I have found are:

http://pubs.vmware.com/vsphere-55/index.jsp?topic=%2Fcom.vmware.vsphere.vm_admin.doc%2FGUID-789C3913-1053-4850-A0F0-E29C3D32B6DA.html

Which is all about other features like "Max number of cores", etc.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007240

Which is about what VMware products can use what versions.

(I can add these to the commit message; they do give some clues about 
vmware_hw
but none of which apply to the code in Xen.)


    -Don Slutz


> ~Andrew
>
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 15:55           ` George Dunlap
@ 2014-09-22 17:19             ` Don Slutz
  2014-09-22 22:00               ` Tian, Kevin
  2014-09-23 12:30               ` Ian Campbell
  0 siblings, 2 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-22 17:19 UTC (permalink / raw)
  To: George Dunlap, Ian Campbell
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, Don Slutz,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit


On 09/22/14 11:55, George Dunlap wrote:
> On 09/22/2014 04:50 PM, Ian Campbell wrote:
>> On Mon, 2014-09-22 at 16:38 +0100, George Dunlap wrote:
>>> On 09/22/2014 04:34 PM, Ian Campbell wrote:
>>>> On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
>>>>> On 09/22/2014 02:56 PM, Ian Campbell wrote:
>>>>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>>>>
>>>>>>> I picked this subset to start with because it only has changes in
>>>>>>> Xen.
>>>>>>>
>>>>>>> Some of this code is already in QEMU
>>>>>> As I suggest in my reply to one for the rpc port patches it's not 
>>>>>> clear
>>>>>> that any of this needs to be in Xen rather than qemu in the first 
>>>>>> place.
>>>>>>
>>>>>> I came to think this even more once I saw the save/restore 
>>>>>> support...
>>>>> I don't think qemu can get notified on either cpuid or #GP faults, 
>>>>> can it?
>>>> I understand the need for the cpuid bits, I should have made that 
>>>> clear.
>>>>
>>>>> A big chunk of the functionality here is to allow a userspace 
>>>>> process to
>>>>> transparently make the "hypercalls" without the OS needing to 
>>>>> explicitly
>>>>> give it access to the IO space, by trapping the resulting #GP 
>>>>> faults and
>>>>> checking to see if they are IO instructions .  If that's 
>>>>> functionality
>>>>> we think is important, then it will have to be done in Xen, I think.
>>>> Ah, the need to #GP was what I had missed, I was thinking it was 
>>>> just a
>>>> regular I/O port access.
>>>>
>>>> Having trapped the #GP and decoded it into an IO access, is there
>>>> anything stopping us forwarding that to qemu for consideration?
>>>>
>>>> (I confess I'm not sure why this is a #GP thing and not a VTd/SVM I/O
>>>> access trap, just like if userspace mmaps /dev/ioports, but I'll trust
>>>> that's just my lack of x86 hw virt knowledge)
>>> I'm not 100% sure of this, but my understanding was that it *would* 
>>> be a
>>> normal IO trap *if* the guest OS gave access to that IO range to the
>>> guest (via IOPL, maybe?).  But if the userspace program is not
>>> explicitly given access by the OS to those ports, it will generate a 
>>> #GP
>>> instead.  The idea is to allow the "hypercall" to happen *without
>>> cooperation* from the guest OS.
>>>

Yes, this is why the port in question is (via vmport_register and 
register_portio_handler) is added
to the hypervisor ports that get handled by hypervisor code.

>>> Again, that's my understanding, someone please correct me if I'm 
>>> wrong...

Looks like a good statement, should I add it to a commit message?

>> It sounds plausible, for sure.
>>
>> Even so, why can't the result of that #GP be a calldown into qemu for
>> further processing?
>

This is not simple in that QEMU does not have access to the VCPU 
registers.  Unlike a normal
I/O request, vmware_port (aka vmport) both reads and writes VCPU registers.

> I was only responding to the part of your comment in parentheses. :-)
>
> I suppose in large part it would depend on what the hypercalls were 
> actually doing; I'd have to go back and look at them to say if they 
> need to be in Xen or whether they could be passed on to qemu.
>

Clearly it is possible to pass the VCPU registers to QEMU, but that is 
currently not done.  So a new
version of QEMU would also be needed to go this way.  None the the 
proposed features need
any data from QEMU, so I do not think this make sense.

    -Don Slutz

>  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 16:18         ` Jan Beulich
@ 2014-09-22 18:32           ` Don Slutz
  2014-09-25 10:37           ` Tim Deegan
  1 sibling, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-22 18:32 UTC (permalink / raw)
  To: Jan Beulich, Ian Campbell, George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Andrew Cooper, Eddie Dong, Don Slutz,
	xen-devel, AravindGopalakrishnan, Suravee Suthikulpanit,
	Boris Ostrovsky, Ian Jackson

On 09/22/14 12:18, Jan Beulich wrote:
>>>> On 22.09.14 at 17:38, <george.dunlap@eu.citrix.com> wrote:
>> On 09/22/2014 04:34 PM, Ian Campbell wrote:
>>> On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
>>>> On 09/22/2014 02:56 PM, Ian Campbell wrote:
>>>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>>>
>>>>>> I picked this subset to start with because it only has changes in
>>>>>> Xen.
>>>>>>
>>>>>> Some of this code is already in QEMU
>>>>> As I suggest in my reply to one for the rpc port patches it's not clear
>>>>> that any of this needs to be in Xen rather than qemu in the first place.
>>>>>
>>>>> I came to think this even more once I saw the save/restore support...
>>>> I don't think qemu can get notified on either cpuid or #GP faults, can it?
>>> I understand the need for the cpuid bits, I should have made that clear.
>>>
>>>> A big chunk of the functionality here is to allow a userspace process to
>>>> transparently make the "hypercalls" without the OS needing to explicitly
>>>> give it access to the IO space, by trapping the resulting #GP faults and
>>>> checking to see if they are IO instructions .  If that's functionality
>>>> we think is important, then it will have to be done in Xen, I think.
>>> Ah, the need to #GP was what I had missed, I was thinking it was just a
>>> regular I/O port access.
>>>
>>> Having trapped the #GP and decoded it into an IO access, is there
>>> anything stopping us forwarding that to qemu for consideration?
>>>
>>> (I confess I'm not sure why this is a #GP thing and not a VTd/SVM I/O
>>> access trap, just like if userspace mmaps /dev/ioports, but I'll trust
>>> that's just my lack of x86 hw virt knowledge)
>> I'm not 100% sure of this, but my understanding was that it *would* be a
>> normal IO trap *if* the guest OS gave access to that IO range to the
>> guest (via IOPL, maybe?).  But if the userspace program is not
>> explicitly given access by the OS to those ports, it will generate a #GP
>> instead.  The idea is to allow the "hypercall" to happen *without
>> cooperation* from the guest OS.
>>
>> Again, that's my understanding, someone please correct me if I'm wrong...
> That's indeed what was said so far. I wonder though whether opening
> this up without guest OS consent isn't gong to introduce a security
> issue inside the guest (depending on the exact functionality of these
> hypercalls).

Since this is only opened when vmware_port=1, there is no change when 
vmware_port=0.

I do not know of any security issue inside the guest with the subset 
that is supported (and there
has been more then 1 set of eyes looking for them).

     -Don Slutz

> Jan
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 15:52       ` Andrew Cooper
@ 2014-09-22 18:39         ` Don Slutz
  0 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-22 18:39 UTC (permalink / raw)
  To: Andrew Cooper, Ian Campbell, George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, Don Slutz,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit


On 09/22/14 11:52, Andrew Cooper wrote:
> On 22/09/14 16:34, Ian Campbell wrote:
>> On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
>>> On 09/22/2014 02:56 PM, Ian Campbell wrote:
>>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>>
>>>>> I picked this subset to start with because it only has changes in
>>>>> Xen.
>>>>>
>>>>> Some of this code is already in QEMU
>>>> As I suggest in my reply to one for the rpc port patches it's not clear
>>>> that any of this needs to be in Xen rather than qemu in the first place.
>>>>
>>>> I came to think this even more once I saw the save/restore support...
>>> I don't think qemu can get notified on either cpuid or #GP faults, can it?
>> I understand the need for the cpuid bits, I should have made that clear.
>>
>>> A big chunk of the functionality here is to allow a userspace process to
>>> transparently make the "hypercalls" without the OS needing to explicitly
>>> give it access to the IO space, by trapping the resulting #GP faults and
>>> checking to see if they are IO instructions .  If that's functionality
>>> we think is important, then it will have to be done in Xen, I think.
>> Ah, the need to #GP was what I had missed, I was thinking it was just a
>> regular I/O port access.
>>
>> Having trapped the #GP and decoded it into an IO access, is there
>> anything stopping us forwarding that to qemu for consideration?
>>
>> (I confess I'm not sure why this is a #GP thing and not a VTd/SVM I/O
>> access trap, just like if userspace mmaps /dev/ioports, but I'll trust
>> that's just my lack of x86 hw virt knowledge)
> I am fairly sure (reading the VMX/SVM manuals) that Xen can force a trap
> of a specific IO port as an IO access trap even if it would otherwise
> cause a #GP fault due to lack of IO permissions (which I guess is
> exactly for purposes like this).

Please direct me to this.  I see things like:

24593—Rev. 3.18—May 2011

15.10.2 IN and OUT Behavior
If the IOIO_PROT intercept bit is set, the IOPM controls port access. 
For IN/OUT instructions that
access more than a single byte, the permission bits for all bytes are 
checked; if any bit is set to 1, the
I/O operation is intercepted.


Exceptions related to virtual x86 mode, IOPL, or the TSS-bitmap are 
checked before the SVM
intercept check. All other exceptions are checked after the SVM 
intercept check.

I/O Intercept Information. When an IOIO intercept triggers, the 
following information (describing
the intercepted operation in order to facilitate emulation) is saved in 
the VMCB’s EXITINFO1 field:


This to me says that IOPL (aka generate a #GP) is checked before the SVM 
check.

     -Don Slutz

> I am also entirely certain that this is a far better position to be in
> than fully enabling #GP intercepts, assuming I have interpreted the
> manuals correctly.
>
> ~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 08/16] xen: Add limited support of VMware's hyper-call rpc
  2014-09-22 13:47   ` Ian Campbell
@ 2014-09-22 21:18     ` Don Slutz
  2014-09-23 12:34       ` Ian Campbell
  2014-09-25 16:28     ` George Dunlap
  1 sibling, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-22 21:18 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit


On 09/22/14 09:47, Ian Campbell wrote:
> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>> This interface is an extension of __HYPERVISOR_HVM_op.  It was
>> picked because xc_get_hvm_param() also uses it and VMware guest
>> info is a lot like a hvm param.
> Sorry if this has been discussed before, but did you consider doing all
> this in qemu rather than Xen?


Yes, but QEMU does not have access to the VCPU registers.  Also QEMU 
does not have this code; so would also need it to be added.

A second part would be adding a way for a tool stack to
communicate with QEMU to access this data.  xenstore/xenbus might
be one way to do this.

> Unless there are frequent accesses to these things then qemu would be
> the default best place for this sort of thing, especially since as
> you've observed there is some pretty complex memory management and
> string handling which it would generally be better to avoid in the
> hypervisor.

The more complex memory management is around not using the
heap during a single guest instruction (inl (%dx)).  Handling
migration of a fully variable sized hunk is not that simple.

Also preventing the guest from accessing outside of the defined
memory does increase the complexity.


> Your description of HVM_PARAM_VMPORT_RESET_TIME suggests they aren't
> typically accessed very frequently.

The rate I have seen is a low number per minute.  However there
is nothing that prevents a guest from doing a very high rate.

     -Don Slutz

> Ian.
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-22 16:34     ` Andrew Cooper
@ 2014-09-22 21:22       ` Don Slutz
  2014-09-24 16:24         ` George Dunlap
  0 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-22 21:22 UTC (permalink / raw)
  To: Andrew Cooper, Ian Campbell, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 09/22/14 12:34, Andrew Cooper wrote:
> On 22/09/14 14:41, Ian Campbell wrote:
>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>> This new libxl_domain_create_info field is used to set
>>> XEN_DOMCTL_CDF_vmware_port for the xc_domain_create() routine.
>> Does this really need to be a CDF, rather than a domctl/hvm param?
> I have made the argument that many things which are currently HVM Params
> should be CDF, as they absolutely should be set and immutable for the
> entire lifetime of the domain.
>
>  From recollection, we have had several XSAs in the past which are
> directly attributable to the toolstack or guest being able to play with
> an (insufficiently locked down) HVM param after boot.
>
> Using a CDF avoids potential issues along these lines.

It also allow setting up v->arch.hvm_vmx.exception_bitmap at
the right time.  domctl/hvm params are setup much latter in
the life of a domain.

>> The latter would allow moving to buildinfo.u.hvm, which would be nicer
>> from the libxl PoV, I think.
>>
> Whatever the decision regarding CDF/hvmparam/other is, getting it right
> in the hypervisor is a much higher priority than being nice in libxl.
>
> It is unfortunate that libxl exposes the internal implementation details
> of createinfo vs buildinfo in its API.  With hindsight, it was a poor
> design decision.

     -Don Slutz

> ~Andrew
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 09/16] tools: Add limited support of VMware's hyper-call rpc
  2014-09-22 13:52   ` Ian Campbell
@ 2014-09-22 21:32     ` Don Slutz
  2014-09-23 12:35       ` Ian Campbell
  0 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-22 21:32 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit


On 09/22/14 09:52, Ian Campbell wrote:
> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>    This guestinfo support is provided via libxc.  libxl support has
>> not be written.
>>
>> Note: VMware RPC support is only available on HVM domU.
>>
>> This interface is an extension of __HYPERVISOR_HVM_op.  It was
>> picked because xc_get_hvm_param() also uses it and VMware guest
>> info is a lot like a hvm param.
>>
>> The HVMOP_get_vmport_guest_info is used by two libxc functions,
>> xc_get_vmport_guest_info and xc_fetch_all_vmport_guest_info.
>> xc_fetch_all_vmport_guest_info is designed to be used to fetch all
>> currently set guestinfo values.
>>
>> Signed-off-by: Don Slutz <dslutz@verizon.com>
> These look to be correct implementations of accessors for the hypercalls
> as defined, although apart from my query about whether this belongs in
> Xen at all I do have concerns about an hvm param argument struct of >4K.
> AIUI more normal would be to have two GUEST_HANDLE* fields and to copy
> in/out explicitly.

All my tests show this working.  I am not aware of a >4K issue with
a GUEST_HANDLE*.  Is this more a case of the 1st use?

> Ultimately that's up to the hypervisor maintainers though.

Ok.

> Ian.
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 17:19             ` Don Slutz
@ 2014-09-22 22:00               ` Tian, Kevin
  2014-09-23 12:30               ` Ian Campbell
  1 sibling, 0 replies; 93+ messages in thread
From: Tian, Kevin @ 2014-09-22 22:00 UTC (permalink / raw)
  To: Don Slutz, George Dunlap, Ian Campbell
  Cc: Tim Deegan, Keir Fraser, Nakajima, Jun, Stefano Stabellini,
	Ian Jackson, Dong, Eddie, xen-devel, Aravind Gopalakrishnan,
	Jan Beulich, Andrew Cooper, Boris Ostrovsky,
	Suravee Suthikulpanit

> From: Don Slutz [mailto:dslutz@verizon.com]
> Sent: Monday, September 22, 2014 10:19 AM
> >> It sounds plausible, for sure.
> >>
> >> Even so, why can't the result of that #GP be a calldown into qemu for
> >> further processing?
> >
> 
> This is not simple in that QEMU does not have access to the VCPU
> registers.  Unlike a normal
> I/O request, vmware_port (aka vmport) both reads and writes VCPU registers.
> 
> > I was only responding to the part of your comment in parentheses. :-)
> >
> > I suppose in large part it would depend on what the hypercalls were
> > actually doing; I'd have to go back and look at them to say if they
> > need to be in Xen or whether they could be passed on to qemu.
> >
> 
> Clearly it is possible to pass the VCPU registers to QEMU, but that is
> currently not done.  So a new
> version of QEMU would also be needed to go this way.  None the the
> proposed features need
> any data from QEMU, so I do not think this make sense.
> 

it looks a bit dirty to have CPU virtualization in Qemu, which also conflicts
with existing Qemu CPU virtualization logic, so if we want to do that there
may be objection from Qemu community.

regardless of security concern from Jan, if we really needs to do anything,
Xen seems the right place.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 02/16] tools: Add vmware_hw support
  2014-09-22 13:34   ` Ian Campbell
@ 2014-09-22 22:08     ` Don Slutz
  0 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-22 22:08 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/22/14 09:34, Ian Campbell wrote:
> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>> This is used to set HVM_PARAM_VMWARE_HW. It is set to the VMware
>> virtual hardware version.
>>
>> Currently 0, 3-4, 6-11 are good values.  However the code only
>> checks for == 0 or != 0.
>>
>> If non-zero then
>>    default VGA to VMware's VGA.
>>
>> Also now allows vga=vmware
>>
>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>> ---
>> v5:
>>        Anything looking for Xen according to the Xen cpuid instructions...
>>          Adjusted doc to new wording.
>>
>>   docs/man/xl.cfg.pod.5               | 21 +++++++++++++++++++--
>>   docs/misc/hypervisor-cpuid.markdown | 28 ++++++++++++++++++++++++++++
>>   tools/libxc/xc_domain_restore.c     | 14 ++++++++++++++
>>   tools/libxc/xc_domain_save.c        | 11 +++++++++++
>>   tools/libxc/xg_save_restore.h       |  2 ++
>>   tools/libxl/libxl.h                 | 10 ++++++++++
>>   tools/libxl/libxl_create.c          |  4 +++-
>>   tools/libxl/libxl_dm.c              | 10 +++++++++-
>>   tools/libxl/libxl_dom.c             |  2 ++
>>   tools/libxl/libxl_types.idl         |  2 ++
>>   tools/libxl/xl_cmdimpl.c            | 11 ++++++++++-
>>   11 files changed, 110 insertions(+), 5 deletions(-)
>>   create mode 100644 docs/misc/hypervisor-cpuid.markdown
>>
>> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
>> index 517ae2f..367b401 100644
>> --- a/docs/man/xl.cfg.pod.5
>> +++ b/docs/man/xl.cfg.pod.5
>> @@ -1147,6 +1147,23 @@ some other Operating Systems and in some circumstance can prevent
>>   Xen's own paravirtualisation interfaces for HVM guests from being
>>   used.
>>   
>> +=item B<vmware_hw=NUMBER>
>> +
>> +Turns on or off the exposure of VMware cpuid.  The number is the
>> +VMware's hardware version number, where 0 is off.  If not zero it
>> +changes the default VGA to VMware's VGA.
> "is the VMware's" => "is VMware's".

Will do.

>> @@ -1185,8 +1202,8 @@ This option is deprecated, use vga="stdvga" instead.
>>   
>>   =item B<vga="STRING">
>>   
>> -Selects the emulated video card (none|stdvga|cirrus).
>> -The default is cirrus.
>> +Selects the emulated video card (none|stdvga|cirrus|vmware).
>> +The default is cirrus (or vmware if B<vmware_hw> is not zero).
> "The default is cirrus unless B<vmware_hw> is non-zero in which case it
> is vmware." ?

Sure.

>>   
>>   =item B<vnc=BOOLEAN>
>>   
>> diff --git a/docs/misc/hypervisor-cpuid.markdown b/docs/misc/hypervisor-cpuid.markdown
>> new file mode 100644
>> index 0000000..901a4e1
>> --- /dev/null
>> +++ b/docs/misc/hypervisor-cpuid.markdown
>> @@ -0,0 +1,28 @@
>> +Hypervisor Cpuid
>> +================
>> +
>> +The support of hypervisor cpuid leaves has not been agreed to.
> by....
>
> "the general hypervisor community" perhaps?
>
> Perhaps a better way of putting this would be "There is no agreed
> standard for the use of hypervisor cpuid leaves" or some such.
>

Ok.

>> +Other then the range 0x40000000 to 0x400000ff can be used by
>> +hypervisors.
> s/then/than/ I think.

I am not sure, so I will change it.

>> +
>> +MicroSoft Hyper-V (AKA viridian) currently must be at 0x40000000.
>> +
>> +VMware currently must be at 0x40000000.
>> +
>> +KVM currently must be at 0x40000000 (from Seabios).
>> +
>> +Xen can be found at the first otherwise unused 0x100 aligned
>> +offset between 0x40000000 and 0x40010000.
> I think you should add " leaves" after each hypervisor name.

Sure.

>> @@ -378,6 +379,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>>                                          ("timeoffset",       string),
>>                                          ("hpet",             libxl_defbool),
>>                                          ("vpt_align",        libxl_defbool),
>> +                                       ("vmware_hw",        UInt(64, init_val = 0)),
> There is no need for an explicitly 0 init_val, it's the default default.
>

Will switch to uint64.

>>                                          ("timer_mode",       libxl_timer_mode),
>>                                          ("nested_hvm",       libxl_defbool),
>>                                          ("smbios_firmware",  string),
>> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
>> index 698b3bc..2119bd6 100644
>> --- a/tools/libxl/xl_cmdimpl.c
>> +++ b/tools/libxl/xl_cmdimpl.c
>> @@ -1038,6 +1038,8 @@ static void parse_config_data(const char *config_source,
>>           xlu_cfg_get_defbool(config, "hpet", &b_info->u.hvm.hpet, 0);
>>           xlu_cfg_get_defbool(config, "vpt_align", &b_info->u.hvm.vpt_align, 0);
>>   
>> +        if (!xlu_cfg_get_long(config, "vmware_hw",  &l, 1))
>> +            b_info->u.hvm.vmware_hw = l;
>>           if (!xlu_cfg_get_long(config, "timer_mode", &l, 1)) {
>>               const char *s = libxl_timer_mode_to_string(l);
>>               fprintf(stderr, "WARNING: specifying \"timer_mode\" as an integer is deprecated. "
>> @@ -1676,13 +1678,20 @@ skip_vfb:
>>                   b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
>>               } else if (!strcmp(buf, "none")) {
>>                   b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE;
>> +            } else if (!strcmp(buf, "vmware")) {
>> +                b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_VMWARE;
>>               } else {
>>                   fprintf(stderr, "Unknown vga \"%s\" specified\n", buf);
>>                   exit(1);
>>               }
>>           } else if (!xlu_cfg_get_long(config, "stdvga", &l, 0))
>>               b_info->u.hvm.vga.kind = l ? LIBXL_VGA_INTERFACE_TYPE_STD :
>> -                                         LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
>> +                b_info->u.hvm.vmware_hw ? LIBXL_VGA_INTERFACE_TYPE_VMWARE :
>> +                                          LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
> I don't think this is a good idea. stdvga = 1 in the config file should
> still mean stdvga, not conditionally vmware. Likewise stdvga = 0 should
> always be cirrus.

I think you are miss reading this.  For "stdvga = 1", l === 1, so it
is not conditionally vmware.  However for "stdvga = 0" it is conditionally
vmware.  I will drop the "stdvga = 0" conditionally vmware.

And add to the xl.cfg.pod.5 the additional statement that
the deprecated "stdvga=0" prevents the usage of vmware by default.


> Someone who wants to force vmware should use vga=vmware and not specify
> stdvga at all.

This should work.  If VGA is specified, stdvga is ignored.

> (NB: stdvga is deprecated synonym, the man page advises using vga=
> already)
>
>> +        else
>> +            b_info->u.hvm.vga.kind =
>> +                b_info->u.hvm.vmware_hw ? LIBXL_VGA_INTERFACE_TYPE_VMWARE :
>> +                                          LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
> This else clause shouldn't be here, update
> libxl__domain_build_info_setdefault instead where it currently says:
>          if (!b_info->u.hvm.vga.kind)
>              b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
>
> note that this code should only set vga.kind if it is currently zero
> (which indicates to libxl "pick a default")

Will do.

    -Don Slutz

>>   
>>           xlu_cfg_replace_string (config, "keymap", &b_info->u.hvm.keymap, 0);
>>           xlu_cfg_get_defbool (config, "spice", &b_info->u.hvm.spice.enable, 0);
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-22 16:42     ` Don Slutz
@ 2014-09-23 12:20       ` Ian Campbell
  2014-09-24 16:31         ` Don Slutz
  0 siblings, 1 reply; 93+ messages in thread
From: Ian Campbell @ 2014-09-23 12:20 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Mon, 2014-09-22 at 12:42 -0400, Don Slutz wrote:
> > The latter would allow moving to buildinfo.u.hvm, which would be nicer
> > from the libxl PoV, I think.
> 
> I could not find "buildinfo.u.hvm":
> 
> 
> dcs-xen-54:~/xen>git grep buildinfo.u.hvm
> dcs-xen-54:~/xen>
> 
> 
> So unable to comment.

It's in the idl, next to createinfo.
> 
> > If yes then I still think you would want to set the default based on
> > vmware-hw, wouldn't you?
> 
> I guess so since this is a BOOLEAN.

defbool I hope.

>   Currently I do not know of a way to 
> say "set vmware_hw to 7
> if vmware_port is true and vmware_hw is not specified".

That's an error case, isn't it? Or at least a vmware_port is ignored
case.

What I suggested was "if vmware_hw is non-zero then set vmware_port".

>   Which would be 
> the inverse.  I lean to
> not having the default of vmware_port based on vmware_hw.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 17:19             ` Don Slutz
  2014-09-22 22:00               ` Tian, Kevin
@ 2014-09-23 12:30               ` Ian Campbell
  2014-09-23 12:35                 ` George Dunlap
                                   ` (2 more replies)
  1 sibling, 3 replies; 93+ messages in thread
From: Ian Campbell @ 2014-09-23 12:30 UTC (permalink / raw)
  To: Don Slutz
  Cc: Kevin Tian, Keir Fraser, Eddie Dong, Stefano Stabellini,
	George Dunlap, Ian Jackson, Tim Deegan, xen-devel, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Mon, 2014-09-22 at 13:19 -0400, Don Slutz wrote:
> >> It sounds plausible, for sure.
> >>
> >> Even so, why can't the result of that #GP be a calldown into qemu for
> >> further processing?
> >
> 
> This is not simple in that QEMU does not have access to the VCPU 
> registers.  Unlike a normal
> I/O request, vmware_port (aka vmport) both reads and writes VCPU registers.

Are you saying that emulating a normal in or out instruction doesn't
require accessing vcpu registers? Are you sure? Surely it needs to
either read the source or write the destination register somehow.

> 
> > I was only responding to the part of your comment in parentheses. :-)
> >
> > I suppose in large part it would depend on what the hypercalls were 
> > actually doing; I'd have to go back and look at them to say if they 
> > need to be in Xen or whether they could be passed on to qemu.
> >
> 
> Clearly it is possible to pass the VCPU registers to QEMU, but that is 
> currently not done.

I think there's an existing hypercall to get/set the state for a vcpu,
perhaps it is too heavy weight to be used here though.

An alternative would be a semantically higher level I/O req which took a
guest pointer to a key and a guest pointer to the buffer etc, without
needing the registers themselves.

>   So a new
> version of QEMU would also be needed to go this way.  None the the 
> proposed features need
> any data from QEMU, so I do not think this make sense.

The concern is that it is adding a load of complex looking string and
pointer manipulation stuff to the hypervisor, the sort of thing which
often leads to security vulnerabilities.

So that would be better done outside of Xen itself if possible, if a
qemu update is the price for that then it doesn't seem so bad to me.

Ian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 08/16] xen: Add limited support of VMware's hyper-call rpc
  2014-09-22 21:18     ` Don Slutz
@ 2014-09-23 12:34       ` Ian Campbell
  2014-09-23 22:03         ` Slutz, Donald Christopher
  0 siblings, 1 reply; 93+ messages in thread
From: Ian Campbell @ 2014-09-23 12:34 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Mon, 2014-09-22 at 17:18 -0400, Don Slutz wrote:
> On 09/22/14 09:47, Ian Campbell wrote:
> > On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
> >> This interface is an extension of __HYPERVISOR_HVM_op.  It was
> >> picked because xc_get_hvm_param() also uses it and VMware guest
> >> info is a lot like a hvm param.
> > Sorry if this has been discussed before, but did you consider doing all
> > this in qemu rather than Xen?
> 
> 
> Yes, but QEMU does not have access to the VCPU registers.  Also QEMU 
> does not have this code; so would also need it to be added.

I thought you said KVM had this functionality already? If so then the
bulk of the code should be there already.

> A second part would be adding a way for a tool stack to
> communicate with QEMU to access this data.  xenstore/xenbus might
> be one way to do this.

qmp would be the more modern way I think.

Ian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 09/16] tools: Add limited support of VMware's hyper-call rpc
  2014-09-22 21:32     ` Don Slutz
@ 2014-09-23 12:35       ` Ian Campbell
  0 siblings, 0 replies; 93+ messages in thread
From: Ian Campbell @ 2014-09-23 12:35 UTC (permalink / raw)
  To: Don Slutz
  Cc: Kevin Tian, Keir Fraser, Eddie Dong, Stefano Stabellini,
	George Dunlap, Ian Jackson, Tim Deegan, xen-devel, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Mon, 2014-09-22 at 17:32 -0400, Don Slutz wrote:
> On 09/22/14 09:52, Ian Campbell wrote:
> > On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
> >>    This guestinfo support is provided via libxc.  libxl support has
> >> not be written.
> >>
> >> Note: VMware RPC support is only available on HVM domU.
> >>
> >> This interface is an extension of __HYPERVISOR_HVM_op.  It was
> >> picked because xc_get_hvm_param() also uses it and VMware guest
> >> info is a lot like a hvm param.
> >>
> >> The HVMOP_get_vmport_guest_info is used by two libxc functions,
> >> xc_get_vmport_guest_info and xc_fetch_all_vmport_guest_info.
> >> xc_fetch_all_vmport_guest_info is designed to be used to fetch all
> >> currently set guestinfo values.
> >>
> >> Signed-off-by: Don Slutz <dslutz@verizon.com>
> > These look to be correct implementations of accessors for the hypercalls
> > as defined, although apart from my query about whether this belongs in
> > Xen at all I do have concerns about an hvm param argument struct of >4K.
> > AIUI more normal would be to have two GUEST_HANDLE* fields and to copy
> > in/out explicitly.
> 
> All my tests show this working.  I am not aware of a >4K issue with
> a GUEST_HANDLE*.  Is this more a case of the 1st use?

It's a couple of orders of magnitude bigger than any existing things, I
think. It's not a problem per-se but I wanted to be sure that it had
been properly considered.

Ian

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-23 12:30               ` Ian Campbell
@ 2014-09-23 12:35                 ` George Dunlap
  2014-09-23 12:40                   ` Ian Campbell
  2014-09-24 15:52                 ` George Dunlap
  2014-09-24 17:19                 ` Don Slutz
  2 siblings, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-23 12:35 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Kevin Tian, Keir Fraser, Jun Nakajima, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Don Slutz, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, xen-devel, Suravee Suthikulpanit

On Tue, Sep 23, 2014 at 1:30 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Mon, 2014-09-22 at 13:19 -0400, Don Slutz wrote:
>> >> It sounds plausible, for sure.
>> >>
>> >> Even so, why can't the result of that #GP be a calldown into qemu for
>> >> further processing?
>> >
>>
>> This is not simple in that QEMU does not have access to the VCPU
>> registers.  Unlike a normal
>> I/O request, vmware_port (aka vmport) both reads and writes VCPU registers.
>
> Are you saying that emulating a normal in or out instruction doesn't
> require accessing vcpu registers? Are you sure? Surely it needs to
> either read the source or write the destination register somehow.

At the moment Xen does all of the decoding of instructions; what it
sends to qemu is just "Read / Write X bytes at address Y" (along with
data when appropriate).  Qemu doesn't require access to the registers.

 -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-23 12:35                 ` George Dunlap
@ 2014-09-23 12:40                   ` Ian Campbell
  0 siblings, 0 replies; 93+ messages in thread
From: Ian Campbell @ 2014-09-23 12:40 UTC (permalink / raw)
  To: George Dunlap
  Cc: Kevin Tian, Keir Fraser, Jun Nakajima, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Don Slutz, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, xen-devel, Suravee Suthikulpanit

On Tue, 2014-09-23 at 13:35 +0100, George Dunlap wrote:
> On Tue, Sep 23, 2014 at 1:30 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Mon, 2014-09-22 at 13:19 -0400, Don Slutz wrote:
> >> >> It sounds plausible, for sure.
> >> >>
> >> >> Even so, why can't the result of that #GP be a calldown into qemu for
> >> >> further processing?
> >> >
> >>
> >> This is not simple in that QEMU does not have access to the VCPU
> >> registers.  Unlike a normal
> >> I/O request, vmware_port (aka vmport) both reads and writes VCPU registers.
> >
> > Are you saying that emulating a normal in or out instruction doesn't
> > require accessing vcpu registers? Are you sure? Surely it needs to
> > either read the source or write the destination register somehow.
> 
> At the moment Xen does all of the decoding of instructions; what it
> sends to qemu is just "Read / Write X bytes at address Y" (along with
> data when appropriate).  Qemu doesn't require access to the registers.

I figured that out later and then forgot to delete the paragraph, sorry.

Ian

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 04/16] xen: Add vmware_port support
  2014-09-20 18:07 ` [PATCH for-4.5 v6 04/16] xen: Add vmware_port support Don Slutz
@ 2014-09-23 17:16   ` Boris Ostrovsky
  2014-09-24  8:28     ` Jan Beulich
  2014-09-26 19:09     ` Don Slutz
  2014-09-24 16:01   ` George Dunlap
  1 sibling, 2 replies; 93+ messages in thread
From: Boris Ostrovsky @ 2014-09-23 17:16 UTC (permalink / raw)
  To: Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan, George Dunlap,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Suravee Suthikulpanit

On 09/20/2014 02:07 PM, Don Slutz wrote:
> @@ -2064,6 +2065,42 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
>       return;
>   }
>   
> +static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
> +                                    struct vcpu *v)
> +{
> +    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
> +    /*
> +     * Just use 15 for the instruction length; vmport_gp_check will
> +     * adjust it.  This is because
> +     * __get_instruction_length_from_list() has issues, and may
> +     * require a double read of the instruction bytes.  At some
> +     * point a new routine could be added that is based on the code
> +     * in vmport_gp_check with extensions to make it more general.
> +     * Since that routine is the only user of this code this can be
> +     * done later.
> +     */
> +    unsigned long inst_len = 15;

Can you add a comment describing why you chose 15?

Also, saying that __get_instruction_length_from_list() has issues I 
think requires a bit more details (e.g. that when called from #GP 
handler NRIP is not available, or that NRIP may not be available at all 
on a particular HW, leading to the need read the instruction twice --- 
once in __get_instruction_length_from_list() and then again in 
vmport_gp_check(). Which is bad because memory may change between the 
reads. Or something like that.).

-boris

> +    unsigned long inst_addr = svm_rip2pointer(v);
> +    int rc;
> +
> +    rc = vmport_gp_check(regs, v, &inst_len, inst_addr,
> +                         vmcb->exitinfo1, vmcb->exitinfo2);
> +    if ( !rc )
> +        __update_guest_eip(regs, inst_len);
> +    else
> +    {
> +        VMPORT_DBG_LOG(VMPORT_LOG_GP_UNKNOWN,
> +                       "gp: rc=%d ei1=0x%lx ei2=0x%lx ec=0x%x ip=%"PRIx64
> +                       " (0x%lx,%ld) ax=%"PRIx64" bx=%"PRIx64" cx=%"PRIx64
> +                       " dx=%"PRIx64" si=%"PRIx64" di=%"PRIx64, rc,
> +                       (unsigned long)vmcb->exitinfo1,
> +                       (unsigned long)vmcb->exitinfo2, regs->error_code,
> +                       regs->rip, inst_addr, inst_len, regs->rax, regs->rbx,
> +                       regs->rcx, regs->rdx, regs->rsi, regs->rdi);
> +        hvm_inject_hw_exception(TRAP_gp_fault, vmcb->exitinfo1);
> +    }
> +}
> +

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 08/16] xen: Add limited support of VMware's hyper-call rpc
  2014-09-23 12:34       ` Ian Campbell
@ 2014-09-23 22:03         ` Slutz, Donald Christopher
  0 siblings, 0 replies; 93+ messages in thread
From: Slutz, Donald Christopher @ 2014-09-23 22:03 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/23/14 08:34, Ian Campbell wrote:
> On Mon, 2014-09-22 at 17:18 -0400, Don Slutz wrote:
>> On 09/22/14 09:47, Ian Campbell wrote:
>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>> This interface is an extension of __HYPERVISOR_HVM_op.  It was
>>>> picked because xc_get_hvm_param() also uses it and VMware guest
>>>> info is a lot like a hvm param.
>>> Sorry if this has been discussed before, but did you consider doing all
>>> this in qemu rather than Xen?
>>
>> Yes, but QEMU does not have access to the VCPU registers.  Also QEMU
>> does not have this code; so would also need it to be added.
> I thought you said KVM had this functionality already? If so then the
> bulk of the code should be there already.

QEMU (aka KVM) only has part of #4; it has none of the code for RPC
(I.E. this patch).  So it would all be new code.

>> A second part would be adding a way for a tool stack to
>> communicate with QEMU to access this data.  xenstore/xenbus might
>> be one way to do this.
> qmp would be the more modern way I think.

Ok.

    -Don Slutz

> Ian.
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 04/16] xen: Add vmware_port support
  2014-09-23 17:16   ` Boris Ostrovsky
@ 2014-09-24  8:28     ` Jan Beulich
  2014-09-26 19:09     ` Don Slutz
  1 sibling, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2014-09-24  8:28 UTC (permalink / raw)
  To: xen-devel, Boris Ostrovsky, Don Slutz
  Cc: Jun Nakajima, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Eddie Dong, Aravind Gopalakrishnan,
	Suravee Suthikulpanit

>>> On 23.09.14 at 19:16, <boris.ostrovsky@oracle.com> wrote:
> On 09/20/2014 02:07 PM, Don Slutz wrote:
>> @@ -2064,6 +2065,42 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
>>       return;
>>   }
>>   
>> +static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
>> +                                    struct vcpu *v)
>> +{
>> +    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
>> +    /*
>> +     * Just use 15 for the instruction length; vmport_gp_check will
>> +     * adjust it.  This is because
>> +     * __get_instruction_length_from_list() has issues, and may
>> +     * require a double read of the instruction bytes.  At some
>> +     * point a new routine could be added that is based on the code
>> +     * in vmport_gp_check with extensions to make it more general.
>> +     * Since that routine is the only user of this code this can be
>> +     * done later.
>> +     */
>> +    unsigned long inst_len = 15;
> 
> Can you add a comment describing why you chose 15?

I think this much of architecture knowledge can be assumed as given.

Jan

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 01/16] xen: Add support for VMware cpuid leaves
  2014-09-20 18:07 ` [PATCH for-4.5 v6 01/16] xen: Add support for VMware cpuid leaves Don Slutz
  2014-09-22 11:49   ` Andrew Cooper
@ 2014-09-24 14:33   ` George Dunlap
  1 sibling, 0 replies; 93+ messages in thread
From: George Dunlap @ 2014-09-24 14:33 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Jan Beulich, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Sat, Sep 20, 2014 at 7:07 PM, Don Slutz <dslutz@verizon.com> wrote:
> This is done by adding HVM_PARAM_VMWARE_HW. It is set to the VMware
> virtual hardware version.
>
> Currently 0, 3-4, 6-11 are good values.  However the
> code only checks for == 0 or != 0.
>
> If non-zero then
>   Return VMware's cpuid leaves.
>
> The support of hypervisor cpuid leaves has not been agreed to.
>
> MicroSoft Hyper-V (AKA viridian) currently must be at 0x40000000.
>
> VMware currently must be at 0x40000000.
>
> KVM currently must be at 0x40000000 (from Seabios).
>
> Xen can be found at the first otherwise unused 0x100 aligned
> offset between 0x40000000 and 0x40010000.
>
> http://download.microsoft.com/download/F/B/0/FB0D01A3-8E3A-4F5F-AA59-08C8026D3B8A/requirements-for-implementing-microsoft-hypervisor-interface.docx
>
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458
>
> http://lwn.net/Articles/301888/
>   Attempted to get this cleaned up.
>
> So based on this, I picked the order:
>
> Xen at 0x40000000 or
> Viridian or VMware at 0x40000000 and Xen at 0x40000100
>
> If both Viridian and VMware selected, report an error.
>
> Since I need to change xen/arch/x86/hvm/Makefile; also add
> a newline at end of file.
>
> Signed-off-by: Don Slutz <dslutz@verizon.com>

FYI I've taken a look and this all seems sensible, but I'll hold off
on my reviewed-by until I've taken a look at more of the series.

 -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 02/16] tools: Add vmware_hw support
  2014-09-20 18:07 ` [PATCH for-4.5 v6 02/16] tools: Add vmware_hw support Don Slutz
  2014-09-22 13:34   ` Ian Campbell
@ 2014-09-24 14:44   ` George Dunlap
  2014-09-24 21:06     ` Don Slutz
  1 sibling, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-24 14:44 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Jan Beulich, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Sat, Sep 20, 2014 at 7:07 PM, Don Slutz <dslutz@verizon.com> wrote:
> This is used to set HVM_PARAM_VMWARE_HW. It is set to the VMware
> virtual hardware version.
>
> Currently 0, 3-4, 6-11 are good values.  However the code only
> checks for == 0 or != 0.
>
> If non-zero then
>   default VGA to VMware's VGA.
>
> Also now allows vga=vmware
>
> Signed-off-by: Don Slutz <dslutz@verizon.com>
[snip]
> diff --git a/docs/misc/hypervisor-cpuid.markdown b/docs/misc/hypervisor-cpuid.markdown
> new file mode 100644
> index 0000000..901a4e1
> --- /dev/null
> +++ b/docs/misc/hypervisor-cpuid.markdown
> @@ -0,0 +1,28 @@
> +Hypervisor Cpuid
> +================
> +
> +The support of hypervisor cpuid leaves has not been agreed to.
> +Other then the range 0x40000000 to 0x400000ff can be used by
> +hypervisors.
> +
> +MicroSoft Hyper-V (AKA viridian) currently must be at 0x40000000.
> +
> +VMware currently must be at 0x40000000.
> +
> +KVM currently must be at 0x40000000 (from Seabios).
> +
> +Xen can be found at the first otherwise unused 0x100 aligned
> +offset between 0x40000000 and 0x40010000.

So Xen is the only kid on the block who plays nice, huh?

> @@ -555,7 +558,12 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc,
>              break;
>          case LIBXL_VGA_INTERFACE_TYPE_NONE:
>              break;
> -        }
> +        case LIBXL_VGA_INTERFACE_TYPE_VMWARE:
> +            flexarray_append_pair(dm_args, "-device",
> +                GCSPRINTF("vmware-svga,vgamem_mb=%d",
> +                libxl__sizekb_to_mb(b_info->video_memkb)));
> +            break;
> +            }

Nit: You screwed up the indentation here.

Other than that, looks good (with IanC's suggestions).

 -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-23 12:30               ` Ian Campbell
  2014-09-23 12:35                 ` George Dunlap
@ 2014-09-24 15:52                 ` George Dunlap
  2014-09-24 18:09                   ` Don Slutz
  2014-09-24 17:19                 ` Don Slutz
  2 siblings, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-24 15:52 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Kevin Tian, Keir Fraser, Eddie Dong, Stefano Stabellini,
	Ian Jackson, Tim Deegan, xen-devel, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/23/2014 01:30 PM, Ian Campbell wrote:
> On Mon, 2014-09-22 at 13:19 -0400, Don Slutz wrote:
>>>> It sounds plausible, for sure.
>>>>
>>>> Even so, why can't the result of that #GP be a calldown into qemu for
>>>> further processing?
>> This is not simple in that QEMU does not have access to the VCPU
>> registers.  Unlike a normal
>> I/O request, vmware_port (aka vmport) both reads and writes VCPU registers.
> Are you saying that emulating a normal in or out instruction doesn't
> require accessing vcpu registers? Are you sure? Surely it needs to
> either read the source or write the destination register somehow.
>
>>> I was only responding to the part of your comment in parentheses. :-)
>>>
>>> I suppose in large part it would depend on what the hypercalls were
>>> actually doing; I'd have to go back and look at them to say if they
>>> need to be in Xen or whether they could be passed on to qemu.
>>>
>> Clearly it is possible to pass the VCPU registers to QEMU, but that is
>> currently not done.
> I think there's an existing hypercall to get/set the state for a vcpu,
> perhaps it is too heavy weight to be used here though.
>
> An alternative would be a semantically higher level I/O req which took a
> guest pointer to a key and a guest pointer to the buffer etc, without
> needing the registers themselves.
>
>>    So a new
>> version of QEMU would also be needed to go this way.  None the the
>> proposed features need
>> any data from QEMU, so I do not think this make sense.
> The concern is that it is adding a load of complex looking string and
> pointer manipulation stuff to the hypervisor, the sort of thing which
> often leads to security vulnerabilities.

Do you mean the instruction decoding in vmware_gp_check()?

I was wondering how hard it would be to use the generic emulation code.  
We already have to emulate IO instructions anyway.  This is very 
complicated code, and having it duplicated in two places seems like it's 
just asking for someone to update the one and forget to update the 
other, opening up a bug / security vulnerability.

The other question would be whether doing it in qemu would be fast 
enough, or if there would be information needed by the hypercall that's 
not available; things like GETTIME / GETTIMEFULL / GETHZ.

On the other hand, things like GETSCREENSIZE and GETGUIOPTIONS probably 
*are* better handled by qemu.

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 04/16] xen: Add vmware_port support
  2014-09-20 18:07 ` [PATCH for-4.5 v6 04/16] xen: Add vmware_port support Don Slutz
  2014-09-23 17:16   ` Boris Ostrovsky
@ 2014-09-24 16:01   ` George Dunlap
  2014-09-24 16:48     ` Don Slutz
  1 sibling, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-24 16:01 UTC (permalink / raw)
  To: Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/20/2014 07:07 PM, Don Slutz wrote:
> This includes adding is_vmware_port_enabled
>
> This is a new domain_create() flag, DOMCRF_vmware_port.  It is
> passed to domctl as XEN_DOMCTL_CDF_vmware_port.
>
> This enables limited support of VMware's hyper-call.
>
> This is both a more complete support then in currently provided by
> QEMU and/or KVM and less.  The missing part requires QEMU changes
> and has been left out until the QEMU patches are accepted upstream.
>
> VMware's hyper-call is also known as VMware Backdoor I/O Port.
>
> Note: this support does not depend on vmware_hw being non-zero.
>
> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
> to port 0x5658 specially.  Note: since many operations return data
> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
> "in (%dx),%al" will still do things, only AL part of EAX will be
> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
> unchanged.
>
> Also this instruction is allowed to be used from ring 3.  To
> support this the vmexit for GP needs to be enabled.  I have not
> fully tested that nested HVM is doing the right thing for this.
>
> An open source example of using this is:
>
> http://open-vm-tools.sourceforge.net/
>
> Which only uses "inl (%dx)".  Also
>
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458
>
> The support included is enough to allow VMware tools to install in a
> HVM domU.
>
> For a debug=y build there is a new command line option
> vmport_debug=.  It enabled output to the console of various
> stages of handling the "in (%dx)" instruction.
>
> Signed-off-by: Don Slutz <dslutz@verizon.com>

[snip]

> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 7b1dfe6..e2e4aad 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -510,6 +510,8 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
>       d->arch.hvm_domain.mem_sharing_enabled = 0;
>   
>       d->arch.s3_integrity = !!(domcr_flags & DOMCRF_s3_integrity);
> +    d->arch.hvm_domain.is_vmware_port_enabled =
> +        (domcr_flags & DOMCRF_vmware_port);

Should this be "!!(domcr..."?

> diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
> new file mode 100644
> index 0000000..811c303
> --- /dev/null
> +++ b/xen/arch/x86/hvm/vmware/vmport.c
> @@ -0,0 +1,326 @@
> +/*
> + * HVM VMPORT emulation
> + *
> + * Copyright (C) 2012 Verizon Corporation
> + *
> + * This file is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License Version 2 (GPLv2)
> + * as published by the Free Software Foundation.
> + *
> + * This file is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details. <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/config.h>
> +#include <xen/lib.h>
> +#include <asm/hvm/hvm.h>
> +#include <asm/hvm/support.h>
> +#include <asm/hvm/vmport.h>
> +
> +#include "backdoor_def.h"
> +#include "guest_msg_def.h"
> +
> +#define MAX_INST_LEN 15
> +
> +#ifndef NDEBUG
> +unsigned int opt_vmport_debug __read_mostly;
> +integer_param("vmport_debug", opt_vmport_debug);
> +#endif
> +
> +/* More VMware defines */
> +
> +#define VMWARE_GUI_AUTO_GRAB              0x001
> +#define VMWARE_GUI_AUTO_UNGRAB            0x002
> +#define VMWARE_GUI_AUTO_SCROLL            0x004
> +#define VMWARE_GUI_AUTO_RAISE             0x008
> +#define VMWARE_GUI_EXCHANGE_SELECTIONS    0x010
> +#define VMWARE_GUI_WARP_CURSOR_ON_UNGRAB  0x020
> +#define VMWARE_GUI_FULL_SCREEN            0x040
> +
> +#define VMWARE_GUI_TO_FULL_SCREEN         0x080
> +#define VMWARE_GUI_TO_WINDOW              0x100
> +
> +#define VMWARE_GUI_AUTO_RAISE_DISABLED    0x200
> +
> +#define VMWARE_GUI_SYNC_TIME              0x400
> +
> +/* When set, toolboxes should not show the cursor options page. */
> +#define VMWARE_DISABLE_CURSOR_OPTIONS     0x800
> +
> +void vmport_register(struct domain *d)
> +{
> +    register_portio_handler(d, BDOOR_PORT, 4, vmport_ioport);
> +}
> +
> +int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
> +{
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
> +    uint32_t cmd = regs->rcx & 0xffff;
> +    uint32_t magic = regs->rax;
> +    int rc = X86EMUL_OKAY;
> +
> +    if ( magic == BDOOR_MAGIC )
> +    {
> +        uint64_t saved_rax = regs->rax;
> +        uint64_t value;
> +
> +        VMPORT_DBG_LOG(VMPORT_LOG_TRACE,
> +                       "VMware trace dir=%d bytes=%u ip=%"PRIx64" cmd=%d ax=%"
> +                       PRIx64" bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"
> +                       PRIx64" di=%"PRIx64"\n", dir, bytes,
> +                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
> +                       regs->rdx, regs->rsi, regs->rdi);
> +        switch ( cmd )
> +        {
> +        case BDOOR_CMD_GETMHZ:
> +            /* ... */
> +            regs->rbx = BDOOR_MAGIC;
> +            regs->rax = current->domain->arch.tsc_khz / 1000;
> +            break;
> +        case BDOOR_CMD_GETVERSION:
> +            /* ... */
> +            regs->rbx = BDOOR_MAGIC;
> +            /* VERSION_MAGIC */
> +            regs->rax = 6;
> +            /* Claim we are an ESX. VMX_TYPE_SCALABLE_SERVER */
> +            regs->rcx = 2;
> +            break;
> +        case BDOOR_CMD_GETSCREENSIZE:
> +            /* We have no screen size */
> +            regs->rax = 0;
> +            break;
> +        case BDOOR_CMD_GETHWVERSION:
> +            /* vmware_hw */
> +            regs->rax = 0;
> +            if ( is_hvm_vcpu(current) )
> +            {
> +                struct hvm_domain *hd = &current->domain->arch.hvm_domain;
> +
> +                regs->rax = hd->params[HVM_PARAM_VMWARE_HW];
> +            }
> +            if ( !regs->rax )
> +                regs->rax = 4;  /* Act like version 4 */
> +            break;
> +        case BDOOR_CMD_GETHZ:
> +            value = current->domain->arch.tsc_khz * 1000;
> +            /* apic-frequency (bus speed) */
> +            regs->rcx = (uint32_t)(1000000000ULL / APIC_BUS_CYCLE_NS);
> +            /* High part of tsc-frequency */
> +            regs->rbx = (uint32_t)(value >> 32);
> +            /* Low part of tsc-frequency */
> +            regs->rax = value;

Either the comment or the code here is wrong -- this is clearly not the 
lower 32 bits, at least on 64-bit guests. :-)

If the code is right -- that is, if a 32-bit guest find this truncated 
automatically, but a 64-bit guest find all 64 bits here (and thus not 
have to reconstruct it) -- you should make the comment more informative; 
for example:
  /* On 32-bit systems this will be the lower 32 bits.  64-bit systems 
can just use the full value from rax. */

(Word-wrapped, of course.)

Hmm -- looks like regs->rax will be clipped to 32 bits for a 4-byte IO 
read?  In which case the comment here should reflect this, but you have 
the same basic issue for BDOOR_CMD_GETTIMEFUL regs->rdx (which will not 
be clipped, I don't think).

> +            break;
> +        case BDOOR_CMD_GETTIME:
> +            value = get_localtime_us(current->domain);
> +            /* hostUsecs */
> +            regs->rbx = (uint32_t)(value % 1000000UL);
> +            /* hostSecs */
> +            regs->rax = value / 1000000ULL;
> +            /* maxTimeLag */
> +            regs->rcx = 0;
> +            break;
> +        case BDOOR_CMD_GETTIMEFULL:
> +            value = get_localtime_us(current->domain);
> +            /* ... */
> +            regs->rax = BDOOR_MAGIC;
> +            /* hostUsecs */
> +            regs->rbx = (uint32_t)(value % 1000000UL);
> +            /* High part of hostSecs */
> +            regs->rsi = (uint32_t)((value / 1000000ULL) >> 32);
> +            /* Low part of hostSecs */
> +            regs->rdx = (uint32_t)(value / 1000000ULL);

Same here.

> +            /* maxTimeLag */
> +            regs->rcx = 0;
> +            break;
> +        case BDOOR_CMD_GETGUIOPTIONS:
> +            regs->rax = VMWARE_GUI_AUTO_GRAB | VMWARE_GUI_AUTO_UNGRAB |
> +                VMWARE_GUI_AUTO_RAISE_DISABLED | VMWARE_GUI_SYNC_TIME |
> +                VMWARE_DISABLE_CURSOR_OPTIONS;
> +            break;
> +        case BDOOR_CMD_SETGUIOPTIONS:
> +            regs->rax = 0x0;
> +            break;
> +        default:
> +            VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
> +                           "VMware bytes=%d dir=%d cmd=%d",
> +                           bytes, dir, cmd);
> +            break;
> +        }
> +        VMPORT_DBG_LOG(VMPORT_LOG_VMWARE_AFTER,
> +                       "VMware after ip=%"PRIx64" cmd=%d ax=%"PRIx64" bx=%"
> +                       PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"PRIx64" di=%"
> +                       PRIx64"\n",
> +                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
> +                       regs->rdx, regs->rsi, regs->rdi);
> +        if ( dir == IOREQ_READ )
> +        {
> +            switch ( bytes )
> +            {
> +            case 1:
> +                regs->rax = (saved_rax & 0xffffff00) | (regs->rax & 0xff);
> +                break;
> +            case 2:
> +                regs->rax = (saved_rax & 0xffff0000) | (regs->rax & 0xffff);
> +                break;
> +            case 4:
> +                regs->rax = (uint32_t)regs->rax;
> +                break;
> +            }
> +            *val = regs->rax;
> +        }
> +        else
> +            regs->rax = saved_rax;
> +    }
> +    else
> +    {
> +        rc = X86EMUL_UNHANDLEABLE;
> +        VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
> +                       "Not VMware %x vs %x; ip=%"PRIx64" ax=%"PRIx64
> +                       " bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"PRIx64
> +                       " di=%"PRIx64"",
> +                       magic, BDOOR_MAGIC, regs->rip, regs->rax, regs->rbx,
> +                       regs->rcx, regs->rdx, regs->rsi, regs->rdi);
> +    }
> +
> +    return rc;
> +}
> +
> +int vmport_gp_check(struct cpu_user_regs *regs, struct vcpu *v,
> +                    unsigned long *inst_len, unsigned long inst_addr,
> +                    unsigned long ei1, unsigned long ei2)

I've wondered, in another e-mail, whether it would make more sense to 
try to re-use the normal Xen emulation code, instead of duplicating its 
IO instruction decoding stuff here.

I think I probably wouldn't make that a blocker for acceptance, though.  
However...

> +{
> +    ASSERT(v->domain->arch.hvm_domain.is_vmware_port_enabled);

At the moment I think this ASSERT is misplaced; there are no checks for 
this anywhere in the handler path to this point.  If at any time in the 
future (or for any other reason) #GP exiting gets enabled (for example, 
if the introspection stuff wants to be notifed on #GPs), you'll end up 
going through this path whether or not is_vmware_port_enabled is true.

I think you should instead "if(!is_vmware_port_enabled) return" here.  
That would effectively isolate these new changes from being able to 
introduce security issues for VMs which don't enable vmware_port, making 
it less risky to accept as-is.

[snip]

> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index fc1f882..6fe9389 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -1102,6 +1102,8 @@ static int construct_vmcs(struct vcpu *v)
>   
>       v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK
>                 | (paging_mode_hap(d) ? 0 : (1U << TRAP_page_fault))
> +              | (v->domain->arch.hvm_domain.is_vmware_port_enabled ?
> +                 (1U << TRAP_gp_fault) : 0)
>                 | (1U << TRAP_no_device);

Probably not mandatory, but it might be nice to pull this bitmap logic 
into a function, so that you don't have the code duplication (here and 
in vmx_update_guest_cr()).

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-22 21:22       ` Don Slutz
@ 2014-09-24 16:24         ` George Dunlap
  2014-09-24 18:25           ` Don Slutz
  0 siblings, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-24 16:24 UTC (permalink / raw)
  To: Don Slutz, Andrew Cooper, Ian Campbell
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit

On 09/22/2014 10:22 PM, Don Slutz wrote:
> On 09/22/14 12:34, Andrew Cooper wrote:
>> On 22/09/14 14:41, Ian Campbell wrote:
>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>> This new libxl_domain_create_info field is used to set
>>>> XEN_DOMCTL_CDF_vmware_port for the xc_domain_create() routine.
>>> Does this really need to be a CDF, rather than a domctl/hvm param?
>> I have made the argument that many things which are currently HVM Params
>> should be CDF, as they absolutely should be set and immutable for the
>> entire lifetime of the domain.
>>
>>  From recollection, we have had several XSAs in the past which are
>> directly attributable to the toolstack or guest being able to play with
>> an (insufficiently locked down) HVM param after boot.
>>
>> Using a CDF avoids potential issues along these lines.
>
> It also allow setting up v->arch.hvm_vmx.exception_bitmap at
> the right time.  domctl/hvm params are setup much latter in
> the life of a domain.

Isn't that already modified on a cr change (a la vmx_update_guest_cr())?

Or did you mean the SVM side?

I'm not making an argument either way (although at the moment I'm more 
sympathetic to Andy's view), just questioning whether setting the exit 
flag is that much of an argument one way or another.

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-23 12:20       ` Ian Campbell
@ 2014-09-24 16:31         ` Don Slutz
  2014-09-24 16:44           ` George Dunlap
  2014-09-25 11:24           ` Ian Campbell
  0 siblings, 2 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-24 16:31 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, George Dunlap, Ian Jackson, Eddie Dong,
	xen-devel, Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/23/14 08:20, Ian Campbell wrote:
> On Mon, 2014-09-22 at 12:42 -0400, Don Slutz wrote:
>>> The latter would allow moving to buildinfo.u.hvm, which would be nicer
>>> from the libxl PoV, I think.
>> I could not find "buildinfo.u.hvm":
>>
>>
>> dcs-xen-54:~/xen>git grep buildinfo.u.hvm
>> dcs-xen-54:~/xen>
>>
>>
>> So unable to comment.
> It's in the idl, next to createinfo.

I take that to mean:


libxl_domain_config = Struct("domain_config", [
     ("c_info", libxl_domain_create_info),
     ("b_info", libxl_domain_build_info),
...

I.E.

b_info->u.hvm


>>> If yes then I still think you would want to set the default based on
>>> vmware-hw, wouldn't you?
>> I guess so since this is a BOOLEAN.
> defbool I hope.

Yes.

>>    Currently I do not know of a way to
>> say "set vmware_hw to 7
>> if vmware_port is true and vmware_hw is not specified".
> That's an error case, isn't it? Or at least a vmware_port is ignored
> case.

Nope.  But I will agree that I have not done a lot with 3 (at least)
state booleans.  The 3 states being true, false, and not specified.

And vmware_port is not ignored.

> What I suggested was "if vmware_hw is non-zero then set vmware_port".
>

I am reading that as "set vmware_port if not specified".  To avoid
complexity, I am treating vmware_hw as a boolean.  Using this
I get the following table:

_hw   _port
  0     0        Just like today
  1     0        Only cpuid leaves change -- very unlikey
  1     1        Full VMware mode
  0     1        VMware hyper call mode.

Adding U for unspecified:

_hw   _port
  U     U        ==> _hw=0 _port=0
  0     U        ==> _hw=0 _port=0
  1     U        The case in question.
  U     0        ==> _hw=0 _port=0
  U     1        What I was talking about.
  0     0        Just like today
  1     0        Only cpuid leaves change -- very unlikey
  1     1        Full VMware mode
  0     1        VMware hyper call mode.

The problem here is that vmware_hw is not a boolean and there is
currently not a value that lets you know it has not been specified.

So I think it is just more confusing to have vmware_hw change
the default of vmware_port but the inverse is not true.

    -Don Slutz

>>    Which would be
>> the inverse.  I lean to
>> not having the default of vmware_port based on vmware_hw.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-24 16:31         ` Don Slutz
@ 2014-09-24 16:44           ` George Dunlap
  2014-09-24 18:29             ` Don Slutz
  2014-09-25 11:24           ` Ian Campbell
  1 sibling, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-24 16:44 UTC (permalink / raw)
  To: Don Slutz, Ian Campbell
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/24/2014 05:31 PM, Don Slutz wrote:
> On 09/23/14 08:20, Ian Campbell wrote:
>> On Mon, 2014-09-22 at 12:42 -0400, Don Slutz wrote:
>>>> The latter would allow moving to buildinfo.u.hvm, which would be nicer
>>>> from the libxl PoV, I think.
>>> I could not find "buildinfo.u.hvm":
>>>
>>>
>>> dcs-xen-54:~/xen>git grep buildinfo.u.hvm
>>> dcs-xen-54:~/xen>
>>>
>>>
>>> So unable to comment.
>> It's in the idl, next to createinfo.
>
> I take that to mean:
>
>
> libxl_domain_config = Struct("domain_config", [
>     ("c_info", libxl_domain_create_info),
>     ("b_info", libxl_domain_build_info),
> ...
>
> I.E.
>
> b_info->u.hvm
>
>
>>>> If yes then I still think you would want to set the default based on
>>>> vmware-hw, wouldn't you?
>>> I guess so since this is a BOOLEAN.
>> defbool I hope.
>
> Yes.
>
>>>    Currently I do not know of a way to
>>> say "set vmware_hw to 7
>>> if vmware_port is true and vmware_hw is not specified".
>> That's an error case, isn't it? Or at least a vmware_port is ignored
>> case.
>
> Nope.  But I will agree that I have not done a lot with 3 (at least)
> state booleans.  The 3 states being true, false, and not specified.
>
> And vmware_port is not ignored.
>
>> What I suggested was "if vmware_hw is non-zero then set vmware_port".
>>
>
> I am reading that as "set vmware_port if not specified".  To avoid
> complexity, I am treating vmware_hw as a boolean.  Using this
> I get the following table:
>
> _hw   _port
>  0     0        Just like today
>  1     0        Only cpuid leaves change -- very unlikey
>  1     1        Full VMware mode
>  0     1        VMware hyper call mode.
>
> Adding U for unspecified:
>
> _hw   _port
>  U     U        ==> _hw=0 _port=0
>  0     U        ==> _hw=0 _port=0
>  1     U        The case in question.
>  U     0        ==> _hw=0 _port=0
>  U     1        What I was talking about.
>  0     0        Just like today
>  1     0        Only cpuid leaves change -- very unlikey
>  1     1        Full VMware mode
>  0     1        VMware hyper call mode.
>
> The problem here is that vmware_hw is not a boolean and there is
> currently not a value that lets you know it has not been specified.
>
> So I think it is just more confusing to have vmware_hw change
> the default of vmware_port but the inverse is not true.

So is it the case that if you specify vmware_hw with a value that your 
guest isn't expecting, it may not work?

I think the main thing Ian wants is probably a simple way for people to 
just turn everything on.  Having vmware_hw!=0 => vmware_port defaults to 
1 seems like a reasonable way to do that.

We could almost think of vmware_port as an "advanced" option that most 
people don't need to set: i.e., you only need to set it if you want one 
of the "unusual" modes (like CPUID-only or hypercall-only).

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 04/16] xen: Add vmware_port support
  2014-09-24 16:01   ` George Dunlap
@ 2014-09-24 16:48     ` Don Slutz
  2014-09-24 17:42       ` Andrew Cooper
  0 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-24 16:48 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/24/14 12:01, George Dunlap wrote:
> On 09/20/2014 07:07 PM, Don Slutz wrote:
>> This includes adding is_vmware_port_enabled
>>
>> This is a new domain_create() flag, DOMCRF_vmware_port.  It is
>> passed to domctl as XEN_DOMCTL_CDF_vmware_port.
>>
>> This enables limited support of VMware's hyper-call.
>>
>> This is both a more complete support then in currently provided by
>> QEMU and/or KVM and less.  The missing part requires QEMU changes
>> and has been left out until the QEMU patches are accepted upstream.
>>
>> VMware's hyper-call is also known as VMware Backdoor I/O Port.
>>
>> Note: this support does not depend on vmware_hw being non-zero.
>>
>> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
>> to port 0x5658 specially.  Note: since many operations return data
>> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
>> "in (%dx),%al" will still do things, only AL part of EAX will be
>> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
>> unchanged.
>>
>> Also this instruction is allowed to be used from ring 3.  To
>> support this the vmexit for GP needs to be enabled.  I have not
>> fully tested that nested HVM is doing the right thing for this.
>>
>> An open source example of using this is:
>>
>> http://open-vm-tools.sourceforge.net/
>>
>> Which only uses "inl (%dx)".  Also
>>
>> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458 
>>
>>
>> The support included is enough to allow VMware tools to install in a
>> HVM domU.
>>
>> For a debug=y build there is a new command line option
>> vmport_debug=.  It enabled output to the console of various
>> stages of handling the "in (%dx)" instruction.
>>
>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>
> [snip]
>
>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
>> index 7b1dfe6..e2e4aad 100644
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -510,6 +510,8 @@ int arch_domain_create(struct domain *d, unsigned 
>> int domcr_flags)
>>       d->arch.hvm_domain.mem_sharing_enabled = 0;
>>         d->arch.s3_integrity = !!(domcr_flags & DOMCRF_s3_integrity);
>> +    d->arch.hvm_domain.is_vmware_port_enabled =
>> +        (domcr_flags & DOMCRF_vmware_port);
>
> Should this be "!!(domcr..."?
>

I do not think it is needed, but happy to change to that.

>> diff --git a/xen/arch/x86/hvm/vmware/vmport.c 
>> b/xen/arch/x86/hvm/vmware/vmport.c
>> new file mode 100644
>> index 0000000..811c303
>> --- /dev/null
>> +++ b/xen/arch/x86/hvm/vmware/vmport.c
>> @@ -0,0 +1,326 @@
>> +/*
>> + * HVM VMPORT emulation
>> + *
>> + * Copyright (C) 2012 Verizon Corporation
>> + *
>> + * This file is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License Version 2 (GPLv2)
>> + * as published by the Free Software Foundation.
>> + *
>> + * This file is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details. 
>> <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/config.h>
>> +#include <xen/lib.h>
>> +#include <asm/hvm/hvm.h>
>> +#include <asm/hvm/support.h>
>> +#include <asm/hvm/vmport.h>
>> +
>> +#include "backdoor_def.h"
>> +#include "guest_msg_def.h"
>> +
>> +#define MAX_INST_LEN 15
>> +
>> +#ifndef NDEBUG
>> +unsigned int opt_vmport_debug __read_mostly;
>> +integer_param("vmport_debug", opt_vmport_debug);
>> +#endif
>> +
>> +/* More VMware defines */
>> +
>> +#define VMWARE_GUI_AUTO_GRAB              0x001
>> +#define VMWARE_GUI_AUTO_UNGRAB            0x002
>> +#define VMWARE_GUI_AUTO_SCROLL            0x004
>> +#define VMWARE_GUI_AUTO_RAISE             0x008
>> +#define VMWARE_GUI_EXCHANGE_SELECTIONS    0x010
>> +#define VMWARE_GUI_WARP_CURSOR_ON_UNGRAB  0x020
>> +#define VMWARE_GUI_FULL_SCREEN            0x040
>> +
>> +#define VMWARE_GUI_TO_FULL_SCREEN         0x080
>> +#define VMWARE_GUI_TO_WINDOW              0x100
>> +
>> +#define VMWARE_GUI_AUTO_RAISE_DISABLED    0x200
>> +
>> +#define VMWARE_GUI_SYNC_TIME              0x400
>> +
>> +/* When set, toolboxes should not show the cursor options page. */
>> +#define VMWARE_DISABLE_CURSOR_OPTIONS     0x800
>> +
>> +void vmport_register(struct domain *d)
>> +{
>> +    register_portio_handler(d, BDOOR_PORT, 4, vmport_ioport);
>> +}
>> +
>> +int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t 
>> *val)
>> +{
>> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
>> +    uint32_t cmd = regs->rcx & 0xffff;
>> +    uint32_t magic = regs->rax;
>> +    int rc = X86EMUL_OKAY;
>> +
>> +    if ( magic == BDOOR_MAGIC )
>> +    {
>> +        uint64_t saved_rax = regs->rax;
>> +        uint64_t value;
>> +
>> +        VMPORT_DBG_LOG(VMPORT_LOG_TRACE,
>> +                       "VMware trace dir=%d bytes=%u ip=%"PRIx64" 
>> cmd=%d ax=%"
>> +                       PRIx64" bx=%"PRIx64" cx=%"PRIx64" 
>> dx=%"PRIx64" si=%"
>> +                       PRIx64" di=%"PRIx64"\n", dir, bytes,
>> +                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
>> +                       regs->rdx, regs->rsi, regs->rdi);
>> +        switch ( cmd )
>> +        {
>> +        case BDOOR_CMD_GETMHZ:
>> +            /* ... */
>> +            regs->rbx = BDOOR_MAGIC;
>> +            regs->rax = current->domain->arch.tsc_khz / 1000;
>> +            break;
>> +        case BDOOR_CMD_GETVERSION:
>> +            /* ... */
>> +            regs->rbx = BDOOR_MAGIC;
>> +            /* VERSION_MAGIC */
>> +            regs->rax = 6;
>> +            /* Claim we are an ESX. VMX_TYPE_SCALABLE_SERVER */
>> +            regs->rcx = 2;
>> +            break;
>> +        case BDOOR_CMD_GETSCREENSIZE:
>> +            /* We have no screen size */
>> +            regs->rax = 0;
>> +            break;
>> +        case BDOOR_CMD_GETHWVERSION:
>> +            /* vmware_hw */
>> +            regs->rax = 0;
>> +            if ( is_hvm_vcpu(current) )
>> +            {
>> +                struct hvm_domain *hd = 
>> &current->domain->arch.hvm_domain;
>> +
>> +                regs->rax = hd->params[HVM_PARAM_VMWARE_HW];
>> +            }
>> +            if ( !regs->rax )
>> +                regs->rax = 4;  /* Act like version 4 */
>> +            break;
>> +        case BDOOR_CMD_GETHZ:
>> +            value = current->domain->arch.tsc_khz * 1000;
>> +            /* apic-frequency (bus speed) */
>> +            regs->rcx = (uint32_t)(1000000000ULL / APIC_BUS_CYCLE_NS);
>> +            /* High part of tsc-frequency */
>> +            regs->rbx = (uint32_t)(value >> 32);
>> +            /* Low part of tsc-frequency */
>> +            regs->rax = value;
>
> Either the comment or the code here is wrong -- this is clearly not 
> the lower 32 bits, at least on 64-bit guests. :-)
>

Opps, it should have included the (uint32_t) cast also.  Will fix.

> If the code is right -- that is, if a 32-bit guest find this truncated 
> automatically, but a 64-bit guest find all 64 bits here (and thus not 
> have to reconstruct it) -- you should make the comment more 
> informative; for example:
>  /* On 32-bit systems this will be the lower 32 bits.  64-bit systems 
> can just use the full value from rax. */
>
> (Word-wrapped, of course.)
>
> Hmm -- looks like regs->rax will be clipped to 32 bits for a 4-byte IO 
> read?  In which case the comment here should reflect this, but you 
> have the same basic issue for BDOOR_CMD_GETTIMEFUL regs->rdx (which 
> will not be clipped, I don't think).
>

It will also be "adjusted" for 2 or 1 byte IO.  rdx does not get clipped
later, but is clipped to 32bits (see below).

>> +            break;
>> +        case BDOOR_CMD_GETTIME:
>> +            value = get_localtime_us(current->domain);
>> +            /* hostUsecs */
>> +            regs->rbx = (uint32_t)(value % 1000000UL);
>> +            /* hostSecs */
>> +            regs->rax = value / 1000000ULL;
>> +            /* maxTimeLag */
>> +            regs->rcx = 0;
>> +            break;
>> +        case BDOOR_CMD_GETTIMEFULL:
>> +            value = get_localtime_us(current->domain);
>> +            /* ... */
>> +            regs->rax = BDOOR_MAGIC;
>> +            /* hostUsecs */
>> +            regs->rbx = (uint32_t)(value % 1000000UL);
>> +            /* High part of hostSecs */
>> +            regs->rsi = (uint32_t)((value / 1000000ULL) >> 32);
>> +            /* Low part of hostSecs */
>> +            regs->rdx = (uint32_t)(value / 1000000ULL);
>
> Same here.
>

But the (uint32_t) does make it just 32bits.
>> +            /* maxTimeLag */
>> +            regs->rcx = 0;
>> +            break;
>> +        case BDOOR_CMD_GETGUIOPTIONS:
>> +            regs->rax = VMWARE_GUI_AUTO_GRAB | VMWARE_GUI_AUTO_UNGRAB |
>> +                VMWARE_GUI_AUTO_RAISE_DISABLED | VMWARE_GUI_SYNC_TIME |
>> +                VMWARE_DISABLE_CURSOR_OPTIONS;
>> +            break;
>> +        case BDOOR_CMD_SETGUIOPTIONS:
>> +            regs->rax = 0x0;
>> +            break;
>> +        default:
>> +            VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
>> +                           "VMware bytes=%d dir=%d cmd=%d",
>> +                           bytes, dir, cmd);
>> +            break;
>> +        }
>> +        VMPORT_DBG_LOG(VMPORT_LOG_VMWARE_AFTER,
>> +                       "VMware after ip=%"PRIx64" cmd=%d 
>> ax=%"PRIx64" bx=%"
>> +                       PRIx64" cx=%"PRIx64" dx=%"PRIx64" 
>> si=%"PRIx64" di=%"
>> +                       PRIx64"\n",
>> +                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
>> +                       regs->rdx, regs->rsi, regs->rdi);
>> +        if ( dir == IOREQ_READ )
>> +        {
>> +            switch ( bytes )
>> +            {
>> +            case 1:
>> +                regs->rax = (saved_rax & 0xffffff00) | (regs->rax & 
>> 0xff);
>> +                break;
>> +            case 2:
>> +                regs->rax = (saved_rax & 0xffff0000) | (regs->rax & 
>> 0xffff);
>> +                break;
>> +            case 4:
>> +                regs->rax = (uint32_t)regs->rax;
>> +                break;
>> +            }
>> +            *val = regs->rax;
>> +        }
>> +        else
>> +            regs->rax = saved_rax;
>> +    }
>> +    else
>> +    {
>> +        rc = X86EMUL_UNHANDLEABLE;
>> +        VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
>> +                       "Not VMware %x vs %x; ip=%"PRIx64" ax=%"PRIx64
>> +                       " bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64" 
>> si=%"PRIx64
>> +                       " di=%"PRIx64"",
>> +                       magic, BDOOR_MAGIC, regs->rip, regs->rax, 
>> regs->rbx,
>> +                       regs->rcx, regs->rdx, regs->rsi, regs->rdi);
>> +    }
>> +
>> +    return rc;
>> +}
>> +
>> +int vmport_gp_check(struct cpu_user_regs *regs, struct vcpu *v,
>> +                    unsigned long *inst_len, unsigned long inst_addr,
>> +                    unsigned long ei1, unsigned long ei2)
>
> I've wondered, in another e-mail, whether it would make more sense to 
> try to re-use the normal Xen emulation code, instead of duplicating 
> its IO instruction decoding stuff here.
>

Since this is only called on a #GP, I do not what to attempt to emulate
the instruction by the normal way.  In fact the normal Xen emulation
should say "do a #GP", not do the VMware hypercall.

I will say that I had not looked into getting the normal Xen emulation
"fixed" for this case.  In all my testing, I have not see this issue.


With the patch:

Subject: [Xen-devel] [PATCH 5/6] x86/hvm: Forced Emulation Prefix for debug
	builds of Xen
Message-ID: <1411484611-31027-6-git-send-email-andrew.cooper3@citrix.com>


I need to look into this.




> I think I probably wouldn't make that a blocker for acceptance, 
> though.  However...
>

Thanks.

>> +{
>> + ASSERT(v->domain->arch.hvm_domain.is_vmware_port_enabled);
>
> At the moment I think this ASSERT is misplaced; there are no checks 
> for this anywhere in the handler path to this point.  If at any time 
> in the future (or for any other reason) #GP exiting gets enabled (for 
> example, if the introspection stuff wants to be notifed on #GPs), 
> you'll end up going through this path whether or not 
> is_vmware_port_enabled is true.
>
> I think you should instead "if(!is_vmware_port_enabled) return" here.  
> That would effectively isolate these new changes from being able to 
> introduce security issues for VMs which don't enable vmware_port, 
> making it less risky to accept as-is.
>

Ok, it was a return in an older version.  Happy to switch back.

> [snip]
>
>> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
>> index fc1f882..6fe9389 100644
>> --- a/xen/arch/x86/hvm/vmx/vmcs.c
>> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
>> @@ -1102,6 +1102,8 @@ static int construct_vmcs(struct vcpu *v)
>>         v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK
>>                 | (paging_mode_hap(d) ? 0 : (1U << TRAP_page_fault))
>> +              | (v->domain->arch.hvm_domain.is_vmware_port_enabled ?
>> +                 (1U << TRAP_gp_fault) : 0)
>>                 | (1U << TRAP_no_device);
>
> Probably not mandatory, but it might be nice to pull this bitmap logic 
> into a function, so that you don't have the code duplication (here and 
> in vmx_update_guest_cr()).
>

Ok, Will keep it in mind.

    -Don Slutz

>  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-23 12:30               ` Ian Campbell
  2014-09-23 12:35                 ` George Dunlap
  2014-09-24 15:52                 ` George Dunlap
@ 2014-09-24 17:19                 ` Don Slutz
  2014-09-24 20:21                   ` Konrad Rzeszutek Wilk
  2014-09-25 11:35                   ` Ian Campbell
  2 siblings, 2 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-24 17:19 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Kevin Tian, Keir Fraser, Eddie Dong, Stefano Stabellini,
	George Dunlap, Ian Jackson, Tim Deegan, xen-devel, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/23/14 08:30, Ian Campbell wrote:
> On Mon, 2014-09-22 at 13:19 -0400, Don Slutz wrote:

[snip]

>>> I was only responding to the part of your comment in parentheses. :-)
>>>
>>> I suppose in large part it would depend on what the hypercalls were
>>> actually doing; I'd have to go back and look at them to say if they
>>> need to be in Xen or whether they could be passed on to qemu.
>>>
>> Clearly it is possible to pass the VCPU registers to QEMU, but that is
>> currently not done.
> I think there's an existing hypercall to get/set the state for a vcpu,
> perhaps it is too heavy weight to be used here though.

Yes, very heavy weight

> An alternative would be a semantically higher level I/O req which took a
> guest pointer to a key and a guest pointer to the buffer etc, without
> needing the registers themselves.

I am looking at adding a new I/O req type for this.  It turns out that
for vmware_port you need to pass 6 32bit values both ways.  And
I can overlap the .addr, .data, .count and .size for this.  The other
option is to increase the size of struct ioreq, which I am assuming
is not the way to go since it would reduce the max number of vcpus
as long as "struct shared_iopage" is limited to 1 page.

"guest pointer to a key and a guest pointer to the buffer" is not how
this works.  The data is all passed by upto 4 bytes at each IN.  A string
(which is how guestinfo access looks like) is passed as a length, and
then each 4 bytes of the string. (I am not trying to say this is good.)



>>    So a new
>> version of QEMU would also be needed to go this way.  None the the
>> proposed features need
>> any data from QEMU, so I do not think this make sense.
> The concern is that it is adding a load of complex looking string and
> pointer manipulation stuff to the hypervisor, the sort of thing which
> often leads to security vulnerabilities.
>
> So that would be better done outside of Xen itself if possible, if a
> qemu update is the price for that then it doesn't seem so bad to me.

I have yet to come up with a good reason why not to move the
VMware port RPC code into QEMU.  I will be looking to do that for
Xen 4.6 & QEMU 2.3


Related to that, the code to connect Xen to QEMU so that Xen can
use any VMware support in QEMU is not that complex.  So added
the xen part in place of patches 8, 9, 10, 11, 12, 14, 15 and 16
looks doable.  This would allow X to use the VMware mouse
code (which is in both qemu-xen and qemu-xen-traditional).  I have
found this to be a great improvement in using a GUI in a guest
where the network speeds are not that fast.  I had planned
on adjusting the Xen to QEMU connector code for 4.6

Also there is a good chance that the QEMU part could be up streamed
to QEMU 2.2 (and backported to Xen's QEMU tree) for 4.5

Now since I did not include this code sooner, would I need a release
exception to include the Xen to QEMU connector code?


One thing related to this is, should I also change qemu-xen-traditional
to handle the new new I/O req type, or to only send it if using qemu-xen.

It is simple to allow a new QEMU to build with pre-4.5 Xen and post-4.5
Xen.  No idea of a good way to check that a QEMU binary has this
support.  However I can say that enabling vmware_port does require
a QEMU with this support in the docs.


     -Don Slutz

> Ian.
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage
  2014-09-20 18:07 ` [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage Don Slutz
@ 2014-09-24 17:27   ` George Dunlap
  2014-09-24 19:07     ` Don Slutz
  0 siblings, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-24 17:27 UTC (permalink / raw)
  To: Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/20/2014 07:07 PM, Don Slutz wrote:
> Reduce the VMPORT_DBG_LOG calls.

You should also have mentioned that you added hew HVMTRACE macros which 
will log the TSC value.

The reason the HVMTRACE macros don't log the TSC values is that for the 
most part you can get all the timing information you need from the TSC 
on the vmexit and vmenter.  Looking at where you've added the TSC 
values, I don't really see how it adds anything except bloat to the 
log.  Is there a reason you need to know exactly when these different 
things happened, instead of just being able to bracket them between 
VMENTER and VMEXITs?

>
> Signed-off-by: Don Slutz <dslutz@verizon.com>
> ---
> v6:
>        Dropped the attempt to use svm_nextrip_insn_length via
>        __get_instruction_length (added in v2).  Just always look
>        at upto 15 bytes on AMD.
>
> v5:
>        exitinfo1 is used twice.
>          Fixed.
>
>   xen/arch/x86/hvm/svm/svm.c       | 20 ++++++++++++++---
>   xen/arch/x86/hvm/vmware/vmport.c | 48 ++++++++++++++++++++++------------------
>   xen/arch/x86/hvm/vmx/vmx.c       | 12 ++++++++++
>   xen/include/asm-x86/hvm/trace.h  | 45 +++++++++++++++++++++++++++++++++++++
>   xen/include/asm-x86/hvm/vmport.h |  6 -----
>   xen/include/public/trace.h       | 12 ++++++++++
>   6 files changed, 113 insertions(+), 30 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index ea99dfb..716dda1 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -2081,10 +2081,18 @@ static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
>        */
>       unsigned long inst_len = 15;
>       unsigned long inst_addr = svm_rip2pointer(v);
> -    int rc;
> +    uint32_t starting_rdx = regs->rdx;
> +    int rc = vmport_gp_check(regs, v, &inst_len, inst_addr,
> +                             vmcb->exitinfo1, vmcb->exitinfo2);
> +
> +    if ( hvm_long_mode_enabled(v) )
> +        HVMTRACE_LONG2_C4D(TRAP_GP, inst_len, starting_rdx,
> +                           TRC_PAR_LONG(vmcb->exitinfo1),
> +                           TRC_PAR_LONG(vmcb->exitinfo2));
> +    else
> +        HVMTRACE_C4D(TRAP_GP, inst_len, starting_rdx, vmcb->exitinfo1,
> +                     vmcb->exitinfo2);
>   
> -    rc = vmport_gp_check(regs, v, &inst_len, inst_addr,
> -                         vmcb->exitinfo1, vmcb->exitinfo2);
>       if ( !rc )
>           __update_guest_eip(regs, inst_len);
>       else
> @@ -2097,6 +2105,12 @@ static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
>                          (unsigned long)vmcb->exitinfo2, regs->error_code,
>                          regs->rip, inst_addr, inst_len, regs->rax, regs->rbx,
>                          regs->rcx, regs->rdx, regs->rsi, regs->rdi);
> +        if ( hvm_long_mode_enabled(v) )
> +            HVMTRACE_LONG_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, regs->rbx, regs->rcx,
> +                              TRC_PAR_LONG(inst_addr));
> +        else
> +            HVMTRACE_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, regs->rbx, regs->rcx,
> +                         inst_addr);
>           hvm_inject_hw_exception(TRAP_gp_fault, vmcb->exitinfo1);
>       }
>   }
> diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
> index 811c303..962ee32 100644
> --- a/xen/arch/x86/hvm/vmware/vmport.c
> +++ b/xen/arch/x86/hvm/vmware/vmport.c
> @@ -18,6 +18,7 @@
>   #include <asm/hvm/hvm.h>
>   #include <asm/hvm/support.h>
>   #include <asm/hvm/vmport.h>
> +#include <asm/hvm/trace.h>
>   
>   #include "backdoor_def.h"
>   #include "guest_msg_def.h"
> @@ -66,12 +67,15 @@ int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
>           uint64_t saved_rax = regs->rax;
>           uint64_t value;
>   
> -        VMPORT_DBG_LOG(VMPORT_LOG_TRACE,
> -                       "VMware trace dir=%d bytes=%u ip=%"PRIx64" cmd=%d ax=%"
> -                       PRIx64" bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"
> -                       PRIx64" di=%"PRIx64"\n", dir, bytes,
> -                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
> -                       regs->rdx, regs->rsi, regs->rdi);
> +        if ( dir == IOREQ_READ )
> +            HVMTRACE_ND(VMPORT_READ_BEFORE, 0, 1/*cycles*/, 6,
> +                        regs->rax, regs->rbx, regs->rcx,
> +                        regs->rdx, regs->rsi, regs->rdi);
> +        else
> +            HVMTRACE_ND(VMPORT_WRITE_AFTER_BEFORE, 0, 1/*cycles*/, 6,
> +                        regs->rax, regs->rbx, regs->rcx,
> +                        regs->rdx, regs->rsi, regs->rdi);

Adding trace points in a separate patch is one thing, but adding code 
like this and then removing it in a later patch is really poor form; it 
could potentially make bisection difficult too, if (for example) the 
output is so verbose in that short window as to make it unusable between 
those changesets.

I think you should go back to the previous patches and remove all the 
VMPORT_DBG_LOG()s that don't survive until the end of the series.

Unless, that is, you think that you might be making the case to accept 
patches 1-5 for 4.5 without this patch; in which case it may make sense 
to leave it the way it is.

We normally don't log both BEFORE and AFTER states of things like 
hypercalls -- just logging the outcome of what the hypervisor did should 
be sufficient, shouldn't it?  Do you really need to know the value of 
things that got clobbered?  You've got tracing in the error paths for 
when things don't go as you expected.

Also, same comment with the cycles: I don't see any value in logging how 
long it took to get from the VMEXIT to here or from here to anywhere 
else; it just makes the log really bloated.

> +
>           switch ( cmd )
>           {
>           case BDOOR_CMD_GETMHZ:
> @@ -143,19 +147,17 @@ int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
>               regs->rax = 0x0;
>               break;
>           default:
> -            VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
> -                           "VMware bytes=%d dir=%d cmd=%d",
> -                           bytes, dir, cmd);
> +            HVMTRACE_ND(VMPORT_UNKNOWN, 0, 1/*cycles*/, 6,
> +                        (bytes << 8) + dir, cmd, regs->rbx,
> +                        regs->rcx, regs->rsi, regs->rdi);

You do realize the maximum number of bytes you can log is 7, not 6, 
right?  The macro stops at 6, but that's just where Keir got tired, I 
think; if you want ot log more registers here you can extend it to 7.

Also, I think for clarity you should (bytes << 8) | dir rather than +dir.

>               break;
>           }
> -        VMPORT_DBG_LOG(VMPORT_LOG_VMWARE_AFTER,
> -                       "VMware after ip=%"PRIx64" cmd=%d ax=%"PRIx64" bx=%"
> -                       PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"PRIx64" di=%"
> -                       PRIx64"\n",
> -                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
> -                       regs->rdx, regs->rsi, regs->rdi);
> +
>           if ( dir == IOREQ_READ )
>           {
> +            HVMTRACE_ND(VMPORT_READ_AFTER, 0, 1/*cycles*/, 6,
> +                        regs->rax, regs->rbx, regs->rcx,
> +                        regs->rdx, regs->rsi, regs->rdi);
>               switch ( bytes )
>               {
>               case 1:
> @@ -171,17 +173,21 @@ int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t *val)
>               *val = regs->rax;
>           }
>           else
> +        {
> +            HVMTRACE_ND(VMPORT_WRITE_AFTER, 0, 1/*cycles*/, 6,
> +                        regs->rax, regs->rbx, regs->rcx,
> +                        regs->rdx, regs->rsi, regs->rdi);
>               regs->rax = saved_rax;
> +        }
>       }
>       else
>       {
> +        if ( hvm_long_mode_enabled(current) )
> +            HVMTRACE_LONG_C4D(VMPORT_BAD, dir, bytes, regs->rax,
> +                              TRC_PAR_LONG(regs->rip));
> +        else
> +            HVMTRACE_C4D(VMPORT_BAD, dir, bytes, regs->rax, regs->rip);
>           rc = X86EMUL_UNHANDLEABLE;
> -        VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
> -                       "Not VMware %x vs %x; ip=%"PRIx64" ax=%"PRIx64
> -                       " bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64" si=%"PRIx64
> -                       " di=%"PRIx64"",
> -                       magic, BDOOR_MAGIC, regs->rip, regs->rax, regs->rbx,
> -                       regs->rcx, regs->rdx, regs->rsi, regs->rdi);
>       }
>   
>       return rc;
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 73f55f2..5395028 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2613,6 +2613,12 @@ static void vmx_vmexit_gp_intercept(struct cpu_user_regs *regs,
>       __vmread(VM_EXIT_INSTRUCTION_LEN, &inst_len);
>       __vmread(VM_EXIT_INTR_ERROR_CODE, &ecode);
>   
> +    if ( hvm_long_mode_enabled(v) )
> +        HVMTRACE_LONG2_C4D(TRAP_GP, inst_len, regs->rdx, TRC_PAR_LONG(ecode),
> +                           TRC_PAR_LONG(exit_qualification));
> +    else
> +        HVMTRACE_C4D(TRAP_GP, inst_len, regs->rdx, ecode, exit_qualification);

Do you think anyone will need this 2 years from now?  That is, will this 
actually be useful in understanding guest behavior, or is this mostly to 
help you debug the hypervisor as you're developing it?

I'd like to say more about my general theory for traces, but my brain 
has about shut down... I'll send this so you can have the comments I've 
got so far, and I'll come back to it tomorrow.

Just one more thing...

> +
>   #ifndef NDEBUG
>       orig_inst_len = inst_len;
>   #endif
> @@ -2636,6 +2642,12 @@ static void vmx_vmexit_gp_intercept(struct cpu_user_regs *regs,
>                          regs->rip, inst_addr, orig_inst_len, inst_len,
>                          regs->rax, regs->rbx, regs->rcx, regs->rdx, regs->rsi,
>                          regs->rdi);
> +        if ( hvm_long_mode_enabled(v) )
> +            HVMTRACE_LONG_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, regs->rbx, regs->rcx,
> +                              TRC_PAR_LONG(inst_addr));
> +        else
> +            HVMTRACE_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, regs->rbx, regs->rcx,
> +                         inst_addr);
>           hvm_inject_hw_exception(TRAP_gp_fault, ecode);
>       }
>   }
> diff --git a/xen/include/asm-x86/hvm/trace.h b/xen/include/asm-x86/hvm/trace.h
> index de802a6..8af2d6a 100644
> --- a/xen/include/asm-x86/hvm/trace.h
> +++ b/xen/include/asm-x86/hvm/trace.h
> @@ -52,8 +52,20 @@
>   #define DO_TRC_HVM_LMSW64      DEFAULT_HVM_MISC
>   #define DO_TRC_HVM_REALMODE_EMULATE DEFAULT_HVM_MISC
>   #define DO_TRC_HVM_TRAP             DEFAULT_HVM_MISC
> +#define DO_TRC_HVM_TRAP64           DEFAULT_HVM_MISC
>   #define DO_TRC_HVM_TRAP_DEBUG       DEFAULT_HVM_MISC
>   #define DO_TRC_HVM_VLAPIC           DEFAULT_HVM_MISC
> +#define DO_TRC_HVM_TRAP_GP          DEFAULT_HVM_MISC
> +#define DO_TRC_HVM_TRAP_GP64        DEFAULT_HVM_MISC
> +#define DO_TRC_HVM_TRAP_GP_UNKNOWN  DEFAULT_HVM_MISC
> +#define DO_TRC_HVM_TRAP_GP_UNKNOWN64 DEFAULT_HVM_MISC
> +#define DO_TRC_HVM_VMPORT_READ_BEFORE DEFAULT_HVM_IO
> +#define DO_TRC_HVM_VMPORT_WRITE_AFTER_BEFORE DEFAULT_HVM_IO
> +#define DO_TRC_HVM_VMPORT_READ_AFTER DEFAULT_HVM_IO
> +#define DO_TRC_HVM_VMPORT_WRITE_AFTER DEFAULT_HVM_IO
> +#define DO_TRC_HVM_VMPORT_BAD         DEFAULT_HVM_IO
> +#define DO_TRC_HVM_VMPORT_BAD64       DEFAULT_HVM_IO
> +#define DO_TRC_HVM_VMPORT_UNKNOWN     DEFAULT_HVM_IO
>   
>   
>   #define TRC_PAR_LONG(par) ((par)&0xFFFFFFFF),((par)>>32)
> @@ -98,6 +110,21 @@
>   #define HVMTRACE_0D(evt)                            \
>       HVMTRACE_ND(evt, 0, 0, 0,  0,  0,  0,  0,  0,  0)
>   
> +#define HVMTRACE_C6D(evt, d1, d2, d3, d4, d5, d6)    \
> +    HVMTRACE_ND(evt, 0, 1, 6, d1, d2, d3, d4, d5, d6)
> +#define HVMTRACE_C5D(evt, d1, d2, d3, d4, d5)        \
> +    HVMTRACE_ND(evt, 0, 1, 5, d1, d2, d3, d4, d5,  0)
> +#define HVMTRACE_C4D(evt, d1, d2, d3, d4)            \
> +    HVMTRACE_ND(evt, 0, 1, 4, d1, d2, d3, d4,  0,  0)
> +#define HVMTRACE_C3D(evt, d1, d2, d3)                \
> +    HVMTRACE_ND(evt, 0, 1, 3, d1, d2, d3,  0,  0,  0)
> +#define HVMTRACE_C2D(evt, d1, d2)                    \
> +    HVMTRACE_ND(evt, 0, 1, 2, d1, d2,  0,  0,  0,  0)
> +#define HVMTRACE_C1D(evt, d1)                        \
> +    HVMTRACE_ND(evt, 0, 1, 1, d1,  0,  0,  0,  0,  0)
> +#define HVMTRACE_C0D(evt)                            \
> +    HVMTRACE_ND(evt, 0, 1, 0,  0,  0,  0,  0,  0,  0)
> +
>   #define HVMTRACE_LONG_1D(evt, d1)                  \
>                      HVMTRACE_2D(evt ## 64, (d1) & 0xFFFFFFFF, (d1) >> 32)
>   #define HVMTRACE_LONG_2D(evt, d1, d2, ...)              \
> @@ -107,6 +134,24 @@
>   #define HVMTRACE_LONG_4D(evt, d1, d2, d3, d4, ...)  \
>                      HVMTRACE_5D(evt ## 64, d1, d2, d3, d4)
>   
> +#define HVMTRACE_LONG_C1D(evt, d1)                  \
> +                   HVMTRACE_C2D(evt ## 64, (d1) & 0xFFFFFFFF, (d1) >> 32)
> +#define HVMTRACE_LONG_C2D(evt, d1, d2, ...)              \
> +                   HVMTRACE_C3D(evt ## 64, d1, d2)
> +#define HVMTRACE_LONG_C3D(evt, d1, d2, d3, ...)      \
> +                   HVMTRACE_C4D(evt ## 64, d1, d2, d3)
> +#define HVMTRACE_LONG_C4D(evt, d1, d2, d3, d4, ...)  \
> +                   HVMTRACE_C5D(evt ## 64, d1, d2, d3, d4)
> +#define HVMTRACE_LONG_C5D(evt, d1, d2, d3, d4, d5, ...) \
> +                   HVMTRACE_C6D(evt ## 64, d1, d2, d3, d4, d5)
> +
> +#define HVMTRACE_LONG2_C2D(evt, d1, d2, ...)              \
> +                   HVMTRACE_C4D(evt ## 64, d1, d2)
> +#define HVMTRACE_LONG2_C3D(evt, d1, d2, d3, ...)      \
> +                   HVMTRACE_C5D(evt ## 64, d1, d2, d3)
> +#define HVMTRACE_LONG2_C4D(evt, d1, d2, d3, d4, ...)  \
> +                   HVMTRACE_C6D(evt ## 64, d1, d2, d3, d4)
> +
>   #endif /* __ASM_X86_HVM_TRACE_H__ */
>   
>   /*
> diff --git a/xen/include/asm-x86/hvm/vmport.h b/xen/include/asm-x86/hvm/vmport.h
> index c4f3926..401cbf4 100644
> --- a/xen/include/asm-x86/hvm/vmport.h
> +++ b/xen/include/asm-x86/hvm/vmport.h
> @@ -25,12 +25,6 @@
>   #define VMPORT_LOG_VGP_UNKNOWN     (1 << 3)
>   #define VMPORT_LOG_REALMODE_GP     (1 << 4)
>   
> -#define VMPORT_LOG_GP_NOT_VMWARE   (1 << 9)
> -
> -#define VMPORT_LOG_TRACE           (1 << 16)
> -#define VMPORT_LOG_ERROR           (1 << 17)
> -#define VMPORT_LOG_VMWARE_AFTER    (1 << 18)
> -

If you remove the debug statements in earlier patches, remember to 
remove these as well.

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 04/16] xen: Add vmware_port support
  2014-09-24 16:48     ` Don Slutz
@ 2014-09-24 17:42       ` Andrew Cooper
  0 siblings, 0 replies; 93+ messages in thread
From: Andrew Cooper @ 2014-09-24 17:42 UTC (permalink / raw)
  To: Don Slutz, George Dunlap, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Ian Jackson, Eddie Dong, Tim Deegan, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Boris Ostrovsky,
	Suravee Suthikulpanit

On 24/09/14 17:48, Don Slutz wrote:
> On 09/24/14 12:01, George Dunlap wrote:
>> On 09/20/2014 07:07 PM, Don Slutz wrote:
>>
>>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
>>> index 7b1dfe6..e2e4aad 100644
>>> --- a/xen/arch/x86/domain.c
>>> +++ b/xen/arch/x86/domain.c
>>> @@ -510,6 +510,8 @@ int arch_domain_create(struct domain *d,
>>> unsigned int domcr_flags)
>>>       d->arch.hvm_domain.mem_sharing_enabled = 0;
>>>         d->arch.s3_integrity = !!(domcr_flags & DOMCRF_s3_integrity);
>>> +    d->arch.hvm_domain.is_vmware_port_enabled =
>>> +        (domcr_flags & DOMCRF_vmware_port);
>>
>> Should this be "!!(domcr..."?
>>
>
> I do not think it is needed, but happy to change to that.

It sadly is as bool_t isn't of type _Bool as one would expect.  It is
int8_t as Xen's bool_t pre-dates the general acceptance of using header
files such as <stdbool.h>

~Andrew

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-24 15:52                 ` George Dunlap
@ 2014-09-24 18:09                   ` Don Slutz
  0 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-24 18:09 UTC (permalink / raw)
  To: George Dunlap, Ian Campbell, Don Slutz
  Cc: Kevin Tian, Keir Fraser, Jun Nakajima, Stefano Stabellini,
	Tim Deegan, Eddie Dong, xen-devel, Aravind Gopalakrishnan,
	Jan Beulich, Andrew Cooper, Suravee Suthikulpanit,
	Boris Ostrovsky, Ian Jackson


On 09/24/14 11:52, George Dunlap wrote:
> On 09/23/2014 01:30 PM, Ian Campbell wrote:
>> On Mon, 2014-09-22 at 13:19 -0400, Don Slutz wrote:
>>>>> It sounds plausible, for sure.
>>>>>
>>>>> Even so, why can't the result of that #GP be a calldown into qemu for
>>>>> further processing?
>>> This is not simple in that QEMU does not have access to the VCPU
>>> registers.  Unlike a normal
>>> I/O request, vmware_port (aka vmport) both reads and writes VCPU 
>>> registers.
>> Are you saying that emulating a normal in or out instruction doesn't
>> require accessing vcpu registers? Are you sure? Surely it needs to
>> either read the source or write the destination register somehow.
>>
>>>> I was only responding to the part of your comment in parentheses. :-)
>>>>
>>>> I suppose in large part it would depend on what the hypercalls were
>>>> actually doing; I'd have to go back and look at them to say if they
>>>> need to be in Xen or whether they could be passed on to qemu.
>>>>
>>> Clearly it is possible to pass the VCPU registers to QEMU, but that is
>>> currently not done.
>> I think there's an existing hypercall to get/set the state for a vcpu,
>> perhaps it is too heavy weight to be used here though.
>>
>> An alternative would be a semantically higher level I/O req which took a
>> guest pointer to a key and a guest pointer to the buffer etc, without
>> needing the registers themselves.
>>
>>>    So a new
>>> version of QEMU would also be needed to go this way.  None the the
>>> proposed features need
>>> any data from QEMU, so I do not think this make sense.
>> The concern is that it is adding a load of complex looking string and
>> pointer manipulation stuff to the hypervisor, the sort of thing which
>> often leads to security vulnerabilities.
>
> Do you mean the instruction decoding in vmware_gp_check()?
>

I do not think so.  I think this is a reference to all the new code in

[PATCH 08/16] xen: Add limited support of VMware's hyper-call rpc


> I was wondering how hard it would be to use the generic emulation 
> code.  We already have to emulate IO instructions anyway.  This is 
> very complicated code, and having it duplicated in two places seems 
> like it's just asking for someone to update the one and forget to 
> update the other, opening up a bug / security vulnerability.
>

I did reply to some this on a different thread.  Key point being that
the current emulate IO instructions should be reporting #GP which
is not what is needed.  Also all I see is a decode and emulate. What
I need a just a decode.  The closest to just a decode is
__get_instruction_length_from_list() (an AMD only function...) which
has the issue of only returning the length of the instruction (and
not any decodeing done).  As I said in svm_vmexit_gp_intercept():


     /*
      * Just use 15 for the instruction length; vmport_gp_check will
      * adjust it.  This is because
      * __get_instruction_length_from_list() has issues, and may
      * require a double read of the instruction bytes.  At some
      * point a new routine could be added that is based on the code
      * in vmport_gp_check with extensions to make it more general.
      * Since that routine is the only user of this code this can be
      * done later.
      */

So I do not know of any code that could be shared.



> The other question would be whether doing it in qemu would be fast 
> enough, or if there would be information needed by the hypercall 
> that's not available; things like GETTIME / GETTIMEFULL / GETHZ.
>

I think it would be fast enough.  But I also do not see any need to send
the simple ones you listed to QEMU for processing.  Only the ones
that need (or could use QEMU like the RPC ones).


> On the other hand, things like GETSCREENSIZE and GETGUIOPTIONS 
> probably *are* better handled by qemu.
>

Yes.  And that includes the vmware mouse support.

    -Don Slutz

>  -George
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-24 16:24         ` George Dunlap
@ 2014-09-24 18:25           ` Don Slutz
  0 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-24 18:25 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, Andrew Cooper, Ian Campbell
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Suravee Suthikulpanit


On 09/24/14 12:24, George Dunlap wrote:
> On 09/22/2014 10:22 PM, Don Slutz wrote:
>> On 09/22/14 12:34, Andrew Cooper wrote:
>>> On 22/09/14 14:41, Ian Campbell wrote:
>>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>>> This new libxl_domain_create_info field is used to set
>>>>> XEN_DOMCTL_CDF_vmware_port for the xc_domain_create() routine.
>>>> Does this really need to be a CDF, rather than a domctl/hvm param?
>>> I have made the argument that many things which are currently HVM 
>>> Params
>>> should be CDF, as they absolutely should be set and immutable for the
>>> entire lifetime of the domain.
>>>
>>>  From recollection, we have had several XSAs in the past which are
>>> directly attributable to the toolstack or guest being able to play with
>>> an (insufficiently locked down) HVM param after boot.
>>>
>>> Using a CDF avoids potential issues along these lines.
>>
>> It also allow setting up v->arch.hvm_vmx.exception_bitmap at
>> the right time.  domctl/hvm params are setup much latter in
>> the life of a domain.
>
> Isn't that already modified on a cr change (a la vmx_update_guest_cr())?
>


The following is not true for my testing:

         if ( (!vmx_unrestricted_guest(v)) &&
              (realmode != v->arch.hvm_vmx.vmx_realmode) )
         {


vmx_unrestricted_guest() is true.

> Or did you mean the SVM side?
>

Also needed there.

> I'm not making an argument either way (although at the moment I'm more 
> sympathetic to Andy's view), just questioning whether setting the exit 
> flag is that much of an argument one way or another.
>

Since Andy and the exit flag are saying the same thing, I do not
care which is a better argument.  (I.E. the way it is coded in this
patch).


    -Don Slutz

>  -George
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-24 16:44           ` George Dunlap
@ 2014-09-24 18:29             ` Don Slutz
  0 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-24 18:29 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, Ian Campbell
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit


On 09/24/14 12:44, George Dunlap wrote:
> On 09/24/2014 05:31 PM, Don Slutz wrote:
>> On 09/23/14 08:20, Ian Campbell wrote:
>>> On Mon, 2014-09-22 at 12:42 -0400, Don Slutz wrote:
>>>>> The latter would allow moving to buildinfo.u.hvm, which would be 
>>>>> nicer
>>>>> from the libxl PoV, I think.
>>>> I could not find "buildinfo.u.hvm":
>>>>
>>>>
>>>> dcs-xen-54:~/xen>git grep buildinfo.u.hvm
>>>> dcs-xen-54:~/xen>
>>>>
>>>>
>>>> So unable to comment.
>>> It's in the idl, next to createinfo.
>>
>> I take that to mean:
>>
>>
>> libxl_domain_config = Struct("domain_config", [
>>     ("c_info", libxl_domain_create_info),
>>     ("b_info", libxl_domain_build_info),
>> ...
>>
>> I.E.
>>
>> b_info->u.hvm
>>
>>
>>>>> If yes then I still think you would want to set the default based on
>>>>> vmware-hw, wouldn't you?
>>>> I guess so since this is a BOOLEAN.
>>> defbool I hope.
>>
>> Yes.
>>
>>>>    Currently I do not know of a way to
>>>> say "set vmware_hw to 7
>>>> if vmware_port is true and vmware_hw is not specified".
>>> That's an error case, isn't it? Or at least a vmware_port is ignored
>>> case.
>>
>> Nope.  But I will agree that I have not done a lot with 3 (at least)
>> state booleans.  The 3 states being true, false, and not specified.
>>
>> And vmware_port is not ignored.
>>
>>> What I suggested was "if vmware_hw is non-zero then set vmware_port".
>>>
>>
>> I am reading that as "set vmware_port if not specified".  To avoid
>> complexity, I am treating vmware_hw as a boolean.  Using this
>> I get the following table:
>>
>> _hw   _port
>>  0     0        Just like today
>>  1     0        Only cpuid leaves change -- very unlikey
>>  1     1        Full VMware mode
>>  0     1        VMware hyper call mode.
>>
>> Adding U for unspecified:
>>
>> _hw   _port
>>  U     U        ==> _hw=0 _port=0
>>  0     U        ==> _hw=0 _port=0
>>  1     U        The case in question.
>>  U     0        ==> _hw=0 _port=0
>>  U     1        What I was talking about.
>>  0     0        Just like today
>>  1     0        Only cpuid leaves change -- very unlikey
>>  1     1        Full VMware mode
>>  0     1        VMware hyper call mode.
>>
>> The problem here is that vmware_hw is not a boolean and there is
>> currently not a value that lets you know it has not been specified.
>>
>> So I think it is just more confusing to have vmware_hw change
>> the default of vmware_port but the inverse is not true.
>
> So is it the case that if you specify vmware_hw with a value that your 
> guest isn't expecting, it may not work?
>

That is not the case for all versions of Linux I have tested with. And
since viridian should be set for windows (which hides the vmware_hw
setting), I do not expect this to be true.  Clearly someone could
write new code that fails because of this.


> I think the main thing Ian wants is probably a simple way for people 
> to just turn everything on.  Having vmware_hw!=0 => vmware_port 
> defaults to 1 seems like a reasonable way to do that.
>
> We could almost think of vmware_port as an "advanced" option that most 
> people don't need to set: i.e., you only need to set it if you want 
> one of the "unusual" modes (like CPUID-only or hypercall-only).
>

Ah, that makes sense.  Will do it.

    -Don Slutz


>  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage
  2014-09-24 17:27   ` George Dunlap
@ 2014-09-24 19:07     ` Don Slutz
  2014-09-25 15:14       ` George Dunlap
  0 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-24 19:07 UTC (permalink / raw)
  To: George Dunlap, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit


On 09/24/14 13:27, George Dunlap wrote:
> On 09/20/2014 07:07 PM, Don Slutz wrote:
>> Reduce the VMPORT_DBG_LOG calls.
>
> You should also have mentioned that you added hew HVMTRACE macros 
> which will log the TSC value.
>
> The reason the HVMTRACE macros don't log the TSC values is that for 
> the most part you can get all the timing information you need from the 
> TSC on the vmexit and vmenter.  Looking at where you've added the TSC 
> values, I don't really see how it adds anything except bloat to the 
> log.  Is there a reason you need to know exactly when these different 
> things happened, instead of just being able to bracket them between 
> VMENTER and VMEXITs?
>

I did want a way to know how long the VMware code was taking.  I am
not sure this is required.

For example:

CPU1  2899550319282 (+    4170)  VMEXIT      [ exitcode = 0x00000000, 
rIP  = 0x00007fad13ffec8c ]
CPU1  2899550320086 (+     804)  TRAP_GP     [ inst_len = 1 edx = 
0x00005658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ]
CPU1  2899550325054 (+    4968)  VMPORT_READ_BEFORE  [ eax = 0x564d5868 
ebx = 0x000001b3 ecx = 0x0003001e edx = 0x00005658 esi = 0x00000000 edi 
= 0x000001b3 ]
CPU1  2899550326050 (+     996)  VMPORT_READ_AFTER  [ eax = 0x564d5868 
ebx = 0x000001b3 ecx = 0x0001001e edx = 0x00005658 esi = 0x00000000 edi 
= 0x000001b3 ]
CPU1  2899550326722 (+     672)  vlapic_accept_pic_intr [ i8259_target = 
1, accept_pic_int = 0 ]
CPU1  2899550327454 (+     732)  VMENTRY


But I am happy to drop the new log TSC macros.




>>
>> Signed-off-by: Don Slutz <dslutz@verizon.com>
>> ---
>> v6:
>>        Dropped the attempt to use svm_nextrip_insn_length via
>>        __get_instruction_length (added in v2).  Just always look
>>        at upto 15 bytes on AMD.
>>
>> v5:
>>        exitinfo1 is used twice.
>>          Fixed.
>>
>>   xen/arch/x86/hvm/svm/svm.c       | 20 ++++++++++++++---
>>   xen/arch/x86/hvm/vmware/vmport.c | 48 
>> ++++++++++++++++++++++------------------
>>   xen/arch/x86/hvm/vmx/vmx.c       | 12 ++++++++++
>>   xen/include/asm-x86/hvm/trace.h  | 45 
>> +++++++++++++++++++++++++++++++++++++
>>   xen/include/asm-x86/hvm/vmport.h |  6 -----
>>   xen/include/public/trace.h       | 12 ++++++++++
>>   6 files changed, 113 insertions(+), 30 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
>> index ea99dfb..716dda1 100644
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -2081,10 +2081,18 @@ static void svm_vmexit_gp_intercept(struct 
>> cpu_user_regs *regs,
>>        */
>>       unsigned long inst_len = 15;
>>       unsigned long inst_addr = svm_rip2pointer(v);
>> -    int rc;
>> +    uint32_t starting_rdx = regs->rdx;
>> +    int rc = vmport_gp_check(regs, v, &inst_len, inst_addr,
>> +                             vmcb->exitinfo1, vmcb->exitinfo2);
>> +
>> +    if ( hvm_long_mode_enabled(v) )
>> +        HVMTRACE_LONG2_C4D(TRAP_GP, inst_len, starting_rdx,
>> +                           TRC_PAR_LONG(vmcb->exitinfo1),
>> +                           TRC_PAR_LONG(vmcb->exitinfo2));
>> +    else
>> +        HVMTRACE_C4D(TRAP_GP, inst_len, starting_rdx, vmcb->exitinfo1,
>> +                     vmcb->exitinfo2);
>>   -    rc = vmport_gp_check(regs, v, &inst_len, inst_addr,
>> -                         vmcb->exitinfo1, vmcb->exitinfo2);
>>       if ( !rc )
>>           __update_guest_eip(regs, inst_len);
>>       else
>> @@ -2097,6 +2105,12 @@ static void svm_vmexit_gp_intercept(struct 
>> cpu_user_regs *regs,
>>                          (unsigned long)vmcb->exitinfo2, 
>> regs->error_code,
>>                          regs->rip, inst_addr, inst_len, regs->rax, 
>> regs->rbx,
>>                          regs->rcx, regs->rdx, regs->rsi, regs->rdi);
>> +        if ( hvm_long_mode_enabled(v) )
>> +            HVMTRACE_LONG_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, 
>> regs->rbx, regs->rcx,
>> +                              TRC_PAR_LONG(inst_addr));
>> +        else
>> +            HVMTRACE_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, regs->rbx, 
>> regs->rcx,
>> +                         inst_addr);
>>           hvm_inject_hw_exception(TRAP_gp_fault, vmcb->exitinfo1);
>>       }
>>   }
>> diff --git a/xen/arch/x86/hvm/vmware/vmport.c 
>> b/xen/arch/x86/hvm/vmware/vmport.c
>> index 811c303..962ee32 100644
>> --- a/xen/arch/x86/hvm/vmware/vmport.c
>> +++ b/xen/arch/x86/hvm/vmware/vmport.c
>> @@ -18,6 +18,7 @@
>>   #include <asm/hvm/hvm.h>
>>   #include <asm/hvm/support.h>
>>   #include <asm/hvm/vmport.h>
>> +#include <asm/hvm/trace.h>
>>     #include "backdoor_def.h"
>>   #include "guest_msg_def.h"
>> @@ -66,12 +67,15 @@ int vmport_ioport(int dir, uint32_t port, 
>> uint32_t bytes, uint32_t *val)
>>           uint64_t saved_rax = regs->rax;
>>           uint64_t value;
>>   -        VMPORT_DBG_LOG(VMPORT_LOG_TRACE,
>> -                       "VMware trace dir=%d bytes=%u ip=%"PRIx64" 
>> cmd=%d ax=%"
>> -                       PRIx64" bx=%"PRIx64" cx=%"PRIx64" 
>> dx=%"PRIx64" si=%"
>> -                       PRIx64" di=%"PRIx64"\n", dir, bytes,
>> -                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
>> -                       regs->rdx, regs->rsi, regs->rdi);
>> +        if ( dir == IOREQ_READ )
>> +            HVMTRACE_ND(VMPORT_READ_BEFORE, 0, 1/*cycles*/, 6,
>> +                        regs->rax, regs->rbx, regs->rcx,
>> +                        regs->rdx, regs->rsi, regs->rdi);
>> +        else
>> +            HVMTRACE_ND(VMPORT_WRITE_AFTER_BEFORE, 0, 1/*cycles*/, 6,
>> +                        regs->rax, regs->rbx, regs->rcx,
>> +                        regs->rdx, regs->rsi, regs->rdi);
>
> Adding trace points in a separate patch is one thing, but adding code 
> like this and then removing it in a later patch is really poor form; 
> it could potentially make bisection difficult too, if (for example) 
> the output is so verbose in that short window as to make it unusable 
> between those changesets.
>
> I think you should go back to the previous patches and remove all the 
> VMPORT_DBG_LOG()s that don't survive until the end of the series.
>
> Unless, that is, you think that you might be making the case to accept 
> patches 1-5 for 4.5 without this patch; in which case it may make 
> sense to leave it the way it is.
>

That was a big part of it.  I can go and remove the excess.  At this
time I am expecting that 1-7 will make 4.5


> We normally don't log both BEFORE and AFTER states of things like 
> hypercalls -- just logging the outcome of what the hypervisor did 
> should be sufficient, shouldn't it?

Not that clear with this poorly build hyper-call.

> Do you really need to know the value of things that got clobbered?

When checking on the complex state machine that is the "RPC" code
it was very helpful.  With that code moving to QEMU, the before and
after is not so important.

Here is a real example:

CPU2  865821836508576 (+    2562)  VMEXIT      [ exitcode = 0x00000000, 
rIP  = 0x00007f68a8b17c8c ]
CPU2  865821836509362 (+     786)  TRAP_GP     [ inst_len = 1 edx = 
0x00025658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ]
CPU2  865821836514132 (+    4770)  VMPORT_READ_BEFORE  [ eax = 
0x564d5868 ebx = 0x00000034 ecx = 0x0002001e edx = 0x00025658 esi = 
0x00000000 edi = 0x000001be ]
CPU2  865821836597832 (+   83700)  VMPORT_READ_AFTER  [ eax = 0x564d5868 
ebx = 0x00000034 ecx = 0x0001001e edx = 0x00025658 esi = 0x00000000 edi 
= 0x000001be ]
CPU2  865821836598756 (+     924)  vlapic_accept_pic_intr [ i8259_target 
= 1, accept_pic_int = 0 ]
CPU2  865821836605602 (+    6846)  vlapic_accept_pic_intr [ i8259_target 
= 1, accept_pic_int = 0 ]
CPU2  865821836606436 (+     834)  VMENTRY
CPU2  865821836609712 (+    3276)  VMEXIT      [ exitcode = 0x00000000, 
rIP  = 0x00007f68a8b17c8c ]
CPU2  865821836610654 (+     942)  TRAP_GP     [ inst_len = 1 edx = 
0x00025658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ]
CPU2  865821836616828 (+    6174)  VMPORT_READ_BEFORE  [ eax = 
0x564d5868 ebx = 0x00000034 ecx = 0x0003001e edx = 0x00025658 esi = 
0x00000000 edi = 0x000001be ]
CPU2  865821836617800 (+     972)  VMPORT_READ_AFTER  [ eax = 0x564d5868 
ebx = 0x00000011 ecx = 0x0003001e edx = 0x00015658 esi = 0x00000000 edi 
= 0x000001be ]
CPU2  865821836618664 (+     864)  vlapic_accept_pic_intr [ i8259_target 
= 1, accept_pic_int = 0 ]
CPU2  865821836619444 (+     780)  VMENTRY

Note that in the one "RPC" call,


grep VMPORT_READ_BEFORE ~/zz-xentrace-vmware3-0.out | wc
    1592   39800  256312

It took 1592 #GP traps to handle it, and 9643628760 tsc cycles.

> You've got tracing in the error paths for when things don't go as you 
> expected.
>
> Also, same comment with the cycles: I don't see any value in logging 
> how long it took to get from the VMEXIT to here or from here to 
> anywhere else; it just makes the log really bloated.
>

Since you feel so strongly about this, I can drop it.


>> +
>>           switch ( cmd )
>>           {
>>           case BDOOR_CMD_GETMHZ:
>> @@ -143,19 +147,17 @@ int vmport_ioport(int dir, uint32_t port, 
>> uint32_t bytes, uint32_t *val)
>>               regs->rax = 0x0;
>>               break;
>>           default:
>> -            VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
>> -                           "VMware bytes=%d dir=%d cmd=%d",
>> -                           bytes, dir, cmd);
>> +            HVMTRACE_ND(VMPORT_UNKNOWN, 0, 1/*cycles*/, 6,
>> +                        (bytes << 8) + dir, cmd, regs->rbx,
>> +                        regs->rcx, regs->rsi, regs->rdi);
>
> You do realize the maximum number of bytes you can log is 7, not 6, 
> right?  The macro stops at 6, but that's just where Keir got tired, I 
> think; if you want ot log more registers here you can extend it to 7.
>

Nope, that I did not know.  However the 1 additional 32bit value may
not be that helpful.

> Also, I think for clarity you should (bytes << 8) | dir rather than +dir.
>

Sure, will change.

>>               break;
>>           }
>> -        VMPORT_DBG_LOG(VMPORT_LOG_VMWARE_AFTER,
>> -                       "VMware after ip=%"PRIx64" cmd=%d 
>> ax=%"PRIx64" bx=%"
>> -                       PRIx64" cx=%"PRIx64" dx=%"PRIx64" 
>> si=%"PRIx64" di=%"
>> -                       PRIx64"\n",
>> -                       regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
>> -                       regs->rdx, regs->rsi, regs->rdi);
>> +
>>           if ( dir == IOREQ_READ )
>>           {
>> +            HVMTRACE_ND(VMPORT_READ_AFTER, 0, 1/*cycles*/, 6,
>> +                        regs->rax, regs->rbx, regs->rcx,
>> +                        regs->rdx, regs->rsi, regs->rdi);
>>               switch ( bytes )
>>               {
>>               case 1:
>> @@ -171,17 +173,21 @@ int vmport_ioport(int dir, uint32_t port, 
>> uint32_t bytes, uint32_t *val)
>>               *val = regs->rax;
>>           }
>>           else
>> +        {
>> +            HVMTRACE_ND(VMPORT_WRITE_AFTER, 0, 1/*cycles*/, 6,
>> +                        regs->rax, regs->rbx, regs->rcx,
>> +                        regs->rdx, regs->rsi, regs->rdi);
>>               regs->rax = saved_rax;
>> +        }
>>       }
>>       else
>>       {
>> +        if ( hvm_long_mode_enabled(current) )
>> +            HVMTRACE_LONG_C4D(VMPORT_BAD, dir, bytes, regs->rax,
>> +                              TRC_PAR_LONG(regs->rip));
>> +        else
>> +            HVMTRACE_C4D(VMPORT_BAD, dir, bytes, regs->rax, regs->rip);
>>           rc = X86EMUL_UNHANDLEABLE;
>> -        VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
>> -                       "Not VMware %x vs %x; ip=%"PRIx64" ax=%"PRIx64
>> -                       " bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64" 
>> si=%"PRIx64
>> -                       " di=%"PRIx64"",
>> -                       magic, BDOOR_MAGIC, regs->rip, regs->rax, 
>> regs->rbx,
>> -                       regs->rcx, regs->rdx, regs->rsi, regs->rdi);
>>       }
>>         return rc;
>> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
>> index 73f55f2..5395028 100644
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -2613,6 +2613,12 @@ static void vmx_vmexit_gp_intercept(struct 
>> cpu_user_regs *regs,
>>       __vmread(VM_EXIT_INSTRUCTION_LEN, &inst_len);
>>       __vmread(VM_EXIT_INTR_ERROR_CODE, &ecode);
>>   +    if ( hvm_long_mode_enabled(v) )
>> +        HVMTRACE_LONG2_C4D(TRAP_GP, inst_len, regs->rdx, 
>> TRC_PAR_LONG(ecode),
>> +                           TRC_PAR_LONG(exit_qualification));
>> +    else
>> +        HVMTRACE_C4D(TRAP_GP, inst_len, regs->rdx, ecode, 
>> exit_qualification);
>
> Do you think anyone will need this 2 years from now?  That is, will 
> this actually be useful in understanding guest behavior, or is this 
> mostly to help you debug the hypervisor as you're developing it?
>

I can see a need for it.  I would hope that Intel (and AMD) would not
change the hardware so this info is needed, but I have seen this
happen in the past.

I would have no issue with making some of these compile time
conditional (like only in debug=y builds).




> I'd like to say more about my general theory for traces, but my brain 
> has about shut down... I'll send this so you can have the comments 
> I've got so far, and I'll come back to it tomorrow.
>
> Just one more thing...
>
>> +
>>   #ifndef NDEBUG
>>       orig_inst_len = inst_len;
>>   #endif
>> @@ -2636,6 +2642,12 @@ static void vmx_vmexit_gp_intercept(struct 
>> cpu_user_regs *regs,
>>                          regs->rip, inst_addr, orig_inst_len, inst_len,
>>                          regs->rax, regs->rbx, regs->rcx, regs->rdx, 
>> regs->rsi,
>>                          regs->rdi);
>> +        if ( hvm_long_mode_enabled(v) )
>> +            HVMTRACE_LONG_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, 
>> regs->rbx, regs->rcx,
>> +                              TRC_PAR_LONG(inst_addr));
>> +        else
>> +            HVMTRACE_C5D(TRAP_GP_UNKNOWN, rc, regs->rax, regs->rbx, 
>> regs->rcx,
>> +                         inst_addr);
>>           hvm_inject_hw_exception(TRAP_gp_fault, ecode);
>>       }
>>   }
>> diff --git a/xen/include/asm-x86/hvm/trace.h 
>> b/xen/include/asm-x86/hvm/trace.h
>> index de802a6..8af2d6a 100644
>> --- a/xen/include/asm-x86/hvm/trace.h
>> +++ b/xen/include/asm-x86/hvm/trace.h
>> @@ -52,8 +52,20 @@
>>   #define DO_TRC_HVM_LMSW64      DEFAULT_HVM_MISC
>>   #define DO_TRC_HVM_REALMODE_EMULATE DEFAULT_HVM_MISC
>>   #define DO_TRC_HVM_TRAP             DEFAULT_HVM_MISC
>> +#define DO_TRC_HVM_TRAP64           DEFAULT_HVM_MISC
>>   #define DO_TRC_HVM_TRAP_DEBUG       DEFAULT_HVM_MISC
>>   #define DO_TRC_HVM_VLAPIC           DEFAULT_HVM_MISC
>> +#define DO_TRC_HVM_TRAP_GP          DEFAULT_HVM_MISC
>> +#define DO_TRC_HVM_TRAP_GP64        DEFAULT_HVM_MISC
>> +#define DO_TRC_HVM_TRAP_GP_UNKNOWN  DEFAULT_HVM_MISC
>> +#define DO_TRC_HVM_TRAP_GP_UNKNOWN64 DEFAULT_HVM_MISC
>> +#define DO_TRC_HVM_VMPORT_READ_BEFORE DEFAULT_HVM_IO
>> +#define DO_TRC_HVM_VMPORT_WRITE_AFTER_BEFORE DEFAULT_HVM_IO
>> +#define DO_TRC_HVM_VMPORT_READ_AFTER DEFAULT_HVM_IO
>> +#define DO_TRC_HVM_VMPORT_WRITE_AFTER DEFAULT_HVM_IO
>> +#define DO_TRC_HVM_VMPORT_BAD         DEFAULT_HVM_IO
>> +#define DO_TRC_HVM_VMPORT_BAD64       DEFAULT_HVM_IO
>> +#define DO_TRC_HVM_VMPORT_UNKNOWN     DEFAULT_HVM_IO
>>       #define TRC_PAR_LONG(par) ((par)&0xFFFFFFFF),((par)>>32)
>> @@ -98,6 +110,21 @@
>>   #define HVMTRACE_0D(evt)                            \
>>       HVMTRACE_ND(evt, 0, 0, 0,  0,  0,  0,  0,  0,  0)
>>   +#define HVMTRACE_C6D(evt, d1, d2, d3, d4, d5, d6)    \
>> +    HVMTRACE_ND(evt, 0, 1, 6, d1, d2, d3, d4, d5, d6)
>> +#define HVMTRACE_C5D(evt, d1, d2, d3, d4, d5)        \
>> +    HVMTRACE_ND(evt, 0, 1, 5, d1, d2, d3, d4, d5,  0)
>> +#define HVMTRACE_C4D(evt, d1, d2, d3, d4)            \
>> +    HVMTRACE_ND(evt, 0, 1, 4, d1, d2, d3, d4,  0,  0)
>> +#define HVMTRACE_C3D(evt, d1, d2, d3)                \
>> +    HVMTRACE_ND(evt, 0, 1, 3, d1, d2, d3,  0,  0,  0)
>> +#define HVMTRACE_C2D(evt, d1, d2)                    \
>> +    HVMTRACE_ND(evt, 0, 1, 2, d1, d2,  0,  0,  0,  0)
>> +#define HVMTRACE_C1D(evt, d1)                        \
>> +    HVMTRACE_ND(evt, 0, 1, 1, d1,  0,  0,  0,  0,  0)
>> +#define HVMTRACE_C0D(evt)                            \
>> +    HVMTRACE_ND(evt, 0, 1, 0,  0,  0,  0,  0,  0,  0)
>> +
>>   #define HVMTRACE_LONG_1D(evt, d1)                  \
>>                      HVMTRACE_2D(evt ## 64, (d1) & 0xFFFFFFFF, (d1) 
>> >> 32)
>>   #define HVMTRACE_LONG_2D(evt, d1, d2, ...)              \
>> @@ -107,6 +134,24 @@
>>   #define HVMTRACE_LONG_4D(evt, d1, d2, d3, d4, ...)  \
>>                      HVMTRACE_5D(evt ## 64, d1, d2, d3, d4)
>>   +#define HVMTRACE_LONG_C1D(evt, d1)                  \
>> +                   HVMTRACE_C2D(evt ## 64, (d1) & 0xFFFFFFFF, (d1) 
>> >> 32)
>> +#define HVMTRACE_LONG_C2D(evt, d1, d2, ...)              \
>> +                   HVMTRACE_C3D(evt ## 64, d1, d2)
>> +#define HVMTRACE_LONG_C3D(evt, d1, d2, d3, ...)      \
>> +                   HVMTRACE_C4D(evt ## 64, d1, d2, d3)
>> +#define HVMTRACE_LONG_C4D(evt, d1, d2, d3, d4, ...)  \
>> +                   HVMTRACE_C5D(evt ## 64, d1, d2, d3, d4)
>> +#define HVMTRACE_LONG_C5D(evt, d1, d2, d3, d4, d5, ...) \
>> +                   HVMTRACE_C6D(evt ## 64, d1, d2, d3, d4, d5)
>> +
>> +#define HVMTRACE_LONG2_C2D(evt, d1, d2, ...)              \
>> +                   HVMTRACE_C4D(evt ## 64, d1, d2)
>> +#define HVMTRACE_LONG2_C3D(evt, d1, d2, d3, ...)      \
>> +                   HVMTRACE_C5D(evt ## 64, d1, d2, d3)
>> +#define HVMTRACE_LONG2_C4D(evt, d1, d2, d3, d4, ...)  \
>> +                   HVMTRACE_C6D(evt ## 64, d1, d2, d3, d4)
>> +
>>   #endif /* __ASM_X86_HVM_TRACE_H__ */
>>     /*
>> diff --git a/xen/include/asm-x86/hvm/vmport.h 
>> b/xen/include/asm-x86/hvm/vmport.h
>> index c4f3926..401cbf4 100644
>> --- a/xen/include/asm-x86/hvm/vmport.h
>> +++ b/xen/include/asm-x86/hvm/vmport.h
>> @@ -25,12 +25,6 @@
>>   #define VMPORT_LOG_VGP_UNKNOWN     (1 << 3)
>>   #define VMPORT_LOG_REALMODE_GP     (1 << 4)
>>   -#define VMPORT_LOG_GP_NOT_VMWARE   (1 << 9)
>> -
>> -#define VMPORT_LOG_TRACE           (1 << 16)
>> -#define VMPORT_LOG_ERROR           (1 << 17)
>> -#define VMPORT_LOG_VMWARE_AFTER    (1 << 18)
>> -
>
> If you remove the debug statements in earlier patches, remember to 
> remove these as well.
>

Sure.

    -Don Slutz

>  -George
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-24 17:19                 ` Don Slutz
@ 2014-09-24 20:21                   ` Konrad Rzeszutek Wilk
  2014-09-26 19:03                     ` Don Slutz
  2014-09-25 11:35                   ` Ian Campbell
  1 sibling, 1 reply; 93+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-24 20:21 UTC (permalink / raw)
  To: Don Slutz
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	George Dunlap, Ian Jackson, Tim Deegan, xen-devel, Jan Beulich,
	Eddie Dong, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Wed, Sep 24, 2014 at 01:19:38PM -0400, Don Slutz wrote:
> On 09/23/14 08:30, Ian Campbell wrote:
> >On Mon, 2014-09-22 at 13:19 -0400, Don Slutz wrote:
> 
> [snip]
> 
> >>>I was only responding to the part of your comment in parentheses. :-)
> >>>
> >>>I suppose in large part it would depend on what the hypercalls were
> >>>actually doing; I'd have to go back and look at them to say if they
> >>>need to be in Xen or whether they could be passed on to qemu.
> >>>
> >>Clearly it is possible to pass the VCPU registers to QEMU, but that is
> >>currently not done.
> >I think there's an existing hypercall to get/set the state for a vcpu,
> >perhaps it is too heavy weight to be used here though.
> 
> Yes, very heavy weight
> 
> >An alternative would be a semantically higher level I/O req which took a
> >guest pointer to a key and a guest pointer to the buffer etc, without
> >needing the registers themselves.
> 
> I am looking at adding a new I/O req type for this.  It turns out that
> for vmware_port you need to pass 6 32bit values both ways.  And
> I can overlap the .addr, .data, .count and .size for this.  The other
> option is to increase the size of struct ioreq, which I am assuming
> is not the way to go since it would reduce the max number of vcpus
> as long as "struct shared_iopage" is limited to 1 page.
> 
> "guest pointer to a key and a guest pointer to the buffer" is not how
> this works.  The data is all passed by upto 4 bytes at each IN.  A string
> (which is how guestinfo access looks like) is passed as a length, and
> then each 4 bytes of the string. (I am not trying to say this is good.)
> 
> 
> 
> >>   So a new
> >>version of QEMU would also be needed to go this way.  None the the
> >>proposed features need
> >>any data from QEMU, so I do not think this make sense.
> >The concern is that it is adding a load of complex looking string and
> >pointer manipulation stuff to the hypervisor, the sort of thing which
> >often leads to security vulnerabilities.
> >
> >So that would be better done outside of Xen itself if possible, if a
> >qemu update is the price for that then it doesn't seem so bad to me.
> 
> I have yet to come up with a good reason why not to move the
> VMware port RPC code into QEMU.  I will be looking to do that for
> Xen 4.6 & QEMU 2.3
> 
> 
> Related to that, the code to connect Xen to QEMU so that Xen can
> use any VMware support in QEMU is not that complex.  So added
> the xen part in place of patches 8, 9, 10, 11, 12, 14, 15 and 16
> looks doable.  This would allow X to use the VMware mouse
> code (which is in both qemu-xen and qemu-xen-traditional).  I have
> found this to be a great improvement in using a GUI in a guest
> where the network speeds are not that fast.  I had planned
> on adjusting the Xen to QEMU connector code for 4.6
> 
> Also there is a good chance that the QEMU part could be up streamed
> to QEMU 2.2 (and backported to Xen's QEMU tree) for 4.5
> 
> Now since I did not include this code sooner, would I need a release
> exception to include the Xen to QEMU connector code?

Yes, but without having seen the patches beforehand it might be
a bit too late as they would be brand-new-patches.

> 
> 
> One thing related to this is, should I also change qemu-xen-traditional
> to handle the new new I/O req type, or to only send it if using qemu-xen.

Just qemu-xen.
> 
> It is simple to allow a new QEMU to build with pre-4.5 Xen and post-4.5
> Xen.  No idea of a good way to check that a QEMU binary has this
> support.  However I can say that enabling vmware_port does require
> a QEMU with this support in the docs.

And that would mean for users to take advantage of it would need
to update their QEMU version right?

> 
> 
>     -Don Slutz
> 
> >Ian.
> >
> 

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 02/16] tools: Add vmware_hw support
  2014-09-24 14:44   ` George Dunlap
@ 2014-09-24 21:06     ` Don Slutz
  0 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-24 21:06 UTC (permalink / raw)
  To: George Dunlap, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Jan Beulich, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/24/14 10:44, George Dunlap wrote:
> On Sat, Sep 20, 2014 at 7:07 PM, Don Slutz <dslutz@verizon.com> wrote:
>> This is used to set HVM_PARAM_VMWARE_HW. It is set to the VMware
>> virtual hardware version.
>>
>> Currently 0, 3-4, 6-11 are good values.  However the code only
>> checks for == 0 or != 0.
>>
>> If non-zero then
>>    default VGA to VMware's VGA.
>>
>> Also now allows vga=vmware
>>
>> Signed-off-by: Don Slutz <dslutz@verizon.com>
> [snip]
>> diff --git a/docs/misc/hypervisor-cpuid.markdown b/docs/misc/hypervisor-cpuid.markdown
>> new file mode 100644
>> index 0000000..901a4e1
>> --- /dev/null
>> +++ b/docs/misc/hypervisor-cpuid.markdown
>> @@ -0,0 +1,28 @@
>> +Hypervisor Cpuid
>> +================
>> +
>> +The support of hypervisor cpuid leaves has not been agreed to.
>> +Other then the range 0x40000000 to 0x400000ff can be used by
>> +hypervisors.
>> +
>> +MicroSoft Hyper-V (AKA viridian) currently must be at 0x40000000.
>> +
>> +VMware currently must be at 0x40000000.
>> +
>> +KVM currently must be at 0x40000000 (from Seabios).
>> +
>> +Xen can be found at the first otherwise unused 0x100 aligned
>> +offset between 0x40000000 and 0x40010000.
> So Xen is the only kid on the block who plays nice, huh?

Yup.

>> @@ -555,7 +558,12 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc,
>>               break;
>>           case LIBXL_VGA_INTERFACE_TYPE_NONE:
>>               break;
>> -        }
>> +        case LIBXL_VGA_INTERFACE_TYPE_VMWARE:
>> +            flexarray_append_pair(dm_args, "-device",
>> +                GCSPRINTF("vmware-svga,vgamem_mb=%d",
>> +                libxl__sizekb_to_mb(b_info->video_memkb)));
>> +            break;
>> +            }
> Nit: You screwed up the indentation here.

Will fix.
    -Don Slutz

> Other than that, looks good (with IanC's suggestions).
>
>   -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-22 16:18         ` Jan Beulich
  2014-09-22 18:32           ` Don Slutz
@ 2014-09-25 10:37           ` Tim Deegan
  2014-09-26 20:00             ` Don Slutz
  1 sibling, 1 reply; 93+ messages in thread
From: Tim Deegan @ 2014-09-25 10:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Ian Jackson, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Eddie Dong,
	Don Slutz, xen-devel, AravindGopalakrishnan, Jun Nakajima,
	Boris Ostrovsky, Suravee Suthikulpanit

At 17:18 +0100 on 22 Sep (1411402700), Jan Beulich wrote:
> >>> On 22.09.14 at 17:38, <george.dunlap@eu.citrix.com> wrote:
> > On 09/22/2014 04:34 PM, Ian Campbell wrote:
> >> On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
> >>> On 09/22/2014 02:56 PM, Ian Campbell wrote:
> >>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
> >>>>
> >>>>> I picked this subset to start with because it only has changes in
> >>>>> Xen.
> >>>>>
> >>>>> Some of this code is already in QEMU
> >>>> As I suggest in my reply to one for the rpc port patches it's not clear
> >>>> that any of this needs to be in Xen rather than qemu in the first place.
> >>>>
> >>>> I came to think this even more once I saw the save/restore support...
> >>> I don't think qemu can get notified on either cpuid or #GP faults, can it?
> >> I understand the need for the cpuid bits, I should have made that clear.
> >>
> >>> A big chunk of the functionality here is to allow a userspace process to
> >>> transparently make the "hypercalls" without the OS needing to explicitly
> >>> give it access to the IO space, by trapping the resulting #GP faults and
> >>> checking to see if they are IO instructions .  If that's functionality
> >>> we think is important, then it will have to be done in Xen, I think.
> >> Ah, the need to #GP was what I had missed, I was thinking it was just a
> >> regular I/O port access.
> >>
> >> Having trapped the #GP and decoded it into an IO access, is there
> >> anything stopping us forwarding that to qemu for consideration?
> >>
> >> (I confess I'm not sure why this is a #GP thing and not a VTd/SVM I/O
> >> access trap, just like if userspace mmaps /dev/ioports, but I'll trust
> >> that's just my lack of x86 hw virt knowledge)
> > 
> > I'm not 100% sure of this, but my understanding was that it *would* be a 
> > normal IO trap *if* the guest OS gave access to that IO range to the 
> > guest (via IOPL, maybe?).  But if the userspace program is not 
> > explicitly given access by the OS to those ports, it will generate a #GP 
> > instead.  The idea is to allow the "hypercall" to happen *without 
> > cooperation* from the guest OS.
> > 
> > Again, that's my understanding, someone please correct me if I'm wrong...
> 
> That's indeed what was said so far. I wonder though whether opening
> this up without guest OS consent isn't gong to introduce a security
> issue inside the guest (depending on the exact functionality of these
> hypercalls).

Yes indeed.  VMware seems to have CPL checks on some of the commands
(but not all).  I guess Xen will be no worse than VMware if we do the
same, though I'd like to have an official spec to follow for that.

Tim.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-24 16:31         ` Don Slutz
  2014-09-24 16:44           ` George Dunlap
@ 2014-09-25 11:24           ` Ian Campbell
  2014-09-25 14:17             ` George Dunlap
  2014-09-26 19:19             ` Don Slutz
  1 sibling, 2 replies; 93+ messages in thread
From: Ian Campbell @ 2014-09-25 11:24 UTC (permalink / raw)
  To: Don Slutz
  Cc: Kevin Tian, Keir Fraser, Eddie Dong, Stefano Stabellini,
	George Dunlap, Ian Jackson, Tim Deegan, xen-devel, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Wed, 2014-09-24 at 12:31 -0400, Don Slutz wrote:
> On 09/23/14 08:20, Ian Campbell wrote:
> > On Mon, 2014-09-22 at 12:42 -0400, Don Slutz wrote:
> >>> The latter would allow moving to buildinfo.u.hvm, which would be nicer
> >>> from the libxl PoV, I think.
> >> I could not find "buildinfo.u.hvm":
> >>
> >>
> >> dcs-xen-54:~/xen>git grep buildinfo.u.hvm
> >> dcs-xen-54:~/xen>
> >>
> >>
> >> So unable to comment.
> > It's in the idl, next to createinfo.
> 
> I take that to mean:
> 
> 
> libxl_domain_config = Struct("domain_config", [
>      ("c_info", libxl_domain_create_info),
>      ("b_info", libxl_domain_build_info),
> ...
> 
> I.E.
> 
> b_info->u.hvm

Yes.


> >>    Currently I do not know of a way to
> >> say "set vmware_hw to 7
> >> if vmware_port is true and vmware_hw is not specified".
> > That's an error case, isn't it? Or at least a vmware_port is ignored
> > case.
> 
> Nope.  But I will agree that I have not done a lot with 3 (at least)
> state booleans.  The 3 states being true, false, and not specified.

The third state is "default" as in: libxl sets something sensible based
on other criteria (internal choice, other settings etc).

> And vmware_port is not ignored.
> 
> > What I suggested was "if vmware_hw is non-zero then set vmware_port".
> >
> 
> I am reading that as "set vmware_port if not specified".  To avoid
> complexity, I am treating vmware_hw as a boolean.  Using this
> I get the following table:
> 
> _hw   _port
>   0     0        Just like today
>   1     0        Only cpuid leaves change -- very unlikey
>   1     1        Full VMware mode
>   0     1        VMware hyper call mode.
> 
> Adding U for unspecified:
> 
> _hw   _port
>   U     U        ==> _hw=0 _port=0
>   0     U        ==> _hw=0 _port=0
>   1     U        The case in question.
>   U     0        ==> _hw=0 _port=0
>   U     1        What I was talking about.
>   0     0        Just like today
>   1     0        Only cpuid leaves change -- very unlikey
>   1     1        Full VMware mode
>   0     1        VMware hyper call mode.
> 
> The problem here is that vmware_hw is not a boolean and there is
> currently not a value that lets you know it has not been specified.

The unspecified value is 0, surely? All of the rows with U under _hw can
be ignored, I am talking only about _port being a defbool.

  0     U        ==> _hw=0 _port=0
  1     U        The case in question.

=> libxl should convert U to 1.

  0     0        Just like today
  1     0        Only cpuid leaves change -- very unlikey
  1     1        Full VMware mode
  0     1        VMware hyper call mode.

All reasonable things to ask for explicitly, I think (I'm not so sure
about the last one, might be an error?).

But a user who asks for vmware_hw and says nothing about _port should
get _port = 1 automatically by libxl.

This should be as simple as in the appropriate setdefault function:
  libxl_defbool_setdefault(&...vmware_port, &...vmare_hw != 0);

The setdefault sets the first argument to the second iff it is current
== default.

> So I think it is just more confusing to have vmware_hw change
> the default of vmware_port but the inverse is not true.
> 
>     -Don Slutz
> 
> >>    Which would be
> >> the inverse.  I lean to
> >> not having the default of vmware_port based on vmware_hw.
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-24 17:19                 ` Don Slutz
  2014-09-24 20:21                   ` Konrad Rzeszutek Wilk
@ 2014-09-25 11:35                   ` Ian Campbell
  1 sibling, 0 replies; 93+ messages in thread
From: Ian Campbell @ 2014-09-25 11:35 UTC (permalink / raw)
  To: Don Slutz
  Cc: Kevin Tian, Keir Fraser, Jun Nakajima, Stefano Stabellini,
	George Dunlap, Ian Jackson, Eddie Dong, Tim Deegan,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, xen-devel, Suravee Suthikulpanit

On Wed, 2014-09-24 at 13:19 -0400, Don Slutz wrote:
> I have yet to come up with a good reason why not to move the
> VMware port RPC code into QEMU.  I will be looking to do that for
> Xen 4.6 & QEMU 2.3

Great, thanks!

[...questions for Konrad...]

> One thing related to this is, should I also change qemu-xen-traditional
> to handle the new new I/O req type, or to only send it if using qemu-xen.

qemu-xen-traditional is in feature freeze, so no new features please.

Ian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-25 11:24           ` Ian Campbell
@ 2014-09-25 14:17             ` George Dunlap
  2014-09-25 14:21               ` Ian Campbell
  2014-09-26 19:19             ` Don Slutz
  1 sibling, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-25 14:17 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Kevin Tian, Keir Fraser, Eddie Dong, Stefano Stabellini,
	Ian Jackson, Tim Deegan, xen-devel, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/25/2014 12:24 PM, Ian Campbell wrote:
> On Wed, 2014-09-24 at 12:31 -0400, Don Slutz wrote:
>> On 09/23/14 08:20, Ian Campbell wrote:
>>> On Mon, 2014-09-22 at 12:42 -0400, Don Slutz wrote:
>>>>> The latter would allow moving to buildinfo.u.hvm, which would be nicer
>>>>> from the libxl PoV, I think.
>>>> I could not find "buildinfo.u.hvm":
>>>>
>>>>
>>>> dcs-xen-54:~/xen>git grep buildinfo.u.hvm
>>>> dcs-xen-54:~/xen>
>>>>
>>>>
>>>> So unable to comment.
>>> It's in the idl, next to createinfo.
>> I take that to mean:
>>
>>
>> libxl_domain_config = Struct("domain_config", [
>>       ("c_info", libxl_domain_create_info),
>>       ("b_info", libxl_domain_build_info),
>> ...
>>
>> I.E.
>>
>> b_info->u.hvm
> Yes.
>
>
>>>>     Currently I do not know of a way to
>>>> say "set vmware_hw to 7
>>>> if vmware_port is true and vmware_hw is not specified".
>>> That's an error case, isn't it? Or at least a vmware_port is ignored
>>> case.
>> Nope.  But I will agree that I have not done a lot with 3 (at least)
>> state booleans.  The 3 states being true, false, and not specified.
> The third state is "default" as in: libxl sets something sensible based
> on other criteria (internal choice, other settings etc).
>
>> And vmware_port is not ignored.
>>
>>> What I suggested was "if vmware_hw is non-zero then set vmware_port".
>>>
>> I am reading that as "set vmware_port if not specified".  To avoid
>> complexity, I am treating vmware_hw as a boolean.  Using this
>> I get the following table:
>>
>> _hw   _port
>>    0     0        Just like today
>>    1     0        Only cpuid leaves change -- very unlikey
>>    1     1        Full VMware mode
>>    0     1        VMware hyper call mode.
>>
>> Adding U for unspecified:
>>
>> _hw   _port
>>    U     U        ==> _hw=0 _port=0
>>    0     U        ==> _hw=0 _port=0
>>    1     U        The case in question.
>>    U     0        ==> _hw=0 _port=0
>>    U     1        What I was talking about.
>>    0     0        Just like today
>>    1     0        Only cpuid leaves change -- very unlikey
>>    1     1        Full VMware mode
>>    0     1        VMware hyper call mode.
>>
>> The problem here is that vmware_hw is not a boolean and there is
>> currently not a value that lets you know it has not been specified.
> The unspecified value is 0, surely? All of the rows with U under _hw can
> be ignored, I am talking only about _port being a defbool.

You asked Don to add "vmware_hw != 0 => vmware_port ?= 1"  (Where ?= is 
like make, "set if not already set").  Don then naturally thought maybe 
you might want to do the opposite: ("vmware_port != 0 => vmware_hw ?= 
7").  That's what Don is talking about with vmware_hw not being a 
boolean: he can't tell the difference between:

vmware_port=1
vmware_hw=0

and:

vmware_port=1
[nothing about vmware_hw]

In my other e-mail, I suggest that we make vmware_hw the "primary" 
configuration thing, and not even suggest using vmware_port unless they 
want one of the "unusual" configurations.

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-25 14:17             ` George Dunlap
@ 2014-09-25 14:21               ` Ian Campbell
  0 siblings, 0 replies; 93+ messages in thread
From: Ian Campbell @ 2014-09-25 14:21 UTC (permalink / raw)
  To: George Dunlap
  Cc: Kevin Tian, Keir Fraser, Eddie Dong, Stefano Stabellini,
	Ian Jackson, Tim Deegan, Don Slutz, xen-devel, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Thu, 2014-09-25 at 15:17 +0100, George Dunlap wrote:
> On 09/25/2014 12:24 PM, Ian Campbell wrote:
> > On Wed, 2014-09-24 at 12:31 -0400, Don Slutz wrote:
> >> On 09/23/14 08:20, Ian Campbell wrote:
> >>> On Mon, 2014-09-22 at 12:42 -0400, Don Slutz wrote:
> >>>>> The latter would allow moving to buildinfo.u.hvm, which would be nicer
> >>>>> from the libxl PoV, I think.
> >>>> I could not find "buildinfo.u.hvm":
> >>>>
> >>>>
> >>>> dcs-xen-54:~/xen>git grep buildinfo.u.hvm
> >>>> dcs-xen-54:~/xen>
> >>>>
> >>>>
> >>>> So unable to comment.
> >>> It's in the idl, next to createinfo.
> >> I take that to mean:
> >>
> >>
> >> libxl_domain_config = Struct("domain_config", [
> >>       ("c_info", libxl_domain_create_info),
> >>       ("b_info", libxl_domain_build_info),
> >> ...
> >>
> >> I.E.
> >>
> >> b_info->u.hvm
> > Yes.
> >
> >
> >>>>     Currently I do not know of a way to
> >>>> say "set vmware_hw to 7
> >>>> if vmware_port is true and vmware_hw is not specified".
> >>> That's an error case, isn't it? Or at least a vmware_port is ignored
> >>> case.
> >> Nope.  But I will agree that I have not done a lot with 3 (at least)
> >> state booleans.  The 3 states being true, false, and not specified.
> > The third state is "default" as in: libxl sets something sensible based
> > on other criteria (internal choice, other settings etc).
> >
> >> And vmware_port is not ignored.
> >>
> >>> What I suggested was "if vmware_hw is non-zero then set vmware_port".
> >>>
> >> I am reading that as "set vmware_port if not specified".  To avoid
> >> complexity, I am treating vmware_hw as a boolean.  Using this
> >> I get the following table:
> >>
> >> _hw   _port
> >>    0     0        Just like today
> >>    1     0        Only cpuid leaves change -- very unlikey
> >>    1     1        Full VMware mode
> >>    0     1        VMware hyper call mode.
> >>
> >> Adding U for unspecified:
> >>
> >> _hw   _port
> >>    U     U        ==> _hw=0 _port=0
> >>    0     U        ==> _hw=0 _port=0
> >>    1     U        The case in question.
> >>    U     0        ==> _hw=0 _port=0
> >>    U     1        What I was talking about.
> >>    0     0        Just like today
> >>    1     0        Only cpuid leaves change -- very unlikey
> >>    1     1        Full VMware mode
> >>    0     1        VMware hyper call mode.
> >>
> >> The problem here is that vmware_hw is not a boolean and there is
> >> currently not a value that lets you know it has not been specified.
> > The unspecified value is 0, surely? All of the rows with U under _hw can
> > be ignored, I am talking only about _port being a defbool.
> 
> You asked Don to add "vmware_hw != 0 => vmware_port ?= 1"  (Where ?= is 
> like make, "set if not already set").  Don then naturally thought


>  maybe 
> you might want to do the opposite: ("vmware_port != 0 => vmware_hw ?= 
> 7").

We don't want this (I've been trying say, badly obviously).

>   That's what Don is talking about with vmware_hw not being a 
> boolean: he can't tell the difference between:
> 
> vmware_port=1
> vmware_hw=0
> 
> and:
> 
> vmware_port=1
> [nothing about vmware_hw]

Then vmware_hw == 0 (which I think you know, but to be clear)

> In my other e-mail, I suggest that we make vmware_hw the "primary" 
> configuration thing,

This is what I've been trying to get at...

>  and not even suggest using vmware_port unless they 
> want one of the "unusual" configurations.

Indeed. Which the second of your examples is doing, just like the first.

Ian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage
  2014-09-24 19:07     ` Don Slutz
@ 2014-09-25 15:14       ` George Dunlap
  2014-09-29 18:10         ` Don Slutz
  0 siblings, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-25 15:14 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Jan Beulich, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Wed, Sep 24, 2014 at 8:07 PM, Don Slutz <dslutz@verizon.com> wrote:
>
> On 09/24/14 13:27, George Dunlap wrote:
>>
>> On 09/20/2014 07:07 PM, Don Slutz wrote:
>>>
>>> Reduce the VMPORT_DBG_LOG calls.
>>
>>
>> You should also have mentioned that you added hew HVMTRACE macros which
>> will log the TSC value.
>>
>> The reason the HVMTRACE macros don't log the TSC values is that for the
>> most part you can get all the timing information you need from the TSC on
>> the vmexit and vmenter.  Looking at where you've added the TSC values, I
>> don't really see how it adds anything except bloat to the log.  Is there a
>> reason you need to know exactly when these different things happened,
>> instead of just being able to bracket them between VMENTER and VMEXITs?
>>
>
> I did want a way to know how long the VMware code was taking.  I am
> not sure this is required.
>
> For example:
>
> CPU1  2899550319282 (+    4170)  VMEXIT      [ exitcode = 0x00000000, rIP  =
> 0x00007fad13ffec8c ]
> CPU1  2899550320086 (+     804)  TRAP_GP     [ inst_len = 1 edx = 0x00005658
> exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ]
> CPU1  2899550325054 (+    4968)  VMPORT_READ_BEFORE  [ eax = 0x564d5868 ebx
> = 0x000001b3 ecx = 0x0003001e edx = 0x00005658 esi = 0x00000000 edi =
> 0x000001b3 ]
> CPU1  2899550326050 (+     996)  VMPORT_READ_AFTER  [ eax = 0x564d5868 ebx =
> 0x000001b3 ecx = 0x0001001e edx = 0x00005658 esi = 0x00000000 edi =
> 0x000001b3 ]
> CPU1  2899550326722 (+     672)  vlapic_accept_pic_intr [ i8259_target = 1,
> accept_pic_int = 0 ]
> CPU1  2899550327454 (+     732)  VMENTRY
>
>
> But I am happy to drop the new log TSC macros.

The tracing function itself is not free -- the trace_var() function
probably executes 5x the amount of code that is actually executed
between the READ_BEFORE and READ_AFTER (given it's a switch statement
and each one is basically a handful of variable assignments).  It's
not unlikely that most of the 996 us there is in the trace function
itself.

>>> diff --git a/xen/arch/x86/hvm/vmware/vmport.c
>>> b/xen/arch/x86/hvm/vmware/vmport.c
>>> index 811c303..962ee32 100644
>>> --- a/xen/arch/x86/hvm/vmware/vmport.c
>>> +++ b/xen/arch/x86/hvm/vmware/vmport.c
>
>> We normally don't log both BEFORE and AFTER states of things like
>> hypercalls -- just logging the outcome of what the hypervisor did should be
>> sufficient, shouldn't it?
>
>
> Not that clear with this poorly build hyper-call.
>
>> Do you really need to know the value of things that got clobbered?
>
>
> When checking on the complex state machine that is the "RPC" code
> it was very helpful.  With that code moving to QEMU, the before and
> after is not so important.
>
> Here is a real example:
>
> CPU2  865821836508576 (+    2562)  VMEXIT      [ exitcode = 0x00000000, rIP
> = 0x00007f68a8b17c8c ]
> CPU2  865821836509362 (+     786)  TRAP_GP     [ inst_len = 1 edx =
> 0x00025658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ]
> CPU2  865821836514132 (+    4770)  VMPORT_READ_BEFORE  [ eax = 0x564d5868
> ebx = 0x00000034 ecx = 0x0002001e edx = 0x00025658 esi = 0x00000000 edi =
> 0x000001be ]
> CPU2  865821836597832 (+   83700)  VMPORT_READ_AFTER  [ eax = 0x564d5868 ebx
> = 0x00000034 ecx = 0x0001001e edx = 0x00025658 esi = 0x00000000 edi =
> 0x000001be ]
> CPU2  865821836598756 (+     924)  vlapic_accept_pic_intr [ i8259_target =
> 1, accept_pic_int = 0 ]
> CPU2  865821836605602 (+    6846)  vlapic_accept_pic_intr [ i8259_target =
> 1, accept_pic_int = 0 ]
> CPU2  865821836606436 (+     834)  VMENTRY
> CPU2  865821836609712 (+    3276)  VMEXIT      [ exitcode = 0x00000000, rIP
> = 0x00007f68a8b17c8c ]
> CPU2  865821836610654 (+     942)  TRAP_GP     [ inst_len = 1 edx =
> 0x00025658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ]
> CPU2  865821836616828 (+    6174)  VMPORT_READ_BEFORE  [ eax = 0x564d5868
> ebx = 0x00000034 ecx = 0x0003001e edx = 0x00025658 esi = 0x00000000 edi =
> 0x000001be ]
> CPU2  865821836617800 (+     972)  VMPORT_READ_AFTER  [ eax = 0x564d5868 ebx
> = 0x00000011 ecx = 0x0003001e edx = 0x00015658 esi = 0x00000000 edi =
> 0x000001be ]
> CPU2  865821836618664 (+     864)  vlapic_accept_pic_intr [ i8259_target =
> 1, accept_pic_int = 0 ]
> CPU2  865821836619444 (+     780)  VMENTRY
>
> Note that in the one "RPC" call,
>
>
> grep VMPORT_READ_BEFORE ~/zz-xentrace-vmware3-0.out | wc
>    1592   39800  256312
>
> It took 1592 #GP traps to handle it, and 9643628760 tsc cycles.

Right, so what I started to say yesterday: It looks like most of the
trace points you're adding here is to help you debug the functionality
of the hypervisor.  That certainly makes sense for you to do during
development.  But what we want in upstream is something that will help
us in production.  For that to be useful, we need the logging to be as
efficient as possible. Every additional HVM trace point potentially
adds more data to someone else's HVM trace. So we don't want
extraneous information, and we don't want to log something that we can
infer from something else.

In general, it's enough to give information about the decisions that
Xen is making to infer what the previous state is; and then giving
information about what Xen did in response (i.e., return values,
injection of traps, &c) to help figure out how the guest responded.
In this case, I'd probably trace:
1. vmport hypercall, handled command
 - cmd
 - return values of modified registers.  Ideally only the registers
that are modified, but just taking a big batch would be OK for now.
 Note: No need for rip here, as it will be collected at the VMEXIT
2. vmport hypercall, unhandled command
 - Just the unimplemented fail
3. In the vmport_gp_check(), if it successfully decodes an IO instruction:
 - direction (read/write)
 - size of the access

I might consider logging something the failure path of
*_vmexit_gp_intercept(), with information that might help you figure
out why it didn't make it to the vmcall; but on the whole I think I'd
probably leave that off.

Hopefully all that would give you enough information to figure out
where the problem was and how to reproduce the behavior locally; and
once you can reproduce it locally, you could add in debugging traces
(which wouldn't be upstreamed) to help you figure out why it wasn't
taking the path you expected.

Does that make sense?

If you really want more traces, I might consider allowing them in the
code but off by default; but I think you probably won't need more
information from running production systems.

 -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 07/16] tools: Convert vmware_port to xentrace usage
  2014-09-20 18:07 ` [PATCH for-4.5 v6 07/16] tools: " Don Slutz
@ 2014-09-25 15:18   ` George Dunlap
  0 siblings, 0 replies; 93+ messages in thread
From: George Dunlap @ 2014-09-25 15:18 UTC (permalink / raw)
  To: Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Jan Beulich, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On Sat, Sep 20, 2014 at 7:07 PM, Don Slutz <dslutz@verizon.com> wrote:
> Also added missing TRAP_DEBUG & VLAPIC.

The title of this patch should have been something like, "Add gp
emulation and vmport trace records to xentrace_format" (and definitely
not exactly the same as the previous patch).

But actually it should probably be merged into the previous patch
(since technically it would be a very mild regression to have new
trace points without entries in xentrace_formats).

(Although of course I'm really bad about avoiding such regressions...)

 -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 08/16] xen: Add limited support of VMware's hyper-call rpc
  2014-09-22 13:47   ` Ian Campbell
  2014-09-22 21:18     ` Don Slutz
@ 2014-09-25 16:28     ` George Dunlap
  1 sibling, 0 replies; 93+ messages in thread
From: George Dunlap @ 2014-09-25 16:28 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jun Nakajima,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/22/2014 02:47 PM, Ian Campbell wrote:
> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>> This interface is an extension of __HYPERVISOR_HVM_op.  It was
>> picked because xc_get_hvm_param() also uses it and VMware guest
>> info is a lot like a hvm param.
> Sorry if this has been discussed before, but did you consider doing all
> this in qemu rather than Xen?
>
> Unless there are frequent accesses to these things then qemu would be
> the default best place for this sort of thing, especially since as
> you've observed there is some pretty complex memory management and
> string handling which it would generally be better to avoid in the
> hypervisor.
>
> Your description of HVM_PARAM_VMPORT_RESET_TIME suggests they aren't
> typically accessed very frequently.

Well the whole architecture implies to me that VMWare have an 
unprivileged program in a service domain somewhere handling the actual 
RPC requests, almost certainly to keep all this crazy stuff out of their 
hypervisor.  We should take advantage of the asyncronous nature to keep 
it out of our hypervisor as well.

 From an architectural perspective, since we're getting support for 
multiple ioreq servers, one could imagine having a special vmport ioreq 
server that would read stuff from xenstore.  But since KVM might want to 
use it at some point, it probably makes more sense to implement it there 
if it's possible.

Storing these key-value pairs in xenstore seems like the most obvious 
thing to do -- does qemu-xen have absolutely no xenstore access?

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-24 20:21                   ` Konrad Rzeszutek Wilk
@ 2014-09-26 19:03                     ` Don Slutz
  2014-09-26 19:28                       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-09-26 19:03 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Don Slutz
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	George Dunlap, Ian Jackson, Tim Deegan, xen-devel, Jan Beulich,
	Eddie Dong, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit


On 09/24/14 16:21, Konrad Rzeszutek Wilk wrote:
> On Wed, Sep 24, 2014 at 01:19:38PM -0400, Don Slutz wrote:
>> On 09/23/14 08:30, Ian Campbell wrote:
>>> On Mon, 2014-09-22 at 13:19 -0400, Don Slutz wrote:
>> [snip]
>>
>>>>> I was only responding to the part of your comment in parentheses. :-)
>>>>>
>>>>> I suppose in large part it would depend on what the hypercalls were
>>>>> actually doing; I'd have to go back and look at them to say if they
>>>>> need to be in Xen or whether they could be passed on to qemu.
>>>>>
>>>> Clearly it is possible to pass the VCPU registers to QEMU, but that is
>>>> currently not done.
>>> I think there's an existing hypercall to get/set the state for a vcpu,
>>> perhaps it is too heavy weight to be used here though.
>> Yes, very heavy weight
>>
>>> An alternative would be a semantically higher level I/O req which took a
>>> guest pointer to a key and a guest pointer to the buffer etc, without
>>> needing the registers themselves.
>> I am looking at adding a new I/O req type for this.  It turns out that
>> for vmware_port you need to pass 6 32bit values both ways.  And
>> I can overlap the .addr, .data, .count and .size for this.  The other
>> option is to increase the size of struct ioreq, which I am assuming
>> is not the way to go since it would reduce the max number of vcpus
>> as long as "struct shared_iopage" is limited to 1 page.
>>
>> "guest pointer to a key and a guest pointer to the buffer" is not how
>> this works.  The data is all passed by upto 4 bytes at each IN.  A string
>> (which is how guestinfo access looks like) is passed as a length, and
>> then each 4 bytes of the string. (I am not trying to say this is good.)
>>
>>
>>
>>>>    So a new
>>>> version of QEMU would also be needed to go this way.  None the the
>>>> proposed features need
>>>> any data from QEMU, so I do not think this make sense.
>>> The concern is that it is adding a load of complex looking string and
>>> pointer manipulation stuff to the hypervisor, the sort of thing which
>>> often leads to security vulnerabilities.
>>>
>>> So that would be better done outside of Xen itself if possible, if a
>>> qemu update is the price for that then it doesn't seem so bad to me.
>> I have yet to come up with a good reason why not to move the
>> VMware port RPC code into QEMU.  I will be looking to do that for
>> Xen 4.6 & QEMU 2.3
>>
>>
>> Related to that, the code to connect Xen to QEMU so that Xen can
>> use any VMware support in QEMU is not that complex.  So added
>> the xen part in place of patches 8, 9, 10, 11, 12, 14, 15 and 16
>> looks doable.  This would allow X to use the VMware mouse
>> code (which is in both qemu-xen and qemu-xen-traditional).  I have
>> found this to be a great improvement in using a GUI in a guest
>> where the network speeds are not that fast.  I had planned
>> on adjusting the Xen to QEMU connector code for 4.6
>>
>> Also there is a good chance that the QEMU part could be up streamed
>> to QEMU 2.2 (and backported to Xen's QEMU tree) for 4.5
>>
>> Now since I did not include this code sooner, would I need a release
>> exception to include the Xen to QEMU connector code?
> Yes, but without having seen the patches beforehand it might be
> a bit too late as they would be brand-new-patches.

I am ok with waiting for 4.6.  I have posted the QEMU 2.2 patch
in the hope that I can get this into that version and make things
simpler for 4.6


>>
>> One thing related to this is, should I also change qemu-xen-traditional
>> to handle the new new I/O req type, or to only send it if using qemu-xen.
> Just qemu-xen.

Ok, but I will see if there is a simple way to not send the new ioreq
type to qemu-xen-traditional.  If I do not find one, I need to at least
ignore the new ioreq type.


>> It is simple to allow a new QEMU to build with pre-4.5 Xen and post-4.5
>> Xen.  No idea of a good way to check that a QEMU binary has this
>> support.  However I can say that enabling vmware_port does require
>> a QEMU with this support in the docs.
> And that would mean for users to take advantage of it would need
> to update their QEMU version right?

Yes and No.  The version of QEMU that is provided with Xen should
have this support.  The issue is for the people (distos?) that mix and
match Xen and QEMU.

Since I am planning on a delay until 4.6, I how to have QEMU in a
better state by then.

     -Don Slutz

>>
>>      -Don Slutz
>>
>>> Ian.
>>>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 04/16] xen: Add vmware_port support
  2014-09-23 17:16   ` Boris Ostrovsky
  2014-09-24  8:28     ` Jan Beulich
@ 2014-09-26 19:09     ` Don Slutz
  1 sibling, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-26 19:09 UTC (permalink / raw)
  To: Boris Ostrovsky, Don Slutz, xen-devel
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	Jun Nakajima, Eddie Dong, Ian Jackson, Tim Deegan, George Dunlap,
	Aravind Gopalakrishnan, Jan Beulich, Andrew Cooper,
	Suravee Suthikulpanit


On 09/23/14 13:16, Boris Ostrovsky wrote:
> On 09/20/2014 02:07 PM, Don Slutz wrote:
>> @@ -2064,6 +2065,42 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
>>       return;
>>   }
>>   +static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
>> +                                    struct vcpu *v)
>> +{
>> +    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
>> +    /*
>> +     * Just use 15 for the instruction length; vmport_gp_check will
>> +     * adjust it.  This is because
>> +     * __get_instruction_length_from_list() has issues, and may
>> +     * require a double read of the instruction bytes.  At some
>> +     * point a new routine could be added that is based on the code
>> +     * in vmport_gp_check with extensions to make it more general.
>> +     * Since that routine is the only user of this code this can be
>> +     * done later.
>> +     */
>> +    unsigned long inst_len = 15;
>
> Can you add a comment describing why you chose 15?
>
> Also, saying that __get_instruction_length_from_list() has issues I 
> think requires a bit more details (e.g. that when called from #GP 
> handler NRIP is not available, or that NRIP may not be available at 
> all on a particular HW, leading to the need read the instruction twice 
> --- once in __get_instruction_length_from_list() and then again in 
> vmport_gp_check(). Which is bad because memory may change between the 
> reads. Or something like that.).
>

Added more on the commit message about this.

    -Don Slutz

> -boris
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 05/16] tools: Add vmware_port support
  2014-09-25 11:24           ` Ian Campbell
  2014-09-25 14:17             ` George Dunlap
@ 2014-09-26 19:19             ` Don Slutz
  1 sibling, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-26 19:19 UTC (permalink / raw)
  To: Ian Campbell, Don Slutz
  Cc: Kevin Tian, Keir Fraser, Eddie Dong, Stefano Stabellini,
	George Dunlap, Ian Jackson, Tim Deegan, xen-devel, Jan Beulich,
	Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit


On 09/25/14 07:24, Ian Campbell wrote:
> On Wed, 2014-09-24 at 12:31 -0400, Don Slutz wrote:
>> On 09/23/14 08:20, Ian Campbell wrote:
>>> On Mon, 2014-09-22 at 12:42 -0400, Don Slutz wrote:

[snip]

>>> And vmware_port is not ignored. 
>>> What I suggested was "if vmware_hw is non-zero then set vmware_port".
>>>
>> I am reading that as "set vmware_port if not specified".  To avoid
>> complexity, I am treating vmware_hw as a boolean.  Using this
>> I get the following table:
>>
>> _hw   _port
>>    0     0        Just like today
>>    1     0        Only cpuid leaves change -- very unlikey
>>    1     1        Full VMware mode
>>    0     1        VMware hyper call mode.
>>
>> Adding U for unspecified:
>>
>> _hw   _port
>>    U     U        ==> _hw=0 _port=0
>>    0     U        ==> _hw=0 _port=0
>>    1     U        The case in question.
>>    U     0        ==> _hw=0 _port=0
>>    U     1        What I was talking about.
>>    0     0        Just like today
>>    1     0        Only cpuid leaves change -- very unlikey
>>    1     1        Full VMware mode
>>    0     1        VMware hyper call mode.
>>
>> The problem here is that vmware_hw is not a boolean and there is
>> currently not a value that lets you know it has not been specified.
> The unspecified value is 0, surely? All of the rows with U under _hw can
> be ignored, I am talking only about _port being a defbool.
>
>    0     U        ==> _hw=0 _port=0
>    1     U        The case in question.
>
> => libxl should convert U to 1.
>
>    0     0        Just like today
>    1     0        Only cpuid leaves change -- very unlikey
>    1     1        Full VMware mode
>    0     1        VMware hyper call mode.
>
> All reasonable things to ask for explicitly, I think (I'm not so sure
> about the last one, might be an error?).
>

This is not an error as far as VMware says in "Mechanisms to determine
if software is running in a VMware virtual machine":

*Recommended code*

int Detect_VMware(void)
{
         if (cpuid_check())
                 return 1;               // Success running under VMware.
         else if (dmi_check() && hypervisor_port_check())
                 return 1;
         return 0;
}



If there is code that has issues with Xen in cpuid leaves and VMware port
"working", then yes this is a bad config to pick.

    -Don Slutz

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-26 19:03                     ` Don Slutz
@ 2014-09-26 19:28                       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 93+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-09-26 19:28 UTC (permalink / raw)
  To: Don Slutz
  Cc: Kevin Tian, Keir Fraser, Ian Campbell, Stefano Stabellini,
	George Dunlap, Ian Jackson, Tim Deegan, xen-devel, Jan Beulich,
	Eddie Dong, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit

> >>It is simple to allow a new QEMU to build with pre-4.5 Xen and post-4.5
> >>Xen.  No idea of a good way to check that a QEMU binary has this
> >>support.  However I can say that enabling vmware_port does require
> >>a QEMU with this support in the docs.
> >And that would mean for users to take advantage of it would need
> >to update their QEMU version right?
> 
> Yes and No.  The version of QEMU that is provided with Xen should
> have this support.  The issue is for the people (distos?) that mix and
> match Xen and QEMU.
> 
> Since I am planning on a delay until 4.6, I how to have QEMU in a
> better state by then.

OK. Sorry that it did not work out for Xen 4.5. But on the positive
side there is more runway time  - and you don't have to strees about
deadlines and all that :-)

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-25 10:37           ` Tim Deegan
@ 2014-09-26 20:00             ` Don Slutz
  2014-09-29  6:50               ` Jan Beulich
  2014-10-02 10:05               ` Tim Deegan
  0 siblings, 2 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-26 20:00 UTC (permalink / raw)
  To: Tim Deegan, Jan Beulich
  Cc: Ian Jackson, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Eddie Dong,
	Don Slutz, xen-devel, AravindGopalakrishnan, Jun Nakajima,
	Boris Ostrovsky, Suravee Suthikulpanit

On 09/25/14 06:37, Tim Deegan wrote:
> At 17:18 +0100 on 22 Sep (1411402700), Jan Beulich wrote:
>>>>> On 22.09.14 at 17:38, <george.dunlap@eu.citrix.com> wrote:
>>> On 09/22/2014 04:34 PM, Ian Campbell wrote:
>>>> On Mon, 2014-09-22 at 16:19 +0100, George Dunlap wrote:
>>>>> On 09/22/2014 02:56 PM, Ian Campbell wrote:
>>>>>> On Sat, 2014-09-20 at 14:07 -0400, Don Slutz wrote:
>>>>>>
>>>>

[snip]

>>> I'm not 100% sure of this, but my understanding was that it *would* be a
>>> normal IO trap *if* the guest OS gave access to that IO range to the
>>> guest (via IOPL, maybe?).  But if the userspace program is not
>>> explicitly given access by the OS to those ports, it will generate a #GP
>>> instead.  The idea is to allow the "hypercall" to happen *without
>>> cooperation* from the guest OS.
>>>
>>> Again, that's my understanding, someone please correct me if I'm wrong...
>> That's indeed what was said so far. I wonder though whether opening
>> this up without guest OS consent isn't gong to introduce a security
>> issue inside the guest (depending on the exact functionality of these
>> hypercalls).
> Yes indeed.  VMware seems to have CPL checks on some of the commands
> (but not all).  I guess Xen will be no worse than VMware if we do the
> same, though I'd like to have an official spec to follow for that.

Yes, VMware has CPL checks on some of the commands.  Not at all
clear the include file has the correct statement.  I have not do any
checking of CPL nor does QEMU.  And the RPC (which is CPL 3) is
one of the most likely to have a security issue.

I do not know of an official spec to follow.  The best I have the
the provided include file and testing on VMware.

I do know that BDOOR_CMD_GETHZ is one that is not allowed in
ring 3, but this makes no sense to me.  I do not see why tsc_freq
and apic bus speed to be things to hide.  And VMware is not
consistent.  On newer configs this same info is available via
cpuid leaf in ring 3.

Also I have not idea if VMware did the CPL checking "correctly".
I.E. is a #GP => CPL 3, or do they check CPL?

All this leads to I current do not check CPL on any VMware commands.

I could look into doing this, but with the xl.cfg flag vmware_port=0
turns this all off, I do not see any need for CPL checking.

    -Don Slutz


> Tim.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-26 20:00             ` Don Slutz
@ 2014-09-29  6:50               ` Jan Beulich
  2014-09-29 13:27                 ` George Dunlap
  2014-10-02 10:05               ` Tim Deegan
  1 sibling, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2014-09-29  6:50 UTC (permalink / raw)
  To: Don Slutz
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Eddie Dong, AravindGopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

>>> On 26.09.14 at 22:00, <dslutz@verizon.com> wrote:
> On 09/25/14 06:37, Tim Deegan wrote:
>> At 17:18 +0100 on 22 Sep (1411402700), Jan Beulich wrote:
>>> That's indeed what was said so far. I wonder though whether opening
>>> this up without guest OS consent isn't gong to introduce a security
>>> issue inside the guest (depending on the exact functionality of these
>>> hypercalls).
>> Yes indeed.  VMware seems to have CPL checks on some of the commands
>> (but not all).  I guess Xen will be no worse than VMware if we do the
>> same, though I'd like to have an official spec to follow for that.
> 
> Yes, VMware has CPL checks on some of the commands.  Not at all
> clear the include file has the correct statement.  I have not do any
> checking of CPL nor does QEMU.  And the RPC (which is CPL 3) is
> one of the most likely to have a security issue.
> 
> I do not know of an official spec to follow.  The best I have the
> the provided include file and testing on VMware.
> 
> I do know that BDOOR_CMD_GETHZ is one that is not allowed in
> ring 3, but this makes no sense to me.  I do not see why tsc_freq
> and apic bus speed to be things to hide.  And VMware is not
> consistent.  On newer configs this same info is available via
> cpuid leaf in ring 3.
> 
> Also I have not idea if VMware did the CPL checking "correctly".
> I.E. is a #GP => CPL 3, or do they check CPL?
> 
> All this leads to I current do not check CPL on any VMware commands.
> 
> I could look into doing this, but with the xl.cfg flag vmware_port=0
> turns this all off, I do not see any need for CPL checking.

Hmm, I think we need to settle on certain things here:
a) I don't think it is okay to base our emulation layer entirely
on observed behavior. At least some form of specification should
be there to follow. This is both for reviewing the code you want
committed and maintainability.
b) I don't think it is okay to introduce security issues into a guest
even if that is something that isn't enabled by default. Or at the
very least it should in such a case be clearly documented _and_
the feature should be properly marked as experimental.
c) Apparent or real flaws with VMware's native implementation
should be brought up with VMware. While mimicking their behavior
as closely as possible is certainly a desirable goal, reproducing
flaws their code has should imo be avoided if at all possible.

Jan

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-29  6:50               ` Jan Beulich
@ 2014-09-29 13:27                 ` George Dunlap
  2014-09-29 13:49                   ` Jan Beulich
  2014-09-29 23:13                   ` Don Slutz
  0 siblings, 2 replies; 93+ messages in thread
From: George Dunlap @ 2014-09-29 13:27 UTC (permalink / raw)
  To: Jan Beulich, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Jun Nakajima, Andrew Cooper, Ian Jackson,
	xen-devel, Eddie Dong, AravindGopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

On 09/29/2014 07:50 AM, Jan Beulich wrote:
>>>> On 26.09.14 at 22:00, <dslutz@verizon.com> wrote:
>> On 09/25/14 06:37, Tim Deegan wrote:
>>> At 17:18 +0100 on 22 Sep (1411402700), Jan Beulich wrote:
>>>> That's indeed what was said so far. I wonder though whether opening
>>>> this up without guest OS consent isn't gong to introduce a security
>>>> issue inside the guest (depending on the exact functionality of these
>>>> hypercalls).
>>> Yes indeed.  VMware seems to have CPL checks on some of the commands
>>> (but not all).  I guess Xen will be no worse than VMware if we do the
>>> same, though I'd like to have an official spec to follow for that.
>> Yes, VMware has CPL checks on some of the commands.  Not at all
>> clear the include file has the correct statement.  I have not do any
>> checking of CPL nor does QEMU.  And the RPC (which is CPL 3) is
>> one of the most likely to have a security issue.
>>
>> I do not know of an official spec to follow.  The best I have the
>> the provided include file and testing on VMware.
>>
>> I do know that BDOOR_CMD_GETHZ is one that is not allowed in
>> ring 3, but this makes no sense to me.  I do not see why tsc_freq
>> and apic bus speed to be things to hide.  And VMware is not
>> consistent.  On newer configs this same info is available via
>> cpuid leaf in ring 3.
>>
>> Also I have not idea if VMware did the CPL checking "correctly".
>> I.E. is a #GP => CPL 3, or do they check CPL?
>>
>> All this leads to I current do not check CPL on any VMware commands.
>>
>> I could look into doing this, but with the xl.cfg flag vmware_port=0
>> turns this all off, I do not see any need for CPL checking.
> Hmm, I think we need to settle on certain things here:
> a) I don't think it is okay to base our emulation layer entirely
> on observed behavior. At least some form of specification should
> be there to follow. This is both for reviewing the code you want
> committed and maintainability.

While that would be nice, I think that's unlikely; and overall I think 
it would be better to have a reverse-engineered implementation than no 
implementation at all.  Having a reverse-engineered spec might be a good 
idea though.

> b) I don't think it is okay to introduce security issues into a guest
> even if that is something that isn't enabled by default.

I agree with this; in particular, it's quite possible that someone will 
decide to enable VMWare functionality by default, "just in case", and 
then forget that they've done so.

> c) Apparent or real flaws with VMware's native implementation
> should be brought up with VMware. While mimicking their behavior
> as closely as possible is certainly a desirable goal, reproducing
> flaws their code has should imo be avoided if at all possible.

If our goal is compatibility with exiting tools, is there really such a 
thing as "reproducing flaws"?  Obviously we shouldn't reproduce a real 
security flaw, but for everything else, if the feature is "Looks just 
like VMWare", then being as close as possible in behavior is the ideal.

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-29 13:27                 ` George Dunlap
@ 2014-09-29 13:49                   ` Jan Beulich
  2014-09-29 23:13                   ` Don Slutz
  1 sibling, 0 replies; 93+ messages in thread
From: Jan Beulich @ 2014-09-29 13:49 UTC (permalink / raw)
  To: George Dunlap, Don Slutz
  Cc: Tim Deegan, Kevin Tian, KeirFraser, Ian Campbell,
	Stefano Stabellini, JunNakajima, Andrew Cooper, IanJackson,
	xen-devel, Eddie Dong, AravindGopalakrishnan,
	SuraveeSuthikulpanit, Boris Ostrovsky

>>> On 29.09.14 at 15:27, <george.dunlap@eu.citrix.com> wrote:
> On 09/29/2014 07:50 AM, Jan Beulich wrote:
>> c) Apparent or real flaws with VMware's native implementation
>> should be brought up with VMware. While mimicking their behavior
>> as closely as possible is certainly a desirable goal, reproducing
>> flaws their code has should imo be avoided if at all possible.
> 
> If our goal is compatibility with exiting tools, is there really such a 
> thing as "reproducing flaws"?  Obviously we shouldn't reproduce a real 
> security flaw, but for everything else, if the feature is "Looks just 
> like VMWare", then being as close as possible in behavior is the ideal.

Yeah, I agree - I should have made more explicit that I really only
meant security relevant flaws here.

Jan

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage
  2014-09-25 15:14       ` George Dunlap
@ 2014-09-29 18:10         ` Don Slutz
  0 siblings, 0 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-29 18:10 UTC (permalink / raw)
  To: George Dunlap, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Ian Jackson, Eddie Dong, xen-devel,
	Jan Beulich, Aravind Gopalakrishnan, Jun Nakajima, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit


On 09/25/14 11:14, George Dunlap wrote:
> On Wed, Sep 24, 2014 at 8:07 PM, Don Slutz <dslutz@verizon.com> wrote:
>> On 09/24/14 13:27, George Dunlap wrote:
>>> On 09/20/2014 07:07 PM, Don Slutz wrote:
>>>> Reduce the VMPORT_DBG_LOG calls.
>>>
>>> You should also have mentioned that you added hew HVMTRACE macros which
>>> will log the TSC value.
>>>
>>> The reason the HVMTRACE macros don't log the TSC values is that for the
>>> most part you can get all the timing information you need from the TSC on
>>> the vmexit and vmenter.  Looking at where you've added the TSC values, I
>>> don't really see how it adds anything except bloat to the log.  Is there a
>>> reason you need to know exactly when these different things happened,
>>> instead of just being able to bracket them between VMENTER and VMEXITs?
>>>
>> I did want a way to know how long the VMware code was taking.  I am
>> not sure this is required.
>>
>> For example:
>>
>> CPU1  2899550319282 (+    4170)  VMEXIT      [ exitcode = 0x00000000, rIP  =
>> 0x00007fad13ffec8c ]
>> CPU1  2899550320086 (+     804)  TRAP_GP     [ inst_len = 1 edx = 0x00005658
>> exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ]
>> CPU1  2899550325054 (+    4968)  VMPORT_READ_BEFORE  [ eax = 0x564d5868 ebx
>> = 0x000001b3 ecx = 0x0003001e edx = 0x00005658 esi = 0x00000000 edi =
>> 0x000001b3 ]
>> CPU1  2899550326050 (+     996)  VMPORT_READ_AFTER  [ eax = 0x564d5868 ebx =
>> 0x000001b3 ecx = 0x0001001e edx = 0x00005658 esi = 0x00000000 edi =
>> 0x000001b3 ]
>> CPU1  2899550326722 (+     672)  vlapic_accept_pic_intr [ i8259_target = 1,
>> accept_pic_int = 0 ]
>> CPU1  2899550327454 (+     732)  VMENTRY
>>
>>
>> But I am happy to drop the new log TSC macros.
> The tracing function itself is not free -- the trace_var() function
> probably executes 5x the amount of code that is actually executed
> between the READ_BEFORE and READ_AFTER (given it's a switch statement
> and each one is basically a handful of variable assignments).  It's
> not unlikely that most of the 996 us there is in the trace function
> itself.

Ok, but the 83700 us is not all this.


>>>> diff --git a/xen/arch/x86/hvm/vmware/vmport.c
>>>> b/xen/arch/x86/hvm/vmware/vmport.c
>>>> index 811c303..962ee32 100644
>>>> --- a/xen/arch/x86/hvm/vmware/vmport.c
>>>> +++ b/xen/arch/x86/hvm/vmware/vmport.c
>>> We normally don't log both BEFORE and AFTER states of things like
>>> hypercalls -- just logging the outcome of what the hypervisor did should be
>>> sufficient, shouldn't it?
>>
>> Not that clear with this poorly build hyper-call.
>>
>>> Do you really need to know the value of things that got clobbered?
>>
>> When checking on the complex state machine that is the "RPC" code
>> it was very helpful.  With that code moving to QEMU, the before and
>> after is not so important.
>>
>> Here is a real example:
>>
>> CPU2  865821836508576 (+    2562)  VMEXIT      [ exitcode = 0x00000000, rIP
>> = 0x00007f68a8b17c8c ]
>> CPU2  865821836509362 (+     786)  TRAP_GP     [ inst_len = 1 edx =
>> 0x00025658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ]
>> CPU2  865821836514132 (+    4770)  VMPORT_READ_BEFORE  [ eax = 0x564d5868
>> ebx = 0x00000034 ecx = 0x0002001e edx = 0x00025658 esi = 0x00000000 edi =
>> 0x000001be ]
>> CPU2  865821836597832 (+   83700)  VMPORT_READ_AFTER  [ eax = 0x564d5868 ebx
>> = 0x00000034 ecx = 0x0001001e edx = 0x00025658 esi = 0x00000000 edi =
>> 0x000001be ]
>> CPU2  865821836598756 (+     924)  vlapic_accept_pic_intr [ i8259_target =
>> 1, accept_pic_int = 0 ]
>> CPU2  865821836605602 (+    6846)  vlapic_accept_pic_intr [ i8259_target =
>> 1, accept_pic_int = 0 ]
>> CPU2  865821836606436 (+     834)  VMENTRY
>> CPU2  865821836609712 (+    3276)  VMEXIT      [ exitcode = 0x00000000, rIP
>> = 0x00007f68a8b17c8c ]
>> CPU2  865821836610654 (+     942)  TRAP_GP     [ inst_len = 1 edx =
>> 0x00025658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ]
>> CPU2  865821836616828 (+    6174)  VMPORT_READ_BEFORE  [ eax = 0x564d5868
>> ebx = 0x00000034 ecx = 0x0003001e edx = 0x00025658 esi = 0x00000000 edi =
>> 0x000001be ]
>> CPU2  865821836617800 (+     972)  VMPORT_READ_AFTER  [ eax = 0x564d5868 ebx
>> = 0x00000011 ecx = 0x0003001e edx = 0x00015658 esi = 0x00000000 edi =
>> 0x000001be ]
>> CPU2  865821836618664 (+     864)  vlapic_accept_pic_intr [ i8259_target =
>> 1, accept_pic_int = 0 ]
>> CPU2  865821836619444 (+     780)  VMENTRY
>>
>> Note that in the one "RPC" call,
>>
>>
>> grep VMPORT_READ_BEFORE ~/zz-xentrace-vmware3-0.out | wc
>>     1592   39800  256312
>>
>> It took 1592 #GP traps to handle it, and 9643628760 tsc cycles.
> Right, so what I started to say yesterday: It looks like most of the
> trace points you're adding here is to help you debug the functionality
> of the hypervisor.  That certainly makes sense for you to do during
> development.  But what we want in upstream is something that will help
> us in production.  For that to be useful, we need the logging to be as
> efficient as possible. Every additional HVM trace point potentially
> adds more data to someone else's HVM trace. So we don't want
> extraneous information, and we don't want to log something that we can
> infer from something else.
>
> In general, it's enough to give information about the decisions that
> Xen is making to infer what the previous state is; and then giving
> information about what Xen did in response (i.e., return values,
> injection of traps, &c) to help figure out how the guest responded.
> In this case, I'd probably trace:
> 1. vmport hypercall, handled command
>   - cmd
>   - return values of modified registers.  Ideally only the registers
> that are modified, but just taking a big batch would be OK for now.
>   Note: No need for rip here, as it will be collected at the VMEXIT
> 2. vmport hypercall, unhandled command
>   - Just the unimplemented fail
> 3. In the vmport_gp_check(), if it successfully decodes an IO instruction:
>   - direction (read/write)
>   - size of the access
>
> I might consider logging something the failure path of
> *_vmexit_gp_intercept(), with information that might help you figure
> out why it didn't make it to the vmcall; but on the whole I think I'd
> probably leave that off.
>
> Hopefully all that would give you enough information to figure out
> where the problem was and how to reproduce the behavior locally; and
> once you can reproduce it locally, you could add in debugging traces
> (which wouldn't be upstreamed) to help you figure out why it wasn't
> taking the path you expected.
>
> Does that make sense?

Yes.  Will attempt to have this subset.

    -Don Slutz

> If you really want more traces, I might consider allowing them in the
> code but off by default; but I think you probably won't need more
> information from running production systems.
>
>   -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-29 13:27                 ` George Dunlap
  2014-09-29 13:49                   ` Jan Beulich
@ 2014-09-29 23:13                   ` Don Slutz
  2014-09-30  7:05                     ` Jan Beulich
  2014-09-30 10:09                     ` George Dunlap
  1 sibling, 2 replies; 93+ messages in thread
From: Don Slutz @ 2014-09-29 23:13 UTC (permalink / raw)
  To: George Dunlap, Jan Beulich, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Jun Nakajima, Andrew Cooper, Ian Jackson,
	xen-devel, Eddie Dong, AravindGopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

On 09/29/14 09:27, George Dunlap wrote:
> On 09/29/2014 07:50 AM, Jan Beulich wrote:
>>>>> On 26.09.14 at 22:00, <dslutz@verizon.com> wrote:
>>> On 09/25/14 06:37, Tim Deegan wrote:
>>>> At 17:18 +0100 on 22 Sep (1411402700), Jan Beulich wrote:
>>>>> That's indeed what was said so far. I wonder though whether opening
>>>>> this up without guest OS consent isn't gong to introduce a security
>>>>> issue inside the guest (depending on the exact functionality of these
>>>>> hypercalls).
>>>> Yes indeed.  VMware seems to have CPL checks on some of the commands
>>>> (but not all).  I guess Xen will be no worse than VMware if we do the
>>>> same, though I'd like to have an official spec to follow for that.
>>> Yes, VMware has CPL checks on some of the commands.  Not at all
>>> clear the include file has the correct statement.  I have not do any
>>> checking of CPL nor does QEMU.  And the RPC (which is CPL 3) is
>>> one of the most likely to have a security issue.
>>>
>>> I do not know of an official spec to follow.  The best I have the
>>> the provided include file and testing on VMware.
>>>
>>> I do know that BDOOR_CMD_GETHZ is one that is not allowed in
>>> ring 3, but this makes no sense to me.  I do not see why tsc_freq
>>> and apic bus speed to be things to hide.  And VMware is not
>>> consistent.  On newer configs this same info is available via
>>> cpuid leaf in ring 3.
>>>
>>> Also I have not idea if VMware did the CPL checking "correctly".
>>> I.E. is a #GP => CPL 3, or do they check CPL?
>>>
>>> All this leads to I current do not check CPL on any VMware commands.
>>>
>>> I could look into doing this, but with the xl.cfg flag vmware_port=0
>>> turns this all off, I do not see any need for CPL checking.
>> Hmm, I think we need to settle on certain things here:
>> a) I don't think it is okay to base our emulation layer entirely
>> on observed behavior. At least some form of specification should
>> be there to follow. This is both for reviewing the code you want
>> committed and maintainability.
>
> While that would be nice, I think that's unlikely; and overall I think 
> it would be better to have a reverse-engineered implementation than no 
> implementation at all.  Having a reverse-engineered spec might be a 
> good idea though.
>

I could work on a reverse-engineered spec.  Is having this on the wiki
good enough or does it need to be in the code?

There is a old but useful web page:

https://sites.google.com/site/chitchatvmback/backdoor

Which is basicly the start of a reverse-engineered spec.

Since I am not proposing to implement all the listed commands
on that web page, I could see some use in listing the currently
supported VMware backdoor commands.

>> b) I don't think it is okay to introduce security issues into a guest
>> even if that is something that isn't enabled by default.
>
> I agree with this; in particular, it's quite possible that someone 
> will decide to enable VMWare functionality by default, "just in case", 
> and then forget that they've done so.
>

I am assuming that the phrase "security issues" is used as a
reference to things like http://xenbits.xen.org/xsa/ or
http://wiki.xen.org/wiki/Securing_Xen.

Or as it might be stated -- A way to cause a guest to crash or have
a DoS (/Denial of Service) or a way in from outside (like "/SMASH the
Bash bug".


But not the area of
http://en.wikipedia.org/wiki/Rainbow_Series or
http://en.wikipedia.org/wiki/Trusted_Computer_System_Evaluation_Criteria

Which talks about "Covert Channel Analysis" and other complex
security issues. (like *"Evaluation Assurance Level", **"Trusted 
Computer System Evaluation Criteria", etc.)*


I feel it is "safe" to run all guests with vmware_port=1 and
vmware_hw=7.  However I am not stating that all guests function
the same with just this.  I do know that xen_platform_pci=0
may also need to be specified to get expected results.

I also do not understand the statement "enable VMWare functionality by
default".  I must be missing something because as far as I know each
guest (domU) has it's own config.  Is this a xl tool stack feature (some
common config for guests)? Or is it some other tool stack feature?


>> c) Apparent or real flaws with VMware's native implementation
>> should be brought up with VMware. While mimicking their behavior
>> as closely as possible is certainly a desirable goal, reproducing
>> flaws their code has should imo be avoided if at all possible.
>
> If our goal is compatibility with exiting tools, is there really such 
> a thing as "reproducing flaws"?  Obviously we shouldn't reproduce a 
> real security flaw, but for everything else, if the feature is "Looks 
> just like VMWare", then being as close as possible in behavior is the 
> ideal.
>

I can agree that is it ideal.  However getting to this ideal place can have
a high cost.  Case in point "BDOOR_CMD_GETHZ".  The VMware provided
include file has no CPL comments.  The same include file in "open vm
tools" does, but not for BDOOR_CMD_GETHZ.  However a VMware
test system does different things for ring 0 (aka a Linux kernel module)
and ring 3.  Neither one report a fault, but the ring 3 one does not
return any data.  I do not know that the result is if I enable ring 3 I/O
via the TSS (since I do not know of a simple way to do that).  If I change
IOPL then it still hides information.

Now there is also the "BDOOR_CMD_GETMHZ".  It also has no CPL
statement but does return the tsc_freq in MHZ.  So why is tsc_freq
in HZ considered sensitive information, but in MHZ it is not?

Just to confuse this more, for vmware_hw=7, cpuid leave 0x40000010
has tsc_freq in KHZ.  So again why is tsc_freq in HZ special?

How about the comment on other commands "/* CPL 0 only. */"
which appears to apply here, but the comment is missing.  So
do I check the segment register SS's DPL (what I know of as CPL),
or is it the segment register CS's DPL (which has been mistakenly called
CPL by some programs)?  Or is it something else which VMware has
decided to mean "ring 3"?


Based on all this, and since I do not see how tsc_freq in HZ could be
a "real security flaw", I do not see a reason to spend a lot of time
attempting to reverse engineer this strange behavior.

I am not saying that I would not add some type of check for "ring 3"
for this one command, but I do not see a good reason for it.  The only
case I could see is that it is one of many ways to determine that you
are running under Xen and not VMware.

    -Don Slutz

>  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-29 23:13                   ` Don Slutz
@ 2014-09-30  7:05                     ` Jan Beulich
  2014-09-30 10:02                       ` George Dunlap
  2014-09-30 10:09                     ` George Dunlap
  1 sibling, 1 reply; 93+ messages in thread
From: Jan Beulich @ 2014-09-30  7:05 UTC (permalink / raw)
  To: Don Slutz
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Eddie Dong, AravindGopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

>>> On 30.09.14 at 01:13, <dslutz@verizon.com> wrote:
> On 09/29/14 09:27, George Dunlap wrote:
>> On 09/29/2014 07:50 AM, Jan Beulich wrote:
>>> a) I don't think it is okay to base our emulation layer entirely
>>> on observed behavior. At least some form of specification should
>>> be there to follow. This is both for reviewing the code you want
>>> committed and maintainability.
>>
>> While that would be nice, I think that's unlikely; and overall I think 
>> it would be better to have a reverse-engineered implementation than no 
>> implementation at all.  Having a reverse-engineered spec might be a 
>> good idea though.
>>
> 
> I could work on a reverse-engineered spec.  Is having this on the wiki
> good enough or does it need to be in the code?

I don't think the place it's at matters that much. All that does matter
is if it's something outside of our control, it should be a place that
reasonably certainly won't go away any time soon, so that a link
placed somewhere in our tree won't become stale.

>>> b) I don't think it is okay to introduce security issues into a guest
>>> even if that is something that isn't enabled by default.
>>
>> I agree with this; in particular, it's quite possible that someone 
>> will decide to enable VMWare functionality by default, "just in case", 
>> and then forget that they've done so.
>>
> 
> I am assuming that the phrase "security issues" is used as a
> reference to things like http://xenbits.xen.org/xsa/ or
> http://wiki.xen.org/wiki/Securing_Xen.
> 
> Or as it might be stated -- A way to cause a guest to crash or have
> a DoS (/Denial of Service) or a way in from outside (like "/SMASH the
> Bash bug".
> 
> 
> But not the area of
> http://en.wikipedia.org/wiki/Rainbow_Series or
> http://en.wikipedia.org/wiki/Trusted_Computer_System_Evaluation_Criteria 
> 
> Which talks about "Covert Channel Analysis" and other complex
> security issues. (like *"Evaluation Assurance Level", **"Trusted 
> Computer System Evaluation Criteria", etc.)*

Covert channels are consider security issues too when applying
strict criteria. But the main concern here are indeed ways for guest
user mode to badly affect the guest as a whole (or the host, but I
think that should really go without saying).

> I feel it is "safe" to run all guests with vmware_port=1 and
> vmware_hw=7.  However I am not stating that all guests function
> the same with just this.  I do know that xen_platform_pci=0
> may also need to be specified to get expected results.
> 
> I also do not understand the statement "enable VMWare functionality by
> default".  I must be missing something because as far as I know each
> guest (domU) has it's own config.  Is this a xl tool stack feature (some
> common config for guests)? Or is it some other tool stack feature?

Higher layer management tools may choose to create guest configs
that have certain settings always enabled (like at least used to be
the case in XenServer for the Viridian flag - not sure if that got
changed -, i.e. enabling this even for non-Windows guests, which
caused issues with Linux).

>>> c) Apparent or real flaws with VMware's native implementation
>>> should be brought up with VMware. While mimicking their behavior
>>> as closely as possible is certainly a desirable goal, reproducing
>>> flaws their code has should imo be avoided if at all possible.
>>
>> If our goal is compatibility with exiting tools, is there really such 
>> a thing as "reproducing flaws"?  Obviously we shouldn't reproduce a 
>> real security flaw, but for everything else, if the feature is "Looks 
>> just like VMWare", then being as close as possible in behavior is the 
>> ideal.
>>
> 
> I can agree that is it ideal.  However getting to this ideal place can have
> a high cost.  Case in point "BDOOR_CMD_GETHZ".  The VMware provided
> include file has no CPL comments.  The same include file in "open vm
> tools" does, but not for BDOOR_CMD_GETHZ.  However a VMware
> test system does different things for ring 0 (aka a Linux kernel module)
> and ring 3.  Neither one report a fault, but the ring 3 one does not
> return any data.  I do not know that the result is if I enable ring 3 I/O
> via the TSS (since I do not know of a simple way to do that).  If I change
> IOPL then it still hides information.
> 
> Now there is also the "BDOOR_CMD_GETMHZ".  It also has no CPL
> statement but does return the tsc_freq in MHZ.  So why is tsc_freq
> in HZ considered sensitive information, but in MHZ it is not?
> 
> Just to confuse this more, for vmware_hw=7, cpuid leave 0x40000010
> has tsc_freq in KHZ.  So again why is tsc_freq in HZ special?

I think this was just an example. All of the command need to be
considered wrt the (guest) privilege they require.

> How about the comment on other commands "/* CPL 0 only. */"
> which appears to apply here, but the comment is missing.  So
> do I check the segment register SS's DPL (what I know of as CPL),
> or is it the segment register CS's DPL (which has been mistakenly called
> CPL by some programs)?  Or is it something else which VMware has
> decided to mean "ring 3"?

SS DPL is the canonical way for determining CPL - you can find a
number of examples throughout the code base.

Jan

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-30  7:05                     ` Jan Beulich
@ 2014-09-30 10:02                       ` George Dunlap
  2014-09-30 22:11                         ` Slutz, Donald Christopher
  0 siblings, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-30 10:02 UTC (permalink / raw)
  To: Jan Beulich, Don Slutz
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Jun Nakajima, Andrew Cooper, Ian Jackson,
	xen-devel, Eddie Dong, AravindGopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

On 09/30/2014 08:05 AM, Jan Beulich wrote:
>>>> On 30.09.14 at 01:13, <dslutz@verizon.com> wrote:
>> On 09/29/14 09:27, George Dunlap wrote:
>>> On 09/29/2014 07:50 AM, Jan Beulich wrote:
>>>> a) I don't think it is okay to base our emulation layer entirely
>>>> on observed behavior. At least some form of specification should
>>>> be there to follow. This is both for reviewing the code you want
>>>> committed and maintainability.
>>> While that would be nice, I think that's unlikely; and overall I think
>>> it would be better to have a reverse-engineered implementation than no
>>> implementation at all.  Having a reverse-engineered spec might be a
>>> good idea though.
>>>
>> I could work on a reverse-engineered spec.  Is having this on the wiki
>> good enough or does it need to be in the code?
> I don't think the place it's at matters that much. All that does matter
> is if it's something outside of our control, it should be a place that
> reasonably certainly won't go away any time soon, so that a link
> placed somewhere in our tree won't become stale.

I think long term it would make sense to have a document in-tree that 
describes what the code is trying to do.

>
>>>> b) I don't think it is okay to introduce security issues into a guest
>>>> even if that is something that isn't enabled by default.
>>> I agree with this; in particular, it's quite possible that someone
>>> will decide to enable VMWare functionality by default, "just in case",
>>> and then forget that they've done so.
>>>
>> I am assuming that the phrase "security issues" is used as a
>> reference to things like http://xenbits.xen.org/xsa/ or
>> http://wiki.xen.org/wiki/Securing_Xen.
>>
>> Or as it might be stated -- A way to cause a guest to crash or have
>> a DoS (/Denial of Service) or a way in from outside (like "/SMASH the
>> Bash bug".
>>
>>
>> But not the area of
>> http://en.wikipedia.org/wiki/Rainbow_Series or
>> http://en.wikipedia.org/wiki/Trusted_Computer_System_Evaluation_Criteria
>>
>> Which talks about "Covert Channel Analysis" and other complex
>> security issues. (like *"Evaluation Assurance Level", **"Trusted
>> Computer System Evaluation Criteria", etc.)*
> Covert channels are consider security issues too when applying
> strict criteria. But the main concern here are indeed ways for guest
> user mode to badly affect the guest as a whole (or the host, but I
> think that should really go without saying).

Just to bring home the point -- this code makes it so that some 
instructions, namely IO instructions, running with no privilege checks 
in ring 3, can access certain extra bits of potentially arbitrarily 
complicated "virtual hardware" functionality which the OS doesn't know 
anything about and has no way to contain or prevent.  This opens up the 
possibility that there's a bug in the functionality somehow (either in 
how VMWare implements it, or how we implement it) which an attacker can 
leverage to gain privileges within the guest.

I think Jan's point is that *we* need to be thinking carefully about the 
functionality itself, and how we implement it, to make sure (as far as 
we are able) that we don't introduce such a vulnerability. Saying "this 
is the observed functionality of VMWare" isn't enough, because, well, 
they're not perfect. :-)

>> I feel it is "safe" to run all guests with vmware_port=1 and
>> vmware_hw=7.  However I am not stating that all guests function
>> the same with just this.  I do know that xen_platform_pci=0
>> may also need to be specified to get expected results.
>>
>> I also do not understand the statement "enable VMWare functionality by
>> default".  I must be missing something because as far as I know each
>> guest (domU) has it's own config.  Is this a xl tool stack feature (some
>> common config for guests)? Or is it some other tool stack feature?
> Higher layer management tools may choose to create guest configs
> that have certain settings always enabled (like at least used to be
> the case in XenServer for the Viridian flag - not sure if that got
> changed -, i.e. enabling this even for non-Windows guests, which
> caused issues with Linux).

Or "vmware_hw=7" gets into a "howto" on the internet and mindlessly 
copied.  Or a template which is then cloned over and over again without 
checking.  Don't vmdk's include some guest configuration as well?  Or as 
Jan said, XenServer or OpenStack or CloudStack or XenOrchestra or oVirt 
set it as a default, because it can't hurt, right?

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-29 23:13                   ` Don Slutz
  2014-09-30  7:05                     ` Jan Beulich
@ 2014-09-30 10:09                     ` George Dunlap
  2014-09-30 22:23                       ` Slutz, Donald Christopher
  1 sibling, 1 reply; 93+ messages in thread
From: George Dunlap @ 2014-09-30 10:09 UTC (permalink / raw)
  To: Don Slutz, Jan Beulich
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, Jun Nakajima, Andrew Cooper, Ian Jackson,
	xen-devel, Eddie Dong, AravindGopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

On 09/30/2014 12:13 AM, Don Slutz wrote:
> On 09/29/14 09:27, George Dunlap wrote:
>> On 09/29/2014 07:50 AM, Jan Beulich wrote:
>>>>>> On 26.09.14 at 22:00, <dslutz@verizon.com> wrote:
>>>> On 09/25/14 06:37, Tim Deegan wrote:
>>>>> At 17:18 +0100 on 22 Sep (1411402700), Jan Beulich wrote:
>>>>>> That's indeed what was said so far. I wonder though whether opening
>>>>>> this up without guest OS consent isn't gong to introduce a security
>>>>>> issue inside the guest (depending on the exact functionality of 
>>>>>> these
>>>>>> hypercalls).
>>>>> Yes indeed.  VMware seems to have CPL checks on some of the commands
>>>>> (but not all).  I guess Xen will be no worse than VMware if we do the
>>>>> same, though I'd like to have an official spec to follow for that.
>>>> Yes, VMware has CPL checks on some of the commands.  Not at all
>>>> clear the include file has the correct statement.  I have not do any
>>>> checking of CPL nor does QEMU.  And the RPC (which is CPL 3) is
>>>> one of the most likely to have a security issue.
>>>>
>>>> I do not know of an official spec to follow.  The best I have the
>>>> the provided include file and testing on VMware.
>>>>
>>>> I do know that BDOOR_CMD_GETHZ is one that is not allowed in
>>>> ring 3, but this makes no sense to me.  I do not see why tsc_freq
>>>> and apic bus speed to be things to hide.  And VMware is not
>>>> consistent.  On newer configs this same info is available via
>>>> cpuid leaf in ring 3.
>>>>
>>>> Also I have not idea if VMware did the CPL checking "correctly".
>>>> I.E. is a #GP => CPL 3, or do they check CPL?
>>>>
>>>> All this leads to I current do not check CPL on any VMware commands.
>>>>
>>>> I could look into doing this, but with the xl.cfg flag vmware_port=0
>>>> turns this all off, I do not see any need for CPL checking.
>>> Hmm, I think we need to settle on certain things here:
>>> a) I don't think it is okay to base our emulation layer entirely
>>> on observed behavior. At least some form of specification should
>>> be there to follow. This is both for reviewing the code you want
>>> committed and maintainability.
>>
>> While that would be nice, I think that's unlikely; and overall I 
>> think it would be better to have a reverse-engineered implementation 
>> than no implementation at all.  Having a reverse-engineered spec 
>> might be a good idea though.
>>
>
> I could work on a reverse-engineered spec.  Is having this on the wiki
> good enough or does it need to be in the code?
>
> There is a old but useful web page:
>
> https://sites.google.com/site/chitchatvmback/backdoor
>
> Which is basicly the start of a reverse-engineered spec.
>
> Since I am not proposing to implement all the listed commands
> on that web page, I could see some use in listing the currently
> supported VMware backdoor commands.
>
>>> b) I don't think it is okay to introduce security issues into a guest
>>> even if that is something that isn't enabled by default.
>>
>> I agree with this; in particular, it's quite possible that someone 
>> will decide to enable VMWare functionality by default, "just in 
>> case", and then forget that they've done so.
>>
>
> I am assuming that the phrase "security issues" is used as a
> reference to things like http://xenbits.xen.org/xsa/ or
> http://wiki.xen.org/wiki/Securing_Xen.
>
> Or as it might be stated -- A way to cause a guest to crash or have
> a DoS (/Denial of Service) or a way in from outside (like "/SMASH the
> Bash bug".
>
>
> But not the area of
> http://en.wikipedia.org/wiki/Rainbow_Series or
> http://en.wikipedia.org/wiki/Trusted_Computer_System_Evaluation_Criteria
>
> Which talks about "Covert Channel Analysis" and other complex
> security issues. (like *"Evaluation Assurance Level", **"Trusted 
> Computer System Evaluation Criteria", etc.)*
>
>
> I feel it is "safe" to run all guests with vmware_port=1 and
> vmware_hw=7.  However I am not stating that all guests function
> the same with just this.  I do know that xen_platform_pci=0
> may also need to be specified to get expected results.
>
> I also do not understand the statement "enable VMWare functionality by
> default".  I must be missing something because as far as I know each
> guest (domU) has it's own config.  Is this a xl tool stack feature (some
> common config for guests)? Or is it some other tool stack feature?
>
>
>>> c) Apparent or real flaws with VMware's native implementation
>>> should be brought up with VMware. While mimicking their behavior
>>> as closely as possible is certainly a desirable goal, reproducing
>>> flaws their code has should imo be avoided if at all possible.
>>
>> If our goal is compatibility with exiting tools, is there really such 
>> a thing as "reproducing flaws"?  Obviously we shouldn't reproduce a 
>> real security flaw, but for everything else, if the feature is "Looks 
>> just like VMWare", then being as close as possible in behavior is the 
>> ideal.
>>
>
> I can agree that is it ideal.  However getting to this ideal place can 
> have
> a high cost.  Case in point "BDOOR_CMD_GETHZ".  The VMware provided
> include file has no CPL comments.  The same include file in "open vm
> tools" does, but not for BDOOR_CMD_GETHZ.  However a VMware
> test system does different things for ring 0 (aka a Linux kernel module)
> and ring 3.  Neither one report a fault, but the ring 3 one does not
> return any data.  I do not know that the result is if I enable ring 3 I/O
> via the TSS (since I do not know of a simple way to do that).  If I 
> change
> IOPL then it still hides information.
>
> Now there is also the "BDOOR_CMD_GETMHZ".  It also has no CPL
> statement but does return the tsc_freq in MHZ.  So why is tsc_freq
> in HZ considered sensitive information, but in MHZ it is not?
>
> Just to confuse this more, for vmware_hw=7, cpuid leave 0x40000010
> has tsc_freq in KHZ.  So again why is tsc_freq in HZ special?
>
> How about the comment on other commands "/* CPL 0 only. */"
> which appears to apply here, but the comment is missing.  So
> do I check the segment register SS's DPL (what I know of as CPL),
> or is it the segment register CS's DPL (which has been mistakenly called
> CPL by some programs)?  Or is it something else which VMware has
> decided to mean "ring 3"?
>
>
> Based on all this, and since I do not see how tsc_freq in HZ could be
> a "real security flaw", I do not see a reason to spend a lot of time
> attempting to reverse engineer this strange behavior.
>
> I am not saying that I would not add some type of check for "ring 3"
> for this one command, but I do not see a good reason for it.  The only
> case I could see is that it is one of many ways to determine that you
> are running under Xen and not VMware.

Sure; the actual "feature" is for VMWare tools to work as expected 
within the guest.  Having the behavior be exactly the same as VMWare is 
one way of making sure those tools work; the amount of time you spend 
duplicating the exact behavior vs just making sure the tools work as 
expected is a sort of cost/benefits analysis.

The main risk of deviating from VMWare is if there are corners of the 
tool functionality you don't test, or if VMWare updates their tools and 
the new version is compatible with old VMWare hypervisors but not old 
Xen hypervisors.

  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-30 10:02                       ` George Dunlap
@ 2014-09-30 22:11                         ` Slutz, Donald Christopher
  0 siblings, 0 replies; 93+ messages in thread
From: Slutz, Donald Christopher @ 2014-09-30 22:11 UTC (permalink / raw)
  To: George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jan Beulich,
	Stefano Stabellini, Jun Nakajima, Andrew Cooper, Ian Jackson,
	xen-devel, Eddie Dong, AravindGopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky, Ian Campbell

On 09/30/14 06:02, George Dunlap wrote:
> On 09/30/2014 08:05 AM, Jan Beulich wrote:
>>>>> On 30.09.14 at 01:13, <dslutz@verizon.com> wrote:
>>> On 09/29/14 09:27, George Dunlap wrote:
>>>> On 09/29/2014 07:50 AM, Jan Beulich wrote:
>>>>> a) I don't think it is okay to base our emulation layer entirely
>>>>> on observed behavior. At least some form of specification should
>>>>> be there to follow. This is both for reviewing the code you want
>>>>> committed and maintainability.
>>>> While that would be nice, I think that's unlikely; and overall I think
>>>> it would be better to have a reverse-engineered implementation than no
>>>> implementation at all.  Having a reverse-engineered spec might be a
>>>> good idea though.
>>>>
>>> I could work on a reverse-engineered spec.  Is having this on the wiki
>>> good enough or does it need to be in the code?
>> I don't think the place it's at matters that much. All that does matter
>> is if it's something outside of our control, it should be a place that
>> reasonably certainly won't go away any time soon, so that a link
>> placed somewhere in our tree won't become stale.
>
> I think long term it would make sense to have a document in-tree that 
> describes what the code is trying to do.
>

Ok.

>>
>>>>> b) I don't think it is okay to introduce security issues into a guest
>>>>> even if that is something that isn't enabled by default.
>>>> I agree with this; in particular, it's quite possible that someone
>>>> will decide to enable VMWare functionality by default, "just in case",
>>>> and then forget that they've done so.
>>>>
>>> I am assuming that the phrase "security issues" is used as a
>>> reference to things like http://xenbits.xen.org/xsa/ or
>>> http://wiki.xen.org/wiki/Securing_Xen.
>>>
>>> Or as it might be stated -- A way to cause a guest to crash or have
>>> a DoS (/Denial of Service) or a way in from outside (like "/SMASH the
>>> Bash bug".
>>>
>>>
>>> But not the area of
>>> http://en.wikipedia.org/wiki/Rainbow_Series or
>>> http://en.wikipedia.org/wiki/Trusted_Computer_System_Evaluation_Criteria 
>>>
>>>
>>> Which talks about "Covert Channel Analysis" and other complex
>>> security issues. (like *"Evaluation Assurance Level", **"Trusted
>>> Computer System Evaluation Criteria", etc.)*
>> Covert channels are consider security issues too when applying
>> strict criteria. But the main concern here are indeed ways for guest
>> user mode to badly affect the guest as a whole (or the host, but I
>> think that should really go without saying).
>
> Just to bring home the point -- this code makes it so that some 
> instructions, namely IO instructions, running with no privilege checks 
> in ring 3, can access certain extra bits of potentially arbitrarily 
> complicated "virtual hardware" functionality which the OS doesn't know 
> anything about and has no way to contain or prevent.  This opens up 
> the possibility that there's a bug in the functionality somehow 
> (either in how VMWare implements it, or how we implement it) which an 
> attacker can leverage to gain privileges within the guest.
>
> I think Jan's point is that *we* need to be thinking carefully about 
> the functionality itself, and how we implement it, to make sure (as 
> far as we are able) that we don't introduce such a vulnerability. 
> Saying "this is the observed functionality of VMWare" isn't enough, 
> because, well, they're not perfect. :-)
>

Well, now that the VMware "RPC" has been moved to QEMU (which does
add a new dimension to this), what is left is not that complex.  It add 
8 ways
to get "hypervisor" info; and one that sets eax (rax) to 0.  So I am 
sure that
all this code does not enable any gains of privileges within the guest.  
It might
add a covert channel, but that is not how you gain privileges within the 
guest.

I fully agree that *we* need to be thinking carefully about the 
functionality.
And none of my statements about the security depend on the observed
functionality of VMware, they are all about the code posted here.

>>> I feel it is "safe" to run all guests with vmware_port=1 and
>>> vmware_hw=7.  However I am not stating that all guests function
>>> the same with just this.  I do know that xen_platform_pci=0
>>> may also need to be specified to get expected results.
>>>
>>> I also do not understand the statement "enable VMWare functionality by
>>> default".  I must be missing something because as far as I know each
>>> guest (domU) has it's own config.  Is this a xl tool stack feature 
>>> (some
>>> common config for guests)? Or is it some other tool stack feature?
>> Higher layer management tools may choose to create guest configs
>> that have certain settings always enabled (like at least used to be
>> the case in XenServer for the Viridian flag - not sure if that got
>> changed -, i.e. enabling this even for non-Windows guests, which
>> caused issues with Linux).
>
> Or "vmware_hw=7" gets into a "howto" on the internet and mindlessly 
> copied.  Or a template which is then cloned over and over again 
> without checking.  Don't vmdk's include some guest configuration as 
> well?  Or as Jan said, XenServer or OpenStack or CloudStack or 
> XenOrchestra or oVirt set it as a default, because it can't hurt, right?
>

vmdk's can have guest configuration (not sure how that fits in).
I have attempted to avoid saying "vmware_hw=7" cannot hurt.  I know that
people can write code that does bad things.  To follow on your example
(as I understand it) Linux had an issue with Viridian and Xen being on
at the same time and there were Viridian features missing (or just a 
code path
that was not tested).  However I am confident that enabling these 
features will
not add new security issues at this time.  I have no issue with stating: 
"Do not
mindlessly set either vmware_hw or vmware_port".

    -Don Slutz



>  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-30 10:09                     ` George Dunlap
@ 2014-09-30 22:23                       ` Slutz, Donald Christopher
  0 siblings, 0 replies; 93+ messages in thread
From: Slutz, Donald Christopher @ 2014-09-30 22:23 UTC (permalink / raw)
  To: George Dunlap
  Cc: Tim Deegan, Kevin Tian, Keir Fraser, Jan Beulich,
	Stefano Stabellini, Jun Nakajima, Andrew Cooper, Ian Jackson,
	xen-devel, Eddie Dong, AravindGopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky, Ian Campbell

On 09/30/14 06:09, George Dunlap wrote:
> On 09/30/2014 12:13 AM, Don Slutz wrote:
>> On 09/29/14 09:27, George Dunlap wrote:
>>> On 09/29/2014 07:50 AM, Jan Beulich wrote:
>>>>>>> On 26.09.14 at 22:00, <dslutz@verizon.com> wrote:
>>>>> On 09/25/14 06:37, Tim Deegan wrote:
>>>>>> At 17:18 +0100 on 22 Sep (1411402700), Jan Beulich wrote:
>>>>>>> That's indeed what was said so far. I wonder though whether opening
>>>>>>> this up without guest OS consent isn't gong to introduce a security
>>>>>>> issue inside the guest (depending on the exact functionality of 
>>>>>>> these
>>>>>>> hypercalls).
>>>>>> Yes indeed.  VMware seems to have CPL checks on some of the commands
>>>>>> (but not all).  I guess Xen will be no worse than VMware if we do 
>>>>>> the
>>>>>> same, though I'd like to have an official spec to follow for that.
>>>>> Yes, VMware has CPL checks on some of the commands.  Not at all
>>>>> clear the include file has the correct statement.  I have not do any
>>>>> checking of CPL nor does QEMU.  And the RPC (which is CPL 3) is
>>>>> one of the most likely to have a security issue.
>>>>>
>>>>> I do not know of an official spec to follow.  The best I have the
>>>>> the provided include file and testing on VMware.
>>>>>
>>>>> I do know that BDOOR_CMD_GETHZ is one that is not allowed in
>>>>> ring 3, but this makes no sense to me.  I do not see why tsc_freq
>>>>> and apic bus speed to be things to hide.  And VMware is not
>>>>> consistent.  On newer configs this same info is available via
>>>>> cpuid leaf in ring 3.
>>>>>
>>>>> Also I have not idea if VMware did the CPL checking "correctly".
>>>>> I.E. is a #GP => CPL 3, or do they check CPL?
>>>>>
>>>>> All this leads to I current do not check CPL on any VMware commands.
>>>>>
>>>>> I could look into doing this, but with the xl.cfg flag vmware_port=0
>>>>> turns this all off, I do not see any need for CPL checking.
>>>> Hmm, I think we need to settle on certain things here:
>>>> a) I don't think it is okay to base our emulation layer entirely
>>>> on observed behavior. At least some form of specification should
>>>> be there to follow. This is both for reviewing the code you want
>>>> committed and maintainability.
>>>
>>> While that would be nice, I think that's unlikely; and overall I 
>>> think it would be better to have a reverse-engineered implementation 
>>> than no implementation at all.  Having a reverse-engineered spec 
>>> might be a good idea though.
>>>
>>
>> I could work on a reverse-engineered spec.  Is having this on the wiki
>> good enough or does it need to be in the code?
>>
>> There is a old but useful web page:
>>
>> https://sites.google.com/site/chitchatvmback/backdoor
>>
>> Which is basicly the start of a reverse-engineered spec.
>>
>> Since I am not proposing to implement all the listed commands
>> on that web page, I could see some use in listing the currently
>> supported VMware backdoor commands.
>>
>>>> b) I don't think it is okay to introduce security issues into a guest
>>>> even if that is something that isn't enabled by default.
>>>
>>> I agree with this; in particular, it's quite possible that someone 
>>> will decide to enable VMWare functionality by default, "just in 
>>> case", and then forget that they've done so.
>>>
>>
>> I am assuming that the phrase "security issues" is used as a
>> reference to things like http://xenbits.xen.org/xsa/ or
>> http://wiki.xen.org/wiki/Securing_Xen.
>>
>> Or as it might be stated -- A way to cause a guest to crash or have
>> a DoS (/Denial of Service) or a way in from outside (like "/SMASH the
>> Bash bug".
>>
>>
>> But not the area of
>> http://en.wikipedia.org/wiki/Rainbow_Series or
>> http://en.wikipedia.org/wiki/Trusted_Computer_System_Evaluation_Criteria
>>
>> Which talks about "Covert Channel Analysis" and other complex
>> security issues. (like *"Evaluation Assurance Level", **"Trusted 
>> Computer System Evaluation Criteria", etc.)*
>>
>>
>> I feel it is "safe" to run all guests with vmware_port=1 and
>> vmware_hw=7.  However I am not stating that all guests function
>> the same with just this.  I do know that xen_platform_pci=0
>> may also need to be specified to get expected results.
>>
>> I also do not understand the statement "enable VMWare functionality by
>> default".  I must be missing something because as far as I know each
>> guest (domU) has it's own config.  Is this a xl tool stack feature (some
>> common config for guests)? Or is it some other tool stack feature?
>>
>>
>>>> c) Apparent or real flaws with VMware's native implementation
>>>> should be brought up with VMware. While mimicking their behavior
>>>> as closely as possible is certainly a desirable goal, reproducing
>>>> flaws their code has should imo be avoided if at all possible.
>>>
>>> If our goal is compatibility with exiting tools, is there really 
>>> such a thing as "reproducing flaws"?  Obviously we shouldn't 
>>> reproduce a real security flaw, but for everything else, if the 
>>> feature is "Looks just like VMWare", then being as close as possible 
>>> in behavior is the ideal.
>>>
>>
>> I can agree that is it ideal.  However getting to this ideal place 
>> can have
>> a high cost.  Case in point "BDOOR_CMD_GETHZ".  The VMware provided
>> include file has no CPL comments.  The same include file in "open vm
>> tools" does, but not for BDOOR_CMD_GETHZ.  However a VMware
>> test system does different things for ring 0 (aka a Linux kernel module)
>> and ring 3.  Neither one report a fault, but the ring 3 one does not
>> return any data.  I do not know that the result is if I enable ring 3 
>> I/O
>> via the TSS (since I do not know of a simple way to do that). If I 
>> change
>> IOPL then it still hides information.
>>
>> Now there is also the "BDOOR_CMD_GETMHZ".  It also has no CPL
>> statement but does return the tsc_freq in MHZ.  So why is tsc_freq
>> in HZ considered sensitive information, but in MHZ it is not?
>>
>> Just to confuse this more, for vmware_hw=7, cpuid leave 0x40000010
>> has tsc_freq in KHZ.  So again why is tsc_freq in HZ special?
>>
>> How about the comment on other commands "/* CPL 0 only. */"
>> which appears to apply here, but the comment is missing.  So
>> do I check the segment register SS's DPL (what I know of as CPL),
>> or is it the segment register CS's DPL (which has been mistakenly called
>> CPL by some programs)?  Or is it something else which VMware has
>> decided to mean "ring 3"?
>>
>>
>> Based on all this, and since I do not see how tsc_freq in HZ could be
>> a "real security flaw", I do not see a reason to spend a lot of time
>> attempting to reverse engineer this strange behavior.
>>
>> I am not saying that I would not add some type of check for "ring 3"
>> for this one command, but I do not see a good reason for it. The only
>> case I could see is that it is one of many ways to determine that you
>> are running under Xen and not VMware.
>
> Sure; the actual "feature" is for VMWare tools to work as expected 
> within the guest.  Having the behavior be exactly the same as VMWare 
> is one way of making sure those tools work; the amount of time you 
> spend duplicating the exact behavior vs just making sure the tools 
> work as expected is a sort of cost/benefits analysis.
>

Yes.  What I was trying to say (and was not clear I guess) is that there is
no simple way to test for a lot of the corner functionality.  Since the 
VMware
tools have a large part that runs in ring 3, they do not use this backdoor
command.  So I saw a cost in adding a check and an almost 0 benefit,
and went with the cheap way.

> The main risk of deviating from VMWare is if there are corners of the 
> tool functionality you don't test, or if VMWare updates their tools 
> and the new version is compatible with old VMWare hypervisors but not 
> old Xen hypervisors.
>

Yes.  I have added a check for the 1 command that has different actions
based on "dpl of ss" == 0 (which is all that I have tested on VMware).  
So the
next version will be even closer.

     -Don Slutz

>  -George

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-09-26 20:00             ` Don Slutz
  2014-09-29  6:50               ` Jan Beulich
@ 2014-10-02 10:05               ` Tim Deegan
  2014-10-02 19:20                 ` Don Slutz
  1 sibling, 1 reply; 93+ messages in thread
From: Tim Deegan @ 2014-10-02 10:05 UTC (permalink / raw)
  To: Don Slutz
  Cc: Jun Nakajima, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Eddie Dong,
	xen-devel, AravindGopalakrishnan, Jan Beulich, Ian Jackson,
	Boris Ostrovsky, Suravee Suthikulpanit

At 16:00 -0400 on 26 Sep (1411743641), Don Slutz wrote:
> On 09/25/14 06:37, Tim Deegan wrote:
> > At 17:18 +0100 on 22 Sep (1411402700), Jan Beulich wrote:
> >>>>> On 22.09.14 at 17:38, <george.dunlap@eu.citrix.com> wrote:
> >> That's indeed what was said so far. I wonder though whether opening
> >> this up without guest OS consent isn't gong to introduce a security
> >> issue inside the guest (depending on the exact functionality of these
> >> hypercalls).
> > Yes indeed.  VMware seems to have CPL checks on some of the commands
> > (but not all).  I guess Xen will be no worse than VMware if we do the
> > same, though I'd like to have an official spec to follow for that.
> 
> Yes, VMware has CPL checks on some of the commands.  Not at all
> clear the include file has the correct statement.  I have not do any
> checking of CPL nor does QEMU.

That needs to be fixed somewhere.  If Xen/Qemu is going to provide
this interface it _must_ copy the privilege checks, even if we don't
understand why they're there -- in fact, _especially_ if we don't
understand why they're there! :)

If the third-party header file isn't a reliable source, you'll have to
determine the correct behaviour by experiment.

> I could look into doing this, but with the xl.cfg flag vmware_port=0
> turns this all off, I do not see any need for CPL checking.

I strongly disagree with this.  If our implementation of this
interface makes guest OSes less secure than they would be under actual
VMware then the config option is irrelevant.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-10-02 10:05               ` Tim Deegan
@ 2014-10-02 19:20                 ` Don Slutz
  2014-10-03  7:09                   ` Tim Deegan
  0 siblings, 1 reply; 93+ messages in thread
From: Don Slutz @ 2014-10-02 19:20 UTC (permalink / raw)
  To: Tim Deegan, Don Slutz
  Cc: Jun Nakajima, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Eddie Dong,
	xen-devel, AravindGopalakrishnan, Jan Beulich, Ian Jackson,
	Boris Ostrovsky, Suravee Suthikulpanit

On 10/02/14 06:05, Tim Deegan wrote:
> At 16:00 -0400 on 26 Sep (1411743641), Don Slutz wrote:
>> On 09/25/14 06:37, Tim Deegan wrote:
>>> At 17:18 +0100 on 22 Sep (1411402700), Jan Beulich wrote:
>>>>>>> On 22.09.14 at 17:38, <george.dunlap@eu.citrix.com> wrote:
>>>> That's indeed what was said so far. I wonder though whether opening
>>>> this up without guest OS consent isn't gong to introduce a security
>>>> issue inside the guest (depending on the exact functionality of these
>>>> hypercalls).
>>> Yes indeed.  VMware seems to have CPL checks on some of the commands
>>> (but not all).  I guess Xen will be no worse than VMware if we do the
>>> same, though I'd like to have an official spec to follow for that.
>> Yes, VMware has CPL checks on some of the commands.  Not at all
>> clear the include file has the correct statement.  I have not do any
>> checking of CPL nor does QEMU.
> That needs to be fixed somewhere.  If Xen/Qemu is going to provide
> this interface it _must_ copy the privilege checks, even if we don't
> understand why they're there -- in fact, _especially_ if we don't
> understand why they're there! :)
>
> If the third-party header file isn't a reliable source, you'll have to
> determine the correct behaviour by experiment.

I have done this.  Will be adding the check.

>> I could look into doing this, but with the xl.cfg flag vmware_port=0
>> turns this all off, I do not see any need for CPL checking.
> I strongly disagree with this.  If our implementation of this
> interface makes guest OSes less secure than they would be under actual
> VMware then the config option is irrelevant.

Ok.

    -Don Slutz

> Cheers,
>
> Tim.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH for-4.5 v6 00/16] Xen VMware tools support
  2014-10-02 19:20                 ` Don Slutz
@ 2014-10-03  7:09                   ` Tim Deegan
  0 siblings, 0 replies; 93+ messages in thread
From: Tim Deegan @ 2014-10-03  7:09 UTC (permalink / raw)
  To: Don Slutz
  Cc: Jun Nakajima, Kevin Tian, Keir Fraser, Ian Campbell,
	Stefano Stabellini, George Dunlap, Andrew Cooper, Eddie Dong,
	xen-devel, AravindGopalakrishnan, Jan Beulich, Ian Jackson,
	Boris Ostrovsky, Suravee Suthikulpanit

At 15:20 -0400 on 02 Oct (1412259615), Don Slutz wrote:
> On 10/02/14 06:05, Tim Deegan wrote:
> > At 16:00 -0400 on 26 Sep (1411743641), Don Slutz wrote:
> >> On 09/25/14 06:37, Tim Deegan wrote:
> >>> At 17:18 +0100 on 22 Sep (1411402700), Jan Beulich wrote:
> >>>>>>> On 22.09.14 at 17:38, <george.dunlap@eu.citrix.com> wrote:
> >>>> That's indeed what was said so far. I wonder though whether opening
> >>>> this up without guest OS consent isn't gong to introduce a security
> >>>> issue inside the guest (depending on the exact functionality of these
> >>>> hypercalls).
> >>> Yes indeed.  VMware seems to have CPL checks on some of the commands
> >>> (but not all).  I guess Xen will be no worse than VMware if we do the
> >>> same, though I'd like to have an official spec to follow for that.
> >> Yes, VMware has CPL checks on some of the commands.  Not at all
> >> clear the include file has the correct statement.  I have not do any
> >> checking of CPL nor does QEMU.
> > That needs to be fixed somewhere.  If Xen/Qemu is going to provide
> > this interface it _must_ copy the privilege checks, even if we don't
> > understand why they're there -- in fact, _especially_ if we don't
> > understand why they're there! :)
> >
> > If the third-party header file isn't a reliable source, you'll have to
> > determine the correct behaviour by experiment.
> 
> I have done this.  Will be adding the check.

Great, thanks!

Tim.

^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2014-10-03  7:09 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-20 18:07 [PATCH for-4.5 v6 00/16] Xen VMware tools support Don Slutz
2014-09-20 18:07 ` [PATCH for-4.5 v6 01/16] xen: Add support for VMware cpuid leaves Don Slutz
2014-09-22 11:49   ` Andrew Cooper
2014-09-22 16:53     ` Don Slutz
2014-09-24 14:33   ` George Dunlap
2014-09-20 18:07 ` [PATCH for-4.5 v6 02/16] tools: Add vmware_hw support Don Slutz
2014-09-22 13:34   ` Ian Campbell
2014-09-22 22:08     ` Don Slutz
2014-09-24 14:44   ` George Dunlap
2014-09-24 21:06     ` Don Slutz
2014-09-20 18:07 ` [PATCH for-4.5 v6 03/16] vmware: Add VMware provided include files Don Slutz
2014-09-20 18:07 ` [PATCH for-4.5 v6 04/16] xen: Add vmware_port support Don Slutz
2014-09-23 17:16   ` Boris Ostrovsky
2014-09-24  8:28     ` Jan Beulich
2014-09-26 19:09     ` Don Slutz
2014-09-24 16:01   ` George Dunlap
2014-09-24 16:48     ` Don Slutz
2014-09-24 17:42       ` Andrew Cooper
2014-09-20 18:07 ` [PATCH for-4.5 v6 05/16] tools: " Don Slutz
2014-09-22 13:41   ` Ian Campbell
2014-09-22 16:34     ` Andrew Cooper
2014-09-22 21:22       ` Don Slutz
2014-09-24 16:24         ` George Dunlap
2014-09-24 18:25           ` Don Slutz
2014-09-22 16:42     ` Don Slutz
2014-09-23 12:20       ` Ian Campbell
2014-09-24 16:31         ` Don Slutz
2014-09-24 16:44           ` George Dunlap
2014-09-24 18:29             ` Don Slutz
2014-09-25 11:24           ` Ian Campbell
2014-09-25 14:17             ` George Dunlap
2014-09-25 14:21               ` Ian Campbell
2014-09-26 19:19             ` Don Slutz
2014-09-20 18:07 ` [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage Don Slutz
2014-09-24 17:27   ` George Dunlap
2014-09-24 19:07     ` Don Slutz
2014-09-25 15:14       ` George Dunlap
2014-09-29 18:10         ` Don Slutz
2014-09-20 18:07 ` [PATCH for-4.5 v6 07/16] tools: " Don Slutz
2014-09-25 15:18   ` George Dunlap
2014-09-20 18:07 ` [PATCH for-4.5 v6 08/16] xen: Add limited support of VMware's hyper-call rpc Don Slutz
2014-09-22 13:47   ` Ian Campbell
2014-09-22 21:18     ` Don Slutz
2014-09-23 12:34       ` Ian Campbell
2014-09-23 22:03         ` Slutz, Donald Christopher
2014-09-25 16:28     ` George Dunlap
2014-09-20 18:07 ` [PATCH for-4.5 v6 09/16] tools: " Don Slutz
2014-09-22 13:52   ` Ian Campbell
2014-09-22 21:32     ` Don Slutz
2014-09-23 12:35       ` Ian Campbell
2014-09-20 18:07 ` [PATCH for-4.5 v6 10/16] Add VMware tool's triggers Don Slutz
2014-09-20 18:07 ` [PATCH for-4.5 v6 11/16] Add live migration of VMware's hyper-call RPC Don Slutz
2014-09-20 18:07 ` [PATCH for-4.5 v6 12/16] Add dump of HVM_SAVE_CODE(VMPORT) to xen-hvmctx Don Slutz
2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 13/16] Add xen-hvm-param Don Slutz
2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 14/16] Add xen-vmware-guestinfo Don Slutz
2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 15/16] Add xen-list-vmware-guestinfo Don Slutz
2014-09-20 18:07 ` [OPTIONAL][PATCH for-4.5 v6 16/16] Add xen-hvm-send-trigger Don Slutz
2014-09-22 13:56 ` [PATCH for-4.5 v6 00/16] Xen VMware tools support Ian Campbell
2014-09-22 15:19   ` George Dunlap
2014-09-22 15:34     ` Ian Campbell
2014-09-22 15:38       ` George Dunlap
2014-09-22 15:50         ` Ian Campbell
2014-09-22 15:55           ` George Dunlap
2014-09-22 17:19             ` Don Slutz
2014-09-22 22:00               ` Tian, Kevin
2014-09-23 12:30               ` Ian Campbell
2014-09-23 12:35                 ` George Dunlap
2014-09-23 12:40                   ` Ian Campbell
2014-09-24 15:52                 ` George Dunlap
2014-09-24 18:09                   ` Don Slutz
2014-09-24 17:19                 ` Don Slutz
2014-09-24 20:21                   ` Konrad Rzeszutek Wilk
2014-09-26 19:03                     ` Don Slutz
2014-09-26 19:28                       ` Konrad Rzeszutek Wilk
2014-09-25 11:35                   ` Ian Campbell
2014-09-22 16:18         ` Jan Beulich
2014-09-22 18:32           ` Don Slutz
2014-09-25 10:37           ` Tim Deegan
2014-09-26 20:00             ` Don Slutz
2014-09-29  6:50               ` Jan Beulich
2014-09-29 13:27                 ` George Dunlap
2014-09-29 13:49                   ` Jan Beulich
2014-09-29 23:13                   ` Don Slutz
2014-09-30  7:05                     ` Jan Beulich
2014-09-30 10:02                       ` George Dunlap
2014-09-30 22:11                         ` Slutz, Donald Christopher
2014-09-30 10:09                     ` George Dunlap
2014-09-30 22:23                       ` Slutz, Donald Christopher
2014-10-02 10:05               ` Tim Deegan
2014-10-02 19:20                 ` Don Slutz
2014-10-03  7:09                   ` Tim Deegan
2014-09-22 15:52       ` Andrew Cooper
2014-09-22 18:39         ` Don Slutz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.