From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage Date: Thu, 25 Sep 2014 16:14:30 +0100 Message-ID: References: <1411236447-7435-1-git-send-email-dslutz@verizon.com> <1411236447-7435-7-git-send-email-dslutz@verizon.com> <5422FF1D.8000308@eu.citrix.com> <5423167A.1070901@terremark.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5423167A.1070901@terremark.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Don Slutz Cc: Tim Deegan , Kevin Tian , Keir Fraser , Ian Campbell , Stefano Stabellini , Ian Jackson , Eddie Dong , "xen-devel@lists.xen.org" , Jan Beulich , Aravind Gopalakrishnan , Jun Nakajima , Andrew Cooper , Boris Ostrovsky , Suravee Suthikulpanit List-Id: xen-devel@lists.xenproject.org On Wed, Sep 24, 2014 at 8:07 PM, Don Slutz wrote: > > On 09/24/14 13:27, George Dunlap wrote: >> >> On 09/20/2014 07:07 PM, Don Slutz wrote: >>> >>> Reduce the VMPORT_DBG_LOG calls. >> >> >> You should also have mentioned that you added hew HVMTRACE macros which >> will log the TSC value. >> >> The reason the HVMTRACE macros don't log the TSC values is that for the >> most part you can get all the timing information you need from the TSC on >> the vmexit and vmenter. Looking at where you've added the TSC values, I >> don't really see how it adds anything except bloat to the log. Is there a >> reason you need to know exactly when these different things happened, >> instead of just being able to bracket them between VMENTER and VMEXITs? >> > > I did want a way to know how long the VMware code was taking. I am > not sure this is required. > > For example: > > CPU1 2899550319282 (+ 4170) VMEXIT [ exitcode = 0x00000000, rIP = > 0x00007fad13ffec8c ] > CPU1 2899550320086 (+ 804) TRAP_GP [ inst_len = 1 edx = 0x00005658 > exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ] > CPU1 2899550325054 (+ 4968) VMPORT_READ_BEFORE [ eax = 0x564d5868 ebx > = 0x000001b3 ecx = 0x0003001e edx = 0x00005658 esi = 0x00000000 edi = > 0x000001b3 ] > CPU1 2899550326050 (+ 996) VMPORT_READ_AFTER [ eax = 0x564d5868 ebx = > 0x000001b3 ecx = 0x0001001e edx = 0x00005658 esi = 0x00000000 edi = > 0x000001b3 ] > CPU1 2899550326722 (+ 672) vlapic_accept_pic_intr [ i8259_target = 1, > accept_pic_int = 0 ] > CPU1 2899550327454 (+ 732) VMENTRY > > > But I am happy to drop the new log TSC macros. The tracing function itself is not free -- the trace_var() function probably executes 5x the amount of code that is actually executed between the READ_BEFORE and READ_AFTER (given it's a switch statement and each one is basically a handful of variable assignments). It's not unlikely that most of the 996 us there is in the trace function itself. >>> diff --git a/xen/arch/x86/hvm/vmware/vmport.c >>> b/xen/arch/x86/hvm/vmware/vmport.c >>> index 811c303..962ee32 100644 >>> --- a/xen/arch/x86/hvm/vmware/vmport.c >>> +++ b/xen/arch/x86/hvm/vmware/vmport.c > >> We normally don't log both BEFORE and AFTER states of things like >> hypercalls -- just logging the outcome of what the hypervisor did should be >> sufficient, shouldn't it? > > > Not that clear with this poorly build hyper-call. > >> Do you really need to know the value of things that got clobbered? > > > When checking on the complex state machine that is the "RPC" code > it was very helpful. With that code moving to QEMU, the before and > after is not so important. > > Here is a real example: > > CPU2 865821836508576 (+ 2562) VMEXIT [ exitcode = 0x00000000, rIP > = 0x00007f68a8b17c8c ] > CPU2 865821836509362 (+ 786) TRAP_GP [ inst_len = 1 edx = > 0x00025658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ] > CPU2 865821836514132 (+ 4770) VMPORT_READ_BEFORE [ eax = 0x564d5868 > ebx = 0x00000034 ecx = 0x0002001e edx = 0x00025658 esi = 0x00000000 edi = > 0x000001be ] > CPU2 865821836597832 (+ 83700) VMPORT_READ_AFTER [ eax = 0x564d5868 ebx > = 0x00000034 ecx = 0x0001001e edx = 0x00025658 esi = 0x00000000 edi = > 0x000001be ] > CPU2 865821836598756 (+ 924) vlapic_accept_pic_intr [ i8259_target = > 1, accept_pic_int = 0 ] > CPU2 865821836605602 (+ 6846) vlapic_accept_pic_intr [ i8259_target = > 1, accept_pic_int = 0 ] > CPU2 865821836606436 (+ 834) VMENTRY > CPU2 865821836609712 (+ 3276) VMEXIT [ exitcode = 0x00000000, rIP > = 0x00007f68a8b17c8c ] > CPU2 865821836610654 (+ 942) TRAP_GP [ inst_len = 1 edx = > 0x00025658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ] > CPU2 865821836616828 (+ 6174) VMPORT_READ_BEFORE [ eax = 0x564d5868 > ebx = 0x00000034 ecx = 0x0003001e edx = 0x00025658 esi = 0x00000000 edi = > 0x000001be ] > CPU2 865821836617800 (+ 972) VMPORT_READ_AFTER [ eax = 0x564d5868 ebx > = 0x00000011 ecx = 0x0003001e edx = 0x00015658 esi = 0x00000000 edi = > 0x000001be ] > CPU2 865821836618664 (+ 864) vlapic_accept_pic_intr [ i8259_target = > 1, accept_pic_int = 0 ] > CPU2 865821836619444 (+ 780) VMENTRY > > Note that in the one "RPC" call, > > > grep VMPORT_READ_BEFORE ~/zz-xentrace-vmware3-0.out | wc > 1592 39800 256312 > > It took 1592 #GP traps to handle it, and 9643628760 tsc cycles. Right, so what I started to say yesterday: It looks like most of the trace points you're adding here is to help you debug the functionality of the hypervisor. That certainly makes sense for you to do during development. But what we want in upstream is something that will help us in production. For that to be useful, we need the logging to be as efficient as possible. Every additional HVM trace point potentially adds more data to someone else's HVM trace. So we don't want extraneous information, and we don't want to log something that we can infer from something else. In general, it's enough to give information about the decisions that Xen is making to infer what the previous state is; and then giving information about what Xen did in response (i.e., return values, injection of traps, &c) to help figure out how the guest responded. In this case, I'd probably trace: 1. vmport hypercall, handled command - cmd - return values of modified registers. Ideally only the registers that are modified, but just taking a big batch would be OK for now. Note: No need for rip here, as it will be collected at the VMEXIT 2. vmport hypercall, unhandled command - Just the unimplemented fail 3. In the vmport_gp_check(), if it successfully decodes an IO instruction: - direction (read/write) - size of the access I might consider logging something the failure path of *_vmexit_gp_intercept(), with information that might help you figure out why it didn't make it to the vmcall; but on the whole I think I'd probably leave that off. Hopefully all that would give you enough information to figure out where the problem was and how to reproduce the behavior locally; and once you can reproduce it locally, you could add in debugging traces (which wouldn't be upstreamed) to help you figure out why it wasn't taking the path you expected. Does that make sense? If you really want more traces, I might consider allowing them in the code but off by default; but I think you probably won't need more information from running production systems. -George