From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Jan Beulich" <JBeulich@suse.com>
Subject: Re: [PATCH v13 7/8] Add IOREQ_TYPE_VMWARE_PORT
Date: Mon, 11 Jan 2016 06:50:19 -0700
Message-ID: <5693C12B02000078000C56AE@prv-mh.provo.novell.com>
References: <1448747105-19966-1-git-send-email-Don.Slutz@Gmail.com>
	<1448747105-19966-8-git-send-email-Don.Slutz@Gmail.com>
	<5678164C02000078000C1E86@prv-mh.provo.novell.com>
	<5692B42F.1030201@Gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <5692B42F.1030201@Gmail.com>
Content-Disposition: inline
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Don Slutz <don.slutz@gmail.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>, Tim Deegan <tim@xen.org>, Kevin Tian <kevin.tian@intel.com>, Wei Liu <wei.liu2@citrix.com>, Ian Campbell <ian.campbell@citrix.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, George Dunlap <george.dunlap@eu.citrix.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Ian Jackson <ian.jackson@eu.citrix.com>, Don Slutz <dslutz@verizon.com>, xen-devel@lists.xen.org, Eddie Dong <eddie.dong@intel.com>, Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>, Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>, Keir Fraser <keir@xen.org>, Boris Ostrovsky <boris.ostrovsky@oracle.com>
List-Id: xen-devel@lists.xenproject.org

>>> On 10.01.16 at 20:42, <don.slutz@gmail.com> wrote:
> On 12/21/15 09:10, Jan Beulich wrote:
>>>>> On 28.11.15 at 22:45, <don.slutz@gmail.com> wrote:
>>> @@ -167,26 +168,65 @@ static int hvmemul_do_io(
>>>          vio->io_req.state = STATE_IOREQ_NONE;
>>>          break;
>>>      case X86EMUL_UNHANDLEABLE:
>>> -    {
>>> -        struct hvm_ioreq_server *s =
>>> -            hvm_select_ioreq_server(curr->domain, &p);
> ...
>>> +                if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )
>>> +                    vio->io_req.state = STATE_IOREQ_NONE;
>>> +            }
>>>          }
>>>          break;
>> 
>> Especially if there is no common code to be broken out at the
>> beginning or end of these new if/else branches, then please avoid
>> re-indenting the entire existing code block (perhaps by making your
>> new code a if ( unlikely(is_vmware) ) ... else if ( ... ) prefix), to aid
>> reviewing.
>> 
> 
> The suggested if above gives:
> [...]

Bad, and not really what was meant with the comment.

> Which I feel is just as hard to review from the diff.  The only way I
> know of to reduce the diff is:
> [...]

This is mostly what was meant, with ...

>      {
>          struct hvm_ioreq_server *s =
>              hvm_select_ioreq_server(curr->domain, &p);

... the declaration (but not the initializer) retained ahead of the
vmware code block (which also wants a variable of that type).

>>> @@ -536,6 +575,21 @@ void hvm_do_resume(struct vcpu *v)
>>>          handle_mmio();
>>>          break;
>>>      case HVMIO_pio_completion:
>>> +        if ( vio->io_req.type == IOREQ_TYPE_VMWARE_PORT ) {
>>> +            vmware_regs_t *vr = get_vmport_regs_any(NULL, v);
>>> +
>>> +            if ( vr )
>>> +            {
>>> +                struct cpu_user_regs *regs = guest_cpu_user_regs();
>>> +
>>> +                /* Only change the 32bit part of the register. */
>>> +                regs->_ebx = vr->ebx;
>>> +                regs->_ecx = vr->ecx;
>>> +                regs->_edx = vr->edx;
>>> +                regs->_esi = vr->esi;
>>> +                regs->_edi = vr->edi;
>>> +            }
>> 
>> Just like in one of the other patches the comment need extending to
>> say _why_ you only change the 32-bit parts (for future readers to
>> not consider this a bug).
>> 
> 
> Will change to this:
> 
> 
> +                /* The code in QEMU that uses these registers,
> +                 * vmport.c and vmmouse.c, only uses only the 32bit
> +                 * part of the register.  This is how VMware defined
> +                 * the use of these registers.
> +                 */
> +                regs->_ebx = vr->ebx;

Hopefully with proper coding style, and ideally with one of the two
"only" removed.

>>> @@ -618,22 +672,56 @@ static void hvm_free_ioreq_gmfn(struct domain *d, unsigned long gmfn)
>>>          set_bit(i, &d->arch.hvm_domain.ioreq_gmfn.mask);
>>>  }
>>>  
>>> -static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool_t buf)
>>> +typedef enum {
>>> +    IOREQ_PAGE_TYPE_IOREQ,
>>> +    IOREQ_PAGE_TYPE_BUFIOREQ,
>>> +    IOREQ_PAGE_TYPE_VMPORT,
>> 
>> Lower case please (and this being a local enum the prefix probably
>> could be shortened).
> 
> Ok, how about ioreq_pt_?

Fine with me.

>>> +} ioreq_page_type_t;
>>> +
>>> +static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, ioreq_page_type_t buf)
>> 
>> The parameter should no longer be named "buf".
> 
> pt or page_type?

Whichever you prefer.

>>> @@ -948,12 +1056,38 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
>>>      for ( i = 0; i < NR_IO_RANGE_TYPES; i++ )
>>>      {
>>>          char *name;
>>> +        char *type_name = NULL;
>>> +        unsigned int limit;
>>>  
>>> -        rc = asprintf(&name, "ioreq_server %d %s", s->id,
>>> -                      (i == HVMOP_IO_RANGE_PORT) ? "port" :
>>> -                      (i == HVMOP_IO_RANGE_MEMORY) ? "memory" :
>>> -                      (i == HVMOP_IO_RANGE_PCI) ? "pci" :
>>> -                      "");
>>> +        switch ( i )
>>> +        {
>>> +        case HVMOP_IO_RANGE_PORT:
>>> +            type_name = "port";
>>> +            limit = MAX_NR_IO_RANGES;
>>> +            break;
>>> +        case HVMOP_IO_RANGE_MEMORY:
>>> +            type_name = "memory";
>>> +            limit = MAX_NR_IO_RANGES;
>>> +            break;
>>> +        case HVMOP_IO_RANGE_PCI:
>>> +            type_name = "pci";
>>> +            limit = MAX_NR_IO_RANGES;
>>> +            break;
>>> +        case HVMOP_IO_RANGE_VMWARE_PORT:
>>> +            type_name = "VMware port";
>>> +            limit = 1;
>>> +            break;
>> 
>> Do you really need to set up a (dummy) range set for this?
> 
> Yes, this is to allow an ioreq server to at any time enable or disable
> sending of an ioreq that is of the type IOREQ_TYPE_VMWARE_PORT to it.

Not sure I understand how this is meant to answer th question.

>>> +        case HVMOP_IO_RANGE_TIMEOFFSET:
>>> +            type_name = "timeoffset";
>>> +            limit = 1;
>>> +            break;
>> 
>> This case didn't exist before, and is unrelated to your change.
>> 
> 
> That is true.  However Paul asked me to add it and since it was a very
> small addition (these lines) I saw no reason to add a seperate patch.
> It would either have to follow this patch or include the expansion of
> s->range and to include missing range handling.

Well, it sort of depends on the resolution of the issue above: This
is a dummy as well, and I'd prefer to get away without dummies if
possible.

>>> @@ -2558,9 +2701,6 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
>>>      if ( list_empty(&d->arch.hvm_domain.ioreq_server.list) )
>>>          return NULL;
>>>  
>>> -    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
>>> -        return d->arch.hvm_domain.default_ioreq_server;
>> 
>> I'd prefer if this shortcut could stay.
> 
> I do not understand why.

Again best to be viewed in conjunction with the comment on the
dummies you add above.

>>> +{
>>> +    struct vcpu *curr = current;
>> 
>> You don't really need this local variable.
> 
> ok, and use d instead of currd?

No, why would you?

>>> +    struct domain *currd = curr->domain;
>>> +
>>> +    if ( port + bytes > BDOOR_PORT && port < BDOOR_PORT + 4 &&
>> 
>> This will match for e.g. port == BDOOR_PORT + 1 and bytes == 4,
>> which I don't think is correct.
> 
> My testing on VMware says that this is what happens there. I.E. the
> VMware "device" respondes to 4 ports.  Not that you can cleanly use the
> other 3 since EAX is not fully available.

But accessing BDOOR_PORT + 1 with a 4-byte operation ought to
be undefined (or properly split up). After all you don't know what is
on BDOOR_PORT + 4.

>>> @@ -66,11 +69,25 @@ struct ioreq {
>>>  };
>>>  typedef struct ioreq ioreq_t;
>>>  
>>> +struct vmware_regs {
>>> +    uint32_t esi;
>>> +    uint32_t edi;
>>> +    uint32_t ebx;
>>> +    uint32_t ecx;
>>> +    uint32_t edx;
>>> +};
>>> +typedef struct vmware_regs vmware_regs_t;
>> 
>> I doubt you need the typedef (considering the use below), and I
>> don't think having something prefixed vmware_ in the Xen public
>> headers is a good idea.
> 
> QEMU has used this typedef since QEMU 2.2.0 released on Dec 9, 2014.
> 
> Went in by pull request:
> 
> http://lists.xenproject.org/archives/html/xen-devel/2014-10/msg03705.html 
> 
> 
> The thread started at
> 
> http://lists.xenproject.org/archives/html/xen-devel/2014-09/msg04346.html 
> 
> During this time was when the name could be easily changed.  I know of
> no simple way to do so now.  Any change like this needs to start in QEMU
> and be accepted up-stream and in Xen's version of up-stream QEMU. After
> that Xen is changed.

All of this may be true and fine, but none of this is - to me - a
reason to introduce new unclean names into the Xen public
interface. In no event do I see qemu dictating naming to us.

>> Also throughout the series I didn't find any code addition to
>> guarantee (perhaps at build time) that BDOOR_PORT doesn't
>> collide with any other use ports (statically assigned ones as well
>> as the range used for dynamic assignment to PCI devices).
> 
> Since this is optional code, I am having an issue understanding this
> statement. Xen will not do anything with BDOOR_PORT when vmware_port=0.
> 
> I do not know how at build time to check for run time optional items.
> 
> What ports are in use include what QEMU has.
> 
> My understanding was that configuration issues like overlapping or
> multiple uses of I/O port is the users issue not Xen.
> 
> I do not see any code in xen that checks for this for other ports. So
> it is not clear what the set of port in xen needs to be checked.  The
> default range used for dynamic assignment to PCI devices is 0xc000 -
> 0xffff, which does not mean that the guest is prevented from adjusting
> pci bridges so that BDOOR_PORT is an overlap.  But that is true of a lot
> of the rest of the ports in Xen.
> 
> register_portio_handler() could be changed to check at run time for Xen.

That would be a runtime check, and hence too late. My concern is
for someone to change one of the definitions not realizing that such
a change would result in an overlap of ranges, which is known and
hence could be checked for at compile time.

Jan