From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [PATCH v3 05/18] x86/hvm: remove multiple open coded 'chunking' loops Date: Wed, 24 Jun 2015 10:38:24 +0100 Message-ID: <558A96B00200007800088BD9@mail.emea.novell.com> References: <1435055997-30017-1-git-send-email-paul.durrant@citrix.com> <1435055997-30017-6-git-send-email-paul.durrant@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1Z7h8W-0007NG-NW for xen-devel@lists.xenproject.org; Wed, 24 Jun 2015 09:38:28 +0000 In-Reply-To: <1435055997-30017-6-git-send-email-paul.durrant@citrix.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Paul Durrant Cc: Andrew Cooper , Keir Fraser , xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org >>> On 23.06.15 at 12:39, wrote: > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -540,6 +540,115 @@ static int hvmemul_virtual_to_linear( > return X86EMUL_EXCEPTION; > } > > +static int hvmemul_phys_mmio_access( > + paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer, > + unsigned int *off) Why this (buffer, off) pair? The caller can easily adjust buffer as necessary, avoiding the other parameter altogether. And buffer itself can be void * just like it is in some of the callers (and the others should follow suit). > +{ > + unsigned long one_rep = 1; > + unsigned int chunk; > + int rc = 0; > + > + /* Accesses must fall within a page */ > + if ( (gpa & (PAGE_SIZE - 1)) + size > PAGE_SIZE ) > + return X86EMUL_UNHANDLEABLE; As for patch 4 - this imposes a restriction that real hardware doesn't have, and hence this needs to be replaced by adjusting the one caller not currently guaranteeing this such that it caps the size. > + /* > + * hvmemul_do_io() cannot handle non-power-of-2 accesses or > + * accesses larger than sizeof(long), so choose the highest power > + * of 2 not exceeding sizeof(long) as the 'chunk' size. > + */ > + chunk = 1 << (fls(size) - 1); > + if ( chunk > sizeof (long) ) > + chunk = sizeof (long); I suppose you intentionally generalize this; if so this should be mentioned in the commit message. This is particularly because it results in changed behavior (which isn't to say that I'm sure the previous way was any better in the sense of being closer to what real hardware does): Right now, an 8 byte access at the last byte of a page would get carried out as 8 1-byte accesses. Your change makes it a 1-, 4-, 2-, and 1-byte access in that order. Also, considering instruction characteristics (as explained in the original comment) I think the old way of determining the chunk size may have been cheaper than yours using fls(). > + > + while ( size != 0 ) > + { > + rc = hvmemul_do_mmio_buffer(gpa, &one_rep, chunk, dir, 0, > + &buffer[*off]); > + if ( rc != X86EMUL_OKAY ) > + break; > + > + /* Advance to the next chunk */ > + gpa += chunk; > + *off += chunk; > + size -= chunk; > + > + /* > + * If the chunk now exceeds the remaining size, choose the next > + * lowest power of 2 that will fit. > + */ > + while ( chunk > size ) > + chunk >>= 1; Please avoid this loop when size == 0. Since the function won't be called with size being zero, I think the loop should be a for ( ; ; ) one with the loop exit condition put in the middle. > @@ -549,52 +658,26 @@ static int __hvmemul_read( > struct hvm_emulate_ctxt *hvmemul_ctxt) > { > struct vcpu *curr = current; > - unsigned long addr, reps = 1; > - unsigned int off, chunk = min(bytes, 1U << LONG_BYTEORDER); > + unsigned long addr, one_rep = 1; > uint32_t pfec = PFEC_page_present; > struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io; > - paddr_t gpa; > int rc; > > rc = hvmemul_virtual_to_linear( > - seg, offset, bytes, &reps, access_type, hvmemul_ctxt, &addr); > + seg, offset, bytes, &one_rep, access_type, hvmemul_ctxt, &addr); > if ( rc != X86EMUL_OKAY ) > return rc; > - off = addr & (PAGE_SIZE - 1); > - /* > - * We only need to handle sizes actual instruction operands can have. All > - * such sizes are either powers of 2 or the sum of two powers of 2. Thus > - * picking as initial chunk size the largest power of 2 not greater than > - * the total size will always result in only power-of-2 size requests > - * issued to hvmemul_do_mmio() (hvmemul_do_io() rejects non-powers-of-2). > - */ > - while ( chunk & (chunk - 1) ) > - chunk &= chunk - 1; > - if ( off + bytes > PAGE_SIZE ) > - while ( off & (chunk - 1) ) > - chunk >>= 1; > > if ( ((access_type != hvm_access_insn_fetch > ? vio->mmio_access.read_access > : vio->mmio_access.insn_fetch)) && > (vio->mmio_gva == (addr & PAGE_MASK)) ) > { > - gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off); > - while ( (off + chunk) <= PAGE_SIZE ) > - { > - rc = hvmemul_do_mmio_buffer(gpa, &reps, chunk, IOREQ_READ, 0, > - p_data); > - if ( rc != X86EMUL_OKAY || bytes == chunk ) > - return rc; > - off += chunk; > - gpa += chunk; > - p_data += chunk; > - bytes -= chunk; > - if ( bytes < chunk ) > - chunk = bytes; > - } > + unsigned int off = 0; > + paddr_t gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | If you already touch this, pfn_to_paddr() please. Jan