From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefano Stabellini Subject: Re: [PATCH RFC 1 2/8] xen/pt: Sync up the dev.config and data values. Date: Wed, 1 Jul 2015 14:48:40 +0100 Message-ID: References: <1435611725-15161-1-git-send-email-konrad.wilk@oracle.com> <1435611725-15161-3-git-send-email-konrad.wilk@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZAIOa-0005Ds-Na for xen-devel@lists.xenproject.org; Wed, 01 Jul 2015 13:49:48 +0000 In-Reply-To: <1435611725-15161-3-git-send-email-konrad.wilk@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk Cc: xen-devel@lists.xenproject.org, qemu-devel@nongnu.org, stefano.stabellini@eu.citrix.com List-Id: xen-devel@lists.xenproject.org On Mon, 29 Jun 2015, Konrad Rzeszutek Wilk wrote: > For a passthrough device we maintain a state of emulated > registers value contained within d->config. We also consult > the host registers (and apply ro and write masks) whenever > the guest access the registers. This is done in xen_pt_pci_write_config > and xen_pt_pci_read_config. > > Also in this picture we call pci_default_write_config which > updates the d->config and if the d->config[PCI_COMMAND] register > has PCI_COMMAND_MEMORY (or PCI_COMMAND_IO) acts on those changes. > > On startup the d->config[PCI_COMMAND] are the host values, not > what the guest initial values should be, which is exactly what > we do _not_ want to do for 64-bit BARs when the guest just wants > to read the size of the BAR. Huh you say? > > To get the size of 64-bit memory space BARs, the guest has > to calculate ((BAR[x] & 0xFFFFFFF0) + ((BAR[x+1] & 0xFFFFFFFF) << 32)) > which means it has to do two writes of ~0 to BARx and BARx+1. > > prior to this patch and with XSA120-addendum patch (Linux kernel) > the PCI_COMMAND register is copied from the host it can have > PCI_COMMAND_MEMORY bit set which means that QEMU will try to > update the hypervisor's P2M with BARx+1 value to ~0 (0xffffffff) > (to sync the guest state to host) instead of just having > xen_pt_pci_write_config and xen_pt_bar_reg_write apply the > proper masks and return the size to the guest. > > To thwart this, this patch syncs up the host values with the > guest values taking into account the emu_mask (bit set means > we emulate, PCI_COMMAND_MEMORY and PCI_COMMAND_IO are set). > That is we copy the host values - masking out any bits which > we will emulate. Then merge it with the initial emulation register > values. There is also some reg->size accounting taken > into consideration - which could be removed. > > This fixes errors such as these: > > (XEN) memory_map:add: dom2 gfn=fffe0 mfn=fbce0 nr=20 > (DEBUG) 189 pci dev 04:0 BAR16 wrote ~0. > (DEBUG) 200 pci dev 04:0 BAR16 read 0x0fffe0004. > (XEN) memory_map:remove: dom2 gfn=fffe0 mfn=fbce0 nr=20 > (DEBUG) 204 pci dev 04:0 BAR16 wrote 0x0fffe0004. > (DEBUG) 217 pci dev 04:0 BAR16 read upper 0x000000000. > (XEN) memory_map:add: dom2 gfn=ffffffff00000 mfn=fbce0 nr=20 > (XEN) p2m.c:883:d0v0 p2m_set_entry failed! mfn=ffffffffffffffff rc:-22 > (XEN) memory_map:fail: dom2 gfn=ffffffff00000 mfn=fbce0 nr=20 ret:-22 > (XEN) memory_map:remove: dom2 gfn=ffffffff00000 mfn=fbce0 nr=20 > (XEN) p2m.c:920:d0v0 gfn_to_mfn failed! gfn=ffffffff00000 type:4 > (XEN) p2m.c:920:d0v0 gfn_to_mfn failed! gfn=ffffffff00001 type:4 > .. > (XEN) memory_map: error -22 removing dom2 access to [fbce0,fbcff] > (DEBUG) 222 pci dev 04:0 BAR16 read upper 0x0ffffffff. > (XEN) memory_map:remove: dom2 gfn=ffffffff00000 mfn=fbce0 nr=20 > (XEN) memory_map: error -22 removing dom2 access to [fbce0,fbcff] > > [The DEBUG is to illustate what the hvmloader was doing] > > Reported-by: Sander Eikelenboom > Signed-off-by: Konrad Rzeszutek Wilk > --- > hw/xen/xen_pt_config_init.c | 45 ++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 44 insertions(+), 1 deletion(-) > > diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c > index e34f9f8..91c3a14 100644 > --- a/hw/xen/xen_pt_config_init.c > +++ b/hw/xen/xen_pt_config_init.c > @@ -1856,6 +1856,10 @@ static int xen_pt_config_reg_init(XenPCIPassthroughState *s, > reg_entry->reg = reg; > > if (reg->init) { > + uint32_t host_mask, size_mask; > + unsigned int offset; > + uint32_t val; > + > /* initialize emulate register */ > rc = reg->init(s, reg_entry->reg, > reg_grp->base_offset + reg->offset, &data); > @@ -1868,8 +1872,47 @@ static int xen_pt_config_reg_init(XenPCIPassthroughState *s, > g_free(reg_entry); > return 0; > } > + /* Sync up the data to dev.config */ > + offset = reg_grp->base_offset + reg->offset; > + size_mask = 0xFFFFFFFF >> ((4 - reg->size) << 3); > + > + if (xen_host_pci_get_long(&s->real_device, offset, &val)) > + val = pci_get_long(s->dev.config + offset); /* Pfff... */ I don't understand this, a more helpful comment would be helpful > + /* Set bits in emu_mask are the ones we emulate. The dev.config shall > + * contain the emulated view of the guest - therefore we flip the mask > + * to mask out the host values (which dev.config initially has) . */ > + host_mask = size_mask & ~reg->emu_mask; > + > + if ((data & host_mask) != (val & host_mask)) { > + uint32_t new_val; > + > + /* Mask out host (including past size). */ > + new_val = val & host_mask; > + /* Merge emulated ones (excluding the non-emulated ones). */ > + new_val |= data & host_mask; > + /* Leave intact host and emulated values past the size - even though > + * we do not care as we write per reg->size granularity, but for the > + * logging below lets have the proper value. */ > + new_val |= ((val | data)) & ~size_mask; > + XEN_PT_LOG(&s->dev,"Offset 0x%04x mismatch! Emulated=0x%04x, host=0x%04x, syncing to 0x%04x.\n", > + offset, data, val, new_val); > + val = new_val; > + } else > + val = data; > + > + /* This could be just pci_set_long as we don't modify the bits > + * past reg->size, but in case this routine is run in parallel > + * we do not want to over-write other registers. */ > + switch (reg->size) { > + case 1: pci_set_byte(s->dev.config + offset, (uint8_t)val); break; > + case 2: pci_set_word(s->dev.config + offset, (uint16_t)val); break; > + case 4: pci_set_long(s->dev.config + offset, val); break; > + default: assert(1); > + } > /* set register value */ > - reg_entry->data = data; > + reg_entry->data = val; This can potentially change reg_entry->data too with the guest view of the register, right? It is worth mentioning in the commit message, as it is one of the goals of the series. > } > /* list add register entry */ > QLIST_INSERT_HEAD(®_grp->reg_tbl_list, reg_entry, entries); > -- > 2.1.0 >