From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56727) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1btaRm-0001Cm-Eu for qemu-devel@nongnu.org; Mon, 10 Oct 2016 09:16:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1btaRi-00032c-5w for qemu-devel@nongnu.org; Mon, 10 Oct 2016 09:16:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43342) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1btaRh-00032O-SI for qemu-devel@nongnu.org; Mon, 10 Oct 2016 09:16:46 -0400 Date: Mon, 10 Oct 2016 15:16:42 +0200 From: Igor Mammedov Message-ID: <20161010151642.3983b4b7@nial.brq.redhat.com> In-Reply-To: <20161008061409.GC3666@pxdev.xzpeter.org> References: <20161005130657.3399-1-rkrcmar@redhat.com> <20161005130657.3399-4-rkrcmar@redhat.com> <20161007150553.2e4ccc9a@nial.brq.redhat.com> <20161007162414.GA20006@potion> <20161008061409.GC3666@pxdev.xzpeter.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v4 3/8] intel_iommu: pass whole remapped addresses to apic List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Xu Cc: Radim =?UTF-8?B?S3LEjW3DocWZ?= , Eduardo Habkost , "Michael S. Tsirkin" , qemu-devel@nongnu.org, Paolo Bonzini , Richard Henderson On Sat, 8 Oct 2016 14:14:09 +0800 Peter Xu wrote: > On Fri, Oct 07, 2016 at 06:24:15PM +0200, Radim Kr=C4=8Dm=C3=A1=C5=99 wro= te: >=20 > [...] >=20 > > KVM accepts the address in host endianess and QEMU/VTD code also uses > > host endianess for internal representation of memory addresses, so this > > hunk should be fine. > >=20 > > It is confusing, because the VTD is definitely broken with respect to > > endianess -- it is even trying to swap the order of bits in a byte in > > the definition of VTD_MSIMessage. > > I don't believe that dma_memory_write() accepted LE address on BE hosts, > > so the existing code for filling the address is wrong: > >=20 > > msg.__addr_head =3D cpu_to_le32(0xfee); =20 >=20 > Yeah. This is my fault. Sorry for the troubles. >=20 > I have a patch (as well...) to fix this in my local tree, but not > posted (as mst suggested). Maybe it's time to post some of them now (I > tried to make patches more into a bunch so that they won't be lost in > mailing list in case maintainer missed it). I agree that current code > is never tested on big endian machines yet. >=20 > Here it should be: >=20 > msg.__addr_head =3D 0xfee; >=20 > > =20 > > > should it be: > > > msg.__addr_hi =3D cpu_to_le32(irq->dest & 0xffffff00) =20 > >=20 > > I don't think so. > >=20 > > Howewer, there are endianess bugs in this patch: > > =20 > > >> @@ -2281,11 +2282,7 @@ static MemTxResult vtd_mem_ir_write(void *opa= que, hwaddr addr, > > >> " for device sid 0x%04x", > > >> to.address, to.data, sid); > > >> =20 > > >> - if (dma_memory_write(&address_space_memory, to.address, > > >> - &to.data, size)) { > > >> - VTD_DPRINTF(GENERAL, "error: fail to write 0x%"PRIx64 > > >> - " value 0x%"PRIx32, to.address, to.data); > > >> - } > > >> + apic_get_class()->send_msi(&to); =20 > >=20 > > because dma_memory_write() does magic on data. > >=20 > > I don't understand how the code should have worked before this series, > > because kvm_apic_mem_write() expects data in little endian and > > apic_mem_writel() expects data in host endian, even though both of them > > are DEVICE_NATIVE_ENDIAN and are called the same way, so one shouldn't > > work. =20 >=20 > I guess that's because APIC is only used for x86? Then it does not > matter. And I agree with you that currently MSI is a little bit > confused on endianess. >=20 > First of all, I believe in the protocol MSI (along with PCI logics) > should be LE. >=20 > Instead, our MSIMessage struct looks more like to be for host > endianess. E.g. in msi_get_message() we are using: >=20 > msg.data =3D pci_get_word(dev->config + msi_data_off(dev, msi64bit)); >=20 > and pci_get_word() is converting LE to host endianess. >=20 > However, in all kvm_irqchip_*() APIs, we are assuming MSIMessage as > LE. E.g.: >=20 > int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg) > { > struct kvm_msi msi; > KVMMSIRoute *route; >=20 > if (kvm_direct_msi_allowed) { > ... > msi.data =3D le32_to_cpu(msg.data); > ... > } > } >=20 > These things are conflicting. >=20 > Maybe we need to clean this up. And I prefer MSIMessage to be host > endianess. Mostly internal QEMU structures are in host order and converted to guest byte-order on transfer, except for structures that are memcpy-ed, in which case they are typically marked QEMU_PACKED and that throws flag to reviewers to check if endianess is correct. At least struct VTD_MSIMessage definition need a comment saying what byte-order is expected. >=20 > >=20 > > And similarly, this hunk is wrong: > > =20 > > >> @@ -279,18 +280,17 @@ static void vtd_update_iotlb(IntelIOMMUState *= s, uint16_t source_id, > > >> static void vtd_generate_interrupt(IntelIOMMUState *s, hwaddr mesg_= addr_reg, > > >> hwaddr mesg_data_reg) > > >> { > > >> - hwaddr addr; > > >> - uint32_t data; > > >> + MSIMessage msi; > > >> =20 > > >> assert(mesg_data_reg < DMAR_REG_SIZE); > > >> assert(mesg_addr_reg < DMAR_REG_SIZE); > > >> =20 > > >> - addr =3D vtd_get_long_raw(s, mesg_addr_reg); > > >> - data =3D vtd_get_long_raw(s, mesg_data_reg); > > >> + msi.address =3D vtd_get_long_raw(s, mesg_addr_reg); > > >> + msi.data =3D vtd_get_long_raw(s, mesg_data_reg); > > >> =20 > > >> - VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32, addr= , data); > > >> - address_space_stl_le(&address_space_memory, addr, data, > > >> - MEMTXATTRS_UNSPECIFIED, NULL); > > >> + VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32, > > >> + msi.address, msi.data); > > >> + apic_get_class()->send_msi(&msi); > > >> } =20 > >=20 > > It should have been wrong even before, because address_space_stl_le() > > seems to accept the address in host endianess and not in LE ... UGH. =20 >=20 > Again, I guess no one is running VT-d in BE machines. So problems are > not exposed. >=20 > Thanks, >=20 > -- peterx >=20