On Mon, Jul 24, 2017 at 08:48:45PM +1000, Alexey Kardashevskiy wrote: > On 24/07/17 15:53, David Gibson wrote: > > On Thu, Jul 20, 2017 at 05:22:29PM +1000, Alexey Kardashevskiy wrote: > >> This replaces g_malloc() with spapr_tce_alloc_table() as this is > >> the standard way of allocating tables and this allows moving the table > >> back to KVM when unplugging a VFIO PCI device and VFIO TCE acceleration > >> support is not present in the KVM. > > > > So, I like the idea here, and I think it will work for now, but one > > thing concerns me. > > > > AIUI, your future plans include allowing in-kernel accelerated TCE > > tables even with VFIO devices. When that happens, we might have an > > in-kernel table both before and after a change in the "need_vfio" > > state. > > The in-kernel table just stays there, mappings will be replayed but in the > end of the replay only the hardware table will change. > > > > > But you won't be able to have two in-kernel tables allocated at once, > > whereas this code relies on having both the old and new tables at once > > so it can copy the contents over. > > > > How do you intend to handle that? > > I intend to make this function no-op when cap_spapr_vfio is present. > > > > > >> Although spapr_tce_alloc_table() is expected to fail with EBUSY > >> if called when previous fd is not closed yet, in practice we will not > >> see it because cap_spapr_vfio is false at the moment. > > > > I don't follow this. How would cap_spapr_vfio be changing at any > > point? > > It depends on the version of the host kernel. Ok. I'm still a bit dubious about the line of reasoning here, but the patch is correct for now, so we just have to make sure subsequent changes work with it. Applied to ppc-for-2.11. > > > > >> > >> Signed-off-by: Alexey Kardashevskiy > >> --- > >> hw/ppc/spapr_iommu.c | 35 ++++++++++++++--------------------- > >> 1 file changed, 14 insertions(+), 21 deletions(-) > >> > >> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c > >> index e614621a83..307dc3021e 100644 > >> --- a/hw/ppc/spapr_iommu.c > >> +++ b/hw/ppc/spapr_iommu.c > >> @@ -275,33 +275,26 @@ static int spapr_tce_table_realize(DeviceState *dev) > >> void spapr_tce_set_need_vfio(sPAPRTCETable *tcet, bool need_vfio) > >> { > >> size_t table_size = tcet->nb_table * sizeof(uint64_t); > >> - void *newtable; > >> + uint64_t *oldtable; > >> + int newfd = -1; > >> > >> - if (need_vfio == tcet->need_vfio) { > >> - /* Nothing to do */ > >> - return; > >> - } > >> + g_assert(need_vfio != tcet->need_vfio); > >> > >> - if (!need_vfio) { > >> - /* FIXME: We don't support transition back to KVM accelerated > >> - * TCEs yet */ > >> - return; > >> - } > >> + tcet->need_vfio = need_vfio; > >> > >> - tcet->need_vfio = true; > >> + oldtable = tcet->table; > >> > >> - if (tcet->fd < 0) { > >> - /* Table is already in userspace, nothing to be do */ > >> - return; > >> - } > >> + tcet->table = spapr_tce_alloc_table(tcet->liobn, > >> + tcet->page_shift, > >> + tcet->bus_offset, > >> + tcet->nb_table, > >> + &newfd, > >> + need_vfio); > >> + memcpy(tcet->table, oldtable, table_size); > >> > >> - newtable = g_malloc(table_size); > >> - memcpy(newtable, tcet->table, table_size); > >> + spapr_tce_free_table(oldtable, tcet->fd, tcet->nb_table); > >> > >> - kvmppc_remove_spapr_tce(tcet->table, tcet->fd, tcet->nb_table); > >> - > >> - tcet->fd = -1; > >> - tcet->table = newtable; > >> + tcet->fd = newfd; > >> } > >> > >> sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn) > > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson