From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:32917) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b6Ae8-00073y-HH for qemu-devel@nongnu.org; Fri, 27 May 2016 01:49:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b6Ae5-0005bE-TY for qemu-devel@nongnu.org; Fri, 27 May 2016 01:49:19 -0400 MIME-Version: 1.0 In-Reply-To: <20160527044428.GY17226@voom.fritz.box> References: <1462344751-28281-1-git-send-email-aik@ozlabs.ru> <1462344751-28281-20-git-send-email-aik@ozlabs.ru> <20160527044428.GY17226@voom.fritz.box> Date: Fri, 27 May 2016 11:19:16 +0530 Message-ID: From: Bharata B Rao Content-Type: text/plain; charset=UTF-8 Subject: Re: [Qemu-devel] [PATCH qemu v16 19/19] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: David Gibson Cc: Alexey Kardashevskiy , "qemu-devel@nongnu.org" , Alexander Graf , Alex Williamson , "qemu-ppc@nongnu.org" , Paolo Bonzini On Fri, May 27, 2016 at 10:14 AM, David Gibson wrote: > On Tue, May 17, 2016 at 11:02:48AM +0530, Bharata B Rao wrote: >> On Mon, May 16, 2016 at 11:55 AM, Alexey Kardashevskiy wrote: >> > On 05/13/2016 06:41 PM, Bharata B Rao wrote: >> >> >> >> On Wed, May 4, 2016 at 12:22 PM, Alexey Kardashevskiy >> >> wrote: >> > >> > >> >> >> >>> + >> >>> + avail = SPAPR_PCI_DMA_MAX_WINDOWS - >> >>> spapr_phb_get_active_win_num(sphb); >> >>> + >> >>> + rtas_st(rets, 0, RTAS_OUT_SUCCESS); >> >>> + rtas_st(rets, 1, avail); >> >>> + rtas_st(rets, 2, max_window_size); >> >>> + rtas_st(rets, 3, pgmask); >> >>> + rtas_st(rets, 4, 0); /* DMA migration mask, not supported */ >> >>> + >> >>> + trace_spapr_iommu_ddw_query(buid, addr, avail, max_window_size, >> >>> pgmask); >> >>> + return; >> >>> + >> >>> +param_error_exit: >> >>> + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); >> >>> +} >> >>> + >> >>> +static void rtas_ibm_create_pe_dma_window(PowerPCCPU *cpu, >> >>> + sPAPRMachineState *spapr, >> >>> + uint32_t token, uint32_t >> >>> nargs, >> >>> + target_ulong args, >> >>> + uint32_t nret, target_ulong >> >>> rets) >> >>> +{ >> >>> + sPAPRPHBState *sphb; >> >>> + sPAPRTCETable *tcet = NULL; >> >>> + uint32_t addr, page_shift, window_shift, liobn; >> >>> + uint64_t buid; >> >>> + >> >>> + if ((nargs != 5) || (nret != 4)) { >> >>> + goto param_error_exit; >> >>> + } >> >>> + >> >>> + buid = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 2); >> >>> + addr = rtas_ld(args, 0); >> >>> + sphb = spapr_pci_find_phb(spapr, buid); >> >>> + if (!sphb || !sphb->ddw_enabled) { >> >>> + goto param_error_exit; >> >>> + } >> >>> + >> >>> + page_shift = rtas_ld(args, 3); >> >>> + window_shift = rtas_ld(args, 4); >> >> >> >> >> >> Kernel has a bug due to which wrong window_shift gets returned here. I >> >> have posted possible fix here: >> >> https://patchwork.ozlabs.org/patch/621497/ >> >> >> >> I have tried to work around this issue in QEMU too >> >> https://lists.nongnu.org/archive/html/qemu-ppc/2016-04/msg00226.html >> >> >> >> But the above work around involves changing the memory representation >> >> in DT. >> > >> > >> > What is wrong with this workaround? >> >> The above workaround will result in different representations for >> memory in DT before and after the workaround. >> >> Currently for -m 2G, -numa node,nodeid=0,mem=1G -numa >> node,nodeid=1,mem=0.5G, we will have the following nodes in DT: >> >> memory@0 >> memory@40000000 >> ibm,dynamic-reconfiguration-memory >> >> ibm,dynamic-memory will have only DR LMBs: >> >> [root@localhost ibm,dynamic-reconfiguration-memory]# hexdump ibm,dynamic-memory >> 0000000 0000 000a 0000 0000 8000 0000 8000 0008 >> 0000010 0000 0000 ffff ffff 0000 0000 0000 0000 >> 0000020 9000 0000 8000 0009 0000 0000 ffff ffff >> 0000030 0000 0000 0000 0000 a000 0000 8000 000a >> 0000040 0000 0000 ffff ffff 0000 0000 0000 0000 >> 0000050 b000 0000 8000 000b 0000 0000 ffff ffff >> 0000060 0000 0000 0000 0000 c000 0000 8000 000c >> 0000070 0000 0000 ffff ffff 0000 0000 0000 0000 >> 0000080 d000 0000 8000 000d 0000 0000 ffff ffff >> 0000090 0000 0000 0000 0000 e000 0000 8000 000e >> 00000a0 0000 0000 ffff ffff 0000 0000 0000 0000 >> 00000b0 f000 0000 8000 000f 0000 0000 ffff ffff >> 00000c0 0000 0000 0000 0001 0000 0000 8000 0010 >> 00000d0 0000 0000 ffff ffff 0000 0000 0000 0001 >> 00000e0 1000 0000 8000 0011 0000 0000 ffff ffff >> 00000f0 0000 0000 >> >> The memory region looks like this: >> >> memory-region: system >> 0000000000000000-ffffffffffffffff (prio 0, RW): system >> 0000000000000000-000000005fffffff (prio 0, RW): ppc_spapr.ram >> 0000000080000000-000000011fffffff (prio 0, RW): hotplug-memory >> >> After this workaround, all this will change like below: >> >> memory@0 >> ibm,dynamic-reconfiguration-memory >> >> All LMBs in ibm,dynamic-memory: >> >> [root@localhost ibm,dynamic-reconfiguration-memory]# hexdump ibm,dynamic-memory >> >> 0000000 0000 0010 0000 0000 0000 0000 8000 0000 >> 0000010 0000 0000 0000 0000 0000 0080 0000 0000 >> 0000020 1000 0000 8000 0001 0000 0000 0000 0000 >> 0000030 0000 0080 0000 0000 2000 0000 8000 0002 >> 0000040 0000 0000 0000 0000 0000 0080 0000 0000 >> 0000050 3000 0000 8000 0003 0000 0000 0000 0000 >> 0000060 0000 0080 0000 0000 4000 0000 8000 0004 >> 0000070 0000 0000 0000 0001 0000 0008 0000 0000 >> 0000080 5000 0000 8000 0005 0000 0000 0000 0001 >> 0000090 0000 0008 0000 0000 6000 0000 8000 0006 >> 00000a0 0000 0000 ffff ffff 0000 0000 0000 0000 >> 00000b0 7000 0000 8000 0007 0000 0000 ffff ffff >> 00000c0 0000 0000 0000 0000 8000 0000 8000 0008 >> 00000d0 0000 0000 ffff ffff 0000 0000 0000 0000 >> 00000e0 9000 0000 8000 0009 0000 0000 ffff ffff >> 00000f0 0000 0000 0000 0000 a000 0000 8000 000a >> 0000100 0000 0000 ffff ffff 0000 0000 0000 0000 >> 0000110 b000 0000 8000 000b 0000 0000 ffff ffff >> 0000120 0000 0000 0000 0000 c000 0000 8000 000c >> 0000130 0000 0000 ffff ffff 0000 0000 0000 0000 >> 0000140 d000 0000 8000 000d 0000 0000 ffff ffff >> 0000150 0000 0000 0000 0000 e000 0000 8000 000e >> 0000160 0000 0000 ffff ffff 0000 0000 0000 0000 >> 0000170 f000 0000 8000 000f 0000 0000 ffff ffff >> 0000180 0000 0000 >> >> Hotplug memory region gets a new address range now: >> >> memory-region: system >> 0000000000000000-ffffffffffffffff (prio 0, RW): system >> 0000000000000000-000000005fffffff (prio 0, RW): ppc_spapr.ram >> 0000000060000000-00000000ffffffff (prio 0, RW): hotplug-memory >> >> >> So when a guest that was booted with older QEMU is migrated to a newer >> QEMU that has this workaround, then it will start exhibiting the above >> changes after first reboot post migration. > > Ok.. why is that bad? > >> If user has done memory hotplug by explicitly specifying address in >> the source, then even migration would fail because the addr specified >> at the target will not be part of hotplug-memory range. > > Sorry, not really following the situation you're describing here. If the original case where the hotplug region was this: 0000000080000000-000000011fffffff (prio 0, RW): hotplug-memory one could hoplug a DIMM at a specified address like this: (qemu) object_add memory-backend-ram,id=ram0,size=256M (qemu) device_add pc-dimm,id=dimm0,memdev=ram0,addr=0x100000000 (qemu) info mtree 0000000080000000-000000011fffffff (prio 0, RW): hotplug-memory 0000000100000000-000000010fffffff (prio 0, RW): ram0 Now if this guest has to be migrated to a target where we have this workaround enabled, then the target QEMU started with -incoming ... -object memory-backend-ram,id=ram0,size=256M -device pc-dimm,id=dimm0,memdev=ram0,addr=0x100000000 will fail because addr=0x100000000 isn't part of the hotplug-memory region at the target. Regards, Bharata.