From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Interesting observation with network event notification and batching Date: Fri, 14 Jun 2013 14:53:03 -0400 Message-ID: <20130614185303.GC21280@phenom.dumpdata.com> References: <20130612101451.GF2765@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <20130612101451.GF2765@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: annie.li@oracle.com, stefano.stabellini@eu.citrix.com, andrew.bennieston@citrix.com, ian.campbell@citrix.com, xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On Wed, Jun 12, 2013 at 11:14:51AM +0100, Wei Liu wrote: > Hi all > > I'm hacking on a netback trying to identify whether TLB flushes causes > heavy performance penalty on Tx path. The hack is quite nasty (you would > not want to know, trust me). > > Basically what is doesn't is, 1) alter network protocol to pass along You probably meant: "what it does" ? > mfns instead of grant references, 2) when the backend sees a new mfn, > map it RO and cache it in its own address space. > > With this hack, now we have some sort of zero-copy TX path. Backend > doesn't need to issue any grant copy / map operation any more. When it > sees a new packet in the ring, it just needs to pick up the pages > in its own address space and assemble packets with those pages then pass > the packet on to network stack. Uh, so not sure I understand the RO part. If dom0 is mapping it won't that trigger a PTE update? And doesn't somebody (either the guest or initial domain) do a grant mapping to let the hypervisor know it is OK to map a grant? Or is dom0 actually permitted to map the MFN of any guest without using the grants? In which case you are then using the _PAGE_IOMAP somewhere and setting up vmap entries with the MFN's that point to the foreign domain - I think? > > In theory this should boost performance, but in practice it is the other > way around. This hack makes Xen network more than 50% slower than before > (OMG). Further investigation shows that with this hack the batching > ability is gone. Before this hack, netback batches like 64 slots in one That is quite interesting. > interrupt event, however after this hack, it only batches 3 slots in one > interrupt event -- that's no batching at all because we can expect one > packet to occupy 3 slots. Right. > > Time to have some figures (iperf from DomU to Dom0). > > Before the hack, doing grant copy, throughput: 7.9 Gb/s, average slots > per batch 64. > > After the hack, throughput: 2.5 Gb/s, average slots per batch 3. > > After the hack, adds in 64 HYPERVISOR_xen_version (it just does context > switch into hypervisor) in Tx path, throughput: 3.2 Gb/s, average slots > per batch 6. > > After the hack, adds in 256 HYPERVISOR_xen_version (it just does context > switch into hypervisor) in Tx path, throughput: 5.2 Gb/s, average slots > per batch 26. > > After the hack, adds in 512 HYPERVISOR_xen_version (it just does context > switch into hypervisor) in Tx path, throughput: 7.9 Gb/s, average slots > per batch 26. > > After the hack, adds in 768 HYPERVISOR_xen_version (it just does context > switch into hypervisor) in Tx path, throughput: 5.6 Gb/s, average slots > per batch 25. > > After the hack, adds in 1024 HYPERVISOR_xen_version (it just does context > switch into hypervisor) in Tx path, throughput: 4.4 Gb/s, average slots > per batch 25. > How do you get it to do more HYPERVISR_xen_version? Did you just add a (for i = 1024; i>0;i--) hypervisor_yield(); in netback? > Average slots per batch is calculate as followed: > 1. count total_slots processed from start of day > 2. count tx_count which is the number of tx_action function gets > invoked > 3. avg_slots_per_tx = total_slots / tx_count > > The counter-intuition figures imply that there is something wrong with > the currently batching mechanism. Probably we need to fine-tune the > batching behavior for network and play with event pointers in the ring > (actually I'm looking into it now). It would be good to have some input > on this. I am still unsure I understand hwo your changes would incur more of the yields. > > Konrad, IIRC you once mentioned you discovered something with event > notification, what's that? They were bizzare. I naively expected some form of # of physical NIC interrupts to be around the same as the VIF or less. And I figured that the amount of interrupts would be constant irregardless of the size of the packets. In other words #packets == #interrupts. In reality the number of interrupts the VIF had was about the same while for the NIC it would fluctuate. (I can't remember the details). But it was odd and I didn't go deeper in it to figure out what was happening. And also to figure out if for the VIF we could do something of #packets != #interrupts. And hopefully some mechanism to adjust so that the amount of interrupts would be lesser per packets (hand waving here).