From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52106) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aRaer-00051i-QK for qemu-devel@nongnu.org; Fri, 05 Feb 2016 02:18:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aRaeo-0003yJ-Ir for qemu-devel@nongnu.org; Fri, 05 Feb 2016 02:18:21 -0500 Received: from szxga01-in.huawei.com ([58.251.152.64]:11817) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aRaen-0003xh-5O for qemu-devel@nongnu.org; Fri, 05 Feb 2016 02:18:18 -0500 From: "Gonglei (Arei)" Date: Fri, 5 Feb 2016 07:17:55 +0000 Message-ID: <33183CC9F5247A488A2544077AF19020B02DC1BD@SZXEMA503-MBS.china.huawei.com> References: <1454236146-23293-1-git-send-email-pbonzini@redhat.com> <33183CC9F5247A488A2544077AF19020B02DA7EA@SZXEMA503-MBS.china.huawei.com> <56B325B0.8050903@redhat.com> In-Reply-To: <56B325B0.8050903@redhat.com> Content-Language: zh-CN Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH v2 00/10] virtio/vring: optimization patches List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , "qemu-devel@nongnu.org" Cc: "cornelia.huck@de.ibm.com" , "v.maffione@gmail.com" , "mst@redhat.com" Dear Paolo, >=20 > From: Paolo Bonzini [mailto:pbonzini@redhat.com] > Sent: Thursday, February 04, 2016 6:19 PM >=20 > On 03/02/2016 13:08, Gonglei (Arei) wrote: > > 22.56% qemu-kvm [.] address_space_translate > > 13.29% qemu-kvm [.] qemu_get_ram_ptr >=20 > We could get rid of qemu_get_ram_ptr by storing the RAMBlock pointer > into the memory region, instead of the ram_addr_t value. I'm happy to > answer any question if you want to do it. >=20 Good point! And I simply realize this change, get nearly 8MB/s through outp= ut bonus. Testing AES-128-CBC cipher:=20 Encrypting in chunks of 256 bytes: done. 412.16 MiB in 5.02 secs: 8= 2.17 MiB/sec (1688202 packets) Encrypting in chunks of 256 bytes: done. 412.15 MiB in 5.02 secs: 8= 2.16 MiB/sec (1688158 packets) Encrypting in chunks of 256 bytes: done. 412.32 MiB in 5.02 secs: 8= 2.20 MiB/sec (1688876 packets) Encrypting in chunks of 256 bytes: done. 412.47 MiB in 5.02 secs: 8= 2.23 MiB/sec (1689491 packets) Encrypting in chunks of 256 bytes: done. 412.31 MiB in 5.02 secs: 8= 2.20 MiB/sec (1688825 packets) Encrypting in chunks of 256 bytes: done. 411.30 MiB in 5.01 secs: 8= 2.15 MiB/sec (1684671 packets) Encrypting in chunks of 256 bytes: done. 412.08 MiB in 5.01 secs: 8= 2.18 MiB/sec (1687864 packets) Encrypting in chunks of 256 bytes: done. 412.49 MiB in 5.02 secs: 8= 2.23 MiB/sec (1689564 packets) Now, 'perf top' shows me: 16.32% qemu-kvm [.] address_space_translate 5.39% libpthread-2.19.so [.] __pthread_mutex_unlock_usercnt 4.13% qemu-kvm [.] qemu_ram_addr_from_host 4.01% qemu-kvm [.] address_space_map 3.82% libc-2.19.so [.] _int_malloc 3.70% libc-2.19.so [.] _int_free 3.49% libc-2.19.so [.] malloc 3.18% libpthread-2.19.so [.] pthread_mutex_lock 3.10% qemu-kvm [.] phys_page_find 2.93% qemu-kvm [.] address_space_translate_internal 2.74% libc-2.19.so [.] malloc_consolidate 2.71% libc-2.19.so [.] __memcpy_sse2_unaligned 1.92% qemu-kvm [.] find_next_zero_bit 1.65% qemu-kvm [.] object_unref 1.61% qemu-kvm [.] address_space_rw 1.35% qemu-kvm [.] virtio_notify 1.33% qemu-kvm [.] object_ref 1.22% libc-2.19.so [.] memset Please review the below patch (based on qemu-2.3 which I'm using), thanks! If it's ok, I can rebase it based on the master branch. [PATCH] exec: store RAMBlock pointer into memory region Signed-off-by: Gonglei --- exec.c | 39 ++++++++++++++++++++++++--------------- include/exec/memory.h | 1 + include/exec/ram_addr.h | 1 + memory.c | 4 +++- 4 files changed, 29 insertions(+), 16 deletions(-) diff --git a/exec.c b/exec.c index 4a16769..51d6f30 100644 --- a/exec.c +++ b/exec.c @@ -1544,6 +1544,7 @@ ram_addr_t qemu_ram_alloc_internal(ram_addr_t size, r= am_addr_t max_size, error_propagate(errp, local_err); return -1; } + mr->ram_block =3D new_block; return addr; } =20 @@ -1817,6 +1818,11 @@ found: return mr; } =20 +void *qemu_get_ram_ptr_from_block(RAMBlock *block, hwaddr addr) +{ + return ramblock_ptr(block, addr - block->offset); +} + static void notdirty_mem_write(void *opaque, hwaddr ram_addr, uint64_t val, unsigned size) { @@ -2350,7 +2356,7 @@ bool address_space_rw(AddressSpace *as, hwaddr addr, = uint8_t *buf, } else { addr1 +=3D memory_region_get_ram_addr(mr); /* RAM case */ - ptr =3D qemu_get_ram_ptr(addr1); + ptr =3D qemu_get_ram_ptr_from_block(mr->ram_block, addr1); memcpy(ptr, buf, l); invalidate_and_set_dirty(addr1, l); } @@ -2384,7 +2390,7 @@ bool address_space_rw(AddressSpace *as, hwaddr addr, = uint8_t *buf, } } else { /* RAM case */ - ptr =3D qemu_get_ram_ptr(mr->ram_addr + addr1); + ptr =3D qemu_get_ram_ptr_from_block(mr->ram_block, mr->ram= _addr + addr1); memcpy(buf, ptr, l); } } @@ -2437,7 +2443,7 @@ static inline void cpu_physical_memory_write_rom_inte= rnal(AddressSpace *as, } else { addr1 +=3D memory_region_get_ram_addr(mr); /* ROM/RAM case */ - ptr =3D qemu_get_ram_ptr(addr1); + ptr =3D qemu_get_ram_ptr_from_block(mr->ram_block, addr1); switch (type) { case WRITE_DATA: memcpy(ptr, buf, l); @@ -2681,9 +2687,10 @@ static inline uint32_t ldl_phys_internal(AddressSpac= e *as, hwaddr addr, #endif } else { /* RAM case */ - ptr =3D qemu_get_ram_ptr((memory_region_get_ram_addr(mr) - & TARGET_PAGE_MASK) - + addr1); + ptr =3D qemu_get_ram_ptr_from_block(mr->ram_block, + (memory_region_get_ram_addr(mr) + & TARGET_PAGE_MASK) + + addr1); switch (endian) { case DEVICE_LITTLE_ENDIAN: val =3D ldl_le_p(ptr); @@ -2740,9 +2747,10 @@ static inline uint64_t ldq_phys_internal(AddressSpac= e *as, hwaddr addr, #endif } else { /* RAM case */ - ptr =3D qemu_get_ram_ptr((memory_region_get_ram_addr(mr) - & TARGET_PAGE_MASK) - + addr1); + ptr =3D qemu_get_ram_ptr_from_block(mr->ram_block, + (memory_region_get_ram_addr(mr) + & TARGET_PAGE_MASK) + + addr1); switch (endian) { case DEVICE_LITTLE_ENDIAN: val =3D ldq_le_p(ptr); @@ -2807,9 +2815,10 @@ static inline uint32_t lduw_phys_internal(AddressSpa= ce *as, hwaddr addr, #endif } else { /* RAM case */ - ptr =3D qemu_get_ram_ptr((memory_region_get_ram_addr(mr) - & TARGET_PAGE_MASK) - + addr1); + ptr =3D qemu_get_ram_ptr_from_block(mr->ram_block, + (memory_region_get_ram_addr(mr) + & TARGET_PAGE_MASK) + + addr1); switch (endian) { case DEVICE_LITTLE_ENDIAN: val =3D lduw_le_p(ptr); @@ -2856,7 +2865,7 @@ void stl_phys_notdirty(AddressSpace *as, hwaddr addr,= uint32_t val) io_mem_write(mr, addr1, val, 4); } else { addr1 +=3D memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK; - ptr =3D qemu_get_ram_ptr(addr1); + ptr =3D qemu_get_ram_ptr_from_block(mr->ram_block, addr1); stl_p(ptr, val); =20 if (unlikely(in_migration)) { @@ -2896,7 +2905,7 @@ static inline void stl_phys_internal(AddressSpace *as= , } else { /* RAM case */ addr1 +=3D memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK; - ptr =3D qemu_get_ram_ptr(addr1); + ptr =3D qemu_get_ram_ptr_from_block(mr->ram_block, addr1); switch (endian) { case DEVICE_LITTLE_ENDIAN: stl_le_p(ptr, val); @@ -2959,7 +2968,7 @@ static inline void stw_phys_internal(AddressSpace *as= , } else { /* RAM case */ addr1 +=3D memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK; - ptr =3D qemu_get_ram_ptr(addr1); + ptr =3D qemu_get_ram_ptr_from_block(mr->ram_block, addr1); switch (endian) { case DEVICE_LITTLE_ENDIAN: stw_le_p(ptr, val); diff --git a/include/exec/memory.h b/include/exec/memory.h index 06ffa1d..bd9ddea 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -146,6 +146,7 @@ struct MemoryRegion { Int128 size; hwaddr addr; void (*destructor)(MemoryRegion *mr); + void *ram_block; /* RAMBlock pointer */ ram_addr_t ram_addr; uint64_t align; bool subpage; diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index ff558a4..cc8d769 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -38,6 +38,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr); void *qemu_get_ram_ptr(ram_addr_t addr); void qemu_ram_free(ram_addr_t addr); void qemu_ram_free_from_ptr(ram_addr_t addr); +void *qemu_get_ram_ptr_from_block(RAMBlock *block, hwaddr addr); =20 int qemu_ram_resize(ram_addr_t base, ram_addr_t newsize, Error **errp); =20 diff --git a/memory.c b/memory.c index ee3f2a8..31bd84a 100644 --- a/memory.c +++ b/memory.c @@ -877,6 +877,7 @@ void memory_region_init(MemoryRegion *mr, mr->size =3D int128_2_64(); } mr->name =3D g_strdup(name); + mr->ram_block =3D NULL; =20 if (name) { char *escaped_name =3D memory_region_escape_name(name); @@ -1449,7 +1450,8 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr) =20 assert(mr->terminates); =20 - return qemu_get_ram_ptr(mr->ram_addr & TARGET_PAGE_MASK); + return qemu_get_ram_ptr_from_block(mr->ram_block, + mr->ram_addr & TARGET_PAGE_MASK); } =20 static void memory_region_update_coalesced_range_as(MemoryRegion *mr, Addr= essSpace *as) --=20 1.8.5.2 Regards, -Gonglei