All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/6] Dynamic TLB sizing
@ 2018-10-06 21:45 Emilio G. Cota
  2018-10-06 21:45 ` [Qemu-devel] [RFC 1/6] (XXX) cputlb: separate MMU allocation + run-time sizing Emilio G. Cota
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Emilio G. Cota @ 2018-10-06 21:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: Pranith Kumar, Richard Henderson, Alex Bennée

After reading this paper [1], I wondered whether how far one
could push the idea of dynamic TLB resizing. We discussed
it briefly in this thread:

 https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg02340.html

Since then, (1) rth helped me (thanks!) with TCG backend code,
and (2) I've abandoned the idea of substituting malloc
for memset, and instead focused on dynamically resizing the
TLBs. The rationale is that if a process touches a lot of
memory, having a large TLB will pay off, since the perf
gains will dwarf the increased cost of flushing via memset.

This series shows that the indirection necessary to do this
does not cause a perf decrease, at least for x86_64 hosts.

This series is incomplete, since it only implements changes
to the i386 backend, and it probably only works on x86_64.
But the whole point is to (1) see whether the performance gains
are worth it, and (2) discuss how crazy this approach is. I was
looking for things to break badly, but so far I've found no obvious
issues. But there might be some assumptions about the TLB size
baked in the code that I might have missed, so please point those
out if they exist.

Performance numbers are in the last patch.

You can fetch this series from:
  https://github.com/cota/qemu/tree/tlb-dyn

Note that it applies on top of my tlb-lock-v3 series:
  https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg01087.html

Thanks,

		Emilio

[1] "Optimizing Memory Translation Emulation in Full System Emulators",
Tong et al, TACO'15 https://dl.acm.org/citation.cfm?id=2686034

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-10-08 20:28 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-06 21:45 [Qemu-devel] [RFC 0/6] Dynamic TLB sizing Emilio G. Cota
2018-10-06 21:45 ` [Qemu-devel] [RFC 1/6] (XXX) cputlb: separate MMU allocation + run-time sizing Emilio G. Cota
2018-10-08  1:47   ` Richard Henderson
2018-10-06 21:45 ` [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb Emilio G. Cota
2018-10-08  2:09   ` Richard Henderson
2018-10-08 14:42     ` Emilio G. Cota
2018-10-08 19:46       ` Richard Henderson
2018-10-08 20:23         ` Emilio G. Cota
2018-10-06 21:45 ` [Qemu-devel] [RFC 3/6] cputlb: track TLB use rates Emilio G. Cota
2018-10-08  2:54   ` Richard Henderson
2018-10-06 21:45 ` [Qemu-devel] [RFC 4/6] tcg: define TCG_TARGET_TLB_MAX_INDEX_BITS Emilio G. Cota
2018-10-08  2:56   ` Richard Henderson
2018-10-06 21:45 ` [Qemu-devel] [RFC 5/6] cpu-defs: define MIN_CPU_TLB_SIZE Emilio G. Cota
2018-10-08  3:01   ` Richard Henderson
2018-10-06 21:45 ` [Qemu-devel] [RFC 6/6] cputlb: dynamically resize TLBs based on use rate Emilio G. Cota
2018-10-07 17:37   ` Philippe Mathieu-Daudé
2018-10-08  1:48     ` Emilio G. Cota
2018-10-08 13:46       ` Emilio G. Cota
2018-10-08  3:21   ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.