From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59574) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZXWX5-0005ek-4L for qemu-devel@nongnu.org; Thu, 03 Sep 2015 11:34:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZXWX3-0003bD-Ld for qemu-devel@nongnu.org; Thu, 03 Sep 2015 11:34:35 -0400 Received: from mail-wi0-x22e.google.com ([2a00:1450:400c:c05::22e]:33823) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZXWX3-0003ax-AC for qemu-devel@nongnu.org; Thu, 03 Sep 2015 11:34:33 -0400 Received: by wicfx3 with SMTP id fx3so24122407wic.1 for ; Thu, 03 Sep 2015 08:34:32 -0700 (PDT) MIME-Version: 1.0 Date: Thu, 3 Sep 2015 18:34:31 +0300 Message-ID: From: Pavel Boldin Content-Type: multipart/alternative; boundary=f46d04428fbe05c742051ed985d5 Subject: [Qemu-devel] [Migration][TCG] Page dirtying and migration in pure-QEMU VM mode List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org --f46d04428fbe05c742051ed985d5 Content-Type: text/plain; charset=UTF-8 Dear All, As a result of fixing the bug [1] I discovered that QEMU in pure emulation (TCG) sometimes misses page dirtying on the migration. This is happens at least in the version 2.0.0 and should, according to the code, be the same in the master as well. The reason for that is that only pages missing from the TLB cache are fetched using the `tlb_fill` that calls `x86_cpu_handle_mmu_fault` and finally `stl_phys_notdirty` which marks the page as dirty in the ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] bitmap. However, if the page is in the TLB cache then no such code is run and page is never marked as dirty, making memory dump created by `savevm` different form the actual VM memory. Sometimes this leads to memory structures corruption during the migration that results in the VM kernel Oops. This is indeed happens very frequently with the data referenced by the APIC timer IRQ handler, because these structures are almost always in the TLB-cache especially on a non-busy VM. [2] Sadly, just flushing the TLB on `ram_save_setup` will not be enough because the appropriate TLB entry must be flushed whenever the appropriate page is saved and marked clean. This however can not be done in a thread-safe manner due to the race between the translated and migration code. The question is: is there a portable and good way to adequately mark such pages dirty? One of the possibilities is to use `mprotect` on the VM memory and dirty the pages on each write access. This is not a portable solution though. Second one seems to be introducing a TCG variable that will disable the code generation of the TLB-cache aware memory writes passing each such access through the appropriate helper that will dirty out the pages. This is something I have a draft hack-patch for in [3]. Are there any other possibilities I miss? Pavel [1] https://bugs.launchpad.net/mos/7.0.x/+bug/1371130 [2] https://bugs.launchpad.net/mos/7.0.x/+bug/1371130/comments/35 [3] https://bugs.launchpad.net/mos/7.0.x/+bug/1371130/+attachment/4456256/+files/dont-use-tlb-migration.patch --f46d04428fbe05c742051ed985d5 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Dear All,

<= /div>As a result of fixing the bug [1] I discovered that QEMU in pure emula= tion (TCG) sometimes misses page dirtying on the migration. This is happens= at least in the version 2.0.0 and should, according to the code, be the sa= me in the master as well.

The reason for that is that only pag= es missing from the TLB cache are fetched using the `tlb_fill` that calls `= x86_cpu_handle_mmu_fault` and finally `stl_phys_notdirty` which marks the p= age as dirty in the ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] bitmap.
However, if the page is in the TLB cache then no such code is r= un and page is never marked as dirty, making memory dump created by `savevm= ` different form the actual VM memory. Sometimes this leads to memory struc= tures corruption during the migration that results in the VM kernel Oops. T= his is indeed happens very frequently with the data referenced by the APIC = timer IRQ handler, because these structures are almost always in the TLB-ca= che especially on a non-busy VM. [2]

Sadly, just flushing the = TLB on `ram_save_setup` will not be enough because the appropriate TLB entr= y must be flushed whenever the appropriate page is saved and marked clean. = This however can not be done in a thread-safe manner due to the race betwee= n the translated and migration code.


The question is: is t= here a portable and good way to adequately mark such pages dirty?

One of the possibilities is to use `mprotect` on the VM memory and dirt= y the pages on each write access. This is not a portable solution though.

Second one seems to be introducing a TCG variable that will dis= able the code generation of the TLB-cache aware memory writes passing each = such access through the appropriate helper that will dirty out the pages. T= his is something I have a draft hack-patch for in [3].


Are= there any other possibilities I miss?
--f46d04428fbe05c742051ed985d5--