From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43532) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b83QI-0001mQ-GN for qemu-devel@nongnu.org; Wed, 01 Jun 2016 06:30:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b83QE-0002cM-5K for qemu-devel@nongnu.org; Wed, 01 Jun 2016 06:30:49 -0400 Received: from mail-wm0-x22a.google.com ([2a00:1450:400c:c09::22a]:37335) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b83QD-0002cF-Da for qemu-devel@nongnu.org; Wed, 01 Jun 2016 06:30:46 -0400 Received: by mail-wm0-x22a.google.com with SMTP id z87so23185682wmh.0 for ; Wed, 01 Jun 2016 03:30:44 -0700 (PDT) References: <1459870344-16773-1-git-send-email-alex.bennee@linaro.org> <1459870344-16773-6-git-send-email-alex.bennee@linaro.org> <5733295B.8090401@gmail.com> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: <5733295B.8090401@gmail.com> Date: Wed, 01 Jun 2016 11:30:51 +0100 Message-ID: <87shwx6w10.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [RFC v2 05/11] tcg: protect TBContext with tb_lock. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Fedorov Cc: mttcg@listserver.greensocs.com, fred.konrad@greensocs.com, a.rigo@virtualopensystems.com, cota@braap.org, qemu-devel@nongnu.org, mark.burton@greensocs.com, pbonzini@redhat.com, jan.kiszka@siemens.com, rth@twiddle.net, peter.maydell@linaro.org, claudio.fontana@huawei.com, Peter Crosthwaite , "Michael S. Tsirkin" , Eduardo Habkost Sergey Fedorov writes: > On 05/04/16 18:32, Alex Bennée wrote: > (snip) >> diff --git a/cpu-exec.c b/cpu-exec.c >> index 74065d9..bd50fef 100644 >> --- a/cpu-exec.c >> +++ b/cpu-exec.c >> @@ -205,18 +205,24 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles, >> if (max_cycles > CF_COUNT_MASK) >> max_cycles = CF_COUNT_MASK; >> >> + tb_lock(); >> cpu->tb_invalidated_flag = false; >> tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flags, >> max_cycles | CF_NOCACHE >> | (ignore_icount ? CF_IGNORE_ICOUNT : 0)); >> tb->orig_tb = cpu->tb_invalidated_flag ? NULL : orig_tb; >> cpu->current_tb = tb; >> + tb_unlock(); >> + >> /* execute the generated code */ >> trace_exec_tb_nocache(tb, tb->pc); >> - cpu_tb_exec(cpu, tb); >> + cpu_tb_exec(cpu, tb->tc_ptr); > > Very suspicious change. I can't even find which patch changes > cpu_tb_exec() accordingly. I think that came from a patch this series was based on. It's gone now. > >> + >> + tb_lock(); >> cpu->current_tb = NULL; >> tb_phys_invalidate(tb, -1); >> tb_free(tb); >> + tb_unlock(); >> } >> #endif >> >> diff --git a/exec.c b/exec.c >> index 17f390e..c46c123 100644 >> --- a/exec.c >> +++ b/exec.c >> @@ -2111,6 +2111,9 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags) >> continue; >> } >> cpu->watchpoint_hit = wp; >> + >> + /* Unlocked by cpu_loop_exit or cpu_resume_from_signal. */ > > In fact, neither cpu_resume_from_signal() nor cpu_loop_exit() unlocks > the lock by itself, it gets unlocked after sigsetjmp() returns via > siglongjmp() back to cpu_exec(). So maybe it would be more clear to say > something like "'tb_lock' gets unlocked after siglongjmp()"? "Locks are reset when we longjmp back to the main cpu_exec loop"? Looking at where the patch is though I think I need to bring that bit forward from the main series. > >> + tb_lock(); >> tb_check_watchpoint(cpu); >> if (wp->flags & BP_STOP_BEFORE_ACCESS) { >> cpu->exception_index = EXCP_DEBUG; > (snip) >> diff --git a/translate-all.c b/translate-all.c >> index a7ff5e7..935d24c 100644 >> --- a/translate-all.c >> +++ b/translate-all.c >> @@ -834,7 +834,9 @@ static void page_flush_tb(void) >> } >> >> /* flush all the translation blocks */ >> -/* XXX: tb_flush is currently not thread safe */ >> +/* XXX: tb_flush is currently not thread safe. System emulation calls it only >> + * with tb_lock taken or from safe_work, so no need to take tb_lock here. >> + */ > > "System emulation"? What about user-mode emulation? It's still not thread safe ;-) It's a harder problem to solve because we can't just suspend all threads to reset the translation buffer. I'm not sure we want to try and fix it in this series. > >> void tb_flush(CPUState *cpu) >> { >> #if defined(DEBUG_FLUSH) >> @@ -1350,6 +1352,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end, >> /* we remove all the TBs in the range [start, end[ */ >> /* XXX: see if in some cases it could be faster to invalidate all >> the code */ >> + tb_lock(); > > Don't we need also protect a call to page_find() above? page_find() > calls page_find_alloc() which is noted to be called with 'tb_lock' held. > However, it might depend on the way we treat 'mmap_lock' in system mode > emulation. We might also consider taking the lock outside of > tb_invalidate_phys*() functions because they can be called after > page_find(). > >> tb = p->first_tb; >> while (tb != NULL) { >> n = (uintptr_t)tb & 3; >> @@ -1417,12 +1420,13 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end, >> if (current_tb_modified) { >> /* we generate a block containing just the instruction >> modifying the memory. It will ensure that it cannot modify >> - itself */ >> + itself. cpu_resume_from_signal unlocks tb_lock. */ >> cpu->current_tb = NULL; >> tb_gen_code(cpu, current_pc, current_cs_base, current_flags, 1); >> cpu_resume_from_signal(cpu, NULL); >> } >> #endif >> + tb_unlock(); >> } >> >> #ifdef CONFIG_SOFTMMU > (snip) >> @ -1627,6 +1636,7 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr) >> target_ulong pc, cs_base; >> uint64_t flags; >> >> + tb_lock(); > > We don't have to take 'tb_lock' for nether tb_find_pc() nor > cpu_restore_state_from_tb() because the lock does not protect from > tb_flush() anyway. I think the lock should be taken just before the > first call to tb_phys_invalidate() in this function. > >> tb = tb_find_pc(retaddr); >> if (!tb) { >> cpu_abort(cpu, "cpu_io_recompile: could not find TB for pc=%p", >> @@ -1678,11 +1688,15 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr) >> /* FIXME: In theory this could raise an exception. In practice >> we have already translated the block once so it's probably ok. */ >> tb_gen_code(cpu, pc, cs_base, flags, cflags); >> - /* TODO: If env->pc != tb->pc (i.e. the faulting instruction was not >> - the first in the TB) then we end up generating a whole new TB and >> - repeating the fault, which is horribly inefficient. >> - Better would be to execute just this insn uncached, or generate a >> - second new TB. */ >> + >> + /* This unlocks the tb_lock. >> + * >> + * TODO: If env->pc != tb->pc (i.e. the faulting instruction was not >> + * the first in the TB) then we end up generating a whole new TB and >> + * repeating the fault, which is horribly inefficient. >> + * Better would be to execute just this insn uncached, or generate a >> + * second new TB. >> + */ >> cpu_resume_from_signal(cpu, NULL); >> } > (snip) > > Kind regards, > Sergey -- Alex Bennée