All of lore.kernel.org
 help / color / mirror / Atom feed
From: alvise rigo <a.rigo@virtualopensystems.com>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: mttcg@listserver.greensocs.com,
	Claudio Fontana <claudio.fontana@huawei.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jani Kokkonen <jani.kokkonen@huawei.com>,
	VirtualOpenSystems Technical Team <tech@virtualopensystems.com>
Subject: Re: [Qemu-devel] [RFC v3 13/13] softmmu_template.h: move to multithreading
Date: Fri, 17 Jul 2015 18:19:37 +0200	[thread overview]
Message-ID: <CAH47eN1-c83Ng6R7XgBQyauws1G_7cJMO_1jPaxLOhnqOJru1Q@mail.gmail.com> (raw)
In-Reply-To: <87615i6aln.fsf@linaro.org>

On Fri, Jul 17, 2015 at 5:57 PM, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Alvise Rigo <a.rigo@virtualopensystems.com> writes:
>
>> Exploiting the tcg_excl_access_lock, port the helper_{le,be}_st_name to
>> work in real multithreading.
>>
>> - The macro lookup_cpus_ll_addr now uses directly the
>>   env->excl_protected_addr to invalidate others' LL/SC operations
>>
>> Suggested-by: Jani Kokkonen <jani.kokkonen@huawei.com>
>> Suggested-by: Claudio Fontana <claudio.fontana@huawei.com>
>> Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
>> ---
>>  softmmu_template.h | 110 +++++++++++++++++++++++++++++++++++++++++++----------
>>  1 file changed, 89 insertions(+), 21 deletions(-)
>>
>> diff --git a/softmmu_template.h b/softmmu_template.h
>> index bc767f6..522454f 100644
>> --- a/softmmu_template.h
>> +++ b/softmmu_template.h
>> @@ -141,21 +141,24 @@
>>      vidx >= 0;                                                                \
>>  })
>>
>> +#define EXCLUSIVE_RESET_ADDR ULLONG_MAX
>> +
>> +/* This macro requires the caller to have the tcg_excl_access_lock lock since
>> + * it modifies the excl_protected_hwaddr of a running vCPU.
>> + * The macros scans all the excl_protected_hwaddr of all the vCPUs and compare
>> + * them with the address the current vCPU is writing to. If there is a match,
>> + * we reset the value, making the SC fail. */
>
> It would have been nice if we had started with a comment when the
> function^H^H^H^H^H macro was first introduced and then updated here.

OK. Related to this, I think to refactor the patches in such a way to
drop the "move to multithreading" part, since it makes things more
confusing (even if my intent was the opposite).

>
>>  #define lookup_cpus_ll_addr(addr)                                             \
>>  ({                                                                            \
>>      CPUState *cpu;                                                            \
>>      CPUArchState *acpu;                                                       \
>> -    bool hit = false;                                                         \
>>                                                                                \
>>      CPU_FOREACH(cpu) {                                                        \
>>          acpu = (CPUArchState *)cpu->env_ptr;                                  \
>>          if (cpu != current_cpu && acpu->excl_protected_hwaddr == addr) {      \
>> -            hit = true;                                                       \
>> -            break;                                                            \
>> +            acpu->excl_protected_hwaddr = EXCLUSIVE_RESET_ADDR;               \
>>          }                                                                     \
>>      }                                                                         \
>> -                                                                              \
>> -    hit;                                                                      \
>>  })
>
> My comment about using an inline function in the earlier patch stands.
>
>>
>>  #ifndef SOFTMMU_CODE_ACCESS
>> @@ -439,18 +442,52 @@ void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
>>               * exclusive-protected memory. */
>>              hwaddr hw_addr = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
>>
>> -            bool set_to_dirty;
>> -
>>              /* Two cases of invalidation: the current vCPU is writing to another
>>               * vCPU's exclusive address or the vCPU that issued the LoadLink is
>>               * writing to it, but not through a StoreCond. */
>> -            set_to_dirty = lookup_cpus_ll_addr(hw_addr);
>> -            set_to_dirty |= env->ll_sc_context &&
>> -                           (env->excl_protected_hwaddr == hw_addr);
>> +            qemu_mutex_lock(&tcg_excl_access_lock);
>> +
>> +            /* The macro lookup_cpus_ll_addr could have reset the exclusive
>> +             * address. Fail the SC in this case.
>> +             * N.B.: Here excl_succeeded == 0 means that we don't come from a
>> +             * store conditional.  */
>> +            if (env->excl_succeeded &&
>> +                        (env->excl_protected_hwaddr == EXCLUSIVE_RESET_ADDR)) {
>> +                env->excl_succeeded = 0;
>> +                qemu_mutex_unlock(&tcg_excl_access_lock);
>> +
>> +                return;
>> +            }
>> +
>> +            lookup_cpus_ll_addr(hw_addr);
>
> Add comment for side effect, also confused by the above comment given we
> call lookups_cpus_ll_addr() after the comment about it tweaking excl_succedded.

I will improve all the comments to avoid any misinterpretation. I will
also be more verbose in the cover letter about how the whole machinery
works.

>
>> +
>> +            if (!env->excl_succeeded) {
>> +                if (env->ll_sc_context &&
>> +                            (env->excl_protected_hwaddr == hw_addr)) {
>> +                    cpu_physical_memory_set_excl_dirty(hw_addr);
>> +                }
>> +            } else {
>> +                if (cpu_physical_memory_excl_is_dirty(hw_addr) ||
>> +                        env->excl_protected_hwaddr != hw_addr) {
>> +                    env->excl_protected_hwaddr = EXCLUSIVE_RESET_ADDR;
>> +                    qemu_mutex_unlock(&tcg_excl_access_lock);
>> +                    env->excl_succeeded = 0;
>> +
>> +                    return;
>> +                }
>> +            }
>
> I'm wondering if it can be more naturally written the other way round to
> aid comprehension:
>
> if (env->excl_succeeded) {
>    if (cpu_physical_memory_excl_is_dirty(hw_addr) ||
>        env->excl_protected_hwaddr != hw_addr) {
>        ..do stuff..
>        return
>    }
> } else {
>    if (env->ll_sc_context &&
>       (env->excl_protected_hwaddr == hw_addr)) {
>       cpu_physical_memory_set_excl_dirty(hw_addr);
>    }
> }
>
> Although now I'm confused as to why we push on in 3 of the 4 cases.

I will try to find another way to write this code, starting from
adding more comments about the corner cases that we are addressing.

>
>> +
>> +            haddr = addr + env->tlb_table[mmu_idx][index].addend;
>> +        #if DATA_SIZE == 1
>> +            glue(glue(st, SUFFIX), _p)((uint8_t *)haddr, val);
>> +        #else
>> +            glue(glue(st, SUFFIX), _le_p)((uint8_t *)haddr, val);
>> +        #endif
>> +
>> +            env->excl_protected_hwaddr = EXCLUSIVE_RESET_ADDR;
>> +            qemu_mutex_unlock(&tcg_excl_access_lock);
>>
>> -            if (set_to_dirty) {
>> -                cpu_physical_memory_set_excl_dirty(hw_addr);
>> -            } /* the vCPU is legitimately writing to the protected address */
>> +            return;
>>          } else {
>>              if ((addr & (DATA_SIZE - 1)) != 0) {
>>                  goto do_unaligned_access;
>> @@ -537,18 +574,49 @@ void helper_be_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
>>               * exclusive-protected memory. */
>>              hwaddr hw_addr = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
>>
>> -            bool set_to_dirty;
>> -
>>              /* Two cases of invalidation: the current vCPU is writing to another
>>               * vCPU's exclusive address or the vCPU that issued the LoadLink is
>>               * writing to it, but not through a StoreCond. */
>> -            set_to_dirty = lookup_cpus_ll_addr(hw_addr);
>> -            set_to_dirty |= env->ll_sc_context &&
>> -                           (env->excl_protected_hwaddr == hw_addr);
>> +            qemu_mutex_lock(&tcg_excl_access_lock);
>> +
>> +            /* The macro lookup_cpus_ll_addr could have reset the exclusive
>> +             * address. Fail the SC in this case.
>> +             * N.B.: Here excl_succeeded == 0 means that we don't come from a
>> +             * store conditional.  */
>> +            if (env->excl_succeeded &&
>> +                        (env->excl_protected_hwaddr == EXCLUSIVE_RESET_ADDR)) {
>> +                env->excl_succeeded = 0;
>> +                qemu_mutex_unlock(&tcg_excl_access_lock);
>> +
>> +                return;
>> +            }
>> +
>> +            lookup_cpus_ll_addr(hw_addr);
>> +
>> +            if (!env->excl_succeeded) {
>> +                if (env->ll_sc_context &&
>> +                            (env->excl_protected_hwaddr == hw_addr)) {
>> +                    cpu_physical_memory_set_excl_dirty(hw_addr);
>> +                }
>> +            } else {
>> +                if (cpu_physical_memory_excl_is_dirty(hw_addr) ||
>> +                        env->excl_protected_hwaddr != hw_addr) {
>> +                    env->excl_protected_hwaddr = EXCLUSIVE_RESET_ADDR;
>> +                    qemu_mutex_unlock(&tcg_excl_access_lock);
>> +                    env->excl_succeeded = 0;
>> +
>> +                    return;
>> +                }
>> +            }
>
> Given the amount of copy/paste between the two functions I wonder if
> there is some commonality to be re-factored out here?

Either we create inline functions outside this file (like in cputlb.c)
or we define new macros.
I don't see other viable ways to re-factor this code.

Thank you,
alvise

>
>>
>> -            if (set_to_dirty) {
>> -                cpu_physical_memory_set_excl_dirty(hw_addr);
>> -            } /* the vCPU is legitimately writing to the protected address */
>> +            haddr = addr + env->tlb_table[mmu_idx][index].addend;
>> +
>> +            glue(glue(st, SUFFIX), _be_p)((uint8_t *)haddr, val);
>> +
>> +            qemu_mutex_unlock(&tcg_excl_access_lock);
>> +            env->excl_protected_hwaddr = EXCLUSIVE_RESET_ADDR;
>> +
>> +            return;
>>          } else {
>>              if ((addr & (DATA_SIZE - 1)) != 0) {
>>                  goto do_unaligned_access;
>
> --
> Alex Bennée

  reply	other threads:[~2015-07-17 16:19 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-10  8:23 [Qemu-devel] [RFC v3 00/13] Slow-path for atomic instruction translation Alvise Rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 01/13] exec: Add new exclusive bitmap to ram_list Alvise Rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 02/13] cputlb: Add new TLB_EXCL flag Alvise Rigo
2015-07-16 14:32   ` Alex Bennée
2015-07-16 15:04     ` alvise rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 03/13] softmmu: Add helpers for a new slow-path Alvise Rigo
2015-07-16 14:53   ` Alex Bennée
2015-07-16 15:15     ` alvise rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 04/13] tcg-op: create new TCG qemu_ldlink and qemu_stcond instructions Alvise Rigo
2015-07-17  9:49   ` Alex Bennée
2015-07-17 10:05     ` alvise rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 05/13] target-arm: translate: implement qemu_ldlink and qemu_stcond ops Alvise Rigo
2015-07-17 12:51   ` Alex Bennée
2015-07-17 13:01     ` alvise rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 06/13] target-i386: " Alvise Rigo
2015-07-17 12:56   ` Alex Bennée
2015-07-17 13:27     ` alvise rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 07/13] ram_addr.h: Make exclusive bitmap accessors atomic Alvise Rigo
2015-07-17 13:32   ` Alex Bennée
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 08/13] exec.c: introduce a simple rendezvous support Alvise Rigo
2015-07-17 13:45   ` Alex Bennée
2015-07-17 13:54     ` alvise rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 09/13] cpus.c: introduce simple callback support Alvise Rigo
2015-07-10  9:36   ` Paolo Bonzini
2015-07-10  9:47     ` alvise rigo
2015-07-10  9:53       ` Frederic Konrad
2015-07-10 10:06         ` alvise rigo
2015-07-10 10:24       ` Paolo Bonzini
2015-07-10 12:16         ` Frederic Konrad
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 10/13] Simple TLB flush wrap to use as exit callback Alvise Rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 11/13] Introduce exit_flush_req and tcg_excl_access_lock Alvise Rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 12/13] softmmu_llsc_template.h: move to multithreading Alvise Rigo
2015-07-17 15:27   ` Alex Bennée
2015-07-17 15:31     ` alvise rigo
2015-07-10  8:23 ` [Qemu-devel] [RFC v3 13/13] softmmu_template.h: " Alvise Rigo
2015-07-17 15:57   ` Alex Bennée
2015-07-17 16:19     ` alvise rigo [this message]
2015-07-10  8:31 ` [Qemu-devel] [RFC v3 00/13] Slow-path for atomic instruction translation Mark Burton
2015-07-10  8:58   ` alvise rigo
2015-07-10  8:39 ` Frederic Konrad
2015-07-10  9:04   ` alvise rigo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAH47eN1-c83Ng6R7XgBQyauws1G_7cJMO_1jPaxLOhnqOJru1Q@mail.gmail.com \
    --to=a.rigo@virtualopensystems.com \
    --cc=alex.bennee@linaro.org \
    --cc=claudio.fontana@huawei.com \
    --cc=jani.kokkonen@huawei.com \
    --cc=mttcg@listserver.greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tech@virtualopensystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.