From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:50567)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <serge.fdrv@gmail.com>) id 1bBoyP-0002kN-9I
	for qemu-devel@nongnu.org; Sat, 11 Jun 2016 15:53:38 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <serge.fdrv@gmail.com>) id 1bBoyM-00047t-5h
	for qemu-devel@nongnu.org; Sat, 11 Jun 2016 15:53:37 -0400
Received: from mail-lf0-x241.google.com ([2a00:1450:4010:c07::241]:35506)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <serge.fdrv@gmail.com>) id 1bBoyL-00047d-H5
	for qemu-devel@nongnu.org; Sat, 11 Jun 2016 15:53:34 -0400
Received: by mail-lf0-x241.google.com with SMTP id w130so5284384lfd.2
	for <qemu-devel@nongnu.org>; Sat, 11 Jun 2016 12:53:33 -0700 (PDT)
References: <87ziqtf2ac.fsf@linaro.org>
From: Sergey Fedorov <serge.fdrv@gmail.com>
Message-ID: <575C6C36.9040701@gmail.com>
Date: Sat, 11 Jun 2016 22:53:26 +0300
MIME-Version: 1.0
In-Reply-To: <87ziqtf2ac.fsf@linaro.org>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] [RFC 02/10] softmmu_llsc_template.h: Move to
 multi-threading
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: =?UTF-8?Q?Alex_Benn=c3=a9e?= <alex.bennee@linaro.org>
Cc: Alvise Rigo <a.rigo@virtualopensystems.com>, mttcg@listserver.greensocs.com, qemu-devel@nongnu.org, jani.kokkonen@huawei.com, claudio.fontana@huawei.com, tech@virtualopensystems.com, fred.konrad@greensocs.com, pbonzini@redhat.com, rth@twiddle.net, cota@braap.org, peter.maydell@linaro.org

On 10/06/16 19:15, Alex Bennée wrote:
> Sergey Fedorov <serge.fdrv@gmail.com> writes:
>
>> On 26/05/16 19:35, Alvise Rigo wrote:
>>> Using tcg_exclusive_{lock,unlock}(), make the emulation of
>>> LoadLink/StoreConditional thread safe.
>>>
>>> During an LL access, this lock protects the load access itself, the
>>> update of the exclusive history and the update of the VCPU's protected
>>> range.  In a SC access, the lock protects the store access itself, the
>>> possible reset of other VCPUs' protected range and the reset of the
>>> exclusive context of calling VCPU.
>>>
>>> The lock is also taken when a normal store happens to access an
>>> exclusive page to reset other VCPUs' protected range in case of
>>> collision.
>> I think the key problem here is that the load in LL helper can race with
>> a concurrent regular fast-path store. 
(snip)
> I think this can be fixed using async_safe_run_on_cpu and tweaking the
> ldlink helper.
>
>   * Change the helper_ldlink call
>     - pass it offset-of(cpu->reg[n]) so it can store result of load
>     - maybe pass it next-pc (unless there is some other way to know)
>
>   vCPU runs until the ldlink instruction occurs and jumps to the helper
>
>   * Once in the helper_ldlink
>     - queue up an async helper function with info of offset
>     - cpu_loop_exit_restore(with next PC)
>
>   vCPU the issued the ldlink exits immediately, waits until all vCPUs are
>   out of generated code.
>
>   * Once in helper_ldlink async helper
>     - Everything at this point is quiescent, no vCPU activity
>     - Flush all TLBs/set flags
>     - Do the load from memory, store directly into cpu->reg[n]
>
> The key thing is once we are committed to load in the async helper
> nothing else can get in the way. Any stores before we are in the helper
> happen as normal, once we exit the async helper all potential
> conflicting stores will slow path.
>
> There is a little messing about in knowing the next PC which is simple
> in the ARM case but gets a bit more complicated for architectures that
> have deferred jump slots. I haven't looked into this nit yet.


Hmm, this looks pretty similar to what linux-user code actually does,
e.g. in do_strex(). Just restarting the LL instruction as Alvise
suggests may well be an easier approach (or may not).

Kind regards,
Sergey