All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Burton <mark.burton@greensocs.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: mttcg@listserver.greensocs.com,
	"J. Kiszka" <jan.kiszka@siemens.com>,
	"Alexander Graf" <agraf@suse.de>,
	"QEMU Developers" <qemu-devel@nongnu.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"KONRAD Frédéric" <fred.konrad@greensocs.com>
Subject: Re: [Qemu-devel] [RFC 01/10] target-arm: protect cpu_exclusive_*.
Date: Tue, 3 Mar 2015 16:29:13 +0100	[thread overview]
Message-ID: <711BBCAD-9514-4523-A456-AD4F397C41CD@greensocs.com> (raw)
In-Reply-To: <CAFEAcA8Lg_iLhd2Vs8Q1KtvWUrqRyHGjmU92RemqNmC2MDEf0w@mail.gmail.com>

Hi Peter, thanks for the feedback

So - what we have discovered - using the slow path as discussed has a significant performance hit (no surprise), likewise the user-mode exit mechanism. However we are investigating that in parallel...

However, for now, we have left the TCG doing the majority of the work. 
(This was supposed to be the ‘quick an easy’ approach - we have already discussed better approaches to atomic instructions at length - but they are even more invasive!)

we have tried several schemes, and not found anything totally satisfactory:  we have a gremlin. There seems to be a race condition somewhere effectively means that the guest ‘ticket’ mechanism inside the guest kernel goes off by one, and therefore the guest kernel stalls as it never gets the ticket it’s looking for…. (it’s off by one in the ‘one too few’ sense, both CPU’s end up looking for tickets that are higher than the current finished ticket)...

The mechanism to hand out the tickets is of course a LDREX/STREX, leading us to believe that is the cause of our problems, however I am increasingly sceptical.


Our scheme is actually quite simple now:
We keep the same overall scheme as exits upstream today - we store the address and value.
We provide a lock which is used for LDREX, STREX, CLREX, and in the raise_exception code.
For LDREX we check that no other CPU is tagging the address. Because of the lock, we can be sure no STREX is executing so we are always safe to ‘steal’ a tag from another CPU (which is what the ARM ARM defines).
STREX and CLREX are similar to upstream...
We are also careful to invalidate the address tag if we take an exception.

The drawback of this scheme is of course that we are not totally protected against non exclusive writes, which could potentially fall between our reading a value and checking it against our saved value. But I am not convinced that is our problem (I have checked to make sure we dont get non exclusive writes).

This scheme would seem to be elegant and simple - but it suffers from the one small drawback of still having the issue as above…


Cheers

Mark.

ps. on our bug - we believe somehow the STREX is being marked as failed, but actually succeeds to write something.  There are only 3 ways the strex can fail:
1/ the address doesn't match - in which case no ST will be attempted
2/ the value doesn't match - which means - no ST attempted
3/ the store starts, but causes a TLB fill/exception…

The 3rd has 2 possibilities - the TLB is filled, and the store goes ahead totally normally - there should be no ‘fail’ - or an exception is generated in which case we will long jump away and never return. 

Am I missing something?



>> 
>> 


> On 2 Mar 2015, at 13:27, Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> On 27 February 2015 at 16:54, Mark Burton <mark.burton@greensocs.com> wrote:
>> 
>>> On 26 Feb 2015, at 23:56, Peter Maydell <peter.maydell@linaro.org> wrote:
>>> cpu_physical_memory_rw would bypass the TLB and so be much slower.
>>> Make sure you use the functions which go via the TLB if you do
>>> this in a helper (and remember that they will longjmp out on a
>>> tlb miss!)
>> 
>> At this point speed isn’t our main concern - it’s simplicity of implementation - we want it to work, then we can worry about a better implementation (which likely should not go this path at all - as discussed above).
>> Given that - isn’t it reasonable to pass through cpu_physical_memory_rw - and hence not have to worry about the long jump ? Or am I missing something?
> 
> If you use cpu_physical_memory_rw you need to do the
> virt-to-phys translation by hand (preferably via the TLB).
> That might be something you needed to do anyway if we want
> to have architecturally correct monitors that work on
> physaddrs rather than vaddrs, but if not then the two
> step process is a bit awkward.
> 
>>> Pretty sure we've already discussed how the current ldrex/strex
>>> implementation is not architecturally correct. I think this is
>>> another of those areas.
>> 
>> We have indeed discussed this - but this is a surprise.
> 
> You're right that I didn't specifically realise this exact
> part of our incorrectness earlier.
> 
> -- PMM


	 +44 (0)20 7100 3485 x 210
 +33 (0)5 33 52 01 77x 210

	+33 (0)603762104
	mark.burton

  reply	other threads:[~2015-03-03 15:29 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-16 17:19 [Qemu-devel] [RFC 00/10] MultiThread TCG fred.konrad
2015-01-16 17:19 ` [Qemu-devel] [RFC 01/10] target-arm: protect cpu_exclusive_* fred.konrad
2015-01-27 14:36   ` Alex Bennée
2015-01-29 15:17   ` Peter Maydell
2015-02-02  8:31     ` Frederic Konrad
2015-02-02  8:36       ` Peter Maydell
2015-02-26 18:09     ` Frederic Konrad
2015-02-26 20:36       ` Alexander Graf
2015-02-26 22:56       ` Peter Maydell
2015-02-27  7:54         ` Mark Burton
2015-03-02 12:27           ` Peter Maydell
2015-03-03 15:29             ` Mark Burton [this message]
2015-03-03 15:32               ` Paolo Bonzini
2015-03-03 15:33                 ` Mark Burton
2015-03-03 15:34                   ` Paolo Bonzini
2015-03-03 15:41                     ` Mark Burton
2015-03-03 15:47                   ` Dr. David Alan Gilbert
2015-03-13 19:38                     ` Richard Henderson
2015-03-13 20:04                       ` Dr. David Alan Gilbert
2015-01-16 17:19 ` [Qemu-devel] [RFC 02/10] use a different translation block list for each cpu fred.konrad
2015-01-27 14:45   ` Alex Bennée
2015-01-27 15:16     ` Frederic Konrad
2015-01-29 15:24   ` Peter Maydell
2015-01-29 15:33     ` Mark Burton
2015-02-02  8:39     ` Frederic Konrad
2015-02-02  8:49       ` Peter Maydell
2015-02-03 16:17   ` Richard Henderson
2015-02-03 16:33     ` Paolo Bonzini
2015-01-16 17:19 ` [Qemu-devel] [RFC 03/10] replace spinlock by QemuMutex fred.konrad
2015-01-29 15:25   ` Peter Maydell
2015-02-02  8:45     ` Frederic Konrad
2015-01-16 17:19 ` [Qemu-devel] [RFC 04/10] remove unused spinlock fred.konrad
2015-01-16 17:19 ` [Qemu-devel] [RFC 05/10] extract TBContext from TCGContext fred.konrad
2015-01-29 15:44   ` Peter Maydell
2015-02-03 16:30     ` Richard Henderson
2015-01-16 17:19 ` [Qemu-devel] [RFC 06/10] protect TBContext with tb_lock fred.konrad
2015-01-16 17:19 ` [Qemu-devel] [RFC 07/10] tcg: remove tcg_halt_cond global variable fred.konrad
2015-01-16 17:19 ` [Qemu-devel] [RFC 08/10] Drop global lock during TCG code execution fred.konrad
2015-01-16 17:19 ` [Qemu-devel] [RFC 09/10] cpu: remove exit_request global fred.konrad
2015-01-29 15:52   ` Peter Maydell
2015-02-02 10:03     ` Paolo Bonzini
2015-02-02 13:12       ` Peter Maydell
2015-02-02 13:14         ` Paolo Bonzini
2015-02-03  9:37     ` Frederic Konrad
2015-02-03 10:29       ` Peter Maydell
2015-01-16 17:19 ` [Qemu-devel] [RFC 10/10] tcg: switch on multithread fred.konrad
2015-03-27 10:08 ` [Qemu-devel] [RFC 00/10] MultiThread TCG Alex Bennée
2015-03-27 10:37   ` Frederic Konrad
2015-03-30  6:52     ` Mark Burton
2015-03-30 21:46       ` Peter Maydell
2015-03-31  6:41         ` Mark Burton
2015-04-10 16:03         ` Frederic Konrad
2015-04-22 12:26           ` Frederic Konrad
2015-04-22 13:18             ` Peter Maydell
2015-04-23  7:38               ` Frederic Konrad
2015-04-23 15:44             ` Alex Bennée
2015-04-23 15:46               ` Alex Bennée
2015-04-27  7:37                 ` Frederic Konrad
2015-04-27 17:06             ` Emilio G. Cota
2015-04-28  8:17               ` Frederic Konrad
2015-04-28  9:06               ` Paolo Bonzini
2015-04-28 17:49                 ` Emilio G. Cota

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=711BBCAD-9514-4523-A456-AD4F397C41CD@greensocs.com \
    --to=mark.burton@greensocs.com \
    --cc=agraf@suse.de \
    --cc=fred.konrad@greensocs.com \
    --cc=jan.kiszka@siemens.com \
    --cc=mttcg@listserver.greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.