All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3
@ 2017-04-03 12:45 Alex Bennée
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 1/9] scripts/qemugdb/mtree.py: fix up mtree dump Alex Bennée
                   ` (9 more replies)
  0 siblings, 10 replies; 32+ messages in thread
From: Alex Bennée @ 2017-04-03 12:45 UTC (permalink / raw)
  To: dovgaluk, rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Alex Bennée

Hi,

This is the current state of my fixes for icount based record and
replay. It doesn't completely fix the problem (hence the RFC status)
but improves it to the point that I have been able to record and
replay the boot of a vexpress kernel.

The first 3 patches are helper scripts I've been using during my
debugging. The first is the only real fix and the following 2 should
probably be dropped from any pull request as they introduce new
features rather than fix something.

We then have another BQL fix for i386. I haven't had a chance to
replicate myself so far but it looks perfectly sane to me.

Finally the fixes for icount:

  cpus: remove icount handling from qemu_tcg_cpu_thread_fn

  Simple clean-up as we don't do icount for MTTCG

  cpus: check cpu->running in cpu_get_icount_raw()

  I'm not sure the race happens and once outside of cpu->running the
  icount counters should be zero. However it seems a sensible
  precaution.

  cpus: move icount preparation out of tcg_exec_cpu

  This is a little light re-factoring that stops the icount work
  getting in the way of the main bit of tcg_exec_cpu. It also removed
  some redundant assignment and replaced them with asserts for now.

  cpus: don't credit executed instructions before they have run

  This is the main one which ensures we never jump forward in time and
  cpu_get_icount_raw() remains consistent.

  replay: gracefully handle backward time events

  This is the most hand-wavey patch. It glosses over the disparity in
  time between the vCPU thread and the main-loop by jumping forward to
  the most current time value. However it is not really deterministic
  and runs into potential problems with sequencing of log events.

  I think a better fix would be to extend replay_lock() so all related
  log events are serialised and we don't end up with interleaved
  events from the vCPU thread and the main-loop.

I think the cpus: patches should probably go into the next
pull-request while we see if we can come up with a better final
solution for fixing record/replay. However given how long this
regression has run during the release candidate process I wanted to
update everyone on the current status and get feedback ASAP.

Cheers,


Alex Bennée (9):
  scripts/qemugdb/mtree.py: fix up mtree dump
  scripts/qemu-gdb/timers.py: new helper to dump timer state
  scripts/replay-dump.py: replay log dumper
  target/i386/misc_helper: wrap BQL around another IRQ generator
  cpus: remove icount handling from qemu_tcg_cpu_thread_fn
  cpus: check cpu->running in cpu_get_icount_raw()
  cpus: move icount preparation out of tcg_exec_cpu
  cpus: don't credit executed instructions before they have run
  replay: gracefully handle backward time events

 cpus.c                    |  94 +++++++++++-----
 include/qom/cpu.h         |   1 +
 replay/replay-internal.c  |   7 ++
 replay/replay.c           |   9 +-
 scripts/qemu-gdb.py       |   3 +-
 scripts/qemugdb/mtree.py  |  12 +-
 scripts/qemugdb/timers.py |  54 +++++++++
 scripts/replay-dump.py    | 272 ++++++++++++++++++++++++++++++++++++++++++++++
 target/i386/misc_helper.c |   3 +
 9 files changed, 423 insertions(+), 32 deletions(-)
 create mode 100644 scripts/qemugdb/timers.py
 create mode 100755 scripts/replay-dump.py

-- 
2.11.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC PATCH v1 1/9] scripts/qemugdb/mtree.py: fix up mtree dump
  2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
@ 2017-04-03 12:45 ` Alex Bennée
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 2/9] scripts/qemu-gdb/timers.py: new helper to dump timer state Alex Bennée
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: Alex Bennée @ 2017-04-03 12:45 UTC (permalink / raw)
  To: dovgaluk, rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Alex Bennée

Since QEMU has been able to build with native Int128 support this was
broken as it attempts to fish values out of the non-existent
structure. Also the alias print was trying to make a %x out of
gdb.ValueType directly which didn't seem to work.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 scripts/qemugdb/mtree.py | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/scripts/qemugdb/mtree.py b/scripts/qemugdb/mtree.py
index cc8131c2e7..e6791b7885 100644
--- a/scripts/qemugdb/mtree.py
+++ b/scripts/qemugdb/mtree.py
@@ -21,7 +21,15 @@ def isnull(ptr):
     return ptr == gdb.Value(0).cast(ptr.type)
 
 def int128(p):
-    return int(p['lo']) + (int(p['hi']) << 64)
+    '''Read an Int128 type to a python integer.
+
+    QEMU can be built with native Int128 support so we need to detect
+    if the value is a structure or the native type.
+    '''
+    if p.type.code == gdb.TYPE_CODE_STRUCT:
+        return int(p['lo']) + (int(p['hi']) << 64)
+    else:
+        return int(("%s" % p), 16)
 
 class MtreeCommand(gdb.Command):
     '''Display the memory tree hierarchy'''
@@ -69,7 +77,7 @@ class MtreeCommand(gdb.Command):
             gdb.write('%s    alias: %s@%016x (@ %s)\n' %
                       ('  ' * level,
                        alias['name'].string(),
-                       ptr['alias_offset'],
+                       int(ptr['alias_offset']),
                        alias,
                        ),
                       gdb.STDOUT)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC PATCH v1 2/9] scripts/qemu-gdb/timers.py: new helper to dump timer state
  2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 1/9] scripts/qemugdb/mtree.py: fix up mtree dump Alex Bennée
@ 2017-04-03 12:45 ` Alex Bennée
  2017-04-03 14:02   ` Philippe Mathieu-Daudé
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 3/9] scripts/replay-dump.py: replay log dumper Alex Bennée
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Alex Bennée @ 2017-04-03 12:45 UTC (permalink / raw)
  To: dovgaluk, rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Alex Bennée

This introduces the qemu-gdb command "qemu timers" which will dump the
state of the main timers in the system.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 scripts/qemu-gdb.py       |  3 ++-
 scripts/qemugdb/timers.py | 54 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+), 1 deletion(-)
 create mode 100644 scripts/qemugdb/timers.py

diff --git a/scripts/qemu-gdb.py b/scripts/qemu-gdb.py
index b3f8e04f77..3e7adb87dc 100644
--- a/scripts/qemu-gdb.py
+++ b/scripts/qemu-gdb.py
@@ -26,7 +26,7 @@ import os, sys
 
 sys.path.append(os.path.dirname(__file__))
 
-from qemugdb import aio, mtree, coroutine
+from qemugdb import aio, mtree, coroutine, timers
 
 class QemuCommand(gdb.Command):
     '''Prefix for QEMU debug support commands'''
@@ -38,6 +38,7 @@ QemuCommand()
 coroutine.CoroutineCommand()
 mtree.MtreeCommand()
 aio.HandlersCommand()
+timers.TimersCommand()
 
 coroutine.CoroutineSPFunction()
 coroutine.CoroutinePCFunction()
diff --git a/scripts/qemugdb/timers.py b/scripts/qemugdb/timers.py
new file mode 100644
index 0000000000..be71a001e3
--- /dev/null
+++ b/scripts/qemugdb/timers.py
@@ -0,0 +1,54 @@
+#!/usr/bin/python
+# GDB debugging support
+#
+# Copyright 2017 Linaro Ltd
+#
+# Author: Alex Bennée <alex.bennee@linaro.org>
+#
+# This work is licensed under the terms of the GNU GPL, version 2.  See
+# the COPYING file in the top-level directory.
+
+# 'qemu timers' -- display the current timerlists
+
+import gdb
+
+class TimersCommand(gdb.Command):
+    '''Display the current QEMU timers'''
+
+    def __init__(self):
+        'Register the class as a gdb command'
+        gdb.Command.__init__(self, 'qemu timers', gdb.COMMAND_DATA,
+                             gdb.COMPLETE_NONE)
+
+    def dump_timers(self, timer):
+        "Follow a timer and recursively dump each one in the list."
+        # timer should be of type QemuTimer
+        gdb.write("    timer %s/%s (cb:%s,opq:%s)\n" % (
+            timer['expire_time'],
+            timer['scale'],
+            timer['cb'],
+            timer['opaque']))
+
+        if int(timer['next']) > 0:
+            self.dump_timers(timer['next'])
+
+
+    def process_timerlist(self, tlist, ttype):
+        gdb.write("Processing %s timers\n" % (ttype))
+        gdb.write("  clock %s is enabled:%s, last:%s\n" % (
+            tlist['clock']['type'],
+            tlist['clock']['enabled'],
+            tlist['clock']['last']))
+        if int(tlist['active_timers']) > 0:
+            self.dump_timers(tlist['active_timers'])
+
+
+    def invoke(self, arg, from_tty):
+        'Run the command'
+        main_timers = gdb.parse_and_eval("main_loop_tlg")
+
+        # This will break if QEMUClockType in timer.h is redfined
+        self.process_timerlist(main_timers['tl'][0], "Realtime")
+        self.process_timerlist(main_timers['tl'][1], "Virtual")
+        self.process_timerlist(main_timers['tl'][2], "Host")
+        self.process_timerlist(main_timers['tl'][3], "Virtual RT")
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC PATCH v1 3/9] scripts/replay-dump.py: replay log dumper
  2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 1/9] scripts/qemugdb/mtree.py: fix up mtree dump Alex Bennée
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 2/9] scripts/qemu-gdb/timers.py: new helper to dump timer state Alex Bennée
@ 2017-04-03 12:45 ` Alex Bennée
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 4/9] target/i386/misc_helper: wrap BQL around another IRQ generator Alex Bennée
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: Alex Bennée @ 2017-04-03 12:45 UTC (permalink / raw)
  To: dovgaluk, rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Alex Bennée

This script is a debugging tool for looking through the contents of a
replay log file. It is incomplete but should fail gracefully at events
it doesn't understand.

It currently understands two different log formats as the audio
record/replay support was merged during since MTTCG. It was written to
help debug what has caused the BQL changes to break replay support.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 scripts/replay-dump.py | 272 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 272 insertions(+)
 create mode 100755 scripts/replay-dump.py

diff --git a/scripts/replay-dump.py b/scripts/replay-dump.py
new file mode 100755
index 0000000000..fdd178aba0
--- /dev/null
+++ b/scripts/replay-dump.py
@@ -0,0 +1,272 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Dump the contents of a recorded execution stream
+#
+#  Copyright (c) 2017 Alex Bennée <alex.bennee@linaro.org>
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
+
+import argparse
+import struct
+from collections import namedtuple
+
+# This mirrors some of the global replay state which some of the
+# stream loading refers to. Some decoders may read the next event so
+# we need handle that case. Calling reuse_event will ensure the next
+# event is read from the cache rather than advancing the file.
+
+class ReplayState(object):
+    def __init__(self):
+        self.event = -1
+        self.event_count = 0
+        self.already_read = False
+        self.current_checkpoint = 0
+        self.checkpoint = 0
+
+    def set_event(self, ev):
+        self.event = ev
+        self.event_count += 1
+
+    def get_event(self):
+        self.already_read = False
+        return self.event
+
+    def reuse_event(self, ev):
+        self.event = ev
+        self.already_read = True
+
+    def set_checkpoint(self):
+        self.checkpoint = self.event - self.checkpoint_start
+
+    def get_checkpoint(self):
+        return self.checkpoint
+
+replay_state = ReplayState()
+
+# Simple read functions that mirror replay-internal.c
+# The file-stream is big-endian and manually written out a byte at a time.
+
+def read_byte(fin):
+    "Read a single byte"
+    return struct.unpack('>B', fin.read(1))[0]
+
+def read_event(fin):
+    "Read a single byte event, but save some state"
+    if replay_state.already_read:
+        return replay_state.get_event()
+    else:
+        replay_state.set_event(read_byte(fin))
+        return replay_state.event
+
+def read_word(fin):
+    "Read a 16 bit word"
+    return struct.unpack('>H', fin.read(2))[0]
+
+def read_dword(fin):
+    "Read a 32 bit word"
+    return struct.unpack('>I', fin.read(4))[0]
+
+def read_qword(fin):
+    "Read a 64 bit word"
+    return struct.unpack('>Q', fin.read(8))[0]
+
+# Generic decoder structure
+Decoder = namedtuple("Decoder", "eid name fn")
+
+def call_decode(table, index, dumpfile):
+    "Search decode table for next step"
+    decoder = next((d for d in table if d.eid == index), None)
+    if not decoder:
+        print "Could not decode index: %d" % (index)
+        print "Entry is: %s" % (decoder)
+        print "Decode Table is:\n%s" % (table)
+        return False
+    else:
+        return decoder.fn(decoder.eid, decoder.name, dumpfile)
+
+# Print event
+def print_event(eid, name, string=None, event_count=None):
+    "Print event with count"
+    if not event_count:
+        event_count = replay_state.event_count
+
+    if string:
+        print "%d:%s(%d) %s" % (event_count, name, eid, string)
+    else:
+        print "%d:%s(%d)" % (event_count, name, eid)
+
+
+# Decoders for each event type
+
+def decode_unimp(eid, name, _unused_dumpfile):
+    "Unimplimented decoder, will trigger exit"
+    print "%s not handled - will now stop" % (name)
+    return False
+
+# Checkpoint decoder
+def swallow_async_qword(eid, name, dumpfile):
+    "Swallow a qword of data without looking at it"
+    step_id = read_qword(dumpfile)
+    print "  %s(%d) @ %d" % (name, eid, step_id)
+    return True
+
+async_decode_table = [ Decoder(0, "REPLAY_ASYNC_EVENT_BH", swallow_async_qword),
+                       Decoder(1, "REPLAY_ASYNC_INPUT", decode_unimp),
+                       Decoder(2, "REPLAY_ASYNC_INPUT_SYNC", decode_unimp),
+                       Decoder(3, "REPLAY_ASYNC_CHAR_READ", decode_unimp),
+                       Decoder(4, "REPLAY_ASYNC_EVENT_BLOCK", decode_unimp),
+                       Decoder(5, "REPLAY_ASYNC_EVENT_NET", decode_unimp),
+]
+# See replay_read_events/replay_read_event
+def decode_async(eid, name, dumpfile):
+    """Decode an ASYNC event"""
+
+    print_event(eid, name)
+
+    async_event_kind = read_byte(dumpfile)
+    async_event_checkpoint = read_byte(dumpfile)
+
+    if async_event_checkpoint != replay_state.current_checkpoint:
+        print "  mismatch between checkpoint %d and async data %d" % (
+            replay_state.current_checkpoint, async_event_checkpoint)
+        return True
+
+    return call_decode(async_decode_table, async_event_kind, dumpfile)
+
+
+def decode_instruction(eid, name, dumpfile):
+    ins_diff = read_dword(dumpfile)
+    print_event(eid, name, "0x%x" % (ins_diff))
+    return True
+
+def decode_audio_out(eid, name, dumpfile):
+    audio_data = read_dword(dumpfile)
+    print_event(eid, name, "%d" % (audio_data))
+    return True
+
+def decode_checkpoint(eid, name, dumpfile):
+    """Decode a checkpoint.
+
+    Checkpoints contain a series of async events with their own specific data.
+    """
+    replay_state.set_checkpoint()
+    # save event count as we peek ahead
+    event_number = replay_state.event_count
+    next_event = read_event(dumpfile)
+
+    # if the next event is EVENT_ASYNC there are a bunch of
+    # async events to read, otherwise we are done
+    if next_event != 3:
+        print_event(eid, name, "no additional data", event_number)
+    else:
+        print_event(eid, name, "more data follows", event_number)
+
+    replay_state.reuse_event(next_event)
+    return True
+
+def decode_checkpoint_init(eid, name, dumpfile):
+    print_event(eid, name)
+    return True
+
+def decode_interrupt(eid, name, dumpfile):
+    print_event(eid, name)
+    return True
+
+def decode_clock(eid, name, dumpfile):
+    clock_data = read_qword(dumpfile)
+    print_event(eid, name, "0x%x" % (clock_data))
+    return True
+
+
+# pre-MTTCG merge
+v5_event_table = [Decoder(0, "EVENT_INSTRUCTION", decode_instruction),
+                  Decoder(1, "EVENT_INTERRUPT", decode_interrupt),
+                  Decoder(2, "EVENT_EXCEPTION", decode_unimp),
+                  Decoder(3, "EVENT_ASYNC", decode_async),
+                  Decoder(4, "EVENT_SHUTDOWN", decode_unimp),
+                  Decoder(5, "EVENT_CHAR_WRITE", decode_unimp),
+                  Decoder(6, "EVENT_CHAR_READ_ALL", decode_unimp),
+                  Decoder(7, "EVENT_CHAR_READ_ALL_ERROR", decode_unimp),
+                  Decoder(8, "EVENT_CLOCK_HOST", decode_clock),
+                  Decoder(9, "EVENT_CLOCK_VIRTUAL_RT", decode_clock),
+                  Decoder(10, "EVENT_CP_CLOCK_WARP_START", decode_checkpoint),
+                  Decoder(11, "EVENT_CP_CLOCK_WARP_ACCOUNT", decode_checkpoint),
+                  Decoder(12, "EVENT_CP_RESET_REQUESTED", decode_checkpoint),
+                  Decoder(13, "EVENT_CP_SUSPEND_REQUESTED", decode_checkpoint),
+                  Decoder(14, "EVENT_CP_CLOCK_VIRTUAL", decode_checkpoint),
+                  Decoder(15, "EVENT_CP_CLOCK_HOST", decode_checkpoint),
+                  Decoder(16, "EVENT_CP_CLOCK_VIRTUAL_RT", decode_checkpoint),
+                  Decoder(17, "EVENT_CP_INIT", decode_checkpoint_init),
+                  Decoder(18, "EVENT_CP_RESET", decode_checkpoint),
+]
+
+# post-MTTCG merge, AUDIO support added
+v6_event_table = [Decoder(0, "EVENT_INSTRUCTION", decode_instruction),
+                  Decoder(1, "EVENT_INTERRUPT", decode_interrupt),
+                  Decoder(2, "EVENT_EXCEPTION", decode_unimp),
+                  Decoder(3, "EVENT_ASYNC", decode_async),
+                  Decoder(4, "EVENT_SHUTDOWN", decode_unimp),
+                  Decoder(5, "EVENT_CHAR_WRITE", decode_unimp),
+                  Decoder(6, "EVENT_CHAR_READ_ALL", decode_unimp),
+                  Decoder(7, "EVENT_CHAR_READ_ALL_ERROR", decode_unimp),
+                  Decoder(8, "EVENT_AUDIO_OUT", decode_audio_out),
+                  Decoder(9, "EVENT_AUDIO_IN", decode_unimp),
+                  Decoder(10, "EVENT_CLOCK_HOST", decode_clock),
+                  Decoder(11, "EVENT_CLOCK_VIRTUAL_RT", decode_clock),
+                  Decoder(12, "EVENT_CP_CLOCK_WARP_START", decode_checkpoint),
+                  Decoder(13, "EVENT_CP_CLOCK_WARP_ACCOUNT", decode_checkpoint),
+                  Decoder(14, "EVENT_CP_RESET_REQUESTED", decode_checkpoint),
+                  Decoder(15, "EVENT_CP_SUSPEND_REQUESTED", decode_checkpoint),
+                  Decoder(16, "EVENT_CP_CLOCK_VIRTUAL", decode_checkpoint),
+                  Decoder(17, "EVENT_CP_CLOCK_HOST", decode_checkpoint),
+                  Decoder(18, "EVENT_CP_CLOCK_VIRTUAL_RT", decode_checkpoint),
+                  Decoder(19, "EVENT_CP_INIT", decode_checkpoint_init),
+                  Decoder(20, "EVENT_CP_RESET", decode_checkpoint),
+]
+
+def parse_arguments():
+    "Grab arguments for script"
+    parser = argparse.ArgumentParser()
+    parser.add_argument("-f", "--file", help='record/replay dump to read from',
+                        required=True)
+    return parser.parse_args()
+
+def decode_file(filename):
+    "Decode a record/replay dump"
+    dumpfile = open(filename, "rb")
+
+    # read and throwaway the header
+    version = read_dword(dumpfile)
+    junk = read_qword(dumpfile)
+
+    print "HEADER: version 0x%x" % (version)
+    if version == 0xe02006:
+        event_decode_table = v6_event_table
+        replay_state.checkpoint_start = 12
+    else:
+        event_decode_table = v5_event_table
+        replay_state.checkpoint_start = 10
+
+    try:
+        decode_ok = True
+        while decode_ok:
+            event = read_event(dumpfile)
+            decode_ok = call_decode(event_decode_table, event, dumpfile)
+    finally:
+        dumpfile.close()
+
+if __name__ == "__main__":
+    args = parse_arguments()
+    decode_file(args.file)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC PATCH v1 4/9] target/i386/misc_helper: wrap BQL around another IRQ generator
  2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
                   ` (2 preceding siblings ...)
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 3/9] scripts/replay-dump.py: replay log dumper Alex Bennée
@ 2017-04-03 12:45 ` Alex Bennée
  2017-04-04 16:53   ` Richard Henderson
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 5/9] cpus: remove icount handling from qemu_tcg_cpu_thread_fn Alex Bennée
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Alex Bennée @ 2017-04-03 12:45 UTC (permalink / raw)
  To: dovgaluk, rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Alex Bennée, Eduardo Habkost

Anything that calls into HW emulation must be protected by the BQL.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 target/i386/misc_helper.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/i386/misc_helper.c b/target/i386/misc_helper.c
index ca2ea09f54..628f64aad5 100644
--- a/target/i386/misc_helper.c
+++ b/target/i386/misc_helper.c
@@ -18,6 +18,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
 #include "cpu.h"
 #include "exec/helper-proto.h"
 #include "exec/exec-all.h"
@@ -156,7 +157,9 @@ void helper_write_crN(CPUX86State *env, int reg, target_ulong t0)
         break;
     case 8:
         if (!(env->hflags2 & HF2_VINTR_MASK)) {
+            qemu_mutex_lock_iothread();
             cpu_set_apic_tpr(x86_env_get_cpu(env)->apic_state, t0);
+            qemu_mutex_unlock_iothread();
         }
         env->v_tpr = t0 & 0x0f;
         break;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC PATCH v1 5/9] cpus: remove icount handling from qemu_tcg_cpu_thread_fn
  2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
                   ` (3 preceding siblings ...)
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 4/9] target/i386/misc_helper: wrap BQL around another IRQ generator Alex Bennée
@ 2017-04-03 12:45 ` Alex Bennée
  2017-04-04 16:53   ` Richard Henderson
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 6/9] cpus: check cpu->running in cpu_get_icount_raw() Alex Bennée
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Alex Bennée @ 2017-04-03 12:45 UTC (permalink / raw)
  To: dovgaluk, rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Alex Bennée, Peter Crosthwaite

We should never be running in multi-threaded mode with icount enabled.
There is no point calling handle_icount_deadline here so remove it and
assert !use_icount.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 cpus.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index 68fdbc40b9..4e48b9c4ad 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1392,6 +1392,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
 
+    g_assert(!use_icount);
+
     rcu_register_thread();
 
     qemu_mutex_lock_iothread();
@@ -1434,8 +1436,6 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
             }
         }
 
-        handle_icount_deadline();
-
         atomic_mb_set(&cpu->exit_request, 0);
         qemu_tcg_wait_io_event(cpu);
     }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC PATCH v1 6/9] cpus: check cpu->running in cpu_get_icount_raw()
  2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
                   ` (4 preceding siblings ...)
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 5/9] cpus: remove icount handling from qemu_tcg_cpu_thread_fn Alex Bennée
@ 2017-04-03 12:45 ` Alex Bennée
  2017-04-03 14:00   ` Philippe Mathieu-Daudé
  2017-04-04 16:54   ` Richard Henderson
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu Alex Bennée
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 32+ messages in thread
From: Alex Bennée @ 2017-04-03 12:45 UTC (permalink / raw)
  To: dovgaluk, rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Alex Bennée, Peter Crosthwaite

The lifetime of current_cpu is now the lifetime of the vCPU thread.
However get_icount_raw() can apply a fudge factor if called while code
is running to take into account the current executed instruction
count.

To ensure this is always the case we also check cpu->running.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 cpus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 4e48b9c4ad..18b1746770 100644
--- a/cpus.c
+++ b/cpus.c
@@ -229,7 +229,7 @@ int64_t cpu_get_icount_raw(void)
     CPUState *cpu = current_cpu;
 
     icount = timers_state.qemu_icount;
-    if (cpu) {
+    if (cpu && cpu->running) {
         if (!cpu->can_do_io) {
             fprintf(stderr, "Bad icount read\n");
             exit(1);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
  2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
                   ` (5 preceding siblings ...)
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 6/9] cpus: check cpu->running in cpu_get_icount_raw() Alex Bennée
@ 2017-04-03 12:45 ` Alex Bennée
  2017-04-04  5:39   ` Pavel Dovgalyuk
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run Alex Bennée
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Alex Bennée @ 2017-04-03 12:45 UTC (permalink / raw)
  To: dovgaluk, rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Alex Bennée, Peter Crosthwaite

As icount is only supported for single-threaded execution due to the
requirement for determinism let's remove it from the common
tcg_exec_cpu path.

Also remove the additional fiddling which shouldn't be required as the
icount counters should all be rectified as you enter the loop.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 cpus.c | 67 +++++++++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 46 insertions(+), 21 deletions(-)

diff --git a/cpus.c b/cpus.c
index 18b1746770..87638a75d2 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1179,47 +1179,66 @@ static void handle_icount_deadline(void)
     }
 }
 
-static int tcg_cpu_exec(CPUState *cpu)
+static void prepare_icount_for_run(CPUState *cpu)
 {
-    int ret;
-#ifdef CONFIG_PROFILER
-    int64_t ti;
-#endif
-
-#ifdef CONFIG_PROFILER
-    ti = profile_getclock();
-#endif
     if (use_icount) {
         int64_t count;
         int decr;
-        timers_state.qemu_icount -= (cpu->icount_decr.u16.low
-                                    + cpu->icount_extra);
-        cpu->icount_decr.u16.low = 0;
-        cpu->icount_extra = 0;
+
+        /* These should always be cleared by process_icount_data after
+         * each vCPU execution. However u16.high can be raised
+         * asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt
+         */
+        g_assert(cpu->icount_decr.u16.low == 0);
+        g_assert(cpu->icount_extra == 0);
+
+
         count = tcg_get_icount_limit();
+
         timers_state.qemu_icount += count;
         decr = (count > 0xffff) ? 0xffff : count;
         count -= decr;
         cpu->icount_decr.u16.low = decr;
         cpu->icount_extra = count;
     }
-    qemu_mutex_unlock_iothread();
-    cpu_exec_start(cpu);
-    ret = cpu_exec(cpu);
-    cpu_exec_end(cpu);
-    qemu_mutex_lock_iothread();
-#ifdef CONFIG_PROFILER
-    tcg_time += profile_getclock() - ti;
-#endif
+}
+
+static void process_icount_data(CPUState *cpu)
+{
     if (use_icount) {
         /* Fold pending instructions back into the
            instruction counter, and clear the interrupt flag.  */
         timers_state.qemu_icount -= (cpu->icount_decr.u16.low
                         + cpu->icount_extra);
+
+        /* We must be under BQL here as cpu_exit can tweak
+           icount_decr.u32 */
+        g_assert(qemu_mutex_iothread_locked());
         cpu->icount_decr.u32 = 0;
         cpu->icount_extra = 0;
         replay_account_executed_instructions();
     }
+}
+
+
+static int tcg_cpu_exec(CPUState *cpu)
+{
+    int ret;
+#ifdef CONFIG_PROFILER
+    int64_t ti;
+#endif
+
+#ifdef CONFIG_PROFILER
+    ti = profile_getclock();
+#endif
+    qemu_mutex_unlock_iothread();
+    cpu_exec_start(cpu);
+    ret = cpu_exec(cpu);
+    cpu_exec_end(cpu);
+    qemu_mutex_lock_iothread();
+#ifdef CONFIG_PROFILER
+    tcg_time += profile_getclock() - ti;
+#endif
     return ret;
 }
 
@@ -1306,7 +1325,13 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
 
             if (cpu_can_run(cpu)) {
                 int r;
+
+                prepare_icount_for_run(cpu);
+
                 r = tcg_cpu_exec(cpu);
+
+                process_icount_data(cpu);
+
                 if (r == EXCP_DEBUG) {
                     cpu_handle_guest_debug(cpu);
                     break;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run
  2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
                   ` (6 preceding siblings ...)
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu Alex Bennée
@ 2017-04-03 12:45 ` Alex Bennée
  2017-04-03 17:04   ` Paolo Bonzini
                     ` (2 more replies)
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 9/9] replay: gracefully handle backward time events Alex Bennée
  2017-04-03 17:03 ` [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Paolo Bonzini
  9 siblings, 3 replies; 32+ messages in thread
From: Alex Bennée @ 2017-04-03 12:45 UTC (permalink / raw)
  To: dovgaluk, rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Alex Bennée, Peter Crosthwaite

Outside of the vCPU thread icount time will only be tracked against
timers_state.qemu_icount. We no longer credit cycles until they have
completed the run. Inside the vCPU thread we adjust for passage of
time by looking at how many have run so far. This is only valid inside
the vCPU thread while it is running.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 cpus.c            | 27 +++++++++++++++++++++------
 include/qom/cpu.h |  1 +
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/cpus.c b/cpus.c
index 87638a75d2..3d18374b0e 100644
--- a/cpus.c
+++ b/cpus.c
@@ -223,6 +223,15 @@ void qemu_tcg_configure(QemuOpts *opts, Error **errp)
     }
 }
 
+/* The current number of executed instructions is based on what we
+ * originally budgeted minus the current state of the decrementing
+ * icount counters in extra/u16.low.
+ */
+static int64_t cpu_get_icount_executed(CPUState *cpu)
+{
+    return cpu->icount_budget - (cpu->icount_decr.u16.low + cpu->icount_extra);
+}
+
 int64_t cpu_get_icount_raw(void)
 {
     int64_t icount;
@@ -234,7 +243,8 @@ int64_t cpu_get_icount_raw(void)
             fprintf(stderr, "Bad icount read\n");
             exit(1);
         }
-        icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
+        /* Take into account what has run */
+        icount += cpu_get_icount_executed(cpu);
     }
     return icount;
 }
@@ -1195,7 +1205,10 @@ static void prepare_icount_for_run(CPUState *cpu)
 
         count = tcg_get_icount_limit();
 
-        timers_state.qemu_icount += count;
+        /* To calculate what we have executed so far we need to know
+         * what we originally budgeted to run this cycle */
+        cpu->icount_budget = count;
+
         decr = (count > 0xffff) ? 0xffff : count;
         count -= decr;
         cpu->icount_decr.u16.low = decr;
@@ -1206,16 +1219,18 @@ static void prepare_icount_for_run(CPUState *cpu)
 static void process_icount_data(CPUState *cpu)
 {
     if (use_icount) {
-        /* Fold pending instructions back into the
-           instruction counter, and clear the interrupt flag.  */
-        timers_state.qemu_icount -= (cpu->icount_decr.u16.low
-                        + cpu->icount_extra);
+        /* Account for executed instructions */
+        timers_state.qemu_icount += cpu_get_icount_executed(cpu);
 
         /* We must be under BQL here as cpu_exit can tweak
            icount_decr.u32 */
         g_assert(qemu_mutex_iothread_locked());
+
+        /* Reset the counters */
         cpu->icount_decr.u32 = 0;
         cpu->icount_extra = 0;
+        cpu->icount_budget = 0;
+
         replay_account_executed_instructions();
     }
 }
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index c3292efe1c..5d10359c8f 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -332,6 +332,7 @@ struct CPUState {
     /* updates protected by BQL */
     uint32_t interrupt_request;
     int singlestep_enabled;
+    int64_t icount_budget;
     int64_t icount_extra;
     sigjmp_buf jmp_env;
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [RFC PATCH v1 9/9] replay: gracefully handle backward time events
  2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
                   ` (7 preceding siblings ...)
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run Alex Bennée
@ 2017-04-03 12:45 ` Alex Bennée
  2017-04-03 17:03 ` [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Paolo Bonzini
  9 siblings, 0 replies; 32+ messages in thread
From: Alex Bennée @ 2017-04-03 12:45 UTC (permalink / raw)
  To: dovgaluk, rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Alex Bennée

For the purposes of record/replay time can only move forward.

It is possible for the non-vCPU thread to find time has moved from
under its feet as it goes on. This is OK as long as we don't try and
re-wind time.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 replay/replay-internal.c | 7 +++++++
 replay/replay.c          | 9 +++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/replay/replay-internal.c b/replay/replay-internal.c
index bea7b4aa6b..9656db7102 100644
--- a/replay/replay-internal.c
+++ b/replay/replay-internal.c
@@ -199,6 +199,13 @@ void replay_save_instructions(void)
             replay_put_event(EVENT_INSTRUCTION);
             replay_put_dword(diff);
             replay_state.current_step += diff;
+        } else if (diff < 0) {
+            /* Time has caught up with us, so as far as we are
+             * concerned use now, not the past. We still put an event
+             * in the stream to keep in sync.
+             */
+            replay_put_event(EVENT_INSTRUCTION);
+            replay_put_dword(0);
         }
         replay_mutex_unlock();
     }
diff --git a/replay/replay.c b/replay/replay.c
index 9e0724e756..f4376df0fd 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -84,8 +84,13 @@ void replay_account_executed_instructions(void)
         if (replay_state.instructions_count > 0) {
             int count = (int)(replay_get_current_step()
                               - replay_state.current_step);
-            replay_state.instructions_count -= count;
-            replay_state.current_step += count;
+
+            /* Time only goes forward */
+            if (count >= 0) {
+                replay_state.instructions_count -= count;
+                replay_state.current_step += count;
+            }
+
             if (replay_state.instructions_count == 0) {
                 assert(replay_state.data_kind == EVENT_INSTRUCTION);
                 replay_finish_event();
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 6/9] cpus: check cpu->running in cpu_get_icount_raw()
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 6/9] cpus: check cpu->running in cpu_get_icount_raw() Alex Bennée
@ 2017-04-03 14:00   ` Philippe Mathieu-Daudé
  2017-04-04 16:54   ` Richard Henderson
  1 sibling, 0 replies; 32+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-04-03 14:00 UTC (permalink / raw)
  To: Alex Bennée, dovgaluk, rth, pbonzini
  Cc: mttcg, peter.maydell, nikunj, Peter Crosthwaite, a.rigo,
	qemu-devel, cota, bobby.prani, fred.konrad

On 04/03/2017 09:45 AM, Alex Bennée wrote:
> The lifetime of current_cpu is now the lifetime of the vCPU thread.
> However get_icount_raw() can apply a fudge factor if called while code
> is running to take into account the current executed instruction
> count.
>
> To ensure this is always the case we also check cpu->running.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

> ---
>  cpus.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/cpus.c b/cpus.c
> index 4e48b9c4ad..18b1746770 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -229,7 +229,7 @@ int64_t cpu_get_icount_raw(void)
>      CPUState *cpu = current_cpu;
>
>      icount = timers_state.qemu_icount;
> -    if (cpu) {
> +    if (cpu && cpu->running) {
>          if (!cpu->can_do_io) {
>              fprintf(stderr, "Bad icount read\n");
>              exit(1);
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 2/9] scripts/qemu-gdb/timers.py: new helper to dump timer state
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 2/9] scripts/qemu-gdb/timers.py: new helper to dump timer state Alex Bennée
@ 2017-04-03 14:02   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 32+ messages in thread
From: Philippe Mathieu-Daudé @ 2017-04-03 14:02 UTC (permalink / raw)
  To: Alex Bennée, dovgaluk, rth, pbonzini
  Cc: mttcg, peter.maydell, nikunj, a.rigo, qemu-devel, cota,
	bobby.prani, fred.konrad

On 04/03/2017 09:45 AM, Alex Bennée wrote:
> This introduces the qemu-gdb command "qemu timers" which will dump the
> state of the main timers in the system.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

> ---
>  scripts/qemu-gdb.py       |  3 ++-
>  scripts/qemugdb/timers.py | 54 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 56 insertions(+), 1 deletion(-)
>  create mode 100644 scripts/qemugdb/timers.py
>
> diff --git a/scripts/qemu-gdb.py b/scripts/qemu-gdb.py
> index b3f8e04f77..3e7adb87dc 100644
> --- a/scripts/qemu-gdb.py
> +++ b/scripts/qemu-gdb.py
> @@ -26,7 +26,7 @@ import os, sys
>
>  sys.path.append(os.path.dirname(__file__))
>
> -from qemugdb import aio, mtree, coroutine
> +from qemugdb import aio, mtree, coroutine, timers
>
>  class QemuCommand(gdb.Command):
>      '''Prefix for QEMU debug support commands'''
> @@ -38,6 +38,7 @@ QemuCommand()
>  coroutine.CoroutineCommand()
>  mtree.MtreeCommand()
>  aio.HandlersCommand()
> +timers.TimersCommand()
>
>  coroutine.CoroutineSPFunction()
>  coroutine.CoroutinePCFunction()
> diff --git a/scripts/qemugdb/timers.py b/scripts/qemugdb/timers.py
> new file mode 100644
> index 0000000000..be71a001e3
> --- /dev/null
> +++ b/scripts/qemugdb/timers.py
> @@ -0,0 +1,54 @@
> +#!/usr/bin/python
> +# GDB debugging support
> +#
> +# Copyright 2017 Linaro Ltd
> +#
> +# Author: Alex Bennée <alex.bennee@linaro.org>
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2.  See
> +# the COPYING file in the top-level directory.
> +
> +# 'qemu timers' -- display the current timerlists
> +
> +import gdb
> +
> +class TimersCommand(gdb.Command):
> +    '''Display the current QEMU timers'''
> +
> +    def __init__(self):
> +        'Register the class as a gdb command'
> +        gdb.Command.__init__(self, 'qemu timers', gdb.COMMAND_DATA,
> +                             gdb.COMPLETE_NONE)
> +
> +    def dump_timers(self, timer):
> +        "Follow a timer and recursively dump each one in the list."
> +        # timer should be of type QemuTimer
> +        gdb.write("    timer %s/%s (cb:%s,opq:%s)\n" % (
> +            timer['expire_time'],
> +            timer['scale'],
> +            timer['cb'],
> +            timer['opaque']))
> +
> +        if int(timer['next']) > 0:
> +            self.dump_timers(timer['next'])
> +
> +
> +    def process_timerlist(self, tlist, ttype):
> +        gdb.write("Processing %s timers\n" % (ttype))
> +        gdb.write("  clock %s is enabled:%s, last:%s\n" % (
> +            tlist['clock']['type'],
> +            tlist['clock']['enabled'],
> +            tlist['clock']['last']))
> +        if int(tlist['active_timers']) > 0:
> +            self.dump_timers(tlist['active_timers'])
> +
> +
> +    def invoke(self, arg, from_tty):
> +        'Run the command'
> +        main_timers = gdb.parse_and_eval("main_loop_tlg")
> +
> +        # This will break if QEMUClockType in timer.h is redfined
> +        self.process_timerlist(main_timers['tl'][0], "Realtime")
> +        self.process_timerlist(main_timers['tl'][1], "Virtual")
> +        self.process_timerlist(main_timers['tl'][2], "Host")
> +        self.process_timerlist(main_timers['tl'][3], "Virtual RT")
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3
  2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
                   ` (8 preceding siblings ...)
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 9/9] replay: gracefully handle backward time events Alex Bennée
@ 2017-04-03 17:03 ` Paolo Bonzini
  2017-04-04  8:50   ` Alex Bennée
  9 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2017-04-03 17:03 UTC (permalink / raw)
  To: Alex Bennée, dovgaluk, rth
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj



On 03/04/2017 14:45, Alex Bennée wrote:
>   cpus: check cpu->running in cpu_get_icount_raw()
> 
>   I'm not sure the race happens and once outside of cpu->running the
>   icount counters should be zero. However it seems a sensible
>   precaution.

Yeah, I think this is unnecessary with patch 7's new assertions.

> I think the cpus: patches should probably go into the next
> pull-request while we see if we can come up with a better final
> solution for fixing record/replay. However given how long this
> regression has run during the release candidate process I wanted to
> update everyone on the current status and get feedback ASAP.

I agree.  I'm not sure exactly how the final race happens, but if it
causes divergence it would be caught later by the record/replay
mechanism, I think.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run Alex Bennée
@ 2017-04-03 17:04   ` Paolo Bonzini
  2017-04-04  5:37   ` Pavel Dovgalyuk
  2017-04-04 14:39   ` Paolo Bonzini
  2 siblings, 0 replies; 32+ messages in thread
From: Paolo Bonzini @ 2017-04-03 17:04 UTC (permalink / raw)
  To: Alex Bennée, dovgaluk, rth
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Peter Crosthwaite



On 03/04/2017 14:45, Alex Bennée wrote:
> Outside of the vCPU thread icount time will only be tracked against
> timers_state.qemu_icount. We no longer credit cycles until they have
> completed the run. Inside the vCPU thread we adjust for passage of
> time by looking at how many have run so far. This is only valid inside
> the vCPU thread while it is running.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

I suspect icount_budget and icount_extra could be merged into one, but
not for 2.9.  For now this looks nice!

Paolo

> ---
>  cpus.c            | 27 +++++++++++++++++++++------
>  include/qom/cpu.h |  1 +
>  2 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 87638a75d2..3d18374b0e 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -223,6 +223,15 @@ void qemu_tcg_configure(QemuOpts *opts, Error **errp)
>      }
>  }
>  
> +/* The current number of executed instructions is based on what we
> + * originally budgeted minus the current state of the decrementing
> + * icount counters in extra/u16.low.
> + */
> +static int64_t cpu_get_icount_executed(CPUState *cpu)
> +{
> +    return cpu->icount_budget - (cpu->icount_decr.u16.low + cpu->icount_extra);
> +}
> +
>  int64_t cpu_get_icount_raw(void)
>  {
>      int64_t icount;
> @@ -234,7 +243,8 @@ int64_t cpu_get_icount_raw(void)
>              fprintf(stderr, "Bad icount read\n");
>              exit(1);
>          }
> -        icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
> +        /* Take into account what has run */
> +        icount += cpu_get_icount_executed(cpu);
>      }
>      return icount;
>  }
> @@ -1195,7 +1205,10 @@ static void prepare_icount_for_run(CPUState *cpu)
>  
>          count = tcg_get_icount_limit();
>  
> -        timers_state.qemu_icount += count;
> +        /* To calculate what we have executed so far we need to know
> +         * what we originally budgeted to run this cycle */
> +        cpu->icount_budget = count;
> +
>          decr = (count > 0xffff) ? 0xffff : count;
>          count -= decr;
>          cpu->icount_decr.u16.low = decr;
> @@ -1206,16 +1219,18 @@ static void prepare_icount_for_run(CPUState *cpu)
>  static void process_icount_data(CPUState *cpu)
>  {
>      if (use_icount) {
> -        /* Fold pending instructions back into the
> -           instruction counter, and clear the interrupt flag.  */
> -        timers_state.qemu_icount -= (cpu->icount_decr.u16.low
> -                        + cpu->icount_extra);
> +        /* Account for executed instructions */
> +        timers_state.qemu_icount += cpu_get_icount_executed(cpu);
>  
>          /* We must be under BQL here as cpu_exit can tweak
>             icount_decr.u32 */
>          g_assert(qemu_mutex_iothread_locked());
> +
> +        /* Reset the counters */
>          cpu->icount_decr.u32 = 0;
>          cpu->icount_extra = 0;
> +        cpu->icount_budget = 0;
> +
>          replay_account_executed_instructions();
>      }
>  }
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index c3292efe1c..5d10359c8f 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -332,6 +332,7 @@ struct CPUState {
>      /* updates protected by BQL */
>      uint32_t interrupt_request;
>      int singlestep_enabled;
> +    int64_t icount_budget;
>      int64_t icount_extra;
>      sigjmp_buf jmp_env;
>  
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run Alex Bennée
  2017-04-03 17:04   ` Paolo Bonzini
@ 2017-04-04  5:37   ` Pavel Dovgalyuk
  2017-04-04 10:13     ` Paolo Bonzini
  2017-04-04 14:39   ` Paolo Bonzini
  2 siblings, 1 reply; 32+ messages in thread
From: Pavel Dovgalyuk @ 2017-04-04  5:37 UTC (permalink / raw)
  To: 'Alex Bennée', rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, 'Peter Crosthwaite'

> From: Alex Bennée [mailto:alex.bennee@linaro.org]
> Outside of the vCPU thread icount time will only be tracked against
> timers_state.qemu_icount. We no longer credit cycles until they have
> completed the run. Inside the vCPU thread we adjust for passage of
> time by looking at how many have run so far. This is only valid inside
> the vCPU thread while it is running.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  cpus.c            | 27 +++++++++++++++++++++------
>  include/qom/cpu.h |  1 +
>  2 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 87638a75d2..3d18374b0e 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -223,6 +223,15 @@ void qemu_tcg_configure(QemuOpts *opts, Error **errp)
>      }
>  }
> 
> +/* The current number of executed instructions is based on what we
> + * originally budgeted minus the current state of the decrementing
> + * icount counters in extra/u16.low.
> + */
> +static int64_t cpu_get_icount_executed(CPUState *cpu)
> +{
> +    return cpu->icount_budget - (cpu->icount_decr.u16.low + cpu->icount_extra);
> +}
> +
>  int64_t cpu_get_icount_raw(void)
>  {
>      int64_t icount;
> @@ -234,7 +243,8 @@ int64_t cpu_get_icount_raw(void)
>              fprintf(stderr, "Bad icount read\n");
>              exit(1);
>          }
> -        icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
> +        /* Take into account what has run */
> +        icount += cpu_get_icount_executed(cpu);
>      }
>      return icount;

As far, as I understand, this one will return the same value in iothread
until vCPU thread finishes cpu_exec?
This value will not jump forward and backward, but still will not allow
making execution deterministic.

Consider the following scenarios:

First:
vCPU            iothread
access HW       ----
...             access HW in timer

Second:
vCPU            iothread
...             access HW in timer
access HW       ----

These scenarios will generate the same order of events in the log.
Synchronization checkpoint in iothread will try to write already
executed instructions, but it does not have access to current_cpu
and the icount value will point to the "past" - it will have less
instructions than already executed.

That is why you see "negative" instruction count event.

>  }
> @@ -1195,7 +1205,10 @@ static void prepare_icount_for_run(CPUState *cpu)
> 
>          count = tcg_get_icount_limit();
> 
> -        timers_state.qemu_icount += count;
> +        /* To calculate what we have executed so far we need to know
> +         * what we originally budgeted to run this cycle */
> +        cpu->icount_budget = count;
> +
>          decr = (count > 0xffff) ? 0xffff : count;
>          count -= decr;
>          cpu->icount_decr.u16.low = decr;
> @@ -1206,16 +1219,18 @@ static void prepare_icount_for_run(CPUState *cpu)
>  static void process_icount_data(CPUState *cpu)
>  {
>      if (use_icount) {
> -        /* Fold pending instructions back into the
> -           instruction counter, and clear the interrupt flag.  */
> -        timers_state.qemu_icount -= (cpu->icount_decr.u16.low
> -                        + cpu->icount_extra);
> +        /* Account for executed instructions */
> +        timers_state.qemu_icount += cpu_get_icount_executed(cpu);
> 
>          /* We must be under BQL here as cpu_exit can tweak
>             icount_decr.u32 */
>          g_assert(qemu_mutex_iothread_locked());
> +
> +        /* Reset the counters */
>          cpu->icount_decr.u32 = 0;
>          cpu->icount_extra = 0;
> +        cpu->icount_budget = 0;
> +
>          replay_account_executed_instructions();
>      }
>  }
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index c3292efe1c..5d10359c8f 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -332,6 +332,7 @@ struct CPUState {
>      /* updates protected by BQL */
>      uint32_t interrupt_request;
>      int singlestep_enabled;
> +    int64_t icount_budget;
>      int64_t icount_extra;
>      sigjmp_buf jmp_env;
> 
> --
> 2.11.0


Pavel Dovgalyuk

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu Alex Bennée
@ 2017-04-04  5:39   ` Pavel Dovgalyuk
  2017-04-04  8:56     ` Alex Bennée
  0 siblings, 1 reply; 32+ messages in thread
From: Pavel Dovgalyuk @ 2017-04-04  5:39 UTC (permalink / raw)
  To: 'Alex Bennée', rth, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, 'Peter Crosthwaite'

I guess you are trying to fix the sympthoms of the case
when iothread is trying to access instruction count.

Maybe the solution is providing access to current_cpu for the iothread
coupled with your patch 8?

Pavel Dovgalyuk


> -----Original Message-----
> From: Alex Bennée [mailto:alex.bennee@linaro.org]
> Sent: Monday, April 03, 2017 3:45 PM
> To: dovgaluk@ispras.ru; rth@twiddle.net; pbonzini@redhat.com
> Cc: peter.maydell@linaro.org; qemu-devel@nongnu.org; mttcg@listserver.greensocs.com;
> fred.konrad@greensocs.com; a.rigo@virtualopensystems.com; cota@braap.org;
> bobby.prani@gmail.com; nikunj@linux.vnet.ibm.com; Alex Bennée; Peter Crosthwaite
> Subject: [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
> 
> As icount is only supported for single-threaded execution due to the
> requirement for determinism let's remove it from the common
> tcg_exec_cpu path.
> 
> Also remove the additional fiddling which shouldn't be required as the
> icount counters should all be rectified as you enter the loop.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  cpus.c | 67 +++++++++++++++++++++++++++++++++++++++++++++---------------------
>  1 file changed, 46 insertions(+), 21 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 18b1746770..87638a75d2 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1179,47 +1179,66 @@ static void handle_icount_deadline(void)
>      }
>  }
> 
> -static int tcg_cpu_exec(CPUState *cpu)
> +static void prepare_icount_for_run(CPUState *cpu)
>  {
> -    int ret;
> -#ifdef CONFIG_PROFILER
> -    int64_t ti;
> -#endif
> -
> -#ifdef CONFIG_PROFILER
> -    ti = profile_getclock();
> -#endif
>      if (use_icount) {
>          int64_t count;
>          int decr;
> -        timers_state.qemu_icount -= (cpu->icount_decr.u16.low
> -                                    + cpu->icount_extra);
> -        cpu->icount_decr.u16.low = 0;
> -        cpu->icount_extra = 0;
> +
> +        /* These should always be cleared by process_icount_data after
> +         * each vCPU execution. However u16.high can be raised
> +         * asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt
> +         */
> +        g_assert(cpu->icount_decr.u16.low == 0);
> +        g_assert(cpu->icount_extra == 0);
> +
> +
>          count = tcg_get_icount_limit();
> +
>          timers_state.qemu_icount += count;
>          decr = (count > 0xffff) ? 0xffff : count;
>          count -= decr;
>          cpu->icount_decr.u16.low = decr;
>          cpu->icount_extra = count;
>      }
> -    qemu_mutex_unlock_iothread();
> -    cpu_exec_start(cpu);
> -    ret = cpu_exec(cpu);
> -    cpu_exec_end(cpu);
> -    qemu_mutex_lock_iothread();
> -#ifdef CONFIG_PROFILER
> -    tcg_time += profile_getclock() - ti;
> -#endif
> +}
> +
> +static void process_icount_data(CPUState *cpu)
> +{
>      if (use_icount) {
>          /* Fold pending instructions back into the
>             instruction counter, and clear the interrupt flag.  */
>          timers_state.qemu_icount -= (cpu->icount_decr.u16.low
>                          + cpu->icount_extra);
> +
> +        /* We must be under BQL here as cpu_exit can tweak
> +           icount_decr.u32 */
> +        g_assert(qemu_mutex_iothread_locked());
>          cpu->icount_decr.u32 = 0;
>          cpu->icount_extra = 0;
>          replay_account_executed_instructions();
>      }
> +}
> +
> +
> +static int tcg_cpu_exec(CPUState *cpu)
> +{
> +    int ret;
> +#ifdef CONFIG_PROFILER
> +    int64_t ti;
> +#endif
> +
> +#ifdef CONFIG_PROFILER
> +    ti = profile_getclock();
> +#endif
> +    qemu_mutex_unlock_iothread();
> +    cpu_exec_start(cpu);
> +    ret = cpu_exec(cpu);
> +    cpu_exec_end(cpu);
> +    qemu_mutex_lock_iothread();
> +#ifdef CONFIG_PROFILER
> +    tcg_time += profile_getclock() - ti;
> +#endif
>      return ret;
>  }
> 
> @@ -1306,7 +1325,13 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
> 
>              if (cpu_can_run(cpu)) {
>                  int r;
> +
> +                prepare_icount_for_run(cpu);
> +
>                  r = tcg_cpu_exec(cpu);
> +
> +                process_icount_data(cpu);
> +
>                  if (r == EXCP_DEBUG) {
>                      cpu_handle_guest_debug(cpu);
>                      break;
> --
> 2.11.0

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3
  2017-04-03 17:03 ` [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Paolo Bonzini
@ 2017-04-04  8:50   ` Alex Bennée
  0 siblings, 0 replies; 32+ messages in thread
From: Alex Bennée @ 2017-04-04  8:50 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: dovgaluk, rth, peter.maydell, qemu-devel, mttcg, fred.konrad,
	a.rigo, cota, bobby.prani, nikunj


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 03/04/2017 14:45, Alex Bennée wrote:
>>   cpus: check cpu->running in cpu_get_icount_raw()
>>
>>   I'm not sure the race happens and once outside of cpu->running the
>>   icount counters should be zero. However it seems a sensible
>>   precaution.
>
> Yeah, I think this is unnecessary with patch 7's new assertions.

I can drop the patch.

>> I think the cpus: patches should probably go into the next
>> pull-request while we see if we can come up with a better final
>> solution for fixing record/replay. However given how long this
>> regression has run during the release candidate process I wanted to
>> update everyone on the current status and get feedback ASAP.
>
> I agree.  I'm not sure exactly how the final race happens, but if it
> causes divergence it would be caught later by the record/replay
> mechanism, I think.

It's odd because everything should be sequenced by the BQL. The
main-loop holds the BQL while writing out checkpoints and everything
that can trigger output to the replay stream should be under BQL as
well:

  - VIRTUAL timers in the outer loop
  - MMIO triggered events (block, char, audio)
  - Interrupt processing

In fact I wonder if replay_mutex could just be dropped and the BQL used
to protect all of this stuff.

I'll have to experiment with some asserts to see if this is every not
the case.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
  2017-04-04  5:39   ` Pavel Dovgalyuk
@ 2017-04-04  8:56     ` Alex Bennée
  2017-04-04 10:46       ` Alex Bennée
  0 siblings, 1 reply; 32+ messages in thread
From: Alex Bennée @ 2017-04-04  8:56 UTC (permalink / raw)
  To: Pavel Dovgalyuk
  Cc: rth, pbonzini, peter.maydell, qemu-devel, mttcg, fred.konrad,
	a.rigo, cota, bobby.prani, nikunj, 'Peter Crosthwaite'


Pavel Dovgalyuk <dovgaluk@ispras.ru> writes:

> I guess you are trying to fix the sympthoms of the case
> when iothread is trying to access instruction count.

In theory the main-loop should be sequenced before or after vCPU events
because of the BQL. I'm not sure why this is not currently the case.

> Maybe the solution is providing access to current_cpu for the iothread
> coupled with your patch 8?

Providing cross-thread access to CPU structures brings its own
challenges.

But it does occur to me we should probably ensure
timer_state.qemu_icount has appropriate barriers. This should be ensured
by the BQL but if it is ever accessed by 2 threads without a BQL
transition in-between then it is potentially racey.

>
> Pavel Dovgalyuk
>
>
>> -----Original Message-----
>> From: Alex Bennée [mailto:alex.bennee@linaro.org]
>> Sent: Monday, April 03, 2017 3:45 PM
>> To: dovgaluk@ispras.ru; rth@twiddle.net; pbonzini@redhat.com
>> Cc: peter.maydell@linaro.org; qemu-devel@nongnu.org; mttcg@listserver.greensocs.com;
>> fred.konrad@greensocs.com; a.rigo@virtualopensystems.com; cota@braap.org;
>> bobby.prani@gmail.com; nikunj@linux.vnet.ibm.com; Alex Bennée; Peter Crosthwaite
>> Subject: [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
>>
>> As icount is only supported for single-threaded execution due to the
>> requirement for determinism let's remove it from the common
>> tcg_exec_cpu path.
>>
>> Also remove the additional fiddling which shouldn't be required as the
>> icount counters should all be rectified as you enter the loop.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>>  cpus.c | 67 +++++++++++++++++++++++++++++++++++++++++++++---------------------
>>  1 file changed, 46 insertions(+), 21 deletions(-)
>>
>> diff --git a/cpus.c b/cpus.c
>> index 18b1746770..87638a75d2 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -1179,47 +1179,66 @@ static void handle_icount_deadline(void)
>>      }
>>  }
>>
>> -static int tcg_cpu_exec(CPUState *cpu)
>> +static void prepare_icount_for_run(CPUState *cpu)
>>  {
>> -    int ret;
>> -#ifdef CONFIG_PROFILER
>> -    int64_t ti;
>> -#endif
>> -
>> -#ifdef CONFIG_PROFILER
>> -    ti = profile_getclock();
>> -#endif
>>      if (use_icount) {
>>          int64_t count;
>>          int decr;
>> -        timers_state.qemu_icount -= (cpu->icount_decr.u16.low
>> -                                    + cpu->icount_extra);
>> -        cpu->icount_decr.u16.low = 0;
>> -        cpu->icount_extra = 0;
>> +
>> +        /* These should always be cleared by process_icount_data after
>> +         * each vCPU execution. However u16.high can be raised
>> +         * asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt
>> +         */
>> +        g_assert(cpu->icount_decr.u16.low == 0);
>> +        g_assert(cpu->icount_extra == 0);
>> +
>> +
>>          count = tcg_get_icount_limit();
>> +
>>          timers_state.qemu_icount += count;
>>          decr = (count > 0xffff) ? 0xffff : count;
>>          count -= decr;
>>          cpu->icount_decr.u16.low = decr;
>>          cpu->icount_extra = count;
>>      }
>> -    qemu_mutex_unlock_iothread();
>> -    cpu_exec_start(cpu);
>> -    ret = cpu_exec(cpu);
>> -    cpu_exec_end(cpu);
>> -    qemu_mutex_lock_iothread();
>> -#ifdef CONFIG_PROFILER
>> -    tcg_time += profile_getclock() - ti;
>> -#endif
>> +}
>> +
>> +static void process_icount_data(CPUState *cpu)
>> +{
>>      if (use_icount) {
>>          /* Fold pending instructions back into the
>>             instruction counter, and clear the interrupt flag.  */
>>          timers_state.qemu_icount -= (cpu->icount_decr.u16.low
>>                          + cpu->icount_extra);
>> +
>> +        /* We must be under BQL here as cpu_exit can tweak
>> +           icount_decr.u32 */
>> +        g_assert(qemu_mutex_iothread_locked());
>>          cpu->icount_decr.u32 = 0;
>>          cpu->icount_extra = 0;
>>          replay_account_executed_instructions();
>>      }
>> +}
>> +
>> +
>> +static int tcg_cpu_exec(CPUState *cpu)
>> +{
>> +    int ret;
>> +#ifdef CONFIG_PROFILER
>> +    int64_t ti;
>> +#endif
>> +
>> +#ifdef CONFIG_PROFILER
>> +    ti = profile_getclock();
>> +#endif
>> +    qemu_mutex_unlock_iothread();
>> +    cpu_exec_start(cpu);
>> +    ret = cpu_exec(cpu);
>> +    cpu_exec_end(cpu);
>> +    qemu_mutex_lock_iothread();
>> +#ifdef CONFIG_PROFILER
>> +    tcg_time += profile_getclock() - ti;
>> +#endif
>>      return ret;
>>  }
>>
>> @@ -1306,7 +1325,13 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>>
>>              if (cpu_can_run(cpu)) {
>>                  int r;
>> +
>> +                prepare_icount_for_run(cpu);
>> +
>>                  r = tcg_cpu_exec(cpu);
>> +
>> +                process_icount_data(cpu);
>> +
>>                  if (r == EXCP_DEBUG) {
>>                      cpu_handle_guest_debug(cpu);
>>                      break;
>> --
>> 2.11.0


--
Alex Bennée

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run
  2017-04-04  5:37   ` Pavel Dovgalyuk
@ 2017-04-04 10:13     ` Paolo Bonzini
  2017-04-07 11:27       ` Pavel Dovgalyuk
  0 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2017-04-04 10:13 UTC (permalink / raw)
  To: Pavel Dovgalyuk, 'Alex Bennée', rth
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, 'Peter Crosthwaite'



On 04/04/2017 07:37, Pavel Dovgalyuk wrote:
>> -        icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
>> +        /* Take into account what has run */
>> +        icount += cpu_get_icount_executed(cpu);
>>      }
>>      return icount;
> As far, as I understand, this one will return the same value in iothread
> until vCPU thread finishes cpu_exec?
> This value will not jump forward and backward, but still will not allow
> making execution deterministic.
> 
> Consider the following scenarios:
> 
> First:
> vCPU            iothread
> access HW       ----
> ...             access HW in timer
> 
> Second:
> vCPU            iothread
> ...             access HW in timer
> access HW       ----
> 
> These scenarios will generate the same order of events in the log.
> Synchronization checkpoint in iothread will try to write already
> executed instructions, but it does not have access to current_cpu
> and the icount value will point to the "past" - it will have less
> instructions than already executed.

The actual access should be covered by a lock, but I think you're right
that the two threads can be nondeterministically off by one instruction,
even if we make gen_io_start update timers_state.qemu_icount atomically.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
  2017-04-04  8:56     ` Alex Bennée
@ 2017-04-04 10:46       ` Alex Bennée
  2017-04-04 10:53         ` Paolo Bonzini
  0 siblings, 1 reply; 32+ messages in thread
From: Alex Bennée @ 2017-04-04 10:46 UTC (permalink / raw)
  To: Pavel Dovgalyuk
  Cc: rth, pbonzini, peter.maydell, qemu-devel, mttcg, fred.konrad,
	a.rigo, cota, bobby.prani, nikunj, 'Peter Crosthwaite'


Alex Bennée <alex.bennee@linaro.org> writes:

> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes:
>
>> I guess you are trying to fix the sympthoms of the case
>> when iothread is trying to access instruction count.
>
> In theory the main-loop should be sequenced before or after vCPU events
> because of the BQL. I'm not sure why this is not currently the case.

It seems cpu_handle_exception doesn't take the BQL until
replay_exception() has done its thing. This is fixable but the function
is a mess so I'm trying to neaten that up first.

>
>> Maybe the solution is providing access to current_cpu for the iothread
>> coupled with your patch 8?
>
> Providing cross-thread access to CPU structures brings its own
> challenges.
>
> But it does occur to me we should probably ensure
> timer_state.qemu_icount has appropriate barriers. This should be ensured
> by the BQL but if it is ever accessed by 2 threads without a BQL
> transition in-between then it is potentially racey.
>
>>
>> Pavel Dovgalyuk
>>
>>
>>> -----Original Message-----
>>> From: Alex Bennée [mailto:alex.bennee@linaro.org]
>>> Sent: Monday, April 03, 2017 3:45 PM
>>> To: dovgaluk@ispras.ru; rth@twiddle.net; pbonzini@redhat.com
>>> Cc: peter.maydell@linaro.org; qemu-devel@nongnu.org; mttcg@listserver.greensocs.com;
>>> fred.konrad@greensocs.com; a.rigo@virtualopensystems.com; cota@braap.org;
>>> bobby.prani@gmail.com; nikunj@linux.vnet.ibm.com; Alex Bennée; Peter Crosthwaite
>>> Subject: [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
>>>
>>> As icount is only supported for single-threaded execution due to the
>>> requirement for determinism let's remove it from the common
>>> tcg_exec_cpu path.
>>>
>>> Also remove the additional fiddling which shouldn't be required as the
>>> icount counters should all be rectified as you enter the loop.
>>>
>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>> ---
>>>  cpus.c | 67 +++++++++++++++++++++++++++++++++++++++++++++---------------------
>>>  1 file changed, 46 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/cpus.c b/cpus.c
>>> index 18b1746770..87638a75d2 100644
>>> --- a/cpus.c
>>> +++ b/cpus.c
>>> @@ -1179,47 +1179,66 @@ static void handle_icount_deadline(void)
>>>      }
>>>  }
>>>
>>> -static int tcg_cpu_exec(CPUState *cpu)
>>> +static void prepare_icount_for_run(CPUState *cpu)
>>>  {
>>> -    int ret;
>>> -#ifdef CONFIG_PROFILER
>>> -    int64_t ti;
>>> -#endif
>>> -
>>> -#ifdef CONFIG_PROFILER
>>> -    ti = profile_getclock();
>>> -#endif
>>>      if (use_icount) {
>>>          int64_t count;
>>>          int decr;
>>> -        timers_state.qemu_icount -= (cpu->icount_decr.u16.low
>>> -                                    + cpu->icount_extra);
>>> -        cpu->icount_decr.u16.low = 0;
>>> -        cpu->icount_extra = 0;
>>> +
>>> +        /* These should always be cleared by process_icount_data after
>>> +         * each vCPU execution. However u16.high can be raised
>>> +         * asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt
>>> +         */
>>> +        g_assert(cpu->icount_decr.u16.low == 0);
>>> +        g_assert(cpu->icount_extra == 0);
>>> +
>>> +
>>>          count = tcg_get_icount_limit();
>>> +
>>>          timers_state.qemu_icount += count;
>>>          decr = (count > 0xffff) ? 0xffff : count;
>>>          count -= decr;
>>>          cpu->icount_decr.u16.low = decr;
>>>          cpu->icount_extra = count;
>>>      }
>>> -    qemu_mutex_unlock_iothread();
>>> -    cpu_exec_start(cpu);
>>> -    ret = cpu_exec(cpu);
>>> -    cpu_exec_end(cpu);
>>> -    qemu_mutex_lock_iothread();
>>> -#ifdef CONFIG_PROFILER
>>> -    tcg_time += profile_getclock() - ti;
>>> -#endif
>>> +}
>>> +
>>> +static void process_icount_data(CPUState *cpu)
>>> +{
>>>      if (use_icount) {
>>>          /* Fold pending instructions back into the
>>>             instruction counter, and clear the interrupt flag.  */
>>>          timers_state.qemu_icount -= (cpu->icount_decr.u16.low
>>>                          + cpu->icount_extra);
>>> +
>>> +        /* We must be under BQL here as cpu_exit can tweak
>>> +           icount_decr.u32 */
>>> +        g_assert(qemu_mutex_iothread_locked());
>>>          cpu->icount_decr.u32 = 0;
>>>          cpu->icount_extra = 0;
>>>          replay_account_executed_instructions();
>>>      }
>>> +}
>>> +
>>> +
>>> +static int tcg_cpu_exec(CPUState *cpu)
>>> +{
>>> +    int ret;
>>> +#ifdef CONFIG_PROFILER
>>> +    int64_t ti;
>>> +#endif
>>> +
>>> +#ifdef CONFIG_PROFILER
>>> +    ti = profile_getclock();
>>> +#endif
>>> +    qemu_mutex_unlock_iothread();
>>> +    cpu_exec_start(cpu);
>>> +    ret = cpu_exec(cpu);
>>> +    cpu_exec_end(cpu);
>>> +    qemu_mutex_lock_iothread();
>>> +#ifdef CONFIG_PROFILER
>>> +    tcg_time += profile_getclock() - ti;
>>> +#endif
>>>      return ret;
>>>  }
>>>
>>> @@ -1306,7 +1325,13 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
>>>
>>>              if (cpu_can_run(cpu)) {
>>>                  int r;
>>> +
>>> +                prepare_icount_for_run(cpu);
>>> +
>>>                  r = tcg_cpu_exec(cpu);
>>> +
>>> +                process_icount_data(cpu);
>>> +
>>>                  if (r == EXCP_DEBUG) {
>>>                      cpu_handle_guest_debug(cpu);
>>>                      break;
>>> --
>>> 2.11.0


--
Alex Bennée

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
  2017-04-04 10:46       ` Alex Bennée
@ 2017-04-04 10:53         ` Paolo Bonzini
  2017-04-04 12:31           ` Alex Bennée
  0 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2017-04-04 10:53 UTC (permalink / raw)
  To: Alex Bennée, Pavel Dovgalyuk
  Cc: rth, peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, 'Peter Crosthwaite'



On 04/04/2017 12:46, Alex Bennée wrote:
>> In theory the main-loop should be sequenced before or after vCPU events
>> because of the BQL. I'm not sure why this is not currently the case.
> 
> It seems cpu_handle_exception doesn't take the BQL until
> replay_exception() has done its thing. This is fixable but the function
> is a mess so I'm trying to neaten that up first.

Long term neither cpu_handle_exception nor cpu_handle_interrupt need the
BQL at all.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
  2017-04-04 10:53         ` Paolo Bonzini
@ 2017-04-04 12:31           ` Alex Bennée
  2017-04-04 12:37             ` Paolo Bonzini
  0 siblings, 1 reply; 32+ messages in thread
From: Alex Bennée @ 2017-04-04 12:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Pavel Dovgalyuk, rth, peter.maydell, qemu-devel, mttcg,
	fred.konrad, a.rigo, cota, bobby.prani, nikunj,
	'Peter Crosthwaite'


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 04/04/2017 12:46, Alex Bennée wrote:
>>> In theory the main-loop should be sequenced before or after vCPU events
>>> because of the BQL. I'm not sure why this is not currently the case.
>>
>> It seems cpu_handle_exception doesn't take the BQL until
>> replay_exception() has done its thing. This is fixable but the function
>> is a mess so I'm trying to neaten that up first.
>
> Long term neither cpu_handle_exception nor cpu_handle_interrupt need the
> BQL at all.

Well for record/replay they might. Otherwise we end up moving the record
stream on even though a checkpoint might be being written by the
main-loop.

As far as the cc->do_interrupt() stuff is concerned it will be guest
dependant because you could end up in device emulation code down this
path which must be protected by the BQL - the arm_gic code being a good
example.

>
> Paolo


--
Alex Bennée

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
  2017-04-04 12:31           ` Alex Bennée
@ 2017-04-04 12:37             ` Paolo Bonzini
  2017-04-04 13:29               ` Alex Bennée
  0 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2017-04-04 12:37 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Pavel Dovgalyuk, rth, peter.maydell, qemu-devel, mttcg,
	fred.konrad, a.rigo, cota, bobby.prani, nikunj,
	'Peter Crosthwaite'



On 04/04/2017 14:31, Alex Bennée wrote:
> 
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 04/04/2017 12:46, Alex Bennée wrote:
>>>> In theory the main-loop should be sequenced before or after vCPU events
>>>> because of the BQL. I'm not sure why this is not currently the case.
>>>
>>> It seems cpu_handle_exception doesn't take the BQL until
>>> replay_exception() has done its thing. This is fixable but the function
>>> is a mess so I'm trying to neaten that up first.
>>
>> Long term neither cpu_handle_exception nor cpu_handle_interrupt need the
>> BQL at all.
> 
> Well for record/replay they might. Otherwise we end up moving the record
> stream on even though a checkpoint might be being written by the
> main-loop.
> 
> As far as the cc->do_interrupt() stuff is concerned it will be guest
> dependant because you could end up in device emulation code down this
> path which must be protected by the BQL - the arm_gic code being a good
> example.

I think recording an event could be split in two parts:

- recording the (icount, event) tuple and getting back a unique event id

- waiting for all events with lower event id to be complete before
starting to process this one

This doesn't require the BQL, you can use a condition variable on
replay_lock (but you do need to unlock/lock the BQL around it if
currently taken).

The complicated part is ensuring that there are no deadlocks where the
I/O thread needs the VCPU thread to proceed, but the VCPU thread is
waiting on the I/O thread's event processing.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
  2017-04-04 12:37             ` Paolo Bonzini
@ 2017-04-04 13:29               ` Alex Bennée
  2017-04-05 10:44                 ` Pavel Dovgalyuk
  0 siblings, 1 reply; 32+ messages in thread
From: Alex Bennée @ 2017-04-04 13:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Pavel Dovgalyuk, rth, peter.maydell, qemu-devel, mttcg,
	fred.konrad, a.rigo, cota, bobby.prani, nikunj,
	'Peter Crosthwaite'


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 04/04/2017 14:31, Alex Bennée wrote:
>>
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>
>>> On 04/04/2017 12:46, Alex Bennée wrote:
>>>>> In theory the main-loop should be sequenced before or after vCPU events
>>>>> because of the BQL. I'm not sure why this is not currently the case.
>>>>
>>>> It seems cpu_handle_exception doesn't take the BQL until
>>>> replay_exception() has done its thing. This is fixable but the function
>>>> is a mess so I'm trying to neaten that up first.
>>>
>>> Long term neither cpu_handle_exception nor cpu_handle_interrupt need the
>>> BQL at all.
>>
>> Well for record/replay they might. Otherwise we end up moving the record
>> stream on even though a checkpoint might be being written by the
>> main-loop.
>>
>> As far as the cc->do_interrupt() stuff is concerned it will be guest
>> dependant because you could end up in device emulation code down this
>> path which must be protected by the BQL - the arm_gic code being a good
>> example.
>
> I think recording an event could be split in two parts:
>
> - recording the (icount, event) tuple and getting back a unique event id
>
> - waiting for all events with lower event id to be complete before
> starting to process this one
>
> This doesn't require the BQL, you can use a condition variable on
> replay_lock (but you do need to unlock/lock the BQL around it if
> currently taken).

Would you then leave the recording to the stream to the main-loop
thread? I guess it would marshal all events that occurred before the
checkpoint first and then finish draining the queue after recording its
checkpoint?

Wrapping the exception stuff in the BQL does improve the repeat-ability
but of course it breaks if I take away the graceful handling of time
differences because there is a race between recording the exception
event (with current_step+insns so far) and getting back to the main loop
where insns is finally credited to timers_state.qemu_icount.

I guess we could improve the situation by updating
timers_state.qemu_icount (under BQL) as we record events. I don't know
how clunky that would get.

> The complicated part is ensuring that there are no deadlocks where the
> I/O thread needs the VCPU thread to proceed, but the VCPU thread is
> waiting on the I/O thread's event processing.

This sort of update sounds more like 2.10 material though.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run Alex Bennée
  2017-04-03 17:04   ` Paolo Bonzini
  2017-04-04  5:37   ` Pavel Dovgalyuk
@ 2017-04-04 14:39   ` Paolo Bonzini
  2 siblings, 0 replies; 32+ messages in thread
From: Paolo Bonzini @ 2017-04-04 14:39 UTC (permalink / raw)
  To: Alex Bennée, dovgaluk, rth
  Cc: mttcg, peter.maydell, nikunj, Peter Crosthwaite, a.rigo,
	qemu-devel, cota, bobby.prani, fred.konrad



On 03/04/2017 14:45, Alex Bennée wrote:
> Outside of the vCPU thread icount time will only be tracked against
> timers_state.qemu_icount. We no longer credit cycles until they have
> completed the run. Inside the vCPU thread we adjust for passage of
> time by looking at how many have run so far. This is only valid inside
> the vCPU thread while it is running.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  cpus.c            | 27 +++++++++++++++++++++------
>  include/qom/cpu.h |  1 +
>  2 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 87638a75d2..3d18374b0e 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -223,6 +223,15 @@ void qemu_tcg_configure(QemuOpts *opts, Error **errp)
>      }
>  }
>  
> +/* The current number of executed instructions is based on what we
> + * originally budgeted minus the current state of the decrementing
> + * icount counters in extra/u16.low.
> + */
> +static int64_t cpu_get_icount_executed(CPUState *cpu)
> +{
> +    return cpu->icount_budget - (cpu->icount_decr.u16.low + cpu->icount_extra);
> +}
> +
>  int64_t cpu_get_icount_raw(void)
>  {
>      int64_t icount;
> @@ -234,7 +243,8 @@ int64_t cpu_get_icount_raw(void)
>              fprintf(stderr, "Bad icount read\n");
>              exit(1);
>          }
> -        icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
> +        /* Take into account what has run */
> +        icount += cpu_get_icount_executed(cpu);
>      }
>      return icount;
>  }
> @@ -1195,7 +1205,10 @@ static void prepare_icount_for_run(CPUState *cpu)
>  
>          count = tcg_get_icount_limit();
>  
> -        timers_state.qemu_icount += count;
> +        /* To calculate what we have executed so far we need to know
> +         * what we originally budgeted to run this cycle */
> +        cpu->icount_budget = count;
> +
>          decr = (count > 0xffff) ? 0xffff : count;
>          count -= decr;
>          cpu->icount_decr.u16.low = decr;
> @@ -1206,16 +1219,18 @@ static void prepare_icount_for_run(CPUState *cpu)
>  static void process_icount_data(CPUState *cpu)
>  {
>      if (use_icount) {
> -        /* Fold pending instructions back into the
> -           instruction counter, and clear the interrupt flag.  */
> -        timers_state.qemu_icount -= (cpu->icount_decr.u16.low
> -                        + cpu->icount_extra);
> +        /* Account for executed instructions */
> +        timers_state.qemu_icount += cpu_get_icount_executed(cpu);
>  
>          /* We must be under BQL here as cpu_exit can tweak
>             icount_decr.u32 */
>          g_assert(qemu_mutex_iothread_locked());
> +
> +        /* Reset the counters */
>          cpu->icount_decr.u32 = 0;

If you only reset u16.low, then the assertion can be removed.  After all
prepare_icount_for_run is also checking for u16.low to be zero.

Thanks,

Paolo

>          cpu->icount_extra = 0;
> +        cpu->icount_budget = 0;
> +
>          replay_account_executed_instructions();
>      }
>  }
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index c3292efe1c..5d10359c8f 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -332,6 +332,7 @@ struct CPUState {
>      /* updates protected by BQL */
>      uint32_t interrupt_request;
>      int singlestep_enabled;
> +    int64_t icount_budget;
>      int64_t icount_extra;
>      sigjmp_buf jmp_env;
>  
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 4/9] target/i386/misc_helper: wrap BQL around another IRQ generator
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 4/9] target/i386/misc_helper: wrap BQL around another IRQ generator Alex Bennée
@ 2017-04-04 16:53   ` Richard Henderson
  2017-04-04 17:36     ` Eduardo Habkost
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Henderson @ 2017-04-04 16:53 UTC (permalink / raw)
  To: Alex Bennée, dovgaluk, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Eduardo Habkost

On 04/03/2017 05:45 AM, Alex Bennée wrote:
> Anything that calls into HW emulation must be protected by the BQL.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  target/i386/misc_helper.c | 3 +++
>  1 file changed, 3 insertions(+)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 5/9] cpus: remove icount handling from qemu_tcg_cpu_thread_fn
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 5/9] cpus: remove icount handling from qemu_tcg_cpu_thread_fn Alex Bennée
@ 2017-04-04 16:53   ` Richard Henderson
  0 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2017-04-04 16:53 UTC (permalink / raw)
  To: Alex Bennée, dovgaluk, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Peter Crosthwaite

On 04/03/2017 05:45 AM, Alex Bennée wrote:
> We should never be running in multi-threaded mode with icount enabled.
> There is no point calling handle_icount_deadline here so remove it and
> assert !use_icount.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  cpus.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 6/9] cpus: check cpu->running in cpu_get_icount_raw()
  2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 6/9] cpus: check cpu->running in cpu_get_icount_raw() Alex Bennée
  2017-04-03 14:00   ` Philippe Mathieu-Daudé
@ 2017-04-04 16:54   ` Richard Henderson
  1 sibling, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2017-04-04 16:54 UTC (permalink / raw)
  To: Alex Bennée, dovgaluk, pbonzini
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, Peter Crosthwaite

On 04/03/2017 05:45 AM, Alex Bennée wrote:
> The lifetime of current_cpu is now the lifetime of the vCPU thread.
> However get_icount_raw() can apply a fudge factor if called while code
> is running to take into account the current executed instruction
> count.
>
> To ensure this is always the case we also check cpu->running.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  cpus.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 4/9] target/i386/misc_helper: wrap BQL around another IRQ generator
  2017-04-04 16:53   ` Richard Henderson
@ 2017-04-04 17:36     ` Eduardo Habkost
  0 siblings, 0 replies; 32+ messages in thread
From: Eduardo Habkost @ 2017-04-04 17:36 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, dovgaluk, pbonzini, peter.maydell, qemu-devel,
	mttcg, fred.konrad, a.rigo, cota, bobby.prani, nikunj

On Tue, Apr 04, 2017 at 09:53:15AM -0700, Richard Henderson wrote:
> On 04/03/2017 05:45 AM, Alex Bennée wrote:
> > Anything that calls into HW emulation must be protected by the BQL.
> > 
> > Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> > ---
> >  target/i386/misc_helper.c | 3 +++
> >  1 file changed, 3 insertions(+)
> 
> Reviewed-by: Richard Henderson <rth@twiddle.net>

In case somebody is going to queue the whole series in one take:

Acked-by: Eduardo Habkost <ehabkost@redhat.com>

-- 
Eduardo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
  2017-04-04 13:29               ` Alex Bennée
@ 2017-04-05 10:44                 ` Pavel Dovgalyuk
  2017-04-05 11:18                   ` Alex Bennée
  0 siblings, 1 reply; 32+ messages in thread
From: Pavel Dovgalyuk @ 2017-04-05 10:44 UTC (permalink / raw)
  To: 'Alex Bennée', 'Paolo Bonzini'
  Cc: rth, peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, 'Peter Crosthwaite'

> From: Alex Bennée [mailto:alex.bennee@linaro.org]
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > On 04/04/2017 14:31, Alex Bennée wrote:
> >>
> >> Paolo Bonzini <pbonzini@redhat.com> writes:
> >>
> >>> On 04/04/2017 12:46, Alex Bennée wrote:
> >>>>> In theory the main-loop should be sequenced before or after vCPU events
> >>>>> because of the BQL. I'm not sure why this is not currently the case.
> >>>>
> >>>> It seems cpu_handle_exception doesn't take the BQL until
> >>>> replay_exception() has done its thing. This is fixable but the function
> >>>> is a mess so I'm trying to neaten that up first.
> >>>
> >>> Long term neither cpu_handle_exception nor cpu_handle_interrupt need the
> >>> BQL at all.
> >>
> >> Well for record/replay they might. Otherwise we end up moving the record
> >> stream on even though a checkpoint might be being written by the
> >> main-loop.
> >>
> >> As far as the cc->do_interrupt() stuff is concerned it will be guest
> >> dependant because you could end up in device emulation code down this
> >> path which must be protected by the BQL - the arm_gic code being a good
> >> example.
> >
> > I think recording an event could be split in two parts:
> >
> > - recording the (icount, event) tuple and getting back a unique event id
> >
> > - waiting for all events with lower event id to be complete before
> > starting to process this one
> >
> > This doesn't require the BQL, you can use a condition variable on
> > replay_lock (but you do need to unlock/lock the BQL around it if
> > currently taken).
> 
> Would you then leave the recording to the stream to the main-loop
> thread? I guess it would marshal all events that occurred before the
> checkpoint first and then finish draining the queue after recording its
> checkpoint?
> 
> Wrapping the exception stuff in the BQL does improve the repeat-ability
> but of course it breaks if I take away the graceful handling of time
> differences because there is a race between recording the exception
> event (with current_step+insns so far) and getting back to the main loop
> where insns is finally credited to timers_state.qemu_icount.
> 
> I guess we could improve the situation by updating
> timers_state.qemu_icount (under BQL) as we record events. I don't know
> how clunky that would get.

Does io instructions make some lock to prevent races in virtual hardware?

vCPU thread updates icount in the beginning of the TB execution.
It means that checkpoints in the replay log will appear only at the boundaries
of TBs. However, the same log may be generated by different scenarios.
Consider the following cases:

1. Sequence: vCPU-block-begin vCPU-update-icount iothread-io vCPU-io vCPU-block-end
2. Sequence: vCPU-block-begin vCPU-update-icount vCPU-io iothread-io vCPU-block-end

These sequences will generate the same order of replay events, but different
states of virtual hardware.

Therefore we need some lock for the time while vCPU executes translation block
(or the whole sequence of blocks as in old times).

> > The complicated part is ensuring that there are no deadlocks where the
> > I/O thread needs the VCPU thread to proceed, but the VCPU thread is
> > waiting on the I/O thread's event processing.
> 
> This sort of update sounds more like 2.10 material though.

Pavel Dovgalyuk

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu
  2017-04-05 10:44                 ` Pavel Dovgalyuk
@ 2017-04-05 11:18                   ` Alex Bennée
  0 siblings, 0 replies; 32+ messages in thread
From: Alex Bennée @ 2017-04-05 11:18 UTC (permalink / raw)
  To: Pavel Dovgalyuk
  Cc: 'Paolo Bonzini',
	rth, peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, 'Peter Crosthwaite'


Pavel Dovgalyuk <dovgaluk@ispras.ru> writes:

>> From: Alex Bennée [mailto:alex.bennee@linaro.org]
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>
>> > On 04/04/2017 14:31, Alex Bennée wrote:
>> >>
>> >> Paolo Bonzini <pbonzini@redhat.com> writes:
>> >>
>> >>> On 04/04/2017 12:46, Alex Bennée wrote:
>> >>>>> In theory the main-loop should be sequenced before or after vCPU events
>> >>>>> because of the BQL. I'm not sure why this is not currently the case.
>> >>>>
>> >>>> It seems cpu_handle_exception doesn't take the BQL until
>> >>>> replay_exception() has done its thing. This is fixable but the function
>> >>>> is a mess so I'm trying to neaten that up first.
>> >>>
>> >>> Long term neither cpu_handle_exception nor cpu_handle_interrupt need the
>> >>> BQL at all.
>> >>
>> >> Well for record/replay they might. Otherwise we end up moving the record
>> >> stream on even though a checkpoint might be being written by the
>> >> main-loop.
>> >>
>> >> As far as the cc->do_interrupt() stuff is concerned it will be guest
>> >> dependant because you could end up in device emulation code down this
>> >> path which must be protected by the BQL - the arm_gic code being a good
>> >> example.
>> >
>> > I think recording an event could be split in two parts:
>> >
>> > - recording the (icount, event) tuple and getting back a unique event id
>> >
>> > - waiting for all events with lower event id to be complete before
>> > starting to process this one
>> >
>> > This doesn't require the BQL, you can use a condition variable on
>> > replay_lock (but you do need to unlock/lock the BQL around it if
>> > currently taken).
>>
>> Would you then leave the recording to the stream to the main-loop
>> thread? I guess it would marshal all events that occurred before the
>> checkpoint first and then finish draining the queue after recording its
>> checkpoint?
>>
>> Wrapping the exception stuff in the BQL does improve the repeat-ability
>> but of course it breaks if I take away the graceful handling of time
>> differences because there is a race between recording the exception
>> event (with current_step+insns so far) and getting back to the main loop
>> where insns is finally credited to timers_state.qemu_icount.
>>
>> I guess we could improve the situation by updating
>> timers_state.qemu_icount (under BQL) as we record events. I don't know
>> how clunky that would get.
>
> Does io instructions make some lock to prevent races in virtual
> hardware?

Yes the BQL is taken for MMIO operations unless the hardware explicitly
does its own locking. Other operations that trigger hardware emulation
(for example changing ARM_CP_IO registers) should also take the BQL.

> vCPU thread updates icount in the beginning of the TB execution.
> It means that checkpoints in the replay log will appear only at the boundaries
> of TBs.

Not quite. The vCPU thread takes into account "in flight" instructions
when calculating the icount. In the next series I'm about to post I've
ensured this reflects an update to the main-loop icount when we read the
value.

> However, the same log may be generated by different scenarios.
> Consider the following cases:
>
> 1. Sequence: vCPU-block-begin vCPU-update-icount iothread-io vCPU-io vCPU-block-end
> 2. Sequence: vCPU-block-begin vCPU-update-icount vCPU-io iothread-io vCPU-block-end
>
> These sequences will generate the same order of replay events, but different
> states of virtual hardware.
>
> Therefore we need some lock for the time while vCPU executes translation block
> (or the whole sequence of blocks as in old times).

Well this can't be the BQL anymore. Using it for serialisation of
multiple threads conflicts with the general aim of reducing BQL
contention across the code-base.

Perhaps we should just push replay_lock up the call-stack to
prepare/process_icount_data take and release the lock and we do the same
in the checkpoint code?

>
>> > The complicated part is ensuring that there are no deadlocks where the
>> > I/O thread needs the VCPU thread to proceed, but the VCPU thread is
>> > waiting on the I/O thread's event processing.
>>
>> This sort of update sounds more like 2.10 material though.
>
> Pavel Dovgalyuk


--
Alex Bennée

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run
  2017-04-04 10:13     ` Paolo Bonzini
@ 2017-04-07 11:27       ` Pavel Dovgalyuk
  0 siblings, 0 replies; 32+ messages in thread
From: Pavel Dovgalyuk @ 2017-04-07 11:27 UTC (permalink / raw)
  To: 'Paolo Bonzini', 'Alex Bennée', rth
  Cc: peter.maydell, qemu-devel, mttcg, fred.konrad, a.rigo, cota,
	bobby.prani, nikunj, 'Peter Crosthwaite'

> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com]
> On 04/04/2017 07:37, Pavel Dovgalyuk wrote:
> >> -        icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
> >> +        /* Take into account what has run */
> >> +        icount += cpu_get_icount_executed(cpu);
> >>      }
> >>      return icount;
> > As far, as I understand, this one will return the same value in iothread
> > until vCPU thread finishes cpu_exec?
> > This value will not jump forward and backward, but still will not allow
> > making execution deterministic.
> >
> > Consider the following scenarios:
> >
> > First:
> > vCPU            iothread
> > access HW       ----
> > ...             access HW in timer
> >
> > Second:
> > vCPU            iothread
> > ...             access HW in timer
> > access HW       ----
> >
> > These scenarios will generate the same order of events in the log.
> > Synchronization checkpoint in iothread will try to write already
> > executed instructions, but it does not have access to current_cpu
> > and the icount value will point to the "past" - it will have less
> > instructions than already executed.
> 
> The actual access should be covered by a lock, but I think you're right
> that the two threads can be nondeterministically off by one instruction,
> even if we make gen_io_start update timers_state.qemu_icount atomically.

Yes. The actual problem is that icount is updated once for the whole TB.
But TB execution is not atomic and machine state is different in different
parts of TB.

Therefore iothread may expose different behavior depending on moment
when it is activated.

Pavel Dovgalyuk

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2017-04-07 11:27 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-03 12:45 [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Alex Bennée
2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 1/9] scripts/qemugdb/mtree.py: fix up mtree dump Alex Bennée
2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 2/9] scripts/qemu-gdb/timers.py: new helper to dump timer state Alex Bennée
2017-04-03 14:02   ` Philippe Mathieu-Daudé
2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 3/9] scripts/replay-dump.py: replay log dumper Alex Bennée
2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 4/9] target/i386/misc_helper: wrap BQL around another IRQ generator Alex Bennée
2017-04-04 16:53   ` Richard Henderson
2017-04-04 17:36     ` Eduardo Habkost
2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 5/9] cpus: remove icount handling from qemu_tcg_cpu_thread_fn Alex Bennée
2017-04-04 16:53   ` Richard Henderson
2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 6/9] cpus: check cpu->running in cpu_get_icount_raw() Alex Bennée
2017-04-03 14:00   ` Philippe Mathieu-Daudé
2017-04-04 16:54   ` Richard Henderson
2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 7/9] cpus: move icount preparation out of tcg_exec_cpu Alex Bennée
2017-04-04  5:39   ` Pavel Dovgalyuk
2017-04-04  8:56     ` Alex Bennée
2017-04-04 10:46       ` Alex Bennée
2017-04-04 10:53         ` Paolo Bonzini
2017-04-04 12:31           ` Alex Bennée
2017-04-04 12:37             ` Paolo Bonzini
2017-04-04 13:29               ` Alex Bennée
2017-04-05 10:44                 ` Pavel Dovgalyuk
2017-04-05 11:18                   ` Alex Bennée
2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 8/9] cpus: don't credit executed instructions before they have run Alex Bennée
2017-04-03 17:04   ` Paolo Bonzini
2017-04-04  5:37   ` Pavel Dovgalyuk
2017-04-04 10:13     ` Paolo Bonzini
2017-04-07 11:27       ` Pavel Dovgalyuk
2017-04-04 14:39   ` Paolo Bonzini
2017-04-03 12:45 ` [Qemu-devel] [RFC PATCH v1 9/9] replay: gracefully handle backward time events Alex Bennée
2017-04-03 17:03 ` [Qemu-devel] [RFC PATCH v1 0/9] MTTCG and record/replay fixes for rc3 Paolo Bonzini
2017-04-04  8:50   ` Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.