All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API
@ 2021-06-08  4:05 Mahmoud Mandour
  2021-06-08  4:05 ` [RFC PATCH v3 1/4] plugins/api: expose symbol lookup to plugins Mahmoud Mandour
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Mahmoud Mandour @ 2021-06-08  4:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mahmoud Mandour, cota

This RFC series introduces a new cache TCG plugin that models separate
L1 data cache and L1 instruction cache and uses one shared cache for
all the cores.

It also includes a commit by Alex that adds an API call that resolves
the symbol of an insn.

The original RFC patch posted by Alex Bennée included incorporating
symbol resolution into the cache plugin that caused conflicts, so I
dropped the plugin additions from that and introduced them afterwards.

v2 -> v3:
    Precomputed the value of block size shift once and stored in the
    cache.

    Removed tag shifting since it's okay to leave the tag in the
    high-order bits and mask out set index and block offset.

    Used one hashtable to store InsnData structs and made the structs
    have separate counters for data misses and instruction misses.

    Used a boolean to indicate whether an access resulted in a hit or a
    miss.

    Inserted an InsnData struct into the hashtable on translation-time
    and made sure we do so once so that we don't rewrite the struct if
    an instruction is translated multiple times.

    Made the output format for most-missing instructions more
    machine-readable.

    Removed trace-generation.

    Freed tokenized strings after argument parsing.

    Returned null from cache_init() if argument cache config is bad.

    Used one enum to indicate the chosen eviction policy.

    Added function pointers for cache update and metadata initialization
    and destroying. Those pointers are assigned to policy-specific
    functions.

    Remade LRU. Assigned a generation number that is incremented on each
    set access to the currently-accessed block's priority. On miss, 
    evicted the block with the least generation number.

    Allowed to give multiple "evict" arguments and sticked to the last
    one.

Alex Bennée (1):
  plugins/api: expose symbol lookup to plugins

Mahmoud Mandour (3):
  plugins: Added a new cache modelling plugin.
  plugins/cache: Enabled cache parameterization
  plugins/cache: Added FIFO and LRU eviction policies.

 contrib/plugins/Makefile   |   1 +
 contrib/plugins/cache.c    | 642 +++++++++++++++++++++++++++++++++++++
 include/qemu/qemu-plugin.h |   9 +
 plugins/api.c              |   6 +
 4 files changed, 658 insertions(+)
 create mode 100644 contrib/plugins/cache.c

-- 
2.25.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH v3 1/4] plugins/api: expose symbol lookup to plugins
  2021-06-08  4:05 [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API Mahmoud Mandour
@ 2021-06-08  4:05 ` Mahmoud Mandour
  2021-06-08  4:05 ` [RFC PATCH v3 2/4] plugins: Added a new cache modelling plugin Mahmoud Mandour
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Mahmoud Mandour @ 2021-06-08  4:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mahmoud Mandour, cota, Alex Bennée

From: Alex Bennée <alex.bennee@linaro.org>

This is a quality of life helper for plugins so they don't need to
re-implement symbol lookup when dumping an address. The strings are
constant so don't need to be duplicated. One minor tweak is to return
NULL instead of a zero length string to show lookup failed.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210601145824.3849-1-alex.bennee@linaro.org>
Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
---
 include/qemu/qemu-plugin.h | 9 +++++++++
 plugins/api.c              | 6 ++++++
 2 files changed, 15 insertions(+)

diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index 97cdfd7761..dc3496f36c 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -525,6 +525,15 @@ qemu_plugin_register_vcpu_syscall_ret_cb(qemu_plugin_id_t id,
 
 char *qemu_plugin_insn_disas(const struct qemu_plugin_insn *insn);
 
+/**
+ * qemu_plugin_insn_symbol() - best effort symbol lookup
+ * @insn: instruction reference
+ *
+ * Return a static string referring to the symbol. This is dependent
+ * on the binary QEMU is running having provided a symbol table.
+ */
+const char *qemu_plugin_insn_symbol(const struct qemu_plugin_insn *insn);
+
 /**
  * qemu_plugin_vcpu_for_each() - iterate over the existing vCPU
  * @id: plugin ID
diff --git a/plugins/api.c b/plugins/api.c
index 817c9b6b69..332e2c60e2 100644
--- a/plugins/api.c
+++ b/plugins/api.c
@@ -233,6 +233,12 @@ char *qemu_plugin_insn_disas(const struct qemu_plugin_insn *insn)
     return plugin_disas(cpu, insn->vaddr, insn->data->len);
 }
 
+const char *qemu_plugin_insn_symbol(const struct qemu_plugin_insn *insn)
+{
+    const char *sym = lookup_symbol(insn->vaddr);
+    return sym[0] != 0 ? sym : NULL;
+}
+
 /*
  * The memory queries allow the plugin to query information about a
  * memory access.
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v3 2/4] plugins: Added a new cache modelling plugin.
  2021-06-08  4:05 [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API Mahmoud Mandour
  2021-06-08  4:05 ` [RFC PATCH v3 1/4] plugins/api: expose symbol lookup to plugins Mahmoud Mandour
@ 2021-06-08  4:05 ` Mahmoud Mandour
  2021-06-08  4:05 ` [RFC PATCH v3 3/4] plugins/cache: Enabled cache parameterization Mahmoud Mandour
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Mahmoud Mandour @ 2021-06-08  4:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mahmoud Mandour, cota, Alex Bennée

Added a cache modelling plugin that uses a static configuration used in
many of the commercial microprocessors and uses random eviction policy.

The purpose of the plugin is to identify the most cache-thrashing
instructions for both instruction cache and data cache.

Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
---
 contrib/plugins/Makefile |   1 +
 contrib/plugins/cache.c  | 423 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 424 insertions(+)
 create mode 100644 contrib/plugins/cache.c

diff --git a/contrib/plugins/Makefile b/contrib/plugins/Makefile
index b9d7935e5e..2237b47f8b 100644
--- a/contrib/plugins/Makefile
+++ b/contrib/plugins/Makefile
@@ -18,6 +18,7 @@ NAMES += hotpages
 NAMES += howvec
 NAMES += lockstep
 NAMES += hwprofile
+NAMES += cache
 
 SONAMES := $(addsuffix .so,$(addprefix lib,$(NAMES)))
 
diff --git a/contrib/plugins/cache.c b/contrib/plugins/cache.c
new file mode 100644
index 0000000000..715e5443b0
--- /dev/null
+++ b/contrib/plugins/cache.c
@@ -0,0 +1,423 @@
+/*
+ * Copyright (C) 2021, Mahmoud Mandour <ma.mandourr@gmail.com>
+ *
+ * License: GNU GPL, version 2 or later.
+ *   See the COPYING file in the top-level directory.
+ */
+
+#include <inttypes.h>
+#include <assert.h>
+#include <stdlib.h>
+#include <inttypes.h>
+#include <string.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <glib.h>
+
+#include <qemu-plugin.h>
+
+QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
+
+static enum qemu_plugin_mem_rw rw = QEMU_PLUGIN_MEM_RW;
+
+static GRand *rng;
+static GHashTable *miss_ht;
+
+static GMutex mtx;
+
+static int limit;
+static bool sys;
+
+static uint64_t dmem_accesses;
+static uint64_t dmisses;
+
+static uint64_t imem_accesses;
+static uint64_t imisses;
+
+/*
+ * A CacheSet is a set of cache blocks. A memory block that maps to a set can be
+ * put in any of the blocks inside the set. The number of block per set is
+ * called the associativity (assoc).
+ *
+ * Each block contains the the stored tag and a valid bit. Since this is not
+ * a functional simulator, the data itself is not stored. We only identify
+ * whether a block is in the cache or not by searching for its tag.
+ *
+ * In order to search for memory data in the cache, the set identifier and tag
+ * are extracted from the address and the set is probed to see whether a tag
+ * match occur.
+ *
+ * An address is logically divided into three portions: The block offset,
+ * the set number, and the tag.
+ *
+ * The set number is used to identify the set in which the block may exist.
+ * The tag is compared against all the tags of a set to search for a match. If a
+ * match is found, then the access is a hit.
+ */
+
+struct CacheBlock {
+    uint64_t tag;
+    bool valid;
+};
+
+struct CacheSet {
+    struct CacheBlock *blocks;
+};
+
+struct Cache {
+    struct CacheSet *sets;
+    int num_sets;
+    int cachesize;
+    int assoc;
+    int blksize_shift;
+    uint64_t set_mask;
+    uint64_t tag_mask;
+};
+
+struct InsnData {
+    char *disas_str;
+    const char *symbol;
+    uint64_t addr;
+    uint64_t dmisses;
+    uint64_t imisses;
+};
+
+struct Cache *dcache, *icache;
+
+static int pow_of_two(int num)
+{
+    g_assert((num & (num - 1)) == 0);
+    int ret = 0;
+    while (num /= 2) {
+        ret++;
+    }
+    return ret;
+}
+
+static inline uint64_t extract_tag(struct Cache *cache, uint64_t addr)
+{
+    return addr & cache->tag_mask;
+}
+
+static inline uint64_t extract_set(struct Cache *cache, uint64_t addr)
+{
+    return (addr & cache->set_mask) >> cache->blksize_shift;
+}
+
+static struct Cache *cache_init(int blksize, int assoc, int cachesize)
+{
+    struct Cache *cache;
+    int i;
+    uint64_t blk_mask;
+
+    cache = g_new(struct Cache, 1);
+    cache->assoc = assoc;
+    cache->cachesize = cachesize;
+    cache->num_sets = cachesize / (blksize * assoc);
+    cache->sets = g_new(struct CacheSet, cache->num_sets);
+    cache->blksize_shift = pow_of_two(blksize);
+
+    for (i = 0; i < cache->num_sets; i++) {
+        cache->sets[i].blocks = g_new0(struct CacheBlock, assoc);
+    }
+
+    blk_mask = blksize - 1;
+    cache->set_mask = ((cache->num_sets - 1) << cache->blksize_shift);
+    cache->tag_mask = ~(cache->set_mask | blk_mask);
+    return cache;
+}
+
+static int get_invalid_block(struct Cache *cache, uint64_t set)
+{
+    int i;
+
+    for (i = 0; i < cache->assoc; i++) {
+        if (!cache->sets[set].blocks[i].valid) {
+            return i;
+        }
+    }
+
+    return -1;
+}
+
+static int get_replaced_block(struct Cache *cache)
+{
+    return g_rand_int_range(rng, 0, cache->assoc);
+}
+
+static bool in_cache(struct Cache *cache, uint64_t addr)
+{
+    int i;
+    uint64_t tag, set;
+
+    tag = extract_tag(cache, addr);
+    set = extract_set(cache, addr);
+
+    for (i = 0; i < cache->assoc; i++) {
+        if (cache->sets[set].blocks[i].tag == tag &&
+                cache->sets[set].blocks[i].valid) {
+            return true;
+        }
+    }
+
+    return false;
+}
+
+/**
+ * access_cache(): Simulate a cache access
+ * @cache: The cache under simulation
+ * @addr: The address of the requested memory location
+ *
+ * Returns true if the requsted data is hit in the cache and false when missed.
+ * The cache is updated on miss for the next access.
+ */
+static bool access_cache(struct Cache *cache, uint64_t addr)
+{
+    uint64_t tag, set;
+    int replaced_blk;
+
+    if (in_cache(cache, addr)) {
+        return true;
+    }
+
+    tag = extract_tag(cache, addr);
+    set = extract_set(cache, addr);
+
+    replaced_blk = get_invalid_block(cache, set);
+
+    if (replaced_blk == -1) {
+        replaced_blk = get_replaced_block(cache);
+    }
+
+    cache->sets[set].blocks[replaced_blk].tag = tag;
+    cache->sets[set].blocks[replaced_blk].valid = true;
+
+    return false;
+}
+
+static void vcpu_mem_access(unsigned int cpu_index, qemu_plugin_meminfo_t info,
+                            uint64_t vaddr, void *userdata)
+{
+    uint64_t insn_addr;
+    uint64_t effective_addr;
+    struct qemu_plugin_hwaddr *hwaddr;
+    struct InsnData *insn;
+
+    g_mutex_lock(&mtx);
+    hwaddr = qemu_plugin_get_hwaddr(info, vaddr);
+    if (hwaddr && qemu_plugin_hwaddr_is_io(hwaddr)) {
+        g_mutex_unlock(&mtx);
+        return;
+    }
+
+    insn_addr = ((struct InsnData *) userdata)->addr;
+    effective_addr = hwaddr ? qemu_plugin_hwaddr_phys_addr(hwaddr) : vaddr;
+
+    if (!access_cache(dcache, effective_addr)) {
+        insn = (struct InsnData *) userdata;
+        insn->dmisses++;
+        dmisses++;
+    }
+    dmem_accesses++;
+    g_mutex_unlock(&mtx);
+}
+
+static void vcpu_insn_exec(unsigned int vcpu_index, void *userdata)
+{
+    uint64_t insn_addr;
+    struct InsnData *insn;
+
+    g_mutex_lock(&mtx);
+    insn_addr = ((struct InsnData *) userdata)->addr;
+
+    if (!access_cache(icache, insn_addr)) {
+        insn = (struct InsnData *) userdata;
+        insn->imisses++;
+        imisses++;
+    }
+    imem_accesses++;
+    g_mutex_unlock(&mtx);
+}
+
+static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
+{
+    size_t n_insns;
+    size_t i;
+    struct InsnData *data;
+
+    n_insns = qemu_plugin_tb_n_insns(tb);
+    for (i = 0; i < n_insns; i++) {
+        struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, i);
+        uint64_t effective_addr;
+
+        if (sys) {
+            effective_addr = (uint64_t) qemu_plugin_insn_haddr(insn);
+        } else {
+            effective_addr = (uint64_t) qemu_plugin_insn_vaddr(insn);
+        }
+
+        /*
+         * Instructions might get translated multiple times, we do not create
+         * new entries for those instructions. Instead, we fetch the same
+         * entry from the hash table and register it for the callback again.
+         */
+        data = g_hash_table_lookup(miss_ht, GUINT_TO_POINTER(effective_addr));
+        if (data == NULL) {
+            data = g_new0(struct InsnData, 1);
+            data->disas_str = qemu_plugin_insn_disas(insn);
+            data->symbol = qemu_plugin_insn_symbol(insn);
+            data->addr = effective_addr;
+            g_hash_table_insert(miss_ht, GUINT_TO_POINTER(effective_addr),
+                               (gpointer) data);
+        }
+
+        qemu_plugin_register_vcpu_mem_cb(insn, vcpu_mem_access,
+                                         QEMU_PLUGIN_CB_NO_REGS,
+                                         rw, data);
+
+        qemu_plugin_register_vcpu_insn_exec_cb(insn, vcpu_insn_exec,
+                                               QEMU_PLUGIN_CB_NO_REGS, data);
+    }
+}
+
+static void free_insn(gpointer data)
+{
+    struct InsnData *insn = (struct InsnData *) data;
+    g_free(insn->disas_str);
+    g_free(insn);
+}
+
+static void free_cache(struct Cache *cache)
+{
+    for (int i = 0; i < cache->num_sets; i++) {
+        g_free(cache->sets[i].blocks);
+    }
+
+    g_free(cache->sets);
+    g_free(cache);
+}
+
+static int dcmp(gconstpointer a, gconstpointer b)
+{
+    struct InsnData *insn_a = (struct InsnData *) a;
+    struct InsnData *insn_b = (struct InsnData *) b;
+
+    return insn_a->dmisses < insn_b->dmisses ? 1 : -1;
+}
+
+static int icmp(gconstpointer a, gconstpointer b)
+{
+    struct InsnData *insn_a = (struct InsnData *) a;
+    struct InsnData *insn_b = (struct InsnData *) b;
+
+    return insn_a->imisses < insn_b->imisses ? 1 : -1;
+}
+
+static void log_stats()
+{
+    g_autoptr(GString) rep = g_string_new("");
+    g_string_append_printf(rep,
+        "Data accesses: %lu, Misses: %lu\nMiss rate: %lf%%\n\n",
+        dmem_accesses,
+        dmisses,
+        ((double) dmisses / dmem_accesses) * 100.0);
+
+    g_string_append_printf(rep,
+        "Instruction accesses: %lu, Misses: %lu\nMiss rate: %lf%%\n\n",
+        imem_accesses,
+        imisses,
+        ((double) imisses / imem_accesses) * 100.0);
+
+    qemu_plugin_outs(rep->str);
+}
+
+static void plugin_exit()
+{
+    GList *curr, *miss_insns;
+    int i;
+    struct InsnData *insn;
+
+    g_mutex_lock(&mtx);
+
+    log_stats();
+
+    miss_insns = g_hash_table_get_values(miss_ht);
+    miss_insns = g_list_sort(miss_insns, dcmp);
+    g_autoptr(GString) rep = g_string_new("");
+    g_string_append_printf(rep, "%s", "address, data misses, instruction\n");
+
+    for (curr = miss_insns, i = 0; curr && i < limit; i++, curr = curr->next) {
+        insn = (struct InsnData *) curr->data;
+        g_string_append_printf(rep, "0x%" PRIx64, insn->addr);
+        if (insn->symbol) {
+            g_string_append_printf(rep, " (%s)", insn->symbol);
+        }
+        g_string_append_printf(rep, ", %ld, %s\n", insn->dmisses,
+                               insn->disas_str);
+    }
+
+    miss_insns = g_list_sort(miss_insns, icmp);
+    g_string_append_printf(rep, "%s", "\naddress, fetch misses, instruction\n");
+
+    for (curr = miss_insns, i = 0; curr && i < limit; i++, curr = curr->next) {
+        insn = (struct InsnData *) curr->data;
+        g_string_append_printf(rep, "0x%" PRIx64, insn->addr);
+        if (insn->symbol) {
+            g_string_append_printf(rep, " (%s)", insn->symbol);
+        }
+        g_string_append_printf(rep, ", %ld, %s\n", insn->imisses,
+                               insn->disas_str);
+    }
+
+    qemu_plugin_outs(rep->str);
+
+    free_cache(dcache);
+    free_cache(icache);
+
+    g_list_free(miss_insns);
+
+    g_hash_table_destroy(miss_ht);
+    g_mutex_unlock(&mtx);
+}
+
+QEMU_PLUGIN_EXPORT
+int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
+                        int argc, char **argv)
+{
+    int i;
+    int iassoc, iblksize, icachesize;
+    int dassoc, dblksize, dcachesize;
+
+    limit = 32;
+    sys = info->system_emulation;
+
+    dassoc = 8;
+    dblksize = 64;
+    dcachesize = dblksize * dassoc * 32;
+
+    iassoc = 8;
+    iblksize = 64;
+    icachesize = iblksize * iassoc * 32;
+
+    rng = g_rand_new();
+
+    for (i = 0; i < argc; i++) {
+        char *opt = argv[i];
+        if (g_str_has_prefix(opt, "limit=")) {
+            limit = g_ascii_strtoull(opt + 6, NULL, 10);
+        } else {
+            fprintf(stderr, "option parsing failed: %s\n", opt);
+            return -1;
+        }
+    }
+
+    dcache = cache_init(dblksize, dassoc, dcachesize);
+    icache = cache_init(iblksize, iassoc, icachesize);
+
+    qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
+    qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
+
+    miss_ht = g_hash_table_new_full(NULL, g_direct_equal, NULL, free_insn);
+
+    return 0;
+}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v3 3/4] plugins/cache: Enabled cache parameterization
  2021-06-08  4:05 [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API Mahmoud Mandour
  2021-06-08  4:05 ` [RFC PATCH v3 1/4] plugins/api: expose symbol lookup to plugins Mahmoud Mandour
  2021-06-08  4:05 ` [RFC PATCH v3 2/4] plugins: Added a new cache modelling plugin Mahmoud Mandour
@ 2021-06-08  4:05 ` Mahmoud Mandour
  2021-06-22 14:37   ` Alex Bennée
  2021-06-08  4:05 ` [RFC PATCH v3 4/4] plugins/cache: Added FIFO and LRU eviction policies Mahmoud Mandour
  2021-06-22 14:57 ` [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API Alex Bennée
  4 siblings, 1 reply; 8+ messages in thread
From: Mahmoud Mandour @ 2021-06-08  4:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mahmoud Mandour, cota, Alex Bennée

Made both icache and dcache configurable through plugin arguments.

Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
---
 contrib/plugins/cache.c | 44 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/contrib/plugins/cache.c b/contrib/plugins/cache.c
index 715e5443b0..d8e8c750b6 100644
--- a/contrib/plugins/cache.c
+++ b/contrib/plugins/cache.c
@@ -104,8 +104,17 @@ static inline uint64_t extract_set(struct Cache *cache, uint64_t addr)
     return (addr & cache->set_mask) >> cache->blksize_shift;
 }
 
+static bool bad_cache_params(int blksize, int assoc, int cachesize)
+{
+    return (cachesize % blksize) != 0 || (cachesize % (blksize * assoc) != 0);
+}
+
 static struct Cache *cache_init(int blksize, int assoc, int cachesize)
 {
+    if (bad_cache_params(blksize, assoc, cachesize)) {
+        return NULL;
+    }
+
     struct Cache *cache;
     int i;
     uint64_t blk_mask;
@@ -403,8 +412,30 @@ int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
 
     for (i = 0; i < argc; i++) {
         char *opt = argv[i];
-        if (g_str_has_prefix(opt, "limit=")) {
-            limit = g_ascii_strtoull(opt + 6, NULL, 10);
+        if (g_str_has_prefix(opt, "I=")) {
+            gchar **toks = g_strsplit(opt + 2, " ", -1);
+            if (g_strv_length(toks) != 3) {
+                g_strfreev(toks);
+                fprintf(stderr, "option parsing failed: %s\n", opt);
+                return -1;
+            }
+            icachesize = g_ascii_strtoull(toks[0], NULL, 10);
+            iassoc = g_ascii_strtoull(toks[1], NULL, 10);
+            iblksize = g_ascii_strtoull(toks[2], NULL, 10);
+            g_strfreev(toks);
+        } else if (g_str_has_prefix(opt, "D=")) {
+            gchar **toks = g_strsplit(opt + 2, " ", -1);
+            if (g_strv_length(toks) != 3) {
+                g_strfreev(toks);
+                fprintf(stderr, "option parsing failed: %s\n", opt);
+                return -1;
+            }
+            dcachesize = g_ascii_strtoull(toks[0], NULL, 10);
+            dassoc = g_ascii_strtoull(toks[1], NULL, 10);
+            dblksize = g_ascii_strtoull(toks[2], NULL, 10);
+            g_strfreev(toks);
+        } else if (g_str_has_prefix(opt, "limit=")) {
+            limit = g_ascii_strtoll(opt + 6, NULL, 10);
         } else {
             fprintf(stderr, "option parsing failed: %s\n", opt);
             return -1;
@@ -412,7 +443,16 @@ int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
     }
 
     dcache = cache_init(dblksize, dassoc, dcachesize);
+    if (!dcache) {
+        fprintf(stderr, "dcache cannot be constructed from given parameters\n");
+        return -1;
+    }
+
     icache = cache_init(iblksize, iassoc, icachesize);
+    if (!icache) {
+        fprintf(stderr, "icache cannot be constructed from given parameters\n");
+        return -1;
+    }
 
     qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
     qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v3 4/4] plugins/cache: Added FIFO and LRU eviction policies.
  2021-06-08  4:05 [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API Mahmoud Mandour
                   ` (2 preceding siblings ...)
  2021-06-08  4:05 ` [RFC PATCH v3 3/4] plugins/cache: Enabled cache parameterization Mahmoud Mandour
@ 2021-06-08  4:05 ` Mahmoud Mandour
  2021-06-22 14:57   ` Alex Bennée
  2021-06-22 14:57 ` [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API Alex Bennée
  4 siblings, 1 reply; 8+ messages in thread
From: Mahmoud Mandour @ 2021-06-08  4:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mahmoud Mandour, cota, Alex Bennée

Implemented FIFO and LRU eviction policies.
Now one of the three eviction policies can be chosen as an argument. On
not specifying an argument, LRU is used by default.

Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
---
 contrib/plugins/cache.c | 205 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 192 insertions(+), 13 deletions(-)

diff --git a/contrib/plugins/cache.c b/contrib/plugins/cache.c
index d8e8c750b6..be817da5b6 100644
--- a/contrib/plugins/cache.c
+++ b/contrib/plugins/cache.c
@@ -34,6 +34,14 @@ static uint64_t dmisses;
 static uint64_t imem_accesses;
 static uint64_t imisses;
 
+enum EvictionPolicy {
+    LRU,
+    FIFO,
+    RAND,
+};
+
+enum EvictionPolicy policy;
+
 /*
  * A CacheSet is a set of cache blocks. A memory block that maps to a set can be
  * put in any of the blocks inside the set. The number of block per set is
@@ -53,6 +61,8 @@ static uint64_t imisses;
  * The set number is used to identify the set in which the block may exist.
  * The tag is compared against all the tags of a set to search for a match. If a
  * match is found, then the access is a hit.
+ *
+ * The CacheSet also contains bookkeaping information about eviction details.
  */
 
 struct CacheBlock {
@@ -62,6 +72,9 @@ struct CacheBlock {
 
 struct CacheSet {
     struct CacheBlock *blocks;
+    uint64_t *lru_priorities;
+    uint64_t lru_gen_counter;
+    GQueue *fifo_queue;
 };
 
 struct Cache {
@@ -82,6 +95,12 @@ struct InsnData {
     uint64_t imisses;
 };
 
+void (*update_hit)(struct Cache *cache, int set, int blk);
+void (*update_miss)(struct Cache *cache, int set, int blk);
+
+void (*metadata_init)(struct Cache *cache);
+void (*metadata_destroy)(struct Cache *cache);
+
 struct Cache *dcache, *icache;
 
 static int pow_of_two(int num)
@@ -104,6 +123,103 @@ static inline uint64_t extract_set(struct Cache *cache, uint64_t addr)
     return (addr & cache->set_mask) >> cache->blksize_shift;
 }
 
+/*
+ * LRU evection policy: For each set, a generation counter is maintained
+ * alongside a priority array.
+ *
+ * On each set access, the generation counter is incremented.
+ *
+ * On a cache hit: The hit-block is assigned the current generation counter,
+ * indicating that it is the most recently used block.
+ *
+ * On a cache miss: The block with the least priority is searched and replaced
+ * with the newly-cached block, of which the priority is set to the current
+ * generation number.
+ */
+
+static void lru_priorities_init(struct Cache *cache)
+{
+    int i, assoc;
+
+    assoc = cache->assoc;
+    for (i = 0; i < cache->num_sets; i++) {
+        cache->sets[i].lru_priorities = g_new0(uint64_t, assoc);
+    }
+}
+
+static void lru_update_blk(struct Cache *cache, int set_idx, int blk_idx)
+{
+    struct CacheSet *set = &cache->sets[set_idx];
+    set->lru_priorities[blk_idx] = cache->sets[set_idx].lru_gen_counter;
+    set->lru_gen_counter++;
+}
+
+static int lru_get_lru_block(struct Cache *cache, int set_idx)
+{
+    int i, min_idx, min_priority;
+
+    min_priority = cache->sets[set_idx].lru_priorities[0];
+    min_idx = 0;
+
+    for (i = 1; i < cache->assoc; i++) {
+        if (cache->sets[set_idx].lru_priorities[i] < min_priority) {
+            min_priority = cache->sets[set_idx].lru_priorities[i];
+            min_idx = i;
+        }
+    }
+    return min_idx;
+}
+
+static void lru_priorities_destroy(struct Cache *cache)
+{
+    int i;
+
+    for (i = 0; i < cache->num_sets; i++) {
+        g_free(cache->sets[i].lru_priorities);
+    }
+}
+
+/*
+ * FIFO eviction policy: a FIFO queue is maintained for each CacheSet that
+ * stores accesses to the cache.
+ *
+ * On a compulsory miss: The block index is enqueued to the fifo_queue to
+ * indicate that it's the latest cached block.
+ *
+ * On a conflict miss: The first-in block is removed from the cache and the new
+ * block is put in its place and enqueued to the FIFO queue.
+ */
+
+static void fifo_init(struct Cache *cache)
+{
+    int i;
+
+    for (i = 0; i < cache->num_sets; i++) {
+        cache->sets[i].fifo_queue = g_queue_new();
+    }
+}
+
+static int fifo_get_first_block(struct Cache *cache, int set)
+{
+    GQueue *q = cache->sets[set].fifo_queue;
+    return GPOINTER_TO_INT(g_queue_pop_tail(q));
+}
+
+static void fifo_update_on_miss(struct Cache *cache, int set, int blk_idx)
+{
+    GQueue *q = cache->sets[set].fifo_queue;
+    g_queue_push_head(q, GINT_TO_POINTER(blk_idx));
+}
+
+static void fifo_destroy(struct Cache *cache)
+{
+    int i;
+
+    for (i = 0; i < cache->assoc; i++) {
+        g_queue_free(cache->sets[i].fifo_queue);
+    }
+}
+
 static bool bad_cache_params(int blksize, int assoc, int cachesize)
 {
     return (cachesize % blksize) != 0 || (cachesize % (blksize * assoc) != 0);
@@ -128,11 +244,17 @@ static struct Cache *cache_init(int blksize, int assoc, int cachesize)
 
     for (i = 0; i < cache->num_sets; i++) {
         cache->sets[i].blocks = g_new0(struct CacheBlock, assoc);
+        cache->sets[i].lru_gen_counter = 0;
     }
 
     blk_mask = blksize - 1;
     cache->set_mask = ((cache->num_sets - 1) << cache->blksize_shift);
     cache->tag_mask = ~(cache->set_mask | blk_mask);
+
+    if (metadata_init) {
+        metadata_init(cache);
+    }
+
     return cache;
 }
 
@@ -149,12 +271,21 @@ static int get_invalid_block(struct Cache *cache, uint64_t set)
     return -1;
 }
 
-static int get_replaced_block(struct Cache *cache)
+static int get_replaced_block(struct Cache *cache, int set)
 {
-    return g_rand_int_range(rng, 0, cache->assoc);
+    switch (policy) {
+    case RAND:
+        return g_rand_int_range(rng, 0, cache->assoc);
+    case LRU:
+        return lru_get_lru_block(cache, set);
+    case FIFO:
+        return fifo_get_first_block(cache, set);
+    defalut:
+        g_assert_not_reached();
+    }
 }
 
-static bool in_cache(struct Cache *cache, uint64_t addr)
+static int in_cache(struct Cache *cache, uint64_t addr)
 {
     int i;
     uint64_t tag, set;
@@ -165,11 +296,11 @@ static bool in_cache(struct Cache *cache, uint64_t addr)
     for (i = 0; i < cache->assoc; i++) {
         if (cache->sets[set].blocks[i].tag == tag &&
                 cache->sets[set].blocks[i].valid) {
-            return true;
+            return i;
         }
     }
 
-    return false;
+    return -1;
 }
 
 /**
@@ -178,24 +309,32 @@ static bool in_cache(struct Cache *cache, uint64_t addr)
  * @addr: The address of the requested memory location
  *
  * Returns true if the requsted data is hit in the cache and false when missed.
- * The cache is updated on miss for the next access.
+ * The cache is then updated for subsequent accesses.
  */
 static bool access_cache(struct Cache *cache, uint64_t addr)
 {
+    int hit_blk, replaced_blk;
     uint64_t tag, set;
-    int replaced_blk;
-
-    if (in_cache(cache, addr)) {
-        return true;
-    }
 
     tag = extract_tag(cache, addr);
     set = extract_set(cache, addr);
 
+    hit_blk = in_cache(cache, addr);
+    if (hit_blk != -1) {
+        if (update_hit) {
+            update_hit(cache, set, hit_blk);
+        }
+        return true;
+    }
+
     replaced_blk = get_invalid_block(cache, set);
 
     if (replaced_blk == -1) {
-        replaced_blk = get_replaced_block(cache);
+        replaced_blk = get_replaced_block(cache, set);
+    }
+
+    if (update_miss) {
+        update_miss(cache, set, replaced_blk);
     }
 
     cache->sets[set].blocks[replaced_blk].tag = tag;
@@ -302,6 +441,10 @@ static void free_cache(struct Cache *cache)
         g_free(cache->sets[i].blocks);
     }
 
+    if (metadata_destroy) {
+        metadata_destroy(cache);
+    }
+
     g_free(cache->sets);
     g_free(cache);
 }
@@ -389,6 +532,28 @@ static void plugin_exit()
     g_mutex_unlock(&mtx);
 }
 
+static void policy_init()
+{
+    switch (policy) {
+    case LRU:
+        update_hit = lru_update_blk;
+        update_miss = lru_update_blk;
+        metadata_init = lru_priorities_init;
+        metadata_destroy = lru_priorities_destroy;
+        break;
+    case FIFO:
+        update_miss = fifo_update_on_miss;
+        metadata_init = fifo_init;
+        metadata_destroy = fifo_destroy;
+        break;
+    case RAND:
+        rng = g_rand_new();
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
 QEMU_PLUGIN_EXPORT
 int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
                         int argc, char **argv)
@@ -408,7 +573,7 @@ int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
     iblksize = 64;
     icachesize = iblksize * iassoc * 32;
 
-    rng = g_rand_new();
+    policy = LRU;
 
     for (i = 0; i < argc; i++) {
         char *opt = argv[i];
@@ -436,12 +601,26 @@ int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
             g_strfreev(toks);
         } else if (g_str_has_prefix(opt, "limit=")) {
             limit = g_ascii_strtoll(opt + 6, NULL, 10);
+        } else if (g_str_has_prefix(opt, "evict=")) {
+            gchar *p = opt + 6;
+            if (g_strcmp0(p, "rand") == 0) {
+                policy = RAND;
+            } else if (g_strcmp0(p, "lru") == 0) {
+                policy = LRU;
+            } else if (g_strcmp0(p, "fifo") == 0) {
+                policy = FIFO;
+            } else {
+                fprintf(stderr, "invalid eviction policy: %s\n", opt);
+                return -1;
+            }
         } else {
             fprintf(stderr, "option parsing failed: %s\n", opt);
             return -1;
         }
     }
 
+    policy_init();
+
     dcache = cache_init(dblksize, dassoc, dcachesize);
     if (!dcache) {
         fprintf(stderr, "dcache cannot be constructed from given parameters\n");
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 3/4] plugins/cache: Enabled cache parameterization
  2021-06-08  4:05 ` [RFC PATCH v3 3/4] plugins/cache: Enabled cache parameterization Mahmoud Mandour
@ 2021-06-22 14:37   ` Alex Bennée
  0 siblings, 0 replies; 8+ messages in thread
From: Alex Bennée @ 2021-06-22 14:37 UTC (permalink / raw)
  To: Mahmoud Mandour; +Cc: cota, qemu-devel


Mahmoud Mandour <ma.mandourr@gmail.com> writes:

> Made both icache and dcache configurable through plugin arguments.
>
> Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
> ---
>  contrib/plugins/cache.c | 44 +++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 42 insertions(+), 2 deletions(-)
>
> diff --git a/contrib/plugins/cache.c b/contrib/plugins/cache.c
> index 715e5443b0..d8e8c750b6 100644
> --- a/contrib/plugins/cache.c
> +++ b/contrib/plugins/cache.c
> @@ -104,8 +104,17 @@ static inline uint64_t extract_set(struct Cache *cache, uint64_t addr)
>      return (addr & cache->set_mask) >> cache->blksize_shift;
>  }
>  
> +static bool bad_cache_params(int blksize, int assoc, int cachesize)
> +{
> +    return (cachesize % blksize) != 0 || (cachesize % (blksize * assoc) != 0);
> +}
> +
>  static struct Cache *cache_init(int blksize, int assoc, int cachesize)
>  {
> +    if (bad_cache_params(blksize, assoc, cachesize)) {
> +        return NULL;
> +    }
> +
>      struct Cache *cache;
>      int i;
>      uint64_t blk_mask;
> @@ -403,8 +412,30 @@ int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
>  
>      for (i = 0; i < argc; i++) {
>          char *opt = argv[i];
> -        if (g_str_has_prefix(opt, "limit=")) {
> -            limit = g_ascii_strtoull(opt + 6, NULL, 10);
> +        if (g_str_has_prefix(opt, "I=")) {
> +            gchar **toks = g_strsplit(opt + 2, " ", -1);

I don't think this works because a space will trigger the shell to split
the args - the only way I could get it to work was by quoting the whole
-plugin argument. I know the syntax of optional plugin args is ugly as
hell but this should probably use "," like hwprofile.

> +            if (g_strv_length(toks) != 3) {
> +                g_strfreev(toks);
> +                fprintf(stderr, "option parsing failed: %s\n", opt);
> +                return -1;
> +            }
> +            icachesize = g_ascii_strtoull(toks[0], NULL, 10);
> +            iassoc = g_ascii_strtoull(toks[1], NULL, 10);
> +            iblksize = g_ascii_strtoull(toks[2], NULL, 10);
> +            g_strfreev(toks);
> +        } else if (g_str_has_prefix(opt, "D=")) {
> +            gchar **toks = g_strsplit(opt + 2, " ", -1);
> +            if (g_strv_length(toks) != 3) {
> +                g_strfreev(toks);
> +                fprintf(stderr, "option parsing failed: %s\n", opt);
> +                return -1;
> +            }
> +            dcachesize = g_ascii_strtoull(toks[0], NULL, 10);
> +            dassoc = g_ascii_strtoull(toks[1], NULL, 10);
> +            dblksize = g_ascii_strtoull(toks[2], NULL, 10);
> +            g_strfreev(toks);
> +        } else if (g_str_has_prefix(opt, "limit=")) {
> +            limit = g_ascii_strtoll(opt + 6, NULL, 10);
>          } else {
>              fprintf(stderr, "option parsing failed: %s\n", opt);
>              return -1;
> @@ -412,7 +443,16 @@ int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
>      }
>  
>      dcache = cache_init(dblksize, dassoc, dcachesize);
> +    if (!dcache) {
> +        fprintf(stderr, "dcache cannot be constructed from given parameters\n");
> +        return -1;
> +    }
> +

Can we give users more of a hint of what's wrong? I suspect you'll need
to factor out a validate_cache_param and return a string that describes
the failure mode. Otherwise it gets frustrating to the user.

>      icache = cache_init(iblksize, iassoc, icachesize);
> +    if (!icache) {
> +        fprintf(stderr, "icache cannot be constructed from given parameters\n");
> +        return -1;
> +    }

ditto

>  
>      qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
>      qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 4/4] plugins/cache: Added FIFO and LRU eviction policies.
  2021-06-08  4:05 ` [RFC PATCH v3 4/4] plugins/cache: Added FIFO and LRU eviction policies Mahmoud Mandour
@ 2021-06-22 14:57   ` Alex Bennée
  0 siblings, 0 replies; 8+ messages in thread
From: Alex Bennée @ 2021-06-22 14:57 UTC (permalink / raw)
  To: Mahmoud Mandour; +Cc: cota, qemu-devel


Mahmoud Mandour <ma.mandourr@gmail.com> writes:

> Implemented FIFO and LRU eviction policies.
> Now one of the three eviction policies can be chosen as an argument. On
> not specifying an argument, LRU is used by default.
>
> Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API
  2021-06-08  4:05 [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API Mahmoud Mandour
                   ` (3 preceding siblings ...)
  2021-06-08  4:05 ` [RFC PATCH v3 4/4] plugins/cache: Added FIFO and LRU eviction policies Mahmoud Mandour
@ 2021-06-22 14:57 ` Alex Bennée
  4 siblings, 0 replies; 8+ messages in thread
From: Alex Bennée @ 2021-06-22 14:57 UTC (permalink / raw)
  To: Mahmoud Mandour; +Cc: cota, qemu-devel


Mahmoud Mandour <ma.mandourr@gmail.com> writes:

> This RFC series introduces a new cache TCG plugin that models separate
> L1 data cache and L1 instruction cache and uses one shared cache for
> all the cores.
>
> It also includes a commit by Alex that adds an API call that resolves
> the symbol of an insn.
>
> The original RFC patch posted by Alex Bennée included incorporating
> symbol resolution into the cache plugin that caused conflicts, so I
> dropped the plugin additions from that and introduced them afterwards.

Queued patches 1-2 to plugins/next, will queue the rest from next
revision. Could you please also add a section to
docs/devel/tcg-plugins.rst to detail the argument options.

>
> v2 -> v3:
>     Precomputed the value of block size shift once and stored in the
>     cache.
>
>     Removed tag shifting since it's okay to leave the tag in the
>     high-order bits and mask out set index and block offset.
>
>     Used one hashtable to store InsnData structs and made the structs
>     have separate counters for data misses and instruction misses.
>
>     Used a boolean to indicate whether an access resulted in a hit or a
>     miss.
>
>     Inserted an InsnData struct into the hashtable on translation-time
>     and made sure we do so once so that we don't rewrite the struct if
>     an instruction is translated multiple times.
>
>     Made the output format for most-missing instructions more
>     machine-readable.
>
>     Removed trace-generation.
>
>     Freed tokenized strings after argument parsing.
>
>     Returned null from cache_init() if argument cache config is bad.
>
>     Used one enum to indicate the chosen eviction policy.
>
>     Added function pointers for cache update and metadata initialization
>     and destroying. Those pointers are assigned to policy-specific
>     functions.
>
>     Remade LRU. Assigned a generation number that is incremented on each
>     set access to the currently-accessed block's priority. On miss, 
>     evicted the block with the least generation number.
>
>     Allowed to give multiple "evict" arguments and sticked to the last
>     one.
>
> Alex Bennée (1):
>   plugins/api: expose symbol lookup to plugins
>
> Mahmoud Mandour (3):
>   plugins: Added a new cache modelling plugin.
>   plugins/cache: Enabled cache parameterization
>   plugins/cache: Added FIFO and LRU eviction policies.
>
>  contrib/plugins/Makefile   |   1 +
>  contrib/plugins/cache.c    | 642 +++++++++++++++++++++++++++++++++++++
>  include/qemu/qemu-plugin.h |   9 +
>  plugins/api.c              |   6 +
>  4 files changed, 658 insertions(+)
>  create mode 100644 contrib/plugins/cache.c


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-06-22 14:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08  4:05 [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API Mahmoud Mandour
2021-06-08  4:05 ` [RFC PATCH v3 1/4] plugins/api: expose symbol lookup to plugins Mahmoud Mandour
2021-06-08  4:05 ` [RFC PATCH v3 2/4] plugins: Added a new cache modelling plugin Mahmoud Mandour
2021-06-08  4:05 ` [RFC PATCH v3 3/4] plugins/cache: Enabled cache parameterization Mahmoud Mandour
2021-06-22 14:37   ` Alex Bennée
2021-06-08  4:05 ` [RFC PATCH v3 4/4] plugins/cache: Added FIFO and LRU eviction policies Mahmoud Mandour
2021-06-22 14:57   ` Alex Bennée
2021-06-22 14:57 ` [RFC PATCH v3 0/4] Cache TCG plugin & symbol-resolution API Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.