All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] fuzz: add generic fuzzer
@ 2020-06-11  5:56 Alexander Bulekov
  2020-06-11  5:56 ` [RFC PATCH 1/3] fuzz: add a general fuzzer for any qemu arguments Alexander Bulekov
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Alexander Bulekov @ 2020-06-11  5:56 UTC (permalink / raw)
  To: qemu-devel; +Cc: darren.kenny, bsd, f4bug, stefanha, Alexander Bulekov

These patches add a generic fuzzer for virtual devices. This should
allow us to fuzz devices that accept inputs over MMIO, PIO and DMA
without any device-specific code.

Example:
QEMU_FUZZ_ARGS="-device virtio-net" \
FUZZ_REGION_WHITELIST="virtio pci-" \
./i386-softmmu/qemu-fuzz-i386 --fuzz-target=general-pci-enum-fuzz

The above command will add a virtio-net device to the QEMU arguments and
restrict the fuzzer to only interact with MMIO and PIO regions with
names that contain "virtio" or "pci-". I find these names using the info
mtree monitor command. 

Basically, the fuzzer splits the input into a series of commands, such
as mmio_write, pio_write, etc. Additionally, these patches add "hooks"
to functions that are typically used by virtual-devices to read from RAM
(DMA). These hooks attempt to populate these DMA regions with fuzzed
data, just in time.  There are some differences from my reference code
that seem to result in performance issues that I am still trying to iron
out. I also need to figure out how to add the DMA "hooks" in a neat way.
Maybe I can use -Wl,--wrap for this. I appreciate any feedback.

Alexander Bulekov (3):
  fuzz: add a general fuzzer for any qemu arguments
  fuzz: add support for fuzzing DMA regions
  fuzz: Add callbacks for dma-access functions

 exec.c                                |  17 +-
 include/exec/memory.h                 |   8 +
 include/exec/memory_ldst_cached.inc.h |   9 +
 include/sysemu/dma.h                  |   5 +-
 memory_ldst.inc.c                     |  12 +
 tests/qtest/fuzz/Makefile.include     |   1 +
 tests/qtest/fuzz/general_fuzz.c       | 556 ++++++++++++++++++++++++++
 7 files changed, 606 insertions(+), 2 deletions(-)
 create mode 100644 tests/qtest/fuzz/general_fuzz.c

-- 
2.26.2



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH 1/3] fuzz: add a general fuzzer for any qemu arguments
  2020-06-11  5:56 [RFC PATCH 0/3] fuzz: add generic fuzzer Alexander Bulekov
@ 2020-06-11  5:56 ` Alexander Bulekov
  2020-06-11  5:56 ` [RFC PATCH 2/3] fuzz: add support for fuzzing DMA regions Alexander Bulekov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Alexander Bulekov @ 2020-06-11  5:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Alexander Bulekov, f4bug,
	darren.kenny, bsd, stefanha, Paolo Bonzini

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
---
 tests/qtest/fuzz/Makefile.include |   1 +
 tests/qtest/fuzz/general_fuzz.c   | 454 ++++++++++++++++++++++++++++++
 2 files changed, 455 insertions(+)
 create mode 100644 tests/qtest/fuzz/general_fuzz.c

diff --git a/tests/qtest/fuzz/Makefile.include b/tests/qtest/fuzz/Makefile.include
index f259d866c9..60f1a448ea 100644
--- a/tests/qtest/fuzz/Makefile.include
+++ b/tests/qtest/fuzz/Makefile.include
@@ -10,6 +10,7 @@ fuzz-obj-y += tests/qtest/fuzz/qos_fuzz.o
 fuzz-obj-$(CONFIG_PCI_I440FX) += tests/qtest/fuzz/i440fx_fuzz.o
 fuzz-obj-$(CONFIG_VIRTIO_NET) += tests/qtest/fuzz/virtio_net_fuzz.o
 fuzz-obj-$(CONFIG_SCSI) += tests/qtest/fuzz/virtio_scsi_fuzz.o
+fuzz-obj-y += tests/qtest/fuzz/general_fuzz.o
 
 FUZZ_CFLAGS += -I$(SRC_PATH)/tests -I$(SRC_PATH)/tests/qtest
 
diff --git a/tests/qtest/fuzz/general_fuzz.c b/tests/qtest/fuzz/general_fuzz.c
new file mode 100644
index 0000000000..5c29306bb6
--- /dev/null
+++ b/tests/qtest/fuzz/general_fuzz.c
@@ -0,0 +1,454 @@
+/*
+ * General Fuzzing Target
+ *
+ * Copyright Red Hat Inc., 2020
+ *
+ * Authors:
+ *  Alexander Bulekov   <alxndr@bu.edu>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "cpu.h"
+#include "standard-headers/linux/virtio_config.h"
+#include "tests/qtest/libqtest.h"
+#include "tests/qtest/libqos/virtio-net.h"
+#include "fuzz.h"
+#include "fork_fuzz.h"
+#include "qos_fuzz.h"
+#include "libqos/pci-pc.h"
+#include "string.h"
+#include "exec/memory.h"
+#include "exec/address-spaces.h"
+#include "hw/qdev-core.h"
+#include "hw/pci/pci_regs.h"
+#include "hw/boards.h"
+
+/*
+ * CMD_SEP is a random 32-bit value used to separate "commands" in the fuzz
+ * input
+ */
+#define CMD_SEP "\x84\x05\x5C\x5E"
+
+typedef struct {
+    size_t addr;
+    size_t len; /* The number of bytes until the end of the I/O region */
+} address_range;
+
+/*
+ * A pattern used to populate a DMA region or perform a memwrite. This is
+ * useful for e.g. populating tables of unique addresses.
+ * Example {.index = 1; .stride = 2; .len = 3; .data = "\x00\x01\x02"}
+ * Renders as: 00 01 02   00 03 03   00 05 03   00 07 03 ...
+ */
+typedef struct {
+    uint8_t index;      /* Index of a byte to increment by stride */
+    uint8_t stride;     /* Increment each index'th byte by this amount */
+    size_t len;
+    const uint8_t *data;
+} pattern;
+
+/*
+ * Only fuzz an IO region if its name contains a word in region_whitelist.
+ * Lazy way to limit the fuzzer to a particular device.
+ */
+char **region_whitelist;
+
+/*
+ * Allocate a block of memory and populate it with a pattern.
+ */
+static void *pattern_alloc(pattern p, size_t len)
+{
+    int i;
+    uint8_t *buf = g_malloc(len);
+    uint8_t sum = 0;
+
+    for (i = 0; i < len; ++i) {
+        buf[i] = p.data[i % p.len];
+        if ((i % p.len) == p.index) {
+            buf[i] += sum;
+            sum += p.stride;
+        }
+    }
+    return buf;
+}
+
+
+/*
+ * Here we want to convert a fuzzer-provided [io-region-index, offset] to
+ * a physical address.
+ */
+static address_range get_io_address(MemoryRegion *io,  uint8_t index,
+                                    uint16_t offset, bool root) {
+    /* The index of the candidate MemoryRegions iterated in preorder */
+    static int i;
+    MemoryRegion *child, *mr = NULL;
+    /*
+     * This loop should run at most twice:
+     * 1.) if index > num regions, to calculate num regions to calculate index
+     * % num_regions.
+     * 2.) to actually select the mr.
+     */
+    while (!mr) {
+        /* If we are recursing over a subregion, don't reset i */
+        if (root) {
+            i = 0;
+        }
+        QTAILQ_FOREACH(child, &io->subregions, subregions_link) {
+            int found = *region_whitelist ? 0 : 1;
+            char **wl_ptr = region_whitelist;
+            while (*wl_ptr != NULL) {
+                if (strstr(child->name, *wl_ptr) != NULL) {
+                    found = 1;
+                    break;
+                }
+                wl_ptr++;
+            }
+            if (found) {
+                if (index == i++) {
+                    mr = child;
+                    break;
+                }
+            }
+            address_range addr = get_io_address(child, index, offset, false);
+            if (addr.addr != -1) {
+                return (address_range){child->addr + addr.addr, addr.len};
+            }
+        }
+        if (!mr) {
+            if (i == 0 || !root) {
+                return (address_range){-1, 0};
+            }
+            index = index % i;
+        }
+    }
+    if (mr->size == 0) {
+        return (address_range){mr->addr, 0};
+    } else {
+        return (address_range){mr->addr + (offset % mr->size),
+                               mr->size - (offset % mr->size)};
+    }
+}
+
+static address_range get_pio_address(uint8_t index, uint16_t offset)
+{
+    return get_io_address(get_system_io(), index, offset, true);
+}
+static address_range get_mmio_address(uint8_t index, uint16_t offset)
+{
+    return get_io_address(get_system_memory(), index, offset, true);
+}
+
+static void op_in(QTestState *s, const unsigned char * data, size_t len)
+{
+    enum Sizes {Byte, Word, Long, end_sizes};
+    struct {
+        uint8_t size;
+        uint8_t base;
+        uint16_t offset;
+    } a;
+
+    if (len < sizeof(a)) {
+        return;
+    }
+    memcpy(&a, data, sizeof(a));
+
+    size_t addr = get_pio_address(a.base, a.offset).addr;
+    switch (a.size %= end_sizes) {
+    case Byte:
+        qtest_inb(s, addr);
+        break;
+    case Word:
+        qtest_inw(s, addr);
+        break;
+    case Long:
+        qtest_inl(s, addr);
+        break;
+    }
+}
+
+static void op_out(QTestState *s, const unsigned char * data, size_t len)
+{
+    enum Sizes {Byte, Word, Long, end_sizes};
+    struct {
+        uint8_t size;
+        uint8_t base;
+        uint16_t offset;
+        uint32_t value;
+    } a;
+
+    if (len < sizeof(a)) {
+        return;
+    }
+    memcpy(&a, data, sizeof(a));
+
+    size_t addr = get_pio_address(a.base, a.offset).addr;
+    if (addr == -1) {
+        return;
+    }
+    switch (a.size %= end_sizes) {
+    case Byte:
+        qtest_outb(s, addr, a.value & 0xFF);
+        break;
+    case Word:
+        qtest_outw(s, addr, a.value & 0xFFFF);
+        break;
+    case Long:
+        qtest_outl(s, addr, a.value);
+        break;
+    }
+}
+
+static void op_read(QTestState *s, const unsigned char * data, size_t len)
+{
+    enum Sizes {Byte, Word, Long, Quad, end_sizes};
+    struct {
+        uint8_t size;
+        uint8_t base;
+        uint16_t offset;
+    } a;
+
+    if (len < sizeof(a)) {
+        return;
+    }
+    memcpy(&a, data, sizeof(a));
+
+    size_t addr = get_mmio_address(a.base, a.offset).addr;
+    if (addr == -1) {
+        return;
+    }
+    switch (a.size %= end_sizes) {
+    case Byte:
+        qtest_readb(s, addr);
+        break;
+    case Word:
+        qtest_readw(s, addr);
+        break;
+    case Long:
+        qtest_readl(s, addr);
+        break;
+    case Quad:
+        qtest_readq(s, addr);
+        break;
+    }
+}
+
+static void op_write(QTestState *s, const unsigned char * data, size_t len)
+{
+    enum Sizes {Byte, Word, Long, Quad, end_sizes};
+    struct {
+        uint8_t size;
+        uint8_t base;
+        uint16_t offset;
+        uint64_t value;
+    } a;
+    if (len < sizeof(a)) {
+        return;
+    }
+    memcpy(&a, data, sizeof(a));
+    size_t addr = get_mmio_address(a.base, a.offset).addr;
+    if (addr == -1) {
+        return;
+    }
+    switch (a.size %= end_sizes) {
+    case Byte:
+        qtest_writeb(s, addr, a.value & 0xFF);
+        break;
+    case Word:
+        qtest_writew(s, addr, a.value & 0xFFFF);
+        break;
+    case Long:
+        qtest_writel(s, addr, a.value & 0xFFFFFFFF);
+        break;
+    case Quad:
+        qtest_writeq(s, addr, a.value);
+        break;
+    }
+}
+
+
+static void op_write_pattern(QTestState *s, const unsigned char * data,
+                             size_t len)
+{
+    struct {
+        uint8_t base;
+        uint32_t offset;
+        uint16_t length;
+        uint8_t index;
+        uint8_t stride;
+    } a;
+
+    /*  Need at least one byte for the actual pattern */
+    if (len < sizeof(a) + 1) {
+        return;
+    }
+
+    memcpy(&a, data, sizeof(a));
+    pattern p = {
+        .data = data + sizeof(a),
+        .len = len - sizeof(a),
+        .index = a.index,
+        .stride = a.stride
+    };
+
+    address_range addr = get_mmio_address(a.base, a.offset);
+    if (addr.addr == -1) {
+        return;
+    }
+    /* Cap the length and make sure it doesn't extend past the IO region. */
+    size_t write_length = MIN(MIN(0x1000, a.length), addr.len);
+
+    void *buf = pattern_alloc(p, write_length);
+    qtest_memwrite(s, addr.addr, buf, write_length);
+    free(buf);
+}
+
+static void op_clock_step(QTestState *s, const unsigned char *data, size_t len)
+{
+    qtest_clock_step_next(s);
+}
+
+/*
+ * Here, we interpret random bytes from the fuzzer, as a sequence of commands.
+ * Our commands are variable-width, so we use a separator, CMD_SEP, to specify
+ * the boundaries between commands. This is just a random 32-bit value, which
+ * is easily identified by libfuzzer+AddressSanitizer, as long as we use
+ * memmem. It can also be included in the fuzzer's dictionary. More details
+ * here:
+ * https://github.com/google/fuzzing/blob/master/docs/split-inputs.md
+ *
+ * As a result, the stream of bytes is converted into a sequence of commands.
+ * In a simplified example where CMD_SEP is 0xFF:
+ * 00 01 02 FF 03 04 05 06 FF 01 FF ...
+ * becomes this sequence of commands:
+ * 00 01 02    -> op00 (0102)   -> in (0102, 2)
+ * 03 04 05 06 -> op03 (040506) -> write (040506, 3)
+ * 01          -> op01 (-,0)    -> out (-,0)
+ * ...
+ *
+ * Note here that it is the job of the individual opcode functions to check
+ * that enough data was provided. I.e. in the last command out (,0), out needs
+ * to check that there is not enough data provided to select an address/value
+ * for the operation.
+ */
+static void general_fuzz(QTestState *s, const unsigned char *Data, size_t Size)
+{
+    void (*ops[]) (QTestState* s, const unsigned char* , size_t) = {
+        op_in,
+        op_out,
+        op_read,
+        op_write,
+        op_write_pattern,
+        op_clock_step
+    };
+    const unsigned char *cmd = Data;
+    const unsigned char *nextcmd;
+    size_t cmd_len;
+    uint8_t op;
+
+    if (fork() == 0) {
+        while (cmd && Size) {
+            /* Get the length until the next command or end of input */
+            nextcmd = memmem(cmd, Size, CMD_SEP, strlen(CMD_SEP));
+            cmd_len = nextcmd ? nextcmd - cmd : Size;
+
+            if (cmd_len > 0) {
+                /* Interpret the first byte of the command as an opcode */
+                op = *cmd % (sizeof(ops) / sizeof((ops)[0]));
+                ops[op](s, cmd + 1, cmd_len - 1);
+
+                /* Run the main loop */
+                flush_events(s);
+            }
+            /* Advance to the next command */
+            cmd = nextcmd ? nextcmd + sizeof(CMD_SEP) - 1 : nextcmd;
+            Size = Size - (cmd_len + sizeof(CMD_SEP) - 1);
+        }
+        flush_events(s);
+        _Exit(0);
+    } else {
+        flush_events(s);
+        wait(NULL);
+    }
+}
+
+/*
+ * Adapted from tests/qtest/libqos/pci.c
+ */
+static void pcidev_foreach_callback(QPCIDevice *dev, int devfn, void *data)
+{
+    static const int bar_reg_map[] = {
+        PCI_BASE_ADDRESS_0, PCI_BASE_ADDRESS_1, PCI_BASE_ADDRESS_2,
+        PCI_BASE_ADDRESS_3, PCI_BASE_ADDRESS_4, PCI_BASE_ADDRESS_5,
+    };
+    int bar_reg;
+    uint32_t addr;
+    uint32_t io_type;
+
+    for (int i = 0; i < 6; i++) {
+        bar_reg = bar_reg_map[i];
+        qpci_config_writel(dev, bar_reg, 0xFFFFFFFF);
+        addr = qpci_config_readl(dev, bar_reg);
+
+        io_type = addr & PCI_BASE_ADDRESS_SPACE;
+        if (io_type == PCI_BASE_ADDRESS_SPACE_IO) {
+            addr &= PCI_BASE_ADDRESS_IO_MASK;
+        } else {
+            addr &= PCI_BASE_ADDRESS_MEM_MASK;
+        }
+        if (addr) {
+            qpci_iomap(dev, i, NULL);
+        }
+    }
+
+    qpci_device_enable(dev);
+    if (qpci_find_capability(dev, PCI_CAP_ID_MSIX, 0)) {
+        qpci_msix_enable(dev);
+    }
+}
+
+
+static void general_pre_qos_fuzz(QTestState *s)
+{
+    if (getenv("FUZZ_REGION_WHITELIST")) {
+        region_whitelist = g_strsplit(getenv("FUZZ_REGION_WHITELIST"), " ", 0);
+    }
+    counter_shm_init();
+
+
+    qos_init_path(s);
+
+    /* Enumerate PCI devices and map BARs */
+    qpci_device_foreach(fuzz_qos_obj, -1, -1, pcidev_foreach_callback, NULL);
+}
+
+
+static void *qos_general_cmdline(GString *cmd_line, void *arg)
+{
+    if (!getenv("QEMU_FUZZ_ARGS")) {
+        printf("Please specify qemu args for fuzzing with the QEMU_FUZZ_ARGS"
+               " environment variable. "
+               " (e.g. QEMU_FUZZ_ARGS='-device virtio-net')\n");
+        exit(0);
+    }
+    g_string_append_printf(cmd_line, " %s ", getenv("QEMU_FUZZ_ARGS"));
+    return arg;
+}
+
+static void register_general_fuzz_targets(void)
+{
+    fuzz_add_qos_target(&(FuzzTarget){
+            .name = "general-pci-enum-fuzz",
+            .description = "Fuzz based on any qemu command-line args. "
+                           "Try to map all PCI Device BARs. prior to fuzzing",
+            .pre_fuzz = &general_pre_qos_fuzz,
+            .fuzz = general_fuzz},
+            "pci-bus",
+            &(QOSGraphTestOptions){.before = qos_general_cmdline}
+            );
+}
+
+fuzz_target_init(register_general_fuzz_targets);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 2/3] fuzz: add support for fuzzing DMA regions
  2020-06-11  5:56 [RFC PATCH 0/3] fuzz: add generic fuzzer Alexander Bulekov
  2020-06-11  5:56 ` [RFC PATCH 1/3] fuzz: add a general fuzzer for any qemu arguments Alexander Bulekov
@ 2020-06-11  5:56 ` Alexander Bulekov
  2020-06-11  5:56 ` [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions Alexander Bulekov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Alexander Bulekov @ 2020-06-11  5:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Alexander Bulekov, f4bug,
	darren.kenny, bsd, stefanha, Paolo Bonzini

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
---
 tests/qtest/fuzz/general_fuzz.c | 102 ++++++++++++++++++++++++++++++++
 1 file changed, 102 insertions(+)

diff --git a/tests/qtest/fuzz/general_fuzz.c b/tests/qtest/fuzz/general_fuzz.c
index 5c29306bb6..9e981e870f 100644
--- a/tests/qtest/fuzz/general_fuzz.c
+++ b/tests/qtest/fuzz/general_fuzz.c
@@ -32,6 +32,7 @@
  * input
  */
 #define CMD_SEP "\x84\x05\x5C\x5E"
+#define MAX_DMA_FILL_SIZE 0x10000
 
 typedef struct {
     size_t addr;
@@ -57,6 +58,18 @@ typedef struct {
  */
 char **region_whitelist;
 
+/*
+ * List of dma regions populated since the last fuzzing command. Used to ensure
+ * that we only write to each DMA address once, to avoid race conditions when
+ * building reproducers.
+ */
+static GArray *dma_regions;
+
+static GArray *dma_patterns;
+int dma_pattern_index;
+
+void dma_read_cb(size_t addr, size_t len);
+
 /*
  * Allocate a block of memory and populate it with a pattern.
  */
@@ -76,6 +89,62 @@ static void *pattern_alloc(pattern p, size_t len)
     return buf;
 }
 
+/*
+ * Call-back for functions that perform DMA reads from guest memory. Confirm
+ * that the region has not already been populated since the last loop in
+ * general_fuzz(), avoiding potential race-conditions, which we don't have
+ * a good way for reproducing right now.
+ */
+void dma_read_cb(size_t addr, size_t len)
+{
+    int i;
+
+    /* Return immediately if we have no data to fill the dma region */
+    if (dma_patterns->len == 0) {
+        return;
+    }
+
+    /* Return immediately if the address is greater than the RAM size */
+    if (addr > current_machine->ram_size) {
+        return;
+    }
+
+    /* Cap the length of the DMA access to something reasonable */
+    len = MIN(len, MAX_DMA_FILL_SIZE);
+
+    /*
+     * If we overlap with any existing dma_regions, split the range and only
+     * populate the non-overlapping parts.
+     */
+    for (i = 0; i < dma_regions->len; ++i) {
+        address_range *region = &g_array_index(dma_regions, address_range, i);
+        if (addr < region->addr + region->len && addr + len > region->addr) {
+            if (addr < region->addr) {
+                dma_read_cb(addr, region->addr - addr);
+            }
+            if (addr + len > region->addr + region->len) {
+                dma_read_cb(region->addr + region->len,
+                        addr + len - (region->addr + region->len));
+            }
+            return;
+        }
+    }
+
+    /*
+     * Otherwise, populate the region using address_space_write_rom to avoid
+     * writing to any IO MemoryRegions
+     */
+    address_range ar = {addr, len};
+    g_array_append_val(dma_regions, ar);
+    void *buf = pattern_alloc(g_array_index(dma_patterns, pattern,
+                              dma_pattern_index), ar.len);
+    address_space_write_rom(first_cpu->as, ar.addr, MEMTXATTRS_UNSPECIFIED,
+                            buf, ar.len);
+    free(buf);
+
+    /* Increment the index of the pattern for the next DMA access */
+    dma_pattern_index = (dma_pattern_index + 1) % dma_patterns->len;
+}
 
 /*
  * Here we want to convert a fuzzer-provided [io-region-index, offset] to
@@ -269,6 +338,32 @@ static void op_write(QTestState *s, const unsigned char * data, size_t len)
     }
 }
 
+static void op_add_dma_pattern(QTestState *s,
+                               const unsigned char *data, size_t len)
+{
+    struct {
+        /*
+         * index and stride can be used to increment the index-th byte of the
+         * pattern by the value stride, for each loop of the pattern.
+         */
+        uint8_t index;
+        uint8_t stride;
+    } a;
+
+    if (len < sizeof(a) + 1) {
+        return;
+    }
+    memcpy(&a, data, sizeof(a));
+    pattern p = {a.index, a.stride, len - sizeof(a), data + sizeof(a)};
+    g_array_append_val(dma_patterns, p);
+    return;
+}
+
+static void op_clear_dma_patterns(QTestState *s,
+                                  const unsigned char *data, size_t len)
+{
+    g_array_set_size(dma_patterns, 0);
+}
 
 static void op_write_pattern(QTestState *s, const unsigned char * data,
                              size_t len)
@@ -341,6 +436,8 @@ static void general_fuzz(QTestState *s, const unsigned char *Data, size_t Size)
         op_out,
         op_read,
         op_write,
+        op_add_dma_pattern,
+        op_clear_dma_patterns,
         op_write_pattern,
         op_clock_step
     };
@@ -348,9 +445,12 @@ static void general_fuzz(QTestState *s, const unsigned char *Data, size_t Size)
     const unsigned char *nextcmd;
     size_t cmd_len;
     uint8_t op;
+    g_array_set_size(dma_patterns, 0);
+    dma_pattern_index = 0;
 
     if (fork() == 0) {
         while (cmd && Size) {
+            g_array_set_size(dma_regions, 0);
             /* Get the length until the next command or end of input */
             nextcmd = memmem(cmd, Size, CMD_SEP, strlen(CMD_SEP));
             cmd_len = nextcmd ? nextcmd - cmd : Size;
@@ -418,6 +518,8 @@ static void general_pre_qos_fuzz(QTestState *s)
     }
     counter_shm_init();
 
+    dma_regions = g_array_new(false, false, sizeof(address_range));
+    dma_patterns = g_array_new(false, false, sizeof(pattern));
 
     qos_init_path(s);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions
  2020-06-11  5:56 [RFC PATCH 0/3] fuzz: add generic fuzzer Alexander Bulekov
  2020-06-11  5:56 ` [RFC PATCH 1/3] fuzz: add a general fuzzer for any qemu arguments Alexander Bulekov
  2020-06-11  5:56 ` [RFC PATCH 2/3] fuzz: add support for fuzzing DMA regions Alexander Bulekov
@ 2020-06-11  5:56 ` Alexander Bulekov
  2020-06-23 14:14   ` Stefan Hajnoczi
  2020-06-24  9:46   ` Philippe Mathieu-Daudé
  2020-06-11  6:55 ` [RFC PATCH 0/3] fuzz: add generic fuzzer no-reply
  2020-06-23 14:16 ` Stefan Hajnoczi
  4 siblings, 2 replies; 14+ messages in thread
From: Alexander Bulekov @ 2020-06-11  5:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alexander Bulekov, f4bug, darren.kenny, bsd, stefanha,
	Paolo Bonzini, Richard Henderson

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
---
 exec.c                                | 17 ++++++++++++++++-
 include/exec/memory.h                 |  8 ++++++++
 include/exec/memory_ldst_cached.inc.h |  9 +++++++++
 include/sysemu/dma.h                  |  5 ++++-
 memory_ldst.inc.c                     | 12 ++++++++++++
 5 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/exec.c b/exec.c
index be4be2df3a..2ed724ab54 100644
--- a/exec.c
+++ b/exec.c
@@ -3247,7 +3247,10 @@ MemTxResult address_space_read_full(AddressSpace *as, hwaddr addr,
 {
     MemTxResult result = MEMTX_OK;
     FlatView *fv;
-
+#ifdef CONFIG_FUZZ
+    if(as->root == get_system_memory())
+        dma_read_cb(addr, len);
+#endif
     if (len > 0) {
         RCU_READ_LOCK_GUARD();
         fv = address_space_to_flatview(as);
@@ -3556,6 +3559,10 @@ void *address_space_map(AddressSpace *as,
         }
 
         *plen = l;
+#ifdef CONFIG_FUZZ
+        if(as->root == get_system_memory() && !is_write)
+            dma_read_cb(addr, *plen);
+#endif
         return bounce.buffer;
     }
 
@@ -3563,6 +3570,10 @@ void *address_space_map(AddressSpace *as,
     memory_region_ref(mr);
     *plen = flatview_extend_translation(fv, addr, len, mr, xlat,
                                         l, is_write, attrs);
+#ifdef CONFIG_FUZZ
+    if(as->root == get_system_memory() && !is_write)
+        dma_read_cb(addr, *plen);
+#endif
     ptr = qemu_ram_ptr_length(mr->ram_block, xlat, plen, true);
 
     return ptr;
@@ -3635,6 +3646,10 @@ int64_t address_space_cache_init(MemoryRegionCache *cache,
 
     assert(len > 0);
 
+#ifdef CONFIG_FUZZ
+    if(as->root == get_system_memory() && !is_write)
+        dma_read_cb(addr, len);
+#endif
     l = len;
     cache->fv = address_space_get_flatview(as);
     d = flatview_to_dispatch(cache->fv);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 3e00cdbbfa..e9178b3e0a 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -49,6 +49,10 @@
 
 extern bool global_dirty_log;
 
+#ifdef CONFIG_FUZZ
+extern void dma_read_cb(size_t addr, size_t len);
+#endif
+
 typedef struct MemoryRegionOps MemoryRegionOps;
 typedef struct MemoryRegionMmio MemoryRegionMmio;
 
@@ -2427,6 +2431,10 @@ address_space_read_cached(MemoryRegionCache *cache, hwaddr addr,
                           void *buf, hwaddr len)
 {
     assert(addr < cache->len && len <= cache->len - addr);
+
+#ifdef CONFIG_FUZZ
+    dma_read_cb(addr, len);
+#endif
     if (likely(cache->ptr)) {
         memcpy(buf, cache->ptr + addr, len);
     } else {
diff --git a/include/exec/memory_ldst_cached.inc.h b/include/exec/memory_ldst_cached.inc.h
index fd4bbb40e7..dc3ce14a97 100644
--- a/include/exec/memory_ldst_cached.inc.h
+++ b/include/exec/memory_ldst_cached.inc.h
@@ -28,6 +28,9 @@ static inline uint32_t ADDRESS_SPACE_LD_CACHED(l)(MemoryRegionCache *cache,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
 {
     assert(addr < cache->len && 4 <= cache->len - addr);
+#ifdef CONFIG_FUZZ
+    dma_read_cb(cache->xlat + addr, 4);
+#endif
     if (likely(cache->ptr)) {
         return LD_P(l)(cache->ptr + addr);
     } else {
@@ -39,6 +42,9 @@ static inline uint64_t ADDRESS_SPACE_LD_CACHED(q)(MemoryRegionCache *cache,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
 {
     assert(addr < cache->len && 8 <= cache->len - addr);
+#ifdef CONFIG_FUZZ
+    dma_read_cb(cache->xlat + addr, 8);
+#endif
     if (likely(cache->ptr)) {
         return LD_P(q)(cache->ptr + addr);
     } else {
@@ -50,6 +56,9 @@ static inline uint32_t ADDRESS_SPACE_LD_CACHED(uw)(MemoryRegionCache *cache,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
 {
     assert(addr < cache->len && 2 <= cache->len - addr);
+#ifdef CONFIG_FUZZ
+    dma_read_cb(cache->xlat + addr, 2);
+#endif
     if (likely(cache->ptr)) {
         return LD_P(uw)(cache->ptr + addr);
     } else {
diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h
index 80c5bc3e02..f32d7db7aa 100644
--- a/include/sysemu/dma.h
+++ b/include/sysemu/dma.h
@@ -105,8 +105,11 @@ static inline int dma_memory_rw(AddressSpace *as, dma_addr_t addr,
                                 void *buf, dma_addr_t len,
                                 DMADirection dir)
 {
+#ifdef CONFIG_FUZZ
+    if (dir == DMA_DIRECTION_TO_DEVICE)
+        dma_read_cb(addr, len);
+#endif
     dma_barrier(as, dir);
-
     return dma_memory_rw_relaxed(as, addr, buf, len, dir);
 }
 
diff --git a/memory_ldst.inc.c b/memory_ldst.inc.c
index c54aee4a95..1935436aff 100644
--- a/memory_ldst.inc.c
+++ b/memory_ldst.inc.c
@@ -42,6 +42,9 @@ static inline uint32_t glue(address_space_ldl_internal, SUFFIX)(ARG1_DECL,
                                         MO_32 | devend_memop(endian), attrs);
     } else {
         /* RAM case */
+#ifdef CONFIG_FUZZ
+        dma_read_cb(addr, 4);
+#endif
         ptr = qemu_map_ram_ptr(mr->ram_block, addr1);
         switch (endian) {
         case DEVICE_LITTLE_ENDIAN:
@@ -110,6 +113,9 @@ static inline uint64_t glue(address_space_ldq_internal, SUFFIX)(ARG1_DECL,
                                         MO_64 | devend_memop(endian), attrs);
     } else {
         /* RAM case */
+#ifdef CONFIG_FUZZ
+        dma_read_cb(addr, 8);
+#endif
         ptr = qemu_map_ram_ptr(mr->ram_block, addr1);
         switch (endian) {
         case DEVICE_LITTLE_ENDIAN:
@@ -175,6 +181,9 @@ uint32_t glue(address_space_ldub, SUFFIX)(ARG1_DECL,
         r = memory_region_dispatch_read(mr, addr1, &val, MO_8, attrs);
     } else {
         /* RAM case */
+#ifdef CONFIG_FUZZ
+        dma_read_cb(addr, 1);
+#endif
         ptr = qemu_map_ram_ptr(mr->ram_block, addr1);
         val = ldub_p(ptr);
         r = MEMTX_OK;
@@ -212,6 +221,9 @@ static inline uint32_t glue(address_space_lduw_internal, SUFFIX)(ARG1_DECL,
                                         MO_16 | devend_memop(endian), attrs);
     } else {
         /* RAM case */
+#ifdef CONFIG_FUZZ
+        dma_read_cb(addr, 2);
+#endif
         ptr = qemu_map_ram_ptr(mr->ram_block, addr1);
         switch (endian) {
         case DEVICE_LITTLE_ENDIAN:
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/3] fuzz: add generic fuzzer
  2020-06-11  5:56 [RFC PATCH 0/3] fuzz: add generic fuzzer Alexander Bulekov
                   ` (2 preceding siblings ...)
  2020-06-11  5:56 ` [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions Alexander Bulekov
@ 2020-06-11  6:55 ` no-reply
  2020-06-23 14:16 ` Stefan Hajnoczi
  4 siblings, 0 replies; 14+ messages in thread
From: no-reply @ 2020-06-11  6:55 UTC (permalink / raw)
  To: alxndr; +Cc: darren.kenny, qemu-devel, f4bug, alxndr, bsd, stefanha

Patchew URL: https://patchew.org/QEMU/20200611055651.13784-1-alxndr@bu.edu/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20200611055651.13784-1-alxndr@bu.edu
Subject: [RFC PATCH 0/3] fuzz: add generic fuzzer
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
 * [new tag]         patchew/20200611055651.13784-1-alxndr@bu.edu -> patchew/20200611055651.13784-1-alxndr@bu.edu
Switched to a new branch 'test'
581b756 fuzz: Add callbacks for dma-access functions
efcea82 fuzz: add support for fuzzing DMA regions
03d7012 fuzz: add a general fuzzer for any qemu arguments

=== OUTPUT BEGIN ===
1/3 Checking commit 03d701265206 (fuzz: add a general fuzzer for any qemu arguments)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#23: 
new file mode 100644

ERROR: "foo* bar" should be "foo *bar"
#366: FILE: tests/qtest/fuzz/general_fuzz.c:339:
+    void (*ops[]) (QTestState* s, const unsigned char* , size_t) = {

total: 1 errors, 1 warnings, 461 lines checked

Patch 1/3 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

2/3 Checking commit efcea82301ce (fuzz: add support for fuzzing DMA regions)
ERROR: externs should be avoided in .c files
#35: FILE: tests/qtest/fuzz/general_fuzz.c:71:
+void dma_read_cb(size_t addr, size_t len);

total: 1 errors, 0 warnings, 147 lines checked

Patch 2/3 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

3/3 Checking commit 581b756ff038 (fuzz: Add callbacks for dma-access functions)
ERROR: space required before the open parenthesis '('
#20: FILE: exec.c:3251:
+    if(as->root == get_system_memory())

ERROR: space required before the open parenthesis '('
#31: FILE: exec.c:3563:
+        if(as->root == get_system_memory() && !is_write)

ERROR: braces {} are necessary for all arms of this statement
#31: FILE: exec.c:3563:
+        if(as->root == get_system_memory() && !is_write)
[...]

ERROR: space required before the open parenthesis '('
#42: FILE: exec.c:3574:
+    if(as->root == get_system_memory() && !is_write)

ERROR: braces {} are necessary for all arms of this statement
#42: FILE: exec.c:3574:
+    if(as->root == get_system_memory() && !is_write)
[...]

ERROR: space required before the open parenthesis '('
#53: FILE: exec.c:3650:
+    if(as->root == get_system_memory() && !is_write)

ERROR: braces {} are necessary for all arms of this statement
#53: FILE: exec.c:3650:
+    if(as->root == get_system_memory() && !is_write)
[...]

ERROR: braces {} are necessary for all arms of this statement
#128: FILE: include/sysemu/dma.h:109:
+    if (dir == DMA_DIRECTION_TO_DEVICE)
[...]

total: 8 errors, 0 warnings, 136 lines checked

Patch 3/3 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200611055651.13784-1-alxndr@bu.edu/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions
  2020-06-11  5:56 ` [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions Alexander Bulekov
@ 2020-06-23 14:14   ` Stefan Hajnoczi
  2020-06-23 14:55     ` Alexander Bulekov
  2020-07-09 23:48     ` Alexander Bulekov
  2020-06-24  9:46   ` Philippe Mathieu-Daudé
  1 sibling, 2 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2020-06-23 14:14 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: qemu-devel, f4bug, darren.kenny, bsd, stefanha, Paolo Bonzini,
	Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 1492 bytes --]

On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> ---
>  exec.c                                | 17 ++++++++++++++++-
>  include/exec/memory.h                 |  8 ++++++++
>  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
>  include/sysemu/dma.h                  |  5 ++++-
>  memory_ldst.inc.c                     | 12 ++++++++++++
>  5 files changed, 49 insertions(+), 2 deletions(-)

Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
function is clear.

The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
is undefined. In a header file:

  #ifdef CONFIG_FUZZ
  void fuzz_dma_read_cb(size_t addr, size_t len);
  #else
  static inline void fuzz_dma_read_cb(size_t addr, size_t len)
  {
      /* Do nothing */
  }
  #endif

Now the compiler should eliminate the deadcode:

  #ifdef CONFIG_FUZZ
  if (as->root == get_system_memory()) {
      dma_read_cb(addr, len);
  }
  #endif

becomes:

  if (as->root == get_system_memory()) {
      fuzz_dma_read_cb(addr, len);
  }

Hopefully gcc and clang will eliminate this and emit no instructions
when CONFIG_FUZZ is undefined. If not, you can simply pass in 'as' and
'is_write' too:

  void fuzz_dma_read_cb(AddressSpace *as, bool is_write, size_t addr, size_t len)

This way the conditional is moved inside fuzz_dma_read_cb() and deadcode
elimination becomes trivial for the compiler:

  fuzz_read_cb(as, is_write, addr, len);

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/3] fuzz: add generic fuzzer
  2020-06-11  5:56 [RFC PATCH 0/3] fuzz: add generic fuzzer Alexander Bulekov
                   ` (3 preceding siblings ...)
  2020-06-11  6:55 ` [RFC PATCH 0/3] fuzz: add generic fuzzer no-reply
@ 2020-06-23 14:16 ` Stefan Hajnoczi
  2020-06-25 15:30   ` Dima Stepanov
  4 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2020-06-23 14:16 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: qemu-devel, f4bug, darren.kenny, bsd, dstepanov.src, stefanha

[-- Attachment #1: Type: text/plain, Size: 2016 bytes --]

On Thu, Jun 11, 2020 at 01:56:48AM -0400, Alexander Bulekov wrote:
> These patches add a generic fuzzer for virtual devices. This should
> allow us to fuzz devices that accept inputs over MMIO, PIO and DMA
> without any device-specific code.
> 
> Example:
> QEMU_FUZZ_ARGS="-device virtio-net" \
> FUZZ_REGION_WHITELIST="virtio pci-" \
> ./i386-softmmu/qemu-fuzz-i386 --fuzz-target=general-pci-enum-fuzz
> 
> The above command will add a virtio-net device to the QEMU arguments and
> restrict the fuzzer to only interact with MMIO and PIO regions with
> names that contain "virtio" or "pci-". I find these names using the info
> mtree monitor command. 
> 
> Basically, the fuzzer splits the input into a series of commands, such
> as mmio_write, pio_write, etc. Additionally, these patches add "hooks"
> to functions that are typically used by virtual-devices to read from RAM
> (DMA). These hooks attempt to populate these DMA regions with fuzzed
> data, just in time.  There are some differences from my reference code
> that seem to result in performance issues that I am still trying to iron
> out. I also need to figure out how to add the DMA "hooks" in a neat way.
> Maybe I can use -Wl,--wrap for this. I appreciate any feedback.
> 
> Alexander Bulekov (3):
>   fuzz: add a general fuzzer for any qemu arguments
>   fuzz: add support for fuzzing DMA regions
>   fuzz: Add callbacks for dma-access functions
> 
>  exec.c                                |  17 +-
>  include/exec/memory.h                 |   8 +
>  include/exec/memory_ldst_cached.inc.h |   9 +
>  include/sysemu/dma.h                  |   5 +-
>  memory_ldst.inc.c                     |  12 +
>  tests/qtest/fuzz/Makefile.include     |   1 +
>  tests/qtest/fuzz/general_fuzz.c       | 556 ++++++++++++++++++++++++++
>  7 files changed, 606 insertions(+), 2 deletions(-)
>  create mode 100644 tests/qtest/fuzz/general_fuzz.c

CCing Dima in case he is interested in this generic fuzzing approach.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions
  2020-06-23 14:14   ` Stefan Hajnoczi
@ 2020-06-23 14:55     ` Alexander Bulekov
  2020-06-26 15:44       ` Stefan Hajnoczi
  2020-07-09 23:48     ` Alexander Bulekov
  1 sibling, 1 reply; 14+ messages in thread
From: Alexander Bulekov @ 2020-06-23 14:55 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, f4bug, darren.kenny, bsd, stefanha, Paolo Bonzini,
	Richard Henderson

On 200623 1514, Stefan Hajnoczi wrote:
> On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > ---
> >  exec.c                                | 17 ++++++++++++++++-
> >  include/exec/memory.h                 |  8 ++++++++
> >  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
> >  include/sysemu/dma.h                  |  5 ++++-
> >  memory_ldst.inc.c                     | 12 ++++++++++++
> >  5 files changed, 49 insertions(+), 2 deletions(-)
> 
> Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
> function is clear.
> 
> The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
> is undefined. In a header file:
> 
>   #ifdef CONFIG_FUZZ
>   void fuzz_dma_read_cb(size_t addr, size_t len);
>   #else
>   static inline void fuzz_dma_read_cb(size_t addr, size_t len)
>   {
>       /* Do nothing */
>   }
>   #endif
> 
> Now the compiler should eliminate the deadcode:
> 
>   #ifdef CONFIG_FUZZ
>   if (as->root == get_system_memory()) {
>       dma_read_cb(addr, len);
>   }
>   #endif
> 
> becomes:
> 
>   if (as->root == get_system_memory()) {
>       fuzz_dma_read_cb(addr, len);
>   }
> 
> Hopefully gcc and clang will eliminate this and emit no instructions
> when CONFIG_FUZZ is undefined. If not, you can simply pass in 'as' and
> 'is_write' too:
> 
>   void fuzz_dma_read_cb(AddressSpace *as, bool is_write, size_t addr, size_t len)
> 
> This way the conditional is moved inside fuzz_dma_read_cb() and deadcode
> elimination becomes trivial for the compiler:
> 
>   fuzz_read_cb(as, is_write, addr, len);

Do you think it would be better to have a "trace_dma_read" or
"trace_device_read_from_guest_memory"? I haven't looked under the hood
too much for the tracepoint api, but these would just be standard
tracepoints(disabled for the majority of builds). When we build the
fuzzer, we could compile with --wrap="trace_dma_read" and implement
a __wrap_trace_dma_read in the generic fuzzer. I looked at the symbols
for a qemu build and it looks like trace_* are actual functions, rather
than preprocessor magic, so maybe this could work?
-Alex


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions
  2020-06-11  5:56 ` [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions Alexander Bulekov
  2020-06-23 14:14   ` Stefan Hajnoczi
@ 2020-06-24  9:46   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-06-24  9:46 UTC (permalink / raw)
  To: Alexander Bulekov, qemu-devel
  Cc: darren.kenny, bsd, Richard Henderson, stefanha, Paolo Bonzini

On 6/11/20 7:56 AM, Alexander Bulekov wrote:
> Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> ---
>  exec.c                                | 17 ++++++++++++++++-
>  include/exec/memory.h                 |  8 ++++++++
>  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
>  include/sysemu/dma.h                  |  5 ++++-
>  memory_ldst.inc.c                     | 12 ++++++++++++
>  5 files changed, 49 insertions(+), 2 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index be4be2df3a..2ed724ab54 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -3247,7 +3247,10 @@ MemTxResult address_space_read_full(AddressSpace *as, hwaddr addr,
>  {
>      MemTxResult result = MEMTX_OK;
>      FlatView *fv;
> -
> +#ifdef CONFIG_FUZZ
> +    if(as->root == get_system_memory())

Since it is local to exec.c, you can directly use system_memory.

But why restrict this to the system memory anyway?

> +        dma_read_cb(addr, len);
> +#endif



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/3] fuzz: add generic fuzzer
  2020-06-23 14:16 ` Stefan Hajnoczi
@ 2020-06-25 15:30   ` Dima Stepanov
  0 siblings, 0 replies; 14+ messages in thread
From: Dima Stepanov @ 2020-06-25 15:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: darren.kenny, f4bug, qemu-devel, Alexander Bulekov, bsd, stefanha

On Tue, Jun 23, 2020 at 03:16:01PM +0100, Stefan Hajnoczi wrote:
> On Thu, Jun 11, 2020 at 01:56:48AM -0400, Alexander Bulekov wrote:
> > These patches add a generic fuzzer for virtual devices. This should
> > allow us to fuzz devices that accept inputs over MMIO, PIO and DMA
> > without any device-specific code.
> > 
> > Example:
> > QEMU_FUZZ_ARGS="-device virtio-net" \
> > FUZZ_REGION_WHITELIST="virtio pci-" \
> > ./i386-softmmu/qemu-fuzz-i386 --fuzz-target=general-pci-enum-fuzz
> > 
> > The above command will add a virtio-net device to the QEMU arguments and
> > restrict the fuzzer to only interact with MMIO and PIO regions with
> > names that contain "virtio" or "pci-". I find these names using the info
> > mtree monitor command. 
> > 
> > Basically, the fuzzer splits the input into a series of commands, such
> > as mmio_write, pio_write, etc. Additionally, these patches add "hooks"
> > to functions that are typically used by virtual-devices to read from RAM
> > (DMA). These hooks attempt to populate these DMA regions with fuzzed
> > data, just in time.  There are some differences from my reference code
> > that seem to result in performance issues that I am still trying to iron
> > out. I also need to figure out how to add the DMA "hooks" in a neat way.
> > Maybe I can use -Wl,--wrap for this. I appreciate any feedback.
> > 
> > Alexander Bulekov (3):
> >   fuzz: add a general fuzzer for any qemu arguments
> >   fuzz: add support for fuzzing DMA regions
> >   fuzz: Add callbacks for dma-access functions
> > 
> >  exec.c                                |  17 +-
> >  include/exec/memory.h                 |   8 +
> >  include/exec/memory_ldst_cached.inc.h |   9 +
> >  include/sysemu/dma.h                  |   5 +-
> >  memory_ldst.inc.c                     |  12 +
> >  tests/qtest/fuzz/Makefile.include     |   1 +
> >  tests/qtest/fuzz/general_fuzz.c       | 556 ++++++++++++++++++++++++++
> >  7 files changed, 606 insertions(+), 2 deletions(-)
> >  create mode 100644 tests/qtest/fuzz/general_fuzz.c
> 
> CCing Dima in case he is interested in this generic fuzzing approach.
> 
> Stefan
Thanks for adding me, going to look into it on this weekend.

Dima.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions
  2020-06-23 14:55     ` Alexander Bulekov
@ 2020-06-26 15:44       ` Stefan Hajnoczi
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2020-06-26 15:44 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: Stefan Hajnoczi, qemu-devel, f4bug, darren.kenny, bsd,
	Paolo Bonzini, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 2700 bytes --]

On Tue, Jun 23, 2020 at 10:55:00AM -0400, Alexander Bulekov wrote:
> On 200623 1514, Stefan Hajnoczi wrote:
> > On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> > > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > > ---
> > >  exec.c                                | 17 ++++++++++++++++-
> > >  include/exec/memory.h                 |  8 ++++++++
> > >  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
> > >  include/sysemu/dma.h                  |  5 ++++-
> > >  memory_ldst.inc.c                     | 12 ++++++++++++
> > >  5 files changed, 49 insertions(+), 2 deletions(-)
> > 
> > Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
> > function is clear.
> > 
> > The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
> > is undefined. In a header file:
> > 
> >   #ifdef CONFIG_FUZZ
> >   void fuzz_dma_read_cb(size_t addr, size_t len);
> >   #else
> >   static inline void fuzz_dma_read_cb(size_t addr, size_t len)
> >   {
> >       /* Do nothing */
> >   }
> >   #endif
> > 
> > Now the compiler should eliminate the deadcode:
> > 
> >   #ifdef CONFIG_FUZZ
> >   if (as->root == get_system_memory()) {
> >       dma_read_cb(addr, len);
> >   }
> >   #endif
> > 
> > becomes:
> > 
> >   if (as->root == get_system_memory()) {
> >       fuzz_dma_read_cb(addr, len);
> >   }
> > 
> > Hopefully gcc and clang will eliminate this and emit no instructions
> > when CONFIG_FUZZ is undefined. If not, you can simply pass in 'as' and
> > 'is_write' too:
> > 
> >   void fuzz_dma_read_cb(AddressSpace *as, bool is_write, size_t addr, size_t len)
> > 
> > This way the conditional is moved inside fuzz_dma_read_cb() and deadcode
> > elimination becomes trivial for the compiler:
> > 
> >   fuzz_read_cb(as, is_write, addr, len);
> 
> Do you think it would be better to have a "trace_dma_read" or
> "trace_device_read_from_guest_memory"? I haven't looked under the hood
> too much for the tracepoint api, but these would just be standard
> tracepoints(disabled for the majority of builds). When we build the
> fuzzer, we could compile with --wrap="trace_dma_read" and implement
> a __wrap_trace_dma_read in the generic fuzzer. I looked at the symbols
> for a qemu build and it looks like trace_* are actual functions, rather
> than preprocessor magic, so maybe this could work?

I think plain old functions are fine for what you are doing.

QEMU's tracing does not provide callbacks that are invoked when a trace
event is emitted. The fuzzing code wouldn't be able to hook
trace_device_read_from_guest_memory, you could only analyze a trace
after the fact.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions
  2020-06-23 14:14   ` Stefan Hajnoczi
  2020-06-23 14:55     ` Alexander Bulekov
@ 2020-07-09 23:48     ` Alexander Bulekov
  2020-07-13 11:41       ` Stefan Hajnoczi
  1 sibling, 1 reply; 14+ messages in thread
From: Alexander Bulekov @ 2020-07-09 23:48 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, f4bug, darren.kenny, bsd, stefanha, Paolo Bonzini,
	Richard Henderson

On 200623 1514, Stefan Hajnoczi wrote:
> On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > ---
> >  exec.c                                | 17 ++++++++++++++++-
> >  include/exec/memory.h                 |  8 ++++++++
> >  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
> >  include/sysemu/dma.h                  |  5 ++++-
> >  memory_ldst.inc.c                     | 12 ++++++++++++
> >  5 files changed, 49 insertions(+), 2 deletions(-)
> 
> Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
> function is clear.
> 
> The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
> is undefined. In a header file:
> 
>   #ifdef CONFIG_FUZZ
>   void fuzz_dma_read_cb(size_t addr, size_t len);
>   #else
>   static inline void fuzz_dma_read_cb(size_t addr, size_t len)
>   {
>       /* Do nothing */
>   }
>   #endif
> 

If I understand correctly, this still has the problem that normal
qemu-system builds under --enable-fuzzing are broken. I'm not sure if
there is some nice solution for this. For example, a sort-of ugly
solution could add this to softmmu/main.c (ie something that is linked
for the qemu-system build, but not for qemu-fuzz).

#ifdef CONFIG_FUZZ
#include "something.h"
static void fuzz_dma_read_cb(size_t addr, size_t len)
{
    /* Do nothing */
}
#endif

> Now the compiler should eliminate the deadcode:
> 
>   #ifdef CONFIG_FUZZ
>   if (as->root == get_system_memory()) {
>       dma_read_cb(addr, len);
>   }
>   #endif
> 
> becomes:
> 
>   if (as->root == get_system_memory()) {
>       fuzz_dma_read_cb(addr, len);
>   }
> 
> Hopefully gcc and clang will eliminate this and emit no instructions
> when CONFIG_FUZZ is undefined. If not, you can simply pass in 'as' and
> 'is_write' too:
> 
>   void fuzz_dma_read_cb(AddressSpace *as, bool is_write, size_t addr, size_t len)
> 
Yes I'll add in is_write. Hopefully that will make life easier for the
compiler.
Thanks

> This way the conditional is moved inside fuzz_dma_read_cb() and deadcode
> elimination becomes trivial for the compiler:
> 
>   fuzz_read_cb(as, is_write, addr, len);



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions
  2020-07-09 23:48     ` Alexander Bulekov
@ 2020-07-13 11:41       ` Stefan Hajnoczi
  2020-07-13 11:52         ` Alexander Bulekov
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2020-07-13 11:41 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: Stefan Hajnoczi, qemu-devel, f4bug, darren.kenny, bsd,
	Paolo Bonzini, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 1736 bytes --]

On Thu, Jul 09, 2020 at 07:48:55PM -0400, Alexander Bulekov wrote:
> On 200623 1514, Stefan Hajnoczi wrote:
> > On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> > > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > > ---
> > >  exec.c                                | 17 ++++++++++++++++-
> > >  include/exec/memory.h                 |  8 ++++++++
> > >  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
> > >  include/sysemu/dma.h                  |  5 ++++-
> > >  memory_ldst.inc.c                     | 12 ++++++++++++
> > >  5 files changed, 49 insertions(+), 2 deletions(-)
> > 
> > Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
> > function is clear.
> > 
> > The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
> > is undefined. In a header file:
> > 
> >   #ifdef CONFIG_FUZZ
> >   void fuzz_dma_read_cb(size_t addr, size_t len);
> >   #else
> >   static inline void fuzz_dma_read_cb(size_t addr, size_t len)
> >   {
> >       /* Do nothing */
> >   }
> >   #endif
> > 
> 
> If I understand correctly, this still has the problem that normal
> qemu-system builds under --enable-fuzzing are broken. I'm not sure if
> there is some nice solution for this. For example, a sort-of ugly
> solution could add this to softmmu/main.c (ie something that is linked
> for the qemu-system build, but not for qemu-fuzz).
> 
> #ifdef CONFIG_FUZZ
> #include "something.h"
> static void fuzz_dma_read_cb(size_t addr, size_t len)
> {
>     /* Do nothing */
> }
> #endif

Another ugly solution is using weak symbols in the main code and a
strong symbol in the fuzzer target:
https://en.wikipedia.org/wiki/Weak_symbol

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions
  2020-07-13 11:41       ` Stefan Hajnoczi
@ 2020-07-13 11:52         ` Alexander Bulekov
  0 siblings, 0 replies; 14+ messages in thread
From: Alexander Bulekov @ 2020-07-13 11:52 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Stefan Hajnoczi, qemu-devel, f4bug, darren.kenny, bsd,
	Paolo Bonzini, Richard Henderson

On 200713 1241, Stefan Hajnoczi wrote:
> On Thu, Jul 09, 2020 at 07:48:55PM -0400, Alexander Bulekov wrote:
> > On 200623 1514, Stefan Hajnoczi wrote:
> > > On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> > > > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > > > ---
> > > >  exec.c                                | 17 ++++++++++++++++-
> > > >  include/exec/memory.h                 |  8 ++++++++
> > > >  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
> > > >  include/sysemu/dma.h                  |  5 ++++-
> > > >  memory_ldst.inc.c                     | 12 ++++++++++++
> > > >  5 files changed, 49 insertions(+), 2 deletions(-)
> > > 
> > > Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
> > > function is clear.
> > > 
> > > The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
> > > is undefined. In a header file:
> > > 
> > >   #ifdef CONFIG_FUZZ
> > >   void fuzz_dma_read_cb(size_t addr, size_t len);
> > >   #else
> > >   static inline void fuzz_dma_read_cb(size_t addr, size_t len)
> > >   {
> > >       /* Do nothing */
> > >   }
> > >   #endif
> > > 
> > 
> > If I understand correctly, this still has the problem that normal
> > qemu-system builds under --enable-fuzzing are broken. I'm not sure if
> > there is some nice solution for this. For example, a sort-of ugly
> > solution could add this to softmmu/main.c (ie something that is linked
> > for the qemu-system build, but not for qemu-fuzz).
> > 
> > #ifdef CONFIG_FUZZ
> > #include "something.h"
> > static void fuzz_dma_read_cb(size_t addr, size_t len)
> > {
> >     /* Do nothing */
> > }
> > #endif
> 
> Another ugly solution is using weak symbols in the main code and a
> strong symbol in the fuzzer target:
> https://en.wikipedia.org/wiki/Weak_symbol

Ok - I'll try that out. I think we'll also need a check in the actual
dma_read_cb function to confirm that we are actually the general-fuzzer.
We don't want to be hooking accesses while e.g. running the non-general
virtio-net fuzzer.

-Alex

> Stefan




^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-07-13 11:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-11  5:56 [RFC PATCH 0/3] fuzz: add generic fuzzer Alexander Bulekov
2020-06-11  5:56 ` [RFC PATCH 1/3] fuzz: add a general fuzzer for any qemu arguments Alexander Bulekov
2020-06-11  5:56 ` [RFC PATCH 2/3] fuzz: add support for fuzzing DMA regions Alexander Bulekov
2020-06-11  5:56 ` [RFC PATCH 3/3] fuzz: Add callbacks for dma-access functions Alexander Bulekov
2020-06-23 14:14   ` Stefan Hajnoczi
2020-06-23 14:55     ` Alexander Bulekov
2020-06-26 15:44       ` Stefan Hajnoczi
2020-07-09 23:48     ` Alexander Bulekov
2020-07-13 11:41       ` Stefan Hajnoczi
2020-07-13 11:52         ` Alexander Bulekov
2020-06-24  9:46   ` Philippe Mathieu-Daudé
2020-06-11  6:55 ` [RFC PATCH 0/3] fuzz: add generic fuzzer no-reply
2020-06-23 14:16 ` Stefan Hajnoczi
2020-06-25 15:30   ` Dima Stepanov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.