On 6/15/21 10:22 AM, Alex Bennée wrote:
> Mahmoud Mandour <ma.mandourr@gmail.com> writes:
> 
>> On 14/06/2021 11:01, Alexandre Iooss wrote:
>>> Log instruction execution and memory access to a file.
>>> This plugin can be used for reverse engineering or for side-channel analysis
>>> using QEMU.
>>>
>>> Signed-off-by: Alexandre Iooss <erdnaxe@crans.org>
>>> ---
>>>    MAINTAINERS               |   1 +
>>>    contrib/plugins/Makefile  |   1 +
>>>    contrib/plugins/execlog.c | 112 ++++++++++++++++++++++++++++++++++++++
>>>    3 files changed, 114 insertions(+)
>>>    create mode 100644 contrib/plugins/execlog.c
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 7d9cd29042..65942d5802 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -2974,6 +2974,7 @@ F: include/tcg/
>>>      TCG Plugins
>>>    M: Alex BennÃ©e <alex.bennee@linaro.org>
>>> +R: Alexandre Iooss <erdnaxe@crans.org>
>>>    S: Maintained
>>>    F: docs/devel/tcg-plugins.rst
>>>    F: plugins/
>>> diff --git a/contrib/plugins/Makefile b/contrib/plugins/Makefile
>>> index b9d7935e5e..51093acd17 100644
>>> --- a/contrib/plugins/Makefile
>>> +++ b/contrib/plugins/Makefile
>>> @@ -13,6 +13,7 @@ include $(BUILD_DIR)/config-host.mak
>>>    VPATH += $(SRC_PATH)/contrib/plugins
>>>      NAMES :=
>>> +NAMES += execlog
>>>    NAMES += hotblocks
>>>    NAMES += hotpages
>>>    NAMES += howvec
>>> diff --git a/contrib/plugins/execlog.c b/contrib/plugins/execlog.c
>>> new file mode 100644
>>> index 0000000000..80716e8eed
>>> --- /dev/null
>>> +++ b/contrib/plugins/execlog.c
>>> @@ -0,0 +1,112 @@
>>> +/*
>>> + * Copyright (C) 2021, Alexandre Iooss <erdnaxe@crans.org>
>>> + *
>>> + * Log instruction execution and memory access to a file.
>>> + * You may pass the output filename as argument.
>>> + *
>>> + * License: GNU GPL, version 2 or later.
>>> + *   See the COPYING file in the top-level directory.
>>> + */
>>> +#include <glib.h>
>>> +#include <inttypes.h>
>>> +#include <stdio.h>
>>> +#include <stdlib.h>
>>> +#include <string.h>
>>> +#include <unistd.h>
>>> +
>>> +#include <qemu-plugin.h>
>>> +
>>> +QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
>>> +
>>> +/* Execution trace buffer */
>>> +FILE *output;
>>
>> Is there a reason for using a regular FILE * instead of using
>> `qemu_plugin_outs()`?
>>
>> I don't see a benefit and since there's an API for logging I guess
>> it's better to use it instead
> 
> I would prefer using qemu_plugin_outs unless there is a reason to have
> multiple streams.

I was not aware of `-D` option in QEMU that does exactly what I need. So 
I am going to use qemu_plugin_outs for next version of this patch.

>>> +
>>> +/**
>>> + * Log memory read or write
>>> + */
>>> +static void vcpu_mem(unsigned int vcpu_index, qemu_plugin_meminfo_t info,
>>> +                     uint64_t vaddr, void *udata)
>>> +{
>>> +    struct qemu_plugin_hwaddr *hwaddr = qemu_plugin_get_hwaddr(info, vaddr);
>>> +    if (!hwaddr) {
>>> +        return;
>>> +    }
>>
>> So you just reject all memory accesses in user mode? I think that it
>> equally useful
>>
>> to log only the virtual address of a memory access in user-mode
>> emulation? However, we better
>>
>> wait for Alex's opinion on all this since he had remarks about
>> introducing a new ad-hoc
>>
>> tracing format...
> 
> It would be nice to integrate with the QEMU tracing support but
> following Stefan's comments I think it will be fairly involved to work
> out a way to do it. In the meantime using plain text logging will have
> to do.

I will change this to log the vaddr in user mode without device name.

>>> +
>>> +    /* Add data to execution log */
>>> +    const char *name = qemu_plugin_hwaddr_device_name(hwaddr);
>>> +    uint64_t addr = qemu_plugin_hwaddr_phys_addr(hwaddr);
>>> +    if (qemu_plugin_mem_is_store(info)) {
>>> +        fprintf(output, "mem: %s store at 0x%08lx\n", name, addr);
>>> +    } else {
>>> +        fprintf(output, "mem: %s load at 0x%08lx\n", name, addr);
>>> +    }
> 
> I would consider a comma separated output to make the file easily
> digestible by other tools. Maybe:
> 
>    memory, 0x%08lx, load|store
> 

Agree, I will change this.

>>> +}
>>> +
>>> +/**
>>> + * Log instruction execution
>>> + */
>>> +static void vcpu_insn_exec(unsigned int cpu_index, void *udata)
>>> +{
>>> +    char *insn_disas = (char *)udata;
>>> +
>>> +    /* Add data to execution log */
>>> +    fprintf(output, "insn: %s\n", insn_disas);
> 
>    insn, 0x%08lx, disas
> 
> Currently however on a multi-threaded execution you will potentially
> loose any synchronisation between the insn_exec and any memory operation
> associated with it. If you really just care about what's tickling
> hardware you could just drop the insn_exec callback and pass the
> instruction information via udata to the vcpu_mem callback. You could
> then dump everything in one line:
> 
>    0xPC, ld [x1], x2, 0xADDR, load|store
> 
> you wouldn't dump other instructions leading up to that though.

You are correct, this is indeed an issue and it's currently making my 
life really hard when I try to apply a side-channel model on the memory 
interactions.
I prefer to log all instructions, so I need to use vcpu_mem_cb when it's 
a memory instruction, or vcpu_insn_exec_cb if it's not.

>>> +}
>>> +
>>> +/**
>>> + * On translation block new translation
>>> + *
>>> + * QEMU convert code by translation block (TB). By hooking here we can then hook
>>> + * a callback on each instruction and memory access.
>>> + */
>>> +static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
>>> +{
>>> +    size_t n = qemu_plugin_tb_n_insns(tb);
>>> +    for (size_t i = 0; i < n; i++) {
>>> +        /* insn is shared between translations in QEMU, copy needed data here */
>>> +        struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb,
>>> i);
> 
> If you had some knowledge of the guest architecture you could attempt a
> partial decode of the op and work out if it was a load/store and then
> conditionally register the mem or exec callback.

Agree. The TCG API does not offer a way to know from a insn if it's 
something that is going to trigger a vcpu_mem_cb, so I will add some 
code to decode this information from the opcode.
When the architecture is unknown, I might default to exec callback.

>>> +        char *insn_disas = qemu_plugin_insn_disas(insn);
>>> +
>>> +        /* Register callback on memory read or write */
>>> +        qemu_plugin_register_vcpu_mem_cb(insn, vcpu_mem,
>>> +                                         QEMU_PLUGIN_CB_NO_REGS,
>>> +                                         QEMU_PLUGIN_MEM_RW, NULL);
>>> +
>>> +        /* Register callback on instruction */
>>> +        qemu_plugin_register_vcpu_insn_exec_cb(
>>> +            insn, vcpu_insn_exec, QEMU_PLUGIN_CB_R_REGS, insn_disas);
>>> +    }
>>> +}
>>> +
>>> +/**
>>> + * On plugin exit, close output file
>>> + */
>>> +static void plugin_exit(qemu_plugin_id_t id, void *p)
>>> +{
>>> +    fclose(output);
>>> +}
>>> +
>>> +/**
>>> + * Install the plugin
>>> + */
>>> +QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
>>> +                                           const qemu_info_t *info, int argc,
>>> +                                           char **argv)
>>> +{
>>> +    /* Parse arguments to get output name and open for writing */
>>> +    char *filename = "execution.log";
>>> +    if (argc > 0) {
>>> +        filename = argv[0];
>>> +    }
>>> +    output = fopen(filename, "w");
>>> +    if (output == NULL) {
>>> +        qemu_plugin_outs("Cannot open output file for writing.\n");
>>
>> Here, I think that it's more logical to output error messages to
>> stderr since
>>
>> logging can be redirected to a file so QEMU will error but the error message
>>
>> would be written to the log file.
> 
> Yes please, report errors directly to stderr as the execution is going
> to trigger and exit at this point and the user needs direct feeback.
> 

I am going to remove this section of code as I don't need to open a file 
anymore.

>>
>>> +        return -1;
>>> +    }
>>> +
>>> +    /* Register translation block and exit callbacks */
>>> +    qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
>>> +    qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
>>> +
>>> +    return 0;
>>> +}
>>
>>
>> Thanks,
>>
>> Mahmoud
>>
>>
>>
> 

Thanks,
-- Alexandre