All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: "Gabriel L. Somlo" <somlo@cmu.edu>
Cc: peter.maydell@linaro.org, jordan.l.justen@intel.com,
	qemu-devel@nongnu.org, kraxel@redhat.com, pbonzini@redhat.com,
	markmb@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method
Date: Wed, 4 Nov 2015 14:01:16 +0100	[thread overview]
Message-ID: <563A019C.80405@redhat.com> (raw)
In-Reply-To: <20151103220337.GH10717@HEDWIG.INI.CMU.EDU>

On 11/03/15 23:03, Gabriel L. Somlo wrote:
> On Tue, Nov 03, 2015 at 10:35:36PM +0100, Laszlo Ersek wrote:
>> On 11/03/15 18:55, Gabriel L. Somlo wrote:
>>> On Tue, Nov 03, 2015 at 11:53:55AM +0100, Laszlo Ersek wrote:
>>>> Thank you for splitting out this patch; it makes it easier to review.
>>>> However,
>>>>
>>>> On 11/03/15 01:35, Gabriel L. Somlo wrote:
>>>>> Introduce fw_cfg_data_read(), a generic read method which works
>>>>> on all access widths (1 through 8 bytes, inclusive), and can be
>>>>> used during both IOPort and MMIO read accesses.
>>>>>
>>>>> To maintain legibility, only fw_cfg_data_mem_read() (the MMIO
>>>>> data read method) is replaced by this patch. The new method
>>>>> essentially unwinds the fw_cfg_data_mem_read() + fw_cfg_read()
>>>>> combo, but without unnecessarily repeating all the validity
>>>>> checks performed by the latter on each byte being read.
>>>>
>>>> this unwinding caused a bug to creep in.
>>>>
>>>> Namely, we have to identify the set of data that remains constant
>>>> between *all* "size" calls that fw_cfg_data_mem_read() makes to
>>>> fw_cfg_read(), and hoist / eliminate the checks on those *only*.
>>>>
>>>> Specifically,
>>>>
>>>>> This patch also modifies the trace_fw_cfg_read prototype to
>>>>> accept a 64-bit value argument, allowing it to work properly
>>>>> with the new read method, but also remain backward compatible
>>>>> with existing call sites.
>>>>>
>>>>> Cc: Laszlo Ersek <lersek@redhat.com>
>>>>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>>>>> Cc: Marc Marí <markmb@redhat.com>
>>>>> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
>>>>> ---
>>>>>  hw/nvram/fw_cfg.c | 33 +++++++++++++++++++--------------
>>>>>  trace-events      |  2 +-
>>>>>  2 files changed, 20 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
>>>>> index c2d3a0a..8aa980c 100644
>>>>> --- a/hw/nvram/fw_cfg.c
>>>>> +++ b/hw/nvram/fw_cfg.c
>>>>> @@ -274,6 +274,24 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
>>>>>      return ret;
>>>>>  }
>>>>>  
>>>>> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
>>>>> +{
>>>>> +    FWCfgState *s = opaque;
>>>>
>>>> This is good.
>>>>
>>>>> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
>>>>
>>>> Okay too.
>>>>
>>>>> +    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>>>
>>>> (1) Side point: the conversion here is faithful to the original code in
>>>> fw_cfg_read(), but even in the original code, the expression uses
>>>> "s->cur_entry" as a (masked) subscript *before* comparing it against
>>>> FW_CFG_INVALID. I don't think that's right.
>>>>
>>>> The same issue is present in fw_cfg_dma_transfer(). Care to write a
>>>> patch (before the restructuring) that fixes both?
>>>>
>>>> Note, I am aware that the expression in both of the above mentioned
>>>> functions only calculates the *address* of the nonexistent element
>>>> belonging to (FW_CFG_INVALID & FW_CFG_ENTRY_MASK) == 0x3FFF:
>>>>
>>>>   e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>>>
>>>> But it doesn't matter; it's undefined behavior just the same. Instead,
>>>> *both* locations should say:
>>>>
>>>>  e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
>>>>      &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>>>
>>>> (I share the blame for not noticing this earlier -- I too reviewed
>>>> fw_cfg_dma_transfer().)
>>>>
>>>> NULL is a valid pointer to *evaluate* (not to dereference), whereas the
>>>> current address-of expression is not valid even for evaluation. Also, in
>>>> practice, dereferencing NULL would give us a nice (as in, non-garbage)
>>>> SIGSEGV.
>>>
>>> Done.
>>>
>>>>
>>>> Anyway, back to the topic at hand:
>>>>
>>>>> +    uint64_t value = 0;
>>>>> +
>>>>> +    assert(size <= sizeof(value));
>>>>> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
>>>>
>>>> Right, good conversion. (Side note: this does protect against
>>>> *dereferencing* "e", but it's already too late, as far as undefined
>>>> behavior is concerned.)
>>>>
>>>>> +        while (size-- && s->cur_offset < e->len) {
>>>>> +            value = (value << 8) | e->data[s->cur_offset++];
>>>>> +        }
>>>>
>>>> (2) So, this is the bug. The pre-conversion code would keep shifting
>>>> "value" to the left until "size" was reached, regardless of the
>>>> underlying blob size, and just leave the least significant bytes zeroed
>>>> if the item ended too early. Whereas this loop *stops shifting* when the
>>>> blob ends.
>>>
>>> D'OH!!! That should teach me to pay more attention -- thanks for
>>> catching it!
>>>
>>>> Since the wide data register (which is big-endian) implements a
>>>> substring-preserving transfer (on top of QEMU's integer preserving
>>>> device r/w infrastructure), this change breaks the case when the
>>>> firmware reads, say, 8 bytes from the register in a single access, when
>>>> only 3 are left in the blob, and then uses only the three *lowest
>>>> address* bytes from the uint64_t value read. Although no known firmware
>>>> does this at the moment, it would be valid, and the above hunk would
>>>> break it.
>>>>
>>>> Hence please
>>>>
>>>> (2a) either append the missing "cumulative" shift after the loop:
>>>>
>>>>     while (size && s->cur_offset < e->len) {
>>>>         --size;
>>>>         value = (value << 8) | e->data[s->cur_offset++];
>>>>     }
>>>>     value <<= 8 * size;
>>>
>>> I went with 2a. Also added a comment to make things painfully obvious
>>> to any potential future archaeologists:
>>>
>>> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
>>> +{
>>> +    FWCfgState *s = opaque;
>>> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
>>> +    FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
>>> +                    &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>> +    uint64_t value = 0;
>>> +
>>> +    assert(size <= sizeof(value));
>>> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
>>> +        /* The least significant 'size' bytes of the return value are
>>> +         * expected to contain a string preserving portion of the item
>>> +         * data, padded with zeros to the right in case we run out early.
>>
>> Please say "*on* the right" here, just like it reads below (emphasis
>> added only for review purposes).
> 
> Done.
> 
>> Also, while the above seems correct, I prefer my own wording from commit
>> 3c23402d4032:
>>
>>   The solution is to compose the host-endian representation [...] of
>>   the big endian interpretation [...] of the fw_cfg string [...]
>>
>> I'm admittedly biased (I have deep scars that read "FW CFG" if I squint
>> ;)) -- my preference could be harder to interpret for "future
>> archeologist". So I'll leave it to you whether to keep yours, pick mine,
>> or run with a mixture / union.
> 
> You mean commit  36b62ae, I think :)

Sigh, yes. Sorry. Must have been multi-tasking too heavily.

> I'm going to go with a "union",
> since the "string preserving" verbiage is also in use (by our mutual
> agreement) in docs/specs/fw_cfg.txt :)
> 
> So that comment will read:
> 
>         /* The least significant 'size' bytes of the return value are
>          * expected to contain a string preserving portion of the item
>          * data, padded with zeros on the right in case we run out early.
>          * In technical terms, we're composing the host-endian representation
>          * of the big endian interpretation of the fw_cfg string.
>          */

Sounds great, thanks!
Laszlo

> 
> ... when I'll send out v5.
> 
> Thanks,
> --Gabriel
> 
>>> +         */
>>> +        while (size && s->cur_offset < e->len) {
>>> +            value = (value << 8) | e->data[s->cur_offset++];
>>> +            size--;
>>> +        }
>>> +        /* If size is still not zero, we *did* run out early, so finish
>>> +         * left-shifting to add the appropriate number of padding zeros
>>> +         * on the right.
>>> +         */
>>> +        value <<= 8 * size;
>>> +    }
>>> +
>>> +    trace_fw_cfg_read(s, value);
>>> +    return value;
>>> +}
>>>
>>> Version 4 should be out by the end of today.
>>>
>>> Thanks again,
>>> --Gabriel
>>>
>>>>
>>>> (2b) or move the offset check from the loop's controlling expression
>>>> into the value composition:
>>>>
>>>>         while (size--) {
>>>>             value = (value << 8) | (s->cur_offset < e->len ?
>>>>                                     e->data[s->cur_offset++] :
>>>>                                     0);
>>>>         }
>>>>
>>>> The rest looks good.
>>>>
>>>> Thanks
>>>> Laszlo
>>>>
>>>>> +    }
>>>>> +
>>>>> +    trace_fw_cfg_read(s, value);
>>>>> +    return value;
>>>>> +}
>>>>> +
>>>>>  static uint8_t fw_cfg_read(FWCfgState *s)
>>>>>  {
>>>>>      int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
>>>>> @@ -290,19 +308,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
>>>>>      return ret;
>>>>>  }
>>>>>  
>>>>> -static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
>>>>> -                                     unsigned size)
>>>>> -{
>>>>> -    FWCfgState *s = opaque;
>>>>> -    uint64_t value = 0;
>>>>> -    unsigned i;
>>>>> -
>>>>> -    for (i = 0; i < size; ++i) {
>>>>> -        value = (value << 8) | fw_cfg_read(s);
>>>>> -    }
>>>>> -    return value;
>>>>> -}
>>>>> -
>>>>>  static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
>>>>>                                    uint64_t value, unsigned size)
>>>>>  {
>>>>> @@ -483,7 +488,7 @@ static const MemoryRegionOps fw_cfg_ctl_mem_ops = {
>>>>>  };
>>>>>  
>>>>>  static const MemoryRegionOps fw_cfg_data_mem_ops = {
>>>>> -    .read = fw_cfg_data_mem_read,
>>>>> +    .read = fw_cfg_data_read,
>>>>>      .write = fw_cfg_data_mem_write,
>>>>>      .endianness = DEVICE_BIG_ENDIAN,
>>>>>      .valid = {
>>>>> diff --git a/trace-events b/trace-events
>>>>> index 72136b9..5073040 100644
>>>>> --- a/trace-events
>>>>> +++ b/trace-events
>>>>> @@ -196,7 +196,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret) "Read diagnostic %"PRId64"= %02x
>>>>>  
>>>>>  # hw/nvram/fw_cfg.c
>>>>>  fw_cfg_select(void *s, uint16_t key, int ret) "%p key %d = %d"
>>>>> -fw_cfg_read(void *s, uint8_t ret) "%p = %d"
>>>>> +fw_cfg_read(void *s, uint64_t ret) "%p = %"PRIx64
>>>>>  fw_cfg_add_file(void *s, int index, char *name, size_t len) "%p #%d: %s (%zd bytes)"
>>>>>  
>>>>>  # hw/block/hd-geometry.c
>>>>>
>>>>
>>

  reply	other threads:[~2015-11-04 13:01 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-03  0:35 [Qemu-devel] [PATCH v3 0/5] fw_cfg: spec update, misc. cleanup, optimize read Gabriel L. Somlo
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 1/5] fw_cfg: move internal function call docs to header file Gabriel L. Somlo
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 2/5] fw_cfg: amend callback behavior spec to once per select Gabriel L. Somlo
2015-11-03  9:51   ` Laszlo Ersek
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 3/5] fw_cfg: remove offset argument from callback prototype Gabriel L. Somlo
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method Gabriel L. Somlo
2015-11-03 10:53   ` Laszlo Ersek
2015-11-03 17:55     ` Gabriel L. Somlo
2015-11-03 21:35       ` Laszlo Ersek
2015-11-03 21:44         ` Gabriel L. Somlo
2015-11-03 22:03         ` Gabriel L. Somlo
2015-11-04 13:01           ` Laszlo Ersek [this message]
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 5/5] fw_cfg: replace ioport data read with generic method Gabriel L. Somlo
2015-11-03 11:11   ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=563A019C.80405@redhat.com \
    --to=lersek@redhat.com \
    --cc=jordan.l.justen@intel.com \
    --cc=kraxel@redhat.com \
    --cc=markmb@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=somlo@cmu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.