From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:42577)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <somlo@cmu.edu>) id 1ZtjNb-0008WP-CI
	for qemu-devel@nongnu.org; Tue, 03 Nov 2015 16:44:37 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <somlo@cmu.edu>) id 1ZtjNW-0003Gd-9M
	for qemu-devel@nongnu.org; Tue, 03 Nov 2015 16:44:35 -0500
Received: from smtp.andrew.cmu.edu ([128.2.157.38]:37895)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <somlo@cmu.edu>) id 1ZtjNW-0003GZ-2k
	for qemu-devel@nongnu.org; Tue, 03 Nov 2015 16:44:30 -0500
Date: Tue, 3 Nov 2015 16:44:26 -0500
From: "Gabriel L. Somlo" <somlo@cmu.edu>
Message-ID: <20151103214426.GG10717@HEDWIG.INI.CMU.EDU>
References: <1446510945-18477-1-git-send-email-somlo@cmu.edu>
	<1446510945-18477-5-git-send-email-somlo@cmu.edu>
	<56389243.4040106@redhat.com>
	<20151103175515.GF10717@HEDWIG.INI.CMU.EDU>
	<563928A8.5030907@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <563928A8.5030907@redhat.com>
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read
	method
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Laszlo Ersek <lersek@redhat.com>
Cc: peter.maydell@linaro.org, jordan.l.justen@intel.com, qemu-devel@nongnu.org, kraxel@redhat.com, pbonzini@redhat.com, markmb@redhat.com

On Tue, Nov 03, 2015 at 10:35:36PM +0100, Laszlo Ersek wrote:
> On 11/03/15 18:55, Gabriel L. Somlo wrote:
> > On Tue, Nov 03, 2015 at 11:53:55AM +0100, Laszlo Ersek wrote:
> >> Thank you for splitting out this patch; it makes it easier to review.
> >> However,
> >>
> >> On 11/03/15 01:35, Gabriel L. Somlo wrote:
> >>> Introduce fw_cfg_data_read(), a generic read method which works
> >>> on all access widths (1 through 8 bytes, inclusive), and can be
> >>> used during both IOPort and MMIO read accesses.
> >>>
> >>> To maintain legibility, only fw_cfg_data_mem_read() (the MMIO
> >>> data read method) is replaced by this patch. The new method
> >>> essentially unwinds the fw_cfg_data_mem_read() + fw_cfg_read()
> >>> combo, but without unnecessarily repeating all the validity
> >>> checks performed by the latter on each byte being read.
> >>
> >> this unwinding caused a bug to creep in.
> >>
> >> Namely, we have to identify the set of data that remains constant
> >> between *all* "size" calls that fw_cfg_data_mem_read() makes to
> >> fw_cfg_read(), and hoist / eliminate the checks on those *only*.
> >>
> >> Specifically,
> >>
> >>> This patch also modifies the trace_fw_cfg_read prototype to
> >>> accept a 64-bit value argument, allowing it to work properly
> >>> with the new read method, but also remain backward compatible
> >>> with existing call sites.
> >>>
> >>> Cc: Laszlo Ersek <lersek@redhat.com>
> >>> Cc: Gerd Hoffmann <kraxel@redhat.com>
> >>> Cc: Marc Mar=ED <markmb@redhat.com>
> >>> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> >>> ---
> >>>  hw/nvram/fw_cfg.c | 33 +++++++++++++++++++--------------
> >>>  trace-events      |  2 +-
> >>>  2 files changed, 20 insertions(+), 15 deletions(-)
> >>>
> >>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> >>> index c2d3a0a..8aa980c 100644
> >>> --- a/hw/nvram/fw_cfg.c
> >>> +++ b/hw/nvram/fw_cfg.c
> >>> @@ -274,6 +274,24 @@ static int fw_cfg_select(FWCfgState *s, uint16=
_t key)
> >>>      return ret;
> >>>  }
> >>> =20
> >>> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsign=
ed size)
> >>> +{
> >>> +    FWCfgState *s =3D opaque;
> >>
> >> This is good.
> >>
> >>> +    int arch =3D !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> >>
> >> Okay too.
> >>
> >>> +    FWCfgEntry *e =3D &s->entries[arch][s->cur_entry & FW_CFG_ENTR=
Y_MASK];
> >>
> >> (1) Side point: the conversion here is faithful to the original code=
 in
> >> fw_cfg_read(), but even in the original code, the expression uses
> >> "s->cur_entry" as a (masked) subscript *before* comparing it against
> >> FW_CFG_INVALID. I don't think that's right.
> >>
> >> The same issue is present in fw_cfg_dma_transfer(). Care to write a
> >> patch (before the restructuring) that fixes both?
> >>
> >> Note, I am aware that the expression in both of the above mentioned
> >> functions only calculates the *address* of the nonexistent element
> >> belonging to (FW_CFG_INVALID & FW_CFG_ENTRY_MASK) =3D=3D 0x3FFF:
> >>
> >>   e =3D &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> >>
> >> But it doesn't matter; it's undefined behavior just the same. Instea=
d,
> >> *both* locations should say:
> >>
> >>  e =3D (s->cur_entry =3D=3D FW_CFG_INVALID) ? NULL :
> >>      &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> >>
> >> (I share the blame for not noticing this earlier -- I too reviewed
> >> fw_cfg_dma_transfer().)
> >>
> >> NULL is a valid pointer to *evaluate* (not to dereference), whereas =
the
> >> current address-of expression is not valid even for evaluation. Also=
, in
> >> practice, dereferencing NULL would give us a nice (as in, non-garbag=
e)
> >> SIGSEGV.
> >=20
> > Done.
> >=20
> >>
> >> Anyway, back to the topic at hand:
> >>
> >>> +    uint64_t value =3D 0;
> >>> +
> >>> +    assert(size <=3D sizeof(value));
> >>> +    if (s->cur_entry !=3D FW_CFG_INVALID && e->data) {
> >>
> >> Right, good conversion. (Side note: this does protect against
> >> *dereferencing* "e", but it's already too late, as far as undefined
> >> behavior is concerned.)
> >>
> >>> +        while (size-- && s->cur_offset < e->len) {
> >>> +            value =3D (value << 8) | e->data[s->cur_offset++];
> >>> +        }
> >>
> >> (2) So, this is the bug. The pre-conversion code would keep shifting
> >> "value" to the left until "size" was reached, regardless of the
> >> underlying blob size, and just leave the least significant bytes zer=
oed
> >> if the item ended too early. Whereas this loop *stops shifting* when=
 the
> >> blob ends.
> >=20
> > D'OH!!! That should teach me to pay more attention -- thanks for
> > catching it!
> >=20
> >> Since the wide data register (which is big-endian) implements a
> >> substring-preserving transfer (on top of QEMU's integer preserving
> >> device r/w infrastructure), this change breaks the case when the
> >> firmware reads, say, 8 bytes from the register in a single access, w=
hen
> >> only 3 are left in the blob, and then uses only the three *lowest
> >> address* bytes from the uint64_t value read. Although no known firmw=
are
> >> does this at the moment, it would be valid, and the above hunk would
> >> break it.
> >>
> >> Hence please
> >>
> >> (2a) either append the missing "cumulative" shift after the loop:
> >>
> >>     while (size && s->cur_offset < e->len) {
> >>         --size;
> >>         value =3D (value << 8) | e->data[s->cur_offset++];
> >>     }
> >>     value <<=3D 8 * size;
> >=20
> > I went with 2a. Also added a comment to make things painfully obvious
> > to any potential future archaeologists:
> >=20
> > +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned=
 size)
> > +{
> > +    FWCfgState *s =3D opaque;
> > +    int arch =3D !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> > +    FWCfgEntry *e =3D (s->cur_entry =3D=3D FW_CFG_INVALID) ? NULL :
> > +                    &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MA=
SK];
> > +    uint64_t value =3D 0;
> > +
> > +    assert(size <=3D sizeof(value));
> > +    if (s->cur_entry !=3D FW_CFG_INVALID && e->data) {
> > +        /* The least significant 'size' bytes of the return value ar=
e
> > +         * expected to contain a string preserving portion of the it=
em
> > +         * data, padded with zeros to the right in case we run out e=
arly.
>=20
> Please say "*on* the right" here, just like it reads below (emphasis
> added only for review purposes).
>=20
> Also, while the above seems correct, I prefer my own wording from commi=
t
> 3c23402d4032:
>=20
>   The solution is to compose the host-endian representation [...] of
>   the big endian interpretation [...] of the fw_cfg string [...]
>=20
> I'm admittedly biased (I have deep scars that read "FW CFG" if I squint
> ;)) -- my preference could be harder to interpret for "future
> archeologist". So I'll leave it to you whether to keep yours, pick mine=
,
> or run with a mixture / union.

Oops, I just fired off v4 literally a few seconds before this email
came in :)

I'll make the changes, and queue them for v5. I'll send that out in a
few days, or as soon as I get any more feedback on the series...

Thanks,
--Gabriel

>=20
> Cheers!
> Laszlo
>=20
> > +         */
> > +        while (size && s->cur_offset < e->len) {
> > +            value =3D (value << 8) | e->data[s->cur_offset++];
> > +            size--;
> > +        }
> > +        /* If size is still not zero, we *did* run out early, so fin=
ish
> > +         * left-shifting to add the appropriate number of padding ze=
ros
> > +         * on the right.
> > +         */
> > +        value <<=3D 8 * size;
> > +    }
> > +
> > +    trace_fw_cfg_read(s, value);
> > +    return value;
> > +}
> >=20
> > Version 4 should be out by the end of today.
> >=20
> > Thanks again,
> > --Gabriel
> >=20
> >>
> >> (2b) or move the offset check from the loop's controlling expression
> >> into the value composition:
> >>
> >>         while (size--) {
> >>             value =3D (value << 8) | (s->cur_offset < e->len ?
> >>                                     e->data[s->cur_offset++] :
> >>                                     0);
> >>         }
> >>
> >> The rest looks good.
> >>
> >> Thanks
> >> Laszlo
> >>
> >>> +    }
> >>> +
> >>> +    trace_fw_cfg_read(s, value);
> >>> +    return value;
> >>> +}
> >>> +
> >>>  static uint8_t fw_cfg_read(FWCfgState *s)
> >>>  {
> >>>      int arch =3D !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> >>> @@ -290,19 +308,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
> >>>      return ret;
> >>>  }
> >>> =20
> >>> -static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
> >>> -                                     unsigned size)
> >>> -{
> >>> -    FWCfgState *s =3D opaque;
> >>> -    uint64_t value =3D 0;
> >>> -    unsigned i;
> >>> -
> >>> -    for (i =3D 0; i < size; ++i) {
> >>> -        value =3D (value << 8) | fw_cfg_read(s);
> >>> -    }
> >>> -    return value;
> >>> -}
> >>> -
> >>>  static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
> >>>                                    uint64_t value, unsigned size)
> >>>  {
> >>> @@ -483,7 +488,7 @@ static const MemoryRegionOps fw_cfg_ctl_mem_ops=
 =3D {
> >>>  };
> >>> =20
> >>>  static const MemoryRegionOps fw_cfg_data_mem_ops =3D {
> >>> -    .read =3D fw_cfg_data_mem_read,
> >>> +    .read =3D fw_cfg_data_read,
> >>>      .write =3D fw_cfg_data_mem_write,
> >>>      .endianness =3D DEVICE_BIG_ENDIAN,
> >>>      .valid =3D {
> >>> diff --git a/trace-events b/trace-events
> >>> index 72136b9..5073040 100644
> >>> --- a/trace-events
> >>> +++ b/trace-events
> >>> @@ -196,7 +196,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret)=
 "Read diagnostic %"PRId64"=3D %02x
> >>> =20
> >>>  # hw/nvram/fw_cfg.c
> >>>  fw_cfg_select(void *s, uint16_t key, int ret) "%p key %d =3D %d"
> >>> -fw_cfg_read(void *s, uint8_t ret) "%p =3D %d"
> >>> +fw_cfg_read(void *s, uint64_t ret) "%p =3D %"PRIx64
> >>>  fw_cfg_add_file(void *s, int index, char *name, size_t len) "%p #%=
d: %s (%zd bytes)"
> >>> =20
> >>>  # hw/block/hd-geometry.c
> >>>
> >>
>=20