qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Bin Meng <bmeng.cn@gmail.com>
To: Francisco Iglesias <frasse.iglesias@gmail.com>
Cc: "Kevin Wolf" <kwolf@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"qemu-devel@nongnu.org Developers" <qemu-devel@nongnu.org>,
	Qemu-block <qemu-block@nongnu.org>,
	"Andrew Jeffery" <andrew@aj.id.au>,
	"Bin Meng" <bin.meng@windriver.com>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	"Havard Skinnemoen" <hskinnemoen@google.com>,
	"Tyrone Ting" <kfting@nuvoton.com>,
	qemu-arm <qemu-arm@nongnu.org>,
	"Alistair Francis" <alistair.francis@wdc.com>,
	"Cédric Le Goater" <clg@kaod.org>,
	"Joe Komlodi" <komlodi@xilinx.com>,
	"Max Reitz" <mreitz@redhat.com>, "Joel Stanley" <joel@jms.id.au>
Subject: Re: [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for fast read commands
Date: Wed, 20 Jan 2021 22:20:25 +0800	[thread overview]
Message-ID: <CAEUhbmUBAgF4D__jsfbE7yGd++5ZH3YOutTiOBOot52sNCV-eg@mail.gmail.com> (raw)
In-Reply-To: <20210119130113.GA28306@fralle-msi>

Hi Francisco,

On Tue, Jan 19, 2021 at 9:01 PM Francisco Iglesias
<frasse.iglesias@gmail.com> wrote:
>
> Hi Bin,
>
> On [2021 Jan 18] Mon 20:32:19, Bin Meng wrote:
> > Hi Francisco,
> >
> > On Mon, Jan 18, 2021 at 6:06 PM Francisco Iglesias
> > <frasse.iglesias@gmail.com> wrote:
> > >
> > > Hi Bin,
> > >
> > > On [2021 Jan 15] Fri 22:38:18, Bin Meng wrote:
> > > > Hi Francisco,
> > > >
> > > > On Fri, Jan 15, 2021 at 8:26 PM Francisco Iglesias
> > > > <frasse.iglesias@gmail.com> wrote:
> > > > >
> > > > > Hi Bin,
> > > > >
> > > > > On [2021 Jan 15] Fri 10:07:52, Bin Meng wrote:
> > > > > > Hi Francisco,
> > > > > >
> > > > > > On Fri, Jan 15, 2021 at 2:13 AM Francisco Iglesias
> > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > >
> > > > > > > Hi Bin,
> > > > > > >
> > > > > > > On [2021 Jan 14] Thu 23:08:53, Bin Meng wrote:
> > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > >
> > > > > > > > The m25p80 model uses s->needed_bytes to indicate how many follow-up
> > > > > > > > bytes are expected to be received after it receives a command. For
> > > > > > > > example, depending on the address mode, either 3-byte address or
> > > > > > > > 4-byte address is needed.
> > > > > > > >
> > > > > > > > For fast read family commands, some dummy cycles are required after
> > > > > > > > sending the address bytes, and the dummy cycles need to be counted
> > > > > > > > in s->needed_bytes. This is where the mess began.
> > > > > > > >
> > > > > > > > As the variable name (needed_bytes) indicates, the unit is in byte.
> > > > > > > > It is not in bit, or cycle. However for some reason the model has
> > > > > > > > been using the number of dummy cycles for s->needed_bytes. The right
> > > > > > > > approach is to convert the number of dummy cycles to bytes based on
> > > > > > > > the SPI protocol, for example, 6 dummy cycles for the Fast Read Quad
> > > > > > > > I/O (EBh) should be converted to 3 bytes per the formula (6 * 4 / 8).
> > > > > > >
> > > > > > > While not being the original implementor I must assume that above solution was
> > > > > > > considered but not chosen by the developers due to it is inaccuracy (it
> > > > > > > wouldn't be possible to model exacly 6 dummy cycles, only a multiple of 8,
> > > > > > > meaning that if the controller is wrongly programmed to generate 7 the error
> > > > > > > wouldn't be caught and the controller will still be considered "correct"). Now
> > > > > > > that we have this detail in the implementation I'm in favor of keeping it, this
> > > > > > > also because the detail is already in use for catching exactly above error.
> > > > > > >
> > > > > >
> > > > > > I found no clue from the commit message that my proposed solution here
> > > > > > was ever considered, otherwise all SPI controller models supporting
> > > > > > software generation should have been found out seriously broken long
> > > > > > time ago!
> > > > >
> > > > >
> > > > > The controllers you are referring to might lack support for commands requiring
> > > > > dummy clock cycles but I really hope they work with the other commands? If so I
> > > >
> > > > I am not sure why you view dummy clock cycles as something special
> > > > that needs some special support from the SPI controller. For the case
> > > > 1 controller, it's nothing special from the controller perspective,
> > > > just like sending out a command, or address bytes, or data. The
> > > > controller just shifts data bit by bit from its tx fifo and that's it.
> > > > In the Xilinx GQSPI controller case, the dummy cycles can either be
> > > > sent via a regular data (the case 1 controller) in the tx fifo, or
> > > > automatically generated (case 2 controller) by the hardware.
> > >
> > > Ok, I'll try to explain my view point a little differently. For that we also
> > > need to keep in mind that QEMU models HW, and any binary that runs on a HW
> > > board supported in QEMU should ideally run on that board inside QEMU aswell
> > > (this can be a bare metal application equaly well as a modified u-boot/Linux
> > > using SPI commands with a non multiple of 8 number of dummy clock cycles).
> > >
> > > Once functionality has been introduced into QEMU it is not easy to know which
> > > intentional or untentional features provided by the functionality are being
> > > used by users. One of the (perhaps not well known) features I'm aware of that
> > > is in use and is provided by the accurate dummy clock cycle modeling inside
> > > m25p80 is the be ability to test drivers accurately regarding the dummy clock
> > > cycles (even when using commands with a non-multiple of 8 number of dummy clock
> > > cycles), but there might be others aswell. So by removing this functionality
> > > above use case will brake, this since those test will not be reliable.
> > > Furthermore, since users tend to be creative it is not possible to know if
> > > there are other use cases that will be affected. This means that in case [1]
> > > needs to be followed the safe path is to add functionality instead of removing.
> > > Luckily it also easier in this case, see below.
> >
> > I understand there might be users other than U-Boot/Linux that use an
> > odd number of dummy bits (not multiple of 8). If your concern was
> > about model behavior changes, sure I can update
> > qemu/docs/system/deprecated.rst to mention that some flashes in the
> > m25p80 model now implement dummy cycles as bytes.
>
> Yes, something like that. My concern is that since this functionality has been
> in tree for while, users have found known or unknown features that got
> introduced by it. By removing the functionality (and the known/uknown features)
> we are riscing to brake our user's use cases (currently I'm aware of one
> feature/use case but it is not unlikely that there are more). [1] states that
> "In general features are intended to be supported indefinitely once introduced
> into QEMU", to me that makes very much sense because the opposite would mean
> that we were not reliable. So in case [1] needs to be honored it looks to be
> safer to add functionality instead of removing (and riscing the removal of use
> cases/features). Luckily I still believe in this case that it will be easier to
> go forward (even if I also agree on what you are saying below about what I
> proposed).
>

Even if the implementation is buggy and we need to keep the buggy
implementation forever? I think that's why
qemu/docs/system/deprecated.rst was created for deprecating such
feature.

> >
> > > >
> > > > > don't think it is fair to call them 'seriously broken' (and else we should
> > > > > probably let the maintainers know about it). Most likely the lack of support
> > > >
> > > > I called it "seriously broken" because current implementation only
> > > > considered one type of SPI controllers while completely ignoring the
> > > > other type.
> > >
> > > If we change view and see this from the perspective of m25p80, it models the
> > > commands a certain way and provides an API that the SPI controllers need to
> > > implement for interacting with it. It is true that there are SPI controllers
> > > referred to above that do not support the portion of that API that corresponds
> > > to commands with dummy clock cycles, but I don't think it is true that this is
> > > broken since there is also one SPI controller that has a working implementation
> > > of m25p80's full API also when transfering through a tx fifo (use case 1). But
> > > as mentioned above, by doing a minor extension and improvement to m25p80's API
> > > and allow for toggling the accuracy from dummy clock cycles to dummy bytes [1]
> > > will still be honored as in the same time making it possible to have full
> > > support for the API in the SPI controllers that currently do not (please reread
> > > the proposal in my previous reply that attempts to do this). I myself see this
> > > as win/win situation, also because no controller should need modifications.
> > >
> >
> > I am afraid your proposal does not work. Your proposed new device
> > property 'model_dummy_bytes' to select to convert the accurate dummy
> > clock cycle count to dummy bytes inside m25p80, is hard to justify as
> > a property to the flash itself, as the behavior is tightly coupled to
> > how the SPI controller works.
>
> I agree on above. I decided though that instead of posting sample code in here
> I'll post an RFC with hopefully an improved proposal. I'll cc you. About below,
> Xilinx ZynqMP GQSPI should not need any modication in a first step.
>

Wait, (see below)

> >
> > Please take a look at the Xilinx GQSPI controller, which supports both
> > use cases, that the dummy cycles can be transferred via tx fifo, or
> > generated by the controller automatically. Please read the example
> > given in:
> >
> >     table 24‐22, an example of Generic FIFO Contents for Quad I/O Read
> > Command (EBh)
> >
> > in https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf
> >
> > If you choose to set the m25p80 device property 'model_dummy_bytes' to
> > true when working with the Xilinx GQSPI controller, you are bound to
> > only allow guest software to use tx fifo to transfer the dummy cycles,
> > and this is wrong.
> >

You missed this part. I looked at your RFC, and as I mentioned above
your proposal cannot support the complicated controller like Xilinx
GQSPI. Please read the example of table 24-22. With your RFC, you
mandate guest software's GQSPI driver to only use hardware dummy cycle
generation, which is wrong.

> > >
> > > >
> > > > > for the commands is because no request has been made for them. Also there is
> > > > > one controller that has support.
> > > >
> > > > Definitely it's not "no request". Nearly all SPI flashes support the
> > > > Fast Read (0Bh) command today, and 0Bh requires a dummy cycle. This is
> > > > "seriously broken" for those case 1 type controllers because they
> > > > cannot read anything from the m25p80 model at all. Unless the guest
> > > > software being tested only uses Read (03h) command which is not
> > > > affected. But I can't find a software that uses Read instead of Fast
> > > > Read.
> > > >
> > > > > > The issue you pointed out that we require the total number of dummy
> > > > > > bits should be multiple of 8 is true, that's why I added the
> > > > > > unimplemented log message in this series (patch 2/3/4) to warn users
> > > > > > if this expectation is not met. However this will not cause any issue
> > > > > > when running U-Boot or Linux, because both spi-nor drivers expect the
> > > > > > same assumption as we do here.
> > > > > >
> > > > > > See U-Boot spi_nor_read_data() and Linux spi_nor_spimem_read_data(),
> > > > > > there is a logic to calculate the dummy bytes needed for fast read
> > > > > > command:
> > > > > >
> > > > > >     /* convert the dummy cycles to the number of bytes */
> > > > > >     op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8;
> > > > > >
> > > > > > Note the default dummy cycles configuration for all flashes I have
> > > > > > looked into as of today, meets the multiple of 8 assumption. On some
> > > > > > flashes the dummy cycle number is configurable, and if it's been
> > > > > > configured to be an odd value, it would not work on U-Boot/Linux in
> > > > > > the first place.
> > > > > >
> > > > > > > >
> > > > > > > > Things get complicated when interacting with different SPI or QSPI
> > > > > > > > flash controllers. There are major two cases:
> > > > > > > >
> > > > > > > > - Dummy bytes prepared by drivers, and wrote to the controller fifo.
> > > > > > > >   For such case, driver will calculate the correct number of dummy
> > > > > > > >   bytes and write them into the tx fifo. Fixing the m25p80 model will
> > > > > > > >   fix flashes working with such controllers.
> > > > > > >
> > > > > > > Above can be fixed while still keeping the detailed dummy cycle implementation
> > > > > > > inside m25p80. Perhaps one of the following could be looked into: configurating
> > > > > > > the amount, letting the spi ctrl fetch the amount from m25p80 or by inheriting
> > > > > > > some functionality handling this in the SPI controller. Or a mixture of above.
> > > > > >
> > > > > > Please send patches to explain this in detail how this is going to
> > > > > > work. I am open to all possible solutions.
> > > > >
> > > > > In that case I suggest that you instead try with a device property
> > > > > 'model_dummy_bytes' used to select to convert the accurate dummy clock cycle
> > > > > count to dummy bytes inside m25p80. Below is an example on how to modify the
> > > >
> > > > No this is wrong in my view. This is not like a DMA vs. PIO handling.
> > > >
> > > > > decode_fast_read_cmd function (the other commands requiring dummy clock cycles
> > > > > can follow a similar pattern). This way the fifo mode will be able to work the
> > > > > way you desire while also keeping the current functionality intact. Suddenly
> > > > > removing functionality (features) will take users by surprise.
> > > >
> > > > I don't think we are removing any features. This is a fix to make the
> > > > model to be used by any SPI controllers.
> > > >
> > > > As I pointed out, both U-Boot and Linux have the multiple of 8
> > > > assumption for the dummy bit, which is the default configuration for
> > > > all flashes I have looked into so far. Can you please comment what use
> > > > case you want to support? I requested a U-Boot/Linux kernel testing in
> > > > the previous SST thread [1] against Xilinx GQSPI but there was no
> > > > response.
> > >
> > > In [2] instructions on how to boot u-boot/Linux is found. For building the
> > > various software components I followed the official doc in [3].
> >
> > I see the following QEMU commands are used to test booting U-Boot/Linux:
> >
> > $ qemu-system-aarch64 -M xlnx-zcu102,secure=on,virtualization=on -m 4G
> > -serial stdio -display none -device loader,file=u-boot.elf -kernel
> > bl31.elf -device loader,addr=0x40000000,file=Image -device
> > loader,addr=0x2000000,file=system.dtb
> >
> > I am not sure where the system.dtb gets built from?
>
> It is the instructions in [2] to look into. 'system.dtb' is the kernel dtb for
> zcu102 ([2] has been fixed). I created [2] purely for you, so respectfully I
> will ask you to try a little first before asking for further guidance.
>

I tried, but no success. I removed the "-device loader" part for
loading kernel image and the device tree, and only focused on booting
U-Boot.

The ATF bl31.elf was built from
https://github.com/ARM-software/arm-trusted-firmware, by following
build instructions at
https://trustedfirmware-a.readthedocs.io/en/latest/plat/xilinx-zynqmp.html.
U-Boot was built from the upstream U-Boot.

$ ./qemu-system-aarch64 -M xlnx-zcu102,secure=on,virtualization=on -m
4G -serial stdio -display none -device loader,file=u-boot.elf -kernel
bl31.elf
ERROR:   Incorrect XILINX IDCODE 0x0, maskid 0x4600093
NOTICE:  ATF running on XCZUUNKN/silicon v1/RTL0.0 at 0xfffea000
NOTICE:  BL31: v2.4(release):v2.4-228-g337e493
NOTICE:  BL31: Built : 21:18:14, Jan 20 2021
ERROR:   BL31: Platform Management API version error. Expected: v1.1 -
Found: v0.0
ERROR:   Error initializing runtime service sip_svc

I also tried the Xilinx fork of ATF from
https://github.com/Xilinx/arm-trusted-firmware, by following build
instructions at
https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18842305/Build+ARM+Trusted+Firmware+ATF

$ ./qemu-system-aarch64 -M xlnx-zcu102,secure=on,virtualization=on -m
4G -serial stdio -display none -device loader,file=u-boot.elf -kernel
bl31.elf
ERROR:   Incorrect XILINX IDCODE 0x0, maskid 0x4600093
NOTICE:  ATF running on XCZUUNKN/silicon v1/RTL0.0 at 0xfffea000
NOTICE:  BL31: v2.2(release):xilinx-v2020.2
NOTICE:  BL31: Built : 21:52:38, Jan 20 2021
ERROR:   BL31: Platform Management API version error. Expected: v1.1 -
Found: v0.0
ERROR:   Error initializing runtime service sip_svc

Then I tried to build a U-Boot from the Xilinx fork at
https://github.com/Xilinx/u-boot-xlnx/, still no success.

> Best regards,
> Francisco Iglesias
>
> [1] qemu/docs/system/deprecated.rst
> [2] https://github.com/franciscoIglesias/qemu-cmdline/blob/master/xlnx-zcu102-atf-u-boot-linux.md
>
>

Regards,
Bin


  reply	other threads:[~2021-01-20 14:21 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-14 15:08 [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for fast read commands Bin Meng
2021-01-14 15:08 ` [PATCH 1/9] hw/block: m25p80: Fix the number of dummy bytes needed for Windbond flashes Bin Meng
2021-01-14 15:08 ` [PATCH 2/9] hw/block: m25p80: Fix the number of dummy bytes needed for Numonyx/Micron flashes Bin Meng
2021-01-14 15:08 ` [PATCH 3/9] hw/block: m25p80: Fix the number of dummy bytes needed for Macronix flashes Bin Meng
2021-01-14 15:08 ` [PATCH 4/9] hw/block: m25p80: Fix the number of dummy bytes needed for Spansion flashes Bin Meng
2021-01-14 15:08 ` [PATCH 5/9] hw/block: m25p80: Support fast read for SST flashes Bin Meng
2021-01-14 15:08 ` [PATCH 6/9] hw/ssi: xilinx_spips: Fix generic fifo dummy cycle handling Bin Meng
2021-01-14 15:09 ` [PATCH 7/9] Revert "aspeed/smc: Fix number of dummy cycles for FAST_READ_4 command" Bin Meng
2021-01-14 15:09 ` [PATCH 8/9] Revert "aspeed/smc: snoop SPI transfers to fake dummy cycles" Bin Meng
2021-01-14 15:09 ` [PATCH 9/9] hw/ssi: npcm7xx_fiu: Correct the dummy cycle emulation logic Bin Meng
2021-01-14 17:12   ` Havard Skinnemoen via
2021-01-14 15:59 ` [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for fast read commands Cédric Le Goater
2021-01-14 16:12 ` no-reply
2021-01-14 18:13 ` Francisco Iglesias
2021-01-15  2:07   ` Bin Meng
2021-01-15  3:29     ` Havard Skinnemoen via
2021-01-15 13:54       ` Bin Meng
2021-01-15 12:26     ` Francisco Iglesias
2021-01-15 14:38       ` Bin Meng
2021-01-18 10:05         ` Francisco Iglesias
2021-01-18 12:32           ` Bin Meng
2021-01-19 13:01             ` Francisco Iglesias
2021-01-20 14:20               ` Bin Meng [this message]
2021-01-21  8:50                 ` Francisco Iglesias
2021-01-21  8:59                   ` Bin Meng
2021-01-21 10:01                     ` Francisco Iglesias
2021-01-21 14:18                     ` Francisco Iglesias
2021-02-08 14:41                       ` Bin Meng
2021-02-08 15:30                         ` Edgar E. Iglesias
2021-02-09  9:35                           ` Francisco Iglesias
2021-04-23  6:45                         ` Bin Meng
2021-04-27  5:56                           ` Alistair Francis
2021-04-27  8:54                             ` Francisco Iglesias
2021-04-27 14:32                               ` Cédric Le Goater
2021-04-28 13:12                                 ` Bin Meng
2021-04-28 13:54                                   ` Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEUhbmUBAgF4D__jsfbE7yGd++5ZH3YOutTiOBOot52sNCV-eg@mail.gmail.com \
    --to=bmeng.cn@gmail.com \
    --cc=alistair.francis@wdc.com \
    --cc=andrew@aj.id.au \
    --cc=bin.meng@windriver.com \
    --cc=clg@kaod.org \
    --cc=f4bug@amsat.org \
    --cc=frasse.iglesias@gmail.com \
    --cc=hskinnemoen@google.com \
    --cc=joel@jms.id.au \
    --cc=kfting@nuvoton.com \
    --cc=komlodi@xilinx.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).