All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pratyush Yadav <p.yadav@ti.com>
To: Tudor Ambarus <Tudor.Ambarus@microchip.com>
Cc: <zajec5@gmail.com>, <linux-mtd@lists.infradead.org>,
	<linux-spi@vger.kernel.org>, <kdasu.kdev@gmail.com>,
	<boris.brezillon@collabora.com>, Mark Brown <broonie@kernel.org>
Subject: Re: spi-nor 5.11 regression: Division by zero in kernel
Date: Mon, 2 Aug 2021 23:42:01 +0530	[thread overview]
Message-ID: <20210802181201.wkff3k32to3lin4n@ti.com> (raw)
In-Reply-To: <b3d23f9d-5c84-c4cc-dede-85ec85135276@microchip.com>

+ Mark

On 02/08/21 02:39PM, Tudor.Ambarus@microchip.com wrote:
> On 8/2/21 4:24 PM, Rafał Miłecki wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> > 
> > Hi,
> 
> Hi, Rafał!
> 
> > 
> > It seems that kernel 5.11 broke spi-nor on Broadcom Northstar (BCM5301X)
> > platforms.
> > 
> > The problem seems to be spi_nor_spimem_read_data() which calculates:
> > op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8;
> > 
> > On Northstar this happens to be:
> > op.dummy.nbytes = (0 * 0) / 8;
> > 
> > That results in bcm_qspi_bspi_set_flex_mode() dividing by zero in the:
> > bpp |= (op->dummy.nbytes * 8) / op->dummy.buswidth;
> > 
> > Could you take a look at that issue, please?
> > 
> > GOOD    5.10.55
> > BAD     5.11.22
> > BAD     5.12.19
> > BAD     5.13.2
> > BAD     5.13.7
> > 
> 
> It's hard to guess. Would you please bisect and identify the commit that introduces
> the regression?

I think the bug is pretty obvious here. op->dummy.buswidth is 0 when 
there is no dummy phase, and that's why there is a division by zero. The 
controller driver does not check if the dummy phase exists (by checking 
op->dummy.nbytes) before performing the calculation. I saw a similar 
patch posted for spi-cadence-quadspi.c [0]. The fix is obvious IMO.

BTW, I think this was introduced by 0e30f47232ab ("mtd: spi-nor: add 
support for DTR protocol"). It set buswidths of non-existent phases to 
0.

The main question is: do we want to keep the buswidth 0 when the dummy 
phase does not exist? It seems to be tripping up controller drivers. 
FWIW, I think we should keep it 0 since I think that when dummy.nbytes 
== 0 the other fields should be "don't care". The responsibility should 
lie on the controller driver to check this.

Thoughts?

[0] https://patchwork.kernel.org/project/spi-devel-general/patch/92eea403-9b21-2488-9cc1-664bee760c5e@nskint.co.jp/

> 
> Thanks,
> ta
> 
> > [    1.075513] Division by zero in kernel.
> > [    1.079354] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.7 #18
> > [    1.085376] Hardware name: BCM5301X
> > [    1.088873] [<c0108394>] (unwind_backtrace) from [<c010498c>] (show_stack+0x10/0x14)
> > [    1.096666] [<c010498c>] (show_stack) from [<c0696470>] (dump_stack+0x94/0xa8)
> > [    1.103926] [<c0696470>] (dump_stack) from [<c03e7f94>] (Ldiv0+0x8/0x10)
> > [    1.110653] [<c03e7f94>] (Ldiv0) from [<c0699274>] (bcm_qspi_exec_mem_op+0x3e0/0x744)
> > [    1.118512] [<c0699274>] (bcm_qspi_exec_mem_op) from [<c0698740>] (spi_mem_exec_op+0x184/0x4fc)
> > [    1.127234] [<c0698740>] (spi_mem_exec_op) from [<c0698bac>] (spi_mem_dirmap_read+0xf4/0x1c8)
> > [    1.135780] [<c0698bac>] (spi_mem_dirmap_read) from [<c0697d58>] (spi_nor_spimem_read_data+0x13c/0x1ec)
> > [    1.145199] [<c0697d58>] (spi_nor_spimem_read_data) from [<c0498d60>] (spi_nor_read+0x16c/0x174)
> > [    1.154008] [<c0498d60>] (spi_nor_read) from [<c04835d0>] (mtd_read_oob_std+0x9c/0xa4)
> > [    1.161964] [<c04835d0>] (mtd_read_oob_std) from [<c04855d0>] (mtd_read_oob+0x84/0x148)
> > [    1.169997] [<c04855d0>] (mtd_read_oob) from [<c04856f4>] (mtd_read+0x60/0x90)
> > [    1.177237] [<c04856f4>] (mtd_read) from [<c048ab50>] (bcm47xxpart_parse+0x1d4/0x744)
> > [    1.185089] [<c048ab50>] (bcm47xxpart_parse) from [<c0488568>] (parse_mtd_partitions+0x188/0x424)
> > [    1.193985] [<c0488568>] (parse_mtd_partitions) from [<c0486018>] (mtd_device_parse_register+0x7c/0x1c0)
> > [    1.203489] [<c0486018>] (mtd_device_parse_register) from [<c04998b8>] (spi_nor_probe+0x20c/0x2d0)
> > [    1.212471] [<c04998b8>] (spi_nor_probe) from [<c046fbf8>] (really_probe+0xf0/0x4dc)
> > [    1.220245] [<c046fbf8>] (really_probe) from [<c046dd40>] (bus_for_each_drv+0x80/0xd0)
> > [    1.228184] [<c046dd40>] (bus_for_each_drv) from [<c04701d0>] (__device_attach+0xf8/0x15c)
> > [    1.236468] [<c04701d0>] (__device_attach) from [<c046edd4>] (bus_probe_device+0x84/0x8c)
> > [    1.244668] [<c046edd4>] (bus_probe_device) from [<c046c6c4>] (device_add+0x300/0x840)
> > [    1.252606] [<c046c6c4>] (device_add) from [<c04b3dc4>] (spi_add_device+0x9c/0x164)
> > [    1.260292] [<c04b3dc4>] (spi_add_device) from [<c04b482c>] (spi_register_controller+0x8ac/0xbc0)
> > [    1.269187] [<c04b482c>] (spi_register_controller) from [<c04b7bd4>] (bcm_qspi_probe+0x600/0x700)
> > [    1.278092] [<c04b7bd4>] (bcm_qspi_probe) from [<c0471d3c>] (platform_probe+0x48/0x8c)
> > [    1.286030] [<c0471d3c>] (platform_probe) from [<c046fbf8>] (really_probe+0xf0/0x4dc)
> > [    1.293880] [<c046fbf8>] (really_probe) from [<c04705dc>] (device_driver_attach+0xf0/0x100)
> > [    1.302254] [<c04705dc>] (device_driver_attach) from [<c0470678>] (__driver_attach+0x8c/0x11c)
> > [    1.310888] [<c0470678>] (__driver_attach) from [<c046dc74>] (bus_for_each_dev+0x74/0xc0)
> > [    1.319086] [<c046dc74>] (bus_for_each_dev) from [<c046efc8>] (bus_add_driver+0xf4/0x1dc)
> > [    1.327286] [<c046efc8>] (bus_add_driver) from [<c0470cdc>] (driver_register+0x88/0x118)
> > [    1.335397] [<c0470cdc>] (driver_register) from [<c01016dc>] (do_one_initcall+0x54/0x1d0)
> > [    1.343598] [<c01016dc>] (do_one_initcall) from [<c08010e8>] (kernel_init_freeable+0x244/0x2ac)
> > [    1.352337] [<c08010e8>] (kernel_init_freeable) from [<c069a7c8>] (kernel_init+0x8/0x118)
> > [    1.360536] [<c069a7c8>] (kernel_init) from [<c0100130>] (ret_from_fork+0x14/0x24)
> > [    1.368125] Exception stack(0xc1035fb0 to 0xc1035ff8)
> > [    1.373184] 5fa0:                                     00000000 00000000 00000000 00000000
> > [    1.381384] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [    1.389582] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> > 
> > ______________________________________________________
> > Linux MTD discussion mailing list
> > http://lists.infradead.org/mailman/listinfo/linux-mtd/
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

-- 
Regards,
Pratyush Yadav
Texas Instruments Inc.

WARNING: multiple messages have this Message-ID (diff)
From: Pratyush Yadav <p.yadav@ti.com>
To: Tudor Ambarus <Tudor.Ambarus@microchip.com>
Cc: <zajec5@gmail.com>, <linux-mtd@lists.infradead.org>,
	<linux-spi@vger.kernel.org>, <kdasu.kdev@gmail.com>,
	<boris.brezillon@collabora.com>, Mark Brown <broonie@kernel.org>
Subject: Re: spi-nor 5.11 regression: Division by zero in kernel
Date: Mon, 2 Aug 2021 23:42:01 +0530	[thread overview]
Message-ID: <20210802181201.wkff3k32to3lin4n@ti.com> (raw)
In-Reply-To: <b3d23f9d-5c84-c4cc-dede-85ec85135276@microchip.com>

+ Mark

On 02/08/21 02:39PM, Tudor.Ambarus@microchip.com wrote:
> On 8/2/21 4:24 PM, Rafał Miłecki wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> > 
> > Hi,
> 
> Hi, Rafał!
> 
> > 
> > It seems that kernel 5.11 broke spi-nor on Broadcom Northstar (BCM5301X)
> > platforms.
> > 
> > The problem seems to be spi_nor_spimem_read_data() which calculates:
> > op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8;
> > 
> > On Northstar this happens to be:
> > op.dummy.nbytes = (0 * 0) / 8;
> > 
> > That results in bcm_qspi_bspi_set_flex_mode() dividing by zero in the:
> > bpp |= (op->dummy.nbytes * 8) / op->dummy.buswidth;
> > 
> > Could you take a look at that issue, please?
> > 
> > GOOD    5.10.55
> > BAD     5.11.22
> > BAD     5.12.19
> > BAD     5.13.2
> > BAD     5.13.7
> > 
> 
> It's hard to guess. Would you please bisect and identify the commit that introduces
> the regression?

I think the bug is pretty obvious here. op->dummy.buswidth is 0 when 
there is no dummy phase, and that's why there is a division by zero. The 
controller driver does not check if the dummy phase exists (by checking 
op->dummy.nbytes) before performing the calculation. I saw a similar 
patch posted for spi-cadence-quadspi.c [0]. The fix is obvious IMO.

BTW, I think this was introduced by 0e30f47232ab ("mtd: spi-nor: add 
support for DTR protocol"). It set buswidths of non-existent phases to 
0.

The main question is: do we want to keep the buswidth 0 when the dummy 
phase does not exist? It seems to be tripping up controller drivers. 
FWIW, I think we should keep it 0 since I think that when dummy.nbytes 
== 0 the other fields should be "don't care". The responsibility should 
lie on the controller driver to check this.

Thoughts?

[0] https://patchwork.kernel.org/project/spi-devel-general/patch/92eea403-9b21-2488-9cc1-664bee760c5e@nskint.co.jp/

> 
> Thanks,
> ta
> 
> > [    1.075513] Division by zero in kernel.
> > [    1.079354] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.7 #18
> > [    1.085376] Hardware name: BCM5301X
> > [    1.088873] [<c0108394>] (unwind_backtrace) from [<c010498c>] (show_stack+0x10/0x14)
> > [    1.096666] [<c010498c>] (show_stack) from [<c0696470>] (dump_stack+0x94/0xa8)
> > [    1.103926] [<c0696470>] (dump_stack) from [<c03e7f94>] (Ldiv0+0x8/0x10)
> > [    1.110653] [<c03e7f94>] (Ldiv0) from [<c0699274>] (bcm_qspi_exec_mem_op+0x3e0/0x744)
> > [    1.118512] [<c0699274>] (bcm_qspi_exec_mem_op) from [<c0698740>] (spi_mem_exec_op+0x184/0x4fc)
> > [    1.127234] [<c0698740>] (spi_mem_exec_op) from [<c0698bac>] (spi_mem_dirmap_read+0xf4/0x1c8)
> > [    1.135780] [<c0698bac>] (spi_mem_dirmap_read) from [<c0697d58>] (spi_nor_spimem_read_data+0x13c/0x1ec)
> > [    1.145199] [<c0697d58>] (spi_nor_spimem_read_data) from [<c0498d60>] (spi_nor_read+0x16c/0x174)
> > [    1.154008] [<c0498d60>] (spi_nor_read) from [<c04835d0>] (mtd_read_oob_std+0x9c/0xa4)
> > [    1.161964] [<c04835d0>] (mtd_read_oob_std) from [<c04855d0>] (mtd_read_oob+0x84/0x148)
> > [    1.169997] [<c04855d0>] (mtd_read_oob) from [<c04856f4>] (mtd_read+0x60/0x90)
> > [    1.177237] [<c04856f4>] (mtd_read) from [<c048ab50>] (bcm47xxpart_parse+0x1d4/0x744)
> > [    1.185089] [<c048ab50>] (bcm47xxpart_parse) from [<c0488568>] (parse_mtd_partitions+0x188/0x424)
> > [    1.193985] [<c0488568>] (parse_mtd_partitions) from [<c0486018>] (mtd_device_parse_register+0x7c/0x1c0)
> > [    1.203489] [<c0486018>] (mtd_device_parse_register) from [<c04998b8>] (spi_nor_probe+0x20c/0x2d0)
> > [    1.212471] [<c04998b8>] (spi_nor_probe) from [<c046fbf8>] (really_probe+0xf0/0x4dc)
> > [    1.220245] [<c046fbf8>] (really_probe) from [<c046dd40>] (bus_for_each_drv+0x80/0xd0)
> > [    1.228184] [<c046dd40>] (bus_for_each_drv) from [<c04701d0>] (__device_attach+0xf8/0x15c)
> > [    1.236468] [<c04701d0>] (__device_attach) from [<c046edd4>] (bus_probe_device+0x84/0x8c)
> > [    1.244668] [<c046edd4>] (bus_probe_device) from [<c046c6c4>] (device_add+0x300/0x840)
> > [    1.252606] [<c046c6c4>] (device_add) from [<c04b3dc4>] (spi_add_device+0x9c/0x164)
> > [    1.260292] [<c04b3dc4>] (spi_add_device) from [<c04b482c>] (spi_register_controller+0x8ac/0xbc0)
> > [    1.269187] [<c04b482c>] (spi_register_controller) from [<c04b7bd4>] (bcm_qspi_probe+0x600/0x700)
> > [    1.278092] [<c04b7bd4>] (bcm_qspi_probe) from [<c0471d3c>] (platform_probe+0x48/0x8c)
> > [    1.286030] [<c0471d3c>] (platform_probe) from [<c046fbf8>] (really_probe+0xf0/0x4dc)
> > [    1.293880] [<c046fbf8>] (really_probe) from [<c04705dc>] (device_driver_attach+0xf0/0x100)
> > [    1.302254] [<c04705dc>] (device_driver_attach) from [<c0470678>] (__driver_attach+0x8c/0x11c)
> > [    1.310888] [<c0470678>] (__driver_attach) from [<c046dc74>] (bus_for_each_dev+0x74/0xc0)
> > [    1.319086] [<c046dc74>] (bus_for_each_dev) from [<c046efc8>] (bus_add_driver+0xf4/0x1dc)
> > [    1.327286] [<c046efc8>] (bus_add_driver) from [<c0470cdc>] (driver_register+0x88/0x118)
> > [    1.335397] [<c0470cdc>] (driver_register) from [<c01016dc>] (do_one_initcall+0x54/0x1d0)
> > [    1.343598] [<c01016dc>] (do_one_initcall) from [<c08010e8>] (kernel_init_freeable+0x244/0x2ac)
> > [    1.352337] [<c08010e8>] (kernel_init_freeable) from [<c069a7c8>] (kernel_init+0x8/0x118)
> > [    1.360536] [<c069a7c8>] (kernel_init) from [<c0100130>] (ret_from_fork+0x14/0x24)
> > [    1.368125] Exception stack(0xc1035fb0 to 0xc1035ff8)
> > [    1.373184] 5fa0:                                     00000000 00000000 00000000 00000000
> > [    1.381384] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [    1.389582] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> > 
> > ______________________________________________________
> > Linux MTD discussion mailing list
> > http://lists.infradead.org/mailman/listinfo/linux-mtd/
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

-- 
Regards,
Pratyush Yadav
Texas Instruments Inc.

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

  reply	other threads:[~2021-08-02 18:12 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-02 13:24 spi-nor 5.11 regression: Division by zero in kernel Rafał Miłecki
2021-08-02 13:24 ` Rafał Miłecki
2021-08-02 14:39 ` Tudor.Ambarus
2021-08-02 14:39   ` Tudor.Ambarus
2021-08-02 18:12   ` Pratyush Yadav [this message]
2021-08-02 18:12     ` Pratyush Yadav
2021-08-03  4:13     ` Tudor.Ambarus
2021-08-03  4:13       ` Tudor.Ambarus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210802181201.wkff3k32to3lin4n@ti.com \
    --to=p.yadav@ti.com \
    --cc=Tudor.Ambarus@microchip.com \
    --cc=boris.brezillon@collabora.com \
    --cc=broonie@kernel.org \
    --cc=kdasu.kdev@gmail.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linux-spi@vger.kernel.org \
    --cc=zajec5@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.