Linux-RISC-V Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
       [not found] ` <20181008100728.24959-2-quentin.schulz@bootlin.com>
@ 2018-11-19 14:57   ` schwab
  2018-11-19 14:57     ` Andreas Schwab
                       ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: schwab @ 2018-11-19 14:57 UTC (permalink / raw)
  To: linux-riscv

On Okt 08 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:

> The Microsemi PHYs have multiple banks of registers (called pages).
> Registers can only be accessed from one page, if we need a register from
> another page, we need to switch the page and the registers of all other
> pages are not accessible anymore.
>
> Basically, to read register 5 from page 0, 1, 2, etc., you do the same
> phy_read(phydev, 5); but you need to set the desired page beforehand.
>
> In order to guarantee that two concurrent functions do not change the
> page, we need to do some locking per page. This can be achieved with the
> use of phy_select_page and phy_restore_page functions but phy_write/read
> calls in-between those two functions shall be replaced by their
> lock-free alternative __phy_write/read.
>
> Let's migrate this driver to those functions.

This has some serious locking problem.

[<0>] __schedule+0x25e/0x74c
[<0>] schedule+0x1a/0x58
[<0>] schedule_preempt_disabled+0xc/0x14
[<0>] __mutex_lock.isra.0+0x10e/0x22e
[<0>] __mutex_lock_slowpath+0xe/0x16
[<0>] mutex_lock+0x22/0x2a
[<0>] mdiobus_read+0x36/0x60
[<0>] vsc85xx_config_init+0x4c/0x1e2
[<0>] phy_init_hw+0x3c/0x52
[<0>] phy_attach_direct+0xec/0x1dc
[<0>] phy_connect_direct+0x1a/0x56
[<0>] macb_probe+0x800/0xb5e [macb]
[<0>] platform_drv_probe+0x3e/0x7e
[<0>] really_probe+0xba/0x310
[<0>] driver_probe_device+0x54/0xf2
[<0>] __driver_attach+0xde/0x106
[<0>] bus_for_each_dev+0x4a/0x72
[<0>] driver_attach+0x1a/0x22
[<0>] bus_add_driver+0x1ce/0x212
[<0>] driver_register+0x3a/0xd0
[<0>] __platform_driver_register+0x3a/0x42
[<0>] macb_driver_init+0x20/0x28 [macb]
[<0>] do_one_initcall+0x48/0x128
[<0>] do_init_module+0x4a/0x186
[<0>] load_module+0xd6a/0xe6a
[<0>] sys_finit_module+0xc6/0xfc
[<0>] check_syscall_nr+0x22/0x22

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 14:57   ` [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions schwab
@ 2018-11-19 14:57     ` Andreas Schwab
  2018-11-19 15:10     ` andrew
  2018-11-20 13:48     ` [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config schwab
  2 siblings, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2018-11-19 14:57 UTC (permalink / raw)
  To: Quentin Schulz
  Cc: andrew, alexandre.belloni, f.fainelli, netdev, linux-kernel,
	allan.nielsen, thomas.petazzoni, linux-riscv, davem

On Okt 08 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:

> The Microsemi PHYs have multiple banks of registers (called pages).
> Registers can only be accessed from one page, if we need a register from
> another page, we need to switch the page and the registers of all other
> pages are not accessible anymore.
>
> Basically, to read register 5 from page 0, 1, 2, etc., you do the same
> phy_read(phydev, 5); but you need to set the desired page beforehand.
>
> In order to guarantee that two concurrent functions do not change the
> page, we need to do some locking per page. This can be achieved with the
> use of phy_select_page and phy_restore_page functions but phy_write/read
> calls in-between those two functions shall be replaced by their
> lock-free alternative __phy_write/read.
>
> Let's migrate this driver to those functions.

This has some serious locking problem.

[<0>] __schedule+0x25e/0x74c
[<0>] schedule+0x1a/0x58
[<0>] schedule_preempt_disabled+0xc/0x14
[<0>] __mutex_lock.isra.0+0x10e/0x22e
[<0>] __mutex_lock_slowpath+0xe/0x16
[<0>] mutex_lock+0x22/0x2a
[<0>] mdiobus_read+0x36/0x60
[<0>] vsc85xx_config_init+0x4c/0x1e2
[<0>] phy_init_hw+0x3c/0x52
[<0>] phy_attach_direct+0xec/0x1dc
[<0>] phy_connect_direct+0x1a/0x56
[<0>] macb_probe+0x800/0xb5e [macb]
[<0>] platform_drv_probe+0x3e/0x7e
[<0>] really_probe+0xba/0x310
[<0>] driver_probe_device+0x54/0xf2
[<0>] __driver_attach+0xde/0x106
[<0>] bus_for_each_dev+0x4a/0x72
[<0>] driver_attach+0x1a/0x22
[<0>] bus_add_driver+0x1ce/0x212
[<0>] driver_register+0x3a/0xd0
[<0>] __platform_driver_register+0x3a/0x42
[<0>] macb_driver_init+0x20/0x28 [macb]
[<0>] do_one_initcall+0x48/0x128
[<0>] do_init_module+0x4a/0x186
[<0>] load_module+0xd6a/0xe6a
[<0>] sys_finit_module+0xc6/0xfc
[<0>] check_syscall_nr+0x22/0x22

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 14:57   ` [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions schwab
  2018-11-19 14:57     ` Andreas Schwab
@ 2018-11-19 15:10     ` andrew
  2018-11-19 15:10       ` Andrew Lunn
  2018-11-19 15:13       ` schwab
  2018-11-20 13:48     ` [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config schwab
  2 siblings, 2 replies; 34+ messages in thread
From: andrew @ 2018-11-19 15:10 UTC (permalink / raw)
  To: linux-riscv

On Mon, Nov 19, 2018 at 03:57:17PM +0100, Andreas Schwab wrote:
> On Okt 08 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:
> 
> > The Microsemi PHYs have multiple banks of registers (called pages).
> > Registers can only be accessed from one page, if we need a register from
> > another page, we need to switch the page and the registers of all other
> > pages are not accessible anymore.
> >
> > Basically, to read register 5 from page 0, 1, 2, etc., you do the same
> > phy_read(phydev, 5); but you need to set the desired page beforehand.
> >
> > In order to guarantee that two concurrent functions do not change the
> > page, we need to do some locking per page. This can be achieved with the
> > use of phy_select_page and phy_restore_page functions but phy_write/read
> > calls in-between those two functions shall be replaced by their
> > lock-free alternative __phy_write/read.
> >
> > Let's migrate this driver to those functions.
> 
> This has some serious locking problem.

Hi Andreas

Could you be more specific. Are you getting a deadlock? A WARN_ON?

Thanks,
	Andrew

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 15:10     ` andrew
@ 2018-11-19 15:10       ` Andrew Lunn
  2018-11-19 15:13       ` schwab
  1 sibling, 0 replies; 34+ messages in thread
From: Andrew Lunn @ 2018-11-19 15:10 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: alexandre.belloni, f.fainelli, Quentin Schulz, netdev,
	linux-kernel, allan.nielsen, thomas.petazzoni, linux-riscv,
	davem

On Mon, Nov 19, 2018 at 03:57:17PM +0100, Andreas Schwab wrote:
> On Okt 08 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:
> 
> > The Microsemi PHYs have multiple banks of registers (called pages).
> > Registers can only be accessed from one page, if we need a register from
> > another page, we need to switch the page and the registers of all other
> > pages are not accessible anymore.
> >
> > Basically, to read register 5 from page 0, 1, 2, etc., you do the same
> > phy_read(phydev, 5); but you need to set the desired page beforehand.
> >
> > In order to guarantee that two concurrent functions do not change the
> > page, we need to do some locking per page. This can be achieved with the
> > use of phy_select_page and phy_restore_page functions but phy_write/read
> > calls in-between those two functions shall be replaced by their
> > lock-free alternative __phy_write/read.
> >
> > Let's migrate this driver to those functions.
> 
> This has some serious locking problem.

Hi Andreas

Could you be more specific. Are you getting a deadlock? A WARN_ON?

Thanks,
	Andrew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 15:10     ` andrew
  2018-11-19 15:10       ` Andrew Lunn
@ 2018-11-19 15:13       ` schwab
  2018-11-19 15:13         ` Andreas Schwab
  2018-11-19 15:28         ` andrew
  1 sibling, 2 replies; 34+ messages in thread
From: schwab @ 2018-11-19 15:13 UTC (permalink / raw)
  To: linux-riscv

On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:

> On Mon, Nov 19, 2018 at 03:57:17PM +0100, Andreas Schwab wrote:
>> On Okt 08 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:
>> 
>> > The Microsemi PHYs have multiple banks of registers (called pages).
>> > Registers can only be accessed from one page, if we need a register from
>> > another page, we need to switch the page and the registers of all other
>> > pages are not accessible anymore.
>> >
>> > Basically, to read register 5 from page 0, 1, 2, etc., you do the same
>> > phy_read(phydev, 5); but you need to set the desired page beforehand.
>> >
>> > In order to guarantee that two concurrent functions do not change the
>> > page, we need to do some locking per page. This can be achieved with the
>> > use of phy_select_page and phy_restore_page functions but phy_write/read
>> > calls in-between those two functions shall be replaced by their
>> > lock-free alternative __phy_write/read.
>> >
>> > Let's migrate this driver to those functions.
>> 
>> This has some serious locking problem.
>
> Hi Andreas
>
> Could you be more specific. Are you getting a deadlock? A WARN_ON?

See the stack trace.  That's where it hangs.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 15:13       ` schwab
@ 2018-11-19 15:13         ` Andreas Schwab
  2018-11-19 15:28         ` andrew
  1 sibling, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2018-11-19 15:13 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: alexandre.belloni, f.fainelli, Quentin Schulz, netdev,
	linux-kernel, allan.nielsen, thomas.petazzoni, linux-riscv,
	davem

On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:

> On Mon, Nov 19, 2018 at 03:57:17PM +0100, Andreas Schwab wrote:
>> On Okt 08 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:
>> 
>> > The Microsemi PHYs have multiple banks of registers (called pages).
>> > Registers can only be accessed from one page, if we need a register from
>> > another page, we need to switch the page and the registers of all other
>> > pages are not accessible anymore.
>> >
>> > Basically, to read register 5 from page 0, 1, 2, etc., you do the same
>> > phy_read(phydev, 5); but you need to set the desired page beforehand.
>> >
>> > In order to guarantee that two concurrent functions do not change the
>> > page, we need to do some locking per page. This can be achieved with the
>> > use of phy_select_page and phy_restore_page functions but phy_write/read
>> > calls in-between those two functions shall be replaced by their
>> > lock-free alternative __phy_write/read.
>> >
>> > Let's migrate this driver to those functions.
>> 
>> This has some serious locking problem.
>
> Hi Andreas
>
> Could you be more specific. Are you getting a deadlock? A WARN_ON?

See the stack trace.  That's where it hangs.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 15:13       ` schwab
  2018-11-19 15:13         ` Andreas Schwab
@ 2018-11-19 15:28         ` andrew
  2018-11-19 15:28           ` Andrew Lunn
  2018-11-19 15:40           ` alexandre.belloni
  1 sibling, 2 replies; 34+ messages in thread
From: andrew @ 2018-11-19 15:28 UTC (permalink / raw)
  To: linux-riscv

On Mon, Nov 19, 2018 at 04:13:10PM +0100, Andreas Schwab wrote:
> On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:
> 
> > On Mon, Nov 19, 2018 at 03:57:17PM +0100, Andreas Schwab wrote:
> >> On Okt 08 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:
> >> 
> >> > The Microsemi PHYs have multiple banks of registers (called pages).
> >> > Registers can only be accessed from one page, if we need a register from
> >> > another page, we need to switch the page and the registers of all other
> >> > pages are not accessible anymore.
> >> >
> >> > Basically, to read register 5 from page 0, 1, 2, etc., you do the same
> >> > phy_read(phydev, 5); but you need to set the desired page beforehand.
> >> >
> >> > In order to guarantee that two concurrent functions do not change the
> >> > page, we need to do some locking per page. This can be achieved with the
> >> > use of phy_select_page and phy_restore_page functions but phy_write/read
> >> > calls in-between those two functions shall be replaced by their
> >> > lock-free alternative __phy_write/read.
> >> >
> >> > Let's migrate this driver to those functions.
> >> 
> >> This has some serious locking problem.
> >
> > Hi Andreas
> >
> > Could you be more specific. Are you getting a deadlock? A WARN_ON?
> 
> See the stack trace.  That's where it hangs.

So you never said it hangs. The stacktrace helps, but a description of
what actually happens also helps. And i expect Quentin has booted this
code lots of times and not had a hang. So some hits how to reproduce
it would also help. Maybe your kernel config?

I'm interested because he is using the core mdio locking
primitives. If those are broken, i want to know.

Thanks
	Andrew

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 15:28         ` andrew
@ 2018-11-19 15:28           ` Andrew Lunn
  2018-11-19 15:40           ` alexandre.belloni
  1 sibling, 0 replies; 34+ messages in thread
From: Andrew Lunn @ 2018-11-19 15:28 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: alexandre.belloni, f.fainelli, Quentin Schulz, netdev,
	linux-kernel, allan.nielsen, thomas.petazzoni, linux-riscv,
	davem

On Mon, Nov 19, 2018 at 04:13:10PM +0100, Andreas Schwab wrote:
> On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:
> 
> > On Mon, Nov 19, 2018 at 03:57:17PM +0100, Andreas Schwab wrote:
> >> On Okt 08 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:
> >> 
> >> > The Microsemi PHYs have multiple banks of registers (called pages).
> >> > Registers can only be accessed from one page, if we need a register from
> >> > another page, we need to switch the page and the registers of all other
> >> > pages are not accessible anymore.
> >> >
> >> > Basically, to read register 5 from page 0, 1, 2, etc., you do the same
> >> > phy_read(phydev, 5); but you need to set the desired page beforehand.
> >> >
> >> > In order to guarantee that two concurrent functions do not change the
> >> > page, we need to do some locking per page. This can be achieved with the
> >> > use of phy_select_page and phy_restore_page functions but phy_write/read
> >> > calls in-between those two functions shall be replaced by their
> >> > lock-free alternative __phy_write/read.
> >> >
> >> > Let's migrate this driver to those functions.
> >> 
> >> This has some serious locking problem.
> >
> > Hi Andreas
> >
> > Could you be more specific. Are you getting a deadlock? A WARN_ON?
> 
> See the stack trace.  That's where it hangs.

So you never said it hangs. The stacktrace helps, but a description of
what actually happens also helps. And i expect Quentin has booted this
code lots of times and not had a hang. So some hits how to reproduce
it would also help. Maybe your kernel config?

I'm interested because he is using the core mdio locking
primitives. If those are broken, i want to know.

Thanks
	Andrew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 15:28         ` andrew
  2018-11-19 15:28           ` Andrew Lunn
@ 2018-11-19 15:40           ` alexandre.belloni
  2018-11-19 15:40             ` Alexandre Belloni
  2018-11-19 15:50             ` schwab
  1 sibling, 2 replies; 34+ messages in thread
From: alexandre.belloni @ 2018-11-19 15:40 UTC (permalink / raw)
  To: linux-riscv

On 19/11/2018 16:28:30+0100, Andrew Lunn wrote:
> On Mon, Nov 19, 2018 at 04:13:10PM +0100, Andreas Schwab wrote:
> > On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:
> > 
> > > On Mon, Nov 19, 2018 at 03:57:17PM +0100, Andreas Schwab wrote:
> > >> On Okt 08 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:
> > >> 
> > >> > The Microsemi PHYs have multiple banks of registers (called pages).
> > >> > Registers can only be accessed from one page, if we need a register from
> > >> > another page, we need to switch the page and the registers of all other
> > >> > pages are not accessible anymore.
> > >> >
> > >> > Basically, to read register 5 from page 0, 1, 2, etc., you do the same
> > >> > phy_read(phydev, 5); but you need to set the desired page beforehand.
> > >> >
> > >> > In order to guarantee that two concurrent functions do not change the
> > >> > page, we need to do some locking per page. This can be achieved with the
> > >> > use of phy_select_page and phy_restore_page functions but phy_write/read
> > >> > calls in-between those two functions shall be replaced by their
> > >> > lock-free alternative __phy_write/read.
> > >> >
> > >> > Let's migrate this driver to those functions.
> > >> 
> > >> This has some serious locking problem.
> > >
> > > Hi Andreas
> > >
> > > Could you be more specific. Are you getting a deadlock? A WARN_ON?
> > 
> > See the stack trace.  That's where it hangs.
> 
> So you never said it hangs. The stacktrace helps, but a description of
> what actually happens also helps. And i expect Quentin has booted this
> code lots of times and not had a hang. So some hits how to reproduce
> it would also help. Maybe your kernel config?
> 
> I'm interested because he is using the core mdio locking
> primitives. If those are broken, i want to know.
> 

My first intuition is that he mac driver quentin is using does
phy_connect when the interface is opened while macb is doing it at probe
time. I didn't investigate but maybe this can help :)

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 15:40           ` alexandre.belloni
@ 2018-11-19 15:40             ` Alexandre Belloni
  2018-11-19 15:50             ` schwab
  1 sibling, 0 replies; 34+ messages in thread
From: Alexandre Belloni @ 2018-11-19 15:40 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: f.fainelli, Quentin Schulz, Andreas Schwab, linux-kernel,
	allan.nielsen, thomas.petazzoni, netdev, linux-riscv, davem

On 19/11/2018 16:28:30+0100, Andrew Lunn wrote:
> On Mon, Nov 19, 2018 at 04:13:10PM +0100, Andreas Schwab wrote:
> > On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:
> > 
> > > On Mon, Nov 19, 2018 at 03:57:17PM +0100, Andreas Schwab wrote:
> > >> On Okt 08 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:
> > >> 
> > >> > The Microsemi PHYs have multiple banks of registers (called pages).
> > >> > Registers can only be accessed from one page, if we need a register from
> > >> > another page, we need to switch the page and the registers of all other
> > >> > pages are not accessible anymore.
> > >> >
> > >> > Basically, to read register 5 from page 0, 1, 2, etc., you do the same
> > >> > phy_read(phydev, 5); but you need to set the desired page beforehand.
> > >> >
> > >> > In order to guarantee that two concurrent functions do not change the
> > >> > page, we need to do some locking per page. This can be achieved with the
> > >> > use of phy_select_page and phy_restore_page functions but phy_write/read
> > >> > calls in-between those two functions shall be replaced by their
> > >> > lock-free alternative __phy_write/read.
> > >> >
> > >> > Let's migrate this driver to those functions.
> > >> 
> > >> This has some serious locking problem.
> > >
> > > Hi Andreas
> > >
> > > Could you be more specific. Are you getting a deadlock? A WARN_ON?
> > 
> > See the stack trace.  That's where it hangs.
> 
> So you never said it hangs. The stacktrace helps, but a description of
> what actually happens also helps. And i expect Quentin has booted this
> code lots of times and not had a hang. So some hits how to reproduce
> it would also help. Maybe your kernel config?
> 
> I'm interested because he is using the core mdio locking
> primitives. If those are broken, i want to know.
> 

My first intuition is that he mac driver quentin is using does
phy_connect when the interface is opened while macb is doing it at probe
time. I didn't investigate but maybe this can help :)

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 15:40           ` alexandre.belloni
  2018-11-19 15:40             ` Alexandre Belloni
@ 2018-11-19 15:50             ` schwab
  2018-11-19 15:50               ` Andreas Schwab
  2018-11-19 16:12               ` andrew
  1 sibling, 2 replies; 34+ messages in thread
From: schwab @ 2018-11-19 15:50 UTC (permalink / raw)
  To: linux-riscv

On Nov 19 2018, Alexandre Belloni <alexandre.belloni@bootlin.com> wrote:

> My first intuition is that he mac driver quentin is using does
> phy_connect when the interface is opened while macb is doing it at probe
> time. I didn't investigate but maybe this can help :)

The phy probing of macb is very unreliable, perhaps it needs to be
rewritten.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 15:50             ` schwab
@ 2018-11-19 15:50               ` Andreas Schwab
  2018-11-19 16:12               ` andrew
  1 sibling, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2018-11-19 15:50 UTC (permalink / raw)
  To: Alexandre Belloni
  Cc: Andrew Lunn, f.fainelli, Quentin Schulz, netdev, linux-kernel,
	allan.nielsen, thomas.petazzoni, linux-riscv, davem

On Nov 19 2018, Alexandre Belloni <alexandre.belloni@bootlin.com> wrote:

> My first intuition is that he mac driver quentin is using does
> phy_connect when the interface is opened while macb is doing it at probe
> time. I didn't investigate but maybe this can help :)

The phy probing of macb is very unreliable, perhaps it needs to be
rewritten.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 15:50             ` schwab
  2018-11-19 15:50               ` Andreas Schwab
@ 2018-11-19 16:12               ` andrew
  2018-11-19 16:12                 ` Andrew Lunn
                                   ` (2 more replies)
  1 sibling, 3 replies; 34+ messages in thread
From: andrew @ 2018-11-19 16:12 UTC (permalink / raw)
  To: linux-riscv

On Mon, Nov 19, 2018 at 04:50:14PM +0100, Andreas Schwab wrote:
> On Nov 19 2018, Alexandre Belloni <alexandre.belloni@bootlin.com> wrote:
> 
> > My first intuition is that he mac driver quentin is using does
> > phy_connect when the interface is opened while macb is doing it at probe
> > time. I didn't investigate but maybe this can help :)
> 
> The phy probing of macb is very unreliable, perhaps it needs to be
> rewritten.

Hi Andreas

I still don't see why that would cause a hang.

Could you turn on lockdep and see if it reports a deadlock.

Thanks
	Andrew

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 16:12               ` andrew
@ 2018-11-19 16:12                 ` Andrew Lunn
  2018-11-19 16:14                 ` schwab
  2018-11-20 11:39                 ` schwab
  2 siblings, 0 replies; 34+ messages in thread
From: Andrew Lunn @ 2018-11-19 16:12 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Alexandre Belloni, f.fainelli, Quentin Schulz, netdev,
	linux-kernel, allan.nielsen, thomas.petazzoni, linux-riscv,
	davem

On Mon, Nov 19, 2018 at 04:50:14PM +0100, Andreas Schwab wrote:
> On Nov 19 2018, Alexandre Belloni <alexandre.belloni@bootlin.com> wrote:
> 
> > My first intuition is that he mac driver quentin is using does
> > phy_connect when the interface is opened while macb is doing it at probe
> > time. I didn't investigate but maybe this can help :)
> 
> The phy probing of macb is very unreliable, perhaps it needs to be
> rewritten.

Hi Andreas

I still don't see why that would cause a hang.

Could you turn on lockdep and see if it reports a deadlock.

Thanks
	Andrew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 16:12               ` andrew
  2018-11-19 16:12                 ` Andrew Lunn
@ 2018-11-19 16:14                 ` schwab
  2018-11-19 16:14                   ` Andreas Schwab
  2018-11-19 16:25                   ` andrew
  2018-11-20 11:39                 ` schwab
  2 siblings, 2 replies; 34+ messages in thread
From: schwab @ 2018-11-19 16:14 UTC (permalink / raw)
  To: linux-riscv

On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:

> Could you turn on lockdep and see if it reports a deadlock.

How do I "turn on lockdep"?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 16:14                 ` schwab
@ 2018-11-19 16:14                   ` Andreas Schwab
  2018-11-19 16:25                   ` andrew
  1 sibling, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2018-11-19 16:14 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Alexandre Belloni, f.fainelli, Quentin Schulz, netdev,
	linux-kernel, allan.nielsen, thomas.petazzoni, linux-riscv,
	davem

On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:

> Could you turn on lockdep and see if it reports a deadlock.

How do I "turn on lockdep"?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 16:14                 ` schwab
  2018-11-19 16:14                   ` Andreas Schwab
@ 2018-11-19 16:25                   ` andrew
  2018-11-19 16:25                     ` Andrew Lunn
  2018-11-19 16:32                     ` schwab
  1 sibling, 2 replies; 34+ messages in thread
From: andrew @ 2018-11-19 16:25 UTC (permalink / raw)
  To: linux-riscv

On Mon, Nov 19, 2018 at 05:14:38PM +0100, Andreas Schwab wrote:
> On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:
> 
> > Could you turn on lockdep and see if it reports a deadlock.
> 
> How do I "turn on lockdep"?

make menuconfig
Kernel hacking
Lock Debugging (spinlocks, mutexes, etc...) 
Lock debugging: prove locking correctness

It has changes its name at some point. it used to be called
CONFIG_LOCKDEP, for locking dependencies. Anybody who has been kernel
hacking for a while still calls it lockdep.

	Andrew

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 16:25                   ` andrew
@ 2018-11-19 16:25                     ` Andrew Lunn
  2018-11-19 16:32                     ` schwab
  1 sibling, 0 replies; 34+ messages in thread
From: Andrew Lunn @ 2018-11-19 16:25 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Alexandre Belloni, f.fainelli, Quentin Schulz, netdev,
	linux-kernel, allan.nielsen, thomas.petazzoni, linux-riscv,
	davem

On Mon, Nov 19, 2018 at 05:14:38PM +0100, Andreas Schwab wrote:
> On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:
> 
> > Could you turn on lockdep and see if it reports a deadlock.
> 
> How do I "turn on lockdep"?

make menuconfig
Kernel hacking
Lock Debugging (spinlocks, mutexes, etc...) 
Lock debugging: prove locking correctness

It has changes its name at some point. it used to be called
CONFIG_LOCKDEP, for locking dependencies. Anybody who has been kernel
hacking for a while still calls it lockdep.

	Andrew

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 16:25                   ` andrew
  2018-11-19 16:25                     ` Andrew Lunn
@ 2018-11-19 16:32                     ` schwab
  2018-11-19 16:32                       ` Andreas Schwab
  2018-11-19 16:44                       ` andrew
  1 sibling, 2 replies; 34+ messages in thread
From: schwab @ 2018-11-19 16:32 UTC (permalink / raw)
  To: linux-riscv

On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:

> Lock debugging: prove locking correctness

404

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 16:32                     ` schwab
@ 2018-11-19 16:32                       ` Andreas Schwab
  2018-11-19 16:44                       ` andrew
  1 sibling, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2018-11-19 16:32 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Alexandre Belloni, f.fainelli, Quentin Schulz, netdev,
	linux-kernel, allan.nielsen, thomas.petazzoni, linux-riscv,
	davem

On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:

> Lock debugging: prove locking correctness

404

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 16:32                     ` schwab
  2018-11-19 16:32                       ` Andreas Schwab
@ 2018-11-19 16:44                       ` andrew
  2018-11-19 16:44                         ` Andrew Lunn
  1 sibling, 1 reply; 34+ messages in thread
From: andrew @ 2018-11-19 16:44 UTC (permalink / raw)
  To: linux-riscv

On Mon, Nov 19, 2018 at 05:32:14PM +0100, Andreas Schwab wrote:
> On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:
> 
> > Lock debugging: prove locking correctness
> 

Lets see how much this gets messed up by email...

.config - Linux/arm 4.20.0-rc2 Kernel Configuration
 > Kernel hacking > Lock Debugging (spinlocks, mutexes, etc...) ??????
  ?????????? Lock Debugging (spinlocks, mutexes, etc...) ???????????
  ?  Arrow keys navigate the menu.  <Enter> selects submenus --->  ?  
  ?  (or empty submenus ----).  Highlighted letters are hotkeys.   ?  
  ?  Pressing <Y> includes, <N> excludes, <M> modularizes          ?  
  ?  features.  Press <Esc><Esc> to exit, <?> for Help, </> for    ?  
  ? ?????????????????????????????????????????????????????????????? ?  
  ? ?    [*] Lock debugging: prove locking correctness           ? ?  
  ? ?    [ ] Lock usage statistics                               ? ?  
  ? ?    -*- RT Mutex debugging, deadlock detection              ? ?  
  ? ?    -*- Spinlock and rw-lock debugging: basic checks        ? ?  
  ? ?    -*- Mutex debugging: basic checks                       ? ?  
  ? ?    -*- Wait/wound mutex debugging: Slowpath testing        ? ?  
  ? ?    -*- RW Semaphore debugging: basic checks                ? ?  
  ? ?    -*- Lock debugging: detect incorrect freeing of live loc? ?  
  ? ?    [ ] Lock dependency engine debugging                    ? ?  
  ? ?    [*] Sleep inside atomic section checking                ? ?  
  ? ?    [ ] Locking API boot-time self-tests                    ? ?  
  ? ?    < > torture tests for locking                           ? ?  
  ? ?    < > Wait/wound mutex selftests                          ? ?  
  ? ?????????????????????????????????????????????????????????????? ?  
  ??????????????????????????????????????????????????????????????????  
  ?    <Select>    < Exit >    < Help >    < Save >    < Load >    ?  
  ??????????????????????????????????????????????????????????????????  
    
If you cannot see it at all, it probably means you are missing a
dependency:

  ?   Depends on: DEBUG_KERNEL [=y] && \                           ?  
  ? LOCK_DEBUGGING_SUPPORT [=y]                                    ?  

    Andrew

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 16:44                       ` andrew
@ 2018-11-19 16:44                         ` Andrew Lunn
  0 siblings, 0 replies; 34+ messages in thread
From: Andrew Lunn @ 2018-11-19 16:44 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Alexandre Belloni, f.fainelli, Quentin Schulz, netdev,
	linux-kernel, allan.nielsen, thomas.petazzoni, linux-riscv,
	davem

On Mon, Nov 19, 2018 at 05:32:14PM +0100, Andreas Schwab wrote:
> On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:
> 
> > Lock debugging: prove locking correctness
> 

Lets see how much this gets messed up by email...

.config - Linux/arm 4.20.0-rc2 Kernel Configuration
 > Kernel hacking > Lock Debugging (spinlocks, mutexes, etc...) ──────
  ┌───────── Lock Debugging (spinlocks, mutexes, etc...) ──────────┐
  │  Arrow keys navigate the menu.  <Enter> selects submenus --->  │  
  │  (or empty submenus ----).  Highlighted letters are hotkeys.   │  
  │  Pressing <Y> includes, <N> excludes, <M> modularizes          │  
  │  features.  Press <Esc><Esc> to exit, <?> for Help, </> for    │  
  │ ┌────────────────────────────────────────────────────────────┐ │  
  │ │    [*] Lock debugging: prove locking correctness           │ │  
  │ │    [ ] Lock usage statistics                               │ │  
  │ │    -*- RT Mutex debugging, deadlock detection              │ │  
  │ │    -*- Spinlock and rw-lock debugging: basic checks        │ │  
  │ │    -*- Mutex debugging: basic checks                       │ │  
  │ │    -*- Wait/wound mutex debugging: Slowpath testing        │ │  
  │ │    -*- RW Semaphore debugging: basic checks                │ │  
  │ │    -*- Lock debugging: detect incorrect freeing of live loc│ │  
  │ │    [ ] Lock dependency engine debugging                    │ │  
  │ │    [*] Sleep inside atomic section checking                │ │  
  │ │    [ ] Locking API boot-time self-tests                    │ │  
  │ │    < > torture tests for locking                           │ │  
  │ │    < > Wait/wound mutex selftests                          │ │  
  │ └────────────────────────────────────────────────────────────┘ │  
  ├────────────────────────────────────────────────────────────────┤  
  │    <Select>    < Exit >    < Help >    < Save >    < Load >    │  
  └────────────────────────────────────────────────────────────────┘  
    
If you cannot see it at all, it probably means you are missing a
dependency:

  │   Depends on: DEBUG_KERNEL [=y] && \                           │  
  │ LOCK_DEBUGGING_SUPPORT [=y]                                    │  

    Andrew



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-19 16:12               ` andrew
  2018-11-19 16:12                 ` Andrew Lunn
  2018-11-19 16:14                 ` schwab
@ 2018-11-20 11:39                 ` schwab
  2018-11-20 11:39                   ` Andreas Schwab
  2018-11-20 13:20                   ` quentin.schulz
  2 siblings, 2 replies; 34+ messages in thread
From: schwab @ 2018-11-20 11:39 UTC (permalink / raw)
  To: linux-riscv

On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:

> I still don't see why that would cause a hang.

[  996.370000] [<ffffffe0011d7324>] mutex_lock+0x22/0x2a
[  996.380000] [<ffffffe001039ba0>] mdiobus_read+0x36/0x60
[  996.380000] [<ffffffe00103b82c>] vsc85xx_config_init+0x4c/0x1e2

vsc85xx_config_init is calling mdiobus_read while holding the mdio_lock.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-20 11:39                 ` schwab
@ 2018-11-20 11:39                   ` Andreas Schwab
  2018-11-20 13:20                   ` quentin.schulz
  1 sibling, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2018-11-20 11:39 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Alexandre Belloni, f.fainelli, Quentin Schulz, netdev,
	linux-kernel, allan.nielsen, thomas.petazzoni, linux-riscv,
	davem

On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:

> I still don't see why that would cause a hang.

[  996.370000] [<ffffffe0011d7324>] mutex_lock+0x22/0x2a
[  996.380000] [<ffffffe001039ba0>] mdiobus_read+0x36/0x60
[  996.380000] [<ffffffe00103b82c>] vsc85xx_config_init+0x4c/0x1e2

vsc85xx_config_init is calling mdiobus_read while holding the mdio_lock.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-20 11:39                 ` schwab
  2018-11-20 11:39                   ` Andreas Schwab
@ 2018-11-20 13:20                   ` quentin.schulz
  2018-11-20 13:20                     ` Quentin Schulz
  1 sibling, 1 reply; 34+ messages in thread
From: quentin.schulz @ 2018-11-20 13:20 UTC (permalink / raw)
  To: linux-riscv

Hi Andreas,

On Tue, Nov 20, 2018 at 12:39:51PM +0100, Andreas Schwab wrote:
> On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:
> 
> > I still don't see why that would cause a hang.
> 
> [  996.370000] [<ffffffe0011d7324>] mutex_lock+0x22/0x2a
> [  996.380000] [<ffffffe001039ba0>] mdiobus_read+0x36/0x60
> [  996.380000] [<ffffffe00103b82c>] vsc85xx_config_init+0x4c/0x1e2
> 
> vsc85xx_config_init is calling mdiobus_read while holding the mdio_lock.
> 

Thanks for reporting the bug. However, it could be nice the next time
you report a bug to give as much info as you can, such as the device
you're using. Architecture of your platform (or the board itself if it's
a DT in the Linux kernel), the MAC and the PHY IPs (the Microsemi PHY
driver now supports 6 different PHYs that do not work the same way so I
can't guess the one you're using), they all matter and help to
understand your bug. Also, if you could give a full bootlog of the
kernel and how it's possible to reproduce your issue (.config of your
kernel, on which commit on which branch you built your kernel).
Otherwise, you might not have any answer or considerably slow down the
bug identifying and fixing processes, which is annoying to you,
the users, the developers and the maintainers: a lot of upset people :)

Could you try something please? I'm pretty sure I found an issue (I
don't know if it's fixing your issue or not, but it might be).

Please replace in vsc85xx_default_config() the phy_read and phy_write by
respectively __phy_read and __phy_write (note the __ in front).

Let me know if you need help with this.

Thanks,
Quentin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20181120/305099c4/attachment.sig>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions
  2018-11-20 13:20                   ` quentin.schulz
@ 2018-11-20 13:20                     ` Quentin Schulz
  0 siblings, 0 replies; 34+ messages in thread
From: Quentin Schulz @ 2018-11-20 13:20 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Andrew Lunn, Alexandre Belloni, f.fainelli, netdev, linux-kernel,
	allan.nielsen, thomas.petazzoni, linux-riscv, davem

[-- Attachment #1.1: Type: text/plain, Size: 1649 bytes --]

Hi Andreas,

On Tue, Nov 20, 2018 at 12:39:51PM +0100, Andreas Schwab wrote:
> On Nov 19 2018, Andrew Lunn <andrew@lunn.ch> wrote:
> 
> > I still don't see why that would cause a hang.
> 
> [  996.370000] [<ffffffe0011d7324>] mutex_lock+0x22/0x2a
> [  996.380000] [<ffffffe001039ba0>] mdiobus_read+0x36/0x60
> [  996.380000] [<ffffffe00103b82c>] vsc85xx_config_init+0x4c/0x1e2
> 
> vsc85xx_config_init is calling mdiobus_read while holding the mdio_lock.
> 

Thanks for reporting the bug. However, it could be nice the next time
you report a bug to give as much info as you can, such as the device
you're using. Architecture of your platform (or the board itself if it's
a DT in the Linux kernel), the MAC and the PHY IPs (the Microsemi PHY
driver now supports 6 different PHYs that do not work the same way so I
can't guess the one you're using), they all matter and help to
understand your bug. Also, if you could give a full bootlog of the
kernel and how it's possible to reproduce your issue (.config of your
kernel, on which commit on which branch you built your kernel).
Otherwise, you might not have any answer or considerably slow down the
bug identifying and fixing processes, which is annoying to you,
the users, the developers and the maintainers: a lot of upset people :)

Could you try something please? I'm pretty sure I found an issue (I
don't know if it's fixing your issue or not, but it might be).

Please replace in vsc85xx_default_config() the phy_read and phy_write by
respectively __phy_read and __phy_write (note the __ in front).

Let me know if you need help with this.

Thanks,
Quentin

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config
  2018-11-19 14:57   ` [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions schwab
  2018-11-19 14:57     ` Andreas Schwab
  2018-11-19 15:10     ` andrew
@ 2018-11-20 13:48     ` schwab
  2018-11-20 13:48       ` Andreas Schwab
  2018-11-20 13:55       ` quentin.schulz
  2 siblings, 2 replies; 34+ messages in thread
From: schwab @ 2018-11-20 13:48 UTC (permalink / raw)
  To: linux-riscv

Don't use locking phy_read/phy_write functions while inside
phy_select_patch/phy_restore_page region.

Fixes: 6a0bfbbe20 ("net: phy: mscc: migrate to phy_select/restore_page functions")
Signed-off-by: Andreas Schwab <schwab@suse.de>
---
 drivers/net/phy/mscc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/mscc.c b/drivers/net/phy/mscc.c
index a2e59f4f6f..a398ab9410 100644
--- a/drivers/net/phy/mscc.c
+++ b/drivers/net/phy/mscc.c
@@ -814,10 +814,10 @@ static int vsc85xx_default_config(struct phy_device *phydev)
 	if (rc < 0)
 		goto out_unlock;
 
-	reg_val = phy_read(phydev, MSCC_PHY_RGMII_CNTL);
+	reg_val = __phy_read(phydev, MSCC_PHY_RGMII_CNTL);
 	reg_val &= ~(RGMII_RX_CLK_DELAY_MASK);
 	reg_val |= (RGMII_RX_CLK_DELAY_1_1_NS << RGMII_RX_CLK_DELAY_POS);
-	phy_write(phydev, MSCC_PHY_RGMII_CNTL, reg_val);
+	__phy_write(phydev, MSCC_PHY_RGMII_CNTL, reg_val);
 
 out_unlock:
 	rc = phy_restore_page(phydev, rc, rc > 0 ? 0 : rc);
-- 
2.19.1


-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config
  2018-11-20 13:48     ` [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config schwab
@ 2018-11-20 13:48       ` Andreas Schwab
  2018-11-20 13:55       ` quentin.schulz
  1 sibling, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2018-11-20 13:48 UTC (permalink / raw)
  To: Quentin Schulz
  Cc: andrew, alexandre.belloni, f.fainelli, netdev, linux-kernel,
	allan.nielsen, thomas.petazzoni, linux-riscv, davem

Don't use locking phy_read/phy_write functions while inside
phy_select_patch/phy_restore_page region.

Fixes: 6a0bfbbe20 ("net: phy: mscc: migrate to phy_select/restore_page functions")
Signed-off-by: Andreas Schwab <schwab@suse.de>
---
 drivers/net/phy/mscc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/mscc.c b/drivers/net/phy/mscc.c
index a2e59f4f6f..a398ab9410 100644
--- a/drivers/net/phy/mscc.c
+++ b/drivers/net/phy/mscc.c
@@ -814,10 +814,10 @@ static int vsc85xx_default_config(struct phy_device *phydev)
 	if (rc < 0)
 		goto out_unlock;
 
-	reg_val = phy_read(phydev, MSCC_PHY_RGMII_CNTL);
+	reg_val = __phy_read(phydev, MSCC_PHY_RGMII_CNTL);
 	reg_val &= ~(RGMII_RX_CLK_DELAY_MASK);
 	reg_val |= (RGMII_RX_CLK_DELAY_1_1_NS << RGMII_RX_CLK_DELAY_POS);
-	phy_write(phydev, MSCC_PHY_RGMII_CNTL, reg_val);
+	__phy_write(phydev, MSCC_PHY_RGMII_CNTL, reg_val);
 
 out_unlock:
 	rc = phy_restore_page(phydev, rc, rc > 0 ? 0 : rc);
-- 
2.19.1


-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config
  2018-11-20 13:48     ` [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config schwab
  2018-11-20 13:48       ` Andreas Schwab
@ 2018-11-20 13:55       ` quentin.schulz
  2018-11-20 13:55         ` Quentin Schulz
  2018-11-20 14:01         ` schwab
  1 sibling, 2 replies; 34+ messages in thread
From: quentin.schulz @ 2018-11-20 13:55 UTC (permalink / raw)
  To: linux-riscv

Hi Andreas,

On Tue, Nov 20, 2018 at 02:48:37PM +0100, Andreas Schwab wrote:
> Don't use locking phy_read/phy_write functions while inside
> phy_select_patch/phy_restore_page region.
> 
> Fixes: 6a0bfbbe20 ("net: phy: mscc: migrate to phy_select/restore_page functions")
> Signed-off-by: Andreas Schwab <schwab@suse.de>

Have you tested this patch? Does it solve your issue?

Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com>
anyway since it's to be done.

You need to send your patch as a mail separate to this patch series.
You also need to prefix your patch with [PATCH net-next] instead of only
[PATCH]. This is specific to the net subsystem.

Use scripts/get_maintainer.pl on your patch to get the people to put in
Cc and in To.

Thanks,
Quentin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20181120/eae2d892/attachment.sig>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config
  2018-11-20 13:55       ` quentin.schulz
@ 2018-11-20 13:55         ` Quentin Schulz
  2018-11-20 14:01         ` schwab
  1 sibling, 0 replies; 34+ messages in thread
From: Quentin Schulz @ 2018-11-20 13:55 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: andrew, alexandre.belloni, f.fainelli, netdev, linux-kernel,
	allan.nielsen, thomas.petazzoni, linux-riscv, davem

[-- Attachment #1.1: Type: text/plain, Size: 781 bytes --]

Hi Andreas,

On Tue, Nov 20, 2018 at 02:48:37PM +0100, Andreas Schwab wrote:
> Don't use locking phy_read/phy_write functions while inside
> phy_select_patch/phy_restore_page region.
> 
> Fixes: 6a0bfbbe20 ("net: phy: mscc: migrate to phy_select/restore_page functions")
> Signed-off-by: Andreas Schwab <schwab@suse.de>

Have you tested this patch? Does it solve your issue?

Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com>
anyway since it's to be done.

You need to send your patch as a mail separate to this patch series.
You also need to prefix your patch with [PATCH net-next] instead of only
[PATCH]. This is specific to the net subsystem.

Use scripts/get_maintainer.pl on your patch to get the people to put in
Cc and in To.

Thanks,
Quentin

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config
  2018-11-20 13:55       ` quentin.schulz
  2018-11-20 13:55         ` Quentin Schulz
@ 2018-11-20 14:01         ` schwab
  2018-11-20 14:01           ` Andreas Schwab
  2018-11-20 14:17           ` quentin.schulz
  1 sibling, 2 replies; 34+ messages in thread
From: schwab @ 2018-11-20 14:01 UTC (permalink / raw)
  To: linux-riscv

On Nov 20 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:

> You also need to prefix your patch with [PATCH net-next] instead of only
> [PATCH]. This is specific to the net subsystem.

Why next?  This is a bug fix that needs to go in now, not next.

> Use scripts/get_maintainer.pl on your patch to get the people to put in
> Cc and in To.

The same people as the original patch the added the bug.  Why would it
be different now?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab at suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config
  2018-11-20 14:01         ` schwab
@ 2018-11-20 14:01           ` Andreas Schwab
  2018-11-20 14:17           ` quentin.schulz
  1 sibling, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2018-11-20 14:01 UTC (permalink / raw)
  To: Quentin Schulz
  Cc: andrew, alexandre.belloni, f.fainelli, netdev, linux-kernel,
	allan.nielsen, thomas.petazzoni, linux-riscv, davem

On Nov 20 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:

> You also need to prefix your patch with [PATCH net-next] instead of only
> [PATCH]. This is specific to the net subsystem.

Why next?  This is a bug fix that needs to go in now, not next.

> Use scripts/get_maintainer.pl on your patch to get the people to put in
> Cc and in To.

The same people as the original patch the added the bug.  Why would it
be different now?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config
  2018-11-20 14:01         ` schwab
  2018-11-20 14:01           ` Andreas Schwab
@ 2018-11-20 14:17           ` quentin.schulz
  2018-11-20 14:17             ` Quentin Schulz
  1 sibling, 1 reply; 34+ messages in thread
From: quentin.schulz @ 2018-11-20 14:17 UTC (permalink / raw)
  To: linux-riscv

Hi Andreas,

On Tue, Nov 20, 2018 at 03:01:25PM +0100, Andreas Schwab wrote:
> On Nov 20 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:
> 
> > You also need to prefix your patch with [PATCH net-next] instead of only
> > [PATCH]. This is specific to the net subsystem.
> 
> Why next?  This is a bug fix that needs to go in now, not next.
> 

Indeed. Use [PATCH net] then.
Please refer to https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt

> > Use scripts/get_maintainer.pl on your patch to get the people to put in
> > Cc and in To.
> 
> The same people as the original patch the added the bug.  Why would it
> be different now?
> 

First, people may have changed mail addresses (especially maintainers)
and that is picked by get_maintainer.

Second, other people may have worked on this driver since the patch was
introduced and they might want to know about this change too and have
their word on it. This is what happened with this driver. Gustavo and
Heiner sent patches since then and they have already been merged.

This is also part of the contribution process which is defined here:
https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html

You have not answered my question yet. Does this fix your issue, yes or
no?

Quentin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20181120/8b0ae393/attachment.sig>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config
  2018-11-20 14:17           ` quentin.schulz
@ 2018-11-20 14:17             ` Quentin Schulz
  0 siblings, 0 replies; 34+ messages in thread
From: Quentin Schulz @ 2018-11-20 14:17 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: andrew, alexandre.belloni, f.fainelli, netdev, linux-kernel,
	allan.nielsen, thomas.petazzoni, linux-riscv, davem

[-- Attachment #1.1: Type: text/plain, Size: 1309 bytes --]

Hi Andreas,

On Tue, Nov 20, 2018 at 03:01:25PM +0100, Andreas Schwab wrote:
> On Nov 20 2018, Quentin Schulz <quentin.schulz@bootlin.com> wrote:
> 
> > You also need to prefix your patch with [PATCH net-next] instead of only
> > [PATCH]. This is specific to the net subsystem.
> 
> Why next?  This is a bug fix that needs to go in now, not next.
> 

Indeed. Use [PATCH net] then.
Please refer to https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt

> > Use scripts/get_maintainer.pl on your patch to get the people to put in
> > Cc and in To.
> 
> The same people as the original patch the added the bug.  Why would it
> be different now?
> 

First, people may have changed mail addresses (especially maintainers)
and that is picked by get_maintainer.

Second, other people may have worked on this driver since the patch was
introduced and they might want to know about this change too and have
their word on it. This is what happened with this driver. Gustavo and
Heiner sent patches since then and they have already been merged.

This is also part of the contribution process which is defined here:
https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html

You have not answered my question yet. Does this fix your issue, yes or
no?

Quentin

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, back to index

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20181008100728.24959-1-quentin.schulz@bootlin.com>
     [not found] ` <20181008100728.24959-2-quentin.schulz@bootlin.com>
2018-11-19 14:57   ` [PATCH net-next v3 1/6] net: phy: mscc: migrate to phy_select/restore_page functions schwab
2018-11-19 14:57     ` Andreas Schwab
2018-11-19 15:10     ` andrew
2018-11-19 15:10       ` Andrew Lunn
2018-11-19 15:13       ` schwab
2018-11-19 15:13         ` Andreas Schwab
2018-11-19 15:28         ` andrew
2018-11-19 15:28           ` Andrew Lunn
2018-11-19 15:40           ` alexandre.belloni
2018-11-19 15:40             ` Alexandre Belloni
2018-11-19 15:50             ` schwab
2018-11-19 15:50               ` Andreas Schwab
2018-11-19 16:12               ` andrew
2018-11-19 16:12                 ` Andrew Lunn
2018-11-19 16:14                 ` schwab
2018-11-19 16:14                   ` Andreas Schwab
2018-11-19 16:25                   ` andrew
2018-11-19 16:25                     ` Andrew Lunn
2018-11-19 16:32                     ` schwab
2018-11-19 16:32                       ` Andreas Schwab
2018-11-19 16:44                       ` andrew
2018-11-19 16:44                         ` Andrew Lunn
2018-11-20 11:39                 ` schwab
2018-11-20 11:39                   ` Andreas Schwab
2018-11-20 13:20                   ` quentin.schulz
2018-11-20 13:20                     ` Quentin Schulz
2018-11-20 13:48     ` [PATCH] net: phy: mscc: fix locking in vsc85xx_default_config schwab
2018-11-20 13:48       ` Andreas Schwab
2018-11-20 13:55       ` quentin.schulz
2018-11-20 13:55         ` Quentin Schulz
2018-11-20 14:01         ` schwab
2018-11-20 14:01           ` Andreas Schwab
2018-11-20 14:17           ` quentin.schulz
2018-11-20 14:17             ` Quentin Schulz

Linux-RISC-V Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-riscv/0 linux-riscv/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-riscv linux-riscv/ https://lore.kernel.org/linux-riscv \
		linux-riscv@lists.infradead.org infradead-linux-riscv@archiver.kernel.org
	public-inbox-index linux-riscv


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-riscv


AGPL code for this site: git clone https://public-inbox.org/ public-inbox