All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Ntb Transport Driver Problem After Power-up
       [not found] <CAJQW-Q4_hJ+23T0qhK9BRMbcfqdF7rnJkWWPmGKfN5WzyD-eJg@mail.gmail.com>
@ 2017-11-27 16:39 ` Jon Mason
  2017-11-27 16:47   ` Logan Gunthorpe
  0 siblings, 1 reply; 14+ messages in thread
From: Jon Mason @ 2017-11-27 16:39 UTC (permalink / raw)
  To: ThanhTuThai; +Cc: linux-ntb

(Ccing NTB mailing list)

On Sun, Nov 19, 2017 at 9:36 PM, ThanhTuThai <cruisethai@gmail.com> wrote:
> Dear Jon,
>
> We are using Ntb_transport driver from Linux with Microsemi's ntb hardware.
> We get a problem when one peer suddenly power off without removing the
> drivers, after it powers up again, the good peer cannot reconnect with it
> again, the good peer need to reload the drivers in order to reconnect to it.
> I guess the good peer need to re-init some thing in order to catch up with
> another one, but I don't what it is.
>
> I knew that, when one peer starts, the driver will send out a message
> through doorbell, when another peer catch that message, I can announce the
> ntb_transport link-down ( ntb_link_event(&sndev->ntb); ).
>
> But in this case, when one peer power down and up the good peer don't
> receive any message from it in the interrupt (switchtec_ntb_message_isr),
> although I have check that, it already sent out the message.
>
> Do you have any idea about it ?

It sounds like the link down/up isn't working properly.  Is the
Microsemi NTB not able to detect a link down?

Thanks,
Jon

>
> Thank you very much !

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-27 16:39 ` Ntb Transport Driver Problem After Power-up Jon Mason
@ 2017-11-27 16:47   ` Logan Gunthorpe
  2017-11-27 17:00     ` Jon Mason
  2017-11-28  0:14     ` Karl Kao
  0 siblings, 2 replies; 14+ messages in thread
From: Logan Gunthorpe @ 2017-11-27 16:47 UTC (permalink / raw)
  To: Jon Mason, ThanhTuThai; +Cc: linux-ntb



On 27/11/17 09:39 AM, Jon Mason wrote:
> It sounds like the link down/up isn't working properly.  Is the
> Microsemi NTB not able to detect a link down?

It can detect a link down, but the link doesn't actually go down in a 
lot of cases. If the host is just soft rebooted after a crash, I don't 
think the link will go down.

Logan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-27 16:47   ` Logan Gunthorpe
@ 2017-11-27 17:00     ` Jon Mason
  2017-11-27 17:09       ` Logan Gunthorpe
  2017-11-28  0:14     ` Karl Kao
  1 sibling, 1 reply; 14+ messages in thread
From: Jon Mason @ 2017-11-27 17:00 UTC (permalink / raw)
  To: Logan Gunthorpe; +Cc: ThanhTuThai, linux-ntb

On Mon, Nov 27, 2017 at 11:47 AM, Logan Gunthorpe <logang@deltatee.com> wrote:
>
>
> On 27/11/17 09:39 AM, Jon Mason wrote:
>>
>> It sounds like the link down/up isn't working properly.  Is the
>> Microsemi NTB not able to detect a link down?
>
>
> It can detect a link down, but the link doesn't actually go down in a lot of
> cases. If the host is just soft rebooted after a crash, I don't think the
> link will go down.

This is really not optimal :(

We can have a SW watchdog timer to poll it (aka heartbeat) to detect
this, but that's going to eat cycles and could allow for a windows
where the link is down and the sender is writing into oblivion.
Thoughts?

>
> Logan
>
> --
> You received this message because you are subscribed to the Google Groups
> "linux-ntb" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to linux-ntb+unsubscribe@googlegroups.com.
> To post to this group, send email to linux-ntb@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/linux-ntb/37fc41d9-5191-6f41-fd65-e175f0c661fa%40deltatee.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-27 17:00     ` Jon Mason
@ 2017-11-27 17:09       ` Logan Gunthorpe
  2017-11-27 17:38         ` Jon Mason
  0 siblings, 1 reply; 14+ messages in thread
From: Logan Gunthorpe @ 2017-11-27 17:09 UTC (permalink / raw)
  To: Jon Mason; +Cc: ThanhTuThai, linux-ntb



On 27/11/17 10:00 AM, Jon Mason wrote:
> We can have a SW watchdog timer to poll it (aka heartbeat) to detect
> this, but that's going to eat cycles and could allow for a windows
> where the link is down and the sender is writing into oblivion.
> Thoughts?

I think the easiest way is if we get a link up event, and we already 
think the link is up, then we just put the link down before sending a 
second link up event. I can probably look at doing something like that 
shortly. However, unfortunately, my setup isn't suited to test this as 
I'm actually looping back both partitions to a single host :(. I'll 
submit a patch that others can test though.

Logan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-27 17:09       ` Logan Gunthorpe
@ 2017-11-27 17:38         ` Jon Mason
  2017-11-27 23:33           ` ThanhTuThai
  0 siblings, 1 reply; 14+ messages in thread
From: Jon Mason @ 2017-11-27 17:38 UTC (permalink / raw)
  To: Logan Gunthorpe; +Cc: ThanhTuThai, linux-ntb

On Mon, Nov 27, 2017 at 12:09 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
>
>
> On 27/11/17 10:00 AM, Jon Mason wrote:
>>
>> We can have a SW watchdog timer to poll it (aka heartbeat) to detect
>> this, but that's going to eat cycles and could allow for a windows
>> where the link is down and the sender is writing into oblivion.
>> Thoughts?
>
>
> I think the easiest way is if we get a link up event, and we already think
> the link is up, then we just put the link down before sending a second link
> up event. I can probably look at doing something like that shortly. However,
> unfortunately, my setup isn't suited to test this as I'm actually looping
> back both partitions to a single host :(. I'll submit a patch that others
> can test though.

Sounds great.  ThanhTuThai, can you test this?

>
> Logan
>
> --
> You received this message because you are subscribed to the Google Groups
> "linux-ntb" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to linux-ntb+unsubscribe@googlegroups.com.
> To post to this group, send email to linux-ntb@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/linux-ntb/e4c00ca2-3d74-5d88-1824-c517250d164f%40deltatee.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-27 17:38         ` Jon Mason
@ 2017-11-27 23:33           ` ThanhTuThai
  2017-11-27 23:35             ` Logan Gunthorpe
  0 siblings, 1 reply; 14+ messages in thread
From: ThanhTuThai @ 2017-11-27 23:33 UTC (permalink / raw)
  To: Jon Mason; +Cc: Logan Gunthorpe, linux-ntb

[-- Attachment #1: Type: text/plain, Size: 1900 bytes --]

No, it doesn't.

I've also thought about this strategy, and implemented it in my setup.
For the soft-reboot, it works well, I can receive the link up event when it
complete rebooting.
But for power-on reset ( in this case, the switchtec is also power-reset ),
I don't receiver any message of linking-up. So the good peer cannot reset
link status as mention above. But if I reload the drivers on the good peer,
they work perfectly.

Thanks !

On Nov 28, 2017 1:38 AM, "Jon Mason" <jdmason@kudzu.us> wrote:

> On Mon, Nov 27, 2017 at 12:09 PM, Logan Gunthorpe <logang@deltatee.com>
> wrote:
> >
> >
> > On 27/11/17 10:00 AM, Jon Mason wrote:
> >>
> >> We can have a SW watchdog timer to poll it (aka heartbeat) to detect
> >> this, but that's going to eat cycles and could allow for a windows
> >> where the link is down and the sender is writing into oblivion.
> >> Thoughts?
> >
> >
> > I think the easiest way is if we get a link up event, and we already
> think
> > the link is up, then we just put the link down before sending a second
> link
> > up event. I can probably look at doing something like that shortly.
> However,
> > unfortunately, my setup isn't suited to test this as I'm actually looping
> > back both partitions to a single host :(. I'll submit a patch that others
> > can test though.
>
> Sounds great.  ThanhTuThai, can you test this?
>
> >
> > Logan
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "linux-ntb" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to linux-ntb+unsubscribe@googlegroups.com.
> > To post to this group, send email to linux-ntb@googlegroups.com.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/linux-ntb/e4c00ca2-3d74-
> 5d88-1824-c517250d164f%40deltatee.com.
> > For more options, visit https://groups.google.com/d/optout.
>

[-- Attachment #2: Type: text/html, Size: 2946 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-27 23:33           ` ThanhTuThai
@ 2017-11-27 23:35             ` Logan Gunthorpe
  2017-11-27 23:44               ` ThanhTuThai
  0 siblings, 1 reply; 14+ messages in thread
From: Logan Gunthorpe @ 2017-11-27 23:35 UTC (permalink / raw)
  To: ThanhTuThai, Jon Mason; +Cc: linux-ntb



On 27/11/17 04:33 PM, ThanhTuThai wrote:
> No, it doesn't.
> 
> I've also thought about this strategy, and implemented it in my setup.
> For the soft-reboot, it works well, I can receive the link up event when 
> it complete rebooting.
> But for power-on reset ( in this case, the switchtec is also power-reset 
> ), I don't receiver any message of linking-up. So the good peer cannot 
> reset link status as mention above. But if I reload the drivers on the 
> good peer, they work perfectly.

That's because it's filtering out the link up because it already thinks 
the link is up. Please wait until I send a patch and test that. It'll be 
in the next day or two.

Thanks,

Logan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-27 23:35             ` Logan Gunthorpe
@ 2017-11-27 23:44               ` ThanhTuThai
  0 siblings, 0 replies; 14+ messages in thread
From: ThanhTuThai @ 2017-11-27 23:44 UTC (permalink / raw)
  To: Logan Gunthorpe; +Cc: jdmason, linux-ntb

[-- Attachment #1: Type: text/plain, Size: 816 bytes --]

Ok, sounds good.

Thank Logan !

On Nov 28, 2017 7:35 AM, "Logan Gunthorpe" <logang@deltatee.com> wrote:

>
>
> On 27/11/17 04:33 PM, ThanhTuThai wrote:
>
>> No, it doesn't.
>>
>> I've also thought about this strategy, and implemented it in my setup.
>> For the soft-reboot, it works well, I can receive the link up event when
>> it complete rebooting.
>> But for power-on reset ( in this case, the switchtec is also power-reset
>> ), I don't receiver any message of linking-up. So the good peer cannot
>> reset link status as mention above. But if I reload the drivers on the good
>> peer, they work perfectly.
>>
>
> That's because it's filtering out the link up because it already thinks
> the link is up. Please wait until I send a patch and test that. It'll be in
> the next day or two.
>
> Thanks,
>
> Logan
>

[-- Attachment #2: Type: text/html, Size: 1308 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-27 16:47   ` Logan Gunthorpe
  2017-11-27 17:00     ` Jon Mason
@ 2017-11-28  0:14     ` Karl Kao
  2017-11-28  1:42       ` Logan Gunthorpe
  1 sibling, 1 reply; 14+ messages in thread
From: Karl Kao @ 2017-11-28  0:14 UTC (permalink / raw)
  To: linux-ntb


[-- Attachment #1.1: Type: text/plain, Size: 685 bytes --]

Would Microsemi turn off the NTB pcie link once the hardware chip is being 
through a reset?
We can have the chip by default disable NTB link until hardware driver is 
loaded to make sure either reset or power cycle that the NTB link will be 
down.

On Monday, November 27, 2017 at 8:47:25 AM UTC-8, Logan Gunthorpe wrote:
>
>
>
> On 27/11/17 09:39 AM, Jon Mason wrote: 
> > It sounds like the link down/up isn't working properly.  Is the 
> > Microsemi NTB not able to detect a link down? 
>
> It can detect a link down, but the link doesn't actually go down in a 
> lot of cases. If the host is just soft rebooted after a crash, I don't 
> think the link will go down. 
>
> Logan 
>

[-- Attachment #1.2: Type: text/html, Size: 894 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-28  0:14     ` Karl Kao
@ 2017-11-28  1:42       ` Logan Gunthorpe
  2017-11-28  2:16         ` Karl Kao
  0 siblings, 1 reply; 14+ messages in thread
From: Logan Gunthorpe @ 2017-11-28  1:42 UTC (permalink / raw)
  To: Karl Kao, linux-ntb



On 27/11/17 05:14 PM, Karl Kao wrote:
> Would Microsemi turn off the NTB pcie link once the hardware chip is
> being through a reset?
> We can have the chip by default disable NTB link until hardware driver
> is loaded to make sure either reset or power cycle that the NTB link
> will be down.

The hardware chip won't be reset in this situation. That would kill the
other host. The "NTB link" is tracked in the driver as hardware only has
support for link events on each of the ports.

Logan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-28  1:42       ` Logan Gunthorpe
@ 2017-11-28  2:16         ` Karl Kao
  2017-11-28  3:01           ` Logan Gunthorpe
  0 siblings, 1 reply; 14+ messages in thread
From: Karl Kao @ 2017-11-28  2:16 UTC (permalink / raw)
  To: linux-ntb


[-- Attachment #1.1: Type: text/plain, Size: 774 bytes --]

I meant to say the "peer" which was suddenly powered off. Once the "peer" 
is being reset or power cycled, the peer's hardware chip is supposed to be 
reset, correct?

On Monday, November 27, 2017 at 5:42:52 PM UTC-8, Logan Gunthorpe wrote:
>
>
>
> On 27/11/17 05:14 PM, Karl Kao wrote: 
> > Would Microsemi turn off the NTB pcie link once the hardware chip is 
> > being through a reset? 
> > We can have the chip by default disable NTB link until hardware driver 
> > is loaded to make sure either reset or power cycle that the NTB link 
> > will be down. 
>
> The hardware chip won't be reset in this situation. That would kill the 
> other host. The "NTB link" is tracked in the driver as hardware only has 
> support for link events on each of the ports. 
>
> Logan 
>

[-- Attachment #1.2: Type: text/html, Size: 998 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-28  2:16         ` Karl Kao
@ 2017-11-28  3:01           ` Logan Gunthorpe
  2017-11-28  3:24             ` Karl Kao
  0 siblings, 1 reply; 14+ messages in thread
From: Logan Gunthorpe @ 2017-11-28  3:01 UTC (permalink / raw)
  To: Karl Kao, linux-ntb



On 2017-11-27 7:16 PM, Karl Kao wrote:
> I meant to say the "peer" which was suddenly powered off. Once the
> "peer" is being reset or power cycled, the peer's hardware chip is
> supposed to be reset, correct

There's only one hardware chip in an NTB configuration. If you reset it, 
you reset all the peers (which you do not want to do).

Logan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-28  3:01           ` Logan Gunthorpe
@ 2017-11-28  3:24             ` Karl Kao
  2017-11-29 23:44               ` Karl Kao
  0 siblings, 1 reply; 14+ messages in thread
From: Karl Kao @ 2017-11-28  3:24 UTC (permalink / raw)
  To: linux-ntb


[-- Attachment #1.1: Type: text/plain, Size: 568 bytes --]

I don't get the point why one system went through a power cycle, and the 
NTB chip in that system would not be reset.

On Monday, November 27, 2017 at 7:02:36 PM UTC-8, Logan Gunthorpe wrote:
>
>
>
> On 2017-11-27 7:16 PM, Karl Kao wrote: 
> > I meant to say the "peer" which was suddenly powered off. Once the 
> > "peer" is being reset or power cycled, the peer's hardware chip is 
> > supposed to be reset, correct 
>
> There's only one hardware chip in an NTB configuration. If you reset it, 
> you reset all the peers (which you do not want to do). 
>
> Logan 
>

[-- Attachment #1.2: Type: text/html, Size: 779 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Ntb Transport Driver Problem After Power-up
  2017-11-28  3:24             ` Karl Kao
@ 2017-11-29 23:44               ` Karl Kao
  0 siblings, 0 replies; 14+ messages in thread
From: Karl Kao @ 2017-11-29 23:44 UTC (permalink / raw)
  To: linux-ntb


[-- Attachment #1.1: Type: text/plain, Size: 1015 bytes --]

Just realized the current driver supports two partitions in single 
MicroSemi chip, instead of crosslink. Sorry for the mess up.
In this sense, I am confused that the initial question is about sudden 
power loss to one system, and the other is alive. What's the configuration 
of the systems with regard to MicroSemi NTB, isn't it cross link?

Thanks,
Karl

On Monday, November 27, 2017 at 7:24:03 PM UTC-8, Karl Kao wrote:
>
> I don't get the point why one system went through a power cycle, and the 
> NTB chip in that system would not be reset.
>
> On Monday, November 27, 2017 at 7:02:36 PM UTC-8, Logan Gunthorpe wrote:
>>
>>
>>
>> On 2017-11-27 7:16 PM, Karl Kao wrote: 
>> > I meant to say the "peer" which was suddenly powered off. Once the 
>> > "peer" is being reset or power cycled, the peer's hardware chip is 
>> > supposed to be reset, correct 
>>
>> There's only one hardware chip in an NTB configuration. If you reset it, 
>> you reset all the peers (which you do not want to do). 
>>
>> Logan 
>>
>

[-- Attachment #1.2: Type: text/html, Size: 1402 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-11-29 23:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAJQW-Q4_hJ+23T0qhK9BRMbcfqdF7rnJkWWPmGKfN5WzyD-eJg@mail.gmail.com>
2017-11-27 16:39 ` Ntb Transport Driver Problem After Power-up Jon Mason
2017-11-27 16:47   ` Logan Gunthorpe
2017-11-27 17:00     ` Jon Mason
2017-11-27 17:09       ` Logan Gunthorpe
2017-11-27 17:38         ` Jon Mason
2017-11-27 23:33           ` ThanhTuThai
2017-11-27 23:35             ` Logan Gunthorpe
2017-11-27 23:44               ` ThanhTuThai
2017-11-28  0:14     ` Karl Kao
2017-11-28  1:42       ` Logan Gunthorpe
2017-11-28  2:16         ` Karl Kao
2017-11-28  3:01           ` Logan Gunthorpe
2017-11-28  3:24             ` Karl Kao
2017-11-29 23:44               ` Karl Kao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.