All of lore.kernel.org
 help / color / mirror / Atom feed
* Out-of-band NIC management
@ 2019-07-16 21:45 Ben Wei
  2019-07-17  3:10 ` Joel Stanley
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Ben Wei @ 2019-07-16 21:45 UTC (permalink / raw)
  To: openbmc

Hi all, 
 
Would anyone be interested in collaborating on out-of-band NIC management and monitoring? 

DMTF has as a NCSI spec (https://www.dmtf.org/sites/default/files/standards/documents/DSP0222_1.1.0.pdf), that defines a standard interface for BMCs to manage NICs.
And in kernel 5.x , NC-SI driver supports Netlink interface for communicating with userspace processes.
  
I'm thinking adding the following tools to OpenBMC as a starting point and build form there:
 
      1. A command line utility (e.g. ncsi-util) to send raw NC-SI commands, useful for debugging and initial NIC bring up, 
      For example:
          ncsi-util -eth0 -ch 0 <raw NC-SI command>
 
      We can further extend this command line tool to support other management interfaces, e.g sending MCTP or PLDM commands to NIC.
 
      2. A daemon running on OpenBMC (e.g ncsid) monitoring NIC status,  for example:
          a. Query and log NIC capability and current parameter setting
          b. Periodically check NIC link status, re-initialize NC-SI link if NIC is unreachable, log the status
          c. Enable and monitor NIC Asynchronous Event Notifications (AENs) 
                i. such as  Link Status Change, Configuration required, Host driver status change
                ii. there are OEM-specific AENs that BMC may also enable and monitor
                iii. either log these events, and/or performs recovery and remediation as needed
          d. Additional monitoring such as 
                i.  temperature (not in standard NC-SI command yet), 
                ii. firmware version, update event, network traffic statistics
 
Both the CLI tool and the monitoring daemon can either communicate to kernel driver directly via Netlink independently, or we can have the ncsi daemon acting as command serializer to kernel and other user space processes.
These are just some of my initial thoughts and I'd love to hear some feedback if these would be useful to OpenBMC. 

If anyone in interested in collaborate on these we can discuss more on features and design details.
 
Regards,
-Ben

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Out-of-band NIC management
  2019-07-16 21:45 Out-of-band NIC management Ben Wei
@ 2019-07-17  3:10 ` Joel Stanley
  2019-07-17  4:27   ` Ratan Gupta
  2019-07-17 17:59   ` Ben Wei
  2019-07-17 16:43 ` Supreeth Venkatesh
  2019-07-17 16:47 ` Supreeth Venkatesh
  2 siblings, 2 replies; 11+ messages in thread
From: Joel Stanley @ 2019-07-17  3:10 UTC (permalink / raw)
  To: Ben Wei, Sam Mendoza-Jonas, Jeremy Kerr; +Cc: openbmc

On Tue, 16 Jul 2019 at 21:46, Ben Wei <benwei@fb.com> wrote:
>
> Hi all,
>
> Would anyone be interested in collaborating on out-of-band NIC management and monitoring?
>
> DMTF has as a NCSI spec (https://www.dmtf.org/sites/default/files/standards/documents/DSP0222_1.1.0.pdf), that defines a standard interface for BMCs to manage NICs.
> And in kernel 5.x , NC-SI driver supports Netlink interface for communicating with userspace processes.
>
> I'm thinking adding the following tools to OpenBMC as a starting point and build form there:
>
>       1. A command line utility (e.g. ncsi-util) to send raw NC-SI commands, useful for debugging and initial NIC bring up,
>       For example:
>           ncsi-util -eth0 -ch 0 <raw NC-SI command>

The NCSI kernel maintainer, Sam, has written a tool that fits this descirption:

 https://github.com/sammj/ncsi-netlink

>
>       We can further extend this command line tool to support other management interfaces, e.g sending MCTP or PLDM commands to NIC.

I have added Jeremy to cc, who has been doing some MCTP related work recently.

>
>       2. A daemon running on OpenBMC (e.g ncsid) monitoring NIC status,  for example:
>           a. Query and log NIC capability and current parameter setting
>           b. Periodically check NIC link status, re-initialize NC-SI link if NIC is unreachable, log the status
>           c. Enable and monitor NIC Asynchronous Event Notifications (AENs)
>                 i. such as  Link Status Change, Configuration required, Host driver status change
>                 ii. there are OEM-specific AENs that BMC may also enable and monitor
>                 iii. either log these events, and/or performs recovery and remediation as needed
>           d. Additional monitoring such as
>                 i.  temperature (not in standard NC-SI command yet),
>                 ii. firmware version, update event, network traffic statistics
>
> Both the CLI tool and the monitoring daemon can either communicate to kernel driver directly via Netlink independently, or we can have the ncsi daemon acting as command serializer to kernel and other user space processes.
> These are just some of my initial thoughts and I'd love to hear some feedback if these would be useful to OpenBMC.
>
> If anyone in interested in collaborate on these we can discuss more on features and design details.
>
> Regards,
> -Ben

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Out-of-band NIC management
  2019-07-17  3:10 ` Joel Stanley
@ 2019-07-17  4:27   ` Ratan Gupta
  2019-07-17 17:59   ` Ben Wei
  1 sibling, 0 replies; 11+ messages in thread
From: Ratan Gupta @ 2019-07-17  4:27 UTC (permalink / raw)
  To: openbmc, joel Stanley, Ben Wei, Jeremy Kerr

Hi Joel,

We had ported the similiar ncsi-netlink utility to openbmc under 
phosphor-networkd.

https://github.com/openbmc/phosphor-networkd/blob/master/ncsi_util.hpp

Ben,

we can extend the same.

Regards
Ratan Gupta



On 17/07/19 8:40 AM, Joel Stanley wrote:
> On Tue, 16 Jul 2019 at 21:46, Ben Wei <benwei@fb.com> wrote:
>> Hi all,
>>
>> Would anyone be interested in collaborating on out-of-band NIC management and monitoring?
>>
>> DMTF has as a NCSI spec (https://www.dmtf.org/sites/default/files/standards/documents/DSP0222_1.1.0.pdf), that defines a standard interface for BMCs to manage NICs.
>> And in kernel 5.x , NC-SI driver supports Netlink interface for communicating with userspace processes.
>>
>> I'm thinking adding the following tools to OpenBMC as a starting point and build form there:
>>
>>        1. A command line utility (e.g. ncsi-util) to send raw NC-SI commands, useful for debugging and initial NIC bring up,
>>        For example:
>>            ncsi-util -eth0 -ch 0 <raw NC-SI command>
> The NCSI kernel maintainer, Sam, has written a tool that fits this descirption:
>
>   https://github.com/sammj/ncsi-netlink
>
>>        We can further extend this command line tool to support other management interfaces, e.g sending MCTP or PLDM commands to NIC.
> I have added Jeremy to cc, who has been doing some MCTP related work recently.
>
>>        2. A daemon running on OpenBMC (e.g ncsid) monitoring NIC status,  for example:
>>            a. Query and log NIC capability and current parameter setting
>>            b. Periodically check NIC link status, re-initialize NC-SI link if NIC is unreachable, log the status
>>            c. Enable and monitor NIC Asynchronous Event Notifications (AENs)
>>                  i. such as  Link Status Change, Configuration required, Host driver status change
>>                  ii. there are OEM-specific AENs that BMC may also enable and monitor
>>                  iii. either log these events, and/or performs recovery and remediation as needed
>>            d. Additional monitoring such as
>>                  i.  temperature (not in standard NC-SI command yet),
>>                  ii. firmware version, update event, network traffic statistics
>>
>> Both the CLI tool and the monitoring daemon can either communicate to kernel driver directly via Netlink independently, or we can have the ncsi daemon acting as command serializer to kernel and other user space processes.
>> These are just some of my initial thoughts and I'd love to hear some feedback if these would be useful to OpenBMC.
>>
>> If anyone in interested in collaborate on these we can discuss more on features and design details.
>>
>> Regards,
>> -Ben

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Out-of-band NIC management
  2019-07-16 21:45 Out-of-band NIC management Ben Wei
  2019-07-17  3:10 ` Joel Stanley
@ 2019-07-17 16:43 ` Supreeth Venkatesh
  2019-07-17 18:25   ` Ben Wei
  2019-07-17 16:47 ` Supreeth Venkatesh
  2 siblings, 1 reply; 11+ messages in thread
From: Supreeth Venkatesh @ 2019-07-17 16:43 UTC (permalink / raw)
  To: Ben Wei, openbmc; +Cc: dong.wei, Jeff.Booher-Kaeding

On Tue, 2019-07-16 at 16:45 -0500, Ben Wei wrote:
> Hi all, 
Hi Ben,
>  
> Would anyone be interested in collaborating on out-of-band NIC
> management and monitoring?
Yes. If there is an existing implementation that can be
leveraged/extended within OpenBMC, it will be fantastic.

>  
> 
> DMTF has as a NCSI spec (
> https://www.dmtf.org/sites/default/files/standards/documents/DSP0222_1.1.0.pdf
> ), that defines a standard interface for BMCs to manage NICs.
I assume that NC-SI over MCTP Binding 
https://www.dmtf.org/sites/default/files/standards/documents/DSP0261_1.2.1.pdf
 will also be targeted. Correct?

Jeremy was working on MCTP, so we should collaborate with Jeremy and
team. 


> And in kernel 5.x , NC-SI driver supports Netlink interface for
> communicating with userspace processes.
>   
> I'm thinking adding the following tools to OpenBMC as a starting
> point and build form there:
>  
>       1. A command line utility (e.g. ncsi-util) to send raw NC-SI
> commands, useful for debugging and initial NIC bring up, 
>       For example:
>           ncsi-util -eth0 -ch 0 <raw NC-SI command>
>  
>       We can further extend this command line tool to support other
> management interfaces, e.g sending MCTP or PLDM commands to NIC.
>  
>       2. A daemon running on OpenBMC (e.g ncsid) monitoring NIC
> status,  for example:
>           a. Query and log NIC capability and current parameter
> setting
>           b. Periodically check NIC link status, re-initialize NC-SI
> link if NIC is unreachable, log the status
>           c. Enable and monitor NIC Asynchronous Event Notifications
> (AENs) 
>                 i. such as  Link Status Change, Configuration
> required, Host driver status change
>                 ii. there are OEM-specific AENs that BMC may also
> enable and monitor
>                 iii. either log these events, and/or performs
> recovery and remediation as needed
>           d. Additional monitoring such as 
>                 i.  temperature (not in standard NC-SI command yet), 
>                 ii. firmware version, update event, network traffic
> statistics
>  
> Both the CLI tool and the monitoring daemon can either communicate to
> kernel driver directly via Netlink independently, or we can have the
> ncsi daemon acting as command serializer to kernel and other user
> space processes.
> These are just some of my initial thoughts and I'd love to hear some
> feedback if these would be useful to OpenBMC. 
> 
> If anyone in interested in collaborate on these we can discuss more
> on features and design details.
I am interested in collaborating on the design details.

>  
> Regards,
> -Ben

Thanks,
Supreeth

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Out-of-band NIC management
  2019-07-16 21:45 Out-of-band NIC management Ben Wei
  2019-07-17  3:10 ` Joel Stanley
  2019-07-17 16:43 ` Supreeth Venkatesh
@ 2019-07-17 16:47 ` Supreeth Venkatesh
  2 siblings, 0 replies; 11+ messages in thread
From: Supreeth Venkatesh @ 2019-07-17 16:47 UTC (permalink / raw)
  To: Ben Wei, openbmc; +Cc: Dong Wei, Jeff Booher-Kaeding

[-- Attachment #1: Type: text/plain, Size: 3048 bytes --]

On Tue, 2019-07-16 at 16:45 -0500, Ben Wei wrote:
> Hi all,

Hi Ben,
>
> Would anyone be interested in collaborating on out-of-band NIC
> management and monitoring?

Yes. If there is an existing implementation that can be
leveraged/extended within OpenBMC, it will be fantastic.

>
>
> DMTF has as a NCSI spec (
>
https://www.dmtf.org/sites/default/files/standards/documents/DSP0222_1.1.0.pdf
> ), that defines a standard interface for BMCs to manage NICs.

I assume that NC-SI over MCTP Binding

https://www.dmtf.org/sites/default/files/standards/documents/DSP0261_1.2.1.pdf
 will also be targeted. Correct?

Jeremy was working on MCTP, so we should collaborate with Jeremy and
team.


> And in kernel 5.x , NC-SI driver supports Netlink interface for
> communicating with userspace processes.
>
> I'm thinking adding the following tools to OpenBMC as a starting
> point and build form there:
>
>       1. A command line utility (e.g. ncsi-util) to send raw NC-SI
> commands, useful for debugging and initial NIC bring up,
>       For example:
>           ncsi-util -eth0 -ch 0 <raw NC-SI command>
>
>       We can further extend this command line tool to support other
> management interfaces, e.g sending MCTP or PLDM commands to NIC.
>
>       2. A daemon running on OpenBMC (e.g ncsid) monitoring NIC
> status,  for example:
>           a. Query and log NIC capability and current parameter
> setting
>           b. Periodically check NIC link status, re-initialize NC-SI
> link if NIC is unreachable, log the status
>           c. Enable and monitor NIC Asynchronous Event Notifications
> (AENs)
>                 i. such as  Link Status Change, Configuration
> required, Host driver status change
>                 ii. there are OEM-specific AENs that BMC may also
> enable and monitor
>                 iii. either log these events, and/or performs
> recovery and remediation as needed
>           d. Additional monitoring such as
>                 i.  temperature (not in standard NC-SI command yet),
>                 ii. firmware version, update event, network traffic
> statistics
>
> Both the CLI tool and the monitoring daemon can either communicate to
> kernel driver directly via Netlink independently, or we can have the
> ncsi daemon acting as command serializer to kernel and other user
> space processes.
> These are just some of my initial thoughts and I'd love to hear some
> feedback if these would be useful to OpenBMC.
>
> If anyone in interested in collaborate on these we can discuss more
> on features and design details.

I am interested in collaborating on the design details.

>
> Regards,
> -Ben

Thanks,
Supreeth
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

[-- Attachment #2: Type: text/html, Size: 25691 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Out-of-band NIC management
  2019-07-17  3:10 ` Joel Stanley
  2019-07-17  4:27   ` Ratan Gupta
@ 2019-07-17 17:59   ` Ben Wei
  2019-07-24  1:04     ` Joel Stanley
  1 sibling, 1 reply; 11+ messages in thread
From: Ben Wei @ 2019-07-17 17:59 UTC (permalink / raw)
  To: Joel Stanley, Sam Mendoza-Jonas, Jeremy Kerr; +Cc: openbmc

> On Tue, 16 Jul 2019 at 21:46, Ben Wei <benwei@fb.com> wrote:
> >
> > Hi all,
> >
> > Would anyone be interested in collaborating on out-of-band NIC management and monitoring?
> >
> > DMTF has as a NCSI spec (https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dmtf.org_sites_default_files_standards_documents_DSP0222-> 5F1.1.0.pdf&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=U35IaQ-7Tnwjs7q_Fwf_bQ&m=lXSq8KYmN6_5__0s64ulIMwH5bwqJQjM2d-IqHL7kcw&s=L-> c3XEEs7crMHpKscqEdHYKM8fRR2xHM9NkQdfohAcU&e= ), that defines a standard interface for BMCs to manage NICs.
> > And in kernel 5.x , NC-SI driver supports Netlink interface for communicating with userspace processes.
> >
> > I'm thinking adding the following tools to OpenBMC as a starting point and build form there:
> >
> >       1. A command line utility (e.g. ncsi-util) to send raw NC-SI commands, useful for debugging and initial NIC bring up,
> >       For example:
> >           ncsi-util -eth0 -ch 0 <raw NC-SI command>
>
> The NCSI kernel maintainer, Sam, has written a tool that fits this descirption:
>
>  https://github.com/sammj/ncsi-netlink

Thanks, this is exactly what I was looking for!
One question on this, do you plan add some command-specific parsing.  Especially for commands like get version ID,  capability & parameters, and various statistics. I think these are especially useful for initial NIC bring up and debugging.

Regards,
-Ben

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Out-of-band NIC management
  2019-07-17 16:43 ` Supreeth Venkatesh
@ 2019-07-17 18:25   ` Ben Wei
  2019-07-17 19:44     ` Justin.Lee1
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Wei @ 2019-07-17 18:25 UTC (permalink / raw)
  To: Supreeth Venkatesh, openbmc; +Cc: dong.wei, Jeff.Booher-Kaeding

> > Hi all,
> Hi Ben,
> >  
> > Would anyone be interested in collaborating on out-of-band NIC 
> > management and monitoring?
> Yes. If there is an existing implementation that can be leveraged/extended within OpenBMC, it will be fantastic.
>
> >  
> > 
> > DMTF has as a NCSI spec (
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dmtf.org_site
> > s_default_files_standards_documents_DSP0222-5F1.1.0.pdf&d=DwICaQ&c=5VD
> > 0RTtNlTh3ycd41b3MUw&r=U35IaQ-7Tnwjs7q_Fwf_bQ&m=JEop7ohMmgognpGqc17Ib11
> > BzokuLufcEDI-uGoh-wQ&s=sbnQESowB-lh1RYUBwfgx7qH5Hi11KX_Jtzm3ZnG2_I&e=
> > ), that defines a standard interface for BMCs to manage NICs.
> I assume that NC-SI over MCTP Binding
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dmtf.org_sites_default_files_standards_documents_DSP0261-5F1.2.1.pdf&d=DwICaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=U35IaQ-7Tnwjs7q_Fwf_bQ&m=JEop7ohMmgognpGqc17Ib11BzokuLufcEDI-uGoh-> wQ&s=jqbZoZIZ6zAbpaL1DhB7t8nbcFqHKT-caHdQNZGvFfU&e=
>  will also be targeted. Correct?
>
> Jeremy was working on MCTP, so we should collaborate with Jeremy and team. 

For the CLI tool and management & monitoring daemon, I was initially thinking using NC-SI over RMII/RBT, mainly because kernel already supports this and it provides a netlink interface for userspace to send/receive commands.
But I think we can make our management tool transportation agnostic, so for NCSIoRMII/RBT, it communicates to kernel NCSI driver over netlink, and for NCSI over MCTP, it uses a the mechanism provided by libmctp.

> > And in kernel 5.x , NC-SI driver supports Netlink interface for 
> > communicating with userspace processes.
> >   
> > I'm thinking adding the following tools to OpenBMC as a starting point 
> > and build form there:
> >  
> >       1. A command line utility (e.g. ncsi-util) to send raw NC-SI 
> > commands, useful for debugging and initial NIC bring up,
> >       For example:
> >           ncsi-util -eth0 -ch 0 <raw NC-SI command>
> >  
> >       We can further extend this command line tool to support other 
> > management interfaces, e.g sending MCTP or PLDM commands to NIC.
> >  
> >       2. A daemon running on OpenBMC (e.g ncsid) monitoring NIC 
> > status,  for example:
> >           a. Query and log NIC capability and current parameter 
> > setting
> >           b. Periodically check NIC link status, re-initialize NC-SI 
> > link if NIC is unreachable, log the status
> >           c. Enable and monitor NIC Asynchronous Event Notifications
> > (AENs) 
> >                 i. such as  Link Status Change, Configuration 
> > required, Host driver status change
> >                 ii. there are OEM-specific AENs that BMC may also 
> > enable and monitor
> >                 iii. either log these events, and/or performs recovery 
> > and remediation as needed
> >           d. Additional monitoring such as 
> >                 i.  temperature (not in standard NC-SI command yet), 
> >                 ii. firmware version, update event, network traffic 
> > statistics
> >  
> > Both the CLI tool and the monitoring daemon can either communicate to 
> > kernel driver directly via Netlink independently, or we can have the 
> > ncsi daemon acting as command serializer to kernel and other user 
> space processes.
> > These are just some of my initial thoughts and I'd love to hear some 
> > feedback if these would be useful to OpenBMC.
> > 
> > If anyone in interested in collaborate on these we can discuss more on 
> > features and design details.
> I am interested in collaborating on the design details.

Great! I can put a draft on Gerrit and we can work together on this. Do you have additional uses cases you're looking for?

Regards
-Ben

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Out-of-band NIC management
  2019-07-17 18:25   ` Ben Wei
@ 2019-07-17 19:44     ` Justin.Lee1
  2019-07-17 20:49       ` Ben Wei
  0 siblings, 1 reply; 11+ messages in thread
From: Justin.Lee1 @ 2019-07-17 19:44 UTC (permalink / raw)
  To: benwei, supreeth.venkatesh, openbmc; +Cc: Jeff.Booher-Kaeding, dong.wei, sam

Hi Ben,

I have a few questions about the 2.c item below.


> For the CLI tool and management & monitoring daemon, I was initially thinking using NC-SI over RMII/RBT, mainly because kernel already supports this and it provides a netlink interface for userspace to send/receive commands.
> But I think we can make our management tool transportation agnostic, so for NCSIoRMII/RBT, it communicates to kernel NCSI driver over netlink, and for NCSI over MCTP, it uses a the mechanism provided by libmctp.
> 
> > > And in kernel 5.x , NC-SI driver supports Netlink interface for 
> > > communicating with userspace processes.
> > >   
> > > I'm thinking adding the following tools to OpenBMC as a starting 
> > > point and build form there:
> > >  
> > >       1. A command line utility (e.g. ncsi-util) to send raw NC-SI 
> > > commands, useful for debugging and initial NIC bring up,
> > >       For example:
> > >           ncsi-util -eth0 -ch 0 <raw NC-SI command>
> > >  
> > >       We can further extend this command line tool to support other 
> > > management interfaces, e.g sending MCTP or PLDM commands to NIC.
> > >  
> > >       2. A daemon running on OpenBMC (e.g ncsid) monitoring NIC 
> > > status,  for example:
> > >           a. Query and log NIC capability and current parameter 
> > > setting
> > >           b. Periodically check NIC link status, re-initialize NC-SI 
> > > link if NIC is unreachable, log the status
> > >           c. Enable and monitor NIC Asynchronous Event Notifications
> > > (AENs) 


For selected channels, AEN is enabled by default. Do you plan to enable the AEN for non-selected channels too?

If yes, what is the approach you are going to do? Enable it by userspace or modify NC-SI driver to achieve that?
We are planning to monitor all channels but still look for the best way.


For delivering the AEN to userspace, currently, we implement it via the mcgrps locally but plan to upstream.

enum ncsi_genl_multicast_groups {
	NCSI_GENL_MCGRP_AEN,
};

static const struct genl_multicast_group ncsi_genl_mcgrps[] = {
	[NCSI_GENL_MCGRP_AEN] = { .name = NCSI_GENL_MCGRP_AEN_NAME },
};

static struct genl_family ncsi_genl_family __ro_after_init = {
	.name = "NCSI",
	.version = 0,
	.maxattr = NCSI_ATTR_MAX,
	.module = THIS_MODULE,
	.ops = ncsi_ops,
	.n_ops = ARRAY_SIZE(ncsi_ops),
	.mcgrps = ncsi_genl_mcgrps,
	.n_mcgrps = ARRAY_SIZE(ncsi_genl_mcgrps),
};


> > >                 i. such as  Link Status Change, Configuration 
> > > required, Host driver status change
> > >                 ii. there are OEM-specific AENs that BMC may also 
> > > enable and monitor
> > >                 iii. either log these events, and/or performs 
> > > recovery and remediation as needed
> > >           d. Additional monitoring such as 
> > >                 i.  temperature (not in standard NC-SI command yet), 
> > >                 ii. firmware version, update event, network traffic 
> > > statistics
> > >  
> > > Both the CLI tool and the monitoring daemon can either communicate 
> > > to kernel driver directly via Netlink independently, or we can have 
> > > the ncsi daemon acting as command serializer to kernel and other 
> > > user
> > space processes.
> > > These are just some of my initial thoughts and I'd love to hear some 
> > > feedback if these would be useful to OpenBMC.
> > > 
> > > If anyone in interested in collaborate on these we can discuss more 
> > > on features and design details.
> > I am interested in collaborating on the design details.
> 
> Great! I can put a draft on Gerrit and we can work together on this. Do you have additional uses cases you're looking for?
> 
> Regards
> -Ben

Thanks,
Justin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Out-of-band NIC management
  2019-07-17 19:44     ` Justin.Lee1
@ 2019-07-17 20:49       ` Ben Wei
  2019-07-17 22:19         ` Justin.Lee1
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Wei @ 2019-07-17 20:49 UTC (permalink / raw)
  To: Justin.Lee1, supreeth.venkatesh, openbmc
  Cc: Jeff.Booher-Kaeding, dong.wei, sam

> Hi Ben,
>
> I have a few questions about the 2.c item below.
>
>
> > For the CLI tool and management & monitoring daemon, I was initially thinking using NC-SI over RMII/RBT, mainly because kernel already supports this and it provides a netlink interface for userspace to send/receive commands.
> > But I think we can make our management tool transportation agnostic, so for NCSIoRMII/RBT, it communicates to kernel NCSI driver over netlink, and for NCSI over MCTP, it uses a the mechanism provided by libmctp.
> > 
> > > > And in kernel 5.x , NC-SI driver supports Netlink interface for 
> > > > communicating with userspace processes.
> > > >   
> > > > I'm thinking adding the following tools to OpenBMC as a starting 
> > > > point and build form there:
> > > >  
> > > >       1. A command line utility (e.g. ncsi-util) to send raw NC-SI 
> > > > commands, useful for debugging and initial NIC bring up,
> > > >       For example:
> > > >           ncsi-util -eth0 -ch 0 <raw NC-SI command>
> > > >  
> > > >       We can further extend this command line tool to support 
> > > > other management interfaces, e.g sending MCTP or PLDM commands to NIC.
> > > >  
> > > >       2. A daemon running on OpenBMC (e.g ncsid) monitoring NIC 
> > > > status,  for example:
> > > >           a. Query and log NIC capability and current parameter 
> > > > setting
> > > >           b. Periodically check NIC link status, re-initialize 
> > > > NC-SI link if NIC is unreachable, log the status
> > > >           c. Enable and monitor NIC Asynchronous Event 
> > > > Notifications
> > > > (AENs)
>
>
> For selected channels, AEN is enabled by default. Do you plan to enable the AEN for non-selected channels too?
> If yes, what is the approach you are going to do? Enable it by userspace or modify NC-SI driver to achieve that?
> We are planning to monitor all channels but still look for the best way.
>

Hi Justin,
For now I only plan to monitor selected channel. But I'm curious about the AEN enabled by default. Is this performed by kernel driver today?
In the previous version (e.g. 4.x), I had to manually enable  (or subscribe to) them after NC-SI initialization.

(this is why I was thinking a userspace CLI to check NIC parameters would be useful to get the current NIC setting)

But in any case (selected vs all channels), I am thinking having some tool or daemon in userspace to modify AEN setting would provide flexibility, since kernel driver already provides access mechanism.
For the 2.c case, we may selectively enable AENs based on "get capability" cmd. Also we may check which OEM AENs (if any) are supported, and based on our needs selectively enable/disable these.


> For delivering the AEN to userspace, currently, we implement it via the mcgrps locally but plan to upstream.
>
> enum ncsi_genl_multicast_groups {
>	NCSI_GENL_MCGRP_AEN,
> };
>
> static const struct genl_multicast_group ncsi_genl_mcgrps[] = {
>	[NCSI_GENL_MCGRP_AEN] = { .name = NCSI_GENL_MCGRP_AEN_NAME }, };
>
> static struct genl_family ncsi_genl_family __ro_after_init = {
>	.name = "NCSI",
>	.version = 0,
>	.maxattr = NCSI_ATTR_MAX,
>	.module = THIS_MODULE,
>	.ops = ncsi_ops,
>	.n_ops = ARRAY_SIZE(ncsi_ops),
>	.mcgrps = ncsi_genl_mcgrps,
>	.n_mcgrps = ARRAY_SIZE(ncsi_genl_mcgrps), };
>
>

This is a great idea! Previously I had a hacky solution to send a custom netlink message for AENs, but this multicast group is more flexible.
In your current design, do you have multiple processes listening to these? Or 1 process that handles all AENs.

Regards,
-Ben

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Out-of-band NIC management
  2019-07-17 20:49       ` Ben Wei
@ 2019-07-17 22:19         ` Justin.Lee1
  0 siblings, 0 replies; 11+ messages in thread
From: Justin.Lee1 @ 2019-07-17 22:19 UTC (permalink / raw)
  To: benwei, supreeth.venkatesh, openbmc; +Cc: Jeff.Booher-Kaeding, dong.wei, sam



> > Hi Ben,
> >
> > I have a few questions about the 2.c item below.
> >
> >
> > > For the CLI tool and management & monitoring daemon, I was initially thinking using NC-SI over RMII/RBT, mainly because kernel already supports this and it provides a netlink interface for userspace to send/receive commands.
> > > But I think we can make our management tool transportation agnostic, so for NCSIoRMII/RBT, it communicates to kernel NCSI driver over netlink, and for NCSI over MCTP, it uses a the mechanism provided by libmctp.
> > > 
> > > > > And in kernel 5.x , NC-SI driver supports Netlink interface for 
> > > > > communicating with userspace processes.
> > > > >   
> > > > > I'm thinking adding the following tools to OpenBMC as a starting 
> > > > > point and build form there:
> > > > >  
> > > > >       1. A command line utility (e.g. ncsi-util) to send raw 
> > > > > NC-SI commands, useful for debugging and initial NIC bring up,
> > > > >       For example:
> > > > >           ncsi-util -eth0 -ch 0 <raw NC-SI command>
> > > > >  
> > > > >       We can further extend this command line tool to support 
> > > > > other management interfaces, e.g sending MCTP or PLDM commands to NIC.
> > > > >  
> > > > >       2. A daemon running on OpenBMC (e.g ncsid) monitoring NIC 
> > > > > status,  for example:
> > > > >           a. Query and log NIC capability and current parameter 
> > > > > setting
> > > > >           b. Periodically check NIC link status, re-initialize 
> > > > > NC-SI link if NIC is unreachable, log the status
> > > > >           c. Enable and monitor NIC Asynchronous Event 
> > > > > Notifications
> > > > > (AENs)
> >
> >
> > For selected channels, AEN is enabled by default. Do you plan to enable the AEN for non-selected channels too?
> > If yes, what is the approach you are going to do? Enable it by userspace or modify NC-SI driver to achieve that?
> > We are planning to monitor all channels but still look for the best way.
> >
> 
> Hi Justin,
> For now I only plan to monitor selected channel. But I'm curious about the AEN enabled by default. Is this performed by kernel driver today?
> In the previous version (e.g. 4.x), I had to manually enable  (or subscribe to) them after NC-SI initialization.


Hi Ben,

Inside the ncsi_configure_channel() function, there is one state to enable the AEN (unless the capability bit for AEN is not set).

		} else if (nd->state == ncsi_dev_state_config_ae) {
			nca.type = NCSI_PKT_CMD_AE;
			nca.bytes[0] = 0;
			nca.dwords[1] = nc->caps[NCSI_CAP_AEN].cap;


> 
> (this is why I was thinking a userspace CLI to check NIC parameters would be useful to get the current NIC setting)
> 
> But in any case (selected vs all channels), I am thinking having some tool or daemon in userspace to modify AEN setting would provide flexibility, since kernel driver already provides access mechanism.
> For the 2.c case, we may selectively enable AENs based on "get capability" cmd. Also we may check which OEM AENs (if any) are supported, and based on our needs selectively enable/disable these.
> 
> 
> > For delivering the AEN to userspace, currently, we implement it via the mcgrps locally but plan to upstream.
> >
> > enum ncsi_genl_multicast_groups {
> >	NCSI_GENL_MCGRP_AEN,
> > };
> >
> > static const struct genl_multicast_group ncsi_genl_mcgrps[] = {
> >	[NCSI_GENL_MCGRP_AEN] = { .name = NCSI_GENL_MCGRP_AEN_NAME }, };
> >
> > static struct genl_family ncsi_genl_family __ro_after_init = {
> >	.name = "NCSI",
> >	.version = 0,
> >	.maxattr = NCSI_ATTR_MAX,
> >	.module = THIS_MODULE,
> >	.ops = ncsi_ops,
> >	.n_ops = ARRAY_SIZE(ncsi_ops),
> >	.mcgrps = ncsi_genl_mcgrps,
> >	.n_mcgrps = ARRAY_SIZE(ncsi_genl_mcgrps), };
> >
> >
> 
> This is a great idea! Previously I had a hacky solution to send a custom netlink message for AENs, but this multicast group is more flexible.
> In your current design, do you have multiple processes listening to these? Or 1 process that handles all AENs.


Currently, we will only have one process to listen all AENs but it is not finalized yet.


> 
> Regards,
> -Ben
> 

Thanks,
Justin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Out-of-band NIC management
  2019-07-17 17:59   ` Ben Wei
@ 2019-07-24  1:04     ` Joel Stanley
  0 siblings, 0 replies; 11+ messages in thread
From: Joel Stanley @ 2019-07-24  1:04 UTC (permalink / raw)
  To: Ben Wei; +Cc: Sam Mendoza-Jonas, Jeremy Kerr, openbmc

On Wed, 17 Jul 2019 at 18:00, Ben Wei <benwei@fb.com> wrote:
>
> > On Tue, 16 Jul 2019 at 21:46, Ben Wei <benwei@fb.com> wrote:
> > >       1. A command line utility (e.g. ncsi-util) to send raw NC-SI commands, useful for debugging and initial NIC bring up,
> > >       For example:
> > >           ncsi-util -eth0 -ch 0 <raw NC-SI command>
> >
> > The NCSI kernel maintainer, Sam, has written a tool that fits this descirption:
> >
> >  https://github.com/sammj/ncsi-netlink
>
> Thanks, this is exactly what I was looking for!
> One question on this, do you plan add some command-specific parsing.  Especially for commands like get version ID,  capability & parameters, and various statistics. I think these are especially useful for initial NIC bring up and debugging.

I am sure patches to do this would be accepted.

I won't be doing any work on the tool myself.

Cheers,

Joel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-07-24  1:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-16 21:45 Out-of-band NIC management Ben Wei
2019-07-17  3:10 ` Joel Stanley
2019-07-17  4:27   ` Ratan Gupta
2019-07-17 17:59   ` Ben Wei
2019-07-24  1:04     ` Joel Stanley
2019-07-17 16:43 ` Supreeth Venkatesh
2019-07-17 18:25   ` Ben Wei
2019-07-17 19:44     ` Justin.Lee1
2019-07-17 20:49       ` Ben Wei
2019-07-17 22:19         ` Justin.Lee1
2019-07-17 16:47 ` Supreeth Venkatesh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.