openbmc.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Johnathan Mantey <johnathanx.mantey@intel.com>
To: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>,
	Patrick Williams <patrick@stwcx.xyz>
Cc: Jeremy Kerr <jk@codeconstruct.com.au>,
	OpenBMC Maillist <openbmc@lists.ozlabs.org>,
	Lei Yu <yulei.sh@bytedance.com>
Subject: Re: Checking for network online
Date: Wed, 23 Feb 2022 12:04:12 -0800	[thread overview]
Message-ID: <3f4f0cc0-7967-66f9-a085-a6b2ae978a01@intel.com> (raw)
In-Reply-To: <112c8819-24bc-2a24-45a3-9c919088f43a@linux.intel.com>


[-- Attachment #1.1: Type: text/plain, Size: 6238 bytes --]



On 2/23/22 09:44, Jiaqing Zhao wrote:
> On 2022-02-23 21:48, Patrick Williams wrote:
>> On Wed, Feb 23, 2022 at 10:09:19AM +0800, Jiaqing Zhao wrote:
>>> I think a solution is to set RequiredForOnline=no (https://www.freedesktop.org/software/systemd/man/systemd.network.html#RequiredForOnline=) in all network interface config. This option skips the interface when running systemd-networkd-wait-online.service. Canonical netplan (used in ubuntu server) also uses this option to skip the online check for given interface (https://github.com/canonical/netplan/blob/main/src/networkd.c#L636-L639).
>>>
>>> I'll submit a patch to phosphor-networkd later.
>>
>> I really don't think this is appropriate for all systems.  Services have
>> dependencies on network-online.target for a reason.  If the side-effect of
>> having the BMC network cable unplugged is that the host doesn't boot, that might
>> be entirely reasonable behavior in some environments.
>>
>> We use rsyslog as the mechanism to offload our BMC logging data to an
>> aggregation point.  When you have a very large scale deployment, it is actually
>> better for the system to not come online than for us to lose out on that data,
>> since we have spare capacity to take its place.
> 
> My understanding is that in OpenBMC, the propose to use rsyslog is to format the Redfish and IPMI SEL logs from system journal. The "r" of rsyslogd is not used in most cases. I think the "network not available" can be handled same as "server misconfigured" in rsyslogd, as in both cases it fails to connect to the server, and may exit or print some error messages? (not tried yet)
> 
> Jonathan mentions that the 120s wait blocks multi-user.target in his initial email. Considering that there is no BMC serial port in most production hardware, when BMC has no network connection, the only way to interact with BMC is to use IPMI in host. However, IPMI services are started in multi-user.target, if BMC infinitely waits network online, there would be no way to debug the issue.
> 
>> Note that the Canonical netplan only applies this option if the configuration
>> indicates that the interface is optional, which is entirely appropriate.  The
>> way you wrote it could have been interpreted that they set this on *every*
>> interface, which is what it seems like you're proposing to do to
>> phosphor-networkd
>>
>> If this is desired behavior for someone, can't you supply a wildcard .network
>> file that adds this option, rather than modifying phosphor-networkd to manually
>> add it to each network interface that it is managing?
> 
> Maybe we can add a similar DBus property like how netplan does? Reading/writing systemd-networkd config files is feasible in phosphor-networkd. Default value can be assigned via build option.
>   
>> I believe some designs use a USB network device to connect two internal pieces
>> of the system and those interfaces are not necessarily managed by
>> phosphor-networkd (interfaces that, for example connect BMC-to-BMC or
>> BMC-to-Host).  While it is obviously up to the system designer to work through
>> this bug, by applying this configuration as you proposed you are causing
>> unusual default behavior in that networkd is going to start waiting for these
>> internal connections to come online instead of the external interface.
> 
> I think this is a extremely rare case, internal interfaces should be configurable. For example, host OS can change the IP of its BMC-Host virtual interface, BMC should also be able to change its, and for BMC-to-BMC interfaces, it is impossible to assign a fixed LAN IP without conflicts in manufacturing. The easiest way to configure it is to utilize the phosphor-networkd.
> 
> Even it is not managed by phosphor-networkd, keeping default RequiredForOnline=yes will cause the 120s wait on BMC boot. Developers can simply search it and find out the solution. I remember it will show a timer with message on BMC serial console, that's how I found I should set the "optional" on my ubuntu server.

FWIW, my experimentation with systemd-networkd-wait-online was not 
successful in doing much to change the 120 second timeout.

Setting the RequiredForOnline entry to false in systemd.network did not 
prevent the 120 second timeout from elapsing.

Setting any of the following switches in the service file failed to 
eliminate the timeout:
--ignore=eth0
--interface=eth0:no-carrier            # overrides RequiredForOnline
--interface=eth0:no-carrier:no-carrier # <- probably a bad setting in
                                        # hindsight

It appears systemd-networkd-wait-online expects some state greater than 
no-carrier to consider the link online, thus allowing it to exit with a 
SUCCESS error code. This even when explicitly instructed no-carrier is 
defined as "online".

The only switch that seemed to perform as expected in this instance was 
--timeout. Assigning a value less than 120 to the --timeout control did 
reduce the wait period. It does assign a SUCCESS error code upon timing 
out, which is expected behavior.

systemd-networkd-wait-online appears to have logic preventing no-carrier 
state from being assigned as the "network online" value.

rsyslogd has both a network and network-online target. If the 
network-online target is removed then systemd-networkd-wait-online 
doesn't run, and any configuation of that service appears to be 
pointless. The conclusion I have from that is that network-online.target 
is a valid configuration option for a service to assign.

There may be openbmc powered servers that do use the distributed logging 
provided by rsyslogd. If there are then globally removing network-online 
from the rsyslog service file is undesirable. I consider the same to be 
true of assigning a default RequiredForOnline=false.

Based on the above, it's my opinion this is a vendor based decision for 
how to configure rsyslog/systemd-networkd-wait-online.

-- 
Johnathan Mantey
Senior Software Engineer
*azad te**chnology partners*
Contributing to Technology Innovation since 1992
Phone: (503) 712-6764
Email: johnathanx.mantey@intel.com <mailto:johnathanx.mantey@intel.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

  parent reply	other threads:[~2022-02-23 20:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-17 22:54 Checking for network online Johnathan Mantey
2022-02-18  0:11 ` Jeremy Kerr
2022-02-18  2:29   ` Lei Yu
2022-02-18 16:11     ` Johnathan Mantey
2022-02-23  2:09       ` Jiaqing Zhao
2022-02-23 13:48         ` Patrick Williams
2022-02-23 17:44           ` Jiaqing Zhao
2022-02-23 18:36             ` Bills, Jason M
2022-02-23 18:58               ` Patrick Williams
2022-02-23 18:55             ` Patrick Williams
2022-02-23 20:04             ` Johnathan Mantey [this message]
2022-02-24 20:09               ` Patrick Williams
2022-03-02  6:15                 ` Jiaqing Zhao
2022-03-01 19:56             ` Milton Miller II
2022-02-18 19:04 ` Doman, Jonathan
2022-02-18 19:39   ` Johnathan Mantey
2022-02-23 13:58     ` Patrick Williams
2022-03-02  6:24 ` Ratan Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f4f0cc0-7967-66f9-a085-a6b2ae978a01@intel.com \
    --to=johnathanx.mantey@intel.com \
    --cc=jiaqing.zhao@linux.intel.com \
    --cc=jk@codeconstruct.com.au \
    --cc=openbmc@lists.ozlabs.org \
    --cc=patrick@stwcx.xyz \
    --cc=yulei.sh@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).