All of lore.kernel.org
 help / color / mirror / Atom feed
* preventing chassis power-on until bmc Ready
@ 2022-04-19 21:02 Andrew Geissler
  2022-04-19 21:30 ` William Kennington
  2022-04-20 18:37 ` Michael Richardson
  0 siblings, 2 replies; 4+ messages in thread
From: Andrew Geissler @ 2022-04-19 21:02 UTC (permalink / raw)
  To: OpenBMC Maillist

Greetings,

We at IBM have been finding cases where we wrote our services in a way that they
assume the BMC has reached "Ready" state prior to a chassis power on and host
firmware boot being allowed to start. For example, to power on the chassis, you
need to have collected all of the vpd associated with the VRM's and power
supplies. This vpd collection occurs on the way to BMC Ready, and services
in the power on target assume it's all been collected. We have other scenarios
like this and we're wondering if we continue to whack-a-mole by fixing
these individually with explicit service dependencies or do something a bit
more global to ensure our systems aren't allowed to power on until the BMC
has reached the "Ready" state. This state ensures all inventory and other
system data has been collected and created on d-bus.

The BMC reaches the "Ready" state once all services within the multi-user.target
have successfully started running.

I know in the past I've heard of servers that allow both the BMC and Host
to boot in parallel (which sounds awesome) but we're not there yet. I'm
contemplating an optional package in phosphor-state-manager that could be
installed and put in the obmc-chassis-poweron@.target and prevent
any other services running until the BMC reached Ready.

The obmc-chassis-poweron@.target does have a "After=multi-user.target" within
it but that doesn't control the services within the poweron target. It just
ensures systemd will not consider the obmc-chassis-poweron@.target complete
until multi-user.target has completed.

Anyone else have a similar issue and/or thoughts on this problem?

Thanks,
Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: preventing chassis power-on until bmc Ready
  2022-04-19 21:02 preventing chassis power-on until bmc Ready Andrew Geissler
@ 2022-04-19 21:30 ` William Kennington
  2022-05-13 13:19   ` Andrew Geissler
  2022-04-20 18:37 ` Michael Richardson
  1 sibling, 1 reply; 4+ messages in thread
From: William Kennington @ 2022-04-19 21:30 UTC (permalink / raw)
  To: Andrew Geissler; +Cc: OpenBMC Maillist

I don't really like using multi-user.target to order host startup
because we may have lots of optional or non-critical services that
take a long time to become ready that just end up delaying the boot
time of the server (which is a critical amount of time to reduce for
our usecase). I can also see how different platforms probably have
different definitions of "critical" components based on what the
bootup firmware ends up querying the BMC for. But having some kind of
unit we can opt-in to ordering services against may be useful as we
have our own targets for this purpose on google BMCs.

On Tue, Apr 19, 2022 at 2:03 PM Andrew Geissler <geissonator@gmail.com> wrote:
>
> Greetings,
>
> We at IBM have been finding cases where we wrote our services in a way that they
> assume the BMC has reached "Ready" state prior to a chassis power on and host
> firmware boot being allowed to start. For example, to power on the chassis, you
> need to have collected all of the vpd associated with the VRM's and power
> supplies. This vpd collection occurs on the way to BMC Ready, and services
> in the power on target assume it's all been collected. We have other scenarios
> like this and we're wondering if we continue to whack-a-mole by fixing
> these individually with explicit service dependencies or do something a bit
> more global to ensure our systems aren't allowed to power on until the BMC
> has reached the "Ready" state. This state ensures all inventory and other
> system data has been collected and created on d-bus.
>
> The BMC reaches the "Ready" state once all services within the multi-user.target
> have successfully started running.
>
> I know in the past I've heard of servers that allow both the BMC and Host
> to boot in parallel (which sounds awesome) but we're not there yet. I'm
> contemplating an optional package in phosphor-state-manager that could be
> installed and put in the obmc-chassis-poweron@.target and prevent
> any other services running until the BMC reached Ready.
>
> The obmc-chassis-poweron@.target does have a "After=multi-user.target" within
> it but that doesn't control the services within the poweron target. It just
> ensures systemd will not consider the obmc-chassis-poweron@.target complete
> until multi-user.target has completed.
>
> Anyone else have a similar issue and/or thoughts on this problem?
>
> Thanks,
> Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: preventing chassis power-on until bmc Ready
  2022-04-19 21:02 preventing chassis power-on until bmc Ready Andrew Geissler
  2022-04-19 21:30 ` William Kennington
@ 2022-04-20 18:37 ` Michael Richardson
  1 sibling, 0 replies; 4+ messages in thread
From: Michael Richardson @ 2022-04-20 18:37 UTC (permalink / raw)
  To: Andrew Geissler; +Cc: OpenBMC Maillist

[-- Attachment #1: Type: text/plain, Size: 1382 bytes --]


Andrew Geissler <geissonator@gmail.com> wrote:
    > I know in the past I've heard of servers that allow both the BMC and
    > Host to boot in parallel (which sounds awesome) but we're not there
    > yet.

That would really be awesome... server boot times have become ridiculous,
with the time amount of Black Screen (BMC boot time I think) time seeming to
be increasing...
I think that Dell had to tweak some things a decade ago when people started
putting multiple hundred Gb of ram in; I have old servers that take 10+
minutes to POST.

I do wonder if, as you say, the whack-a-mole should continue, or if the host
should just be able to inquire (and wait) for the BMC to finish booting.
So, don't prevent the host from booting, but allow the host to synchronize
with the BMC before it continues.
That would be in the BIOS, and perhaps could even be a prototyped as a (host) grub module.

It seems like there is a lot of mechanism in the BMC that affects the host
booting. (Like virtual USB bootable media)
It would also be very very annoying if one never could get boot console
capture after a cold boot, but only after a warm boot.

--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        | network architect  [
]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 658 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: preventing chassis power-on until bmc Ready
  2022-04-19 21:30 ` William Kennington
@ 2022-05-13 13:19   ` Andrew Geissler
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Geissler @ 2022-05-13 13:19 UTC (permalink / raw)
  To: William Kennington; +Cc: OpenBMC Maillist



> On Apr 19, 2022, at 5:30 PM, William Kennington <wak@google.com> wrote:
> 
> I don't really like using multi-user.target to order host startup
> because we may have lots of optional or non-critical services that
> take a long time to become ready that just end up delaying the boot
> time of the server (which is a critical amount of time to reduce for
> our usecase).

Yep, I agree, if you only have a few critical services needed to boot,
waiting on multi-user.target is very inefficient.


> I can also see how different platforms probably have
> different definitions of "critical" components based on what the
> bootup firmware ends up querying the BMC for. But having some kind of
> unit we can opt-in to ordering services against may be useful as we
> have our own targets for this purpose on google BMCs.

Yeah, I like this. An optional opt-in unit that systems owners can put their
services against. 

I took a first stab at a design up at:
  https://gerrit.openbmc-project.xyz/c/openbmc/docs/+/53723 

> 
> On Tue, Apr 19, 2022 at 2:03 PM Andrew Geissler <geissonator@gmail.com> wrote:
>> 
>> Greetings,
>> 
>> We at IBM have been finding cases where we wrote our services in a way that they
>> assume the BMC has reached "Ready" state prior to a chassis power on and host
>> firmware boot being allowed to start. For example, to power on the chassis, you
>> need to have collected all of the vpd associated with the VRM's and power
>> supplies. This vpd collection occurs on the way to BMC Ready, and services
>> in the power on target assume it's all been collected. We have other scenarios
>> like this and we're wondering if we continue to whack-a-mole by fixing
>> these individually with explicit service dependencies or do something a bit
>> more global to ensure our systems aren't allowed to power on until the BMC
>> has reached the "Ready" state. This state ensures all inventory and other
>> system data has been collected and created on d-bus.
>> 
>> The BMC reaches the "Ready" state once all services within the multi-user.target
>> have successfully started running.
>> 
>> I know in the past I've heard of servers that allow both the BMC and Host
>> to boot in parallel (which sounds awesome) but we're not there yet. I'm
>> contemplating an optional package in phosphor-state-manager that could be
>> installed and put in the obmc-chassis-poweron@.target and prevent
>> any other services running until the BMC reached Ready.
>> 
>> The obmc-chassis-poweron@.target does have a "After=multi-user.target" within
>> it but that doesn't control the services within the poweron target. It just
>> ensures systemd will not consider the obmc-chassis-poweron@.target complete
>> until multi-user.target has completed.
>> 
>> Anyone else have a similar issue and/or thoughts on this problem?
>> 
>> Thanks,
>> Andrew


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-05-13 13:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-19 21:02 preventing chassis power-on until bmc Ready Andrew Geissler
2022-04-19 21:30 ` William Kennington
2022-05-13 13:19   ` Andrew Geissler
2022-04-20 18:37 ` Michael Richardson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.