How to deal with failing services in the boot targets

* How to deal with failing services in the boot targets
@ 2017-01-25 23:29 Xo Wang
  2017-01-26  0:33 ` Andrew Geissler
  2017-01-27  1:16 ` Andrew Jeffery
  0 siblings, 2 replies; 7+ messages in thread
From: Xo Wang @ 2017-01-25 23:29 UTC (permalink / raw)
  To: OpenBMC Maillist

Hi folks,

I'm seeing vcs-on@0.service failing occasionally. I know the cause of
it (i2c errors) but I'd like to know how to deal with failing services
in the context of OpenBMC boot sequencing.

For example, the service failure isn't reflect by any subsequent
target failures (it reaches obmc-chassis-start@0.target with no
command line errors, only a journal error for vcs-on@0.service
itself), nor did it prevent the boot from proceeding to pdbg host
control.

This is expected behavior given the systemd Unit relationships I used,
but I don't see a clean way to make a unit like vcs-on@.service block
the boot.

I tried making vcs-on@.service [Install]
RequiredBy=obmc-chassis-start@%i.target (and modifying the service
install similarly), but this only prints out a message that
obmc-chassis-start@0.target couldn't be reached due to its failed
dependency. It did not stop the pdbg start IPL.

I also tried RequiredBy=obmc-host-start-pre@%i.target. This turned out
even worse because our targets don't require their precedent targets,
so obmc-host-start@0.target is still reachable even with a failure in
obmc-host-start-pre@0.target. Likewise for
obmc-chassis-start@0.target, which now prints no console error at all.

Finally I could add RequiredBy=start_host@%i.service to
vcs-on@0.service, but this seems fragile compared to using the targets
as synchronization points.

1) How should I make a host boot service be a blocking step in the chain?

2) Will this require a structural change in the OpenBMC targets?
Making targets require their precedent targets comes to mind. This
would make targets useful not only for sequencing but also for
dependency checking.

3) Do other people also want this? To me it seems obvious that failure
to power on should always block starting IPL, but maybe somebody else
has a good reason to use weaker relationships.

thanks
xo

^ permalink raw reply	[flat|nested] 7+ messages in thread