openbmc.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* bmcweb non-standard OEM integration
@ 2021-10-27  4:37 Vernon Mauery
  2021-10-27 16:22 ` Ed Tanous
  0 siblings, 1 reply; 7+ messages in thread
From: Vernon Mauery @ 2021-10-27  4:37 UTC (permalink / raw)
  To: Ed Tanous, Development list for OpenBMC

I can't imagine that Intel is the only company on this project that has 
a set of patches against bmcweb. Some of these are for features that 
have not yet been released. Some of these are for OEM things that don't 
fit any of the Redfish-OEM schemas. Some are for features that are 
long-standing upstream reviews that have not yet been merged for 
whatever reason.

We want to move away from patches. We want to be able to share our 
changes with the community. Right now, there is not a way for this sort 
of OEM changes getting into bmcweb. I know not everyone is a huge fan of 
the way that the ipmi-providers code works, but it does work. Companies 
need to be able to have those changes that allow them to differentiate 
from the common phosphor core. I am proposing that we make changes to 
bmcweb that would allow this sort of OEM customization.

I don't have a full proposal yet. But I wanted to get this concept out 
in the open before diving headlong into a metaphorical brick wall. While 
I compared what I would like to the ipmi-providers mechanism, there are 
other ways to do this that can work around many of the objections that 
have been raised to that architecture.

--Vernon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bmcweb non-standard OEM integration
  2021-10-27  4:37 bmcweb non-standard OEM integration Vernon Mauery
@ 2021-10-27 16:22 ` Ed Tanous
  2021-10-27 20:42   ` Gunnar Mills
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Ed Tanous @ 2021-10-27 16:22 UTC (permalink / raw)
  To: Vernon Mauery; +Cc: Development list for OpenBMC

On Tue, Oct 26, 2021 at 9:37 PM Vernon Mauery
<vernon.mauery@linux.intel.com> wrote:
>
> I can't imagine that Intel is the only company on this project that has
> a set of patches against bmcweb. Some of these are for features that
> have not yet been released. Some of these are for OEM things that don't
> fit any of the Redfish-OEM schemas. Some are for features that are
> long-standing upstream reviews that have not yet been merged for
> whatever reason.
>
> We want to move away from patches.


As an attempt to make this more concrete, I tried to look at Intel-BMC
to figure out what you're talking about.  The only OEM schema I see is
0001-Firmware-update-configuration-changes.patch, which adds support
for defaulting the setup on a firmware update.  DMTF has been
discussing this idea of defaulting a setup very recently (I think we
talked about it last week), so that will hopefully be in the standard
soon, and if you're interested in particular properties of it, you
might want to participate there.

That's the only OEM patch I see;  Is there more?

> We want to be able to share our
> changes with the community. Right now, there is not a way for this sort
> of OEM changes getting into bmcweb.

I'm not sure why you think this, but the current policy is definitely
not "no way".  Have you read the doc on this?
https://github.com/openbmc/bmcweb/blob/master/OEM_SCHEMAS.md

To paraphrase, the above doesn't say "no OEM schemas in
upstream".  It says "OEM schema features need to be discussed in the
relevant communities".  This policy as written was attempting to be
similar to our policy on systemd, linux, ect.

> I know not everyone is a huge fan of
> the way that the ipmi-providers code works, but it does work. Companies
> need to be able to have those changes that allow them to differentiate
> from the common phosphor core. I am proposing that we make changes to
> bmcweb that would allow this sort of OEM customization.

Intentionally avoiding the comparison with ipmi-oem for a moment,
let's see if we can agree on some things:
1. We should implement Redfish schemas (be they OEM or not) to the
standard, with the same level of quality as the standard.
2. The OpenBMC community has a track record of writing incorrect
schemas, then abandoning them (there are positive examples as well).
3. We should avoid duplicating code between schemas in as many cases
as possible.

Now jumping into the comparison, I would posit that Redfish is
different from IPMI:
1. There is a very active standards body, with new spec changes coming
every few months.  IPMI has no such standards body at this point, so
even if you wanted to get a change in, there's no path forward.
2. Redfish has a configurable privilege system, and its interfaces are
reflectable.  This significantly complicates any runtime-dynamic
plugin solution, given that previously static resources would now have
to be generated at startup, which would be quite a bit of code.

None of this is to say "no OEM schemas", more to point out that in
terms of engineering and code, they're not very similar problem
spaces.

>
> I don't have a full proposal yet. But I wanted to get this concept out
> in the open before diving headlong into a metaphorical brick wall. While
> I compared what I would like to the ipmi-providers mechanism, there are
> other ways to do this that can work around many of the objections that
> have been raised to that architecture.
>

I'm not against having OEM schemas, but I do think we need some
guardrails to keep them useful and maintained in the long run, and
there's a good bit of architecture work to make them possible.

Some things to consider in your proposal:
1. Redfish requires a PrivilegeRegistry.  The moment you implement an
OEM resource, you now need an OEM privilege registry to manage the
privileges for it.  Up until my patch recently, bmcweb had a hardcoded
privilege map, which didn't allow for modification on a per-system
basis.  After a few patches, we're at least generating our privilege
tables now, but this would need to be significantly improved to handle
OEM schemas privileges properly.
2. Redfish requires schema files.  In general for company-specific OEM
registries, it's not desirable to have another company's OEM schemas
show up on your systems (and for all I know, might be a copyright
violation, but I'm not a lawyer).  This requires some mechanism to
switch features on and off, which bmcweb already has, in the form of
the meson options.  If we're going to allow OEM options in a lot more
places, we'll need an engineering solution for this that scales
better.
3. Redfish schemas are VERY difficult to write, and the current batch
of maintainers (myself especially) is not up to the task of reviewing
schema file submissions at this time.  I'm personally working to get
better, but at this point every single OEM schema that we've accepted
has bugs.  Side note, I tried to generate type-safe client bindings
for a project recently, and I ended up having to comment out all the
OEM schemas, given how broken they are, so this has real-world effects
on clients usage.
3. As it's in one piece of code, bmcweb can make changes to internal
details, clean up code, and make things more usable over time by
simplifying and reducing code.  Some examples of this include
readJson, boost-dbus to sdbusplus conversion, variable naming,
clang-tidy and many more.  Any change to how we inject OEM handlers
should understand that and make sure it's still possible in the
future.
4. Functionally, code shouldn't be duplicated.  Please plan on your
design doc covering how we're going to deal with that functionally
when others want to reuse OEM code.
5. In your proposal, please plan to propose a set of guidelines for
determining which things should be upstreamed to DMTF, which things
should remain as OpenBMC OEM, and which should remain company specific
OEM.  There are definitely examples that should be each of the three,
but the vast majority of things that I get asked about belong in DMTF
upstream, and I struggle as a maintainer when the DMTF community tells
me they want more participation, but OpenBMC OEM schemas immediately
bifurcate the conversation to a much smaller audience.  IMO in many
cases, OEM resources are used as a trapdoor to avoid having to
upstream things, which is not a pattern we should promote.
6. Given that we've had problems in the webui with hardcoding
resources, please plan on coming up with similar client usage
guidelines to avoid similar problems as we've had with uid handling in
our clients, as well as deprecation policies when things do get
supported in upstream.


In short, I don't see a "metaphorical brick wall", but a gentle set of
rolling hills to climb.  If you can point to something specifically
you'd like to see made OEM that we can use as an example for this kind
of thing to work through the engineering and process changes, please
do.

> --Vernon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bmcweb non-standard OEM integration
  2021-10-27 16:22 ` Ed Tanous
@ 2021-10-27 20:42   ` Gunnar Mills
  2021-10-27 22:28   ` Ed Tanous
  2021-10-28 22:56   ` Vernon Mauery
  2 siblings, 0 replies; 7+ messages in thread
From: Gunnar Mills @ 2021-10-27 20:42 UTC (permalink / raw)
  To: Ed Tanous, Vernon Mauery; +Cc: Development list for OpenBMC

On 10/27/2021 10:22 AM, Ed Tanous wrote:
> On Tue, Oct 26, 2021 at 9:37 PM Vernon Mauery
> <vernon.mauery@linux.intel.com> wrote:
>>
>> I can't imagine that Intel is the only company on this project that has
>> a set of patches against bmcweb. 

IBM has a growing number of patches against bmcweb as well.


>> Some of these are for features that
>> have not yet been released. Some of these are for OEM things that don't
>> fit any of the Redfish-OEM schemas. Some are for features that are
>> long-standing upstream reviews that have not yet been merged for
>> whatever reason.

Are OEM things a majority of these patches? Would allowing "OEM 
customization" make much difference in the number of patches against 
bmcweb? I know for us only a small percentage of patches against bmcweb 
are adding Redfish OEM.

>>
>> We want to move away from patches.
> 
> 
> As an attempt to make this more concrete, I tried to look at Intel-BMC
> to figure out what you're talking about.  The only OEM schema I see is
> 0001-Firmware-update-configuration-changes.patch, which adds support
> for defaulting the setup on a firmware update.  DMTF has been
> discussing this idea of defaulting a setup very recently (I think we
> talked about it last week), so that will hopefully be in the standard
> soon, and if you're interested in particular properties of it, you
> might want to participate there
I briefly looked too at Intel-BMC, saw several patches that look to be 
implementing standard Redfish. Things that are already in Redfish. In 
some cases, I didn't see an upstream review. Is there a reason why these 
can't be upstream? What things can we do to help?

> 
> That's the only OEM patch I see;  Is there more?

+1. I would like to understand what Redfish OEM features you are trying 
to add.

> 
>> We want to be able to share our
>> changes with the community. Right now, there is not a way for this sort
>> of OEM changes getting into bmcweb.
> 
> I'm not sure why you think this, but the current policy is definitely
> not "no way". 

OpenBMC has 5 OEM schemas today.
There is also an IBM and Google Rest API. Although I don't think I would 
recommend this approach.

Have you read the doc on this?
> https://github.com/openbmc/bmcweb/blob/master/OEM_SCHEMAS.md
> 
> To paraphrase, the above doesn't say "no OEM schemas in
> upstream".  It says "OEM schema features need to be discussed in the
> relevant communities".  This policy as written was attempting to be
> similar to our policy on systemd, linux, ect.

IBM has had quite a bit of success adding additional features to 
Redfish. Anyone can post to https://redfishforum.com/ and Redfish member 
companies can open issues, attend meetings, and submit PRs.

After adding to Redfish, the bmcweb code is a lot more straightforward. 
For example, we got IdlePowerSave added to the 2021.2 Redfish release, 
the bmcweb commit then doesn't involve much schema discussion and can 
move faster https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/47776

Thanks,
Gunnar


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bmcweb non-standard OEM integration
  2021-10-27 16:22 ` Ed Tanous
  2021-10-27 20:42   ` Gunnar Mills
@ 2021-10-27 22:28   ` Ed Tanous
  2021-10-28 23:05     ` Vernon Mauery
  2021-10-28 22:56   ` Vernon Mauery
  2 siblings, 1 reply; 7+ messages in thread
From: Ed Tanous @ 2021-10-27 22:28 UTC (permalink / raw)
  To: Vernon Mauery; +Cc: Development list for OpenBMC

On Wed, Oct 27, 2021 at 9:22 AM Ed Tanous <edtanous@google.com> wrote:
>
> On Tue, Oct 26, 2021 at 9:37 PM Vernon Mauery
> <vernon.mauery@linux.intel.com> wrote:
> >
> >
> > We want to move away from patches.
>

Given this is your goal, let's enumerate these patches to see if
there's a way to get the easy ones knocked out, and get next steps for
all of them.  Overall, I see a lot of half-finished things, or things
that the submitters abandoned immediately after opening the patchset
upstream.  A lot of them could be upstreamed with pretty minimal
rebase and rework.

As a layman's analysis of this, most of the problems don't seem to be
OEM schemas, and fall into a couple categories:
1. Patches already on master, just need to rebase.
2. Standard features that make assumptions about specific addresses,
parts, and names on Intel systems.  I'm not sure how we would get
these into upstream given that we need to support more flash chips and
naming conventions than just those that Intel uses.
3. Patchsets that reimplement things that exist in the standard.  I
think in general we don't want to duplicate things already covered in
the standard, but I'm happy to have a discussion if your opinion
differs.
4. Patchsets that were submitted, then after review never received any
other response from the submitter.  In some cases these just required
minor rework to make them usable on master, then would've been fine.

./0001-Firmware-update-configuration-changes.patch
I talked about this one previously on this thread.  Check out the DMTF
proposals for this.

./0002-Use-chip-id-based-UUID-for-Service-Root.patch
This is reading the chipid directly from the filesystem, and hardcodes
both the chipid and offset specific to intel systems.  The feature is
novel and would be useful, but I think needs rolled into its own
application that bmcweb can grab from so that bmcweb isn't directly
talking to hardware, and other systems can override with their own
specific UUID implementations.  For the moment, bmcweb generates the
Redfish UUID internally, so if this patch were submitted today it
would break a majority of systems.  Re-develop it to not do that, and
it seems like something we could have on master.

./0010-managers-add-attributes-for-Manager.CommandShell.patch
This one was submitted here, but rejected because it would be
incorrect for some systems and effectively break them.  Next steps
involve not hardcoding this data for all systems and relying on a dbus
interface to determine the presence of a serial console, which we
already have helper functions to do.
https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/38331

./0011-bmcweb-Add-PhysicalContext-to-Thermal-resources.patch
This one relies on intel's specific naming of dbus paths to generate a
standard property "PhysicalContext".  If that works for Intel, it
seems like it needs to stay a patch, but should be relatively easy to
come up with a solution that all systems could implement that didn't
effectively search for specific names.

./0012-Log-RedFish-event-for-Invalid-login-attempt.patch
This patch looks reasonable (a couple minor things present that need
fixed), I just don't think it was ever submitted.

./0013-Add-UART-routing-logic-into-host-console-connection-.patch
This is directly opening a hardcoded uart on a client connecting, and
directly writing hardware from bmcweb, which is a big design
anti-pattern.  If this is needed, this feature needs to go into
phosphor-console, and not hardcode the specific uart address.

./0014-recommended-fixes-by-crypto-review-team.patch
99% of this looks like "apply mozilla modern cipher rules", which we
used to have a config option for, but nobody used it and it got out of
date.  If this is something you want to see, let's get it behind some
compiler flags, and backed by a security authority like Mozilla, then
this should be fine to submit.  That should at least minimize the
patch down to "places where Intel security teams disagree with
mozilla" which I would hope would be minimal.  Send this to gerrit as
is and we can talk about it more.  There might be other options for
making some of this compile time configurable.

./0015-Add-state-sensor-messages-to-the-registry.patch
This was submitted here:
https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/40047
and was abandoned 6 months later when the submitter never replied to
the first round of comments.  Not sure how I can help more on that one

./0017-Add-msg-registry-for-subscription-related-actions.patch
This looks like it's duplicating the Resource registry items, which
aren't specific to Event subscriptions, and much more useful.  It's
not clear why you would want to go this route, and the commit message
doesn't talk about why this route was chosen.  If you want to submit
this as-is, we can talk more about it in the review.

./0018-bmcweb-Add-BMC-Time-update-log-to-the-registry.patch
Got submitted here:
https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/42880
And is waiting on a response from the author.

./0019-Add-generic-message-PropertySizeExceeded.patch
This is making edits to a DMTF owned message registry.  Per the spec,
we can't make modifications to DMTF owned registries, although in DMTF
these types of new messages seem to get in easily, so it seems likely
that you could upstream it to DMTF, or rewrite the patch to put it in
the correct registry.  With that said, it adds code which isn't used
anywhere, so it's not clear why it's needed.

./0020-Redfish-Deny-set-AccountLockDuration-to-zero.patch
This was never submitted, but this kind of business logic and validity
checking needs to go into phosphor-user-manager, not bmcweb, otherwise
we'll duplicate the logic in both bmcweb and ipmi.

./0021-Add-message-registry-entry-for-FirmwareResiliencyErr.patch
This patchset is already merged here:
https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/43280
Maybe it's just waiting on Intel to Rebase?

./0023-Add-get-IPMI-session-id-s-to-Redfish.patch
This is currently in review here:
https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/37785
And is waiting on a response from the submitter.  Last response from
Gunnar on August 4th was "I am still really struggling with why we
want this myself."  I have similar concerns.

./0024-Add-count-sensor-type.patch
This adds a new type of sensor not in phosphor-dbus-interfaces, which
also goes somewhat against what we've been doing for other counters,
but let's get the commit open against PDI and start discussing.  The
bmcweb patchset itself is pretty trivial once we have an agreement on
how counters will work in dbus.

./biosconfig/0001-Define-Redfish-interface-Registries-Bios.patch
./biosconfig/0002-BaseBiosTable-Add-support-for-PATCH-operation.patch
./biosconfig/0003-Add-support-to-ResetBios-action.patch
./biosconfig/0004-Add-support-to-ChangePassword-action.patch
./biosconfig/0005-Fix-remove-bios-user-pwd-change-option-via-Redfish.patch

This whole series of patches got submitted and effectively abandoned
(they might still be open) because they didn't implement the Redfish
Registry versioning requirements properly.  Implement that properly,
and I suspect others will want to see this on master as well.

./bmcweb.socket
I have no idea why Intel duplicated the socket file locally in their
meta layer;  It looks roughly the same as upstream (although upstream
can now template configure the port).  I suspect this can be removed?

./eventservice/0001-EventService-Fix-retry-handling-for-http-client.patch
This is already present on master.  Still waiting to rebase?

./eventservice/0002-EventService-https-client-support.patch
This has security issues that were pointed out in the review.  Namely,
it fails to verify certificates, which defeats the point of adding SSL
support.  Once that gets fixed, along with the other comments in the
review here, this should be fine to add to master, but I had to
abandon it because the submitters never replied to comments and the
lack of feedback made it too much effort to review (it made it to
patchset 45 with almost no responses before I abandoned it).  I left a
note that once they replied to the comments they could reopen it, but
that never happened:
https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/31735

./eventservice/0004-Add-Server-Sent-Events-support.patch
./eventservice/0005-Add-SSE-style-subscription-support-to-eventservice.patch
./eventservice/0006-Add-EventService-SSE-filter-support.patch

The SSE stuff had to be disabled on master because it didn't handle
errors and was trivial to cause bmcweb segfaults from outside the
system.  If that's been fixed here, submit the above patches and we'll
get it re-enabled.  Alternatively, we could enable it under a
bmcweb-unsecure option if you aren't worried about the security
consequences.

./eventservice/0007-EventService-Log-events-for-subscription-actions.patch
This looks like it's duplicated with
0017-Add-msg-registry-for-subscription-related-actions.patch, and has
the same answer.  Use the Resource registry that already exists in
dmtf redfish.

./eventservice/0008-Add-checks-on-Event-Subscription-input-parameters.patch
This patch looks mostly fine, but was never submitted.  Get it
submitted and we'll get the minor things cleaned up and on master.

./eventservice/0009-Restructure-Redifsh-EventLog-Transmit-code-flow.patch
This needs some cleanup (some of which is already on master) then this
looks fine.  It looks like it's just a bunch of error checking and
handling, which is a good thing in general, but I've never seen it
submitted.

./telemetry/0001-Add-support-for-MetricDefinition-scheme.patch
./telemetry/0002-Sync-Telmetry-service-with-EventService.patch
./telemetry/0003-Switched-bmcweb-to-use-new-telemetry-service-API.patch
./telemetry/0004-Add-support-for-MetricDefinition-property-in-MetricReport.patch
./telemetry/0005-Add-DELETE-method-for-MetricReport.patch
./telemetry/0006-Add-GET-method-for-TriggerCollection.patch
./telemetry/0007-Revert-Remove-LogService-from-TelemetryService.patch
./telemetry/0008-event-service-fix-added-Context-field-to-response.patch
./telemetry/0009-Generalize-ReadingType-in-MetricDefinition.patch

The telemetry patches have been difficult, because originally they
were submitted in series, so they're slow to review.  0001 and 0005
are already on master.  The rest have had varying degrees of design
discussions that revolve around getting further clarification on the
redfish specification, and how/if we should implement caching.  In the
first revision, they also choose to reimplement a number of things
that bmcweb already had abstractions for, so the submitters were asked
to clean that up before they merged, and that took quite a bit of
time, although we're still making progress.

./vm/0001-Revert-Disable-nbd-proxy-from-the-build.patch
./vm/0002-bmcweb-handle-device-or-resource-busy-exception.patch
./vm/0003-Add-ConnectedVia-property-to-virtual-media-item-temp.patch
./vm/0004-Invalid-status-code-from-InsertMedia-REST-methods.patch
./vm/0005-Set-Inserted-redfish-property-for-not-inserted-resou.patch
./vm/0006-Bmcweb-handle-permission-denied-exception.patch
./vm/0007-Fix-unmounting-image-in-proxy-mode.patch
./0016-Fix-bmcweb-crashes-if-socket-directory-not-present.patch
These are patches against the nbd backend that never got upstreamed,
so I don't really like accepting patches for dead code.  Several
attempts have been made to reach out to the authors to figure out what
the upstream plan for the nbd backend should be, given it's unusable
in upstream, and I haven't heard any responses.  Currently this whole
module is slated for "possible removal", and is uncompilable on
master.  Ironically patch 0001 comments that line back in, so I know
that someone at intel has at least seen and groked the comment left,
and directly decided that it wasn't on their priority list to upstream
anything.  Hopefully your email above implies that this isn't the
general policy, and Intel is looking to upstream it?  Let's get a
discussion started there.
https://github.com/openbmc/bmcweb/blob/9d832618c74052bd445d6e8b3461946f3c431ca3/meson_options.txt#L7
https://github.com/openbmc/bmcweb/issues/188

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bmcweb non-standard OEM integration
  2021-10-27 16:22 ` Ed Tanous
  2021-10-27 20:42   ` Gunnar Mills
  2021-10-27 22:28   ` Ed Tanous
@ 2021-10-28 22:56   ` Vernon Mauery
  2021-10-28 23:52     ` Ed Tanous
  2 siblings, 1 reply; 7+ messages in thread
From: Vernon Mauery @ 2021-10-28 22:56 UTC (permalink / raw)
  To: Ed Tanous; +Cc: Development list for OpenBMC

On 27-Oct-2021 09:22 AM, Ed Tanous wrote:
>On Tue, Oct 26, 2021 at 9:37 PM Vernon Mauery
><vernon.mauery@linux.intel.com> wrote:
>>
>> I can't imagine that Intel is the only company on this project that has
>> a set of patches against bmcweb. Some of these are for features that
>> have not yet been released. Some of these are for OEM things that don't
>> fit any of the Redfish-OEM schemas. Some are for features that are
>> long-standing upstream reviews that have not yet been merged for
>> whatever reason.
>>
>> We want to move away from patches.
>
>
>As an attempt to make this more concrete, I tried to look at Intel-BMC
>to figure out what you're talking about.  The only OEM schema I see is
>0001-Firmware-update-configuration-changes.patch, which adds support
>for defaulting the setup on a firmware update.  DMTF has been
>discussing this idea of defaulting a setup very recently (I think we
>talked about it last week), so that will hopefully be in the standard
>soon, and if you're interested in particular properties of it, you
>might want to participate there.
>
>That's the only OEM patch I see;  Is there more?

All the patches that make changes are OEM patches. Just because they 
don't have an OEM schema, that doesn't mean they are not OEM patches. I 
see OEM patches as a patch written by an OEM. In this case, Intel.

>> We want to be able to share our
>> changes with the community. Right now, there is not a way for this sort
>> of OEM changes getting into bmcweb.
>
>I'm not sure why you think this, but the current policy is definitely
>not "no way".  Have you read the doc on this?
>https://github.com/openbmc/bmcweb/blob/master/OEM_SCHEMAS.md
>
>To paraphrase, the above doesn't say "no OEM schemas in
>upstream".  It says "OEM schema features need to be discussed in the
>relevant communities".  This policy as written was attempting to be
>similar to our policy on systemd, linux, ect.

Discussing things in relevant communities is good for when it comes time 
to actually upstream a patch and make it public. But there are patches 
that are not yet public because of pre-release hardware/specs/features 
that will eventually get upstreamed. Those are in a really tricky 
position because they might actually need to get re-architected and 
re-written as part of this discussion process during the upstream 
process. I agree that might be part of a calculated risk that we would 
have to take on weighing the benefit of getting the feature working vs. 
the risk of possibly needing to rework it in the future. But in the 
meantime, it would be advantageous for Intel to be able to keep it in a 
format that does not break every time it has to be rebased on top of the 
latest bmcweb.

>> I know not everyone is a huge fan of
>> the way that the ipmi-providers code works, but it does work. Companies
>> need to be able to have those changes that allow them to differentiate
>> from the common phosphor core. I am proposing that we make changes to
>> bmcweb that would allow this sort of OEM customization.
>
>Intentionally avoiding the comparison with ipmi-oem for a moment,
>let's see if we can agree on some things:
>1. We should implement Redfish schemas (be they OEM or not) to the
>standard, with the same level of quality as the standard.

I understand that the IPMI spec is dated and unclear in many cases, but 
even with a standard that is still 'living', there will always be cases 
where something does not fit into the standard. It may be a 'not yet' 
situation where a future version adds the feature, or it may be a 
scenario that does not really fit the standard and they don't want it 
but we do. In either case, the implementation needs to exist from the 
point of requirement forward. In the former case, it will eventually get 
merged with bmcweb. In the latter case, it still needs a home.

>2. The OpenBMC community has a track record of writing incorrect
>schemas, then abandoning them (there are positive examples as well).

I feel like this is normal and part of growing pains. How many times has 
any given line in the Linux kernel been rewritten? Projects grow and 
change. If we are waiting for the perfect schema before implementation 
of a feature we will never get there. And then who will pay the bills? I 
am not suggesting that we just throw everything at the wall and see what 
sticks, but have some faith in organizations to come up with at least 
a first good step to appease their customers that have needs.

>3. We should avoid duplicating code between schemas in as many cases
>as possible.

Sure. No argument here.

>Now jumping into the comparison, I would posit that Redfish is
>different from IPMI:
>1. There is a very active standards body, with new spec changes coming
>every few months.  IPMI has no such standards body at this point, so
>even if you wanted to get a change in, there's no path forward.
>2. Redfish has a configurable privilege system, and its interfaces are
>reflectable.  This significantly complicates any runtime-dynamic
>plugin solution, given that previously static resources would now have
>to be generated at startup, which would be quite a bit of code.

It is just code. With code anything is possible. :)

>None of this is to say "no OEM schemas", more to point out that in
>terms of engineering and code, they're not very similar problem
>spaces.

So they will have different solutions. I didn't expect it would be a cut 
and paste situation.

>>
>> I don't have a full proposal yet. But I wanted to get this concept out
>> in the open before diving headlong into a metaphorical brick wall. While
>> I compared what I would like to the ipmi-providers mechanism, there are
>> other ways to do this that can work around many of the objections that
>> have been raised to that architecture.
>>
>
>I'm not against having OEM schemas, but I do think we need some
>guardrails to keep them useful and maintained in the long run, and
>there's a good bit of architecture work to make them possible.
>
>Some things to consider in your proposal:
>1. Redfish requires a PrivilegeRegistry.  The moment you implement an
>OEM resource, you now need an OEM privilege registry to manage the
>privileges for it.  Up until my patch recently, bmcweb had a hardcoded
>privilege map, which didn't allow for modification on a per-system
>basis.  After a few patches, we're at least generating our privilege
>tables now, but this would need to be significantly improved to handle
>OEM schemas privileges properly.

This is good to keep in mind; like I said, so far, only the desire to do 
this is present, not a plan.

>2. Redfish requires schema files.  In general for company-specific OEM
>registries, it's not desirable to have another company's OEM schemas
>show up on your systems (and for all I know, might be a copyright
>violation, but I'm not a lawyer).  This requires some mechanism to
>switch features on and off, which bmcweb already has, in the form of
>the meson options.  If we're going to allow OEM options in a lot more
>places, we'll need an engineering solution for this that scales
>better.

Yes. Nobody wants to endure that sort of embarrassment.

>3. Redfish schemas are VERY difficult to write, and the current batch
>of maintainers (myself especially) is not up to the task of reviewing
>schema file submissions at this time.  I'm personally working to get
>better, but at this point every single OEM schema that we've accepted
>has bugs.  Side note, I tried to generate type-safe client bindings
>for a project recently, and I ended up having to comment out all the
>OEM schemas, given how broken they are, so this has real-world effects
>on clients usage.

This is not my area of expertise, but it sounds like generating valid 
schemas is the sort of thing that needs to be automated to let the 
computer do the work or something. Mere humans are not up to the task.

>3. As it's in one piece of code, bmcweb can make changes to internal
>details, clean up code, and make things more usable over time by
>simplifying and reducing code.  Some examples of this include
>readJson, boost-dbus to sdbusplus conversion, variable naming,
>clang-tidy and many more.  Any change to how we inject OEM handlers
>should understand that and make sure it's still possible in the
>future.

Yup.

>4. Functionally, code shouldn't be duplicated.  Please plan on your
>design doc covering how we're going to deal with that functionally
>when others want to reuse OEM code.

Yes. This is one of my dislikes of the ipmi-oem situation.

>5. In your proposal, please plan to propose a set of guidelines for
>determining which things should be upstreamed to DMTF, which things
>should remain as OpenBMC OEM, and which should remain company specific
>OEM.  There are definitely examples that should be each of the three,
>but the vast majority of things that I get asked about belong in DMTF
>upstream, and I struggle as a maintainer when the DMTF community tells
>me they want more participation, but OpenBMC OEM schemas immediately
>bifurcate the conversation to a much smaller audience.  IMO in many
>cases, OEM resources are used as a trapdoor to avoid having to
>upstream things, which is not a pattern we should promote.

Sure. And given those three scenarios, it might also change where the 
code lives. I agree that OEMs should not use this as just a carte 
blanche approval to do whatever they want.

>6. Given that we've had problems in the webui with hardcoding
>resources, please plan on coming up with similar client usage
>guidelines to avoid similar problems as we've had with uid handling in
>our clients, as well as deprecation policies when things do get
>supported in upstream.

I am not sure I grok this section...

>In short, I don't see a "metaphorical brick wall", but a gentle set of
>rolling hills to climb.  If you can point to something specifically
>you'd like to see made OEM that we can use as an example for this kind
>of thing to work through the engineering and process changes, please
>do.

Thanks for your detailed response.

--Vernon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bmcweb non-standard OEM integration
  2021-10-27 22:28   ` Ed Tanous
@ 2021-10-28 23:05     ` Vernon Mauery
  0 siblings, 0 replies; 7+ messages in thread
From: Vernon Mauery @ 2021-10-28 23:05 UTC (permalink / raw)
  To: Ed Tanous; +Cc: Development list for OpenBMC

On 27-Oct-2021 03:28 PM, Ed Tanous wrote:
>On Wed, Oct 27, 2021 at 9:22 AM Ed Tanous <edtanous@google.com> wrote:
>>
>> On Tue, Oct 26, 2021 at 9:37 PM Vernon Mauery
>> <vernon.mauery@linux.intel.com> wrote:
>> >
>> >
>> > We want to move away from patches.
>>
>
>Given this is your goal, let's enumerate these patches to see if
>there's a way to get the easy ones knocked out, and get next steps for
>all of them.  Overall, I see a lot of half-finished things, or things
>that the submitters abandoned immediately after opening the patchset
>upstream.  A lot of them could be upstreamed with pretty minimal
>rebase and rework.

This all started as an internal discussion about the mounting technical 
debt of carrying patches internally for upstream projects. We did a 
similar categorization ourselves.

>As a layman's analysis of this, most of the problems don't seem to be
>OEM schemas, and fall into a couple categories:
>1. Patches already on master, just need to rebase.

Yes. I am trying to push a mindset internally that the feature is not 
complete until it is accepted upstream.

>2. Standard features that make assumptions about specific addresses,
>parts, and names on Intel systems.  I'm not sure how we would get
>these into upstream given that we need to support more flash chips and
>naming conventions than just those that Intel uses.

This is the kind of thing where it is an OEM patch, but maybe not an OEM 
schema. The proposal will need to make room for this code as well.

>3. Patchsets that reimplement things that exist in the standard.  I
>think in general we don't want to duplicate things already covered in
>the standard, but I'm happy to have a discussion if your opinion
>differs.

If the resulting output differs from the original code, we need to have 
a discussion on why. It may be that some of this code is actually the 
same as #2. If the output is the same, but the implementation is 
different, this is just code that should be removed.

>4. Patchsets that were submitted, then after review never received any
>other response from the submitter.  In some cases these just required
>minor rework to make them usable on master, then would've been fine.

Again, this is part of nurturing the changes until they are accepted. It 
is work, but it is current work instead of future work.

I appreciate all the detailed feedback below. I will try to see if we 
can at least reduce the size of our internal patch tree by following 
your recommendations.

--Vernon

>./0001-Firmware-update-configuration-changes.patch
>I talked about this one previously on this thread.  Check out the DMTF
>proposals for this.
>
>./0002-Use-chip-id-based-UUID-for-Service-Root.patch
>This is reading the chipid directly from the filesystem, and hardcodes
>both the chipid and offset specific to intel systems.  The feature is
>novel and would be useful, but I think needs rolled into its own
>application that bmcweb can grab from so that bmcweb isn't directly
>talking to hardware, and other systems can override with their own
>specific UUID implementations.  For the moment, bmcweb generates the
>Redfish UUID internally, so if this patch were submitted today it
>would break a majority of systems.  Re-develop it to not do that, and
>it seems like something we could have on master.
>
>./0010-managers-add-attributes-for-Manager.CommandShell.patch
>This one was submitted here, but rejected because it would be
>incorrect for some systems and effectively break them.  Next steps
>involve not hardcoding this data for all systems and relying on a dbus
>interface to determine the presence of a serial console, which we
>already have helper functions to do.
>https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/38331
>
>./0011-bmcweb-Add-PhysicalContext-to-Thermal-resources.patch
>This one relies on intel's specific naming of dbus paths to generate a
>standard property "PhysicalContext".  If that works for Intel, it
>seems like it needs to stay a patch, but should be relatively easy to
>come up with a solution that all systems could implement that didn't
>effectively search for specific names.
>
>./0012-Log-RedFish-event-for-Invalid-login-attempt.patch
>This patch looks reasonable (a couple minor things present that need
>fixed), I just don't think it was ever submitted.
>
>./0013-Add-UART-routing-logic-into-host-console-connection-.patch
>This is directly opening a hardcoded uart on a client connecting, and
>directly writing hardware from bmcweb, which is a big design
>anti-pattern.  If this is needed, this feature needs to go into
>phosphor-console, and not hardcode the specific uart address.
>
>./0014-recommended-fixes-by-crypto-review-team.patch
>99% of this looks like "apply mozilla modern cipher rules", which we
>used to have a config option for, but nobody used it and it got out of
>date.  If this is something you want to see, let's get it behind some
>compiler flags, and backed by a security authority like Mozilla, then
>this should be fine to submit.  That should at least minimize the
>patch down to "places where Intel security teams disagree with
>mozilla" which I would hope would be minimal.  Send this to gerrit as
>is and we can talk about it more.  There might be other options for
>making some of this compile time configurable.
>
>./0015-Add-state-sensor-messages-to-the-registry.patch
>This was submitted here:
>https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/40047
>and was abandoned 6 months later when the submitter never replied to
>the first round of comments.  Not sure how I can help more on that one
>
>./0017-Add-msg-registry-for-subscription-related-actions.patch
>This looks like it's duplicating the Resource registry items, which
>aren't specific to Event subscriptions, and much more useful.  It's
>not clear why you would want to go this route, and the commit message
>doesn't talk about why this route was chosen.  If you want to submit
>this as-is, we can talk more about it in the review.
>
>./0018-bmcweb-Add-BMC-Time-update-log-to-the-registry.patch
>Got submitted here:
>https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/42880
>And is waiting on a response from the author.
>
>./0019-Add-generic-message-PropertySizeExceeded.patch
>This is making edits to a DMTF owned message registry.  Per the spec,
>we can't make modifications to DMTF owned registries, although in DMTF
>these types of new messages seem to get in easily, so it seems likely
>that you could upstream it to DMTF, or rewrite the patch to put it in
>the correct registry.  With that said, it adds code which isn't used
>anywhere, so it's not clear why it's needed.
>
>./0020-Redfish-Deny-set-AccountLockDuration-to-zero.patch
>This was never submitted, but this kind of business logic and validity
>checking needs to go into phosphor-user-manager, not bmcweb, otherwise
>we'll duplicate the logic in both bmcweb and ipmi.
>
>./0021-Add-message-registry-entry-for-FirmwareResiliencyErr.patch
>This patchset is already merged here:
>https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/43280
>Maybe it's just waiting on Intel to Rebase?
>
>./0023-Add-get-IPMI-session-id-s-to-Redfish.patch
>This is currently in review here:
>https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/37785
>And is waiting on a response from the submitter.  Last response from
>Gunnar on August 4th was "I am still really struggling with why we
>want this myself."  I have similar concerns.
>
>./0024-Add-count-sensor-type.patch
>This adds a new type of sensor not in phosphor-dbus-interfaces, which
>also goes somewhat against what we've been doing for other counters,
>but let's get the commit open against PDI and start discussing.  The
>bmcweb patchset itself is pretty trivial once we have an agreement on
>how counters will work in dbus.
>
>./biosconfig/0001-Define-Redfish-interface-Registries-Bios.patch
>./biosconfig/0002-BaseBiosTable-Add-support-for-PATCH-operation.patch
>./biosconfig/0003-Add-support-to-ResetBios-action.patch
>./biosconfig/0004-Add-support-to-ChangePassword-action.patch
>./biosconfig/0005-Fix-remove-bios-user-pwd-change-option-via-Redfish.patch
>
>This whole series of patches got submitted and effectively abandoned
>(they might still be open) because they didn't implement the Redfish
>Registry versioning requirements properly.  Implement that properly,
>and I suspect others will want to see this on master as well.
>
>./bmcweb.socket
>I have no idea why Intel duplicated the socket file locally in their
>meta layer;  It looks roughly the same as upstream (although upstream
>can now template configure the port).  I suspect this can be removed?
>
>./eventservice/0001-EventService-Fix-retry-handling-for-http-client.patch
>This is already present on master.  Still waiting to rebase?
>
>./eventservice/0002-EventService-https-client-support.patch
>This has security issues that were pointed out in the review.  Namely,
>it fails to verify certificates, which defeats the point of adding SSL
>support.  Once that gets fixed, along with the other comments in the
>review here, this should be fine to add to master, but I had to
>abandon it because the submitters never replied to comments and the
>lack of feedback made it too much effort to review (it made it to
>patchset 45 with almost no responses before I abandoned it).  I left a
>note that once they replied to the comments they could reopen it, but
>that never happened:
>https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/31735
>
>./eventservice/0004-Add-Server-Sent-Events-support.patch
>./eventservice/0005-Add-SSE-style-subscription-support-to-eventservice.patch
>./eventservice/0006-Add-EventService-SSE-filter-support.patch
>
>The SSE stuff had to be disabled on master because it didn't handle
>errors and was trivial to cause bmcweb segfaults from outside the
>system.  If that's been fixed here, submit the above patches and we'll
>get it re-enabled.  Alternatively, we could enable it under a
>bmcweb-unsecure option if you aren't worried about the security
>consequences.
>
>./eventservice/0007-EventService-Log-events-for-subscription-actions.patch
>This looks like it's duplicated with
>0017-Add-msg-registry-for-subscription-related-actions.patch, and has
>the same answer.  Use the Resource registry that already exists in
>dmtf redfish.
>
>./eventservice/0008-Add-checks-on-Event-Subscription-input-parameters.patch
>This patch looks mostly fine, but was never submitted.  Get it
>submitted and we'll get the minor things cleaned up and on master.
>
>./eventservice/0009-Restructure-Redifsh-EventLog-Transmit-code-flow.patch
>This needs some cleanup (some of which is already on master) then this
>looks fine.  It looks like it's just a bunch of error checking and
>handling, which is a good thing in general, but I've never seen it
>submitted.
>
>./telemetry/0001-Add-support-for-MetricDefinition-scheme.patch
>./telemetry/0002-Sync-Telmetry-service-with-EventService.patch
>./telemetry/0003-Switched-bmcweb-to-use-new-telemetry-service-API.patch
>./telemetry/0004-Add-support-for-MetricDefinition-property-in-MetricReport.patch
>./telemetry/0005-Add-DELETE-method-for-MetricReport.patch
>./telemetry/0006-Add-GET-method-for-TriggerCollection.patch
>./telemetry/0007-Revert-Remove-LogService-from-TelemetryService.patch
>./telemetry/0008-event-service-fix-added-Context-field-to-response.patch
>./telemetry/0009-Generalize-ReadingType-in-MetricDefinition.patch
>
>The telemetry patches have been difficult, because originally they
>were submitted in series, so they're slow to review.  0001 and 0005
>are already on master.  The rest have had varying degrees of design
>discussions that revolve around getting further clarification on the
>redfish specification, and how/if we should implement caching.  In the
>first revision, they also choose to reimplement a number of things
>that bmcweb already had abstractions for, so the submitters were asked
>to clean that up before they merged, and that took quite a bit of
>time, although we're still making progress.
>
>./vm/0001-Revert-Disable-nbd-proxy-from-the-build.patch
>./vm/0002-bmcweb-handle-device-or-resource-busy-exception.patch
>./vm/0003-Add-ConnectedVia-property-to-virtual-media-item-temp.patch
>./vm/0004-Invalid-status-code-from-InsertMedia-REST-methods.patch
>./vm/0005-Set-Inserted-redfish-property-for-not-inserted-resou.patch
>./vm/0006-Bmcweb-handle-permission-denied-exception.patch
>./vm/0007-Fix-unmounting-image-in-proxy-mode.patch
>./0016-Fix-bmcweb-crashes-if-socket-directory-not-present.patch
>These are patches against the nbd backend that never got upstreamed,
>so I don't really like accepting patches for dead code.  Several
>attempts have been made to reach out to the authors to figure out what
>the upstream plan for the nbd backend should be, given it's unusable
>in upstream, and I haven't heard any responses.  Currently this whole
>module is slated for "possible removal", and is uncompilable on
>master.  Ironically patch 0001 comments that line back in, so I know
>that someone at intel has at least seen and groked the comment left,
>and directly decided that it wasn't on their priority list to upstream
>anything.  Hopefully your email above implies that this isn't the
>general policy, and Intel is looking to upstream it?  Let's get a
>discussion started there.
>https://github.com/openbmc/bmcweb/blob/9d832618c74052bd445d6e8b3461946f3c431ca3/meson_options.txt#L7
>https://github.com/openbmc/bmcweb/issues/188

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bmcweb non-standard OEM integration
  2021-10-28 22:56   ` Vernon Mauery
@ 2021-10-28 23:52     ` Ed Tanous
  0 siblings, 0 replies; 7+ messages in thread
From: Ed Tanous @ 2021-10-28 23:52 UTC (permalink / raw)
  To: Vernon Mauery; +Cc: Development list for OpenBMC

On Thu, Oct 28, 2021 at 3:57 PM Vernon Mauery
<vernon.mauery@linux.intel.com> wrote:
>
> On 27-Oct-2021 09:22 AM, Ed Tanous wrote:
> >On Tue, Oct 26, 2021 at 9:37 PM Vernon Mauery
> ><vernon.mauery@linux.intel.com> wrote:
> >>
> >> I can't imagine that Intel is the only company on this project that has
> >> a set of patches against bmcweb. Some of these are for features that
> >> have not yet been released. Some of these are for OEM things that don't
> >> fit any of the Redfish-OEM schemas. Some are for features that are
> >> long-standing upstream reviews that have not yet been merged for
> >> whatever reason.
> >>
> >> We want to move away from patches.
> >
> >
> >As an attempt to make this more concrete, I tried to look at Intel-BMC
> >to figure out what you're talking about.  The only OEM schema I see is
> >0001-Firmware-update-configuration-changes.patch, which adds support
> >for defaulting the setup on a firmware update.  DMTF has been
> >discussing this idea of defaulting a setup very recently (I think we
> >talked about it last week), so that will hopefully be in the standard
> >soon, and if you're interested in particular properties of it, you
> >might want to participate there.
> >
> >That's the only OEM patch I see;  Is there more?
>
> All the patches that make changes are OEM patches. Just because they
> don't have an OEM schema, that doesn't mean they are not OEM patches. I
> see OEM patches as a patch written by an OEM. In this case, Intel.

Gotcha, so you're not referring to "OEM" in the way redfish refers to
it, you're referring to it in the way "anything I want to override for
a system".  Got it.

>
> >> We want to be able to share our
> >> changes with the community. Right now, there is not a way for this sort
> >> of OEM changes getting into bmcweb.
> >
> >I'm not sure why you think this, but the current policy is definitely
> >not "no way".  Have you read the doc on this?
> >https://github.com/openbmc/bmcweb/blob/master/OEM_SCHEMAS.md
> >
> >To paraphrase, the above doesn't say "no OEM schemas in
> >upstream".  It says "OEM schema features need to be discussed in the
> >relevant communities".  This policy as written was attempting to be
> >similar to our policy on systemd, linux, ect.
>
> Discussing things in relevant communities is good for when it comes time
> to actually upstream a patch and make it public. But there are patches
> that are not yet public because of pre-release hardware/specs/features
> that will eventually get upstreamed. Those are in a really tricky
> position because they might actually need to get re-architected and
> re-written as part of this discussion process during the upstream
> process.

But isn't this the point of submitting to upstream?  Reachitecting it
to make sure it's useful to others, and has community input?  For the
not-yet-public pre-release hardware, I'd really like to understand
that use case.  By design, there shouldn't be any system-specific code
in bmcweb.

FWIW, none of the patches I saw in intel-bmc were system specific, or
leaked any information about pre-release hardware, considering all but
one were following the DMTF spec pieces.  Maybe there's other examples
coming?

> I agree that might be part of a calculated risk that we would
> have to take on weighing the benefit of getting the feature working vs.
> the risk of possibly needing to rework it in the future. But in the
> meantime, it would be advantageous for Intel to be able to keep it in a
> format that does not break every time it has to be rebased on top of the
> latest bmcweb.

I think this is a case where "advantageous to Intel" and "Advantageous
to OpenBMC" are in conflict.  I'm not sure how we can avoid breaking
code that we can't see or build, unless you have some thoughts there?
FWIW, IPMI layer doesn't solve this either although maybe it makes it
more manageable?  I'd be interested to see some concrete designs on
how this could work.  Today, bmcweb tries to treat dbus as the
system-specific abstraction layer, but clearly that's not working.

>
> >> I know not everyone is a huge fan of
> >> the way that the ipmi-providers code works, but it does work. Companies
> >> need to be able to have those changes that allow them to differentiate
> >> from the common phosphor core. I am proposing that we make changes to
> >> bmcweb that would allow this sort of OEM customization.
> >
> >Intentionally avoiding the comparison with ipmi-oem for a moment,
> >let's see if we can agree on some things:
> >1. We should implement Redfish schemas (be they OEM or not) to the
> >standard, with the same level of quality as the standard.
>
> I understand that the IPMI spec is dated and unclear in many cases, but
> even with a standard that is still 'living', there will always be cases
> where something does not fit into the standard. It may be a 'not yet'
> situation where a future version adds the feature, or it may be a
> scenario that does not really fit the standard and they don't want it
> but we do. In either case, the implementation needs to exist from the
> point of requirement forward.

I'm not following, are you saying that we should allow patches that
break the Redfish spec?  That seems like a significant departure, and
I would hope I'm misunderstanding it, because it would significantly
change how testing, tooling, and other things need to work if we
needed to maintain an openbmc-specific version of all the testing
tools.  If there are cases where the schemas are in review or accepted
on DMTF master but not yet in a release, I'm happy to discuss better
staging of that kind of thing, but in practice, they tend to be faster
to review and merge than OpenBMC is.

> In the former case, it will eventually get
> merged with bmcweb. In the latter case, it still needs a home.

The OEM_SCHEMAS.md doesn't preclude having things in review that
haven't hit a standard yet.  It's just asked that they're on that
path, which involves submitting a request to DMTF first.

>
> >2. The OpenBMC community has a track record of writing incorrect
> >schemas, then abandoning them (there are positive examples as well).
>
> I feel like this is normal and part of growing pains. How many times has
> any given line in the Linux kernel been rewritten?

The better analogy would be "How many times has a given uabi been
changed?" considering we're talking about a user-facing API schema
here.  The answer is, rare to never.  The other analogy would be
kernel modules, which to my understanding have no version-to-version
guarantees, but maybe are more manageable than patches?

> Projects grow and
> change. If we are waiting for the perfect schema before implementation
> of a feature we will never get there.

I'm not asking for perfection, I'm asking for something that at first
glance and in testing follows the spec and has someone that owns it
and takes responsibility for testing and improving it over time.
Maybe that's too much?

> And then who will pay the bills? I
> am not suggesting that we just throw everything at the wall and see what
> sticks, but have some faith in organizations to come up with at least
> a first good step to appease their customers that have needs.

Customers generally ask for some level of API stability, given that
they're coding against specific implementations, and any change has
the risk of breaking things.  Maybe we need a bmcweb-unstable branch
for api-breaking things that are WIP?  Just thinking out loud.  I'm
not really sure I'm up to maintain such a thing.

I do like the model that we've been taking with the new sensor
schemas, where we have options for both the new and old behaviors, and
they can be switched on and off at will.  Maybe some model like that
would work?

>
> >3. We should avoid duplicating code between schemas in as many cases
> >as possible.
>
> Sure. No argument here.
>
> >Now jumping into the comparison, I would posit that Redfish is
> >different from IPMI:
> >1. There is a very active standards body, with new spec changes coming
> >every few months.  IPMI has no such standards body at this point, so
> >even if you wanted to get a change in, there's no path forward.
> >2. Redfish has a configurable privilege system, and its interfaces are
> >reflectable.  This significantly complicates any runtime-dynamic
> >plugin solution, given that previously static resources would now have
> >to be generated at startup, which would be quite a bit of code.
>
> It is just code. With code anything is possible. :)

Sure code can do anything, but nobody (with the exception of me
recently with the Privilege generation code) has sent any code to
solve this.  If there's code to solve this, or it's easier than I'm
thinking it is, I'd love to see the patch :)

>
> >None of this is to say "no OEM schemas", more to point out that in
> >terms of engineering and code, they're not very similar problem
> >spaces.
>
> So they will have different solutions. I didn't expect it would be a cut
> and paste situation.
>
> >>
> >> I don't have a full proposal yet. But I wanted to get this concept out
> >> in the open before diving headlong into a metaphorical brick wall. While
> >> I compared what I would like to the ipmi-providers mechanism, there are
> >> other ways to do this that can work around many of the objections that
> >> have been raised to that architecture.
> >>
> >
> >I'm not against having OEM schemas, but I do think we need some
> >guardrails to keep them useful and maintained in the long run, and
> >there's a good bit of architecture work to make them possible.
> >
> >Some things to consider in your proposal:
> >1. Redfish requires a PrivilegeRegistry.  The moment you implement an
> >OEM resource, you now need an OEM privilege registry to manage the
> >privileges for it.  Up until my patch recently, bmcweb had a hardcoded
> >privilege map, which didn't allow for modification on a per-system
> >basis.  After a few patches, we're at least generating our privilege
> >tables now, but this would need to be significantly improved to handle
> >OEM schemas privileges properly.
>
> This is good to keep in mind; like I said, so far, only the desire to do
> this is present, not a plan.

FWIW, this work is ongoing, and could use some adventurous people like
yourself to keep helping it along.  The next link in the chain is:
https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/45125

>
> >2. Redfish requires schema files.  In general for company-specific OEM
> >registries, it's not desirable to have another company's OEM schemas
> >show up on your systems (and for all I know, might be a copyright
> >violation, but I'm not a lawyer).  This requires some mechanism to
> >switch features on and off, which bmcweb already has, in the form of
> >the meson options.  If we're going to allow OEM options in a lot more
> >places, we'll need an engineering solution for this that scales
> >better.
>
> Yes. Nobody wants to endure that sort of embarrassment.
>
> >3. Redfish schemas are VERY difficult to write, and the current batch
> >of maintainers (myself especially) is not up to the task of reviewing
> >schema file submissions at this time.  I'm personally working to get
> >better, but at this point every single OEM schema that we've accepted
> >has bugs.  Side note, I tried to generate type-safe client bindings
> >for a project recently, and I ended up having to comment out all the
> >OEM schemas, given how broken they are, so this has real-world effects
> >on clients usage.
>
> This is not my area of expertise, but it sounds like generating valid
> schemas is the sort of thing that needs to be automated to let the
> computer do the work or something.

Seems reasonable, but it's going to require someone to look into it.
Recently DMTF has published their Schema generator tool, which might
have some solutions for this if we hooked it into the build system,
but I haven't looked into it in depth.  If someone else has, I'd love
to hear your experience.

> Mere humans are not up to the task.
>
> >3. As it's in one piece of code, bmcweb can make changes to internal
> >details, clean up code, and make things more usable over time by
> >simplifying and reducing code.  Some examples of this include
> >readJson, boost-dbus to sdbusplus conversion, variable naming,
> >clang-tidy and many more.  Any change to how we inject OEM handlers
> >should understand that and make sure it's still possible in the
> >future.
>
> Yup.
>
> >4. Functionally, code shouldn't be duplicated.  Please plan on your
> >design doc covering how we're going to deal with that functionally
> >when others want to reuse OEM code.
>
> Yes. This is one of my dislikes of the ipmi-oem situation.
>
> >5. In your proposal, please plan to propose a set of guidelines for
> >determining which things should be upstreamed to DMTF, which things
> >should remain as OpenBMC OEM, and which should remain company specific
> >OEM.  There are definitely examples that should be each of the three,
> >but the vast majority of things that I get asked about belong in DMTF
> >upstream, and I struggle as a maintainer when the DMTF community tells
> >me they want more participation, but OpenBMC OEM schemas immediately
> >bifurcate the conversation to a much smaller audience.  IMO in many
> >cases, OEM resources are used as a trapdoor to avoid having to
> >upstream things, which is not a pattern we should promote.
>
> Sure. And given those three scenarios, it might also change where the
> code lives. I agree that OEMs should not use this as just a carte
> blanche approval to do whatever they want.
>
> >6. Given that we've had problems in the webui with hardcoding
> >resources, please plan on coming up with similar client usage
> >guidelines to avoid similar problems as we've had with uid handling in
> >our clients, as well as deprecation policies when things do get
> >supported in upstream.
>
> I am not sure I grok this section...

To give a quick example, if one system supports the key
ServiceRoot/MyOEMProperty, and another system doesn't, you can't write
the code (python)

j = requests.get("/redfish/v1/").json()
print(j["OEM"]["MyOEMProperty"])

This code will throw an exception for any system that doesn't support
that property.  We need to document that these keys don't have the
same level of stability as built-in redfish things, and no openbmc
client should directly rely on them.  Also, given that this is a
user-facing interface, we need to set expectations for version to
version stability (ie, we probably shouldn't be pulling a
long-standing feature out from underneath someone without warning).

>
> >In short, I don't see a "metaphorical brick wall", but a gentle set of
> >rolling hills to climb.  If you can point to something specifically
> >you'd like to see made OEM that we can use as an example for this kind
> >of thing to work through the engineering and process changes, please
> >do.
>
> Thanks for your detailed response.

Sounds like we're mostly on the same page.   Looking forward to seeing
what you're coming up with.

>
> --Vernon

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-28 23:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-27  4:37 bmcweb non-standard OEM integration Vernon Mauery
2021-10-27 16:22 ` Ed Tanous
2021-10-27 20:42   ` Gunnar Mills
2021-10-27 22:28   ` Ed Tanous
2021-10-28 23:05     ` Vernon Mauery
2021-10-28 22:56   ` Vernon Mauery
2021-10-28 23:52     ` Ed Tanous

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).