All of lore.kernel.org
 help / color / mirror / Atom feed
* Using D-Bus well-known service names and deprecating mapper service lookups
@ 2019-04-11  5:02 Andrew Jeffery
  2019-04-11 15:51 ` Andrew Geissler
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Jeffery @ 2019-04-11  5:02 UTC (permalink / raw)
  To: openbmc; +Cc: Brad Bishop, Andrew Geissler, wak

Hello,

Recently Andrew Geissler, William and myself have been running into limitations of the mapper's service lookups. In my opinion we should move to use well known service names and avoiding reliance on the mapper to determine who to talk to.

My issues with using the mapper for service dependencies are:

1. Ambiguities in the way `mapper wait` functions
2. Race conditions between bus name announcement, mapper introspection and inbound service lookups
3. Increased IPC traffic (decreasing performance).

Expanding on 1, `mapper wait` takes a path name argument and waits until the path is available on a bus connection. The problem here is that multiple connections can (and do) implement objects under a given path, so the successful exit of `mapper wait` with the expectation that a given service is available can be an undefined result, as it's possible the *wrong* service was used to resolve the wait.

Where this hits home is that we currently have a systemd service, mapper-wait@.service, that allows units to depend on the presence of a service providing a required path. With the ambiguity outlined above it's possible that dependent services are started earlier than they should be due to `mapper wait` falsely exiting success with respect to the dependent's actual requirement.

A quick grep of the openbmc tree gives an idea of the potential problem size:

$ git ls-files -- recipes-phosphor/ | grep '\.service$' | wc -l
98
$ git ls-files -- recipes-phosphor/ | grep '\.service$' | xargs grep -li mapper-wait | wc -l
32

So about a 1/3 generic services are potentially exposed.

Point 2 leads to patches like [1] where we wind up implementing a fallback to a well-known service name if we get issues with a mapper lookup. If this is a valid approach then we should just avoid looking up the service altogether; implementing this fallback approach everywhere seems tedious and error-prone, and the lookup is a waste of resources if the fallback works regardless.

By moving to well known service names we can exploit more of systemd's D-Bus integration with units of Type=dbus, and depending either on unit names aliased to D-Bus service names, or on D-Bus service activation, and receive reliable behaviour for D-Bus-based dependencies.

Point 3 is something William has complained about, and I think it's a good point - avoiding IPC where we can is a good thing no matter what we're doing.

I'm kicking this thread off because it's something that's bothered me for a while. I've complained about it to various people in private but have never taken it any further. My impression is that there's support for the idea that we move away from the mapper for service lookups (though we can't eliminate it entirely, we still need the interface reverse mapping feature), and I want to gauge how much support it has and whether there are solid counter-arguments.

Cheers,

Andrew

[1] https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-state-manager/+/20460

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using D-Bus well-known service names and deprecating mapper service lookups
  2019-04-11  5:02 Using D-Bus well-known service names and deprecating mapper service lookups Andrew Jeffery
@ 2019-04-11 15:51 ` Andrew Geissler
  2019-04-12  3:47   ` Andrew Jeffery
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Geissler @ 2019-04-11 15:51 UTC (permalink / raw)
  To: Andrew Jeffery; +Cc: OpenBMC Maillist, Brad Bishop, William Kennington

On Thu, Apr 11, 2019 at 12:03 AM Andrew Jeffery <andrew@aj.id.au> wrote:
>
> Hello,
>
> Recently Andrew Geissler, William and myself have been running into limitations of the mapper's service lookups. In my opinion we should move to use well known service names and avoiding reliance on the mapper to determine who to talk to.

One thing to ask ourselves is what are the advantages to mapper and
the GetObject concept (I think we all agree(?) that features like
getting the subtree and associations are still useful).
For GetObject the main thing I see as an advantage is actually
relevant in my patch you linked. My code wants to talk to the host.
The interface to talk to the host is currently provided by ipmid but
at some point in the future that may move to the pldm daemon. Using
mapper to get the object that provides my interface abstract that from
me. Will the pldm daemon use the same well-know name as ipmid does
now? I'm not so sure about that. I know ipmid does currently implement
two well known names, so we could potentially move it.

Another feature we've thought about would be if we some day support
multi-BMC distributed systems. Having something like mapper
abstracting cross BMC communication could be useful (i.e. process A
just knows it needs to talk to interface B, mapper could route process
A through a dbus proxy in another BMC without process A having any
knowledge of that). Not really a fully a formed thought but just
something we kicked around a bit when talking about distributed BMC's.

>
> My issues with using the mapper for service dependencies are:
>
> 1. Ambiguities in the way `mapper wait` functions
> 2. Race conditions between bus name announcement, mapper introspection and inbound service lookups
> 3. Increased IPC traffic (decreasing performance).
>
> Expanding on 1, `mapper wait` takes a path name argument and waits until the path is available on a bus connection. The problem here is that multiple connections can (and do) implement objects under a given path, so the successful exit of `mapper wait` with the expectation that a given service is available can be an undefined result, as it's possible the *wrong* service was used to resolve the wait.
>
> Where this hits home is that we currently have a systemd service, mapper-wait@.service, that allows units to depend on the presence of a service providing a required path. With the ambiguity outlined above it's possible that dependent services are started earlier than they should be due to `mapper wait` falsely exiting success with respect to the dependent's actual requirement.
>
> A quick grep of the openbmc tree gives an idea of the potential problem size:
>
> $ git ls-files -- recipes-phosphor/ | grep '\.service$' | wc -l
> 98
> $ git ls-files -- recipes-phosphor/ | grep '\.service$' | xargs grep -li mapper-wait | wc -l
> 32
>
> So about a 1/3 generic services are potentially exposed.

Yes, this has been hounding me for a while now. We had a discussion in
IRC a few months ago about having mapper stop doing introspection and
move to using the InterfacesAdded signal. This would improve
performance (introspection is resource intensive) and it would
severely limit (if not fix) the race condition issues. I took a stab
at a patch (https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-objmgr/+/17837)
but have been running into some weird issues I haven't gotten to the
bottom of yet.

>
> Point 2 leads to patches like [1] where we wind up implementing a fallback to a well-known service name if we get issues with a mapper lookup. If this is a valid approach then we should just avoid looking up the service altogether; implementing this fallback approach everywhere seems tedious and error-prone, and the lookup is a waste of resources if the fallback works regardless.
>
> By moving to well known service names we can exploit more of systemd's D-Bus integration with units of Type=dbus, and depending either on unit names aliased to D-Bus service names, or on D-Bus service activation, and receive reliable behaviour for D-Bus-based dependencies.
>
> Point 3 is something William has complained about, and I think it's a good point - avoiding IPC where we can is a good thing no matter what we're doing.
>
> I'm kicking this thread off because it's something that's bothered me for a while. I've complained about it to various people in private but have never taken it any further. My impression is that there's support for the idea that we move away from the mapper for service lookups (though we can't eliminate it entirely, we still need the interface reverse mapping feature), and I want to gauge how much support it has and whether there are solid counter-arguments.

Thanks for getting this started. Definitely something that's been
bothering me more and more lately as some of the issues you've noted
above have become more prevalent. I agree that I'd like to transition
away form GetObject usage in our process's. I'd be interested in the
history from Brad on the reasons this was initially created and
whether they are still relevant though.

>
> Cheers,
>
> Andrew
>
> [1] https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-state-manager/+/20460

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using D-Bus well-known service names and deprecating mapper service lookups
  2019-04-11 15:51 ` Andrew Geissler
@ 2019-04-12  3:47   ` Andrew Jeffery
  2019-04-15  2:52     ` Brad Bishop
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Jeffery @ 2019-04-12  3:47 UTC (permalink / raw)
  To: Andrew Geissler; +Cc: OpenBMC Maillist, Brad Bishop, William Kennington



On Fri, 12 Apr 2019, at 01:21, Andrew Geissler wrote:
> On Thu, Apr 11, 2019 at 12:03 AM Andrew Jeffery <andrew@aj.id.au> wrote:
> >
> > Hello,
> >
> > Recently Andrew Geissler, William and myself have been running into limitations of the mapper's service lookups. In my opinion we should move to use well known service names and avoiding reliance on the mapper to determine who to talk to.
> 
> One thing to ask ourselves is what are the advantages to mapper and
> the GetObject concept (I think we all agree(?) that features like
> getting the subtree and associations are still useful).
> For GetObject the main thing I see as an advantage is actually
> relevant in my patch you linked. My code wants to talk to the host.
> The interface to talk to the host is currently provided by ipmid but
> at some point in the future that may move to the pldm daemon. Using
> mapper to get the object that provides my interface abstract that from
> me. Will the pldm daemon use the same well-know name as ipmid does
> now? 

Only if it at least implements the same set of object paths and interfaces
on those objects. I'm not convinced it will (yet).

> I'm not so sure about that. I know ipmid does currently implement
> two well known names, so we could potentially move it.
> 
> Another feature we've thought about would be if we some day support
> multi-BMC distributed systems. Having something like mapper
> abstracting cross BMC communication could be useful (i.e. process A
> just knows it needs to talk to interface B, mapper could route process
> A through a dbus proxy in another BMC without process A having any
> knowledge of that). Not really a fully a formed thought but just
> something we kicked around a bit when talking about distributed BMC's.

I looked into this briefly and doing federated D-Bus seemed pretty ugly.
I feel that inter-BMC communication needs a lot more thought than
anyone has given it (to my knowledge).

My gut feeling is we might be better served by another approach.

> 
> >
> > My issues with using the mapper for service dependencies are:
> >
> > 1. Ambiguities in the way `mapper wait` functions
> > 2. Race conditions between bus name announcement, mapper introspection and inbound service lookups
> > 3. Increased IPC traffic (decreasing performance).
> >
> > Expanding on 1, `mapper wait` takes a path name argument and waits until the path is available on a bus connection. The problem here is that multiple connections can (and do) implement objects under a given path, so the successful exit of `mapper wait` with the expectation that a given service is available can be an undefined result, as it's possible the *wrong* service was used to resolve the wait.
> >
> > Where this hits home is that we currently have a systemd service, mapper-wait@.service, that allows units to depend on the presence of a service providing a required path. With the ambiguity outlined above it's possible that dependent services are started earlier than they should be due to `mapper wait` falsely exiting success with respect to the dependent's actual requirement.
> >
> > A quick grep of the openbmc tree gives an idea of the potential problem size:
> >
> > $ git ls-files -- recipes-phosphor/ | grep '\.service$' | wc -l
> > 98
> > $ git ls-files -- recipes-phosphor/ | grep '\.service$' | xargs grep -li mapper-wait | wc -l
> > 32
> >
> > So about a 1/3 generic services are potentially exposed.
> 
> Yes, this has been hounding me for a while now. We had a discussion in
> IRC a few months ago about having mapper stop doing introspection and
> move to using the InterfacesAdded signal. This would improve
> performance (introspection is resource intensive) and it would
> severely limit (if not fix) the race condition issues. I took a stab
> at a patch 
> (https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-objmgr/+/17837)
> but have been running into some weird issues I haven't gotten to the
> bottom of yet.

Have you documented the weird issues anywhere? This seems like
a nice optimisation, would be good to resolve whatever the problems
are.

> 
> >
> > Point 2 leads to patches like [1] where we wind up implementing a fallback to a well-known service name if we get issues with a mapper lookup. If this is a valid approach then we should just avoid looking up the service altogether; implementing this fallback approach everywhere seems tedious and error-prone, and the lookup is a waste of resources if the fallback works regardless.
> >
> > By moving to well known service names we can exploit more of systemd's D-Bus integration with units of Type=dbus, and depending either on unit names aliased to D-Bus service names, or on D-Bus service activation, and receive reliable behaviour for D-Bus-based dependencies.
> >
> > Point 3 is something William has complained about, and I think it's a good point - avoiding IPC where we can is a good thing no matter what we're doing.
> >
> > I'm kicking this thread off because it's something that's bothered me for a while. I've complained about it to various people in private but have never taken it any further. My impression is that there's support for the idea that we move away from the mapper for service lookups (though we can't eliminate it entirely, we still need the interface reverse mapping feature), and I want to gauge how much support it has and whether there are solid counter-arguments.
> 
> Thanks for getting this started. Definitely something that's been
> bothering me more and more lately as some of the issues you've noted
> above have become more prevalent. I agree that I'd like to transition
> away form GetObject usage in our process's. I'd be interested in the
> history from Brad on the reasons this was initially created and
> whether they are still relevant though.

Great! Yeah, I think getting Brad's input is important here.

Thanks for the response. I was intending to Cc you but apparently
didn't, sorry about that.

Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using D-Bus well-known service names and deprecating mapper service lookups
  2019-04-12  3:47   ` Andrew Jeffery
@ 2019-04-15  2:52     ` Brad Bishop
  2019-04-16  1:51       ` Andrew Jeffery
  0 siblings, 1 reply; 6+ messages in thread
From: Brad Bishop @ 2019-04-15  2:52 UTC (permalink / raw)
  To: Andrew Jeffery, Andrew Geissler; +Cc: OpenBMC Maillist, William Kennington

On Thu, 2019-04-11 at 23:47 -0400, Andrew Jeffery wrote:
> 
> On Fri, 12 Apr 2019, at 01:21, Andrew Geissler wrote:
> > On Thu, Apr 11, 2019 at 12:03 AM Andrew Jeffery <andrew@aj.id.au>
> > wrote:
> > > 
> > > I'm kicking this thread off because it's something that's
> > > bothered me for a while. I've complained about it to various
> > > people in private but have never taken it any further. My
> > > impression is that there's support for the idea that we move away
> > > from the mapper for service lookups (though we can't eliminate it
> > > entirely, we still need the interface reverse mapping feature), 

I'd like to eliminate the mapper entirely someday, if we can.  Can you
elaborate on why you think we can't?  I'm not sure I understand what
you mean by the interface reverse mapping feature.

> > > and I want to gauge how much support it has and whether there are
> > > solid counter-arguments.
> > 
> > Thanks for getting this started. Definitely something that's been
> > bothering me more and more lately as some of the issues you've
> > noted
> > above have become more prevalent. I agree that I'd like to
> > transition
> > away form GetObject usage in our process's. I'd be interested in
> > the
> > history from Brad on the reasons this was initially created and
> > whether they are still relevant though.
> 
> Great! Yeah, I think getting Brad's input is important here.

Doing lookups based on well-known bus names would make DBus programming
in OpenBMC more like DBus programming on the desktop, and that is a win
IMO.

FWIW the mapper was conceived (back in 2015) as a means to provide
flexibility in where (which process) objects are implemented, yet still
enable client reuse across those different implementations.  Want to
implement all your sensors in a single application?  No problem.  Want
to implement one application for each type of sensor?  No problem. 
Want to implement an application for every sensor instance?  No
problem.

Here is another way to look at it.  If we start applying a schema to
the well-known bus name as is normal in desktop-land, what would the
scope be?  Consider sensors.  You could have either specific or general
scope e.g. bus names like:

xyz.openbmc_project.sensors (general)

-or-

xyz.openbmc_project.nvme (specific)

In the case of the former, every sensor in a system has to be
implemented by a single process.  That isn't workable is it?

In the case of the latter, every client application looking for sensors
in the general sense (it doesn't care that it is a sensor on an nvme
drive that talks nvme-mi over i2c) has to introspect this bus, along
with every other bus to find any objects that implement a sensor
interface.

OpenBMC still needs a way to address these issues today - that hasn't
changed I don't think.  Is the mapper the best possible solution? 
Probably not.

-brad

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using D-Bus well-known service names and deprecating mapper service lookups
  2019-04-15  2:52     ` Brad Bishop
@ 2019-04-16  1:51       ` Andrew Jeffery
  2019-04-24 12:15         ` Brad Bishop
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Jeffery @ 2019-04-16  1:51 UTC (permalink / raw)
  To: Brad Bishop, Andrew Geissler; +Cc: OpenBMC Maillist, William Kennington



On Mon, 15 Apr 2019, at 12:23, Brad Bishop wrote:
> On Thu, 2019-04-11 at 23:47 -0400, Andrew Jeffery wrote:
> > 
> > On Fri, 12 Apr 2019, at 01:21, Andrew Geissler wrote:
> > > On Thu, Apr 11, 2019 at 12:03 AM Andrew Jeffery <andrew@aj.id.au>
> > > wrote:
> > > > 
> > > > I'm kicking this thread off because it's something that's
> > > > bothered me for a while. I've complained about it to various
> > > > people in private but have never taken it any further. My
> > > > impression is that there's support for the idea that we move away
> > > > from the mapper for service lookups (though we can't eliminate it
> > > > entirely, we still need the interface reverse mapping feature), 
> 
> I'd like to eliminate the mapper entirely someday, if we can.  Can you
> elaborate on why you think we can't?  I'm not sure I understand what
> you mean by the interface reverse mapping feature.

Your discussion below is what I was talking about, probably just poor
terminology on my part.

> 
> > > > and I want to gauge how much support it has and whether there are
> > > > solid counter-arguments.
> > > 
> > > Thanks for getting this started. Definitely something that's been
> > > bothering me more and more lately as some of the issues you've
> > > noted
> > > above have become more prevalent. I agree that I'd like to
> > > transition
> > > away form GetObject usage in our process's. I'd be interested in
> > > the
> > > history from Brad on the reasons this was initially created and
> > > whether they are still relevant though.
> > 
> > Great! Yeah, I think getting Brad's input is important here.
> 
> Doing lookups based on well-known bus names would make DBus programming
> in OpenBMC more like DBus programming on the desktop, and that is a win
> IMO.
> 
> FWIW the mapper was conceived (back in 2015) as a means to provide
> flexibility in where (which process) objects are implemented, yet still
> enable client reuse across those different implementations.  Want to
> implement all your sensors in a single application?  No problem.  Want
> to implement one application for each type of sensor?  No problem. 
> Want to implement an application for every sensor instance?  No
> problem.
> 
> Here is another way to look at it.  If we start applying a schema to
> the well-known bus name as is normal in desktop-land, what would the
> scope be?  Consider sensors.  You could have either specific or general
> scope e.g. bus names like:
> 
> xyz.openbmc_project.sensors (general)
> 
> -or-
> 
> xyz.openbmc_project.nvme (specific)
> 
> In the case of the former, every sensor in a system has to be
> implemented by a single process.  That isn't workable is it?
> 
> In the case of the latter, every client application looking for sensors
> in the general sense (it doesn't care that it is a sensor on an nvme
> drive that talks nvme-mi over i2c) has to introspect this bus, along
> with every other bus to find any objects that implement a sensor
> interface.
> 
> OpenBMC still needs a way to address these issues today - that hasn't
> changed I don't think.  Is the mapper the best possible solution? 
> Probably not.

So I think you've painted a reasonable picture of the problem here,
however I also think that the solution to this specific problem has
been over-generalised to the point of pervasive use in OpenBMC,
and my argument is I don't think that's necessary.

Backing up, the motivation for this thread was that I wanted to know
whether phosphor-ipmi-host had got to the point of being considered
"on the bus". I knew we had the mapper-wait@ systemd service that
several units already used, so I gave that a go, and found that it didn't
behave as I expected (as outlined in the original message). From there
I moved on to poking at systemd and dbus-activation. The point is that
we're using the mapper in a lot of places where it should be
unnecessary. Having a well-defined schema and hence a well-known
bus name for ipmid doesn't seem too far fetched. The practical reality
of the implementation is it's done as a single process.

I'm not arguing that we eliminate the mapper entirely, your scenario
above does a good job of motivating its existence. I'm just arguing that
we should rely on it less, and do a better job of defining what services
should have the benefit of a well-known bus names, as this enables
reliable dependency management without hacks like mapper-wait@.

Maybe we could put together a list of what could benefit from well
known bus names and what requires the mapper? Might help us
understand the scope and whether any of this is a useful idea.

Cheers,

Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using D-Bus well-known service names and deprecating mapper service lookups
  2019-04-16  1:51       ` Andrew Jeffery
@ 2019-04-24 12:15         ` Brad Bishop
  0 siblings, 0 replies; 6+ messages in thread
From: Brad Bishop @ 2019-04-24 12:15 UTC (permalink / raw)
  To: Andrew Jeffery; +Cc: Andrew Geissler, OpenBMC Maillist, William Kennington

On Mon, Apr 15, 2019 at 09:51:19PM -0400, Andrew Jeffery wrote:
>
>
>On Mon, 15 Apr 2019, at 12:23, Brad Bishop wrote:
>> OpenBMC still needs a way to address these issues today - that hasn't
>> changed I don't think.  Is the mapper the best possible solution?
>> Probably not.
>
>So I think you've painted a reasonable picture of the problem here,
>however I also think that the solution to this specific problem has
>been over-generalised to the point of pervasive use in OpenBMC,
>and my argument is I don't think that's necessary.

Agreed.  FWIW in the past there was a heavy tendency to solve problems
that didn't necessarily exist yet.

>
>Backing up, the motivation for this thread was that I wanted to know
>whether phosphor-ipmi-host had got to the point of being considered
>"on the bus". I knew we had the mapper-wait@ systemd service that
>several units already used, so I gave that a go, and found that it didn't
>behave as I expected (as outlined in the original message). From there
>I moved on to poking at systemd and dbus-activation. The point is that
>we're using the mapper in a lot of places where it should be
>unnecessary. Having a well-defined schema and hence a well-known
>bus name for ipmid doesn't seem too far fetched. The practical reality
>of the implementation is it's done as a single process.
>
>I'm not arguing that we eliminate the mapper entirely, your scenario
>above does a good job of motivating its existence. I'm just arguing that
>we should rely on it less, and do a better job of defining what services
>should have the benefit of a well-known bus names, as this enables
>reliable dependency management without hacks like mapper-wait@.

Agreed.

>
>Maybe we could put together a list of what could benefit from well
>known bus names and what requires the mapper? Might help us
>understand the scope and whether any of this is a useful idea.

It certainly can't hurt to have something like this.

>
>Cheers,
>
>Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-04-24 12:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-11  5:02 Using D-Bus well-known service names and deprecating mapper service lookups Andrew Jeffery
2019-04-11 15:51 ` Andrew Geissler
2019-04-12  3:47   ` Andrew Jeffery
2019-04-15  2:52     ` Brad Bishop
2019-04-16  1:51       ` Andrew Jeffery
2019-04-24 12:15         ` Brad Bishop

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.