Re: [PATCH v2 00/13] introduce fail-safe PMD

From: Bruce Richardson <bruce.richardson@intel.com>
To: "Gaëtan Rivet" <gaetan.rivet@6wind.com>
Cc: Neil Horman <nhorman@tuxdriver.com>,
	dev@dpdk.org, Thomas Monjalon <thomas.monjalon@6wind.com>,
	Adrien Mazarguil <adrien.mazarguil@6wind.com>
Subject: Re: [PATCH v2 00/13] introduce fail-safe PMD
Date: Wed, 15 Mar 2017 03:28:55 +0000	[thread overview]
Message-ID: <20170315032853.GA366048@bricha3-MOBL3.ger.corp.intel.com> (raw)
In-Reply-To: <20170314144947.GO908@bidouze.vm.6wind.com>

On Tue, Mar 14, 2017 at 03:49:47PM +0100, Gaëtan Rivet wrote:
> > > The central question that I would like to tackle is this: why should
> > > we require from our users declaring a bonding device to have
> > > hot-plug support?
> > >
> > We'll, strictly speaking, I suppose we don't have to require it.  But by that
> > same token, we don't need to do it in a separate PMD either, there are lots of
> > other options.
> > 
> 
> I think I used an ambiguous formulation here. To be sure that we
> understand each other, what I want to express is that it will certainly
> seem very strange for a user to declare a bond on a single device,
> first, and to be expected to do so if they wanted hot-plug support on
> that particular device.
> 
> The bonding here would only be used as a place-holder, which does not
> really make sense from the user point of view.
> 
> I think you understood that, and what I take from your response is that
> while agreeing you would like to do so differently. I just wanted to be
> clear.
> 
> > > I took some time to illustrate a few modes of operation:
> > >
> > > Fig. 1
> > >
> > >    .-------------.
> > >    | application |
> > >    `------.------'
> > >           |
> > >      .----'-----.---------. <------ init, conf, Rx/Tx
> > >      |          |         |
> > >      |      .---|--.------|--. <--- conf, link check, Rx/Tx
> > >      |      |   |  |      |  |
> > >      v      |   v  v      v  v
> > > .---------. | .-------. .------.
> > > | bonding | | | ixgbe | | mlx4 |
> > > `----.----' | `-------' `------'
> > >      |      |
> > >      `------'
> > >
> > > Typical link fail-over.
> > >
> > >
> > > Fig. 2
> > >
> > >  .-------------.
> > >  | application |
> > >  `------.------'
> > >         | <-------- init, conf, Rx/Tx
> > >         v
> > >   .-----------.
> > >   | fail-safe |
> > >   `-----.-----'
> > >         |
> > >     .---'----. <--- init, conf, dev check, Rx/Tx
> > >     |        |
> > >     v        v
> > > .-------. .------.
> > > | ixgbe | | mlx4 |
> > > `-------' `------'
> > >
> > > Typical automatic hot-plug handling with device fail-over.
> > >
> > > [...]
> > 
> > Yes, I think we all understand the purpose of your PMD - its there to provide a
> > placeholder device that can respond to application requests in a sane manner,
> > until such time as real hardware is put in its place via a hot plug/failure.  Thats all
> > well and good, I'm saying there are better ways to go about this that can
> > provide the same functionality without having to add an extra 4k lines of code
> > to the project, many of which already exist.
> > 
> 
> Ah, yes, I didn't want to imply that the purpose of this PMD wasn't
> understood already by many. These figures are there to illustrate some
> use cases that some users could recognize, and serve as a support for
> the point made afterward.
> 
> The main thing that can be taken from these is the division along the
> link-level and device-level checking that is done. This explains most of
> my position. The nature of those checks imply different kind of code,
> most of which is thus actually not duplicated / would require pretty
> much the same amount of code to be implemented either as libraries or as
> part of the bonding PMD. This is the crux of my argument, which I expand
> upon below.
> 
> > > 1. LSC vs. RMV
> > >
> > >  A link status change is a valid state for a device. It calls for
> > >  specific responses, e.g. a link switch in a bonding device, without
> > >  losing the general configuration of the port.
> > >
> > >  The removal of a device calls for more than pausing operations and
> > >  switching an active device. The party responsible for initializing the
> > >  device should take care of closing it properly. If this party also
> > >  wants to be able to restore the device if it was plugged back in, it
> > >  would need be able to initialize it back and reconfigure its previous
> > >  state.
> > >
> > >  As we can see that in [Fig. 1], this responsibility lies upon the
> > >  application.
> > >
> > Again, yes, I think we all see the benefit of centralizing hot plug operations,
> > no one is disagreing with that, its the code/functional duplication that is
> > concerning.
> > 
> 
> Certainly, I will try to explain why the code is not actually duplicated
> / why the functions are actually only superficially overlapping.
> 
> > > 2. Bonding and link availability
> > >
> > >  The hot-plug functionality is not a core function of the bonding PMD.
> > >  It is only interested in knowing if the link is active or not.
> > >
> > Currently, yes.  The suggestion was that you augment the bonding driver so that
> > hot plug is a core function of bonding.
> > 
> > >  Adding the device persistence to the bonding PMD would mean adding the
> > >  ability to flexibly parse device definitions to cope with plug-ins in
> > >  evolving busses (PCI hot-plug could mean changing bus addresses), being
> > >  able to emulate the EAL and the ether layer and to properly store the
> > >  device configuration.  This means formally describing the life of a
> > >  device in a DPDK application from start to finish.
> > >
> > Which seems to me to be exactly what your PMD does.  I don't see why its
> > fundamentally harder to do that in an existing pmd, than it is in a new one.
> > 
> 
> Indeed it does. I must emphasize the "formally describe the life of a
> device". The hot-plug functionality goes beyong the link-level check.
> The description of a device from a DPDK standpoint is complete in the
> fail-safe PMD. The state-machine must be able to describe the entire
> life of a device, from the devargs parsing to its start-up.
> 
> We cannot reuse the existing bonding PMD architecture for this.  We
> would have to rewrite the bonding PMD from the ground up for the
> hot-plug function. Because it is actually a different approach to
> managing the slaves.
> 
> This is what I wanted to illustrate in [Fig. 1] and [Fig. 2]:
> 
> - In the bonding, the init and configuration steps are still the
>  responsibility of the application and no one else. The bonding PMD
>  captures the device, re-applies its configuration upon dev_configure()
>  which is actually re-applying part of the configuration already  present
> within the slave eth_dev (cf rte_eth_dev_config_restore).
> 
> - In the fail-safe, the init and configuration are both the
>  responsibilities of the fail-safe PMD itself, not the application
>  anymore. This handling of these responsibilities in lieu of the
>  application is the whole point of the "deferred hot-plug" support, of
>  proposing a simple implementation to the user.
> 
> This change in responsibilities is the bulk of the fail-safe code. It
> would have to be added as-is to the bonding. Verifying the correctness
> of the sync of the initialization phase (acceptable states of a device
> following several events registered by the fail-safe PMD) and the
> configuration items between the state the application believes it is in
> and the fail-safe knows it is in, is the bulk of the fail-safe code.
> 
> This function is not overlapping with that of the bonding. The reason I
> did not add this whole architecture to the bonding is that when I tried
> to do so, I found that I only had two possibilities:
> 
> - The current slave handling path is kept, and we only add a new one
>  with additional functionalities: full init and conf handling with
>  extended parsing capabilities.
> 
> - The current slave handling is scraped and replaced entirely by the new
>  slave management. The old capturing of existing device is not done
>  anymore.
> 
> The first solution is not acceptable, because we effectively end-up with
> a maintenance nightmare by having to validate two types of slaves with
> differing capabilities, differing initialization paths and differing
> configuration code.  This is extremely awkward and architecturally
> unsound. This is essentially the same as having the exact code of the
> fail-safe as an aside in the bonding, maintening exactly the same
> breadth of code while having muddier interfaces and organization.
> 
> The second solution is not acceptable, because we are bending the whole
> existing bonding API to our whim. We could just as well simply rename
> the fail-safe PMD as bonding, add a few grouping capabilities and call
> it a day. This is not acceptable for users.
> 
If the first solution is indeed not an option, why do you think this
second one would be unacceptable for users? If the functionality remains
the same, I don't see how it matters much for users which driver
provides it or where the code originates.

Despite all the discussion, it still just doesn't make sense to me to
have more than one DPDK driver to handle failover - be it link or
device. If nothing else, it's going to be awkward to explain to users
that if they want fail-over for when a link goes down they have to use
driver A, but if they want fail-over when a NIC gets hotplugged they use
driver B, and if they want both kinds of failover - which would surely
be the expected case - they need to use both drivers. The usability is
a problem here.

Regards,
/Bruce