On 2020.03.27 08:44:59 +0000, Tian, Kevin wrote:
> > From: Zhenyu Wang <zhenyuw@linux.intel.com>
> > Sent: Friday, March 27, 2020 4:12 PM
> > 
> [...]
> > > > +int intel_vgpu_adjust_resource_count(struct intel_vgpu *vgpu)
> > > > +{
> > > > +	if ((vgpu_aperture_sz(vgpu) != vgpu->param.low_gm_sz *
> > > > +	     vgpu->param.aggregation) ||
> > > > +	    (vgpu_hidden_sz(vgpu) != vgpu->param.high_gm_sz *
> > > > +	     vgpu->param.aggregation)) {
> > > > +		/* handle aggregation change */
> > > > +		intel_vgpu_free_resource_count(vgpu);
> > > > +		intel_vgpu_alloc_resource_count(vgpu);
> > >
> > > this logic sounds like different from the commit msg. Earlier you
> > > said the resource is not allocated until mdev open, while the
> > > aggregated_interfaces is only allowed to be written before
> > > mdev open. In such case, why would it required to handle the
> > > case where a vgpu already has resource allocated then coming
> > > a new request to adjust the number of instances?
> > 
> > This is to handle resource accounting before mdev open by aggregation
> > setting change. When vgpu create from mdev type, default resource
> > count according to type is set for vgpu. So this function is just to
> > change resource count by aggregation.
> 
> then better change the name, e.g. .xxx_adjust_resource_accounting,
> otherwise it's easy to be confused.
>

ok

> [...]
> > > >  	if (ret)
> > > >  		goto out_clean_vgpu_mmio;
> > > >
> > > > -	populate_pvinfo_page(vgpu);
> > > > +	if (!delay_res_alloc) {
> > > > +		ret = intel_vgpu_res_alloc(vgpu);
> > > > +		if (ret)
> > > > +			goto out_clean_vgpu_mmio;
> > > > +	}
> > >
> > > If delayed resource allocation works correctly, why do we still
> > > need support non-delayed flavor? Even a type doesn't support
> > > aggregation, it doesn't hurt to do allocation until mdev open...
> > >
> > 
> > The difference is user expectation I think, if there's really
> > awareness of this. As original way is to allocate at creat time, so
> > once created success, resource is guaranteed. But for aggregation type
> > which could be changed before open, alloc happens at that time which
> > may have different scenario, e.g might fail caused by other instance
> > or host. So original idea is to keep old behavior but only change for
> > aggregation type.
> 
> but how could one expect any difference between instant allocation
> and delayed allocation? You already update resource accounting so
> the remaining instances are accurate anyway. Then the user only knows
> how the vgpu looks like when it is opened...
> 
> > 
> > If we think this user expectation is not important, delayed alloc
> > could help to create vgpu quickly but may have more chance to fail
> > later..
> > 
> 
> why? If delayed allocation has more chance to fail, it means our
> resource accounting has problem. Even for type w/o aggregation
> capability, we should reserve one instance resource by default anyway
> 

If under really heavy load of host and many other vgpu running, we
might not have left continual gfx mem space..This is not new problem,
just that now we handle it at vgpu create time to reserve the
resource. Once host side could promise some limit, then our usage
will be guaranteed.

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827