linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jacob Pan <jacob.jun.pan@intel.com>
To: Tejun Heo <tj@kernel.org>
Cc: Vipin Sharma <vipinsh@google.com>,
	mkoutny@suse.com, rdunlap@infradead.org, thomas.lendacky@amd.com,
	brijesh.singh@amd.com, jon.grimm@amd.com,
	eric.vantassell@amd.com, pbonzini@redhat.com, hannes@cmpxchg.org,
	frankja@linux.ibm.com, borntraeger@de.ibm.com, corbet@lwn.net,
	seanjc@google.com, vkuznets@redhat.com, wanpengli@tencent.com,
	jmattson@google.com, joro@8bytes.org, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, hpa@zytor.com,
	gingell@google.com, rientjes@google.com, dionnaglaze@google.com,
	kvm@vger.kernel.org, x86@kernel.org, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, "Tian,
	Kevin" <kevin.tian@intel.com>, "Liu, Yi L" <yi.l.liu@intel.com>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Jacob Pan <jacob.jun.pan@linux.intel.com>,
	"jean-philippe@linaro.org" <jean-philippe@linaro.org>,
	jacob.jun.pan@intel.com
Subject: Re: [RFC v2 2/2] cgroup: sev: Miscellaneous cgroup documentation.
Date: Mon, 15 Mar 2021 15:11:55 -0700	[thread overview]
Message-ID: <20210315151155.383a7e6e@jacob-builder> (raw)
In-Reply-To: <YEz+8HbfkbGgG5Tm@mtj.duckdns.org>

Hi Tejun,

On Sat, 13 Mar 2021 13:05:36 -0500, Tejun Heo <tj@kernel.org> wrote:

> Hello,
> 
> On Sat, Mar 13, 2021 at 08:57:01AM -0800, Jacob Pan wrote:
> > Isn't PIDs controller doing the charge/uncharge? I was under the
> > impression that each resource can be independently charged/uncharged,
> > why it affects other resources? Sorry for the basic question.  
> 
> Yeah, PID is an exception as we needed the initial migration to seed new
> cgroups and it gets really confusing with other ways to observe the
> processes - e.g. if you follow the original way of creating a cgroup,
> forking and then moving the seed process into the target cgroup, if we
> don't migrate the pid charge together, the numbers wouldn't agree and the
> seeder cgroup may end up running out of pids if there are any
> restrictions.
> 
Thanks for explaining. Unfortunately, it seems IOASIDs has a similar needs
in terms of migrating the charge.

> > I also didn't quite get the limitation on cgroup v2 migration, this is
> > much simpler than memcg. Could you give me some pointers?  
> 
> Migration itself doesn't have restrictions but all resources are
> distributed on the same hierarchy, so the controllers are supposed to
> follow the same conventions that can be implemented by all controllers.
> 
Got it, I guess that is the behavior required by the unified hierarchy.
Cgroup v1 would be ok? But I am guessing we are not extending on v1?

> > BTW, since the IOASIDs are used to tag DMA and bound with guest
> > process(mm) for shared virtual addressing. fork() cannot be supported,
> > so I guess clone is not a solution here.  
> 
> Can you please elaborate what wouldn't work? The new spawning into a new
> cgroup w/ clone doesn't really change the usage model. It's just a neater
> way to seed a new cgroup. If you're saying that the overall usage model
> doesn't fit the needs of IOASIDs, it likely shouldn't be a cgroup
> controller.
> 
The IOASIDs are programmed into devices to generate DMA requests tagged
with them. The IOMMU has a per device IOASID table with each entry has two
pointers:
 - the PGD of the guest process.
 - the PGD of the host process

The result of this 2 stage/nested translation is that we can share virtual
address (SVA) between guest process and DMA. The host process needs to
allocate multiple IOASIDs since one IOASID is needed for each guest process
who wants SVA.

The DMA binding among device-IOMMU-process is setup via a series of user
APIs (e.g. via VFIO).

If a process calls fork(), the children does not inherit the IOASIDs and
their bindings. Children who wish to use SVA has to call those APIs to
establish the binding for themselves.

Therefore, if a host process allocates 10 IOASIDs then does a
fork()/clone(), it cannot charge 10 IOASIDs in the new cgroup. i.e. the 10
IOASIDs stays with the process wherever it goes.

I feel this fit in the domain model, true?

> Thanks.
> 


Thanks,

Jacob

  reply	other threads:[~2021-03-15 22:10 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-02  8:17 [RFC v2 0/2] cgroup: New misc cgroup controller Vipin Sharma
2021-03-02  8:17 ` [RFC v2 1/2] cgroup: sev: Add " Vipin Sharma
2021-03-03 15:42   ` Tejun Heo
2021-03-04  6:12     ` Vipin Sharma
2021-03-04  8:53       ` Tejun Heo
2021-03-02  8:17 ` [RFC v2 2/2] cgroup: sev: Miscellaneous cgroup documentation Vipin Sharma
2021-03-04  2:55   ` Jacob Pan
2021-03-04  6:22     ` Vipin Sharma
2021-03-04  8:51       ` Tejun Heo
2021-03-12 20:58         ` Jacob Pan
2021-03-12 21:15           ` Vipin Sharma
2021-03-12 22:59             ` Jacob Pan
2021-03-13 10:20               ` Tejun Heo
2021-03-13 16:57                 ` Jacob Pan
2021-03-13 18:05                   ` Tejun Heo
2021-03-15 22:11                     ` Jacob Pan [this message]
2021-03-15 22:19                       ` Tejun Heo
2021-03-15 23:40                         ` Jacob Pan
2021-03-15 23:54                           ` Tejun Heo
2021-03-16  1:30                             ` Jacob Pan
2021-03-16  2:22                               ` Tejun Heo
2021-03-16 18:19                                 ` Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210315151155.383a7e6e@jacob-builder \
    --to=jacob.jun.pan@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=borntraeger@de.ibm.com \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=dionnaglaze@google.com \
    --cc=eric.vantassell@amd.com \
    --cc=frankja@linux.ibm.com \
    --cc=gingell@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hpa@zytor.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=jmattson@google.com \
    --cc=jon.grimm@amd.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=pbonzini@redhat.com \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tj@kernel.org \
    --cc=vipinsh@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).