All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Roman Gushchin <guroan@gmail.com>, Tejun Heo <tj@kernel.org>,
	"Dan Carpenter" <dan.carpenter@oracle.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v4 4/7] cgroup: cgroup v2 freezer
Date: Tue, 4 Dec 2018 00:11:13 +0000	[thread overview]
Message-ID: <20181204001105.GA14586@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <20181203144717.GD31795@redhat.com>

On Mon, Dec 03, 2018 at 03:47:18PM +0100, Oleg Nesterov wrote:
> To be honest, I fail to understand this patch. At least after a quick glance,
> I will try to read it again tomorrow but so far I do not even understand the
> desired semantics wrt signals/ptrace.
> 
> On 11/30, Roman Gushchin wrote:
> >
> > @@ -368,6 +369,8 @@ static inline int signal_pending_state(long state, struct task_struct *p)
> >  		return 0;
> >  	if (!signal_pending(p))
> >  		return 0;
> > +	if (unlikely(cgroup_task_frozen(p)))
> > +		return __fatal_signal_pending(p);
> 
> Oh, this is not nice. And doesn't look right.
> 
> > +/*
> > + * Entry path into frozen state.
> > + * If the task was not frozen before, counters are updated and the cgroup state
> > + * is revisited. Otherwise, the task is put into the TASK_KILLABLE sleep.
> > + */
> > +void cgroup_enter_frozen(void)
> > +{
> > +	if (!current->frozen) {
> > +		struct cgroup *cgrp;
> > +
> > +		spin_lock_irq(&css_set_lock);
> > +		current->frozen = true;
> > +		cgrp = task_dfl_cgroup(current);
> > +		cgrp->freezer.nr_frozen_tasks++;
> > +		WARN_ON_ONCE(cgrp->freezer.nr_frozen_tasks >
> > +			     cgrp->freezer.nr_tasks_to_freeze);
> > +		cgroup_update_frozen(cgrp, true);
> > +		spin_unlock_irq(&css_set_lock);
> > +	}
> > +
> > +	__set_current_state(TASK_INTERRUPTIBLE);
> > +	schedule();
> 
> The comment above says TASK_KILLABLE, very confusing.

Sorry, it's a leftover from one of the previous versions. Fixed.

> 
> Probably this pairs with the change in signal_pending_state() above. So this
> schedule() should actually "work" in that it won't return if signal_pending().
> 
> But this can't protect from another signal_wake_up(). Yes, iiuc in this case
> cgroup_enter_frozen() will be called again "soon" but this all looks strange.

So, the idea here is to make ptrace traps and fatal signals working, but
non-fatal shouldn't be delivered.

As soon as the frozen task is looping in the signal delivery loop, it's fine,
it's not going anywhere.

Without the change above the task is getting out of schedule() immediately,
if any signal is pending (including non-fatals). So it becomes a busy-loop.

> 
> > --- a/kernel/ptrace.c
> > +++ b/kernel/ptrace.c
> > @@ -410,6 +410,13 @@ static int ptrace_attach(struct task_struct *task, long request,
> >
> >  	spin_lock(&task->sighand->siglock);
> >
> > +	/*
> > +	 * If the process is frozen, let's wake it up to give it a chance
> > +	 * to enter the ptrace trap.
> > +	 */
> > +	if (cgroup_task_frozen(task))
> > +		wake_up_process(task);
> 
> And why this can't race with cgroup_enter_frozen() ?
> 
> Or think of PTRACE_INTERRUPT. It can race with cgroup_enter_frozen() too, the
> tracee can miss this request because of that change in signal_pending_state().

It's a good point. So I need an additional synchronization around
checking/setting the JOBCTL_TRAP_FREEZE?

> 
> 
> >  static void do_jobctl_trap(void)
> >  {
> > +	struct sighand_struct *sighand = current->sighand;
> >  	struct signal_struct *signal = current->signal;
> >  	int signr = current->jobctl & JOBCTL_STOP_SIGMASK;
> >  
> > -	if (current->ptrace & PT_SEIZED) {
> > -		if (!signal->group_stop_count &&
> > -		    !(signal->flags & SIGNAL_STOP_STOPPED))
> > -			signr = SIGTRAP;
> > -		WARN_ON_ONCE(!signr);
> > -		ptrace_do_notify(signr, signr | (PTRACE_EVENT_STOP << 8),
> > -				 CLD_STOPPED);
> > -	} else {
> > -		WARN_ON_ONCE(!signr);
> > -		ptrace_stop(signr, CLD_STOPPED, 0, NULL);
> > -		current->exit_code = 0;
> > +	if (current->jobctl & (JOBCTL_TRAP_STOP | JOBCTL_TRAP_NOTIFY)) {
> > +		if (current->ptrace & PT_SEIZED) {
> > +			if (!signal->group_stop_count &&
> > +			    !(signal->flags & SIGNAL_STOP_STOPPED))
> > +				signr = SIGTRAP;
> > +			WARN_ON_ONCE(!signr);
> > +			ptrace_do_notify(signr,
> > +					 signr | (PTRACE_EVENT_STOP << 8),
> > +					 CLD_STOPPED);
> > +		} else {
> > +			WARN_ON_ONCE(!signr);
> > +			ptrace_stop(signr, CLD_STOPPED, 0, NULL);
> > +			current->exit_code = 0;
> > +		}
> > +	} else if (current->jobctl & JOBCTL_TRAP_FREEZE) {
> > +		/*
> > +		 * Enter the frozen state, unless the task is about to exit.
> > +		 */
> > +		if (fatal_signal_pending(current)) {
> > +			current->jobctl &= ~JOBCTL_TRAP_FREEZE;
> > +		} else {
> > +			spin_unlock_irq(&sighand->siglock);
> > +			cgroup_enter_frozen();
> > +			spin_lock_irq(&sighand->siglock);
> > +		}
> >  	}
> >  }
> >  
> > @@ -2401,12 +2420,23 @@ bool get_signal(struct ksignal *ksig)
> >  		    do_signal_stop(0))
> >  			goto relock;
> >  
> > -		if (unlikely(current->jobctl & JOBCTL_TRAP_MASK)) {
> > +		if (unlikely(current->jobctl &
> > +			     (JOBCTL_TRAP_MASK | JOBCTL_TRAP_FREEZE))) {
> >  			do_jobctl_trap();
> 
> Cosmetic nit, but can't you add another helper? To me something like
> 
> 		if (JOBCTL_TRAP_MASK)
> 			do_jobctl_trap();
> 		else if (JOBCTL_TRAP_FREEZE)
> 			do_jobctl_freeze();
> 
> will look more clean, but I won't insist.

Sure.

> 
> 
> > +		/*
> > +		 * If the task is leaving the frozen state, let's update
> > +		 * cgroup counters and reset the frozen bit.
> > +		 */
> > +		if (unlikely(cgroup_task_frozen(current))) {
> > +			spin_unlock_irq(&sighand->siglock);
> > +			cgroup_leave_frozen();
> > +			spin_lock_irq(&sighand->siglock);
> 
> looks like this needs another "goto relock", no?
> 
> And perhaps this another reason for the new do_jobctl_freeze() helper which
> could absorb this leave_frozen() ?

Makes sense. Will follow this path in v5.

Thank you!

  reply	other threads:[~2018-12-04  0:11 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-30 23:47 [PATCH v4 0/7] freezer for cgroup v2 Roman Gushchin
2018-11-30 23:47 ` [PATCH v4 1/7] cgroup: rename freezer.c into legacy_freezer.c Roman Gushchin
2018-11-30 23:47 ` [PATCH v4 2/7] cgroup: implement __cgroup_task_count() helper Roman Gushchin
2018-11-30 23:47 ` [PATCH v4 3/7] cgroup: protect cgroup->nr_(dying_)descendants by css_set_lock Roman Gushchin
2018-12-03 16:17   ` Tejun Heo
2018-12-03 17:55     ` Roman Gushchin
2018-11-30 23:47 ` [PATCH v4 4/7] cgroup: cgroup v2 freezer Roman Gushchin
2018-12-03 14:47   ` Oleg Nesterov
2018-12-04  0:11     ` Roman Gushchin [this message]
2018-12-03 14:49   ` Oleg Nesterov
2018-12-03 15:35   ` Oleg Nesterov
2018-11-30 23:47 ` [PATCH v4 5/7] kselftests: cgroup: don't fail on cg_kill_all() error in cg_destroy() Roman Gushchin
2018-11-30 23:47   ` Roman Gushchin
2018-11-30 23:47   ` guroan
2018-11-30 23:47 ` [PATCH v4 6/7] kselftests: cgroup: add freezer controller self-tests Roman Gushchin
2018-11-30 23:47   ` Roman Gushchin
2018-11-30 23:47   ` guroan
2018-11-30 23:47 ` [PATCH v4 7/7] cgroup: document cgroup v2 freezer interface Roman Gushchin
2018-12-03  6:21   ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181204001105.GA14586@castle.DHCP.thefacebook.com \
    --to=guro@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=cgroups@vger.kernel.org \
    --cc=dan.carpenter@oracle.com \
    --cc=guroan@gmail.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.