From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751965AbdHMToc (ORCPT ); Sun, 13 Aug 2017 15:44:32 -0400 Received: from mail-qt0-f194.google.com ([209.85.216.194]:36620 "EHLO mail-qt0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751528AbdHMTo1 (ORCPT ); Sun, 13 Aug 2017 15:44:27 -0400 Date: Sun, 13 Aug 2017 12:44:21 -0700 From: Tejun Heo To: Waiman Long Cc: lizefan@huawei.com, hannes@cmpxchg.org, peterz@infradead.org, mingo@redhat.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, pjt@google.com, luto@amacapital.net, efault@gmx.de, torvalds@linux-foundation.org, guro@fb.com Subject: Re: [PATCH 3/3] cgroup: Implement cgroup2 basic CPU usage accounting Message-ID: <20170813194421.GB1438922@devbig577.frc2.facebook.com> References: <20170811163754.3939102-1-tj@kernel.org> <20170811163754.3939102-4-tj@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Waiman. On Fri, Aug 11, 2017 at 04:19:07PM -0400, Waiman Long wrote: > > * CPU usage stats are collected and shown in "cgroup.stat" with "cpu." > > prefix. Total usage is collected from scheduling events. User/sys > > breakdown is sourced from tick sampling and adjusted to the usage > > using cputime_adjuts(). > > Typo: cputime_adjust() Oops, will fix. > > +static DEFINE_MUTEX(cgroup_stat_mutex); > > +static DEFINE_PER_CPU(raw_spinlock_t, cgroup_cpu_stat_lock); > > If the hierarchy is large enough and the stat data hasn't been read > recently, it may take a while to accumulate all the stat data even for > one cpu in cgroup_stat_flush_locked(). So I think it will make more > sense to use regular spinlock_t instead of raw_spinlock_t. They need to be raw spinlocks because the accounting side may grab them while scheduling. If the accumulating latency becomes problematic, we can test for need_resched and spin_needbreak and break out as necessary. The iteration logic is built to allow that and an earlier revision actually did that but I wasn't sure whether it's actually necessary and removed it for simplicity. If the latency ever becomes a problem, resurrecting that isn't difficult at all. > > +static struct cgroup *cgroup_cpu_stat_next_updated(struct cgroup *pos, > > + struct cgroup *root, int cpu) > > This function is trying to unwind one cgrp from the updated_children and > updated_next linkage. It is somewhat like the opposite of > cgroup_cpu_stat_updated(). I just feel like its name isn't intuitive > enough to convey what it is doing. Maybe use name like > cgroup_cpu_stat_unlink_one() to match cgroup_cpu_stat_flush_one(). Hmm... the name comes from it being an iterator - most interators are named _next. But, yeah, the name doesn't signifiy that it unlinks as it goes along. I'll rename it to cgroup_cpu_stat_pop_updated(). Thanks. -- tejun From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tejun Heo Subject: Re: [PATCH 3/3] cgroup: Implement cgroup2 basic CPU usage accounting Date: Sun, 13 Aug 2017 12:44:21 -0700 Message-ID: <20170813194421.GB1438922@devbig577.frc2.facebook.com> References: <20170811163754.3939102-1-tj@kernel.org> <20170811163754.3939102-4-tj@kernel.org> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=/oUjGZEzRR1POR3sBhYtsRDUQoLd4zLcHeXlEJu1Ht4=; b=RTbwCzLCnILD5YYyiZj1ZoMOOOnYspvApIGhVfY+XQ+u6KlbEb5eA03SHjNAnMJbyA 3lVHBy4qHuhGFSwd6EcL7HeESkQ19VIK1+6ORNz48fAvacDYclmKOSJEgdxZVGCmU3Xz GnzyUNDM0mQcYAJEEgkWXSt5oIZcIMgyhr4CBWPav6sozJCvOVIck7WO952evW4FhR/h cRHdJsz58RwX967zKIQJ0C8hP5yVjfLC7fSFQndqrP2UJUVF308M3Upbwz5ogA37GDC+ b7v9qFMwWoYaB0ECuuukRF5mFUkQzxdseC2/WRGLSovYV+vxSd0CFEYyTCxC/j5RimjM Vl2Q== Content-Disposition: inline In-Reply-To: Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Waiman Long Cc: lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org, hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org, peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org, mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kernel-team-b10kYP2dOMg@public.gmane.org, pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org, efault-Mmb7MZpHnFY@public.gmane.org, torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org, guro-b10kYP2dOMg@public.gmane.org Hello, Waiman. On Fri, Aug 11, 2017 at 04:19:07PM -0400, Waiman Long wrote: > > * CPU usage stats are collected and shown in "cgroup.stat" with "cpu." > > prefix. Total usage is collected from scheduling events. User/sys > > breakdown is sourced from tick sampling and adjusted to the usage > > using cputime_adjuts(). > > Typo: cputime_adjust() Oops, will fix. > > +static DEFINE_MUTEX(cgroup_stat_mutex); > > +static DEFINE_PER_CPU(raw_spinlock_t, cgroup_cpu_stat_lock); > > If the hierarchy is large enough and the stat data hasn't been read > recently, it may take a while to accumulate all the stat data even for > one cpu in cgroup_stat_flush_locked(). So I think it will make more > sense to use regular spinlock_t instead of raw_spinlock_t. They need to be raw spinlocks because the accounting side may grab them while scheduling. If the accumulating latency becomes problematic, we can test for need_resched and spin_needbreak and break out as necessary. The iteration logic is built to allow that and an earlier revision actually did that but I wasn't sure whether it's actually necessary and removed it for simplicity. If the latency ever becomes a problem, resurrecting that isn't difficult at all. > > +static struct cgroup *cgroup_cpu_stat_next_updated(struct cgroup *pos, > > + struct cgroup *root, int cpu) > > This function is trying to unwind one cgrp from the updated_children and > updated_next linkage. It is somewhat like the opposite of > cgroup_cpu_stat_updated(). I just feel like its name isn't intuitive > enough to convey what it is doing. Maybe use name like > cgroup_cpu_stat_unlink_one() to match cgroup_cpu_stat_flush_one(). Hmm... the name comes from it being an iterator - most interators are named _next. But, yeah, the name doesn't signifiy that it unlinks as it goes along. I'll rename it to cgroup_cpu_stat_pop_updated(). Thanks. -- tejun