linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 4/4] driver core: Allow showing cpu as offline if not valid in cpuset context
       [not found] ` <20210614152306.25668-5-longman@redhat.com>
@ 2021-06-14 15:52   ` Greg KH
  2021-06-14 16:32     ` Waiman Long
  0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2021-06-14 15:52 UTC (permalink / raw)
  To: Waiman Long
  Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin <hpa@zytor.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Rafael J. Wysocki ,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, x86, cgroups,
	linux-kernel, linux-doc, linux-fsdevel

On Mon, Jun 14, 2021 at 11:23:06AM -0400, Waiman Long wrote:
> Make /sys/devices/system/cpu/cpu<n>/online file to show a cpu as
> offline if it is not a valid cpu in a proper cpuset context when the
> cpuset_bound_cpuinfo sysctl parameter is turned on.

This says _what_ you are doing, but I do not understand _why_ you want
to do this.

What is going to use this information?  And now you are showing more
files than you previously did, so what userspace tool is now going to
break?



> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  drivers/base/core.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 54ba506e5a89..176b927fade2 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -29,6 +29,7 @@
>  #include <linux/sched/mm.h>
>  #include <linux/sysfs.h>
>  #include <linux/dma-map-ops.h> /* for dma_default_coherent */
> +#include <linux/cpuset.h>
>  
>  #include "base.h"
>  #include "power/power.h"
> @@ -2378,11 +2379,24 @@ static ssize_t uevent_store(struct device *dev, struct device_attribute *attr,
>  }
>  static DEVICE_ATTR_RW(uevent);
>  
> +static bool is_device_cpu(struct device *dev)
> +{
> +	return dev->bus && dev->bus->dev_name
> +			&& !strcmp(dev->bus->dev_name, "cpu");
> +}

No, this is not ok, there is a reason we did not put RTTI in struct
devices, so don't try to fake one here please.

> +
>  static ssize_t online_show(struct device *dev, struct device_attribute *attr,
>  			   char *buf)
>  {
>  	bool val;
>  
> +	/*
> +	 * Show a cpu as offline if the cpu number is not valid in a
> +	 * proper cpuset bounding cpuinfo context.
> +	 */
> +	if (is_device_cpu(dev) && !cpuset_current_cpu_valid(dev->id))
> +		return sysfs_emit(buf, "0\n");

Why are you changing the driver core for a single random, tiny set of
devices?  The device code for those devices can handle this just fine,
do NOT modify the driver core for each individual driver type, that way
lies madness.

This change is not ok, sorry.

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 4/4] driver core: Allow showing cpu as offline if not valid in cpuset context
  2021-06-14 15:52   ` [PATCH 4/4] driver core: Allow showing cpu as offline if not valid in cpuset context Greg KH
@ 2021-06-14 16:32     ` Waiman Long
  2021-06-14 17:00       ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: Waiman Long @ 2021-06-14 16:32 UTC (permalink / raw)
  To: Greg KH
  Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Greg Kroah-Hartman, Rafael J. Wysocki, Luis Chamberlain,
	Kees Cook, Iurii Zaikin, x86, cgroups, linux-kernel, linux-doc,
	linux-fsdevel

On 6/14/21 11:52 AM, Greg KH wrote:
> On Mon, Jun 14, 2021 at 11:23:06AM -0400, Waiman Long wrote:
>> Make /sys/devices/system/cpu/cpu<n>/online file to show a cpu as
>> offline if it is not a valid cpu in a proper cpuset context when the
>> cpuset_bound_cpuinfo sysctl parameter is turned on.
> This says _what_ you are doing, but I do not understand _why_ you want
> to do this.
>
> What is going to use this information?  And now you are showing more
> files than you previously did, so what userspace tool is now going to
> break?

One reason that is provided by the customer asking for this 
functionality is because some applications use the number of cpu cores 
for licensing purpose. Even though the applications are running in a 
container with a smaller set of cpus, they may still charge as if all 
the cpus are available. They ended up using a bind mount to mount over 
the cpuX/online file.

I should have included this information in the patchset.


>
>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   drivers/base/core.c | 14 ++++++++++++++
>>   1 file changed, 14 insertions(+)
>>
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index 54ba506e5a89..176b927fade2 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -29,6 +29,7 @@
>>   #include <linux/sched/mm.h>
>>   #include <linux/sysfs.h>
>>   #include <linux/dma-map-ops.h> /* for dma_default_coherent */
>> +#include <linux/cpuset.h>
>>   
>>   #include "base.h"
>>   #include "power/power.h"
>> @@ -2378,11 +2379,24 @@ static ssize_t uevent_store(struct device *dev, struct device_attribute *attr,
>>   }
>>   static DEVICE_ATTR_RW(uevent);
>>   
>> +static bool is_device_cpu(struct device *dev)
>> +{
>> +	return dev->bus && dev->bus->dev_name
>> +			&& !strcmp(dev->bus->dev_name, "cpu");
>> +}
> No, this is not ok, there is a reason we did not put RTTI in struct
> devices, so don't try to fake one here please.
>
>> +
>>   static ssize_t online_show(struct device *dev, struct device_attribute *attr,
>>   			   char *buf)
>>   {
>>   	bool val;
>>   
>> +	/*
>> +	 * Show a cpu as offline if the cpu number is not valid in a
>> +	 * proper cpuset bounding cpuinfo context.
>> +	 */
>> +	if (is_device_cpu(dev) && !cpuset_current_cpu_valid(dev->id))
>> +		return sysfs_emit(buf, "0\n");
> Why are you changing the driver core for a single random, tiny set of
> devices?  The device code for those devices can handle this just fine,
> do NOT modify the driver core for each individual driver type, that way
> lies madness.
>
> This change is not ok, sorry.

OK, thanks for the comments. I will see if there is alternative way of 
doing it.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 4/4] driver core: Allow showing cpu as offline if not valid in cpuset context
  2021-06-14 16:32     ` Waiman Long
@ 2021-06-14 17:00       ` Greg KH
  0 siblings, 0 replies; 7+ messages in thread
From: Greg KH @ 2021-06-14 17:00 UTC (permalink / raw)
  To: Waiman Long
  Cc: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Rafael J. Wysocki, Luis Chamberlain, Kees Cook, Iurii Zaikin,
	x86, cgroups, linux-kernel, linux-doc, linux-fsdevel

On Mon, Jun 14, 2021 at 12:32:01PM -0400, Waiman Long wrote:
> On 6/14/21 11:52 AM, Greg KH wrote:
> > On Mon, Jun 14, 2021 at 11:23:06AM -0400, Waiman Long wrote:
> > > Make /sys/devices/system/cpu/cpu<n>/online file to show a cpu as
> > > offline if it is not a valid cpu in a proper cpuset context when the
> > > cpuset_bound_cpuinfo sysctl parameter is turned on.
> > This says _what_ you are doing, but I do not understand _why_ you want
> > to do this.
> > 
> > What is going to use this information?  And now you are showing more
> > files than you previously did, so what userspace tool is now going to
> > break?
> 
> One reason that is provided by the customer asking for this functionality is
> because some applications use the number of cpu cores for licensing purpose.
> Even though the applications are running in a container with a smaller set
> of cpus, they may still charge as if all the cpus are available. They ended
> up using a bind mount to mount over the cpuX/online file.

Great, then stick with the bind mount for foolish things like that.

There's no technical reason for doing this then, just marketing?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/4] cgroup/cpuset: Allow cpuset to bound displayed cpu info
       [not found] <20210614152306.25668-1-longman@redhat.com>
       [not found] ` <20210614152306.25668-5-longman@redhat.com>
@ 2021-06-14 20:43 ` Tejun Heo
  2021-06-15  2:53   ` Waiman Long
  2021-06-15  9:14   ` Christian Brauner
  1 sibling, 2 replies; 7+ messages in thread
From: Tejun Heo @ 2021-06-14 20:43 UTC (permalink / raw)
  To: Waiman Long
  Cc: Zefan Li, Johannes Weiner, Jonathan Corbet, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov,
	H. Peter Anvin <hpa@zytor.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Rafael J. Wysocki ,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, x86, cgroups,
	linux-kernel, linux-doc, linux-fsdevel

Hello,

On Mon, Jun 14, 2021 at 11:23:02AM -0400, Waiman Long wrote:
> The current container management system is able to create the illusion
> that applications running within a container have limited resources and
> devices available for their use. However, one thing that is hard to hide
> is the number of CPUs available in the system. In fact, the container
> developers are asking for the kernel to provide such capability.
> 
> There are two places where cpu information are available for the
> applications to see - /proc/cpuinfo and /sys/devices/system/cpu sysfs
> directory.
> 
> This patchset introduces a new sysctl parameter cpuset_bound_cpuinfo
> which, when set, will limit the amount of information disclosed by
> /proc/cpuinfo and /sys/devices/system/cpu.

The goal of cgroup has never been masquerading system information so that
applications can pretend that they own the whole system and the proposed
solution requires application changes anyway. The information being provided
is useful but please do so within the usual cgroup interface - e.g.
cpuset.stat. The applications (or libraries) that want to determine its
confined CPU availability can locate the file through /proc/self/cgroup.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/4] cgroup/cpuset: Allow cpuset to bound displayed cpu info
  2021-06-14 20:43 ` [PATCH 0/4] cgroup/cpuset: Allow cpuset to bound displayed cpu info Tejun Heo
@ 2021-06-15  2:53   ` Waiman Long
  2021-06-15 15:59     ` Tejun Heo
  2021-06-15  9:14   ` Christian Brauner
  1 sibling, 1 reply; 7+ messages in thread
From: Waiman Long @ 2021-06-15  2:53 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Zefan Li, Johannes Weiner, Jonathan Corbet, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Greg Kroah-Hartman,
	Rafael J. Wysocki, Luis Chamberlain, Kees Cook, Iurii Zaikin,
	x86, cgroups, linux-kernel, linux-doc, linux-fsdevel

On 6/14/21 4:43 PM, Tejun Heo wrote:
> Hello,
>
> On Mon, Jun 14, 2021 at 11:23:02AM -0400, Waiman Long wrote:
>> The current container management system is able to create the illusion
>> that applications running within a container have limited resources and
>> devices available for their use. However, one thing that is hard to hide
>> is the number of CPUs available in the system. In fact, the container
>> developers are asking for the kernel to provide such capability.
>>
>> There are two places where cpu information are available for the
>> applications to see - /proc/cpuinfo and /sys/devices/system/cpu sysfs
>> directory.
>>
>> This patchset introduces a new sysctl parameter cpuset_bound_cpuinfo
>> which, when set, will limit the amount of information disclosed by
>> /proc/cpuinfo and /sys/devices/system/cpu.
> The goal of cgroup has never been masquerading system information so that
> applications can pretend that they own the whole system and the proposed
> solution requires application changes anyway. The information being provided
> is useful but please do so within the usual cgroup interface - e.g.
> cpuset.stat. The applications (or libraries) that want to determine its
> confined CPU availability can locate the file through /proc/self/cgroup.

Thanks for your comment. I understand your point making change via 
cgroup interface files. However, this is not what the customers are 
asking for. They are using tools that look at /proc/cpuinfo and the 
sysfs files. It is a much bigger effort to make all those tools look at 
a new cgroup file interface instead. It can be more efficiently done at 
the kernel level.

Anyway, I am OK if the consensus is that it is not a kernel problem and 
have to be handled in userspace.

BTW, do you have any comment on another cpuset patch that I sent a week 
earlier?

https://lore.kernel.org/lkml/20210603212416.25934-1-longman@redhat.com/

I am looking forward for your feedback.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/4] cgroup/cpuset: Allow cpuset to bound displayed cpu info
  2021-06-14 20:43 ` [PATCH 0/4] cgroup/cpuset: Allow cpuset to bound displayed cpu info Tejun Heo
  2021-06-15  2:53   ` Waiman Long
@ 2021-06-15  9:14   ` Christian Brauner
  1 sibling, 0 replies; 7+ messages in thread
From: Christian Brauner @ 2021-06-15  9:14 UTC (permalink / raw)
  To: Tejun Heo, Waiman Long, Greg Kroah-Hartman, Johannes Weiner
  Cc: Zefan Li, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin <hpa@zytor.com>,
	Rafael J. Wysocki ,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, x86, cgroups,
	linux-kernel, linux-doc, linux-fsdevel

On Mon, Jun 14, 2021 at 04:43:28PM -0400, Tejun Heo wrote:
> Hello,
> 
> On Mon, Jun 14, 2021 at 11:23:02AM -0400, Waiman Long wrote:
> > The current container management system is able to create the illusion
> > that applications running within a container have limited resources and
> > devices available for their use. However, one thing that is hard to hide
> > is the number of CPUs available in the system. In fact, the container
> > developers are asking for the kernel to provide such capability.
> > 
> > There are two places where cpu information are available for the
> > applications to see - /proc/cpuinfo and /sys/devices/system/cpu sysfs
> > directory.
> > 
> > This patchset introduces a new sysctl parameter cpuset_bound_cpuinfo
> > which, when set, will limit the amount of information disclosed by
> > /proc/cpuinfo and /sys/devices/system/cpu.
> 
> The goal of cgroup has never been masquerading system information so that
> applications can pretend that they own the whole system and the proposed
> solution requires application changes anyway. The information being provided
> is useful but please do so within the usual cgroup interface - e.g.
> cpuset.stat. The applications (or libraries) that want to determine its
> confined CPU availability can locate the file through /proc/self/cgroup.

Fyi, there's another concurrent push going on to provide a new file
/proc/self/meminfo that is a subset of /proc/meminfo (cf. [1]) and
virtualizes based on cgroups as well.

But there it's a new file not virtualizing exisiting files and
directories so there things seem to be out of sync between these groups
at the same company.

In any case I would like to point out that this has a complete solution
in userspace. We have had this problem of providing virtualized
information to containers since they started existing. So we created
LXCFS in 2014 (cf. [2]) a tiny fuse fileystem to provide a virtualized
view based on cgroups and other information for containers.

The two patchsets seems like they're on the way trying to move 1:1 what
we're already doing in userspace into the kernel. LXCFS is quite well
known and widely used so it's suprising to not see it mentioned at all.

And the container people will want more then just the cpu and meminfo
stuff sooner or later. Just look at what we currently virtualize:

/proc/cpuinfo
/proc/diskstats
/proc/meminfo
/proc/stat
/proc/swaps
/proc/uptime
/proc/slabinfo
/sys/devices/system/cpu

## So for example /proc/cpuinfo
#### Host
brauner@wittgenstein|~
> grep ^processor /proc/cpuinfo
processor       : 0
processor       : 1
processor       : 2
processor       : 3
processor       : 4
processor       : 5
processor       : 6
processor       : 7

#### Container
brauner@wittgenstein|~
> lxc exec f1 -- grep ^processor /proc/cpuinfo
processor       : 0
processor       : 1

## and for /sys/devices/system/cpu
#### Host
brauner@wittgenstein|~
> ls -al /sys/devices/system/cpu/ | grep cpu[[:digit:]]
drwxr-xr-x 10 root root    0 Jun 14 21:22 cpu0
drwxr-xr-x 10 root root    0 Jun 14 21:22 cpu1
drwxr-xr-x 10 root root    0 Jun 14 21:22 cpu2
drwxr-xr-x 10 root root    0 Jun 14 21:22 cpu3
drwxr-xr-x 10 root root    0 Jun 14 21:22 cpu4
drwxr-xr-x 10 root root    0 Jun 14 21:22 cpu5
drwxr-xr-x 10 root root    0 Jun 14 21:22 cpu6
drwxr-xr-x 10 root root    0 Jun 14 21:22 cpu7

#### Container
brauner@wittgenstein|~
> lxc exec f1 -- ls -al /sys/devices/system/cpu/ | grep cpu[[:digit:]]
drwxr-xr-x  2 nobody nogroup   0 Jun 15 09:07 cpu3
drwxr-xr-x  2 nobody nogroup   0 Jun 15 09:07 cpu4

We have a wide variety of users from various distros.

[1]: https://lore.kernel.org/containers/f62b652c-3f6f-31ba-be0f-5f97b304599f@metux.net
[2]: https://github.com/lxc/lxcfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/4] cgroup/cpuset: Allow cpuset to bound displayed cpu info
  2021-06-15  2:53   ` Waiman Long
@ 2021-06-15 15:59     ` Tejun Heo
  0 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2021-06-15 15:59 UTC (permalink / raw)
  To: Waiman Long
  Cc: Zefan Li, Johannes Weiner, Jonathan Corbet, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Greg Kroah-Hartman,
	Rafael J. Wysocki, Luis Chamberlain, Kees Cook, Iurii Zaikin,
	x86, cgroups, linux-kernel, linux-doc, linux-fsdevel

Hello, Waiman.

On Mon, Jun 14, 2021 at 10:53:53PM -0400, Waiman Long wrote:
> Thanks for your comment. I understand your point making change via cgroup
> interface files. However, this is not what the customers are asking for.

It's not like we can always follow what specific customers request. If there
are actual use-cases that can't be achieved with the existing interfaces and
features, we can look into how to provide those but making interface
decisions based on specific customer requests tends to lead to long term
pains.

> They are using tools that look at /proc/cpuinfo and the sysfs files. It is a
> much bigger effort to make all those tools look at a new cgroup file
> interface instead. It can be more efficiently done at the kernel level.

Short term, sure, it sure is more painful to adapt, but I don't think longer
term solution lies in the kernel trying to masquerage existing sytsem-wide
information interfaces. e.g. cpuset is one thing but what are we gonna do
about weight control or work-conserving memory controls? Pro-rate cpu count
and available memory?

> Anyway, I am OK if the consensus is that it is not a kernel problem and have
> to be handled in userspace.

I'd be happy to provide more information from kernel side as necessary but
the approach taken here doesn't seem generic or scalable at all.

> BTW, do you have any comment on another cpuset patch that I sent a week
> earlier?
> 
> https://lore.kernel.org/lkml/20210603212416.25934-1-longman@redhat.com/
> 
> I am looking forward for your feedback.

Sorry about the delay. Will take a look later today.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-06-15 16:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210614152306.25668-1-longman@redhat.com>
     [not found] ` <20210614152306.25668-5-longman@redhat.com>
2021-06-14 15:52   ` [PATCH 4/4] driver core: Allow showing cpu as offline if not valid in cpuset context Greg KH
2021-06-14 16:32     ` Waiman Long
2021-06-14 17:00       ` Greg KH
2021-06-14 20:43 ` [PATCH 0/4] cgroup/cpuset: Allow cpuset to bound displayed cpu info Tejun Heo
2021-06-15  2:53   ` Waiman Long
2021-06-15 15:59     ` Tejun Heo
2021-06-15  9:14   ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).