From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-9.0 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,
	MENTIONS_GIT_HOSTING,NUMERIC_HTTP_ADDR,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,
	USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id EF888C6377D
	for <linux-kernel@archiver.kernel.org>; Thu, 22 Jul 2021 07:53:57 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id C614E61221
	for <linux-kernel@archiver.kernel.org>; Thu, 22 Jul 2021 07:53:57 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230313AbhGVHNV (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Thu, 22 Jul 2021 03:13:21 -0400
Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:20046 "EHLO
        mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL)
        by vger.kernel.org with ESMTP id S229642AbhGVHNT (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 22 Jul 2021 03:13:19 -0400
Received: from pps.filterd (m0098413.ppops.net [127.0.0.1])
        by mx0b-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16M7Ybfg148306;
        Thu, 22 Jul 2021 03:53:44 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=to : cc : from : subject
 : message-id : date : content-type : content-transfer-encoding :
 mime-version; s=pp1; bh=Caa7KHT646SLX5TNhCX7Cf9dqEUSWEuShbv7ao6ul6g=;
 b=I0sxX/JhCEh2JRzzblXk4a6Nk3mAD+k4lmr9fGA6L+W7jOsSIYOCNkDZuAEUpZeeFKgS
 w5XcxQE98UHDsmX/xHhmyWXjd0SilDC6ShQ0gks4tIKTGuftJrODXgc9al5JzdDoi/NT
 78PGhoc9hCJl4U0xLcefsMrTtj2YHilp/VOIIZV9qzkEswKj6I5jDEZ0gcW8L/IIzdt6
 TaLEYxF0pyL4/RjPM/8hK6H4WwQXj16V5DfgCZlkQ8NGVXGU+wzY/u8O1qPfXtYqkMeL
 05FB1zskm0SKRRvG/C6JHM6g2XM59vm7tT0o9kVJKjbAA8JqaC3TFIr9L/A54hfPkqNJ xQ== 
Received: from pps.reinject (localhost [127.0.0.1])
        by mx0b-001b2d01.pphosted.com with ESMTP id 39y3ycs4xp-1
        (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
        Thu, 22 Jul 2021 03:53:43 -0400
Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1])
        by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 16M7YdW5148600;
        Thu, 22 Jul 2021 03:53:43 -0400
Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99])
        by mx0b-001b2d01.pphosted.com with ESMTP id 39y3ycs4x7-1
        (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
        Thu, 22 Jul 2021 03:53:43 -0400
Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1])
        by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 16M7r7N5025553;
        Thu, 22 Jul 2021 07:53:41 GMT
Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196])
        by ppma04ams.nl.ibm.com with ESMTP id 39xhx48g4k-1
        (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
        Thu, 22 Jul 2021 07:53:41 +0000
Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160])
        by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 16M7rdHn27132250
        (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);
        Thu, 22 Jul 2021 07:53:39 GMT
Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1])
        by IMSVA (Postfix) with ESMTP id 2EAE0A4060;
        Thu, 22 Jul 2021 07:53:39 +0000 (GMT)
Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1])
        by IMSVA (Postfix) with ESMTP id CD7DBA405B;
        Thu, 22 Jul 2021 07:53:34 +0000 (GMT)
Received: from [9.102.1.144] (unknown [9.102.1.144])
        by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTP;
        Thu, 22 Jul 2021 07:53:34 +0000 (GMT)
To:     Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        containers@lists.linux.dev, containers@lists.linux-foundation.org
Cc:     legion@kernel.org, akpm@linux-foundation.org,
        christian.brauner@ubuntu.com, ebiederm@xmission.com,
        hannes@cmpxchg.org, mhocko@kernel.org,
        Alexey Makhalov <amakhalov@vmware.com>, llong@redhat.com,
        Pratik Sampat <psampat@linux.ibm.com>,
        pratik.r.sampat@gmail.com
From:   Pratik Sampat <psampat@linux.ibm.com>
Subject: [RFD] Provide virtualized CPU system information for containers
Message-ID: <ac76aada-f94d-d596-9b3c-1dca5a9914d0@linux.ibm.com>
Date:   Thu, 22 Jul 2021 13:23:33 +0530
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.11.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
X-TM-AS-GCONF: 00
X-Proofpoint-GUID: v5ZXRYAqtNhegTzdwUGvJBMLlQkc-pwk
X-Proofpoint-ORIG-GUID: V4RZQX6zRBA66QHW0D3Ovkknj18Mi9sj
Content-Transfer-Encoding: 7bit
X-Proofpoint-UnRewURL: 0 URL was un-rewritten
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.790
 definitions=2021-07-22_03:2021-07-22,2021-07-22 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0
 priorityscore=1501 malwarescore=0 suspectscore=0 adultscore=0
 clxscore=1011 impostorscore=0 mlxscore=0 phishscore=0 spamscore=0
 mlxlogscore=999 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx
 scancount=1 engine=8.12.0-2104190000 definitions=main-2107220043
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Abstract
========

Today, applications that run on containers enforce their CPU and memory
limits, requirements with the help of cgroups. However, many applications
legacy or otherwise get the view of the system through sysfs/procfs and
allocate resources like number of threads/processes, memory allocation based
on that information. This can lead to unexpected running behaviors as well as
have a high impact on performance.

The problem is not only limited to the coherency of information. Cloud runtime
environments requests for CPU runtime in millicores[1], which translate to
using CFS period and quota to limit CPU runtime in cgroups. However, generally,
applications operate in terms of threads with little to no cognizance of the
millicore limit or its connotation.

The scope of the RFD, along with the experimental results is anchored
towards CPU system information, rather than the challenges posed by Memory
limits information or its likes in this proposal.

Problem Statement
=================
Provide Virtualized CPU system information to applications running within
the container semantics.

Experiments
===========
Picked a relatively common container application nginx[2] configured with
"worker_processes: auto"[3] (which ensures that the number of processes to spawn
will be derived from resources viewed on the system) and a benchmark/driver
application wrk[4]

Nginx: Nginx is a web server that can also be used as a reverse proxy, load
balancer, mail proxy and HTTP cache
Wrk: wrk is a modern HTTP benchmarking tool capable of generating significant
load when run on a single multi-core CPU

Docker is used as the containerization platform of choice.

For the scope of experimentation a fake sysfs (/sys/devices/system/cpu) is
mounted which encapsulates information in coherence with the limits set to
the container.

The aim of the experiment is to quantify the effects of incoherent information
on resources allocated as well as performance

System configuration1 -- Intel
1. Intel(R) Xeon(R) CPU E5-2470
2. CPUs: 32
3. Memory: 94Gi

System configuration2 -- IBM POWER
1. IBM POWER 9
2. CPUs: 176
3. Memory: 127GB

Exp1: Effects of incorrect CPU information with cpuset
------------------------------------------------------
See [12] for detailed stats -- POWER
See [13] for detailed stats -- Intel

Case1: The container has access to all the CPUs
Case2: cpuset limits set on nginx container to only "0-3". However, the default
        sys/ and proc/ file systems display system CPUs
Case3: cpuset limits set to "0-3" and sysfs faked to give coherent information
        pertaining to only 0-3

No significant improvement or degradation in terms of performance is observed.

Summary stats -- IBM POWER
+----------------+--------+--------+--------+
| Metric         | Case 1 | Case 2 | Case 3 |
+----------------+--------+--------+--------+
| PIDs           | 177    | 177    | 5      |
| mem usg (init) | 411.1  | 290.8  | 26.69  |
| mem usg (peak) | 662.8  | 295.3  | 30.69  |
+----------------+--------+--------+--------+

Summary stats -- Intel
+----------------+--------+--------+--------+
| Metric         | Case 1 | Case 2 | Case 3 |
+----------------+--------+--------+--------+
| PIDs           | 33     | 33     | 5      |
| mem usg (init) | 28.63  | 25.37  | 5.914  |
| mem usg (peak) | 40.14  | 30.7   | 9.914  |
+----------------+--------+--------+--------+

Observations -- Both platforms show the same trend in statistics:
1. The number of PIDs in case 3 are in coherence with the cpu limit provided.
    4 worker threads + 1 Master thread, compared for the former cases where the
    number of threads spawned were based on the CPUs on the system
2. The memory footprint dropped significantly from case1 to case3 just because
    the application received a coherent view of the system

Exp2: Effects of Period and quota information
---------------------------------------------
See [14] for detailed stats -- POWER
See [15] for detailed stats -- Intel

Case1: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
        worker_processes: auto - No limits
Case2: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
        worker_processes: auto, fake sysfs to export 4 cpus - Exact CPUs
Case3: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
        worker_processes: auto, fake sysfs to export 8 cpus - Overcommit of CPUs
Case4: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
        worker_processes: auto, fake sysfs to export 8 cpus - Undercommit of CPUs

Summary statistics of the experiment -- IBM POWER:
+----------------+----------+----------+----------+----------+
| Metric         | case1    | case2    | case3    | case4    |
+----------------+----------+----------+----------+----------+
| PIDs           | 177      | 5        | 9        | 3        |
| mem usg (init) | 422.2    | 67.5     | 87.12    | 62.5     |
| mem usg (peak) | 571.4    | 130.6    | 131.6    | 85.38    |
| Throttle %     | 96.8     | 20.12    | 97.08    | 0        |
| Requests/sec   | 18849.97 | 66356.02 | 61121.65 | 35265.99 |
| Transfer/sec   | 15.28    | 53.79    | 49.54    | 28.59    |
+----------------+----------+----------+----------+----------+

Summary statistics of the experiment -- Intel:
+----------------+----------+----------+----------+----------+
| Metric         | case1    | case2    | case3    | case4    |
+----------------+----------+----------+----------+----------+
| PIDs           | 33       | 5        | 9        | 3        |
| mem usg (init) | 29.12    | 7.574    | 10.83    | 6.07     |
| mem usg (peak) | 37.78    | 16.34    | 18.59    | 12.69    |
| Throttle %     | 97.4     | 19.80    | 97.4     | 0        |
| Requests/sec   | 32778.57 | 44754.85 | 42296.64 | 22500.00 |
| Transfer/sec   | 26.57    | 36.28    | 34.28    | 18.24    |
+----------------+----------+----------+----------+----------+

Obervations -- Both platforms show the same trend in statistics:
When the CPU quota limit is set to run for the duration of 4 CPUs and,
Case1: Nginx spawns processes based on the view of the system then there is a
        high amount of throttling, high memory footprint as well as low
        performance
Case2: A fake sysfs is mounted to display 4 cpus, when period and quota
        reflects 4 cpus worth of runtime then the throttling is the lowest as
        well as the performance is the highest.
        Also, memory footprint is seen to improve.
Case3: A fake sysfs is mounted to display 8 cpus i.e overcommit, then
        throttling is seen to increase, while the throttle time is lesser
        than case1, the throttle % is the same. Performance also drops as well
        as higher memory footprint can be seen when compared to case 2 but
        less than case 1
Case4: A fake sysfs is mounted to display 2 cpus i.e undercommit, There is
        virtually no throttling to be observed as there is no contention.
        The memory footprint is also the lowest, however the performance takes
        a dip too and is the worst of all the cases

The above experiments show us that there is merit for applications observing
coherent information in terms of tasks spawned, memory footprint and
performance.

Existing solutions
==================

1. Why don't current applications look at the cgroupfs interface instead of the
    old sys and procfs if they need coherent information?

Most of the information that applications seek from the traditional
filesystems is correctly populated in the cgroupfs and that applications
should modify their libraries to receive coherent information from there. This
is a strong argument and cannot be discounted, however it does present two
problems along with it.
a. There are a lot of applications that currently use the traditional
    interface which can be range from legacy applications as well relatively
    modern applications like nginx as we have seen. Therefore, the sheer volume
    of applications and their libraries may make it difficult to implement this
    currently.
b. Applications which previously didn't know the concept of millicores would
    now have to incorporate that into their business logic for their thread
    requirements as well by deriving and interpreting this information from CFS
    period and quota

2. Userspace tools like LXCFS[5]
In the experiment above, to give a coherent view of the system we mounted
fake sysfs directories, which is precisely the modus operandi of LXFCS.
LXCFS is a userspace tools which uses FUSE filesystem to provide coherency of
information and mount cgroupfs based information in sys, procfs like:

/proc/cpuinfo
/proc/diskstats
/proc/meminfo
/proc/stat
/proc/swaps
/proc/uptime
/proc/slabinfo
/sys/devices/system/cpu/*
/sys/devices/system/cpu/online

It is also capable to virtualize period and quota information with --enable-cfs
option[6]. It divides period by quota and the resulting number of CPUs "N" is
presented in /sys/devices/system/cpu/online as "0-N".

The benefit of LXCFS is that it is a light, relatively easy to setup userspace
tool which can be used by applications to get coherent information presented
from cgroupfs to sysfs. It does seem to be currently in use with
Kubenetes as described by Google Anthos[7] and the Alibab Cloud tutorial[8]

However, it does pose a couple of concerns too:
a. From a CPU point of view, when it comes to virtualizing of CPUs based on
    periods and quotas will always lead to list of CPUs starting from 0 to N,
    where N is the translation of number of CPUs it should get a runtime of.
    The question aries if this can become an issue where the applications
    depend of the CPU list itself, that it is task-setting or setting affinity
    to those CPUs?
    If that is possible, then in that case where there are multiple
    container applications running with the same taskset CPUset; can experience
    unwarranted throttling.
b. LXCFS is an external solution that needs to be explicitly setup for
    applications that experience problems from incorrect information
    in sys/procfs

Hence, I believe an argument can be made to have an in-kernel interface that
can virtualize CPU information and namespace each logical container into its
own view of the CPU topology.

3. Introduce a new interface to present information in-kernel
A patchset was suggested[9] which added /proc/self/meminfo which contained a
subset of /proc/meminfo respecting cgroup restrictions for the memory
incoherence problem.

This design can also be ported for the CPU view of the system too.
The advantage of this approach is that a new interface is setup
without overriding the current interfaces which enables us to not break any
assumptions already established on those sys and procfs interface.

However, this could turn out to be a potential disadvantage too.
As there can be two kinds of applications that the solution is currently
designed for:
a. Legacy applications
b. Newer applications that still look at traditional interfaces
For both a.) and b.) if they do not currently look at the cgroupfs interfaces;
then introduction of yet another interface may not be motivating enough to
modify their codebase to receive this information.
This argument was also presented by Christian Brauner in the same patchset[10],
while also highlighting overlapping points presented from this proposal.

Honorable mention: Kubenetes CPU manager[11]. The CPU manager is a feature for
QoS in container orchestration, here the CPU manager manages the cpuset given
exclusively to pods based on the requests of CPUs in its configuration.
While it is a nifty feature to manage cpuset information, it still does not
reflect this information in traditional sys/procfs interfaces and a LXCFS hook
is needed along with it for the same.

Proposed Solution -> CPU namespace
==================================

This RFD proposes the inclusion of a new namespace feature - CPU namespace.
A CPU namespace can present coherent system CPU information to the contain
applications that reside within it in accordance with the cgroups limits set
onto it. The namespace also virtualizes CPU information and can maintain an
internal translation from the namespace CPU to the logical CPU in the kernel.

Designing a namespace this way presents a coherent interface as well as is able
to cleanly abstract details about the system and it's configuration from the
higher level applications.
The advantage of this approach is also that this can be acheived without the
introduction of a new interface and by just reimagining the interpretation of
the existing sys and proc interfaces.

On the lines of namespaces, an alternative namespace that could also be
proposed is a sys/proc namespace that can virtualize information presented from
cgroupfs. It could be CPUs, memory, even other system topology. This would
resolve memory limits inconsistency issues as reported in [9]. However,
presenting CPU information this way does pose a challenge. There are metrics
like period and quota as discussed earlier which need to be derived to present
as CPUs as well as needs to be abstracted out. If a coherent interpretation of
these derived metrics can be agreed upon then the following could also be a
viable alternative.

The aim of the above proposal is to:
a. Garner perspective from the community around the problem, its implications
    in the real world and the cementing a consensus if there is a need to
    solving it
b. Spark a discussion around a potential solution

If a consensus can be reached, first towards acceptance of the problem and then
towards a coherent CPU namespace mechanism; I would gladly volunteer to help
in building it out.

Thanks,
Pratik Sampat
IBM, Linux Technology Center

[1]: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
[2]: https://docs.nginx.com/nginx/
[3]: http://nginx.org/en/docs/ngx_core_module.html#worker_processes
[4]: https://github.com/wg/wrk
[5]: https://linuxcontainers.org/lxcfs/
[6]: https://www.mankier.com/1/lxcfs#--enable-cfs
[7]: https://cloud.google.com/blog/products/containers-kubernetes/migrate-for-anthos-streamlines-legacy-java-app-modernization
[8]: https://www.alibabacloud.com/blog/kubernetes-demystified-using-lxcfs-to-improve-container-resource-visibility_594109
[9]: https://lore.kernel.org/lkml/ac070cd90c0d45b7a554366f235262fa5c566435.1622716926.git.legion@kernel.org/
[10]: https://lore.kernel.org/lkml/20210615113222.edzkaqfvrris4nth@wittgenstein/
[11]: https://kubernetes.io/blog/2018/07/24/feature-highlight-cpu-manager/
[12]: POWER - EXP1: Effects of incorrect CPU information with cpuset
     Case1: The container has access to all the CPUs (0-175)

     IDLE container stat
     NAME   CPU %      MEM USAGE / LIMIT   MEM %  NET I/O      BLOCK I/O    PIDS
     pnginx 0.00%      411.1MiB / 127.5GiB 0.31%  2.29kB / 0B  0B / 8.19kB  177
     PEAK WORKLOAD
     pnginx 14383.42%  662.8MiB / 127.5GiB 0.51% 389MB / 2.11GB 0B / 8.19kB  177


     Case2: cpuset limits set on nginx container to only "0-3". However the
            default sys/ and proc/ file systems display 176 CPUs.

     IDLE container stat
     NAME    CPU %    MEM USAGE / LIMIT    MEM %  NET I/O       BLOCK I/O    PIDS
     pnginx  0.00%    290.8MiB / 127.5GiB  0.22%  2.29kB / 0B   0B / 8.19kB  177
     PEAK WORKLOAD
     pnginx  399.21%  295.3MiB / 127.5GiB  0.23%  197MB / 1.1GB 0B / 8.19kB  177


     Case3: cpuset limits set to "0-3" and sysfs faked to give coherent
            information pertaining to only 0-3

     IDLE container stat
     NAME    CPU %    MEM USAGE / LIMIT    MEM %  NET I/O         BLOCK I/O    PIDS
     pnginx  0.00%    26.69MiB / 127.5GiB  0.02%  2.22kB / 0B     0B / 8.19kB  5
     PEAK WORKLOAD
     pnginx  399.24%  30.69MiB / 127.5GiB  0.02%  183MB / 1.03GB  0B / 8.19kB  5

[13]: Intel - EXP1: Effects of incorrect CPU information with cpuset
     Case1: The container has access to all the CPUs (0-31)

     IDLE container stat
     NAME  CPU %  MEM USAGE / LIMIT   MEM %   NET I/O       BLOCK I/O     PIDS
     pnginx 0.00% 28.63MiB / 94.38GiB 0.03%   1.54kB / 0B   69.6kB / 8.19kB   33
     PEAK WORKLOAD
     pnginx 1562.51% 40.14MiB / 94.38GiB   0.04% 765MB / 4.08GB  0B / 8.19kB  33


     Case2: cpuset limits set on nginx container to only "0-3". However the
            default sys/ and proc/ file systems display 32 CPUs.

     IDLE container stat
     NAME   CPU %  MEM USAGE / LIMIT   MEM %   NET I/O       BLOCK I/O     PIDS
     pnginx 0.00%  25.37MiB / 94.38GiB   0.03%  2.01kB / 0B   0B / 8.19kB   33
     PEAK WORKLOAD
     pnginx 406.82% 30.7MiB / 94.38GiB   0.03%  243MB / 1.36GB  0B / 8.19kB  33

     Case3: cpuset limits set to "0-3" and sysfs faked to give coherent
            information pertaining to only 0-3

     IDLE container stat
     NAME   CPU %  MEM USAGE / LIMIT   MEM %   NET I/O       BLOCK I/O     PIDS
     pnginx 0.00%  5.914MiB / 94.38GiB 0.01%   2.08kB / 0B   0B / 8.19kB   5
     PEAK WORKLOAD
     pnginx 406.04% 9.914MiB / 94.38GiB  0.01%  251MB / 1.41GB  0B / 8.19kB  5

[14]: POWER - Exp2: Effects of Period and quota information
     Case1: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
            worker_processes: auto - No limits
     Inital nginx stats
     --docker stats--
     NAME    CPU %  MEM USAGE / LIMIT    MEM %  NET I/O      BLOCK I/O    PIDS
     pnginx  0.00%  422.2MiB / 127.5GiB  0.32%  2.36kB / 0B  0B / 8.19kB  177
     --throttle stats--
     nr_periods 7
     nr_throttled 0
     throttled_time 0

     Peak workload nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT    MEM %  NET I/O        BLOCK I/O    PIDS
     pnginx  391.18%  571.4MiB / 127.5GiB  0.44%  101MB / 561MB  0B / 8.19kB  177
     --throttle stats--
     nr_periods 313
     nr_throttled 303
     throttled_time 2168846281268

     Benchmark stats
     # ./wrk -t4 -c500 --latency -d30s http://172.17.0.2:80/index.html
     Running 30s test @ http://172.17.0.2:80/index.html
     4 threads and 500 connections
     Thread Stats   Avg      Stdev     Max   +/- Stdev
     Latency    59.17ms   89.55ms   1.19s    88.62%
     Req/Sec     4.75k     4.03k   27.79k    74.00%
     567045 requests in 30.08s, 459.63MB read
     Requests/sec:  18849.97
     Transfer/sec:     15.28MB

     Case2: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
            worker_processes: auto, fake sysfs to export 4 cpus - Exact CPUs

     Inital nginx stats
     --docker stats--
     NAME    CPU %  MEM USAGE / LIMIT   MEM %  NET I/O      BLOCK I/O    PIDS
     pnginx  0.00%  67.5MiB / 127.5GiB  0.05%  2.29kB / 0B  0B / 8.19kB  5
     --throttle stats--
     nr_periods 5
     nr_throttled 0
     throttled_time 0

     Peak workload nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT    MEM %  NET I/O        BLOCK I/O    PIDS
     pnginx  398.36%  130.6MiB / 127.5GiB  0.10%  337MB / 1.9GB  0B / 8.19kB  5
     --throttle stats--
     nr_periods 308
     nr_throttled 62
     throttled_time 375890674

     Benchmark stats
     # ./wrk -t4 -c500 --latency -d30s http://172.17.0.2:80/index.html
     Running 30s test @ http://172.17.0.2:80/index.html
       4 threads and 500 connections
       Thread Stats   Avg      Stdev     Max   +/- Stdev
         Latency    17.57ms   32.08ms 341.08ms   89.20%
         Req/Sec    16.71k     1.26k   24.71k    78.17%
       1996404 requests in 30.09s, 1.58GB read
     Requests/sec:  66356.02
     Transfer/sec:     53.79MB

     Case3: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
            worker_processes: auto, fake sysfs to export 8 cpus - Overcommit of CPUs
     Inital nginx stats
     --docker stats--
     NAME    CPU %  MEM USAGE / LIMIT    MEM %  NET I/O      BLOCK I/O    PIDS
     pnginx  0.00%  87.12MiB / 127.5GiB  0.07%  2.36kB / 0B  0B / 8.19kB  9
     --throttle stats--
     nr_periods 5
     nr_throttled 0
     throttled_time 0

     Peak workload nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT    MEM %  NET I/O        BLOCK I/O    PIDS
     pnginx  401.48%  131.6MiB / 127.5GiB  0.10%  300MB / 1.7GB  0B / 8.19kB  9
     --throttle stats--
     nr_periods 309
     nr_throttled 300
     throttled_time 119159115734

     Benchmark stats
     # ./wrk -t4 -c500 --latency -d30s http://172.17.0.2:80/index.html
     Running 30s test @ http://172.17.0.2:80/index.html
       4 threads and 500 connections
       Thread Stats   Avg      Stdev     Max   +/- Stdev
         Latency    14.39ms   16.52ms 151.55ms   81.31%
         Req/Sec    15.39k     0.91k   30.95k    90.08%
       1838179 requests in 30.07s, 1.46GB read
     Requests/sec:  61121.65
     Transfer/sec:     49.54MB

     Case4: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
            worker_processes: auto, fake sysfs to export 2 cpus - Undercommit of CPUs

     Inital nginx stats
     --docker stats--
     NAME    CPU %  MEM USAGE / LIMIT   MEM %  NET I/O      BLOCK I/O    PIDS
     pnginx  0.00%  62.5MiB / 127.5GiB  0.05%  2.29kB / 0B  0B / 8.19kB  3
     --throttle stats--
     nr_periods 5
     nr_throttled 0
     throttled_time 0

     Peak workload nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT    MEM %  NET I/O        BLOCK I/O    PIDS
     pnginx  199.47%  85.38MiB / 127.5GiB  0.07%  170MB / 963MB  0B / 8.19kB  3
     --throttle stats--
     nr_periods 308
     nr_throttled 0
     throttled_time 0

     Benchmark stats
     # ./wrk -t4 -c500 --latency -d30s http://172.17.0.2:80/index.html
     Running 30s test @ http://172.17.0.2:80/index.html
       4 threads and 500 connections
       Thread Stats   Avg      Stdev     Max   +/- Stdev
         Latency   159.81ms  251.64ms   1.05s    81.16%
         Req/Sec     8.88k     1.89k   15.59k    71.00%
       1060592 requests in 30.07s, 859.69MB read
     Requests/sec:  35265.99
     Transfer/sec:     28.59MB

[15]: Intel - Exp2: Effects of Period and quota information
     Case1: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
            worker_processes: auto - No limits

     Inital nginx stats
     --docker stats--
     NAME   CPU %  MEM USAGE / LIMIT    MEM %  NET I/O      BLOCK I/O        PIDS
     pnginx 0.00%  29.12MiB / 94.38GiB  0.03%  1.74kB / 0B  2.26MB / 8.19kB  33
     --throttle stats--
     nr_periods 5
     nr_throttled 0
     throttled_time 0

     Peak workload nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT     MEM %     NET I/O         BLOCK I/O         PIDS
     pnginx  403.43%  37.78MiB / 94.38GiB   0.04%     184MB / 912MB   2.26MB / 8.19kB   33
     --throttle stats--
     nr_periods 309
     nr_throttled 301
     throttled_time 506059002784

     Benchmark stats
     # ./wrk -t4 -c500 --latency -d30s http://172.17.0.4:80/index.html
     Running 30s test @ http://172.17.0.4:80/index.html
     4 threads and 500 connections
     Thread Stats   Avg      Stdev     Max   +/- Stdev
         Latency    26.10ms   31.45ms 189.88ms   79.53%
         Req/Sec     8.25k     1.67k   22.62k    79.92%
     985441 requests in 30.06s, 798.78MB read
     Requests/sec:  32778.57
     Transfer/sec:     26.57MB

     Case2: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
            worker_processes: auto, fake sysfs to export 4 cpus - Exact CPUs

     Inital nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT     MEM %     NET I/O       BLOCK I/O         PIDS
     pnginx  0.00%    7.574MiB / 94.38GiB   0.01%     2.01kB / 0B   90.1kB / 8.19kB   5
     --throttle stats--
     nr_periods 5
     nr_throttled 0
     throttled_time 0

     Peak workload nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT     MEM %     NET I/O          BLOCK I/O         PIDS
     pnginx  408.06%  16.34MiB / 94.38GiB   0.02%     227MB / 1.28GB   90.1kB / 8.19kB   5
     --throttle stats--
     nr_periods 308
     nr_throttled 61
     throttled_time 100989735

     Benchmark stats
     # ./wrk -t4 -c500 --latency -d30s http://172.17.0.4:80/index.html
     Running 30s test @ http://172.17.0.4:80/index.html
     4 threads and 500 connections
     Thread Stats   Avg      Stdev     Max   +/- Stdev
         Latency    26.47ms   48.54ms 448.54ms   89.32%
         Req/Sec    11.26k   844.04    14.61k    68.67%
     1344115 requests in 30.03s, 1.06GB read
     Requests/sec:  44754.85
     Transfer/sec:     36.28MB

     Case3: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
            worker_processes: auto, fake sysfs to export 8 cpus - Overcommit of CPUs
     Inital nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT     MEM %     NET I/O       BLOCK I/O     PIDS
     pnginx  0.00%    10.83MiB / 94.38GiB   0.01%     2.01kB / 0B   0B / 8.19kB   9
     --throttle stats--
     nr_periods 6
     nr_throttled 0
     throttled_time 0

     Peak workload nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT     MEM %     NET I/O          BLOCK I/O     PIDS
     pnginx  403.62%  18.59MiB / 94.38GiB   0.02%     236MB / 1.23GB   0B / 8.19kB   9
     --throttle stats--
     nr_periods 308
     nr_throttled 300
     throttled_time 11847978641

     Benchmark stats
     # ./wrk -t4 -c500 --latency -d30s http://172.17.0.4:80/index.html
     Running 30s test @ http://172.17.0.4:80/index.html
     4 threads and 500 connections
     Thread Stats   Avg      Stdev     Max   +/- Stdev
         Latency    17.52ms   18.08ms 176.48ms   81.30%
         Req/Sec    10.64k   692.48    19.12k    80.50%
     1270019 requests in 30.03s, 1.01GB read
     Requests/sec:  42296.64
     Transfer/sec:     34.28MB

     Case4: 4 CPUs worth of runtime (period: 100000us quota: 400000 us) ,
            worker_processes: auto, fake sysfs to export 2 cpus - Undercommit of CPUs

     Inital nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT    MEM %     NET I/O       BLOCK I/O     PIDS
     pnginx  0.00%    6.07MiB / 94.38GiB   0.01%     2.15kB / 0B   0B / 8.19kB   3
     --throttle stats--
     nr_periods 6
     nr_throttled 0
     throttled_time 0

     Peak workload nginx stats
     --docker stats--
     NAME    CPU %    MEM USAGE / LIMIT     MEM %     NET I/O         BLOCK I/O     PIDS
     pnginx  202.32%  12.69MiB / 94.38GiB   0.01%     126MB / 681MB   0B / 8.19kB   3
     --throttle stats--
     nr_periods 308
     nr_throttled 0
     throttled_time 0

     Benchmark stats
     # ./wrk -t4 -c500 --latency -d30s http://172.17.0.4:80/index.html
     Running 30s test @ http://172.17.0.4:80/index.html
     4 threads and 500 connections
     Thread Stats   Avg      Stdev     Max   +/- Stdev
         Latency   237.39ms  385.12ms   1.49s    81.66%
         Req/Sec     5.66k     1.24k    8.34k    63.42%
     676025 requests in 30.05s, 547.97MB read
     Requests/sec:  22500.00
     Transfer/sec:     18.24MB