From: Shuang Zhai <szhai2@cs.rochester.edu> To: yuzhao@google.com Cc: Michael@michaellarabel.com, ak@linux.intel.com, akpm@linux-foundation.org, axboe@kernel.dk, catalin.marinas@arm.com, corbet@lwn.net, dave.hansen@linux.intel.com, hannes@cmpxchg.org, hdanton@sina.com, jsbarnes@google.com, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mgorman@suse.de, mhocko@kernel.org, page-reclaim@google.com, riel@surriel.com, torvalds@linux-foundation.org, vbabka@suse.cz, will@kernel.org, willy@infradead.org, x86@kernel.org, ying.huang@intel.com Subject: Re: [PATCH v6 0/9] Multigenerational LRU Framework Date: Tue, 4 Jan 2022 21:44:23 -0500 [thread overview] Message-ID: <20220105024423.26409-1-szhai2@cs.rochester.edu> (raw) In-Reply-To: <20220104202227.2903605-1-yuzhao@google.com> Fio / pmem benchmark with MGLRU TLDR ==== With the MGLRU, fio achieved 95% CIs [38.95, 40.26]%, [4.12, 6.64]% and [9.26, 10.36]% higher throughput, respectively, for random access, Zipfian (distribution) access and Gaussian (distribution) access, when the average number of jobs per CPU is 1; 95% CIs [42.32, 49.15]%, [9.44, 9.89]% and [20.99, 22.86]% higher throughput, respectively, for random access, Zipfian access and Gaussian access, when the average number of jobs per CPU is 2. Background ========== Many applications running on warehouse-scale computers heavily use POSIX read(2)/write(2) and page cache, e.g., Apache Kafka, a distributed streaming application used by "more than 80% of all Fortune 100 companies" [1] and PostgreSQL, "the world's most advanced open source relational database" [2]. Intel DC Persistent Memory, as an affordable alternative to DRAM, can deliver large capacity and data persistence. Specifically, the device used in this benchmark can achieve up to 36 GiB/s and 15 GiB/s throughput, respectively, for sequential and random read access. Our research group at the University of Rochester focuses on the intersection of computer architecture and system software. My current research interest is memory management on tiered memory systems. Matrix ====== Kernels: version [+ patchset] * Baseline: 5.15 * Patched: 5.15 + MGLRU Access patterns (4KB read): * Random (uniform) * Zipfian (theta 0.8; the recommended range is 0-2) * Gaussian (deviation 40; the possible range is 0-100) Concurrency conditions (the average number of jobs per CPU): * 1 * 2 Total file size (GB): 400 (~2x memory capacity) Total configurations: 12 Data points per configuration: 10 Total run duration (minutes) per data point: ~30 Notes ----- 1. All files were stored on pmem. Each job had the exclusive access to a single file. 2. Due to the hardware limitation when accessing remote pmem [3], numactl was used to bind the fio processes to the local pmem. Only one of the two NUMA nodes was used during the benchmark. 3. During dry runs, we observed that the throughput doesn't improve beyond 2 jobs per CPU for random access. Moreover, the patched kernel showed consistent improvements over the baseline kernel when using 3 or 4 jobs per CPU. 4. We wanted to simulate the real-world scenarios and therefore used default swap configuration (on). Moreover, we didn't observe any negative impact on performance with dry runs that disabled swap. Procedure ========= <for each kernel> grub2-reboot <baseline, patched> <for each concurrency condition> <generate test files> <for each access pattern> <for each data point> <reboot> <run fio> Hardware -------- Memory (GiB per socket): 192 CPU (# per socket): 40 Pmem (GiB per socket): 768 Fio --- $ fio -version fio-3.28 $ numactl --cpubind=0 --membind=0 fio --name=randread \ --directory=/mnt/pmem/ --size={10G, 5G} --io_size=1000TB \ --time_based --numjobs={40, 80} --ioengine=io_uring \ --ramp_time=20m --runtime=10m --iodepth=128 \ --iodepth_batch_submit=32 --iodepth_batch_complete=32 \ --rw=randread --random_distribution={random, zipf:0.8, normal:40} \ --direct=0 --norandommap --group_reporting Results ======= Throughput ---------- The patched kernel achieved substantially higher throughput for all three access patterns and two concurrency conditions. Specifically, comparing the patched with the baseline kernel, fio achieved 95% CIs [38.95, 40.26]%, [4.12, 6.64]% and [9.26, 10.36]% higher throughput, respectively, for random access, Zipfian access, and Gaussian access, when the average number of jobs per CPU is 1; 95% CIs [42.32, 49.15]%, [9.44, 9.89]% and [20.99, 22.86]% higher throughput, respectively, for random access, Zipfian access and Gaussian access, when the average number of jobs per CPU is 2. +---------------------+---------------+---------------+ | Mean MiB/s [95% CI] | 1 job / CPU | 2 jobs / CPU | +---------------------+---------------+---------------+ | Random access | 8411 / 11742 | 8417 / 12267 | | | [3275, 3387] | [3562, 4137] | +---------------------+---------------+---------------+ | Zipfian access | 14576 / 15360 | 12932 / 14181 | | | [600, 967] | [1220, 1279] | +---------------------+---------------+---------------+ | Gaussian access | 14564 / 15993 | 11513 / 14037 | | | [1348, 1508] | [2417, 2631] | +---------------------+---------------+---------------+ Table 1. Throughput comparison between the baseline and the patched kernels The patched kernel exhibited less degradation in throughput when running more concurrent jobs. Comparing 2 jobs per CPU with 1 job per CPU, fio achieved 95% CIs [-11.54, -11.02]%, [-16.91, -12.01]% and [-21.61, -20.30]% higher throughput, respectively, for random access, Zipfian access and Gaussian access, when using the baseline kernel; 95% CIs [2.04, 6.92]%, [-8.86, -6.48]% and [-12.83, -11.64]% higher throughput, respectively, for random access, Zipfian access and Gaussian access, when using the patched kernel. There were no statistically significant changes in throughput for the rest of the test matrix. +---------------------+-----------------+----------------+ | Mean MiB/s [95% CI] | Baseline kernel | Patched kernel | +---------------------+-----------------+----------------+ | Random access | 8411 / 8417 | 11741 / 12267 | | | [-55, 69] | [239, 812] | +---------------------+-----------------+----------------+ | Zipfian access | 14576 / 12932 | 15360/ 14181 | | | [-1682, -1607] | [-1361, -996] | +---------------------+-----------------+----------------+ | Gaussian access | 14565 / 11513 | 15993 / 14037 | | | [-3147, -2957] | [-2051, -1861] | +---------------------+-----------------+----------------+ Table 2. Throughput comparison between 1 job per CPU and 2 jobs per CPU Tail Latency ------------ Comparing the patched with the baseline kernel, fio experienced 95% CIs [-41.77, -40.35]% and [6.64, 13.95]% higher latency at the 99th percentile, respectively, for random access and Gaussian access, when the average number of jobs per CPU is 1; 95% CIs [-41.97, -40.59]%, [-47.74, -47.04]% and [-51.32, -50.27]% higher latency at the 99th percentile, respectively, for random access, Zipfian access and Gaussian access, when the average number of jobs per CPU is 2. There were no statistically significant changes in latency at the 99th percentile for the rest of the test matrix. +------------------------------+----------------+------------------+ | 99th percentile latency (us) | 1 job / CPU | 2 jobs / CPU | +------------------------------+----------------+------------------+ | Random access | 12466 / 7347 | 25560 / 15008 | | | [-5207, -5030] | [-10729, -10375] | +------------------------------+----------------+------------------+ | Zipfian access | 3395 / 3382 | 14563 / 7661 | | | [-131, 105] | [-6953,-6850] | +------------------------------+----------------+------------------+ | Gaussian access | 3280 / 3618 | 15611 / 7681 | | | [217, 457] | [-8012, -7848] | +------------------------------+----------------+------------------+ Table 3. Comparison of the 99th percentile latency between the baseline and the patched kernels (lower is better) Metrics collected during each run are available at: https://github.com/zhaishuang1/MglruPerf/tree/master A peek at 5.16-rc6 ------------------ We also ran the benchmark on 5.16-rc6 with swap off. However, we haven't collected enough data points to establish a 95% CI. Here are a few numbers we've collected: +----------------+------------+----------+----------------+----------+ | Access pattern | Jobs / CPU | 5.16-rc6 | 5.16-rc6-mglru | % change | +----------------+------------+----------+----------------+----------+ | Random access | 1 | 7467 | 10440 | 39.8% | +----------------+------------+----------+----------------+----------+ | Random access | 2 | 7504 | 13417 | 78.8% | +----------------+------------+----------+----------------+----------+ | Random access | 3 | 7511 | 13954 | 85.8% | +----------------+------------+----------+----------------+----------+ | Random access | 4 | 7542 | 13925 | 84.6% | +----------------+------------+----------+----------------+----------+ Reference ========= [1] https://kafka.apache.org/documentation/#design_filesystem [2] https://www.postgresql.org/docs/11/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY [3] System Evaluation of the Intel Optane byte-addressable NVM, MEMSYS 2019. Appendix ======== Throughput ---------- $ cat raw_data_fio.r v <- c( # baseline 40 procs random 8467.89, 8428.34, 8383.32, 8253.12, 8464.65, 8307.42, 8424.78, 8434.44, 8474.88, 8468.26, # baseline 40 procs zipf 14570.44, 14598.03, 14550.74, 14640.29, 14591.4, 14573.35, 14503.18, 14613.39, 14598.61, 14522.27, # baseline 40 procs gaussian 14504.95, 14427.23, 14652.19, 14519.47, 14557.97, 14617.92, 14555.87, 14446.94, 14678.12, 14688.33, # baseline 80 procs random 8427.51, 8267.23, 8437.48, 8432.37, 8441.4, 8454.26, 8413.13, 8412.44, 8444.36, 8444.32, # baseline 80 procs zipf 12980.12, 12946.43, 12911.95, 12925.83, 12952.75, 12841.44, 12920.35, 12924.19, 12944.38, 12967.72, # baseline 80 procs gaussian 11666.29, 11624.72, 11454.82, 11482.36, 11462.24, 11379.46, 11691.5, 11471.19, 11402.08, 11494.13, # patched 40 procs random 11706.69, 11778.1, 11774.07, 11750.07, 11744.97, 11766.65, 11727.79, 11708.41, 11745.3, 11716.45, # patched 40 procs zipf 15498.31, 14647.94, 15423.35, 15467.32, 15467.05, 15342.49, 15511.34, 15414.06, 15401.1, 15431.57, # patched 40 procs gaussian 15957.86, 15957.13, 16022.69, 16035.85, 16150.2, 15904.5, 15943.36, 16036.78, 16025.95, 15900.56, # patched 80 procs random 12568.51, 11772.25, 11622.15, 12057.66, 11971.72, 12693.36, 12399.71, 12553.23, 12242.74, 12793.34, # patched 80 procs zipf 14194.78, 14213.61, 14148.66, 14182.35, 14183.91, 14192.23, 14163.2, 14179.7, 14162.12, 14196.34, # patched 80 procs gaussian 14084.86, 13706.34, 14089.42, 14058.4, 14096.74, 14108.06, 14043.41, 14072.15, 14088.44, 14024.51 ) a <- array(v, dim = c(10, 3, 2, 2)) # baseline vs patched for (concurr in 1:2) { for (dist in 1:3) { r <- t.test(a[, dist, concurr, 1], a[, dist, concurr, 2]) print(r) p <- r$conf.int * 100 / r$estimate[1] if ((p[1] > 0 && p[2] < 0) || (p[1] < 0 && p[2] > 0)) { s <- sprintf("concurr%d dist%d: no significance", concurr, dist) } else { s <- sprintf("concurr%d dist%d: [%.2f, %.2f]%%", concurr, dist, -p[2], -p[1]) } print(s) } } # low concurr vs high concurr for (kern in 1:2) { for (dist in 1:3) { r <- t.test(a[, dist, 1, kern], a[, dist, 2, kern]) print(r) p <- r$conf.int * 100 / r$estimate[1] if ((p[1] > 0 && p[2] < 0) || (p[1] < 0 && p[2] > 0)) { s <- sprintf("kern%d dist%d: no significance", kern, dist) } else { s <- sprintf("kern%d dist%d: [%.2f, %.2f]%%", kern, dist, -p[2], -p[1]) } print(s) } } $ R -q -s -f raw_data_fio.r Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -132.15, df = 11.177, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -3386.514 -3275.766 sample estimates: mean of x mean of y 8410.71 11741.85 [1] "concurr1 dist1: [38.95, 40.26]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -9.5917, df = 9.4797, p-value = 3.463e-06 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -967.8353 -600.7307 sample estimates: mean of x mean of y 14576.17 15360.45 [1] "concurr1 dist2: [4.12, 6.64]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -37.744, df = 17.33, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1508.328 -1348.850 sample estimates: mean of x mean of y 14564.90 15993.49 [1] "concurr1 dist3: [9.26, 10.36]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -30.144, df = 9.3334, p-value = 1.281e-10 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -4137.381 -3562.653 sample estimates: mean of x mean of y 8417.45 12267.47 [1] "concurr2 dist1: [42.32, 49.15]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -92.164, df = 13.276, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1279.417 -1220.931 sample estimates: mean of x mean of y 12931.52 14181.69 [1] "concurr2 dist2: [9.44, 9.89]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -49.453, df = 17.863, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -2631.656 -2417.052 sample estimates: mean of x mean of y 11512.88 14037.23 [1] "concurr2 dist3: [20.99, 22.86]%" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = -0.22947, df = 16.403, p-value = 0.8213 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -68.88155 55.40155 sample estimates: mean of x mean of y 8410.71 8417.45 [1] "kern1 dist1: no significance" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = 91.86, df = 17.875, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1607.021 1682.287 sample estimates: mean of x mean of y 14576.17 12931.52 [1] "kern1 dist2: [-11.54, -11.02]%" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = 67.477, df = 17.539, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 2956.815 3147.225 sample estimates: mean of x mean of y 14564.90 11512.88 [1] "kern1 dist3: [-21.61, -20.30]%" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = -4.1443, df = 9.0781, p-value = 0.002459 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -812.1507 -239.0833 sample estimates: mean of x mean of y 11741.85 12267.47 [1] "kern2 dist1: [2.04, 6.92]%" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = 14.566, df = 9.1026, p-value = 1.291e-07 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 996.0064 1361.5196 sample estimates: mean of x mean of y 15360.45 14181.69 [1] "kern2 dist2: [-8.86, -6.48]%" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = 43.826, df = 15.275, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1861.263 2051.247 sample estimates: mean of x mean of y 15993.49 14037.23 [1] "kern2 dist3: [-12.83, -11.64]%" 99th Percentile Latency ----------------------- $ cat raw_data_fio_lat.r v <- c( # baseline 40 procs random 12649, 12387, 12518, 12518, 12518, 12387, 12518, 12518, 12387, 12256, # baseline 40 procs zipf 3458, 3294, 3425, 3294, 3294, 3359, 3752, 3326, 3294, 3458, # baseline 40 procs gaussian 3326, 3458, 3195, 3392, 3326, 3228, 3228, 3326, 3130, 3195, # baseline 80 procs random 25560, 26084, 25560, 25560, 25297, 25297, 25822, 25560, 25560, 25297, # baseline 80 procs zipf 14484, 14615, 14615, 14484, 14484, 14615, 14615, 14615, 14615, 14484, # baseline 80 procs gaussian 15664, 15664, 15533, 15533, 15533, 15664, 15795, 15533, 15664, 15533, # patched 40 procs random 7439, 7242, 7373, 7373, 7373, 7439, 7242, 7308, 7308, 7373, # patched 40 procs zipf 3261, 3425, 3392, 3294, 3359, 3556, 3228, 3490, 3458, 3359, # patched 40 procs gaussian 3687, 3523, 3556, 3523, 3752, 3654, 3884, 3490, 3392, 3720, # patched 80 procs random 15008, 15008, 15008, 15008, 15008, 15008, 15008, 15008, 15008, 15008, # patched 80 procs zipf 7701, 7635, 7701, 7701, 7635, 7635, 7701, 7635, 7635, 7635, # patched 80 procs gaussian 7635, 7898, 7701, 7635, 7635, 7635, 7635, 7635, 7701, 7701 ) a <- array(v, dim = c(10, 3, 2, 2)) # baseline vs patched for (concurr in 1:2) { for (dist in 1:3) { r <- t.test(a[, dist, concurr, 1], a[, dist, concurr, 2]) print(r) p <- r$conf.int * 100 / r$estimate[1] if ((p[1] > 0 && p[2] < 0) || (p[1] < 0 && p[2] > 0)) { s <- sprintf("concurr%d dist%d: no significance", concurr, dist) } else { s <- sprintf("concurr%d dist%d: [%.2f, %.2f]%%", concurr, dist, -p[2], -p[1]) } print(s) } } $ R -q -s -f raw_data_fio_lat.r Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = 123.52, df = 15.287, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 5030.417 5206.783 sample estimates: mean of x mean of y 12465.6 7347.0 [1] "concurr1 dist1: [-41.77, -40.35]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = 0.23667, df = 16.437, p-value = 0.8158 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -104.7812 131.1812 sample estimates: mean of x mean of y 3395.4 3382.2 [1] "concurr1 dist2: no significance" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -5.9754, df = 16.001, p-value = 1.94e-05 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -457.5065 -217.8935 sample estimates: mean of x mean of y 3280.4 3618.1 [1] "concurr1 dist3: [6.64, 13.95]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = 134.89, df = 9, p-value = 3.437e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 10374.74 10728.66 sample estimates: mean of x mean of y 25559.7 15008.0 [1] "concurr2 dist1: [-41.97, -40.59]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = 288.1, df = 13.292, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 6849.566 6952.834 sample estimates: mean of x mean of y 14562.6 7661.4 [1] "concurr2 dist2: [-47.74, -47.04]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = 203.64, df = 17.798, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 7848.616 8012.384 sample estimates: mean of x mean of y 15611.6 7681.1 [1] "concurr2 dist3: [-51.32, -50.27]%"
WARNING: multiple messages have this Message-ID (diff)
From: Shuang Zhai <szhai2@cs.rochester.edu> To: yuzhao@google.com Cc: Michael@michaellarabel.com, ak@linux.intel.com, akpm@linux-foundation.org, axboe@kernel.dk, catalin.marinas@arm.com, corbet@lwn.net, dave.hansen@linux.intel.com, hannes@cmpxchg.org, hdanton@sina.com, jsbarnes@google.com, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mgorman@suse.de, mhocko@kernel.org, page-reclaim@google.com, riel@surriel.com, torvalds@linux-foundation.org, vbabka@suse.cz, will@kernel.org, willy@infradead.org, x86@kernel.org, ying.huang@intel.com Subject: Re: [PATCH v6 0/9] Multigenerational LRU Framework Date: Tue, 4 Jan 2022 21:44:23 -0500 [thread overview] Message-ID: <20220105024423.26409-1-szhai2@cs.rochester.edu> (raw) In-Reply-To: <20220104202227.2903605-1-yuzhao@google.com> Fio / pmem benchmark with MGLRU TLDR ==== With the MGLRU, fio achieved 95% CIs [38.95, 40.26]%, [4.12, 6.64]% and [9.26, 10.36]% higher throughput, respectively, for random access, Zipfian (distribution) access and Gaussian (distribution) access, when the average number of jobs per CPU is 1; 95% CIs [42.32, 49.15]%, [9.44, 9.89]% and [20.99, 22.86]% higher throughput, respectively, for random access, Zipfian access and Gaussian access, when the average number of jobs per CPU is 2. Background ========== Many applications running on warehouse-scale computers heavily use POSIX read(2)/write(2) and page cache, e.g., Apache Kafka, a distributed streaming application used by "more than 80% of all Fortune 100 companies" [1] and PostgreSQL, "the world's most advanced open source relational database" [2]. Intel DC Persistent Memory, as an affordable alternative to DRAM, can deliver large capacity and data persistence. Specifically, the device used in this benchmark can achieve up to 36 GiB/s and 15 GiB/s throughput, respectively, for sequential and random read access. Our research group at the University of Rochester focuses on the intersection of computer architecture and system software. My current research interest is memory management on tiered memory systems. Matrix ====== Kernels: version [+ patchset] * Baseline: 5.15 * Patched: 5.15 + MGLRU Access patterns (4KB read): * Random (uniform) * Zipfian (theta 0.8; the recommended range is 0-2) * Gaussian (deviation 40; the possible range is 0-100) Concurrency conditions (the average number of jobs per CPU): * 1 * 2 Total file size (GB): 400 (~2x memory capacity) Total configurations: 12 Data points per configuration: 10 Total run duration (minutes) per data point: ~30 Notes ----- 1. All files were stored on pmem. Each job had the exclusive access to a single file. 2. Due to the hardware limitation when accessing remote pmem [3], numactl was used to bind the fio processes to the local pmem. Only one of the two NUMA nodes was used during the benchmark. 3. During dry runs, we observed that the throughput doesn't improve beyond 2 jobs per CPU for random access. Moreover, the patched kernel showed consistent improvements over the baseline kernel when using 3 or 4 jobs per CPU. 4. We wanted to simulate the real-world scenarios and therefore used default swap configuration (on). Moreover, we didn't observe any negative impact on performance with dry runs that disabled swap. Procedure ========= <for each kernel> grub2-reboot <baseline, patched> <for each concurrency condition> <generate test files> <for each access pattern> <for each data point> <reboot> <run fio> Hardware -------- Memory (GiB per socket): 192 CPU (# per socket): 40 Pmem (GiB per socket): 768 Fio --- $ fio -version fio-3.28 $ numactl --cpubind=0 --membind=0 fio --name=randread \ --directory=/mnt/pmem/ --size={10G, 5G} --io_size=1000TB \ --time_based --numjobs={40, 80} --ioengine=io_uring \ --ramp_time=20m --runtime=10m --iodepth=128 \ --iodepth_batch_submit=32 --iodepth_batch_complete=32 \ --rw=randread --random_distribution={random, zipf:0.8, normal:40} \ --direct=0 --norandommap --group_reporting Results ======= Throughput ---------- The patched kernel achieved substantially higher throughput for all three access patterns and two concurrency conditions. Specifically, comparing the patched with the baseline kernel, fio achieved 95% CIs [38.95, 40.26]%, [4.12, 6.64]% and [9.26, 10.36]% higher throughput, respectively, for random access, Zipfian access, and Gaussian access, when the average number of jobs per CPU is 1; 95% CIs [42.32, 49.15]%, [9.44, 9.89]% and [20.99, 22.86]% higher throughput, respectively, for random access, Zipfian access and Gaussian access, when the average number of jobs per CPU is 2. +---------------------+---------------+---------------+ | Mean MiB/s [95% CI] | 1 job / CPU | 2 jobs / CPU | +---------------------+---------------+---------------+ | Random access | 8411 / 11742 | 8417 / 12267 | | | [3275, 3387] | [3562, 4137] | +---------------------+---------------+---------------+ | Zipfian access | 14576 / 15360 | 12932 / 14181 | | | [600, 967] | [1220, 1279] | +---------------------+---------------+---------------+ | Gaussian access | 14564 / 15993 | 11513 / 14037 | | | [1348, 1508] | [2417, 2631] | +---------------------+---------------+---------------+ Table 1. Throughput comparison between the baseline and the patched kernels The patched kernel exhibited less degradation in throughput when running more concurrent jobs. Comparing 2 jobs per CPU with 1 job per CPU, fio achieved 95% CIs [-11.54, -11.02]%, [-16.91, -12.01]% and [-21.61, -20.30]% higher throughput, respectively, for random access, Zipfian access and Gaussian access, when using the baseline kernel; 95% CIs [2.04, 6.92]%, [-8.86, -6.48]% and [-12.83, -11.64]% higher throughput, respectively, for random access, Zipfian access and Gaussian access, when using the patched kernel. There were no statistically significant changes in throughput for the rest of the test matrix. +---------------------+-----------------+----------------+ | Mean MiB/s [95% CI] | Baseline kernel | Patched kernel | +---------------------+-----------------+----------------+ | Random access | 8411 / 8417 | 11741 / 12267 | | | [-55, 69] | [239, 812] | +---------------------+-----------------+----------------+ | Zipfian access | 14576 / 12932 | 15360/ 14181 | | | [-1682, -1607] | [-1361, -996] | +---------------------+-----------------+----------------+ | Gaussian access | 14565 / 11513 | 15993 / 14037 | | | [-3147, -2957] | [-2051, -1861] | +---------------------+-----------------+----------------+ Table 2. Throughput comparison between 1 job per CPU and 2 jobs per CPU Tail Latency ------------ Comparing the patched with the baseline kernel, fio experienced 95% CIs [-41.77, -40.35]% and [6.64, 13.95]% higher latency at the 99th percentile, respectively, for random access and Gaussian access, when the average number of jobs per CPU is 1; 95% CIs [-41.97, -40.59]%, [-47.74, -47.04]% and [-51.32, -50.27]% higher latency at the 99th percentile, respectively, for random access, Zipfian access and Gaussian access, when the average number of jobs per CPU is 2. There were no statistically significant changes in latency at the 99th percentile for the rest of the test matrix. +------------------------------+----------------+------------------+ | 99th percentile latency (us) | 1 job / CPU | 2 jobs / CPU | +------------------------------+----------------+------------------+ | Random access | 12466 / 7347 | 25560 / 15008 | | | [-5207, -5030] | [-10729, -10375] | +------------------------------+----------------+------------------+ | Zipfian access | 3395 / 3382 | 14563 / 7661 | | | [-131, 105] | [-6953,-6850] | +------------------------------+----------------+------------------+ | Gaussian access | 3280 / 3618 | 15611 / 7681 | | | [217, 457] | [-8012, -7848] | +------------------------------+----------------+------------------+ Table 3. Comparison of the 99th percentile latency between the baseline and the patched kernels (lower is better) Metrics collected during each run are available at: https://github.com/zhaishuang1/MglruPerf/tree/master A peek at 5.16-rc6 ------------------ We also ran the benchmark on 5.16-rc6 with swap off. However, we haven't collected enough data points to establish a 95% CI. Here are a few numbers we've collected: +----------------+------------+----------+----------------+----------+ | Access pattern | Jobs / CPU | 5.16-rc6 | 5.16-rc6-mglru | % change | +----------------+------------+----------+----------------+----------+ | Random access | 1 | 7467 | 10440 | 39.8% | +----------------+------------+----------+----------------+----------+ | Random access | 2 | 7504 | 13417 | 78.8% | +----------------+------------+----------+----------------+----------+ | Random access | 3 | 7511 | 13954 | 85.8% | +----------------+------------+----------+----------------+----------+ | Random access | 4 | 7542 | 13925 | 84.6% | +----------------+------------+----------+----------------+----------+ Reference ========= [1] https://kafka.apache.org/documentation/#design_filesystem [2] https://www.postgresql.org/docs/11/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY [3] System Evaluation of the Intel Optane byte-addressable NVM, MEMSYS 2019. Appendix ======== Throughput ---------- $ cat raw_data_fio.r v <- c( # baseline 40 procs random 8467.89, 8428.34, 8383.32, 8253.12, 8464.65, 8307.42, 8424.78, 8434.44, 8474.88, 8468.26, # baseline 40 procs zipf 14570.44, 14598.03, 14550.74, 14640.29, 14591.4, 14573.35, 14503.18, 14613.39, 14598.61, 14522.27, # baseline 40 procs gaussian 14504.95, 14427.23, 14652.19, 14519.47, 14557.97, 14617.92, 14555.87, 14446.94, 14678.12, 14688.33, # baseline 80 procs random 8427.51, 8267.23, 8437.48, 8432.37, 8441.4, 8454.26, 8413.13, 8412.44, 8444.36, 8444.32, # baseline 80 procs zipf 12980.12, 12946.43, 12911.95, 12925.83, 12952.75, 12841.44, 12920.35, 12924.19, 12944.38, 12967.72, # baseline 80 procs gaussian 11666.29, 11624.72, 11454.82, 11482.36, 11462.24, 11379.46, 11691.5, 11471.19, 11402.08, 11494.13, # patched 40 procs random 11706.69, 11778.1, 11774.07, 11750.07, 11744.97, 11766.65, 11727.79, 11708.41, 11745.3, 11716.45, # patched 40 procs zipf 15498.31, 14647.94, 15423.35, 15467.32, 15467.05, 15342.49, 15511.34, 15414.06, 15401.1, 15431.57, # patched 40 procs gaussian 15957.86, 15957.13, 16022.69, 16035.85, 16150.2, 15904.5, 15943.36, 16036.78, 16025.95, 15900.56, # patched 80 procs random 12568.51, 11772.25, 11622.15, 12057.66, 11971.72, 12693.36, 12399.71, 12553.23, 12242.74, 12793.34, # patched 80 procs zipf 14194.78, 14213.61, 14148.66, 14182.35, 14183.91, 14192.23, 14163.2, 14179.7, 14162.12, 14196.34, # patched 80 procs gaussian 14084.86, 13706.34, 14089.42, 14058.4, 14096.74, 14108.06, 14043.41, 14072.15, 14088.44, 14024.51 ) a <- array(v, dim = c(10, 3, 2, 2)) # baseline vs patched for (concurr in 1:2) { for (dist in 1:3) { r <- t.test(a[, dist, concurr, 1], a[, dist, concurr, 2]) print(r) p <- r$conf.int * 100 / r$estimate[1] if ((p[1] > 0 && p[2] < 0) || (p[1] < 0 && p[2] > 0)) { s <- sprintf("concurr%d dist%d: no significance", concurr, dist) } else { s <- sprintf("concurr%d dist%d: [%.2f, %.2f]%%", concurr, dist, -p[2], -p[1]) } print(s) } } # low concurr vs high concurr for (kern in 1:2) { for (dist in 1:3) { r <- t.test(a[, dist, 1, kern], a[, dist, 2, kern]) print(r) p <- r$conf.int * 100 / r$estimate[1] if ((p[1] > 0 && p[2] < 0) || (p[1] < 0 && p[2] > 0)) { s <- sprintf("kern%d dist%d: no significance", kern, dist) } else { s <- sprintf("kern%d dist%d: [%.2f, %.2f]%%", kern, dist, -p[2], -p[1]) } print(s) } } $ R -q -s -f raw_data_fio.r Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -132.15, df = 11.177, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -3386.514 -3275.766 sample estimates: mean of x mean of y 8410.71 11741.85 [1] "concurr1 dist1: [38.95, 40.26]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -9.5917, df = 9.4797, p-value = 3.463e-06 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -967.8353 -600.7307 sample estimates: mean of x mean of y 14576.17 15360.45 [1] "concurr1 dist2: [4.12, 6.64]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -37.744, df = 17.33, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1508.328 -1348.850 sample estimates: mean of x mean of y 14564.90 15993.49 [1] "concurr1 dist3: [9.26, 10.36]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -30.144, df = 9.3334, p-value = 1.281e-10 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -4137.381 -3562.653 sample estimates: mean of x mean of y 8417.45 12267.47 [1] "concurr2 dist1: [42.32, 49.15]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -92.164, df = 13.276, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1279.417 -1220.931 sample estimates: mean of x mean of y 12931.52 14181.69 [1] "concurr2 dist2: [9.44, 9.89]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -49.453, df = 17.863, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -2631.656 -2417.052 sample estimates: mean of x mean of y 11512.88 14037.23 [1] "concurr2 dist3: [20.99, 22.86]%" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = -0.22947, df = 16.403, p-value = 0.8213 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -68.88155 55.40155 sample estimates: mean of x mean of y 8410.71 8417.45 [1] "kern1 dist1: no significance" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = 91.86, df = 17.875, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1607.021 1682.287 sample estimates: mean of x mean of y 14576.17 12931.52 [1] "kern1 dist2: [-11.54, -11.02]%" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = 67.477, df = 17.539, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 2956.815 3147.225 sample estimates: mean of x mean of y 14564.90 11512.88 [1] "kern1 dist3: [-21.61, -20.30]%" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = -4.1443, df = 9.0781, p-value = 0.002459 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -812.1507 -239.0833 sample estimates: mean of x mean of y 11741.85 12267.47 [1] "kern2 dist1: [2.04, 6.92]%" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = 14.566, df = 9.1026, p-value = 1.291e-07 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 996.0064 1361.5196 sample estimates: mean of x mean of y 15360.45 14181.69 [1] "kern2 dist2: [-8.86, -6.48]%" Welch Two Sample t-test data: a[, dist, 1, kern] and a[, dist, 2, kern] t = 43.826, df = 15.275, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1861.263 2051.247 sample estimates: mean of x mean of y 15993.49 14037.23 [1] "kern2 dist3: [-12.83, -11.64]%" 99th Percentile Latency ----------------------- $ cat raw_data_fio_lat.r v <- c( # baseline 40 procs random 12649, 12387, 12518, 12518, 12518, 12387, 12518, 12518, 12387, 12256, # baseline 40 procs zipf 3458, 3294, 3425, 3294, 3294, 3359, 3752, 3326, 3294, 3458, # baseline 40 procs gaussian 3326, 3458, 3195, 3392, 3326, 3228, 3228, 3326, 3130, 3195, # baseline 80 procs random 25560, 26084, 25560, 25560, 25297, 25297, 25822, 25560, 25560, 25297, # baseline 80 procs zipf 14484, 14615, 14615, 14484, 14484, 14615, 14615, 14615, 14615, 14484, # baseline 80 procs gaussian 15664, 15664, 15533, 15533, 15533, 15664, 15795, 15533, 15664, 15533, # patched 40 procs random 7439, 7242, 7373, 7373, 7373, 7439, 7242, 7308, 7308, 7373, # patched 40 procs zipf 3261, 3425, 3392, 3294, 3359, 3556, 3228, 3490, 3458, 3359, # patched 40 procs gaussian 3687, 3523, 3556, 3523, 3752, 3654, 3884, 3490, 3392, 3720, # patched 80 procs random 15008, 15008, 15008, 15008, 15008, 15008, 15008, 15008, 15008, 15008, # patched 80 procs zipf 7701, 7635, 7701, 7701, 7635, 7635, 7701, 7635, 7635, 7635, # patched 80 procs gaussian 7635, 7898, 7701, 7635, 7635, 7635, 7635, 7635, 7701, 7701 ) a <- array(v, dim = c(10, 3, 2, 2)) # baseline vs patched for (concurr in 1:2) { for (dist in 1:3) { r <- t.test(a[, dist, concurr, 1], a[, dist, concurr, 2]) print(r) p <- r$conf.int * 100 / r$estimate[1] if ((p[1] > 0 && p[2] < 0) || (p[1] < 0 && p[2] > 0)) { s <- sprintf("concurr%d dist%d: no significance", concurr, dist) } else { s <- sprintf("concurr%d dist%d: [%.2f, %.2f]%%", concurr, dist, -p[2], -p[1]) } print(s) } } $ R -q -s -f raw_data_fio_lat.r Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = 123.52, df = 15.287, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 5030.417 5206.783 sample estimates: mean of x mean of y 12465.6 7347.0 [1] "concurr1 dist1: [-41.77, -40.35]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = 0.23667, df = 16.437, p-value = 0.8158 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -104.7812 131.1812 sample estimates: mean of x mean of y 3395.4 3382.2 [1] "concurr1 dist2: no significance" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = -5.9754, df = 16.001, p-value = 1.94e-05 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -457.5065 -217.8935 sample estimates: mean of x mean of y 3280.4 3618.1 [1] "concurr1 dist3: [6.64, 13.95]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = 134.89, df = 9, p-value = 3.437e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 10374.74 10728.66 sample estimates: mean of x mean of y 25559.7 15008.0 [1] "concurr2 dist1: [-41.97, -40.59]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = 288.1, df = 13.292, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 6849.566 6952.834 sample estimates: mean of x mean of y 14562.6 7661.4 [1] "concurr2 dist2: [-47.74, -47.04]%" Welch Two Sample t-test data: a[, dist, concurr, 1] and a[, dist, concurr, 2] t = 203.64, df = 17.798, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 7848.616 8012.384 sample estimates: mean of x mean of y 15611.6 7681.1 [1] "concurr2 dist3: [-51.32, -50.27]%" _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-01-05 3:34 UTC|newest] Thread overview: 223+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-01-04 20:22 [PATCH v6 0/9] Multigenerational LRU Framework Yu Zhao 2022-01-04 20:22 ` Yu Zhao 2022-01-04 20:22 ` [PATCH v6 1/9] mm: x86, arm64: add arch_has_hw_pte_young() Yu Zhao 2022-01-04 20:22 ` Yu Zhao 2022-01-05 10:45 ` Will Deacon 2022-01-05 10:45 ` Will Deacon 2022-01-05 20:47 ` Yu Zhao 2022-01-05 20:47 ` Yu Zhao 2022-01-06 10:30 ` Will Deacon 2022-01-06 10:30 ` Will Deacon 2022-01-07 7:25 ` Yu Zhao 2022-01-07 7:25 ` Yu Zhao 2022-01-11 14:19 ` Will Deacon 2022-01-11 14:19 ` Will Deacon 2022-01-11 22:27 ` Yu Zhao 2022-01-11 22:27 ` Yu Zhao 2022-01-04 20:22 ` [PATCH v6 2/9] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG Yu Zhao 2022-01-04 20:22 ` Yu Zhao 2022-01-04 21:24 ` Linus Torvalds 2022-01-04 21:24 ` Linus Torvalds 2022-01-04 20:22 ` [PATCH v6 3/9] mm/vmscan.c: refactor shrink_node() Yu Zhao 2022-01-04 20:22 ` Yu Zhao 2022-01-04 20:22 ` [PATCH v6 4/9] mm: multigenerational lru: groundwork Yu Zhao 2022-01-04 20:22 ` Yu Zhao 2022-01-04 21:34 ` Linus Torvalds 2022-01-04 21:34 ` Linus Torvalds 2022-01-11 8:16 ` Aneesh Kumar K.V 2022-01-11 8:16 ` Aneesh Kumar K.V 2022-01-12 2:16 ` Yu Zhao 2022-01-12 2:16 ` Yu Zhao 2022-01-04 20:22 ` [PATCH v6 5/9] mm: multigenerational lru: mm_struct list Yu Zhao 2022-01-04 20:22 ` Yu Zhao 2022-01-07 9:06 ` Michal Hocko 2022-01-07 9:06 ` Michal Hocko 2022-01-08 0:19 ` Yu Zhao 2022-01-08 0:19 ` Yu Zhao 2022-01-10 15:21 ` Michal Hocko 2022-01-10 15:21 ` Michal Hocko 2022-01-12 8:08 ` Yu Zhao 2022-01-12 8:08 ` Yu Zhao 2022-01-04 20:22 ` [PATCH v6 6/9] mm: multigenerational lru: aging Yu Zhao 2022-01-04 20:22 ` Yu Zhao 2022-01-06 16:06 ` Michal Hocko 2022-01-06 16:06 ` Michal Hocko 2022-01-06 21:27 ` Yu Zhao 2022-01-06 21:27 ` Yu Zhao 2022-01-07 8:43 ` Michal Hocko 2022-01-07 8:43 ` Michal Hocko 2022-01-07 21:12 ` Yu Zhao 2022-01-07 21:12 ` Yu Zhao 2022-01-06 16:12 ` Michal Hocko 2022-01-06 16:12 ` Michal Hocko 2022-01-06 21:41 ` Yu Zhao 2022-01-06 21:41 ` Yu Zhao 2022-01-07 8:55 ` Michal Hocko 2022-01-07 8:55 ` Michal Hocko 2022-01-07 9:00 ` Michal Hocko 2022-01-07 9:00 ` Michal Hocko 2022-01-10 3:58 ` Yu Zhao 2022-01-10 3:58 ` Yu Zhao 2022-01-10 14:37 ` Michal Hocko 2022-01-10 14:37 ` Michal Hocko 2022-01-13 9:43 ` Yu Zhao 2022-01-13 9:43 ` Yu Zhao 2022-01-13 12:02 ` Michal Hocko 2022-01-13 12:02 ` Michal Hocko 2022-01-19 6:31 ` Yu Zhao 2022-01-19 6:31 ` Yu Zhao 2022-01-19 9:44 ` Michal Hocko 2022-01-19 9:44 ` Michal Hocko 2022-01-10 15:01 ` Michal Hocko 2022-01-10 15:01 ` Michal Hocko 2022-01-10 16:01 ` Vlastimil Babka 2022-01-10 16:01 ` Vlastimil Babka 2022-01-10 16:25 ` Michal Hocko 2022-01-10 16:25 ` Michal Hocko 2022-01-11 23:16 ` Yu Zhao 2022-01-11 23:16 ` Yu Zhao 2022-01-12 10:28 ` Michal Hocko 2022-01-12 10:28 ` Michal Hocko 2022-01-13 9:25 ` Yu Zhao 2022-01-13 9:25 ` Yu Zhao 2022-01-07 13:11 ` Michal Hocko 2022-01-07 13:11 ` Michal Hocko 2022-01-07 23:36 ` Yu Zhao 2022-01-07 23:36 ` Yu Zhao 2022-01-10 15:35 ` Michal Hocko 2022-01-10 15:35 ` Michal Hocko 2022-01-11 1:18 ` Yu Zhao 2022-01-11 1:18 ` Yu Zhao 2022-01-11 9:00 ` Michal Hocko 2022-01-11 9:00 ` Michal Hocko [not found] ` <1641900108.61dd684cb0e59@mail.inbox.lv> 2022-01-11 12:15 ` Michal Hocko 2022-01-11 12:15 ` Michal Hocko 2022-01-13 17:00 ` Alexey Avramov 2022-01-13 17:00 ` Alexey Avramov 2022-01-11 14:22 ` Alexey Avramov 2022-01-11 14:22 ` Alexey Avramov 2022-01-07 14:44 ` Michal Hocko 2022-01-07 14:44 ` Michal Hocko 2022-01-10 4:47 ` Yu Zhao 2022-01-10 4:47 ` Yu Zhao 2022-01-10 10:54 ` Michal Hocko 2022-01-10 10:54 ` Michal Hocko 2022-01-19 7:04 ` Yu Zhao 2022-01-19 7:04 ` Yu Zhao 2022-01-19 9:42 ` Michal Hocko 2022-01-19 9:42 ` Michal Hocko 2022-01-23 21:28 ` Yu Zhao 2022-01-23 21:28 ` Yu Zhao 2022-01-24 14:01 ` Michal Hocko 2022-01-24 14:01 ` Michal Hocko 2022-01-10 16:57 ` Michal Hocko 2022-01-10 16:57 ` Michal Hocko 2022-01-12 1:01 ` Yu Zhao 2022-01-12 1:01 ` Yu Zhao 2022-01-12 10:17 ` Michal Hocko 2022-01-12 10:17 ` Michal Hocko 2022-01-12 23:43 ` Yu Zhao 2022-01-12 23:43 ` Yu Zhao 2022-01-13 11:57 ` Michal Hocko 2022-01-13 11:57 ` Michal Hocko 2022-01-23 21:40 ` Yu Zhao 2022-01-23 21:40 ` Yu Zhao 2022-01-04 20:22 ` [PATCH v6 7/9] mm: multigenerational lru: eviction Yu Zhao 2022-01-04 20:22 ` Yu Zhao 2022-01-11 10:37 ` Aneesh Kumar K.V 2022-01-11 10:37 ` Aneesh Kumar K.V 2022-01-12 8:05 ` Yu Zhao 2022-01-12 8:05 ` Yu Zhao 2022-01-04 20:22 ` [PATCH v6 8/9] mm: multigenerational lru: user interface Yu Zhao 2022-01-04 20:22 ` Yu Zhao 2022-01-10 10:27 ` Mike Rapoport 2022-01-10 10:27 ` Mike Rapoport 2022-01-12 8:35 ` Yu Zhao 2022-01-12 8:35 ` Yu Zhao 2022-01-12 10:31 ` Michal Hocko 2022-01-12 10:31 ` Michal Hocko 2022-01-12 15:45 ` Mike Rapoport 2022-01-12 15:45 ` Mike Rapoport 2022-01-13 9:47 ` Yu Zhao 2022-01-13 9:47 ` Yu Zhao 2022-01-13 10:31 ` Aneesh Kumar K.V 2022-01-13 10:31 ` Aneesh Kumar K.V 2022-01-13 23:02 ` Yu Zhao 2022-01-13 23:02 ` Yu Zhao 2022-01-14 5:20 ` Aneesh Kumar K.V 2022-01-14 5:20 ` Aneesh Kumar K.V 2022-01-14 6:50 ` Yu Zhao 2022-01-14 6:50 ` Yu Zhao 2022-01-04 20:22 ` [PATCH v6 9/9] mm: multigenerational lru: Kconfig Yu Zhao 2022-01-04 20:22 ` Yu Zhao 2022-01-04 21:39 ` Linus Torvalds 2022-01-04 21:39 ` Linus Torvalds 2022-01-04 20:22 ` [PATCH v6 0/9] Multigenerational LRU Framework Yu Zhao 2022-01-04 20:30 ` Yu Zhao 2022-01-04 20:30 ` Yu Zhao 2022-01-04 21:43 ` Linus Torvalds 2022-01-04 21:43 ` Linus Torvalds 2022-01-05 21:12 ` Yu Zhao 2022-01-05 21:12 ` Yu Zhao 2022-01-07 9:38 ` Michal Hocko 2022-01-07 9:38 ` Michal Hocko 2022-01-07 18:45 ` Yu Zhao 2022-01-07 18:45 ` Yu Zhao 2022-01-10 15:39 ` Michal Hocko 2022-01-10 15:39 ` Michal Hocko 2022-01-10 22:04 ` Yu Zhao 2022-01-10 22:04 ` Yu Zhao 2022-01-10 22:46 ` Jesse Barnes 2022-01-10 22:46 ` Jesse Barnes 2022-01-11 1:41 ` Linus Torvalds 2022-01-11 1:41 ` Linus Torvalds 2022-01-11 10:40 ` Michal Hocko 2022-01-11 10:40 ` Michal Hocko 2022-01-11 8:41 ` Yu Zhao 2022-01-11 8:41 ` Yu Zhao 2022-01-11 8:53 ` Holger Hoffstätte 2022-01-11 8:53 ` Holger Hoffstätte 2022-01-11 9:26 ` Jan Alexander Steffens (heftig) 2022-01-11 16:04 ` Shuang Zhai 2022-01-11 16:04 ` Shuang Zhai 2022-01-12 1:46 ` Suleiman Souhlal 2022-01-12 1:46 ` Suleiman Souhlal 2022-01-12 6:07 ` Sofia Trinh 2022-01-12 6:07 ` Sofia Trinh 2022-01-12 16:17 ` Daniel Byrne 2022-01-18 9:21 ` Yu Zhao 2022-01-18 9:21 ` Yu Zhao 2022-01-18 9:36 ` Donald Carr 2022-01-18 9:36 ` Donald Carr 2022-01-19 20:19 ` Steven Barrett 2022-01-19 20:19 ` Steven Barrett 2022-01-19 22:25 ` Brian Geffon 2022-01-19 22:25 ` Brian Geffon 2022-01-05 2:44 ` Shuang Zhai [this message] 2022-01-05 2:44 ` Shuang Zhai 2022-01-05 8:55 ` SeongJae Park 2022-01-05 8:55 ` SeongJae Park 2022-01-05 10:53 ` Yu Zhao 2022-01-05 10:53 ` Yu Zhao 2022-01-05 11:12 ` Borislav Petkov 2022-01-05 11:12 ` Borislav Petkov 2022-01-05 11:25 ` SeongJae Park 2022-01-05 11:25 ` SeongJae Park 2022-01-05 21:06 ` Yu Zhao 2022-01-05 21:06 ` Yu Zhao 2022-01-10 14:49 ` Alexey Avramov 2022-01-10 14:49 ` Alexey Avramov 2022-01-11 10:24 ` Alexey Avramov 2022-01-11 10:24 ` Alexey Avramov 2022-01-12 20:56 ` Oleksandr Natalenko 2022-01-12 20:56 ` Oleksandr Natalenko 2022-01-13 8:59 ` Yu Zhao 2022-01-13 8:59 ` Yu Zhao 2022-01-23 5:43 ` Barry Song 2022-01-23 5:43 ` Barry Song 2022-01-25 6:48 ` Yu Zhao 2022-01-25 6:48 ` Yu Zhao 2022-01-28 8:54 ` Barry Song 2022-01-28 8:54 ` Barry Song 2022-02-08 9:16 ` Yu Zhao 2022-02-08 9:16 ` Yu Zhao
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220105024423.26409-1-szhai2@cs.rochester.edu \ --to=szhai2@cs.rochester.edu \ --cc=Michael@michaellarabel.com \ --cc=ak@linux.intel.com \ --cc=akpm@linux-foundation.org \ --cc=axboe@kernel.dk \ --cc=catalin.marinas@arm.com \ --cc=corbet@lwn.net \ --cc=dave.hansen@linux.intel.com \ --cc=hannes@cmpxchg.org \ --cc=hdanton@sina.com \ --cc=jsbarnes@google.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-doc@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=mhocko@kernel.org \ --cc=page-reclaim@google.com \ --cc=riel@surriel.com \ --cc=torvalds@linux-foundation.org \ --cc=vbabka@suse.cz \ --cc=will@kernel.org \ --cc=willy@infradead.org \ --cc=x86@kernel.org \ --cc=ying.huang@intel.com \ --cc=yuzhao@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.