On Mon, Sep 18, 2017 at 06:33:20PM +0300, Tariq Toukan wrote: > > > On 18/09/2017 10:44 AM, Aaron Lu wrote: > > On Mon, Sep 18, 2017 at 03:34:47PM +0800, Aaron Lu wrote: > > > On Sun, Sep 17, 2017 at 07:16:15PM +0300, Tariq Toukan wrote: > > > > > > > > It's nice to have the option to dynamically play with the parameter. > > > > But maybe we should also think of changing the default fraction guaranteed > > > > to the PCP, so that unaware admins of networking servers would also benefit. > > > > > > I collected some performance data with will-it-scale/page_fault1 process > > > mode on different machines with different pcp->batch sizes, starting > > > from the default 31(calculated by zone_batchsize(), 31 is the standard > > > value for any zone that has more than 1/2MiB memory), then incremented > > > by 31 upwards till 527. PCP's upper limit is 6*batch. > > > > > > An image is plotted and attached: batch_full.png(full here means the > > > number of process started equals to CPU number). > > > > To be clear: X-axis is the value of batch size(31, 62, 93, ..., 527), > > Y-axis is the value of per_process_ops, generated by will-it-scale, One correction here, Y-axis isn't per_process_ops but per_process_ops * nr_processes. Still, higher is better. > > higher is better. > > > > > > > > From the image: > > > - For EX machines, they all see throughput increase with increased batch > > > size and peaked at around batch_size=310, then fall; > > > - For EP machines, Haswell-EP and Broadwell-EP also see throughput > > > increase with increased batch size and peaked at batch_size=279, then > > > fall, batch_size=310 also delivers pretty good result. Skylake-EP is > > > quite different in that it doesn't see any obvious throughput increase > > > after batch_size=93, though the trend is still increasing, but in a very > > > small way and finally peaked at batch_size=403, then fall. > > > Ivybridge EP behaves much like desktop ones. > > > - For Desktop machines, they do not see any obvious changes with > > > increased batch_size. > > > > > > So the default batch size(31) doesn't deliver good enough result, we > > > probbaly should change the default value. > > Thanks Aaron for sharing your experiment results. > That's a good analysis of the effect of the batch value. > I agree with your conclusion. > > From networking perspective, we should reconsider the defaults to be able to > reach the increasing NICs linerates. > Not only for pcp->batch, but also for pcp->high. I guess I didn't make it clear in my last email: when pcp->batch is changed, pcp->high is also changed. Their relationship is: pcp->high = pcp->batch * 6. Manipulating percpu_pagelist_fraction could increase pcp->high, but not pcp->batch(it has an upper limit as 96 currently). My test shows even when pcp->high being the same, changing pcp->batch could further improve will-it-scale's performance. e.g. in the below two cases, pcp->high are both set to 1860 but with different pcp->batch: will-it-scale native_queued_spin_lock_slowpath(perf) pcp->batch=96 15762348 79.95% pcp->batch=310 19291492 +22.3% 74.87% -5.1% Granted, this is the case for will-it-scale and may not apply to your case. I have a small patch that adds a batch interface for debug purpose, echo a value could set batch and high will be batch * 6. You are welcome to give it a try if you think it's worth(attached). Regards, Aaron