* Re: high number of dropped packets/rx_missed_errors from 4.17 kernel [not found] ` <47586104-a816-1419-13c0-b1d297289fd5@intel.com> @ 2020-11-18 18:47 ` Rafael J. Wysocki 2020-12-03 19:43 ` Andrei Popa 0 siblings, 1 reply; 3+ messages in thread From: Rafael J. Wysocki @ 2020-11-18 18:47 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Andrei Popa, linux-kernel, peterz, Linux PM On Tuesday, November 17, 2020 7:31:29 PM CET Rafael J. Wysocki wrote: > On 11/16/2020 8:11 AM, Andrei Popa wrote: > > Hello, > > > > After an update from vmlinuz-4.15.0-106-generic to vmlinuz-5.4.0-37-generic we experience, on a number of servers, a very high number of rx_missed_errors and dropped packets only on the uplink 10G interface. We have another 10G downlink interface with no problems. > > > > The affected servers have the following mainboards: > > S5520HC ver E26045-455 > > S5520UR ver E22554-751 > > S5520UR ver E22554-753 > > S5000VSA > > > > On other 30 servers with similar mainboards and/or configs there are no dropped packets with vmlinuz-5.4.0-37-generic. > > > > We’ve installed vanilla 4.16 and there were no dropped packets. > > Vanilla 4.17 had a very high number of dropped packets like the following: > > > > root@shaper:~# cat test > > #!/bin/bash > > while true > > do > > ethtool -S ens6f1|grep "missed_errors" > > ifconfig ens6f1|grep RX|grep dropped > > sleep 1 > > done > > > > root@shaper:~# ./test > > rx_missed_errors: 2418845 > > RX errors 0 dropped 2418888 overruns 0 frame 0 > > rx_missed_errors: 2426175 > > RX errors 0 dropped 2426218 overruns 0 frame 0 > > rx_missed_errors: 2431910 > > RX errors 0 dropped 2431953 overruns 0 frame 0 > > rx_missed_errors: 2437266 > > RX errors 0 dropped 2437309 overruns 0 frame 0 > > rx_missed_errors: 2443305 > > RX errors 0 dropped 2443348 overruns 0 frame 0 > > rx_missed_errors: 2448357 > > RX errors 0 dropped 2448400 overruns 0 frame 0 > > rx_missed_errors: 2452539 > > RX errors 0 dropped 2452582 overruns 0 frame 0 > > > > We did a git bisect and we’ve found that the following commit generates the high number of dropped packets: > > > > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com <mailto:rafael.j.wysocki@intel.com>> > > Date: Thu Apr 5 19:12:43 2018 +0200 > > cpuidle: menu: Avoid selecting shallow states with stopped tick > > If the scheduler tick has been stopped already and the governor > > selects a shallow idle state, the CPU can spend a long time in that > > state if the selection is based on an inaccurate prediction of idle > > time. That effect turns out to be relevant, so it needs to be > > mitigated. > > To that end, modify the menu governor to discard the result of the > > idle time prediction if the tick is stopped and the predicted idle > > time is less than the tick period length, unless the tick timer is > > going to expire soon. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com <mailto:rafael.j.wysocki@intel.com>> > > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org <mailto:peterz@infradead.org>> > > diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c > > index 267982e471e0..1bfe03ceb236 100644 > > --- a/drivers/cpuidle/governors/menu.c > > +++ b/drivers/cpuidle/governors/menu.c > > @@ -352,13 +352,28 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, > > */ > > data->predicted_us = min(data->predicted_us, expected_interval); > > - /* > > - * Use the performance multiplier and the user-configurable > > - * latency_req to determine the maximum exit latency. > > - */ > > - interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load); > > - if (latency_req > interactivity_req) > > - latency_req = interactivity_req; > > The tick_nohz_tick_stopped() check may be done after the above and it > may be reworked a bit. > > I'll send a test patch to you shortly. The patch is appended, but please note that it has been rebased by hand and not tested. Please let me know if it makes any difference. And in the future please avoid pasting the entire kernel config to your reports, that's problematic. --- drivers/cpuidle/governors/menu.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) Index: linux-pm/drivers/cpuidle/governors/menu.c =================================================================== --- linux-pm.orig/drivers/cpuidle/governors/menu.c +++ linux-pm/drivers/cpuidle/governors/menu.c @@ -308,18 +308,18 @@ static int menu_select(struct cpuidle_dr get_typical_interval(data, predicted_us)) * NSEC_PER_USEC; - if (tick_nohz_tick_stopped()) { - /* - * If the tick is already stopped, the cost of possible short - * idle duration misprediction is much higher, because the CPU - * may be stuck in a shallow idle state for a long time as a - * result of it. In that case say we might mispredict and use - * the known time till the closest timer event for the idle - * state selection. - */ - if (data->predicted_us < TICK_USEC) - data->predicted_us = min_t(unsigned int, TICK_USEC, - ktime_to_us(delta_next)); + /* + * If the tick is already stopped, the cost of possible short idle + * duration misprediction is much higher, because the CPU may be stuck + * in a shallow idle state for a long time as a result of it. In that + * case, say we might mispredict and use the known time till the closest + * timer event for the idle state selection, unless that event is going + * to occur within the tick time frame (in which case the CPU will be + * woken up from whatever idle state it gets into soon enough anyway). + */ + if (tick_nohz_tick_stopped() && data->predicted_us < TICK_USEC && + delta_next >= TICK_NSEC) { + data->predicted_us = ktime_to_us(delta_next); } else { /* * Use the performance multiplier and the user-configurable ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: high number of dropped packets/rx_missed_errors from 4.17 kernel 2020-11-18 18:47 ` high number of dropped packets/rx_missed_errors from 4.17 kernel Rafael J. Wysocki @ 2020-12-03 19:43 ` Andrei Popa 2020-12-04 6:45 ` Andrei Popa 0 siblings, 1 reply; 3+ messages in thread From: Andrei Popa @ 2020-12-03 19:43 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Rafael J. Wysocki, linux-kernel, peterz, Linux PM Hi, On what kernel version should I try the patch ? I tried on 5.9 and it doesn't build. > On 18 Nov 2020, at 20:47, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > On Tuesday, November 17, 2020 7:31:29 PM CET Rafael J. Wysocki wrote: >> On 11/16/2020 8:11 AM, Andrei Popa wrote: >>> Hello, >>> >>> After an update from vmlinuz-4.15.0-106-generic to vmlinuz-5.4.0-37-generic we experience, on a number of servers, a very high number of rx_missed_errors and dropped packets only on the uplink 10G interface. We have another 10G downlink interface with no problems. >>> >>> The affected servers have the following mainboards: >>> S5520HC ver E26045-455 >>> S5520UR ver E22554-751 >>> S5520UR ver E22554-753 >>> S5000VSA >>> >>> On other 30 servers with similar mainboards and/or configs there are no dropped packets with vmlinuz-5.4.0-37-generic. >>> >>> We’ve installed vanilla 4.16 and there were no dropped packets. >>> Vanilla 4.17 had a very high number of dropped packets like the following: >>> >>> root@shaper:~# cat test >>> #!/bin/bash >>> while true >>> do >>> ethtool -S ens6f1|grep "missed_errors" >>> ifconfig ens6f1|grep RX|grep dropped >>> sleep 1 >>> done >>> >>> root@shaper:~# ./test >>> rx_missed_errors: 2418845 >>> RX errors 0 dropped 2418888 overruns 0 frame 0 >>> rx_missed_errors: 2426175 >>> RX errors 0 dropped 2426218 overruns 0 frame 0 >>> rx_missed_errors: 2431910 >>> RX errors 0 dropped 2431953 overruns 0 frame 0 >>> rx_missed_errors: 2437266 >>> RX errors 0 dropped 2437309 overruns 0 frame 0 >>> rx_missed_errors: 2443305 >>> RX errors 0 dropped 2443348 overruns 0 frame 0 >>> rx_missed_errors: 2448357 >>> RX errors 0 dropped 2448400 overruns 0 frame 0 >>> rx_missed_errors: 2452539 >>> RX errors 0 dropped 2452582 overruns 0 frame 0 >>> >>> We did a git bisect and we’ve found that the following commit generates the high number of dropped packets: >>> >>> Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com <mailto:rafael.j.wysocki@intel.com>> >>> Date: Thu Apr 5 19:12:43 2018 +0200 >>> cpuidle: menu: Avoid selecting shallow states with stopped tick >>> If the scheduler tick has been stopped already and the governor >>> selects a shallow idle state, the CPU can spend a long time in that >>> state if the selection is based on an inaccurate prediction of idle >>> time. That effect turns out to be relevant, so it needs to be >>> mitigated. >>> To that end, modify the menu governor to discard the result of the >>> idle time prediction if the tick is stopped and the predicted idle >>> time is less than the tick period length, unless the tick timer is >>> going to expire soon. >>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com <mailto:rafael.j.wysocki@intel.com>> >>> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org <mailto:peterz@infradead.org>> >>> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c >>> index 267982e471e0..1bfe03ceb236 100644 >>> --- a/drivers/cpuidle/governors/menu.c >>> +++ b/drivers/cpuidle/governors/menu.c >>> @@ -352,13 +352,28 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, >>> */ >>> data->predicted_us = min(data->predicted_us, expected_interval); >>> - /* >>> - * Use the performance multiplier and the user-configurable >>> - * latency_req to determine the maximum exit latency. >>> - */ >>> - interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load); >>> - if (latency_req > interactivity_req) >>> - latency_req = interactivity_req; >> >> The tick_nohz_tick_stopped() check may be done after the above and it >> may be reworked a bit. >> >> I'll send a test patch to you shortly. > > The patch is appended, but please note that it has been rebased by hand and > not tested. > > Please let me know if it makes any difference. > > And in the future please avoid pasting the entire kernel config to your > reports, that's problematic. > > --- > drivers/cpuidle/governors/menu.c | 23 ++++++++++++----------- > 1 file changed, 12 insertions(+), 11 deletions(-) > > Index: linux-pm/drivers/cpuidle/governors/menu.c > =================================================================== > --- linux-pm.orig/drivers/cpuidle/governors/menu.c > +++ linux-pm/drivers/cpuidle/governors/menu.c > @@ -308,18 +308,18 @@ static int menu_select(struct cpuidle_dr > get_typical_interval(data, predicted_us)) * > NSEC_PER_USEC; > > - if (tick_nohz_tick_stopped()) { > - /* > - * If the tick is already stopped, the cost of possible short > - * idle duration misprediction is much higher, because the CPU > - * may be stuck in a shallow idle state for a long time as a > - * result of it. In that case say we might mispredict and use > - * the known time till the closest timer event for the idle > - * state selection. > - */ > - if (data->predicted_us < TICK_USEC) > - data->predicted_us = min_t(unsigned int, TICK_USEC, > - ktime_to_us(delta_next)); > + /* > + * If the tick is already stopped, the cost of possible short idle > + * duration misprediction is much higher, because the CPU may be stuck > + * in a shallow idle state for a long time as a result of it. In that > + * case, say we might mispredict and use the known time till the closest > + * timer event for the idle state selection, unless that event is going > + * to occur within the tick time frame (in which case the CPU will be > + * woken up from whatever idle state it gets into soon enough anyway). > + */ > + if (tick_nohz_tick_stopped() && data->predicted_us < TICK_USEC && > + delta_next >= TICK_NSEC) { > + data->predicted_us = ktime_to_us(delta_next); > } else { > /* > * Use the performance multiplier and the user-configurable ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: high number of dropped packets/rx_missed_errors from 4.17 kernel 2020-12-03 19:43 ` Andrei Popa @ 2020-12-04 6:45 ` Andrei Popa 0 siblings, 0 replies; 3+ messages in thread From: Andrei Popa @ 2020-12-04 6:45 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Rafael J. Wysocki, linux-kernel, peterz, Linux PM Hi, I’ve applied your patch on kernel 4.17.0 and dropped packets and rx_missed_errors are still present, through they are increasing at a lower rate. root@shaper:~# ./test rx_missed_errors: 2135 RX errors 0 dropped 2155 overruns 0 frame 0 sleeping 60 seconds rx_missed_errors: 2433 RX errors 0 dropped 2459 overruns 0 frame 0 sleeping 60 seconds rx_missed_errors: 2433 RX errors 0 dropped 2465 overruns 0 frame 0 sleeping 60 seconds rx_missed_errors: 2526 RX errors 0 dropped 2564 overruns 0 frame 0 sleeping 60 seconds > On 3 Dec 2020, at 21:43, Andrei Popa <andreipopad@gmail.com> wrote: > > Hi, > > On what kernel version should I try the patch ? I tried on 5.9 and it doesn't build. > >> On 18 Nov 2020, at 20:47, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> >> On Tuesday, November 17, 2020 7:31:29 PM CET Rafael J. Wysocki wrote: >>> On 11/16/2020 8:11 AM, Andrei Popa wrote: >>>> Hello, >>>> >>>> After an update from vmlinuz-4.15.0-106-generic to vmlinuz-5.4.0-37-generic we experience, on a number of servers, a very high number of rx_missed_errors and dropped packets only on the uplink 10G interface. We have another 10G downlink interface with no problems. >>>> >>>> The affected servers have the following mainboards: >>>> S5520HC ver E26045-455 >>>> S5520UR ver E22554-751 >>>> S5520UR ver E22554-753 >>>> S5000VSA >>>> >>>> On other 30 servers with similar mainboards and/or configs there are no dropped packets with vmlinuz-5.4.0-37-generic. >>>> >>>> We’ve installed vanilla 4.16 and there were no dropped packets. >>>> Vanilla 4.17 had a very high number of dropped packets like the following: >>>> >>>> root@shaper:~# cat test >>>> #!/bin/bash >>>> while true >>>> do >>>> ethtool -S ens6f1|grep "missed_errors" >>>> ifconfig ens6f1|grep RX|grep dropped >>>> sleep 1 >>>> done >>>> >>>> root@shaper:~# ./test >>>> rx_missed_errors: 2418845 >>>> RX errors 0 dropped 2418888 overruns 0 frame 0 >>>> rx_missed_errors: 2426175 >>>> RX errors 0 dropped 2426218 overruns 0 frame 0 >>>> rx_missed_errors: 2431910 >>>> RX errors 0 dropped 2431953 overruns 0 frame 0 >>>> rx_missed_errors: 2437266 >>>> RX errors 0 dropped 2437309 overruns 0 frame 0 >>>> rx_missed_errors: 2443305 >>>> RX errors 0 dropped 2443348 overruns 0 frame 0 >>>> rx_missed_errors: 2448357 >>>> RX errors 0 dropped 2448400 overruns 0 frame 0 >>>> rx_missed_errors: 2452539 >>>> RX errors 0 dropped 2452582 overruns 0 frame 0 >>>> >>>> We did a git bisect and we’ve found that the following commit generates the high number of dropped packets: >>>> >>>> Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com <mailto:rafael.j.wysocki@intel.com>> >>>> Date: Thu Apr 5 19:12:43 2018 +0200 >>>> cpuidle: menu: Avoid selecting shallow states with stopped tick >>>> If the scheduler tick has been stopped already and the governor >>>> selects a shallow idle state, the CPU can spend a long time in that >>>> state if the selection is based on an inaccurate prediction of idle >>>> time. That effect turns out to be relevant, so it needs to be >>>> mitigated. >>>> To that end, modify the menu governor to discard the result of the >>>> idle time prediction if the tick is stopped and the predicted idle >>>> time is less than the tick period length, unless the tick timer is >>>> going to expire soon. >>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com <mailto:rafael.j.wysocki@intel.com>> >>>> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org <mailto:peterz@infradead.org>> >>>> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c >>>> index 267982e471e0..1bfe03ceb236 100644 >>>> --- a/drivers/cpuidle/governors/menu.c >>>> +++ b/drivers/cpuidle/governors/menu.c >>>> @@ -352,13 +352,28 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, >>>> */ >>>> data->predicted_us = min(data->predicted_us, expected_interval); >>>> - /* >>>> - * Use the performance multiplier and the user-configurable >>>> - * latency_req to determine the maximum exit latency. >>>> - */ >>>> - interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load); >>>> - if (latency_req > interactivity_req) >>>> - latency_req = interactivity_req; >>> >>> The tick_nohz_tick_stopped() check may be done after the above and it >>> may be reworked a bit. >>> >>> I'll send a test patch to you shortly. >> >> The patch is appended, but please note that it has been rebased by hand and >> not tested. >> >> Please let me know if it makes any difference. >> >> And in the future please avoid pasting the entire kernel config to your >> reports, that's problematic. >> >> --- >> drivers/cpuidle/governors/menu.c | 23 ++++++++++++----------- >> 1 file changed, 12 insertions(+), 11 deletions(-) >> >> Index: linux-pm/drivers/cpuidle/governors/menu.c >> =================================================================== >> --- linux-pm.orig/drivers/cpuidle/governors/menu.c >> +++ linux-pm/drivers/cpuidle/governors/menu.c >> @@ -308,18 +308,18 @@ static int menu_select(struct cpuidle_dr >> get_typical_interval(data, predicted_us)) * >> NSEC_PER_USEC; >> >> - if (tick_nohz_tick_stopped()) { >> - /* >> - * If the tick is already stopped, the cost of possible short >> - * idle duration misprediction is much higher, because the CPU >> - * may be stuck in a shallow idle state for a long time as a >> - * result of it. In that case say we might mispredict and use >> - * the known time till the closest timer event for the idle >> - * state selection. >> - */ >> - if (data->predicted_us < TICK_USEC) >> - data->predicted_us = min_t(unsigned int, TICK_USEC, >> - ktime_to_us(delta_next)); >> + /* >> + * If the tick is already stopped, the cost of possible short idle >> + * duration misprediction is much higher, because the CPU may be stuck >> + * in a shallow idle state for a long time as a result of it. In that >> + * case, say we might mispredict and use the known time till the closest >> + * timer event for the idle state selection, unless that event is going >> + * to occur within the tick time frame (in which case the CPU will be >> + * woken up from whatever idle state it gets into soon enough anyway). >> + */ >> + if (tick_nohz_tick_stopped() && data->predicted_us < TICK_USEC && >> + delta_next >= TICK_NSEC) { >> + data->predicted_us = ktime_to_us(delta_next); >> } else { >> /* >> * Use the performance multiplier and the user-configurable > ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-12-04 6:46 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <8EACE662-A291-4DB8-A5CB-BB0BD44B7AB0@gmail.com> [not found] ` <47586104-a816-1419-13c0-b1d297289fd5@intel.com> 2020-11-18 18:47 ` high number of dropped packets/rx_missed_errors from 4.17 kernel Rafael J. Wysocki 2020-12-03 19:43 ` Andrei Popa 2020-12-04 6:45 ` Andrei Popa
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).