All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: SeongJae Park <sjpark@amazon.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Eric Dumazet <edumazet@google.com>,
	David Miller <davem@davemloft.net>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Jakub Kicinski <kuba@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	sj38.park@gmail.com, netdev <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	SeongJae Park <sjpark@amazon.de>,
	snu@amazon.com, amit@kernel.org, stable@vger.kernel.org
Subject: Re: Re: Re: Re: Re: [PATCH net v2 0/2] Revert the 'socket_alloc' life cycle change
Date: Wed, 6 May 2020 07:41:51 -0700	[thread overview]
Message-ID: <20200506144151.GZ2869@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200506125926.29844-1-sjpark@amazon.com>

On Wed, May 06, 2020 at 02:59:26PM +0200, SeongJae Park wrote:
> TL; DR: It was not kernel's fault, but the benchmark program.
> 
> So, the problem is reproducible using the lebench[1] only.  I carefully read
> it's code again.
> 
> Before running the problem occurred "poll big" sub test, lebench executes
> "context switch" sub test.  For the test, it sets the cpu affinity[2] and
> process priority[3] of itself to '0' and '-20', respectively.  However, it
> doesn't restore the values to original value even after the "context switch" is
> finished.  For the reason, "select big" sub test also run binded on CPU 0 and
> has lowest nice value.  Therefore, it can disturb the RCU callback thread for
> the CPU 0, which processes the deferred deallocations of the sockets, and as a
> result it triggers the OOM.
> 
> We confirmed the problem disappears by offloading the RCU callbacks from the
> CPU 0 using rcu_nocbs=0 boot parameter or simply restoring the affinity and/or
> priority.
> 
> Someone _might_ still argue that this is kernel problem because the problem
> didn't occur on the old kernels prior to the Al's patches.  However, setting
> the affinity and priority was available because the program received the
> permission.  Therefore, it would be reasonable to blame the system
> administrators rather than the kernel.
> 
> So, please ignore this patchset, apology for making confuse.  If you still has
> some doubts or need more tests, please let me know.
> 
> [1] https://github.com/LinuxPerfStudy/LEBench
> [2] https://github.com/LinuxPerfStudy/LEBench/blob/master/TEST_DIR/OS_Eval.c#L820
> [3] https://github.com/LinuxPerfStudy/LEBench/blob/master/TEST_DIR/OS_Eval.c#L822

Thank you for chasing this down!

I have had this sort of thing on my list as a potential issue, but given
that it is now really showing up, it sounds like it is time to bump
up its priority a bit.  Of course there are limits, so if userspace is
running at any of the real-time priorities, making sufficient CPU time
available to RCU's kthreads becomes userspace's responsibility.  But if
everything is running at SCHED_OTHER (which is this case here, correct?),
then it is reasonable for RCU to do some work to avoid this situation.

But still, yes, the immediate job is fixing the benchmark.  ;-)

							Thanx, Paul

PS.  Why not just attack all potential issues on my list?  Because I
     usually learn quite a bit from seeing the problem actually happen.
     And sometimes other changes in RCU eliminate the potential issue
     before it has a chance to happen.

  parent reply	other threads:[~2020-05-06 14:41 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05  8:10 [PATCH net v2 0/2] Revert the 'socket_alloc' life cycle change SeongJae Park
2020-05-05  8:10 ` [PATCH net v2 1/2] Revert "coallocate socket_wq with socket itself" SeongJae Park
2020-05-06  4:55   ` kbuild test robot
2020-05-06  4:55     ` kbuild test robot
2020-05-05  8:10 ` [PATCH net v2 2/2] Revert "sockfs: switch to ->free_inode()" SeongJae Park
2020-05-05 11:54 ` [PATCH net v2 0/2] Revert the 'socket_alloc' life cycle change SeongJae Park
2020-05-05 12:31   ` Nuernberger, Stefan
2020-05-05 14:53   ` Eric Dumazet
2020-05-05 15:07     ` SeongJae Park
2020-05-05 15:20       ` Eric Dumazet
2020-05-05 15:46         ` SeongJae Park
2020-05-05 16:00           ` Eric Dumazet
2020-05-05 16:13             ` SeongJae Park
2020-05-05 16:25               ` Eric Dumazet
2020-05-05 16:31                 ` Eric Dumazet
2020-05-05 16:37                   ` Eric Dumazet
2020-05-05 17:05                     ` SeongJae Park
2020-05-05 17:30                       ` Paul E. McKenney
2020-05-05 17:56                         ` SeongJae Park
2020-05-05 18:17                           ` Paul E. McKenney
2020-05-05 18:34                             ` SeongJae Park
2020-05-05 18:49                               ` Paul E. McKenney
2020-05-06 12:59                                 ` SeongJae Park
2020-05-06 14:33                                   ` Eric Dumazet
2020-05-06 14:41                                   ` Paul E. McKenney [this message]
2020-05-06 15:20                                     ` SeongJae Park
2020-05-05 17:28                     ` Paul E. McKenney
2020-05-05 18:11                       ` SeongJae Park
2020-05-05 17:23                 ` Paul E. McKenney
2020-05-05 17:49                   ` SeongJae Park
2020-05-05 18:27                     ` Paul E. McKenney
2020-05-05 18:40                       ` SeongJae Park
2020-05-05 18:48                         ` Paul E. McKenney
2020-05-05 16:26             ` Al Viro
2020-05-05 18:48 ` David Miller
2020-05-05 19:00   ` David Miller
2020-05-06  6:24     ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200506144151.GZ2869@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=amit@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.com \
    --cc=sjpark@amazon.de \
    --cc=snu@amazon.com \
    --cc=stable@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.