linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Jarkko Sakkinen <jarkko@kernel.org>,
	<dave.hansen@linux.intel.com>, <tglx@linutronix.de>,
	<bp@alien8.de>, <mingo@redhat.com>, <linux-sgx@vger.kernel.org>,
	<x86@kernel.org>
Cc: <seanjc@google.com>, <tony.luck@intel.com>, <hpa@zytor.com>,
	<linux-kernel@vger.kernel.org>, <stable@vger.kernel.org>
Subject: Re: [PATCH] x86/sgx: Fix free page accounting
Date: Mon, 8 Nov 2021 11:48:18 -0800	[thread overview]
Message-ID: <d7a6dedb-03c5-fad1-e112-c912473c7214@intel.com> (raw)
In-Reply-To: <2a0b84575733e4aaee13926387d997c35ac23130.camel@kernel.org>

Hi Jarkko,

On 11/7/2021 8:47 AM, Jarkko Sakkinen wrote:
> On Sun, 2021-11-07 at 18:45 +0200, Jarkko Sakkinen wrote:
>> On Thu, 2021-11-04 at 11:28 -0700, Reinette Chatre wrote:
>>> The consequence of sgx_nr_free_pages not being protected is that
>>> its value may not accurately reflect the actual number of free
>>> pages on the system, impacting the availability of free pages in
>>> support of many flows. The problematic scenario is when the
>>> reclaimer never runs because it believes there to be sufficient
>>> free pages while any attempt to allocate a page fails because there
>>> are no free pages available. The worst scenario observed was a
>>> user space hang because of repeated page faults caused by
>>> no free pages ever made available.
>>
>> Can you go in detail with the "concrete scenario" in the commit
>> message? It does not have to describe all the possible scenarios
>> but at least one sequence of events.


I provided significant detail regarding the "concrete scenario" in a 
separate response to Greg:
https://lore.kernel.org/lkml/a636290d-db04-be16-1c86-a8dcc3719b39@intel.com/

That message details the test that was run (the test hangs before the 
fix and can complete after the fix), the traces captured at the time the 
test hung, analysis of the traces with root cause of why the system is 
hung, traces after fix applied demonstrating why user space is able to 
make progress and explaining why the test can complete.

Unfortunately the traces I provided wrapped and are not easy to read. 
The essential message from the two traces are that the first trace 
(before the fix) shows that the system is stuck (almost 100%) in the SGX 
page fault handler and not able to make any progress and user space 
hangs. The second trace (after the fix) shows that the system splits its 
time between the SGX page fault handler and the reclaimer enabling user 
space to make progress and the test can complete.

 > I.e. I don't have anything fundamentally against changing it to
 > atomic but the commit message is completely lacking the stimulus
 > of changing anything.

The problem needing to be fixed is that sgx_nr_free_pages is not updated 
safely on systems with more than one NUMA node.

Reinette

  reply	other threads:[~2021-11-08 19:48 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-04 18:28 [PATCH] x86/sgx: Fix free page accounting Reinette Chatre
2021-11-04 18:36 ` Luck, Tony
2021-11-04 18:44   ` Reinette Chatre
2021-11-04 18:54 ` Greg KH
2021-11-04 19:04   ` Dave Hansen
2021-11-04 20:57   ` Reinette Chatre
2021-11-05  7:10     ` Greg KH
2021-11-08 19:19       ` Reinette Chatre
2021-11-07 16:45 ` Jarkko Sakkinen
2021-11-07 16:47   ` Jarkko Sakkinen
2021-11-08 19:48     ` Reinette Chatre [this message]
2021-11-08 20:12       ` Jarkko Sakkinen
2021-11-08 20:56         ` Reinette Chatre
2021-11-09  1:30           ` Jarkko Sakkinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d7a6dedb-03c5-fad1-e112-c912473c7214@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=seanjc@google.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).