All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH, RFC] x86/mm/pat: Restore large pages after fragmentation
Date: Fri, 17 Apr 2020 17:17:07 +0200	[thread overview]
Message-ID: <20200417151707.GG20730@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20200417144244.gdoblao656l6ktkr@box>

On Fri, Apr 17, 2020 at 05:42:44PM +0300, Kirill A. Shutemov wrote:
> On Fri, Apr 17, 2020 at 02:18:28PM +0200, Peter Zijlstra wrote:
> Do you mean that it is not aligned to '(' on the previous line?
> 
> Okay, I'll fix it up (and change my vim setup). But I find indenting with
> spaces disgusting.

set cino=:0(0


> > >  static void cpa_flush(struct cpa_data *data, int cache)
> > >  {
> > > +	LIST_HEAD(pgtables);
> > > +	struct page *page, *tmp;
> > 
> > xmas fail
> 
> Emm. What are rules here?
> 
> > >  	struct cpa_data *cpa = data;
> > >  	unsigned int i;

Basically we (tip) strive for the inverse xmas tree thing, so here that
would be:

	struct cpa_data *cpa = data;
+	struct page *page, *tmp;
+	LIST_HEAD(pgtables);
	unsigned int i;


> > > +	pr_debug("2M restored at %#lx\n", addr);
> > 
> > While I appreciate it's usefulness while writing this, I do think we can
> > do without that print once we know it works.
> 
> Even with pr_debug()? I shouldn't be noisy for most users.

I always have debug on; also there is no counterpart on split either.

> > > +/*
> > > + * Restore PMD and PUD pages in the kernel mapping around the address where
> > > + * possible.
> > > + *
> > > + * Caller must flush TLB and free page tables queued on the list before
> > > + * touching the new entries. CPU must not see TLB entries of different size
> > > + * with different attributes.
> > > + */
> > > +static void restore_large_pages(unsigned long addr, struct list_head *pgtables)
> > > +{
> > > +	pgd_t *pgd;
> > > +	p4d_t *p4d;
> > > +	pud_t *pud;
> > > +
> > > +	addr &= PUD_MASK;
> > > +
> > > +	spin_lock(&pgd_lock);
> > > +	pgd = pgd_offset_k(addr);
> > > +	if (pgd_none(*pgd))
> > > +		goto out;
> > > +	p4d = p4d_offset(pgd, addr);
> > > +	if (p4d_none(*p4d))
> > > +		goto out;
> > > +	pud = pud_offset(p4d, addr);
> > > +	if (!pud_present(*pud) || pud_large(*pud))
> > > +		goto out;
> > > +
> > > +	restore_pud_page(pud, addr, pgtables);
> > > +out:
> > > +	spin_unlock(&pgd_lock);
> > > +}
> > 
> > I find this odd, at best. AFAICT this does not attempt to reconstruct a
> > PMD around @addr when possible. When the first PMD of the PUD can't be
> > reconstructed, we give up entirely.
> 
> No, you misread. If the first PMD is not suitable, we give up
> reconstructing PUD page, but we still look at all PMD of the PUD range.

Aah, indeed. I got my brain in a twist reading that pud function.

> But this might be excessive. What you are proposing below should be fine
> and more efficient. If we do everything right the rest of PMDs in the PUD
> have to already large or not suitable for reconstructing.

Just so.

> We might still look at the rest of PMDs for CONFIG_CPA_DEBUG, just to make
> sure we don't miss some corner case where we didn't restore a PMD.
> 
> (Also I think about s/restore/reconstruct/g)

Right, and WARN if they do succeed collapsing ;-)

  reply	other threads:[~2020-04-17 15:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-16 21:32 [PATCH, RFC] x86/mm/pat: Restore large pages after fragmentation Kirill A. Shutemov
2020-04-16 22:03 ` Dave Hansen
2020-04-16 22:12   ` Kirill A. Shutemov
2020-04-16 22:38     ` Dave Hansen
2020-04-17  0:52 ` Mika Penttilä
2020-04-17 11:42   ` Kirill A. Shutemov
2020-04-17 12:05     ` Mika Penttilä
2020-04-17 12:18 ` Peter Zijlstra
2020-04-17 14:42   ` Kirill A. Shutemov
2020-04-17 15:17     ` Peter Zijlstra [this message]
2020-04-17 15:47 ` Peter Zijlstra
2020-04-17 16:54   ` Kirill A. Shutemov
2020-04-25 11:43 ` [x86/mm/pat] ae64ac1a83: BUG:Bad_page_state_in_process kernel test robot
2020-04-25 11:43   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200417151707.GG20730@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.