All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Linux MM <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH 1/3] pfn_t: Change the encoding
Date: Sat, 12 Mar 2016 13:30:05 -0500	[thread overview]
Message-ID: <20160312183005.GA2525@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4g82US298_mCd75toj9kEeyDhw0cP_Ott0R8fOydWNsSg@mail.gmail.com>

On Fri, Mar 11, 2016 at 01:40:20PM -0800, Dan Williams wrote:
> On Fri, Mar 11, 2016 at 1:13 PM, Matthew Wilcox
> <matthew.r.wilcox@intel.com> wrote:
> > By moving the flag bits to the bottom, we encourage commonality
> > between SGs with pages and those using pfn_t.  We can also then insert
> > a pfn_t into a radix tree, as it uses the same two bits for indirect &
> > exceptional indicators.
> 
> It's not immediately clear to me what we gain with SG entry
> commonality.  The down side is that we lose the property that
> pfn_to_pfn_t() is a nop.  This was Dave's suggestion so that the
> nominal case did not change the binary layout of a typical pfn.

I understand that motivation!

> Can we just bit swizzle a pfn_t on insertion/retrieval from the radix?

Of course we *can*, but we end up doing more swizzling that way than we
do this way.  In the Brave New Future where we're storing pfn_t in the
radix tree, on a page fault we find the pfn_t in the radix tree then
we want to insert it into the page tables.  So DAX would first have to
convert the radix tree entry to a pfn_t, then the page table code has to
convert the pfn_t into a pte/pmd/pud (which we currently do by converting
a pfn_t to a pfn, then converting the pfn to a pte/pmd/pud, but I assume
that either the compiler optimises that into a single conversion, or we'll
add pfn_t_pte to each architecture in future if it's actually a problem).

Much easier to look up a pfn_t in the radix tree and pass it directly
to vm_insert_mixed().

If there's any part of the kernel that is doing a *lot* of conversion
between pfn_t and pfn, that surely indicates a place in the kernel where
we need to convert an interface from pfn to pfn_t.

(It occurs to me we can make the code simpler on architectures that
don't support PUDs.  The PFN_HUGE bit is still available to distinguish
between PMDs and PTEs, but we won't need to clear the bottom bit of the
PFN if PFN_HUGE is set, since nobody can add a PUD pfn to the radix tree).
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH 1/3] pfn_t: Change the encoding
Date: Sat, 12 Mar 2016 13:30:05 -0500	[thread overview]
Message-ID: <20160312183005.GA2525@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4g82US298_mCd75toj9kEeyDhw0cP_Ott0R8fOydWNsSg@mail.gmail.com>

On Fri, Mar 11, 2016 at 01:40:20PM -0800, Dan Williams wrote:
> On Fri, Mar 11, 2016 at 1:13 PM, Matthew Wilcox
> <matthew.r.wilcox@intel.com> wrote:
> > By moving the flag bits to the bottom, we encourage commonality
> > between SGs with pages and those using pfn_t.  We can also then insert
> > a pfn_t into a radix tree, as it uses the same two bits for indirect &
> > exceptional indicators.
> 
> It's not immediately clear to me what we gain with SG entry
> commonality.  The down side is that we lose the property that
> pfn_to_pfn_t() is a nop.  This was Dave's suggestion so that the
> nominal case did not change the binary layout of a typical pfn.

I understand that motivation!

> Can we just bit swizzle a pfn_t on insertion/retrieval from the radix?

Of course we *can*, but we end up doing more swizzling that way than we
do this way.  In the Brave New Future where we're storing pfn_t in the
radix tree, on a page fault we find the pfn_t in the radix tree then
we want to insert it into the page tables.  So DAX would first have to
convert the radix tree entry to a pfn_t, then the page table code has to
convert the pfn_t into a pte/pmd/pud (which we currently do by converting
a pfn_t to a pfn, then converting the pfn to a pte/pmd/pud, but I assume
that either the compiler optimises that into a single conversion, or we'll
add pfn_t_pte to each architecture in future if it's actually a problem).

Much easier to look up a pfn_t in the radix tree and pass it directly
to vm_insert_mixed().

If there's any part of the kernel that is doing a *lot* of conversion
between pfn_t and pfn, that surely indicates a place in the kernel where
we need to convert an interface from pfn to pfn_t.

(It occurs to me we can make the code simpler on architectures that
don't support PUDs.  The PFN_HUGE bit is still available to distinguish
between PMDs and PTEs, but we won't need to clear the bottom bit of the
PFN if PFN_HUGE is set, since nobody can add a PUD pfn to the radix tree).

WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH 1/3] pfn_t: Change the encoding
Date: Sat, 12 Mar 2016 13:30:05 -0500	[thread overview]
Message-ID: <20160312183005.GA2525@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4g82US298_mCd75toj9kEeyDhw0cP_Ott0R8fOydWNsSg@mail.gmail.com>

On Fri, Mar 11, 2016 at 01:40:20PM -0800, Dan Williams wrote:
> On Fri, Mar 11, 2016 at 1:13 PM, Matthew Wilcox
> <matthew.r.wilcox@intel.com> wrote:
> > By moving the flag bits to the bottom, we encourage commonality
> > between SGs with pages and those using pfn_t.  We can also then insert
> > a pfn_t into a radix tree, as it uses the same two bits for indirect &
> > exceptional indicators.
> 
> It's not immediately clear to me what we gain with SG entry
> commonality.  The down side is that we lose the property that
> pfn_to_pfn_t() is a nop.  This was Dave's suggestion so that the
> nominal case did not change the binary layout of a typical pfn.

I understand that motivation!

> Can we just bit swizzle a pfn_t on insertion/retrieval from the radix?

Of course we *can*, but we end up doing more swizzling that way than we
do this way.  In the Brave New Future where we're storing pfn_t in the
radix tree, on a page fault we find the pfn_t in the radix tree then
we want to insert it into the page tables.  So DAX would first have to
convert the radix tree entry to a pfn_t, then the page table code has to
convert the pfn_t into a pte/pmd/pud (which we currently do by converting
a pfn_t to a pfn, then converting the pfn to a pte/pmd/pud, but I assume
that either the compiler optimises that into a single conversion, or we'll
add pfn_t_pte to each architecture in future if it's actually a problem).

Much easier to look up a pfn_t in the radix tree and pass it directly
to vm_insert_mixed().

If there's any part of the kernel that is doing a *lot* of conversion
between pfn_t and pfn, that surely indicates a place in the kernel where
we need to convert an interface from pfn to pfn_t.

(It occurs to me we can make the code simpler on architectures that
don't support PUDs.  The PFN_HUGE bit is still available to distinguish
between PMDs and PTEs, but we won't need to clear the bottom bit of the
PFN if PFN_HUGE is set, since nobody can add a PUD pfn to the radix tree).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-03-12 18:30 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-11 21:13 [PATCH 0/3] Make pfn_t suitable for placing in the radix tree Matthew Wilcox
2016-03-11 21:13 ` Matthew Wilcox
2016-03-11 21:13 ` Matthew Wilcox
2016-03-11 21:13 ` [PATCH 1/3] pfn_t: Change the encoding Matthew Wilcox
2016-03-11 21:13   ` Matthew Wilcox
2016-03-11 21:13   ` Matthew Wilcox
2016-03-11 21:40   ` Dan Williams
2016-03-11 21:40     ` Dan Williams
2016-03-11 21:40     ` Dan Williams
2016-03-12 18:30     ` Matthew Wilcox [this message]
2016-03-12 18:30       ` Matthew Wilcox
2016-03-12 18:30       ` Matthew Wilcox
2016-03-13 23:09       ` Dan Williams
2016-03-13 23:09         ` Dan Williams
2016-03-13 23:09         ` Dan Williams
2016-03-14 15:00         ` Matthew Wilcox
2016-03-14 15:00           ` Matthew Wilcox
2016-03-14 15:00           ` Matthew Wilcox
2016-03-11 21:13 ` [PATCH 2/3] pfn_t: Support for huge PFNs Matthew Wilcox
2016-03-11 21:13   ` Matthew Wilcox
2016-03-11 21:13   ` Matthew Wilcox
2016-03-11 21:13 ` [PATCH 3/3] pfn_t: New functions pfn_t_add and pfn_t_cmp Matthew Wilcox
2016-03-11 21:13   ` Matthew Wilcox
2016-03-11 21:13   ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160312183005.GA2525@linux.intel.com \
    --to=willy@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=matthew.r.wilcox@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.