linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dominique Martinet <asmadeus@codewreck.org>
Cc: linux-mm@kvack.org
Subject: Re: How to use huge pages in drivers?
Date: Tue, 3 Sep 2019 11:42:30 -0700	[thread overview]
Message-ID: <20190903184230.GJ29434@bombadil.infradead.org> (raw)
In-Reply-To: <20190903182627.GA6079@nautica>

On Tue, Sep 03, 2019 at 08:26:27PM +0200, Dominique Martinet wrote:
> Some context first. I'm inquiring in the context of mckernel[1], a
> lightweight kernel that works next to linux (basically offlines a
> few/most cores, reserve some memory and have boot a second OS on that to
> run HPC applications).
> Being brutally honest here, this is mostly research and anyone here
> looking into it will probably scream, but I might as well try not to add
> too many more reasons to do so....
> 
> One of the mecanisms here is that sometimes we want to access the
> mckernel memory from linux (either from the process that spawned the
> mckernel side process or from a driver in linux), and to do that we have
> mapped the mckernel side virtual memory range to that process so it can
> page fault.
> The (horrible) function doing that can be found here[2], rus_vm_fault -
> sends a message to the other side to identify the physical address
> corresponding from what we had reserved earlier and map it quite
> manually.
> 
> We could know at this point if it had been a huge page (very likely) or
> not; I'm observing a huge difference of performance with some
> interconnect if I add a huge kludge emulating huge pages here (directly
> manipulating the process' page table) so I'd very much like to use huge
> pages when we know a huge page has been mapped on the other side.
> 
> 
> 
> What I'd like to know is:
>  - we know (assuming the other side isn't too bugged, but if it is we're
> fucked up anyway) exactly what huge-page-sized physical memory range has
> been mapped on the other side, is there a way to manually gather the
> pages corresponding and merge them into a huge page?

You're using the word 'page' here, but I suspect what you really mean is
"pfn" or "pte".  As you've described it, it doesn't matter what data structure
Linux is using for the memory, since Linux doesn't know about the memory.

We have vmf_insert_pfn_pmd() which is designed to be called from your
->huge_fault handler.  See dev_dax_huge_fault() -> __dev_dax_pmd_fault()
for an example.  It's a fairly new mechanism, so I don't think it's
popular with device drivers yet.

All you really need is the physical address of the memory to make this work.


  reply	other threads:[~2019-09-03 18:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-03 18:26 How to use huge pages in drivers? Dominique Martinet
2019-09-03 18:42 ` Matthew Wilcox [this message]
2019-09-03 21:28   ` Dominique Martinet
2019-09-04 17:00     ` Dominique Martinet
2019-09-04 17:50       ` Matthew Wilcox
2019-09-05 15:44         ` Dominique Martinet
2019-09-05 18:15           ` Matthew Wilcox
2019-09-05 18:50             ` Dominique Martinet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190903184230.GJ29434@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=asmadeus@codewreck.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).