linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yury Norov <ynorov@caviumnetworks.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: libc-alpha@sourceware.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	szabolcs.nagy@arm.com, heiko.carstens@de.ibm.com,
	cmetcalf@ezchip.com, philipp.tomsich@theobroma-systems.com,
	joseph@codesourcery.com, zhouchengming1@huawei.com,
	Prasun.Kapoor@caviumnetworks.com, agraf@suse.de,
	geert@linux-m68k.org, kilobyte@angband.pl,
	manuel.montezelo@gmail.com, pinskia@gmail.com,
	linyongting@huawei.com, klimov.linux@gmail.com,
	broonie@kernel.org, bamvor.zhangjian@huawei.com,
	linux-arm-kernel@lists.infradead.org, maxim.kuvyrkov@linaro.org,
	Nathan_Lynch@mentor.com, schwidefsky@de.ibm.com,
	davem@davemloft.net, christoph.muellner@theobroma-systems.com
Subject: Re: [Question] New mmap64 syscall?
Date: Wed, 7 Dec 2016 16:04:51 +0530	[thread overview]
Message-ID: <20161207103451.GA869@yury-N73SV> (raw)
In-Reply-To: <3014428.VXGdOARdm1@wuerfel>

On Tue, Dec 06, 2016 at 10:20:20PM +0100, Arnd Bergmann wrote:
> On Wednesday, December 7, 2016 12:24:40 AM CET Yury Norov wrote:
> > 3. Introduce new mmap64() syscall like this:
> > sys_mmap64(void *addr, size_t len, int prot, int flags, int fd, struct off_pair *off);
> > (The pointer here because otherwise we have 7 args, if simply pass off_hi and
> > off_lo in registers.)
> 
> This wouldn't have to be a pair, just a pointer to a 64-bit number.
> 
> > With new 64-bit interface we can deprecate mmap2(), and generalize all
> > implementations in kernel.
> > 
> > I think we can discuss it because 64-bit is the default size for off_t 
> > in all new 32-bit architectures. So generic solution may take place.
> > 
> > The last question here is how important to support offsets bigger than
> > 2^44 on 32-bit machines in practice? It may be a case for ARM64 servers,
> > which are looking like main aarch64/ilp32 users. If no, we can leave
> > things as is, and just do nothing.
> 
> If there is a use case for larger than 16TB offsets, we should add
> the call on all architectures, probably using your approach 3. I don't
> think that we should treat it as anything special for arm64 though.

From this point of view, 16+TB offset is a matter of 16+TB storage,
and it's more than real. The other consideration to add it is that
we have 64-bit support for offsets in syscalls like sys_llseek().
So mmap64() will simply extend this support.

I can prepare this patch. Some implementation details I'd like to
clarify:
Syscall declaration:
SYSCALL_DEFINE6(mmap64, unsigned long, addr, unsigned long, len,
                unsigned long, prot, unsigned long, flags,
                unsigned long, fd, unsigned long long *, offset);

sys_mmap64() deprecates sys_mmap2(), and __ARCH_WANT_MMAP2 is
introduced to keep it enabled for all existing architectures.
All modern arches (aarch64/ilp32 is the first candidate) will have
mmap64() only. The example is set/getrlimit() or renameat() drop
patches (b0da6d44).
                                
On GLIBC side, __OFF_T_MATCHES_OFF64_t will wire mmap() from
linux/generic/wordsize32/mmap.c to mmap64() from linux/mmap64.c. 

mmap64() will first try __NR_mmap64, and if not defined, or ENOSYS
is returned, __NR_mmap2 will be called. This is to let userspace that
supports both mmap2() and mmap64() have full 64-bit offset support, not
44-bit one.

For __NR_mmap2 case, I'd also add the check against offsets more than
2^44, and set errno to EOVERFLOW in that case.

Any thoughts?

Yury.

WARNING: multiple messages have this Message-ID (diff)
From: Yury Norov <ynorov@caviumnetworks.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: libc-alpha@sourceware.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	szabolcs.nagy@arm.com, heiko.carstens@de.ibm.com,
	cmetcalf@ezchip.com, philipp.tomsich@theobroma-systems.com,
	joseph@codesourcery.com, zhouchengming1@huawei.com,
	Prasun.Kapoor@caviumnetworks.com, agraf@suse.de,
	geert@linux-m68k.org, kilobyte@angband.pl,
	manuel.montezelo@gmail.com, pinskia@gmail.com,
	linyongting@huawei.com, klimov.linux@gmail.com,
	broonie@kernel.org, bamvor.zhangjian@huawei.com,
	linux-arm-kernel@lists.infradead.org, maxim.kuvyrkov@linaro.org,
	Nathan_Lynch@mentor.com, schwidefsky@de.ibm.com,
	davem@davemloft.net, christoph.muellner@theobroma-systems.com
Subject: Re: [Question] New mmap64 syscall?
Date: Wed, 7 Dec 2016 16:04:51 +0530	[thread overview]
Message-ID: <20161207103451.GA869@yury-N73SV> (raw)
Message-ID: <20161207103451.9g-2MDc_GzPD13XfmN_uOYjR-VI7TQag42rvl7IhOEU@z> (raw)
In-Reply-To: <3014428.VXGdOARdm1@wuerfel>

On Tue, Dec 06, 2016 at 10:20:20PM +0100, Arnd Bergmann wrote:
> On Wednesday, December 7, 2016 12:24:40 AM CET Yury Norov wrote:
> > 3. Introduce new mmap64() syscall like this:
> > sys_mmap64(void *addr, size_t len, int prot, int flags, int fd, struct off_pair *off);
> > (The pointer here because otherwise we have 7 args, if simply pass off_hi and
> > off_lo in registers.)
> 
> This wouldn't have to be a pair, just a pointer to a 64-bit number.
> 
> > With new 64-bit interface we can deprecate mmap2(), and generalize all
> > implementations in kernel.
> > 
> > I think we can discuss it because 64-bit is the default size for off_t 
> > in all new 32-bit architectures. So generic solution may take place.
> > 
> > The last question here is how important to support offsets bigger than
> > 2^44 on 32-bit machines in practice? It may be a case for ARM64 servers,
> > which are looking like main aarch64/ilp32 users. If no, we can leave
> > things as is, and just do nothing.
> 
> If there is a use case for larger than 16TB offsets, we should add
> the call on all architectures, probably using your approach 3. I don't
> think that we should treat it as anything special for arm64 though.

  reply	other threads:[~2016-12-07 10:34 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-06 18:54 [Question] New mmap64 syscall? Yury Norov
2016-12-06 18:54 ` Yury Norov
2016-12-06 21:20 ` Arnd Bergmann
2016-12-07 10:34   ` Yury Norov [this message]
2016-12-07 10:34     ` Yury Norov
2016-12-07 11:07     ` Dr. Philipp Tomsich
2016-12-07 12:39       ` Yury Norov
2016-12-07 16:32         ` Catalin Marinas
2016-12-07 16:32           ` Catalin Marinas
2016-12-07 16:43           ` Dr. Philipp Tomsich
2016-12-07 16:43             ` Dr. Philipp Tomsich
2016-12-07 21:30             ` Arnd Bergmann
2016-12-07 21:30               ` Arnd Bergmann
2016-12-10  9:10               ` Pavel Machek
2016-12-10  9:10                 ` Pavel Machek
2016-12-10  9:21                 ` Pavel Machek
2016-12-10  9:21                   ` Pavel Machek
2016-12-11 12:56                   ` Yury Norov
2016-12-11 12:56                     ` Yury Norov
2016-12-11 12:56                     ` [PATCH 1/3] mm: move argument checkers of mmap_pgoff() to separated routine Yury Norov
2016-12-11 12:56                       ` Yury Norov
2016-12-11 12:56                     ` [PATCH 2/3] sys_mmap64() Yury Norov
2016-12-11 12:56                       ` Yury Norov
2016-12-11 14:48                       ` kbuild test robot
2016-12-11 14:48                         ` kbuild test robot
2016-12-11 14:56                       ` kbuild test robot
2016-12-11 14:56                         ` kbuild test robot
2016-12-11 12:56                     ` [PATCH 3/3] mm: make pagoff_t type 64-bit Yury Norov
2016-12-11 12:56                       ` Yury Norov
2016-12-11 13:31                       ` kbuild test robot
2016-12-11 13:31                         ` kbuild test robot
2016-12-11 13:41                       ` kbuild test robot
2016-12-11 13:41                         ` kbuild test robot
2016-12-11 14:59                       ` Arnd Bergmann
2016-12-11 14:59                         ` Arnd Bergmann
2016-12-16 10:55                         ` Yury Norov
2016-12-16 10:55                           ` Yury Norov
2016-12-16 11:02                           ` Arnd Bergmann
2016-12-16 11:02                             ` Arnd Bergmann
2016-12-18  9:23                           ` Christoph Hellwig
2016-12-18  9:23                             ` Christoph Hellwig
2016-12-07 13:23 ` [Question] New mmap64 syscall? Florian Weimer
2016-12-07 13:23   ` Florian Weimer
2016-12-07 15:48   ` Yury Norov
2016-12-07 15:48     ` Yury Norov
2016-12-08 15:47     ` Florian Weimer
2016-12-08 15:47       ` Florian Weimer
2017-01-03 20:54       ` Pavel Machek
2017-01-03 20:54         ` Pavel Machek
2017-01-12 16:13         ` Florian Weimer
2017-01-12 16:13           ` Florian Weimer
2017-01-12 21:51           ` Pavel Machek
2017-01-12 21:51             ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161207103451.GA869@yury-N73SV \
    --to=ynorov@caviumnetworks.com \
    --cc=Nathan_Lynch@mentor.com \
    --cc=Prasun.Kapoor@caviumnetworks.com \
    --cc=agraf@suse.de \
    --cc=arnd@arndb.de \
    --cc=bamvor.zhangjian@huawei.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=christoph.muellner@theobroma-systems.com \
    --cc=cmetcalf@ezchip.com \
    --cc=davem@davemloft.net \
    --cc=geert@linux-m68k.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=joseph@codesourcery.com \
    --cc=kilobyte@angband.pl \
    --cc=klimov.linux@gmail.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linyongting@huawei.com \
    --cc=manuel.montezelo@gmail.com \
    --cc=maxim.kuvyrkov@linaro.org \
    --cc=philipp.tomsich@theobroma-systems.com \
    --cc=pinskia@gmail.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=szabolcs.nagy@arm.com \
    --cc=zhouchengming1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).