From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751881AbdKVDNv (ORCPT ); Tue, 21 Nov 2017 22:13:51 -0500 Received: from mail-it0-f65.google.com ([209.85.214.65]:34829 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751590AbdKVDNt (ORCPT ); Tue, 21 Nov 2017 22:13:49 -0500 X-Google-Smtp-Source: AGs4zMbAtZhOqEd4iUrRcn0MWioPK+rHpqs4UitgY0r0OvTnWgGJZ1/tcs4N7fChiLSkHDABwd6kO49RrF2jIV8oe6Q= MIME-Version: 1.0 In-Reply-To: References: <61b5cd5e7f2cf2f485a990cdf238788b817a1d53.1510118606.git.green.hu@gmail.com> From: Vincent Chen Date: Wed, 22 Nov 2017 11:13:47 +0800 Message-ID: Subject: Re: FW: [PATCH 15/31] nds32: System calls handling To: Arnd Bergmann Cc: Greentime , Linux Kernel Mailing List , linux-arch , Thomas Gleixner , Jason Cooper , Marc Zyngier , Rob Herring , Networking , Vincent Chen Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2017-11-13 19:42 GMT+08:00 Arnd Bergmann : > On Mon, Nov 13, 2017 at 3:51 AM, Vincent Chen wrote: >>>>On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu wrote: >>>> From: Greentime Hu > >> >>>> +#define __ARCH_WANT_SYS_CLONE >>> >>>This seems ok, though it would be nice to have the reverse logic and have architectures opt-out of the generic version when they need to provide their own, rather than having most architectures set it. >>> >> >> Thanks >> I will provide nds32 SYSCALL_DEFINE_5(clone) in the next version patch. > > That's not what I meant, my suggestion is to create a new patch to remove the > __ARCH_WANT_SYS_CLONE symbol from all architectures, and instead use > something like __ARCH_HAVE_SYS_CLONE on architectures that do *not* > want the generic implementation. > > nds32 clearly wants the generic implementation here, it just shouldn't have to > select that symbol to get it. > >>>> +/* >>>> + * Special system call wrappers >>>> + * >>>> + * $r0 = syscall number >>>> + * $r8 = syscall table >>>> + */ >>>> + .type sys_syscall, #function >>>> +ENTRY(sys_syscall) >>>> + addi $p1, $r0, #-__NR_syscalls >>>> + bgtz $p1, 3f >>>> + move $p1, $r0 >>>> + move $r0, $r1 >>>> + move $r1, $r2 >>>> + move $r2, $r3 >>>> + move $r3, $r4 >>>> + move $r4, $r5 >>>> +! add for syscall 6 args >>>> + lwi $r5, [$sp + #SP_OFFSET ] >>>> + lwi $r5, [$r5] >>>> +! ~add for syscall 6 args >>>> + >>>> + lw $p1, [tbl+$p1<<2] >>>> .+ jr $p1 >>>> +3: b sys_ni_syscall >>>> +ENDPROC(sys_syscall) >>> >>>Can you explain what this is used for? >>> >> >> This is used to handle syscall(int number, ....). >> Unlike other architectures, the system number shall be determined in >> compile time when issuing system call in nds32. >> Therefore, we only can parse the content of syscall(int number, ....) >> and distribute it to destination handler in kernel space >> (Other architecture can handle it in user space by glibc's syscall wrapper) > > Hmm, I think other architectures that run into this problem use self-modifying > code for syscall(), but that is also ugly as it requires having a page that > is both writable and executable. > > I think your approach can be tricky for things like seccomp(). It's possible > that you get all of it right, but if you can come up with a different solution, > that might be better. > > How much would it cost to simply always pass the syscall number > in a register, and not use the immediate argument at all? > After re-evaluating the performance, we decide to use a particular register to transfer syscall number instead of immediate argument. So, above sys_syscall will be removed in the next version patch Thanks >>>> --- /dev/null >>>> +++ b/arch/nds32/kernel/sys_nds32.c >>>> + >>>> +long sys_mmap2(unsigned long addr, unsigned long len, >>>> + unsigned long prot, unsigned long flags, >>>> + unsigned long fd, unsigned long pgoff) { >>>> + if (pgoff & (~PAGE_MASK >> 12)) >>>> + return -EINVAL; >>>> + >>>> + return sys_mmap_pgoff(addr, len, prot, flags, fd, >>>> + pgoff >> (PAGE_SHIFT - 12)); } >>>> + >>>> +asmlinkage long sys_fadvise64_64_wrapper(int fd, int advice, loff_t offset, >>>> + loff_t len) { >>>> + return sys_fadvise64_64(fd, offset, len, advice); } >>> >>>You should always use SYSCALL_DEFINE*() macros to define entry points for your own syscalls in C code for consistency. I also wonder if we should just move those two into common code, a lot of architectures need the first one in particular. >>> >> >> The sys_fadvise64_64_wrapper is used to reorder the input parameter. >> >> In order to solve register alignment problem, we adjust the input >> parameter order of fadvise64_64 while issuing this syscall. >> Therefore, we need this wrapper to reorder the input parameter to fit >> sys_fadvise64_64's API in kernel. > > I understand what it's used for, it's just that I would recommend writing it > differently, as > > SYSCALL_DEFINE4(fadvise64_64_wrapper, int, fd, int, advice, loff_t, > offset, loff_t, len) > { > return sys_fadvise64_64(fd, offset, len, advice); > } > > Arnd