* [PATCH] parisc: Fix syscall restarts @ 2015-12-18 23:30 Helge Deller 2015-12-20 13:59 ` Mathieu Desnoyers ` (3 more replies) 0 siblings, 4 replies; 30+ messages in thread From: Helge Deller @ 2015-12-18 23:30 UTC (permalink / raw) To: linux-parisc, James Bottomley, John David Anglin; +Cc: Mathieu Desnoyers On parisc syscalls which are interrupted by signals sometimes fail to restart and instead return -ENOSYS which then in the worst case lead to userspace crashes. A similiar problem existed on MIPS and was fixed by commit e967ef02 ("MIPS: Fix restart of indirect syscalls"). On parisc the current syscall restart code assumes hat the syscall number is always loaded in the delay branch of the ble instruction as defined in the unistd.h header file and as such never restored %r20 before returning to userspace: ble 0x100(%sr2, %r0) ldi #syscall_nr, %r20 This assumption is at least not true for code which uses the syscall() glibc function, which instead uses this syntax: ble 0x100(%sr2, %r0) copy regX, %r20 where regX depend on how the compiler optimizes the code and register usage. This patch fixes this problem by adding code to analyze how the syscall number is loaded in the delay branch and - if needed - copy the syscall number to regX prior returning to userspace for the syscall restart. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c index dc1ea79..b0414ad 100644 --- a/arch/parisc/kernel/signal.c +++ b/arch/parisc/kernel/signal.c @@ -435,6 +435,48 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs, int in_syscall) regs->gr[28]); } +/* + * Check the delay branch in userspace how the syscall number gets loaded into + * %r20 and adjust as needed. + */ + +static void check_syscallno_in_delay_branch(struct pt_regs *regs) +{ + unsigned int opcode, source_reg; + u32 __user *uaddr; + + /* Usually we don't have to restore %r20 (the system call number) + * because it gets loaded in the delay slot of the branch external + * instruction via the ldi instruction. + * In some cases a register-to-register copy instruction might have + * been used instead, in which case we need to copy the syscall + * number into the source register before returning to userspace. + */ + + /* A syscall is just a branch, so all + * we have to do is fiddle the return pointer. + */ + regs->gr[31] -= 8; /* delayed branching */ + + /* Get assembler opcode of code in delay branch */ + uaddr = (unsigned int *) (regs->gr[31] + 1); + get_user(opcode, uaddr); + + /* Check if delay branch uses "ldi int,%r20" */ + if ((opcode & 0xffff0000) == 0x34140000) + return; /* everything ok, just return */ + + /* Check if delay branch uses "copy %rX,%r20" */ + if ((opcode & 0xff00ffff) == 0x08000254) { + source_reg = (opcode >> 16) & 31; + regs->gr[source_reg] = regs->gr[20]; + return; + } + + pr_warn("syscall restart: %s (pid %d): unexpected opcode 0x%08x\n", + current->comm, task_pid_nr(current), opcode); +} + static inline void syscall_restart(struct pt_regs *regs, struct k_sigaction *ka) { @@ -457,10 +499,7 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka) } /* fallthrough */ case -ERESTARTNOINTR: - /* A syscall is just a branch, so all - * we have to do is fiddle the return pointer. - */ - regs->gr[31] -= 8; /* delayed branching */ + check_syscallno_in_delay_branch(regs); break; } } @@ -510,15 +549,9 @@ insert_restart_trampoline(struct pt_regs *regs) } case -ERESTARTNOHAND: case -ERESTARTSYS: - case -ERESTARTNOINTR: { - /* Hooray for delayed branching. We don't - * have to restore %r20 (the system call - * number) because it gets loaded in the delay - * slot of the branch external instruction. - */ - regs->gr[31] -= 8; + case -ERESTARTNOINTR: + check_syscallno_in_delay_branch(regs); return; - } default: break; } ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-18 23:30 [PATCH] parisc: Fix syscall restarts Helge Deller @ 2015-12-20 13:59 ` Mathieu Desnoyers 2015-12-20 14:09 ` Mathieu Desnoyers 2015-12-20 19:39 ` John David Anglin ` (2 subsequent siblings) 3 siblings, 1 reply; 30+ messages in thread From: Mathieu Desnoyers @ 2015-12-20 13:59 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, James Bottomley, John David Anglin ----- On Dec 18, 2015, at 6:30 PM, Helge Deller deller@gmx.de wrote: > On parisc syscalls which are interrupted by signals sometimes fail to restart > and instead return -ENOSYS which then in the worst case lead to userspace > crashes. > A similiar problem existed on MIPS and was fixed by commit e967ef02 > ("MIPS: Fix restart of indirect syscalls"). > > On parisc the current syscall restart code assumes hat the syscall number is > always loaded in the delay branch of the ble instruction as defined in the > unistd.h header file and as such never restored %r20 before returning to > userspace: > ble 0x100(%sr2, %r0) > ldi #syscall_nr, %r20 > > This assumption is at least not true for code which uses the syscall() glibc > function, which instead uses this syntax: > ble 0x100(%sr2, %r0) > copy regX, %r20 > where regX depend on how the compiler optimizes the code and register usage. > > This patch fixes this problem by adding code to analyze how the syscall number > is loaded in the delay branch and - if needed - copy the syscall number to regX > prior returning to userspace for the syscall restart. > > Signed-off-by: Helge Deller <deller@gmx.de> > Cc: stable@vger.kernel.org > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c > index dc1ea79..b0414ad 100644 > --- a/arch/parisc/kernel/signal.c > +++ b/arch/parisc/kernel/signal.c > @@ -435,6 +435,48 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs, > int in_syscall) > regs->gr[28]); > } > > +/* > + * Check the delay branch in userspace how the syscall number gets loaded into > + * %r20 and adjust as needed. I'm pretty sure "Check the delay branch in userspace how the syscall..." is not an English construct. ;-) Suggested rewording: "Check how the syscall number gets loaded into %r20 within the delay branch in userspace and adjust as needed." > + */ > + > +static void check_syscallno_in_delay_branch(struct pt_regs *regs) > +{ > + unsigned int opcode, source_reg; Why "unsigned int" above rather than u32 ? Since we're using opcode as target variable for a get_user, it would be clearer if the type of the __user * match the type of the target kernel variable. (understood that those happen to have the same bitness and type size on all Linux architectures, but it would be clearer nevertheless). > + u32 __user *uaddr; > + > + /* Usually we don't have to restore %r20 (the system call number) > + * because it gets loaded in the delay slot of the branch external > + * instruction via the ldi instruction. > + * In some cases a register-to-register copy instruction might have > + * been used instead, in which case we need to copy the syscall > + * number into the source register before returning to userspace. > + */ > + > + /* A syscall is just a branch, so all > + * we have to do is fiddle the return pointer. > + */ > + regs->gr[31] -= 8; /* delayed branching */ > + > + /* Get assembler opcode of code in delay branch */ > + uaddr = (unsigned int *) (regs->gr[31] + 1); > + get_user(opcode, uaddr); get_user() can fail due to EFAULT. This error should be handled here, otherwise this could lead to the following code using an uninitialized opcode variable, which could indirectly leak a few bits of kernel stack information to userspace (security concern). One attack vector I have in mind for this is ptrace(), which might be able to tweak those register values. > + > + /* Check if delay branch uses "ldi int,%r20" */ > + if ((opcode & 0xffff0000) == 0x34140000) > + return; /* everything ok, just return */ > + > + /* Check if delay branch uses "copy %rX,%r20" */ > + if ((opcode & 0xff00ffff) == 0x08000254) { > + source_reg = (opcode >> 16) & 31; Can you explain the reasoning behind the & 31 mask ? I'm no parisc expert, but this seems rather odd. Do you really mean "% 31" which translates to "& 5" ? It would make more sense since there are 32 "gr" registers. With & 31, a carefully crafted opcode could overflow the gr[32] array, and cause a kernel overflow allowing to overwrite kernel memory (security issue). If it's the case, then it would also be good to check that the register selector within the opcode is not larger than 31 (e.g. applying to fr or sr register), rather than applying the modulo and assuming it's a gr and corrupt userspace state. Thanks, Mathieu > + regs->gr[source_reg] = regs->gr[20]; > + return; > + } > + > + pr_warn("syscall restart: %s (pid %d): unexpected opcode 0x%08x\n", > + current->comm, task_pid_nr(current), opcode); > +} > + > static inline void > syscall_restart(struct pt_regs *regs, struct k_sigaction *ka) > { > @@ -457,10 +499,7 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction > *ka) > } > /* fallthrough */ > case -ERESTARTNOINTR: > - /* A syscall is just a branch, so all > - * we have to do is fiddle the return pointer. > - */ > - regs->gr[31] -= 8; /* delayed branching */ > + check_syscallno_in_delay_branch(regs); > break; > } > } > @@ -510,15 +549,9 @@ insert_restart_trampoline(struct pt_regs *regs) > } > case -ERESTARTNOHAND: > case -ERESTARTSYS: > - case -ERESTARTNOINTR: { > - /* Hooray for delayed branching. We don't > - * have to restore %r20 (the system call > - * number) because it gets loaded in the delay > - * slot of the branch external instruction. > - */ > - regs->gr[31] -= 8; > + case -ERESTARTNOINTR: > + check_syscallno_in_delay_branch(regs); > return; > - } > default: > break; > } -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 13:59 ` Mathieu Desnoyers @ 2015-12-20 14:09 ` Mathieu Desnoyers 2015-12-20 15:49 ` Helge Deller 0 siblings, 1 reply; 30+ messages in thread From: Mathieu Desnoyers @ 2015-12-20 14:09 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, James Bottomley, John David Anglin ----- On Dec 20, 2015, at 8:59 AM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: > ----- On Dec 18, 2015, at 6:30 PM, Helge Deller deller@gmx.de wrote: > >> On parisc syscalls which are interrupted by signals sometimes fail to restart >> and instead return -ENOSYS which then in the worst case lead to userspace >> crashes. >> A similiar problem existed on MIPS and was fixed by commit e967ef02 >> ("MIPS: Fix restart of indirect syscalls"). >> >> On parisc the current syscall restart code assumes hat the syscall number is >> always loaded in the delay branch of the ble instruction as defined in the >> unistd.h header file and as such never restored %r20 before returning to >> userspace: >> ble 0x100(%sr2, %r0) >> ldi #syscall_nr, %r20 >> >> This assumption is at least not true for code which uses the syscall() glibc >> function, which instead uses this syntax: >> ble 0x100(%sr2, %r0) >> copy regX, %r20 >> where regX depend on how the compiler optimizes the code and register usage. >> >> This patch fixes this problem by adding code to analyze how the syscall number >> is loaded in the delay branch and - if needed - copy the syscall number to regX >> prior returning to userspace for the syscall restart. >> >> Signed-off-by: Helge Deller <deller@gmx.de> >> Cc: stable@vger.kernel.org >> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >> >> diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c >> index dc1ea79..b0414ad 100644 >> --- a/arch/parisc/kernel/signal.c >> +++ b/arch/parisc/kernel/signal.c >> @@ -435,6 +435,48 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs, >> int in_syscall) >> regs->gr[28]); >> } >> >> +/* >> + * Check the delay branch in userspace how the syscall number gets loaded into >> + * %r20 and adjust as needed. > > I'm pretty sure "Check the delay branch in userspace how the syscall..." > is not an English construct. ;-) Suggested rewording: > > "Check how the syscall number gets loaded into %r20 within > the delay branch in userspace and adjust as needed." > >> + */ >> + >> +static void check_syscallno_in_delay_branch(struct pt_regs *regs) >> +{ >> + unsigned int opcode, source_reg; > > Why "unsigned int" above rather than u32 ? Since we're using > opcode as target variable for a get_user, it would be clearer > if the type of the __user * match the type of the target kernel > variable. (understood that those happen to have the same bitness > and type size on all Linux architectures, but it would be clearer > nevertheless). > >> + u32 __user *uaddr; >> + >> + /* Usually we don't have to restore %r20 (the system call number) >> + * because it gets loaded in the delay slot of the branch external >> + * instruction via the ldi instruction. >> + * In some cases a register-to-register copy instruction might have >> + * been used instead, in which case we need to copy the syscall >> + * number into the source register before returning to userspace. >> + */ >> + >> + /* A syscall is just a branch, so all >> + * we have to do is fiddle the return pointer. >> + */ >> + regs->gr[31] -= 8; /* delayed branching */ >> + >> + /* Get assembler opcode of code in delay branch */ >> + uaddr = (unsigned int *) (regs->gr[31] + 1); >> + get_user(opcode, uaddr); > > get_user() can fail due to EFAULT. This error should be > handled here, otherwise this could lead to the following > code using an uninitialized opcode variable, which could > indirectly leak a few bits of kernel stack information > to userspace (security concern). One attack vector I have > in mind for this is ptrace(), which might be able to tweak > those register values. > >> + >> + /* Check if delay branch uses "ldi int,%r20" */ >> + if ((opcode & 0xffff0000) == 0x34140000) >> + return; /* everything ok, just return */ >> + >> + /* Check if delay branch uses "copy %rX,%r20" */ >> + if ((opcode & 0xff00ffff) == 0x08000254) { >> + source_reg = (opcode >> 16) & 31; > > Can you explain the reasoning behind the & 31 mask ? > I'm no parisc expert, but this seems rather odd. > Do you really mean "% 31" which translates to "& 5" ? > It would make more sense since there are 32 "gr" > registers. With & 31, a carefully crafted opcode > could overflow the gr[32] array, and cause a kernel > overflow allowing to overwrite kernel memory > (security issue). Hrm, I got my masks temporarily mixed up (early morning here). This is why I always use constructs such as: #define GR_REGS_BITS 5 #define NR_GR_REGS (1U << GR_REGS_BITS) #define GR_REGS_MASK (NR_GR_REGS - 1) and then v & GR_REGS_MASK; Which makes everything super-obvious. The & 31 mask seems therefore technically correct. The paragraph below still holds though: > > If it's the case, then it would also be good to > check that the register selector within the opcode > is not larger than 31 (e.g. applying to fr or sr > register), rather than applying the modulo and > assuming it's a gr and corrupt userspace state. > Thanks, Mathieu > Thanks, > > Mathieu > >> + regs->gr[source_reg] = regs->gr[20]; >> + return; >> + } >> + >> + pr_warn("syscall restart: %s (pid %d): unexpected opcode 0x%08x\n", >> + current->comm, task_pid_nr(current), opcode); >> +} >> + >> static inline void >> syscall_restart(struct pt_regs *regs, struct k_sigaction *ka) >> { >> @@ -457,10 +499,7 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction >> *ka) >> } >> /* fallthrough */ >> case -ERESTARTNOINTR: >> - /* A syscall is just a branch, so all >> - * we have to do is fiddle the return pointer. >> - */ >> - regs->gr[31] -= 8; /* delayed branching */ >> + check_syscallno_in_delay_branch(regs); >> break; >> } >> } >> @@ -510,15 +549,9 @@ insert_restart_trampoline(struct pt_regs *regs) >> } >> case -ERESTARTNOHAND: >> case -ERESTARTSYS: >> - case -ERESTARTNOINTR: { >> - /* Hooray for delayed branching. We don't >> - * have to restore %r20 (the system call >> - * number) because it gets loaded in the delay >> - * slot of the branch external instruction. >> - */ >> - regs->gr[31] -= 8; >> + case -ERESTARTNOINTR: >> + check_syscallno_in_delay_branch(regs); >> return; >> - } >> default: >> break; >> } > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 14:09 ` Mathieu Desnoyers @ 2015-12-20 15:49 ` Helge Deller 2015-12-20 16:50 ` James Bottomley 2015-12-20 18:31 ` John David Anglin 0 siblings, 2 replies; 30+ messages in thread From: Helge Deller @ 2015-12-20 15:49 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: linux-parisc, James Bottomley, John David Anglin On 20.12.2015 15:09, Mathieu Desnoyers wrote: > ----- On Dec 20, 2015, at 8:59 AM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: > >> ----- On Dec 18, 2015, at 6:30 PM, Helge Deller deller@gmx.de wrote: >> >>> On parisc syscalls which are interrupted by signals sometimes fail to restart >>> and instead return -ENOSYS which then in the worst case lead to userspace >>> crashes. >>> A similiar problem existed on MIPS and was fixed by commit e967ef02 >>> ("MIPS: Fix restart of indirect syscalls"). >>> >>> On parisc the current syscall restart code assumes hat the syscall number is >>> always loaded in the delay branch of the ble instruction as defined in the >>> unistd.h header file and as such never restored %r20 before returning to >>> userspace: >>> ble 0x100(%sr2, %r0) >>> ldi #syscall_nr, %r20 >>> >>> This assumption is at least not true for code which uses the syscall() glibc >>> function, which instead uses this syntax: >>> ble 0x100(%sr2, %r0) >>> copy regX, %r20 >>> where regX depend on how the compiler optimizes the code and register usage. >>> >>> This patch fixes this problem by adding code to analyze how the syscall number >>> is loaded in the delay branch and - if needed - copy the syscall number to regX >>> prior returning to userspace for the syscall restart. >>> >>> Signed-off-by: Helge Deller <deller@gmx.de> >>> Cc: stable@vger.kernel.org >>> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >>> >>> diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c >>> index dc1ea79..b0414ad 100644 >>> --- a/arch/parisc/kernel/signal.c >>> +++ b/arch/parisc/kernel/signal.c >>> @@ -435,6 +435,48 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs, >>> int in_syscall) >>> regs->gr[28]); >>> } >>> >>> +/* >>> + * Check the delay branch in userspace how the syscall number gets loaded into >>> + * %r20 and adjust as needed. >> >> I'm pretty sure "Check the delay branch in userspace how the syscall..." >> is not an English construct. ;-) Suggested rewording: >> >> "Check how the syscall number gets loaded into %r20 within >> the delay branch in userspace and adjust as needed." Thanks! I'll change that. >>> + */ >>> + >>> +static void check_syscallno_in_delay_branch(struct pt_regs *regs) >>> +{ >>> + unsigned int opcode, source_reg; >> >> Why "unsigned int" above rather than u32 ? Since we're using >> opcode as target variable for a get_user, it would be clearer >> if the type of the __user * match the type of the target kernel >> variable. (understood that those happen to have the same bitness >> and type size on all Linux architectures, but it would be clearer >> nevertheless). Yes, seems OK. I'll change that. >>> + u32 __user *uaddr; >>> + >>> + /* Usually we don't have to restore %r20 (the system call number) >>> + * because it gets loaded in the delay slot of the branch external >>> + * instruction via the ldi instruction. >>> + * In some cases a register-to-register copy instruction might have >>> + * been used instead, in which case we need to copy the syscall >>> + * number into the source register before returning to userspace. >>> + */ >>> + >>> + /* A syscall is just a branch, so all >>> + * we have to do is fiddle the return pointer. >>> + */ >>> + regs->gr[31] -= 8; /* delayed branching */ >>> + >>> + /* Get assembler opcode of code in delay branch */ >>> + uaddr = (unsigned int *) (regs->gr[31] + 1); >>> + get_user(opcode, uaddr); >> >> get_user() can fail due to EFAULT. This error should be >> handled here, otherwise this could lead to the following >> code using an uninitialized opcode variable, which could >> indirectly leak a few bits of kernel stack information >> to userspace (security concern). One attack vector I have >> in mind for this is ptrace(), which might be able to tweak >> those register values. Yes, generally get_user() can fail. But this would be rather strange in that case, because the syscall was started by userspace from this address. So, without the code at that address in userspace, we would never have reached this get_user(). And, there is no leak (of kernel stack) either. All this function does is to move the content of userspace register X to userspace register Y. There is no kernel info involved. >>> + >>> + /* Check if delay branch uses "ldi int,%r20" */ >>> + if ((opcode & 0xffff0000) == 0x34140000) >>> + return; /* everything ok, just return */ >>> + >>> + /* Check if delay branch uses "copy %rX,%r20" */ >>> + if ((opcode & 0xff00ffff) == 0x08000254) { >>> + source_reg = (opcode >> 16) & 31; >> >> Can you explain the reasoning behind the & 31 mask ? >> I'm no parisc expert, but this seems rather odd. >> Do you really mean "% 31" which translates to "& 5" ? >> It would make more sense since there are 32 "gr" >> registers. With & 31, a carefully crafted opcode >> could overflow the gr[32] array, and cause a kernel >> overflow allowing to overwrite kernel memory >> (security issue). > > Hrm, I got my masks temporarily mixed up (early morning > here). This is why I always use constructs such as: > > #define GR_REGS_BITS 5 > #define NR_GR_REGS (1U << GR_REGS_BITS) > #define GR_REGS_MASK (NR_GR_REGS - 1) > > and then > > v & GR_REGS_MASK; > > Which makes everything super-obvious. The & 31 mask > seems therefore technically correct. Good. I was struggling with your comments as well :-) > The paragraph below still holds though: > >> >> If it's the case, then it would also be good to >> check that the register selector within the opcode >> is not larger than 31 (e.g. applying to fr or sr >> register), rather than applying the modulo and >> assuming it's a gr and corrupt userspace state. I'll check that. Thanks! Helge ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 15:49 ` Helge Deller @ 2015-12-20 16:50 ` James Bottomley 2015-12-20 20:35 ` Helge Deller 2015-12-20 18:31 ` John David Anglin 1 sibling, 1 reply; 30+ messages in thread From: James Bottomley @ 2015-12-20 16:50 UTC (permalink / raw) To: Helge Deller, Mathieu Desnoyers; +Cc: linux-parisc, John David Anglin On Sun, 2015-12-20 at 16:49 +0100, Helge Deller wrote: > On 20.12.2015 15:09, Mathieu Desnoyers wrote: > > ----- On Dec 20, 2015, at 8:59 AM, Mathieu Desnoyers > > mathieu.desnoyers@efficios.com wrote: > > > > > ----- On Dec 18, 2015, at 6:30 PM, Helge Deller deller@gmx.de wro > > > te: > > > > > > > On parisc syscalls which are interrupted by signals sometimes > > > > fail to restart > > > > and instead return -ENOSYS which then in the worst case lead to > > > > userspace > > > > crashes. > > > > A similiar problem existed on MIPS and was fixed by commit > > > > e967ef02 > > > > ("MIPS: Fix restart of indirect syscalls"). > > > > > > > > On parisc the current syscall restart code assumes hat the > > > > syscall number is > > > > always loaded in the delay branch of the ble instruction as > > > > defined in the > > > > unistd.h header file and as such never restored %r20 before > > > > returning to > > > > userspace: > > > > ble 0x100(%sr2, %r0) > > > > ldi #syscall_nr, %r20 > > > > > > > > This assumption is at least not true for code which uses the > > > > syscall() glibc > > > > function, which instead uses this syntax: > > > > ble 0x100(%sr2, %r0) > > > > copy regX, %r20 > > > > where regX depend on how the compiler optimizes the code and > > > > register usage. > > > > > > > > This patch fixes this problem by adding code to analyze how the > > > > syscall number > > > > is loaded in the delay branch and - if needed - copy the > > > > syscall number to regX > > > > prior returning to userspace for the syscall restart. > > > > > > > > Signed-off-by: Helge Deller <deller@gmx.de> > > > > Cc: stable@vger.kernel.org > > > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > > > > > > > diff --git a/arch/parisc/kernel/signal.c > > > > b/arch/parisc/kernel/signal.c > > > > index dc1ea79..b0414ad 100644 > > > > --- a/arch/parisc/kernel/signal.c > > > > +++ b/arch/parisc/kernel/signal.c > > > > @@ -435,6 +435,48 @@ handle_signal(struct ksignal *ksig, struct > > > > pt_regs *regs, > > > > int in_syscall) > > > > regs->gr[28]); > > > > } > > > > > > > > +/* > > > > + * Check the delay branch in userspace how the syscall number > > > > gets loaded into > > > > + * %r20 and adjust as needed. > > > > > > I'm pretty sure "Check the delay branch in userspace how the > > > syscall..." > > > is not an English construct. ;-) Suggested rewording: > > > > > > "Check how the syscall number gets loaded into %r20 within > > > the delay branch in userspace and adjust as needed." > > Thanks! > I'll change that. > > > > > > + */ > > > > + > > > > +static void check_syscallno_in_delay_branch(struct pt_regs > > > > *regs) > > > > +{ > > > > + unsigned int opcode, source_reg; > > > > > > Why "unsigned int" above rather than u32 ? Since we're using > > > opcode as target variable for a get_user, it would be clearer > > > if the type of the __user * match the type of the target kernel > > > variable. (understood that those happen to have the same bitness > > > and type size on all Linux architectures, but it would be clearer > > > nevertheless). > > Yes, seems OK. > I'll change that. > > > > > + u32 __user *uaddr; > > > > + > > > > + /* Usually we don't have to restore %r20 (the system > > > > call number) > > > > + * because it gets loaded in the delay slot of the > > > > branch external > > > > + * instruction via the ldi instruction. > > > > + * In some cases a register-to-register copy > > > > instruction might have > > > > + * been used instead, in which case we need to copy > > > > the syscall > > > > + * number into the source register before returning to > > > > userspace. > > > > + */ > > > > + > > > > + /* A syscall is just a branch, so all > > > > + * we have to do is fiddle the return pointer. > > > > + */ > > > > + regs->gr[31] -= 8; /* delayed branching */ > > > > + > > > > + /* Get assembler opcode of code in delay branch */ > > > > + uaddr = (unsigned int *) (regs->gr[31] + 1); > > > > + get_user(opcode, uaddr); > > > > > > get_user() can fail due to EFAULT. This error should be > > > handled here, otherwise this could lead to the following > > > code using an uninitialized opcode variable, which could > > > indirectly leak a few bits of kernel stack information > > > to userspace (security concern). One attack vector I have > > > in mind for this is ptrace(), which might be able to tweak > > > those register values. > > Yes, generally get_user() can fail. > But this would be rather strange in that case, because > the syscall was started by userspace from this address. > So, without the code at that address in userspace, we would > never have reached this get_user(). Actually, that's not necessarily a safe assumption. Any memory allocation in a syscall path (except GFP_ATOMIC) can trigger reclaim and since this is a signal restart path, that's entirely possible. Reclaim could pull the backing page out from under the syscall, so in a low memory situation it is possible get_user() could fail with EFAULT unless get_user_page() has been called somewhere to pin the page. James ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 16:50 ` James Bottomley @ 2015-12-20 20:35 ` Helge Deller 2015-12-21 8:03 ` James Bottomley 0 siblings, 1 reply; 30+ messages in thread From: Helge Deller @ 2015-12-20 20:35 UTC (permalink / raw) To: James Bottomley, Mathieu Desnoyers; +Cc: linux-parisc, John David Anglin On 20.12.2015 17:50, James Bottomley wrote: > On Sun, 2015-12-20 at 16:49 +0100, Helge Deller wrote: >> On 20.12.2015 15:09, Mathieu Desnoyers wrote: >>> ----- On Dec 20, 2015, at 8:59 AM, Mathieu Desnoyers >>> mathieu.desnoyers@efficios.com wrote: >>> >>>> ----- On Dec 18, 2015, at 6:30 PM, Helge Deller deller@gmx.de wro >>>> te: >>>> >>>>> On parisc syscalls which are interrupted by signals sometimes >>>>> fail to restart >>>>> and instead return -ENOSYS which then in the worst case lead to >>>>> userspace >>>>> crashes. >>>>> A similiar problem existed on MIPS and was fixed by commit >>>>> e967ef02 >>>>> ("MIPS: Fix restart of indirect syscalls"). >>>>> >>>>> On parisc the current syscall restart code assumes hat the >>>>> syscall number is >>>>> always loaded in the delay branch of the ble instruction as >>>>> defined in the >>>>> unistd.h header file and as such never restored %r20 before >>>>> returning to >>>>> userspace: >>>>> ble 0x100(%sr2, %r0) >>>>> ldi #syscall_nr, %r20 >>>>> >>>>> This assumption is at least not true for code which uses the >>>>> syscall() glibc >>>>> function, which instead uses this syntax: >>>>> ble 0x100(%sr2, %r0) >>>>> copy regX, %r20 >>>>> where regX depend on how the compiler optimizes the code and >>>>> register usage. >>>>> >>>>> This patch fixes this problem by adding code to analyze how the >>>>> syscall number >>>>> is loaded in the delay branch and - if needed - copy the >>>>> syscall number to regX >>>>> prior returning to userspace for the syscall restart. >>>>> >>>>> Signed-off-by: Helge Deller <deller@gmx.de> >>>>> Cc: stable@vger.kernel.org >>>>> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >>>>> >>>>> diff --git a/arch/parisc/kernel/signal.c >>>>> b/arch/parisc/kernel/signal.c >>>>> index dc1ea79..b0414ad 100644 >>>>> --- a/arch/parisc/kernel/signal.c >>>>> +++ b/arch/parisc/kernel/signal.c >>>>> @@ -435,6 +435,48 @@ handle_signal(struct ksignal *ksig, struct >>>>> pt_regs *regs, >>>>> int in_syscall) >>>>> regs->gr[28]); >>>>> } >>>>> >>>>> +/* >>>>> + * Check the delay branch in userspace how the syscall number >>>>> gets loaded into >>>>> + * %r20 and adjust as needed. >>>> >>>> I'm pretty sure "Check the delay branch in userspace how the >>>> syscall..." >>>> is not an English construct. ;-) Suggested rewording: >>>> >>>> "Check how the syscall number gets loaded into %r20 within >>>> the delay branch in userspace and adjust as needed." >> >> Thanks! >> I'll change that. >> >> >>>>> + */ >>>>> + >>>>> +static void check_syscallno_in_delay_branch(struct pt_regs >>>>> *regs) >>>>> +{ >>>>> + unsigned int opcode, source_reg; >>>> >>>> Why "unsigned int" above rather than u32 ? Since we're using >>>> opcode as target variable for a get_user, it would be clearer >>>> if the type of the __user * match the type of the target kernel >>>> variable. (understood that those happen to have the same bitness >>>> and type size on all Linux architectures, but it would be clearer >>>> nevertheless). >> >> Yes, seems OK. >> I'll change that. >> >>>>> + u32 __user *uaddr; >>>>> + >>>>> + /* Usually we don't have to restore %r20 (the system >>>>> call number) >>>>> + * because it gets loaded in the delay slot of the >>>>> branch external >>>>> + * instruction via the ldi instruction. >>>>> + * In some cases a register-to-register copy >>>>> instruction might have >>>>> + * been used instead, in which case we need to copy >>>>> the syscall >>>>> + * number into the source register before returning to >>>>> userspace. >>>>> + */ >>>>> + >>>>> + /* A syscall is just a branch, so all >>>>> + * we have to do is fiddle the return pointer. >>>>> + */ >>>>> + regs->gr[31] -= 8; /* delayed branching */ >>>>> + >>>>> + /* Get assembler opcode of code in delay branch */ >>>>> + uaddr = (unsigned int *) (regs->gr[31] + 1); >>>>> + get_user(opcode, uaddr); >>>> >>>> get_user() can fail due to EFAULT. This error should be >>>> handled here, otherwise this could lead to the following >>>> code using an uninitialized opcode variable, which could >>>> indirectly leak a few bits of kernel stack information >>>> to userspace (security concern). One attack vector I have >>>> in mind for this is ptrace(), which might be able to tweak >>>> those register values. >> >> Yes, generally get_user() can fail. >> But this would be rather strange in that case, because >> the syscall was started by userspace from this address. >> So, without the code at that address in userspace, we would >> never have reached this get_user(). > > Actually, that's not necessarily a safe assumption. Any memory > allocation in a syscall path (except GFP_ATOMIC) can trigger reclaim > and since this is a signal restart path, that's entirely possible. > Reclaim could pull the backing page out from under the syscall, so in > a low memory situation it is possible get_user() could fail with EFAULT Really? Maybe I misunderstood...? So, let's say we have low memory and the kernel "swapped" out the userspace. I assume that when there is no memory pressure get_user() would pull the page in again, and if it's memory pressure, then with the assumption there is no memory left it probably return EFAULT (is that true?). But what happens then when we return to userspace? I expect userspace then to segfault. I will add the check for EFAULT, but just trying to understand... > unless get_user_page() has been called somewhere to pin the page. Helge ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 20:35 ` Helge Deller @ 2015-12-21 8:03 ` James Bottomley 2015-12-21 14:39 ` Mathieu Desnoyers 0 siblings, 1 reply; 30+ messages in thread From: James Bottomley @ 2015-12-21 8:03 UTC (permalink / raw) To: Helge Deller, Mathieu Desnoyers; +Cc: linux-parisc, John David Anglin On Sun, 2015-12-20 at 21:35 +0100, Helge Deller wrote: > On 20.12.2015 17:50, James Bottomley wrote: > > On Sun, 2015-12-20 at 16:49 +0100, Helge Deller wrote: > > > On 20.12.2015 15:09, Mathieu Desnoyers wrote: > > > > ----- On Dec 20, 2015, at 8:59 AM, Mathieu Desnoyers > > > > mathieu.desnoyers@efficios.com wrote: > > > > > > > > > ----- On Dec 18, 2015, at 6:30 PM, Helge Deller deller@gmx.de > > > > > wro > > > > > te: > > > > > > > > > > > On parisc syscalls which are interrupted by signals > > > > > > sometimes > > > > > > fail to restart > > > > > > and instead return -ENOSYS which then in the worst case > > > > > > lead to > > > > > > userspace > > > > > > crashes. > > > > > > A similiar problem existed on MIPS and was fixed by commit > > > > > > e967ef02 > > > > > > ("MIPS: Fix restart of indirect syscalls"). > > > > > > > > > > > > On parisc the current syscall restart code assumes hat the > > > > > > syscall number is > > > > > > always loaded in the delay branch of the ble instruction as > > > > > > defined in the > > > > > > unistd.h header file and as such never restored %r20 before > > > > > > returning to > > > > > > userspace: > > > > > > ble 0x100(%sr2, %r0) > > > > > > ldi #syscall_nr, %r20 > > > > > > > > > > > > This assumption is at least not true for code which uses > > > > > > the > > > > > > syscall() glibc > > > > > > function, which instead uses this syntax: > > > > > > ble 0x100(%sr2, %r0) > > > > > > copy regX, %r20 > > > > > > where regX depend on how the compiler optimizes the code > > > > > > and > > > > > > register usage. > > > > > > > > > > > > This patch fixes this problem by adding code to analyze how > > > > > > the > > > > > > syscall number > > > > > > is loaded in the delay branch and - if needed - copy the > > > > > > syscall number to regX > > > > > > prior returning to userspace for the syscall restart. > > > > > > > > > > > > Signed-off-by: Helge Deller <deller@gmx.de> > > > > > > Cc: stable@vger.kernel.org > > > > > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > > > > > > > > > > > diff --git a/arch/parisc/kernel/signal.c > > > > > > b/arch/parisc/kernel/signal.c > > > > > > index dc1ea79..b0414ad 100644 > > > > > > --- a/arch/parisc/kernel/signal.c > > > > > > +++ b/arch/parisc/kernel/signal.c > > > > > > @@ -435,6 +435,48 @@ handle_signal(struct ksignal *ksig, > > > > > > struct > > > > > > pt_regs *regs, > > > > > > int in_syscall) > > > > > > regs->gr[28]); > > > > > > } > > > > > > > > > > > > +/* > > > > > > + * Check the delay branch in userspace how the syscall > > > > > > number > > > > > > gets loaded into > > > > > > + * %r20 and adjust as needed. > > > > > > > > > > I'm pretty sure "Check the delay branch in userspace how the > > > > > syscall..." > > > > > is not an English construct. ;-) Suggested rewording: > > > > > > > > > > "Check how the syscall number gets loaded into %r20 within > > > > > the delay branch in userspace and adjust as needed." > > > > > > Thanks! > > > I'll change that. > > > > > > > > > > > > + */ > > > > > > + > > > > > > +static void check_syscallno_in_delay_branch(struct pt_regs > > > > > > *regs) > > > > > > +{ > > > > > > + unsigned int opcode, source_reg; > > > > > > > > > > Why "unsigned int" above rather than u32 ? Since we're using > > > > > opcode as target variable for a get_user, it would be clearer > > > > > if the type of the __user * match the type of the target > > > > > kernel > > > > > variable. (understood that those happen to have the same > > > > > bitness > > > > > and type size on all Linux architectures, but it would be > > > > > clearer > > > > > nevertheless). > > > > > > Yes, seems OK. > > > I'll change that. > > > > > > > > > + u32 __user *uaddr; > > > > > > + > > > > > > + /* Usually we don't have to restore %r20 (the > > > > > > system > > > > > > call number) > > > > > > + * because it gets loaded in the delay slot of the > > > > > > branch external > > > > > > + * instruction via the ldi instruction. > > > > > > + * In some cases a register-to-register copy > > > > > > instruction might have > > > > > > + * been used instead, in which case we need to > > > > > > copy > > > > > > the syscall > > > > > > + * number into the source register before > > > > > > returning to > > > > > > userspace. > > > > > > + */ > > > > > > + > > > > > > + /* A syscall is just a branch, so all > > > > > > + * we have to do is fiddle the return pointer. > > > > > > + */ > > > > > > + regs->gr[31] -= 8; /* delayed branching */ > > > > > > + > > > > > > + /* Get assembler opcode of code in delay branch */ > > > > > > + uaddr = (unsigned int *) (regs->gr[31] + 1); > > > > > > + get_user(opcode, uaddr); > > > > > > > > > > get_user() can fail due to EFAULT. This error should be > > > > > handled here, otherwise this could lead to the following > > > > > code using an uninitialized opcode variable, which could > > > > > indirectly leak a few bits of kernel stack information > > > > > to userspace (security concern). One attack vector I have > > > > > in mind for this is ptrace(), which might be able to tweak > > > > > those register values. > > > > > > Yes, generally get_user() can fail. > > > But this would be rather strange in that case, because > > > the syscall was started by userspace from this address. > > > So, without the code at that address in userspace, we would > > > never have reached this get_user(). > > > > Actually, that's not necessarily a safe assumption. Any memory > > allocation in a syscall path (except GFP_ATOMIC) can trigger > > reclaim > > and since this is a signal restart path, that's entirely possible. > > Reclaim could pull the backing page out from under the syscall, so > > in > > a low memory situation it is possible get_user() could fail with > > EFAULT > > Really? > Maybe I misunderstood...? > So, let's say we have low memory and the kernel "swapped" out the > userspace. > I assume that when there is no memory pressure get_user() would > pull the page in again, and if it's memory pressure, then with the > assumption > there is no memory left it probably return EFAULT (is that true?). > But what happens then when we return to userspace? > I expect userspace then to segfault. > > I will add the check for EFAULT, but just trying to understand... Actually, you're right, I was misremembering why these functions return EFAULT. It's if the access is invalid. They will actually try to page back in via the fault handler if the backing has gone. James > > unless get_user_page() has been called somewhere to pin the page. > > Helge > -- > To unsubscribe from this list: send the line "unsubscribe linux > -parisc" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-21 8:03 ` James Bottomley @ 2015-12-21 14:39 ` Mathieu Desnoyers 0 siblings, 0 replies; 30+ messages in thread From: Mathieu Desnoyers @ 2015-12-21 14:39 UTC (permalink / raw) To: James Bottomley; +Cc: Helge Deller, linux-parisc, John David Anglin ----- On Dec 21, 2015, at 3:03 AM, James Bottomley James.Bottomley@HansenPartnership.com wrote: > On Sun, 2015-12-20 at 21:35 +0100, Helge Deller wrote: >> On 20.12.2015 17:50, James Bottomley wrote: >> > On Sun, 2015-12-20 at 16:49 +0100, Helge Deller wrote: >> > > On 20.12.2015 15:09, Mathieu Desnoyers wrote: >> > > > ----- On Dec 20, 2015, at 8:59 AM, Mathieu Desnoyers >> > > > mathieu.desnoyers@efficios.com wrote: >> > > > >> > > > > ----- On Dec 18, 2015, at 6:30 PM, Helge Deller deller@gmx.de >> > > > > wro >> > > > > te: >> > > > > >> > > > > > On parisc syscalls which are interrupted by signals >> > > > > > sometimes >> > > > > > fail to restart >> > > > > > and instead return -ENOSYS which then in the worst case >> > > > > > lead to >> > > > > > userspace >> > > > > > crashes. >> > > > > > A similiar problem existed on MIPS and was fixed by commit >> > > > > > e967ef02 >> > > > > > ("MIPS: Fix restart of indirect syscalls"). >> > > > > > >> > > > > > On parisc the current syscall restart code assumes hat the >> > > > > > syscall number is >> > > > > > always loaded in the delay branch of the ble instruction as >> > > > > > defined in the >> > > > > > unistd.h header file and as such never restored %r20 before >> > > > > > returning to >> > > > > > userspace: >> > > > > > ble 0x100(%sr2, %r0) >> > > > > > ldi #syscall_nr, %r20 >> > > > > > >> > > > > > This assumption is at least not true for code which uses >> > > > > > the >> > > > > > syscall() glibc >> > > > > > function, which instead uses this syntax: >> > > > > > ble 0x100(%sr2, %r0) >> > > > > > copy regX, %r20 >> > > > > > where regX depend on how the compiler optimizes the code >> > > > > > and >> > > > > > register usage. >> > > > > > >> > > > > > This patch fixes this problem by adding code to analyze how >> > > > > > the >> > > > > > syscall number >> > > > > > is loaded in the delay branch and - if needed - copy the >> > > > > > syscall number to regX >> > > > > > prior returning to userspace for the syscall restart. >> > > > > > >> > > > > > Signed-off-by: Helge Deller <deller@gmx.de> >> > > > > > Cc: stable@vger.kernel.org >> > > > > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >> > > > > > >> > > > > > diff --git a/arch/parisc/kernel/signal.c >> > > > > > b/arch/parisc/kernel/signal.c >> > > > > > index dc1ea79..b0414ad 100644 >> > > > > > --- a/arch/parisc/kernel/signal.c >> > > > > > +++ b/arch/parisc/kernel/signal.c >> > > > > > @@ -435,6 +435,48 @@ handle_signal(struct ksignal *ksig, >> > > > > > struct >> > > > > > pt_regs *regs, >> > > > > > int in_syscall) >> > > > > > regs->gr[28]); >> > > > > > } >> > > > > > >> > > > > > +/* >> > > > > > + * Check the delay branch in userspace how the syscall >> > > > > > number >> > > > > > gets loaded into >> > > > > > + * %r20 and adjust as needed. >> > > > > >> > > > > I'm pretty sure "Check the delay branch in userspace how the >> > > > > syscall..." >> > > > > is not an English construct. ;-) Suggested rewording: >> > > > > >> > > > > "Check how the syscall number gets loaded into %r20 within >> > > > > the delay branch in userspace and adjust as needed." >> > > >> > > Thanks! >> > > I'll change that. >> > > >> > > >> > > > > > + */ >> > > > > > + >> > > > > > +static void check_syscallno_in_delay_branch(struct pt_regs >> > > > > > *regs) >> > > > > > +{ >> > > > > > + unsigned int opcode, source_reg; >> > > > > >> > > > > Why "unsigned int" above rather than u32 ? Since we're using >> > > > > opcode as target variable for a get_user, it would be clearer >> > > > > if the type of the __user * match the type of the target >> > > > > kernel >> > > > > variable. (understood that those happen to have the same >> > > > > bitness >> > > > > and type size on all Linux architectures, but it would be >> > > > > clearer >> > > > > nevertheless). >> > > >> > > Yes, seems OK. >> > > I'll change that. >> > > >> > > > > > + u32 __user *uaddr; >> > > > > > + >> > > > > > + /* Usually we don't have to restore %r20 (the >> > > > > > system >> > > > > > call number) >> > > > > > + * because it gets loaded in the delay slot of the >> > > > > > branch external >> > > > > > + * instruction via the ldi instruction. >> > > > > > + * In some cases a register-to-register copy >> > > > > > instruction might have >> > > > > > + * been used instead, in which case we need to >> > > > > > copy >> > > > > > the syscall >> > > > > > + * number into the source register before >> > > > > > returning to >> > > > > > userspace. >> > > > > > + */ >> > > > > > + >> > > > > > + /* A syscall is just a branch, so all >> > > > > > + * we have to do is fiddle the return pointer. >> > > > > > + */ >> > > > > > + regs->gr[31] -= 8; /* delayed branching */ >> > > > > > + >> > > > > > + /* Get assembler opcode of code in delay branch */ >> > > > > > + uaddr = (unsigned int *) (regs->gr[31] + 1); >> > > > > > + get_user(opcode, uaddr); >> > > > > >> > > > > get_user() can fail due to EFAULT. This error should be >> > > > > handled here, otherwise this could lead to the following >> > > > > code using an uninitialized opcode variable, which could >> > > > > indirectly leak a few bits of kernel stack information >> > > > > to userspace (security concern). One attack vector I have >> > > > > in mind for this is ptrace(), which might be able to tweak >> > > > > those register values. >> > > >> > > Yes, generally get_user() can fail. >> > > But this would be rather strange in that case, because >> > > the syscall was started by userspace from this address. >> > > So, without the code at that address in userspace, we would >> > > never have reached this get_user(). >> > >> > Actually, that's not necessarily a safe assumption. Any memory >> > allocation in a syscall path (except GFP_ATOMIC) can trigger >> > reclaim >> > and since this is a signal restart path, that's entirely possible. >> > Reclaim could pull the backing page out from under the syscall, so >> > in >> > a low memory situation it is possible get_user() could fail with >> > EFAULT >> >> Really? >> Maybe I misunderstood...? >> So, let's say we have low memory and the kernel "swapped" out the >> userspace. >> I assume that when there is no memory pressure get_user() would >> pull the page in again, and if it's memory pressure, then with the >> assumption >> there is no memory left it probably return EFAULT (is that true?). >> But what happens then when we return to userspace? >> I expect userspace then to segfault. >> >> I will add the check for EFAULT, but just trying to understand... > > Actually, you're right, I was misremembering why these functions return > EFAULT. It's if the access is invalid. They will actually try to page > back in via the fault handler if the backing has gone. I think you could still trigger the EFAULT by having a concurrent thread running within the same process address space invoking munmap on the memory range that includes the get_user() target memory area. This would clearly be a hostile user-space application, but it's important to validate all user inputs very carefully, especially for misbehaving applications. Thanks, Mathieu > > James > >> > unless get_user_page() has been called somewhere to pin the page. >> >> Helge >> -- >> To unsubscribe from this list: send the line "unsubscribe linux >> -parisc" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 15:49 ` Helge Deller 2015-12-20 16:50 ` James Bottomley @ 2015-12-20 18:31 ` John David Anglin 2015-12-20 19:32 ` Helge Deller 2015-12-21 14:42 ` Mathieu Desnoyers 1 sibling, 2 replies; 30+ messages in thread From: John David Anglin @ 2015-12-20 18:31 UTC (permalink / raw) To: Helge Deller; +Cc: Mathieu Desnoyers, linux-parisc, James Bottomley On 2015-12-20, at 10:49 AM, Helge Deller wrote: >>>> /* Check if delay branch uses "copy %rX,%r20" */ >>>> + if ((opcode & 0xff00ffff) == 0x08000254) { >>>> + source_reg = (opcode >> 16) & 31; >>> >>> Can you explain the reasoning behind the & 31 mask ? >>> I'm no parisc expert, but this seems rather odd. >>> Do you really mean "% 31" which translates to "& 5" ? >>> It would make more sense since there are 32 "gr" >>> registers. With & 31, a carefully crafted opcode >>> could overflow the gr[32] array, and cause a kernel >>> overflow allowing to overwrite kernel memory >>> (security issue). >> >> Hrm, I got my masks temporarily mixed up (early morning >> here). This is why I always use constructs such as: >> >> #define GR_REGS_BITS 5 >> #define NR_GR_REGS (1U << GR_REGS_BITS) >> #define GR_REGS_MASK (NR_GR_REGS - 1) >> >> and then >> >> v & GR_REGS_MASK; >> >> Which makes everything super-obvious. The & 31 mask >> seems therefore technically correct. > > Good. I was struggling with your comments as well :-) > > >> The paragraph below still holds though: >> >>> >>> If it's the case, then it would also be good to >>> check that the register selector within the opcode >>> is not larger than 31 (e.g. applying to fr or sr >>> register), rather than applying the modulo and >>> assuming it's a gr and corrupt userspace state. > > I'll check that. The register field cannot be larger than 31. A fr or sr regster can't be used in a LDO "copy" instruction. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 18:31 ` John David Anglin @ 2015-12-20 19:32 ` Helge Deller 2015-12-20 19:46 ` John David Anglin 2015-12-21 14:42 ` Mathieu Desnoyers 1 sibling, 1 reply; 30+ messages in thread From: Helge Deller @ 2015-12-20 19:32 UTC (permalink / raw) To: John David Anglin; +Cc: Mathieu Desnoyers, linux-parisc, James Bottomley On 20.12.2015 19:31, John David Anglin wrote: > On 2015-12-20, at 10:49 AM, Helge Deller wrote: > >>>>> /* Check if delay branch uses "copy %rX,%r20" */ >>>>> + if ((opcode & 0xff00ffff) == 0x08000254) { >>>>> + source_reg = (opcode >> 16) & 31; >>>> >>>> Can you explain the reasoning behind the & 31 mask ? >>>> I'm no parisc expert, but this seems rather odd. >>>> Do you really mean "% 31" which translates to "& 5" ? >>>> It would make more sense since there are 32 "gr" >>>> registers. With & 31, a carefully crafted opcode >>>> could overflow the gr[32] array, and cause a kernel >>>> overflow allowing to overwrite kernel memory >>>> (security issue). >>> >>> Hrm, I got my masks temporarily mixed up (early morning >>> here). This is why I always use constructs such as: >>> >>> #define GR_REGS_BITS 5 >>> #define NR_GR_REGS (1U << GR_REGS_BITS) >>> #define GR_REGS_MASK (NR_GR_REGS - 1) >>> >>> and then >>> >>> v & GR_REGS_MASK; >>> >>> Which makes everything super-obvious. The & 31 mask >>> seems therefore technically correct. >> >> Good. I was struggling with your comments as well :-) >> >> >>> The paragraph below still holds though: >>> >>>> >>>> If it's the case, then it would also be good to >>>> check that the register selector within the opcode >>>> is not larger than 31 (e.g. applying to fr or sr >>>> register), rather than applying the modulo and >>>> assuming it's a gr and corrupt userspace state. >> >> I'll check that. > > The register field cannot be larger than 31. A fr or sr > regster can't be used Both correct. > in a LDO "copy" instruction. Actually it's the "OR,cond r1,r2,t" instruction. https://parisc.wiki.kernel.org/images-parisc/6/68/Pa11_acd.pdf page 5-105 if ((opcode & 0xff00ffff) == 0x08000254) The mask should be 0xffe0ffff, so that some bits of r2 (which needs to be 0) are not missed. Helge ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 19:32 ` Helge Deller @ 2015-12-20 19:46 ` John David Anglin 2015-12-20 20:06 ` Helge Deller 2015-12-20 23:57 ` John David Anglin 0 siblings, 2 replies; 30+ messages in thread From: John David Anglin @ 2015-12-20 19:46 UTC (permalink / raw) To: Helge Deller; +Cc: Mathieu Desnoyers, linux-parisc, James Bottomley On 2015-12-20, at 2:32 PM, Helge Deller wrote: >> in a LDO "copy" instruction. > > Actually it's the "OR,cond r1,r2,t" instruction. > https://parisc.wiki.kernel.org/images-parisc/6/68/Pa11_acd.pdf > page 5-105 > > if ((opcode & 0xff00ffff) == 0x08000254) > > The mask should be 0xffe0ffff, so that some bits of r2 (which needs to be 0) are not missed. There are multiple instructions that could be used. The PA 2.0 arch says LDO on page 7-83. I think we should fix syscall(). It's not used that much. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 19:46 ` John David Anglin @ 2015-12-20 20:06 ` Helge Deller 2015-12-20 23:57 ` John David Anglin 1 sibling, 0 replies; 30+ messages in thread From: Helge Deller @ 2015-12-20 20:06 UTC (permalink / raw) To: John David Anglin; +Cc: Mathieu Desnoyers, linux-parisc, James Bottomley On 20.12.2015 20:46, John David Anglin wrote: > On 2015-12-20, at 2:32 PM, Helge Deller wrote: > >>> in a LDO "copy" instruction. >> >> Actually it's the "OR,cond r1,r2,t" instruction. >> https://parisc.wiki.kernel.org/images-parisc/6/68/Pa11_acd.pdf >> page 5-105 >> >> if ((opcode & 0xff00ffff) == 0x08000254) >> >> The mask should be 0xffe0ffff, so that some bits of r2 (which needs to be 0) are not missed. This is actually not the full truth. I missed the point, that "r2" is a register and not an immediate value. My code just checks that "r2" means %r0, while it could be any other register as well, which either has a null-value, or even worse, if someone decides to "calculate" the syscall number. > There are multiple instructions that could be used. The PA 2.0 arch says LDO on page 7-83. True. But luckily it seems gas converts "copy" to the "or" syntax mentioned above. In any case, keeping my kernel patch will report any cases (which might by mistake get added) in the future so that we can fix userspace then (or enhance the kernel workaround). Helge ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 19:46 ` John David Anglin 2015-12-20 20:06 ` Helge Deller @ 2015-12-20 23:57 ` John David Anglin 1 sibling, 0 replies; 30+ messages in thread From: John David Anglin @ 2015-12-20 23:57 UTC (permalink / raw) To: John David Anglin Cc: Helge Deller, Mathieu Desnoyers, linux-parisc, James Bottomley [-- Attachment #1: Type: text/plain, Size: 198 bytes --] On 2015-12-20, at 2:46 PM, John David Anglin wrote: > I think we should fix syscall(). It's not used that much. Attached is glibc fix. Testing. Dave -- John David Anglin dave.anglin@bell.net [-- Attachment #2: syscall-restart.d.txt --] [-- Type: text/plain, Size: 1128 bytes --] diff --git a/sysdeps/unix/sysv/linux/hppa/syscall.c b/sysdeps/unix/sysv/linux/hppa/syscall.c index 958fa47..9c7ee4f 100644 --- a/sysdeps/unix/sysv/linux/hppa/syscall.c +++ b/sysdeps/unix/sysv/linux/hppa/syscall.c @@ -48,8 +48,9 @@ syscall (long int __sysno, ...) PIC_REG_DEF LOAD_REGS_6 asm volatile (SAVE_ASM_PIC - " ble 0x100(%%sr2, %%r0) \n" " copy %1, %%r20 \n" + " ble 0x100(%%sr2, %%r0) \n" + " nop \n" LOAD_ASM_PIC : "=r" (__res) : "r" (__sysno) PIC_REG_USE ASM_ARGS_6 diff --git a/sysdeps/unix/sysv/linux/hppa/sysdep.h b/sysdeps/unix/sysv/linux/hppa/sysdep.h index 2cae70f..cc16c1a 100644 --- a/sysdeps/unix/sysv/linux/hppa/sysdep.h +++ b/sysdeps/unix/sysv/linux/hppa/sysdep.h @@ -434,8 +434,9 @@ L(pre_end): ASM_LINE_SEP \ /* FIXME: HACK save/load r19 around syscall */ \ asm volatile( \ SAVE_ASM_PIC \ - " ble 0x100(%%sr2, %%r0)\n" \ " copy %1, %%r20\n" \ + " ble 0x100(%%sr2, %%r0)\n" \ + " nop\n" \ LOAD_ASM_PIC \ : "=r" (__res) \ : "r" (name) PIC_REG_USE ASM_ARGS_##nr \ ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 18:31 ` John David Anglin 2015-12-20 19:32 ` Helge Deller @ 2015-12-21 14:42 ` Mathieu Desnoyers 2015-12-21 15:12 ` John David Anglin 1 sibling, 1 reply; 30+ messages in thread From: Mathieu Desnoyers @ 2015-12-21 14:42 UTC (permalink / raw) To: John David Anglin; +Cc: Helge Deller, linux-parisc, James Bottomley ----- On Dec 20, 2015, at 1:31 PM, John David Anglin dave.anglin@bell.net wrote: > On 2015-12-20, at 10:49 AM, Helge Deller wrote: > >>>>> /* Check if delay branch uses "copy %rX,%r20" */ >>>>> + if ((opcode & 0xff00ffff) == 0x08000254) { >>>>> + source_reg = (opcode >> 16) & 31; >>>> >>>> Can you explain the reasoning behind the & 31 mask ? >>>> I'm no parisc expert, but this seems rather odd. >>>> Do you really mean "% 31" which translates to "& 5" ? >>>> It would make more sense since there are 32 "gr" >>>> registers. With & 31, a carefully crafted opcode >>>> could overflow the gr[32] array, and cause a kernel >>>> overflow allowing to overwrite kernel memory >>>> (security issue). >>> >>> Hrm, I got my masks temporarily mixed up (early morning >>> here). This is why I always use constructs such as: >>> >>> #define GR_REGS_BITS 5 >>> #define NR_GR_REGS (1U << GR_REGS_BITS) >>> #define GR_REGS_MASK (NR_GR_REGS - 1) >>> >>> and then >>> >>> v & GR_REGS_MASK; >>> >>> Which makes everything super-obvious. The & 31 mask >>> seems therefore technically correct. >> >> Good. I was struggling with your comments as well :-) >> >> >>> The paragraph below still holds though: >>> >>>> >>>> If it's the case, then it would also be good to >>>> check that the register selector within the opcode >>>> is not larger than 31 (e.g. applying to fr or sr >>>> register), rather than applying the modulo and >>>> assuming it's a gr and corrupt userspace state. >> >> I'll check that. > > The register field cannot be larger than 31. A fr or sr > regster can't be used in a LDO "copy" instruction. Independently of the instruction set, if an application pulls the carpet from under the kernel from a concurrent thread (or possibly ptrace) by forging an invalid instruction after the system call has started executing, it would be good to catch it and report it with a printk, rather than blindly expecting that some bits should always be 0. Thanks, Mathieu > > Dave > -- > John David Anglin dave.anglin@bell.net -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-21 14:42 ` Mathieu Desnoyers @ 2015-12-21 15:12 ` John David Anglin 0 siblings, 0 replies; 30+ messages in thread From: John David Anglin @ 2015-12-21 15:12 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: Helge Deller, linux-parisc, James Bottomley On 2015-12-21 9:42 AM, Mathieu Desnoyers wrote: > ----- On Dec 20, 2015, at 1:31 PM, John David Anglin dave.anglin@bell.net wrote: > >> On 2015-12-20, at 10:49 AM, Helge Deller wrote: >> >>>>>> /* Check if delay branch uses "copy %rX,%r20" */ >>>>>> + if ((opcode & 0xff00ffff) == 0x08000254) { >>>>>> + source_reg = (opcode >> 16) & 31; >>>>> Can you explain the reasoning behind the & 31 mask ? >>>>> I'm no parisc expert, but this seems rather odd. >>>>> Do you really mean "% 31" which translates to "& 5" ? >>>>> It would make more sense since there are 32 "gr" >>>>> registers. With & 31, a carefully crafted opcode >>>>> could overflow the gr[32] array, and cause a kernel >>>>> overflow allowing to overwrite kernel memory >>>>> (security issue). >>>> Hrm, I got my masks temporarily mixed up (early morning >>>> here). This is why I always use constructs such as: >>>> >>>> #define GR_REGS_BITS 5 >>>> #define NR_GR_REGS (1U << GR_REGS_BITS) >>>> #define GR_REGS_MASK (NR_GR_REGS - 1) >>>> >>>> and then >>>> >>>> v & GR_REGS_MASK; >>>> >>>> Which makes everything super-obvious. The & 31 mask >>>> seems therefore technically correct. >>> Good. I was struggling with your comments as well :-) >>> >>> >>>> The paragraph below still holds though: >>>> >>>>> If it's the case, then it would also be good to >>>>> check that the register selector within the opcode >>>>> is not larger than 31 (e.g. applying to fr or sr >>>>> register), rather than applying the modulo and >>>>> assuming it's a gr and corrupt userspace state. >>> I'll check that. >> The register field cannot be larger than 31. A fr or sr >> regster can't be used in a LDO "copy" instruction. > Independently of the instruction set, if an application > pulls the carpet from under the kernel from a concurrent > thread (or possibly ptrace) by forging an invalid > instruction after the system call has started executing, > it would be good to catch it and report it with a printk, > rather than blindly expecting that some bits should always > be 0. The mask should be 0xffe0ffff. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-18 23:30 [PATCH] parisc: Fix syscall restarts Helge Deller 2015-12-20 13:59 ` Mathieu Desnoyers @ 2015-12-20 19:39 ` John David Anglin 2015-12-20 19:48 ` Helge Deller 2015-12-20 20:14 ` John David Anglin 2015-12-21 9:19 ` [PATCH] parisc: Fix syscall restarts (v2) Helge Deller 3 siblings, 1 reply; 30+ messages in thread From: John David Anglin @ 2015-12-20 19:39 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, James Bottomley, Mathieu Desnoyers On 2015-12-18, at 6:30 PM, Helge Deller wrote: > + /* Usually we don't have to restore %r20 (the system call number) > + * because it gets loaded in the delay slot of the branch external > + * instruction via the ldi instruction. > + * In some cases a register-to-register copy instruction might have > + * been used instead, in which case we need to copy the syscall > + * number into the source register before returning to userspace. > + */ I'm thinking it might be better to fix syscall() in glibc. The copy could be moved before ble and a nop placed delay slot. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 19:39 ` John David Anglin @ 2015-12-20 19:48 ` Helge Deller 2015-12-20 20:01 ` John David Anglin 0 siblings, 1 reply; 30+ messages in thread From: Helge Deller @ 2015-12-20 19:48 UTC (permalink / raw) To: John David Anglin; +Cc: linux-parisc, James Bottomley, Mathieu Desnoyers On 20.12.2015 20:39, John David Anglin wrote: > On 2015-12-18, at 6:30 PM, Helge Deller wrote: > >> + /* Usually we don't have to restore %r20 (the system call number) >> + * because it gets loaded in the delay slot of the branch external >> + * instruction via the ldi instruction. >> + * In some cases a register-to-register copy instruction might have >> + * been used instead, in which case we need to copy the syscall >> + * number into the source register before returning to userspace. >> + */ > > I'm thinking it might be better to fix syscall() in glibc. The copy could be > moved before ble and a nop placed delay slot. Yes, I think it should be fixed in glibc which makes it cleaner. I looked at dietlibc. There a "nop" is being used. Nevertheless, it may happen anytime if people forget, that we will see a "copy" there again, so IMHO it's probably safer to include the workaround in kernel too. Helge ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 19:48 ` Helge Deller @ 2015-12-20 20:01 ` John David Anglin 2015-12-20 20:18 ` Helge Deller 0 siblings, 1 reply; 30+ messages in thread From: John David Anglin @ 2015-12-20 20:01 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, James Bottomley, Mathieu Desnoyers On 2015-12-20, at 2:48 PM, Helge Deller wrote: > On 20.12.2015 20:39, John David Anglin wrote: >> On 2015-12-18, at 6:30 PM, Helge Deller wrote: >> >>> + /* Usually we don't have to restore %r20 (the system call number) >>> + * because it gets loaded in the delay slot of the branch external >>> + * instruction via the ldi instruction. >>> + * In some cases a register-to-register copy instruction might have >>> + * been used instead, in which case we need to copy the syscall >>> + * number into the source register before returning to userspace. >>> + */ >> >> I'm thinking it might be better to fix syscall() in glibc. The copy could be >> moved before ble and a nop placed delay slot. > > Yes, I think it should be fixed in glibc which makes it cleaner. > I looked at dietlibc. There a "nop" is being used. A "nop" implies %r20 needs to be restored. > > Nevertheless, it may happen anytime if people forget, that we will see a > "copy" there again, so IMHO it's probably safer to include the workaround in > kernel too. The current patch assumes regs->gr[source_reg] is restored. That needs to be checked given previous comment about %r20. Essentially, all syscall clobbered registers need to be restored. Can the user space stuff be avoided by jumping to the gateway entry point? Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 20:01 ` John David Anglin @ 2015-12-20 20:18 ` Helge Deller 2015-12-20 20:45 ` John David Anglin 0 siblings, 1 reply; 30+ messages in thread From: Helge Deller @ 2015-12-20 20:18 UTC (permalink / raw) To: John David Anglin; +Cc: linux-parisc, James Bottomley, Mathieu Desnoyers On 20.12.2015 21:01, John David Anglin wrote: > On 2015-12-20, at 2:48 PM, Helge Deller wrote: > >> On 20.12.2015 20:39, John David Anglin wrote: >>> On 2015-12-18, at 6:30 PM, Helge Deller wrote: >>> >>>> + /* Usually we don't have to restore %r20 (the system call number) >>>> + * because it gets loaded in the delay slot of the branch external >>>> + * instruction via the ldi instruction. >>>> + * In some cases a register-to-register copy instruction might have >>>> + * been used instead, in which case we need to copy the syscall >>>> + * number into the source register before returning to userspace. >>>> + */ >>> >>> I'm thinking it might be better to fix syscall() in glibc. The copy could be >>> moved before ble and a nop placed delay slot. >> >> Yes, I think it should be fixed in glibc which makes it cleaner. >> I looked at dietlibc. There a "nop" is being used. > > A "nop" implies %r20 needs to be restored. Yes, I tested that. It is being restored correctly, although the comments imply different behaviour. >> Nevertheless, it may happen anytime if people forget, that we will see a >> "copy" there again, so IMHO it's probably safer to include the workaround in >> kernel too. > > > The current patch assumes regs->gr[source_reg] is restored. Yes and no. The real problem we actually faced is, that the glibc() syscall function uses %r28 (aka the return value of the syscall) as "source_reg". That's the reason why we failed with "ENOSYS" in the end, because when we returned (before my patch) from the first syscall we returned -ERESTARTSYS in %r28 (which is basically correct), but then the "copy %r28,%r20" in userspace moved "-ERESTARTSYS" back into %r20, jumped into the kernel, and the kernel (correctly) reported back "ENOSYS" since there is no such syscall number with the value of "-ERESTARTSYS". The problem: %r28 is of course not one of the registers which is being restored. Quite complicated... I hope I could explain it somewhat...? > That needs to be > checked given previous comment about %r20. > > Essentially, all syscall clobbered registers need to be restored. Yes, they are (with the exception of the return value %r28). > Can the user space stuff be avoided by jumping to the gateway entry point? Probably. I didn't looked into that. Helge ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 20:18 ` Helge Deller @ 2015-12-20 20:45 ` John David Anglin 0 siblings, 0 replies; 30+ messages in thread From: John David Anglin @ 2015-12-20 20:45 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, James Bottomley, Mathieu Desnoyers On 2015-12-20, at 3:18 PM, Helge Deller wrote: >>> Nevertheless, it may happen anytime if people forget, that we will see a >>> "copy" there again, so IMHO it's probably safer to include the workaround in >>> kernel too. >> >> >> The current patch assumes regs->gr[source_reg] is restored. > > Yes and no. > > The real problem we actually faced is, that the glibc() syscall function > uses %r28 (aka the return value of the syscall) as "source_reg". > That's the reason why we failed with "ENOSYS" in the end, because when we returned > (before my patch) from the first syscall we returned -ERESTARTSYS > in %r28 (which is basically correct), but then the "copy %r28,%r20" in > userspace moved "-ERESTARTSYS" back into %r20, jumped into the kernel, and the > kernel (correctly) reported back "ENOSYS" since there is no such syscall > number with the value of "-ERESTARTSYS". > The problem: %r28 is of course not one of the registers which is being restored. > Quite complicated... I hope I could explain it somewhat...? So, the patch doesn't work and we need to fix glibc. The only other solution is to load %r20 into %r28 and restore %r28 on a syscall restart. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-18 23:30 [PATCH] parisc: Fix syscall restarts Helge Deller 2015-12-20 13:59 ` Mathieu Desnoyers 2015-12-20 19:39 ` John David Anglin @ 2015-12-20 20:14 ` John David Anglin 2015-12-20 20:19 ` Helge Deller 2015-12-21 9:19 ` [PATCH] parisc: Fix syscall restarts (v2) Helge Deller 3 siblings, 1 reply; 30+ messages in thread From: John David Anglin @ 2015-12-20 20:14 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, James Bottomley, Mathieu Desnoyers On 2015-12-18, at 6:30 PM, Helge Deller wrote: > + /* Get assembler opcode of code in delay branch */ > + uaddr = (unsigned int *) (regs->gr[31] + 1); Shouldn't increment be 4? Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 20:14 ` John David Anglin @ 2015-12-20 20:19 ` Helge Deller 2015-12-20 20:21 ` Helge Deller 0 siblings, 1 reply; 30+ messages in thread From: Helge Deller @ 2015-12-20 20:19 UTC (permalink / raw) To: John David Anglin; +Cc: linux-parisc, James Bottomley, Mathieu Desnoyers On 20.12.2015 21:14, John David Anglin wrote: > On 2015-12-18, at 6:30 PM, Helge Deller wrote: > >> + /* Get assembler opcode of code in delay branch */ >> + uaddr = (unsigned int *) (regs->gr[31] + 1); > > Shouldn't increment be 4? No. The first address which is being executed in userspace is regs->gr[31]-3. Helge ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 20:19 ` Helge Deller @ 2015-12-20 20:21 ` Helge Deller 2015-12-20 20:53 ` John David Anglin 0 siblings, 1 reply; 30+ messages in thread From: Helge Deller @ 2015-12-20 20:21 UTC (permalink / raw) To: John David Anglin; +Cc: linux-parisc, James Bottomley, Mathieu Desnoyers On 20.12.2015 21:19, Helge Deller wrote: > On 20.12.2015 21:14, John David Anglin wrote: >> On 2015-12-18, at 6:30 PM, Helge Deller wrote: >> >>> + /* Get assembler opcode of code in delay branch */ >>> + uaddr = (unsigned int *) (regs->gr[31] + 1); >> >> Shouldn't increment be 4? > > No. > The first address which is being executed in userspace is regs->gr[31]-3. I could (should?) have used uaddr = (unsigned int *) ((regs->gr[31] & ~3) + 4); Helge ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts 2015-12-20 20:21 ` Helge Deller @ 2015-12-20 20:53 ` John David Anglin 0 siblings, 0 replies; 30+ messages in thread From: John David Anglin @ 2015-12-20 20:53 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, James Bottomley, Mathieu Desnoyers On 2015-12-20, at 3:21 PM, Helge Deller wrote: > On 20.12.2015 21:19, Helge Deller wrote: >> On 20.12.2015 21:14, John David Anglin wrote: >>> On 2015-12-18, at 6:30 PM, Helge Deller wrote: >>> >>>> + /* Get assembler opcode of code in delay branch */ >>>> + uaddr = (unsigned int *) (regs->gr[31] + 1); >>> >>> Shouldn't increment be 4? >> >> No. >> The first address which is being executed in userspace is regs->gr[31]-3. > > I could (should?) have used > uaddr = (unsigned int *) ((regs->gr[31] & ~3) + 4); Possibly that is better. The former assumes user space is at level 3. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts (v2) 2015-12-18 23:30 [PATCH] parisc: Fix syscall restarts Helge Deller ` (2 preceding siblings ...) 2015-12-20 20:14 ` John David Anglin @ 2015-12-21 9:19 ` Helge Deller 2015-12-21 13:11 ` John David Anglin 2015-12-21 20:27 ` Mathieu Desnoyers 3 siblings, 2 replies; 30+ messages in thread From: Helge Deller @ 2015-12-21 9:19 UTC (permalink / raw) To: linux-parisc, James Bottomley, John David Anglin; +Cc: Mathieu Desnoyers This is version 2 of the patch: On parisc syscalls which are interrupted by signals sometimes failed to restart and instead returned -ENOSYS which in the worst case lead to userspace crashes. A similiar problem existed on MIPS and was fixed by commit e967ef02 ("MIPS: Fix restart of indirect syscalls"). On parisc the current syscall restart code assumes that all syscall callers load the syscall number in the delay slot of the ble instruction. That's how it is e.g. done in the unistd.h header file: ble 0x100(%sr2, %r0) ldi #syscall_nr, %r20 Because of that assumption the current code never restored %r20 before returning to userspace. This assumption is at least not true for code which uses the glibc syscall() function, which instead uses this syntax: ble 0x100(%sr2, %r0) copy regX, %r20 where regX depend on how the compiler optimizes the code and register usage. This patch fixes this problem by adding code to analyze how the syscall number is loaded in the delay branch and - if needed - copy the syscall number to regX prior returning to userspace for the syscall restart. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c index dc1ea79..2264f68 100644 --- a/arch/parisc/kernel/signal.c +++ b/arch/parisc/kernel/signal.c @@ -435,6 +435,55 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs, int in_syscall) regs->gr[28]); } +/* + * Check how the syscall number gets loaded into %r20 within + * the delay branch in userspace and adjust as needed. + */ + +static void check_syscallno_in_delay_branch(struct pt_regs *regs) +{ + u32 opcode, source_reg; + u32 __user *uaddr; + int err; + + /* Usually we don't have to restore %r20 (the system call number) + * because it gets loaded in the delay slot of the branch external + * instruction via the ldi instruction. + * In some cases a register-to-register copy instruction might have + * been used instead, in which case we need to copy the syscall + * number into the source register before returning to userspace. + */ + + /* A syscall is just a branch, so all we have to do is fiddle the + * return pointer so that the ble instruction gets executed again. + */ + regs->gr[31] -= 8; /* delayed branching */ + + /* Get assembler opcode of code in delay branch */ + uaddr = (unsigned int *) ((regs->gr[31] & ~3) + 4); + err = get_user(opcode, uaddr); + if (err) + return; + + /* Check if delay branch uses "ldi int,%r20" */ + if ((opcode & 0xffff0000) == 0x34140000) + return; /* everything ok, just return */ + + /* Check if delay branch uses "nop" */ + if (opcode == INSN_NOP) + return; + + /* Check if delay branch uses "copy %rX,%r20" */ + if ((opcode & 0xffe0ffff) == 0x08000254) { + source_reg = (opcode >> 16) & 31; + regs->gr[source_reg] = regs->gr[20]; + return; + } + + pr_warn("syscall restart: %s (pid %d): unexpected opcode 0x%08x\n", + current->comm, task_pid_nr(current), opcode); +} + static inline void syscall_restart(struct pt_regs *regs, struct k_sigaction *ka) { @@ -457,10 +506,7 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka) } /* fallthrough */ case -ERESTARTNOINTR: - /* A syscall is just a branch, so all - * we have to do is fiddle the return pointer. - */ - regs->gr[31] -= 8; /* delayed branching */ + check_syscallno_in_delay_branch(regs); break; } } @@ -510,15 +556,9 @@ insert_restart_trampoline(struct pt_regs *regs) } case -ERESTARTNOHAND: case -ERESTARTSYS: - case -ERESTARTNOINTR: { - /* Hooray for delayed branching. We don't - * have to restore %r20 (the system call - * number) because it gets loaded in the delay - * slot of the branch external instruction. - */ - regs->gr[31] -= 8; + case -ERESTARTNOINTR: + check_syscallno_in_delay_branch(regs); return; - } default: break; } ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts (v2) 2015-12-21 9:19 ` [PATCH] parisc: Fix syscall restarts (v2) Helge Deller @ 2015-12-21 13:11 ` John David Anglin 2015-12-21 20:27 ` Mathieu Desnoyers 1 sibling, 0 replies; 30+ messages in thread From: John David Anglin @ 2015-12-21 13:11 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, James Bottomley, Mathieu Desnoyers On 2015-12-21, at 4:19 AM, Helge Deller wrote: > On parisc the current syscall restart code assumes that all syscall > callers load the syscall number in the delay slot of the ble > instruction. That's how it is e.g. done in the unistd.h header file: > ble 0x100(%sr2, %r0) > ldi #syscall_nr, %r20 > Because of that assumption the current code never restored %r20 before > returning to userspace. Is %r20 restored for nop case? The comments in the patch suggest that it is not necessary to restore it. I'm thinking we need to review syscall clobbered register list for glibc. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts (v2) 2015-12-21 9:19 ` [PATCH] parisc: Fix syscall restarts (v2) Helge Deller 2015-12-21 13:11 ` John David Anglin @ 2015-12-21 20:27 ` Mathieu Desnoyers 2015-12-21 20:54 ` Helge Deller 1 sibling, 1 reply; 30+ messages in thread From: Mathieu Desnoyers @ 2015-12-21 20:27 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, James Bottomley, John David Anglin ----- On Dec 21, 2015, at 4:19 AM, Helge Deller deller@gmx.de wrote: > This is version 2 of the patch: > > On parisc syscalls which are interrupted by signals sometimes failed to > restart and instead returned -ENOSYS which in the worst case lead to > userspace crashes. > A similiar problem existed on MIPS and was fixed by commit e967ef02 > ("MIPS: Fix restart of indirect syscalls"). > > On parisc the current syscall restart code assumes that all syscall > callers load the syscall number in the delay slot of the ble > instruction. That's how it is e.g. done in the unistd.h header file: > ble 0x100(%sr2, %r0) > ldi #syscall_nr, %r20 > Because of that assumption the current code never restored %r20 before > returning to userspace. > > This assumption is at least not true for code which uses the glibc > syscall() function, which instead uses this syntax: > ble 0x100(%sr2, %r0) > copy regX, %r20 > where regX depend on how the compiler optimizes the code and register > usage. > > This patch fixes this problem by adding code to analyze how the syscall > number is loaded in the delay branch and - if needed - copy the syscall > number to regX prior returning to userspace for the syscall restart. > > Signed-off-by: Helge Deller <deller@gmx.de> > Cc: stable@vger.kernel.org > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c > index dc1ea79..2264f68 100644 > --- a/arch/parisc/kernel/signal.c > +++ b/arch/parisc/kernel/signal.c > @@ -435,6 +435,55 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs, > int in_syscall) > regs->gr[28]); > } > > +/* > + * Check how the syscall number gets loaded into %r20 within > + * the delay branch in userspace and adjust as needed. > + */ > + > +static void check_syscallno_in_delay_branch(struct pt_regs *regs) > +{ > + u32 opcode, source_reg; > + u32 __user *uaddr; > + int err; > + > + /* Usually we don't have to restore %r20 (the system call number) > + * because it gets loaded in the delay slot of the branch external > + * instruction via the ldi instruction. > + * In some cases a register-to-register copy instruction might have > + * been used instead, in which case we need to copy the syscall > + * number into the source register before returning to userspace. > + */ > + > + /* A syscall is just a branch, so all we have to do is fiddle the > + * return pointer so that the ble instruction gets executed again. > + */ > + regs->gr[31] -= 8; /* delayed branching */ > + > + /* Get assembler opcode of code in delay branch */ > + uaddr = (unsigned int *) ((regs->gr[31] & ~3) + 4); Is it valid to have unaligned instructions ? Does the architecture allow it, or it's a fumble and we should pr_warn ? > + err = get_user(opcode, uaddr); > + if (err) Should we add a pr_warn here ? > + return; > + > + /* Check if delay branch uses "ldi int,%r20" */ > + if ((opcode & 0xffff0000) == 0x34140000) > + return; /* everything ok, just return */ > + > + /* Check if delay branch uses "nop" */ > + if (opcode == INSN_NOP) > + return; When we find a NOP in the delay slot, how can we be sure %r20 still holds the syscall value when we re-play the branch instruction ? Can it be overwritten during the syscall, either from start of syscall to here, or from here to return to userspace ? > + > + /* Check if delay branch uses "copy %rX,%r20" */ > + if ((opcode & 0xffe0ffff) == 0x08000254) { > + source_reg = (opcode >> 16) & 31; > + regs->gr[source_reg] = regs->gr[20]; Similar question here, how can we be sure regs->gr[20] still has the system call number at this point (not overwritten from start of syscall to here) ? Thanks, Mathieu > + return; > + } > + > + pr_warn("syscall restart: %s (pid %d): unexpected opcode 0x%08x\n", > + current->comm, task_pid_nr(current), opcode); > +} > + > static inline void > syscall_restart(struct pt_regs *regs, struct k_sigaction *ka) > { > @@ -457,10 +506,7 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction > *ka) > } > /* fallthrough */ > case -ERESTARTNOINTR: > - /* A syscall is just a branch, so all > - * we have to do is fiddle the return pointer. > - */ > - regs->gr[31] -= 8; /* delayed branching */ > + check_syscallno_in_delay_branch(regs); > break; > } > } > @@ -510,15 +556,9 @@ insert_restart_trampoline(struct pt_regs *regs) > } > case -ERESTARTNOHAND: > case -ERESTARTSYS: > - case -ERESTARTNOINTR: { > - /* Hooray for delayed branching. We don't > - * have to restore %r20 (the system call > - * number) because it gets loaded in the delay > - * slot of the branch external instruction. > - */ > - regs->gr[31] -= 8; > + case -ERESTARTNOINTR: > + check_syscallno_in_delay_branch(regs); > return; > - } > default: > break; > } -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts (v2) 2015-12-21 20:27 ` Mathieu Desnoyers @ 2015-12-21 20:54 ` Helge Deller 2015-12-24 16:07 ` Mathieu Desnoyers 0 siblings, 1 reply; 30+ messages in thread From: Helge Deller @ 2015-12-21 20:54 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: linux-parisc, James Bottomley, John David Anglin On 21.12.2015 21:27, Mathieu Desnoyers wrote: > ----- On Dec 21, 2015, at 4:19 AM, Helge Deller deller@gmx.de wrote: > >> This is version 2 of the patch: >> >> On parisc syscalls which are interrupted by signals sometimes failed to >> restart and instead returned -ENOSYS which in the worst case lead to >> userspace crashes. >> A similiar problem existed on MIPS and was fixed by commit e967ef02 >> ("MIPS: Fix restart of indirect syscalls"). >> >> On parisc the current syscall restart code assumes that all syscall >> callers load the syscall number in the delay slot of the ble >> instruction. That's how it is e.g. done in the unistd.h header file: >> ble 0x100(%sr2, %r0) >> ldi #syscall_nr, %r20 >> Because of that assumption the current code never restored %r20 before >> returning to userspace. >> >> This assumption is at least not true for code which uses the glibc >> syscall() function, which instead uses this syntax: >> ble 0x100(%sr2, %r0) >> copy regX, %r20 >> where regX depend on how the compiler optimizes the code and register >> usage. >> >> This patch fixes this problem by adding code to analyze how the syscall >> number is loaded in the delay branch and - if needed - copy the syscall >> number to regX prior returning to userspace for the syscall restart. >> >> Signed-off-by: Helge Deller <deller@gmx.de> >> Cc: stable@vger.kernel.org >> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >> >> diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c >> index dc1ea79..2264f68 100644 >> --- a/arch/parisc/kernel/signal.c >> +++ b/arch/parisc/kernel/signal.c >> @@ -435,6 +435,55 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs, >> int in_syscall) >> regs->gr[28]); >> } >> >> +/* >> + * Check how the syscall number gets loaded into %r20 within >> + * the delay branch in userspace and adjust as needed. >> + */ >> + >> +static void check_syscallno_in_delay_branch(struct pt_regs *regs) >> +{ >> + u32 opcode, source_reg; >> + u32 __user *uaddr; >> + int err; >> + >> + /* Usually we don't have to restore %r20 (the system call number) >> + * because it gets loaded in the delay slot of the branch external >> + * instruction via the ldi instruction. >> + * In some cases a register-to-register copy instruction might have >> + * been used instead, in which case we need to copy the syscall >> + * number into the source register before returning to userspace. >> + */ >> + >> + /* A syscall is just a branch, so all we have to do is fiddle the >> + * return pointer so that the ble instruction gets executed again. >> + */ >> + regs->gr[31] -= 8; /* delayed branching */ >> + >> + /* Get assembler opcode of code in delay branch */ >> + uaddr = (unsigned int *) ((regs->gr[31] & ~3) + 4); > > Is it valid to have unaligned instructions ? Does the architecture > allow it, or it's a fumble and we should pr_warn ? How can it be unaligned? It's about u32... And, no, unaligned instructions are not allowed (I think that at least). >> + err = get_user(opcode, uaddr); >> + if (err) > > Should we add a pr_warn here ? No. There is no gain to have a warning here. >> + return; >> + >> + /* Check if delay branch uses "ldi int,%r20" */ >> + if ((opcode & 0xffff0000) == 0x34140000) >> + return; /* everything ok, just return */ >> + >> + /* Check if delay branch uses "nop" */ >> + if (opcode == INSN_NOP) >> + return; > > When we find a NOP in the delay slot, how can we be sure %r20 > still holds the syscall value when we re-play the branch > instruction ? I looked at the code and even tested it (with your testcase actually). > Can it be overwritten during the syscall, > either from start of syscall to here, or from here to > return to userspace ? No. >> + >> + /* Check if delay branch uses "copy %rX,%r20" */ >> + if ((opcode & 0xffe0ffff) == 0x08000254) { >> + source_reg = (opcode >> 16) & 31; >> + regs->gr[source_reg] = regs->gr[20]; > > Similar question here, how can we be sure regs->gr[20] > still has the system call number at this point (not > overwritten from start of syscall to here) ? Those registers are saved at entry of syscall and restored at exit (with exception of a few registers e.g. like r28 which is the return value of the syscall). Helge ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts (v2) 2015-12-21 20:54 ` Helge Deller @ 2015-12-24 16:07 ` Mathieu Desnoyers 2015-12-24 16:51 ` John David Anglin 0 siblings, 1 reply; 30+ messages in thread From: Mathieu Desnoyers @ 2015-12-24 16:07 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc, James Bottomley, John David Anglin ----- On Dec 21, 2015, at 3:54 PM, Helge Deller deller@gmx.de wrote: > On 21.12.2015 21:27, Mathieu Desnoyers wrote: >> ----- On Dec 21, 2015, at 4:19 AM, Helge Deller deller@gmx.de wrote: >> >>> This is version 2 of the patch: >>> >>> On parisc syscalls which are interrupted by signals sometimes failed to >>> restart and instead returned -ENOSYS which in the worst case lead to >>> userspace crashes. >>> A similiar problem existed on MIPS and was fixed by commit e967ef02 >>> ("MIPS: Fix restart of indirect syscalls"). >>> >>> On parisc the current syscall restart code assumes that all syscall >>> callers load the syscall number in the delay slot of the ble >>> instruction. That's how it is e.g. done in the unistd.h header file: >>> ble 0x100(%sr2, %r0) >>> ldi #syscall_nr, %r20 >>> Because of that assumption the current code never restored %r20 before >>> returning to userspace. >>> >>> This assumption is at least not true for code which uses the glibc >>> syscall() function, which instead uses this syntax: >>> ble 0x100(%sr2, %r0) >>> copy regX, %r20 >>> where regX depend on how the compiler optimizes the code and register >>> usage. >>> >>> This patch fixes this problem by adding code to analyze how the syscall >>> number is loaded in the delay branch and - if needed - copy the syscall >>> number to regX prior returning to userspace for the syscall restart. >>> >>> Signed-off-by: Helge Deller <deller@gmx.de> >>> Cc: stable@vger.kernel.org >>> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >>> >>> diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c >>> index dc1ea79..2264f68 100644 >>> --- a/arch/parisc/kernel/signal.c >>> +++ b/arch/parisc/kernel/signal.c >>> @@ -435,6 +435,55 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs, >>> int in_syscall) >>> regs->gr[28]); >>> } >>> >>> +/* >>> + * Check how the syscall number gets loaded into %r20 within >>> + * the delay branch in userspace and adjust as needed. >>> + */ >>> + >>> +static void check_syscallno_in_delay_branch(struct pt_regs *regs) >>> +{ >>> + u32 opcode, source_reg; >>> + u32 __user *uaddr; >>> + int err; >>> + >>> + /* Usually we don't have to restore %r20 (the system call number) >>> + * because it gets loaded in the delay slot of the branch external >>> + * instruction via the ldi instruction. >>> + * In some cases a register-to-register copy instruction might have >>> + * been used instead, in which case we need to copy the syscall >>> + * number into the source register before returning to userspace. >>> + */ >>> + >>> + /* A syscall is just a branch, so all we have to do is fiddle the >>> + * return pointer so that the ble instruction gets executed again. >>> + */ >>> + regs->gr[31] -= 8; /* delayed branching */ >>> + >>> + /* Get assembler opcode of code in delay branch */ >>> + uaddr = (unsigned int *) ((regs->gr[31] & ~3) + 4); >> >> Is it valid to have unaligned instructions ? Does the architecture >> allow it, or it's a fumble and we should pr_warn ? > > How can it be unaligned? It's about u32... That would be an instruction that is volountarily offset from 1, 2, 3 bytes from 4-bytes multiples by the application. The only situation where I have seen this is in cases where applications are trying to play games with the debugger or disassembler and hide what they are doing: they can offset the start of a function like this, and therefore all the instructions within that function. > And, no, unaligned instructions are not allowed (I think that at least). Might be interesting to try it out though. I'm not saying it's a valid use-case for an application, but it would be at least good to known whether this is an input we can expect. > >>> + err = get_user(opcode, uaddr); >>> + if (err) >> >> Should we add a pr_warn here ? > > No. There is no gain to have a warning here. Allright. > >>> + return; >>> + >>> + /* Check if delay branch uses "ldi int,%r20" */ >>> + if ((opcode & 0xffff0000) == 0x34140000) >>> + return; /* everything ok, just return */ >>> + >>> + /* Check if delay branch uses "nop" */ >>> + if (opcode == INSN_NOP) >>> + return; >> >> When we find a NOP in the delay slot, how can we be sure %r20 >> still holds the syscall value when we re-play the branch >> instruction ? > > I looked at the code and even tested it (with your testcase actually). > >> Can it be overwritten during the syscall, >> either from start of syscall to here, or from here to >> return to userspace ? > > No. > >>> + >>> + /* Check if delay branch uses "copy %rX,%r20" */ >>> + if ((opcode & 0xffe0ffff) == 0x08000254) { >>> + source_reg = (opcode >> 16) & 31; >>> + regs->gr[source_reg] = regs->gr[20]; >> >> Similar question here, how can we be sure regs->gr[20] >> still has the system call number at this point (not >> overwritten from start of syscall to here) ? > > Those registers are saved at entry of syscall and > restored at exit (with exception of a few registers > e.g. like r28 which is the return value of the syscall). Can r28 be used as a source_reg ? If so, what happens then ? Thanks, Mathieu > > Helge -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] parisc: Fix syscall restarts (v2) 2015-12-24 16:07 ` Mathieu Desnoyers @ 2015-12-24 16:51 ` John David Anglin 0 siblings, 0 replies; 30+ messages in thread From: John David Anglin @ 2015-12-24 16:51 UTC (permalink / raw) To: Mathieu Desnoyers, Helge Deller; +Cc: linux-parisc, James Bottomley On 2015-12-24 11:07 AM, Mathieu Desnoyers wrote: >>> Is it valid to have unaligned instructions ? Does the architecture >>> >>allow it, or it's a fumble and we should pr_warn ? >> > >> >How can it be unaligned? It's about u32... > That would be an instruction that is volountarily offset > from 1, 2, 3 bytes from 4-bytes multiples by the application. > The only situation where I have seen this is in cases where > applications are trying to play games with the debugger or > disassembler and hide what they are doing: they can offset > the start of a function like this, and therefore all the > instructions within that function. > This is not possible on PA-RISC. Indeed, user instruction addresses are always offset by three. There is no way to branch to an instruction that is offset. The least significant two bits of an instruction address contain a priority level. User code on linux and hpux executes at level 3. The only way a user can change privilege level is with a "gate" instruction (a special branch). Whether this is permitted or not depends on page table permissions that the user can't change. In practice, a level change is only allowed on the gateway page. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2015-12-24 16:51 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-12-18 23:30 [PATCH] parisc: Fix syscall restarts Helge Deller 2015-12-20 13:59 ` Mathieu Desnoyers 2015-12-20 14:09 ` Mathieu Desnoyers 2015-12-20 15:49 ` Helge Deller 2015-12-20 16:50 ` James Bottomley 2015-12-20 20:35 ` Helge Deller 2015-12-21 8:03 ` James Bottomley 2015-12-21 14:39 ` Mathieu Desnoyers 2015-12-20 18:31 ` John David Anglin 2015-12-20 19:32 ` Helge Deller 2015-12-20 19:46 ` John David Anglin 2015-12-20 20:06 ` Helge Deller 2015-12-20 23:57 ` John David Anglin 2015-12-21 14:42 ` Mathieu Desnoyers 2015-12-21 15:12 ` John David Anglin 2015-12-20 19:39 ` John David Anglin 2015-12-20 19:48 ` Helge Deller 2015-12-20 20:01 ` John David Anglin 2015-12-20 20:18 ` Helge Deller 2015-12-20 20:45 ` John David Anglin 2015-12-20 20:14 ` John David Anglin 2015-12-20 20:19 ` Helge Deller 2015-12-20 20:21 ` Helge Deller 2015-12-20 20:53 ` John David Anglin 2015-12-21 9:19 ` [PATCH] parisc: Fix syscall restarts (v2) Helge Deller 2015-12-21 13:11 ` John David Anglin 2015-12-21 20:27 ` Mathieu Desnoyers 2015-12-21 20:54 ` Helge Deller 2015-12-24 16:07 ` Mathieu Desnoyers 2015-12-24 16:51 ` John David Anglin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.