* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 7:42 ` Colin Cross 0 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-14 7:42 UTC (permalink / raw) To: linux-arm-kernel; +Cc: Catalin Marinas, Colin Cross, Russell King, linux-kernel The exception handler in entry-armv.S checks for thumb mode and correctly determines the exception location and instruction, but VFP_bounce uses the uncorrected location off the stack. If the VFP exception occured in Thumb mode, fix up the exception location to match the value that would be returned in ARM mode. Fixes segfaults in userspace applications running in Thumb mode caused by a handled VFP exception returning to the middle of the instruction that triggered the exception. Change-Id: I6c6ba1ab88e107bec166ea334d7e0974a4f6bfba Signed-off-by: Colin Cross <ccross@android.com> --- arch/arm/vfp/vfpmodule.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c index 0797cb5..63ed73d 100644 --- a/arch/arm/vfp/vfpmodule.c +++ b/arch/arm/vfp/vfpmodule.c @@ -275,6 +275,16 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs) pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc); /* + * If the exception occured in thumb mode, pc is exception location + 2, + * the middle of the 32-bit VFP instruction. Add 2 to get exception + * location + 4, the same we get in ARM mode. + */ +#ifdef CONFIG_ARM_THUMB + if (regs->ARM_cpsr & PSR_T_BIT) + regs->ARM_pc += 2; +#endif + + /* * At this point, FPEXC can have the following configuration: * * EX DEX IXE -- 1.7.3.1 ^ permalink raw reply related [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 7:42 ` Colin Cross 0 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-14 7:42 UTC (permalink / raw) To: linux-arm-kernel The exception handler in entry-armv.S checks for thumb mode and correctly determines the exception location and instruction, but VFP_bounce uses the uncorrected location off the stack. If the VFP exception occured in Thumb mode, fix up the exception location to match the value that would be returned in ARM mode. Fixes segfaults in userspace applications running in Thumb mode caused by a handled VFP exception returning to the middle of the instruction that triggered the exception. Change-Id: I6c6ba1ab88e107bec166ea334d7e0974a4f6bfba Signed-off-by: Colin Cross <ccross@android.com> --- arch/arm/vfp/vfpmodule.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c index 0797cb5..63ed73d 100644 --- a/arch/arm/vfp/vfpmodule.c +++ b/arch/arm/vfp/vfpmodule.c @@ -275,6 +275,16 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs) pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc); /* + * If the exception occured in thumb mode, pc is exception location + 2, + * the middle of the 32-bit VFP instruction. Add 2 to get exception + * location + 4, the same we get in ARM mode. + */ +#ifdef CONFIG_ARM_THUMB + if (regs->ARM_cpsr & PSR_T_BIT) + regs->ARM_pc += 2; +#endif + + /* * At this point, FPEXC can have the following configuration: * * EX DEX IXE -- 1.7.3.1 ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 7:42 ` Colin Cross @ 2011-01-14 11:43 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-14 11:43 UTC (permalink / raw) To: Colin Cross; +Cc: linux-arm-kernel, Russell King, linux-kernel On 14 January 2011 07:42, Colin Cross <ccross@android.com> wrote: > The exception handler in entry-armv.S checks for thumb mode and > correctly determines the exception location and instruction, > but VFP_bounce uses the uncorrected location off the stack. > If the VFP exception occured in Thumb mode, fix up the > exception location to match the value that would be returned > in ARM mode. > > Fixes segfaults in userspace applications running in Thumb mode > caused by a handled VFP exception returning to the middle of the > instruction that triggered the exception. > > Change-Id: I6c6ba1ab88e107bec166ea334d7e0974a4f6bfba > Signed-off-by: Colin Cross <ccross@android.com> > --- > arch/arm/vfp/vfpmodule.c | 10 ++++++++++ > 1 files changed, 10 insertions(+), 0 deletions(-) > > diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c > index 0797cb5..63ed73d 100644 > --- a/arch/arm/vfp/vfpmodule.c > +++ b/arch/arm/vfp/vfpmodule.c > @@ -275,6 +275,16 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs) > pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc); > > /* > + * If the exception occured in thumb mode, pc is exception location + 2, > + * the middle of the 32-bit VFP instruction. Add 2 to get exception > + * location + 4, the same we get in ARM mode. > + */ > +#ifdef CONFIG_ARM_THUMB > + if (regs->ARM_cpsr & PSR_T_BIT) > + regs->ARM_pc += 2; > +#endif You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 11:43 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-14 11:43 UTC (permalink / raw) To: linux-arm-kernel On 14 January 2011 07:42, Colin Cross <ccross@android.com> wrote: > The exception handler in entry-armv.S checks for thumb mode and > correctly determines the exception location and instruction, > but VFP_bounce uses the uncorrected location off the stack. > If the VFP exception occured in Thumb mode, fix up the > exception location to match the value that would be returned > in ARM mode. > > Fixes segfaults in userspace applications running in Thumb mode > caused by a handled VFP exception returning to the middle of the > instruction that triggered the exception. > > Change-Id: I6c6ba1ab88e107bec166ea334d7e0974a4f6bfba > Signed-off-by: Colin Cross <ccross@android.com> > --- > ?arch/arm/vfp/vfpmodule.c | ? 10 ++++++++++ > ?1 files changed, 10 insertions(+), 0 deletions(-) > > diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c > index 0797cb5..63ed73d 100644 > --- a/arch/arm/vfp/vfpmodule.c > +++ b/arch/arm/vfp/vfpmodule.c > @@ -275,6 +275,16 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs) > ? ? ? ?pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc); > > ? ? ? ?/* > + ? ? ? ?* If the exception occured in thumb mode, pc is exception location + 2, > + ? ? ? ?* the middle of the 32-bit VFP instruction. ?Add 2 to get exception > + ? ? ? ?* location + 4, the same we get in ARM mode. > + ? ? ? ?*/ > +#ifdef CONFIG_ARM_THUMB > + ? ? ? if (regs->ARM_cpsr & PSR_T_BIT) > + ? ? ? ? ? ? ? regs->ARM_pc += 2; > +#endif You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 11:43 ` Catalin Marinas @ 2011-01-14 12:02 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 12:02 UTC (permalink / raw) To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote: > > pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc); > > > > /* > > + * If the exception occured in thumb mode, pc is exception location + 2, > > + * the middle of the 32-bit VFP instruction. Add 2 to get exception > > + * location + 4, the same we get in ARM mode. > > + */ > > +#ifdef CONFIG_ARM_THUMB > > + if (regs->ARM_cpsr & PSR_T_BIT) > > + regs->ARM_pc += 2; > > +#endif > > You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely. I don't think this is correct. On entry to the undefined instruction handler, we get the uncorrected PC value, so PC points to the instruction after the faulting instruction. If it was an ARM instruction, that is located at PC-4. If it was a Thumb instruction, it is located at PC-2. This PC value is passed unmodified to the VFP entry code, and the passed r2 reflect the value in regs->ARM_pc. The VFP entry assembly doesn't touch the PC value, except when it wants to retry an instruction: sub r2, r2, #4 str r2, [sp, #S_PC] @ retry the instruction So I think that 2 to the PC when in thumb mode is incorrect, as that'll cause us to skip the instruction following the faulted one. I think that the undefined instruction handling needs reworking for Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb instruction, and we have no way of knowing without repeatedly decoding that instruction. ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 12:02 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 12:02 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote: > > ? ? ? ?pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc); > > > > ? ? ? ?/* > > + ? ? ? ?* If the exception occured in thumb mode, pc is exception location + 2, > > + ? ? ? ?* the middle of the 32-bit VFP instruction. ?Add 2 to get exception > > + ? ? ? ?* location + 4, the same we get in ARM mode. > > + ? ? ? ?*/ > > +#ifdef CONFIG_ARM_THUMB > > + ? ? ? if (regs->ARM_cpsr & PSR_T_BIT) > > + ? ? ? ? ? ? ? regs->ARM_pc += 2; > > +#endif > > You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely. I don't think this is correct. On entry to the undefined instruction handler, we get the uncorrected PC value, so PC points to the instruction after the faulting instruction. If it was an ARM instruction, that is located at PC-4. If it was a Thumb instruction, it is located at PC-2. This PC value is passed unmodified to the VFP entry code, and the passed r2 reflect the value in regs->ARM_pc. The VFP entry assembly doesn't touch the PC value, except when it wants to retry an instruction: sub r2, r2, #4 str r2, [sp, #S_PC] @ retry the instruction So I think that 2 to the PC when in thumb mode is incorrect, as that'll cause us to skip the instruction following the faulted one. I think that the undefined instruction handling needs reworking for Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb instruction, and we have no way of knowing without repeatedly decoding that instruction. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 12:02 ` Russell King - ARM Linux @ 2011-01-14 14:10 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-14 14:10 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: > On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote: > > > pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc); > > > > > > /* > > > + * If the exception occured in thumb mode, pc is exception location + 2, > > > + * the middle of the 32-bit VFP instruction. Add 2 to get exception > > > + * location + 4, the same we get in ARM mode. > > > + */ > > > +#ifdef CONFIG_ARM_THUMB > > > + if (regs->ARM_cpsr & PSR_T_BIT) > > > + regs->ARM_pc += 2; > > > +#endif > > > > You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely. > > I don't think this is correct. On entry to the undefined instruction > handler, we get the uncorrected PC value, so PC points to the > instruction after the faulting instruction. > > If it was an ARM instruction, that is located at PC-4. If it was a > Thumb instruction, it is located at PC-2. This PC value is passed > unmodified to the VFP entry code, and the passed r2 reflect the > value in regs->ARM_pc. The entry-armv.S code adds 2 to the r2 register in case of a 32-bit Thumb instruction, so it is no longer the same as the ARM_pc. Since the VFP instructions in Thumb mode are always 32-bit, Colin's patch made sense to me. > I think that the undefined instruction handling needs reworking for > Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb > instruction, and we have no way of knowing without repeatedly > decoding that instruction. We already handle the r2 for in __und_usr. We don't deal with ARM_pc but we could either do it in __und_usr or let the code handling the undef fix it up. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 14:10 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-14 14:10 UTC (permalink / raw) To: linux-arm-kernel On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: > On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote: > > > pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc); > > > > > > /* > > > + * If the exception occured in thumb mode, pc is exception location + 2, > > > + * the middle of the 32-bit VFP instruction. Add 2 to get exception > > > + * location + 4, the same we get in ARM mode. > > > + */ > > > +#ifdef CONFIG_ARM_THUMB > > > + if (regs->ARM_cpsr & PSR_T_BIT) > > > + regs->ARM_pc += 2; > > > +#endif > > > > You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely. > > I don't think this is correct. On entry to the undefined instruction > handler, we get the uncorrected PC value, so PC points to the > instruction after the faulting instruction. > > If it was an ARM instruction, that is located at PC-4. If it was a > Thumb instruction, it is located at PC-2. This PC value is passed > unmodified to the VFP entry code, and the passed r2 reflect the > value in regs->ARM_pc. The entry-armv.S code adds 2 to the r2 register in case of a 32-bit Thumb instruction, so it is no longer the same as the ARM_pc. Since the VFP instructions in Thumb mode are always 32-bit, Colin's patch made sense to me. > I think that the undefined instruction handling needs reworking for > Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb > instruction, and we have no way of knowing without repeatedly > decoding that instruction. We already handle the r2 for in __und_usr. We don't deal with ARM_pc but we could either do it in __und_usr or let the code handling the undef fix it up. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 14:10 ` Catalin Marinas @ 2011-01-14 15:49 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 15:49 UTC (permalink / raw) To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote: > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: > > I don't think this is correct. On entry to the undefined instruction > > handler, we get the uncorrected PC value, so PC points to the > > instruction after the faulting instruction. > > > > If it was an ARM instruction, that is located at PC-4. If it was a > > Thumb instruction, it is located at PC-2. This PC value is passed > > unmodified to the VFP entry code, and the passed r2 reflect the > > value in regs->ARM_pc. > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit > Thumb instruction, so it is no longer the same as the ARM_pc. That's something that should be fixed - the entry conditions should be the same irrespective of thumb or arm encoding. > Since the VFP instructions in Thumb mode are always 32-bit, Colin's > patch made sense to me. I looked up the VADD instruction in the ARM ARM. It has both a 16-bit and 32-bit encoding. > > I think that the undefined instruction handling needs reworking for > > Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb > > instruction, and we have no way of knowing without repeatedly > > decoding that instruction. > > We already handle the r2 for in __und_usr. We don't deal with ARM_pc but > we could either do it in __und_usr or let the code handling the undef > fix it up. At the moment its just confusing as things stand, as some things are changed in one place and not the other. Let's kill the pointless addition of 2 in the undefined instruction handler so that in every case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc as per the ARM ARM undefined exception entry LR. Undefined instruction exception handlers can then rely on the meaning of both of these. ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 15:49 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 15:49 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote: > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: > > I don't think this is correct. On entry to the undefined instruction > > handler, we get the uncorrected PC value, so PC points to the > > instruction after the faulting instruction. > > > > If it was an ARM instruction, that is located at PC-4. If it was a > > Thumb instruction, it is located at PC-2. This PC value is passed > > unmodified to the VFP entry code, and the passed r2 reflect the > > value in regs->ARM_pc. > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit > Thumb instruction, so it is no longer the same as the ARM_pc. That's something that should be fixed - the entry conditions should be the same irrespective of thumb or arm encoding. > Since the VFP instructions in Thumb mode are always 32-bit, Colin's > patch made sense to me. I looked up the VADD instruction in the ARM ARM. It has both a 16-bit and 32-bit encoding. > > I think that the undefined instruction handling needs reworking for > > Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb > > instruction, and we have no way of knowing without repeatedly > > decoding that instruction. > > We already handle the r2 for in __und_usr. We don't deal with ARM_pc but > we could either do it in __und_usr or let the code handling the undef > fix it up. At the moment its just confusing as things stand, as some things are changed in one place and not the other. Let's kill the pointless addition of 2 in the undefined instruction handler so that in every case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc as per the ARM ARM undefined exception entry LR. Undefined instruction exception handlers can then rely on the meaning of both of these. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 15:49 ` Russell King - ARM Linux @ 2011-01-14 16:23 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-14 16:23 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote: > On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote: > > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: > > > I don't think this is correct. On entry to the undefined instruction > > > handler, we get the uncorrected PC value, so PC points to the > > > instruction after the faulting instruction. > > > > > > If it was an ARM instruction, that is located at PC-4. If it was a > > > Thumb instruction, it is located at PC-2. This PC value is passed > > > unmodified to the VFP entry code, and the passed r2 reflect the > > > value in regs->ARM_pc. > > > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit > > Thumb instruction, so it is no longer the same as the ARM_pc. > > That's something that should be fixed - the entry conditions should be > the same irrespective of thumb or arm encoding. But in this case you have to fix the vfphw.S code to check for Thumb and subtract 2 rather than 4 from r2. > > Since the VFP instructions in Thumb mode are always 32-bit, Colin's > > patch made sense to me. > > I looked up the VADD instruction in the ARM ARM. It has both a 16-bit > and 32-bit encoding. Are you sure? The Thumb encoding is made up of two 16-bit values but it is still 32-bit in total. > > > I think that the undefined instruction handling needs reworking for > > > Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb > > > instruction, and we have no way of knowing without repeatedly > > > decoding that instruction. > > > > We already handle the r2 for in __und_usr. We don't deal with ARM_pc but > > we could either do it in __und_usr or let the code handling the undef > > fix it up. > > At the moment its just confusing as things stand, as some things are > changed in one place and not the other. Let's kill the pointless > addition of 2 in the undefined instruction handler so that in every > case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc > as per the ARM ARM undefined exception entry LR. > > Undefined instruction exception handlers can then rely on the meaning > of both of these. That's an alternative, though we may end up with checking the encoding twice. The Undef handler already reads the instruction opcode and it needs to know whether it is a 16 or a 32-bit wide instruction. But I agree that the current implementation is a bit confusing. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 16:23 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-14 16:23 UTC (permalink / raw) To: linux-arm-kernel On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote: > On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote: > > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: > > > I don't think this is correct. On entry to the undefined instruction > > > handler, we get the uncorrected PC value, so PC points to the > > > instruction after the faulting instruction. > > > > > > If it was an ARM instruction, that is located at PC-4. If it was a > > > Thumb instruction, it is located at PC-2. This PC value is passed > > > unmodified to the VFP entry code, and the passed r2 reflect the > > > value in regs->ARM_pc. > > > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit > > Thumb instruction, so it is no longer the same as the ARM_pc. > > That's something that should be fixed - the entry conditions should be > the same irrespective of thumb or arm encoding. But in this case you have to fix the vfphw.S code to check for Thumb and subtract 2 rather than 4 from r2. > > Since the VFP instructions in Thumb mode are always 32-bit, Colin's > > patch made sense to me. > > I looked up the VADD instruction in the ARM ARM. It has both a 16-bit > and 32-bit encoding. Are you sure? The Thumb encoding is made up of two 16-bit values but it is still 32-bit in total. > > > I think that the undefined instruction handling needs reworking for > > > Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb > > > instruction, and we have no way of knowing without repeatedly > > > decoding that instruction. > > > > We already handle the r2 for in __und_usr. We don't deal with ARM_pc but > > we could either do it in __und_usr or let the code handling the undef > > fix it up. > > At the moment its just confusing as things stand, as some things are > changed in one place and not the other. Let's kill the pointless > addition of 2 in the undefined instruction handler so that in every > case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc > as per the ARM ARM undefined exception entry LR. > > Undefined instruction exception handlers can then rely on the meaning > of both of these. That's an alternative, though we may end up with checking the encoding twice. The Undef handler already reads the instruction opcode and it needs to know whether it is a 16 or a 32-bit wide instruction. But I agree that the current implementation is a bit confusing. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 16:23 ` Catalin Marinas @ 2011-01-14 16:35 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 16:35 UTC (permalink / raw) To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Fri, Jan 14, 2011 at 04:23:12PM +0000, Catalin Marinas wrote: > On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote: > > On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote: > > > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: > > > > I don't think this is correct. On entry to the undefined instruction > > > > handler, we get the uncorrected PC value, so PC points to the > > > > instruction after the faulting instruction. > > > > > > > > If it was an ARM instruction, that is located at PC-4. If it was a > > > > Thumb instruction, it is located at PC-2. This PC value is passed > > > > unmodified to the VFP entry code, and the passed r2 reflect the > > > > value in regs->ARM_pc. > > > > > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit > > > Thumb instruction, so it is no longer the same as the ARM_pc. > > > > That's something that should be fixed - the entry conditions should be > > the same irrespective of thumb or arm encoding. > > But in this case you have to fix the vfphw.S code to check for Thumb and > subtract 2 rather than 4 from r2. So is this right for Thumb? Or does it need to be 2 for thumb and 4 for ARM? Maybe it needs documenting to say why 4 is always correct (if that is the case). check_for_exception: tst r1, #FPEXC_EX bne process_exception @ might as well handle the pending @ exception before retrying branch @ out before setting an FPEXC that @ stops us reading stuff VFPFMXR FPEXC, r1 @ restore FPEXC last sub r2, r2, #4 str r2, [sp, #S_PC] @ retry the instruction > > > Since the VFP instructions in Thumb mode are always 32-bit, Colin's > > > patch made sense to me. > > > > I looked up the VADD instruction in the ARM ARM. It has both a 16-bit > > and 32-bit encoding. > > Are you sure? The Thumb encoding is made up of two 16-bit values but it > is still 32-bit in total. No, I'm not sure - it looks like it is made up from two 16-bit instructions. > > At the moment its just confusing as things stand, as some things are > > changed in one place and not the other. Let's kill the pointless > > addition of 2 in the undefined instruction handler so that in every > > case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc > > as per the ARM ARM undefined exception entry LR. > > > > Undefined instruction exception handlers can then rely on the meaning > > of both of these. > > That's an alternative, though we may end up with checking the encoding > twice. The Undef handler already reads the instruction opcode and it > needs to know whether it is a 16 or a 32-bit wide instruction. At the moment we add 2 in one place, take off 4 in another, and now we're going to add 2 in a completely different place. This is insane. It's a big mess, one which it's impossible to tell if anything is correct or even easy to follow what's going on. I don't really care what it's replaced with provided its replaced by something sane, easy to follow and obviously correct. ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 16:35 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 16:35 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 14, 2011 at 04:23:12PM +0000, Catalin Marinas wrote: > On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote: > > On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote: > > > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: > > > > I don't think this is correct. On entry to the undefined instruction > > > > handler, we get the uncorrected PC value, so PC points to the > > > > instruction after the faulting instruction. > > > > > > > > If it was an ARM instruction, that is located at PC-4. If it was a > > > > Thumb instruction, it is located at PC-2. This PC value is passed > > > > unmodified to the VFP entry code, and the passed r2 reflect the > > > > value in regs->ARM_pc. > > > > > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit > > > Thumb instruction, so it is no longer the same as the ARM_pc. > > > > That's something that should be fixed - the entry conditions should be > > the same irrespective of thumb or arm encoding. > > But in this case you have to fix the vfphw.S code to check for Thumb and > subtract 2 rather than 4 from r2. So is this right for Thumb? Or does it need to be 2 for thumb and 4 for ARM? Maybe it needs documenting to say why 4 is always correct (if that is the case). check_for_exception: tst r1, #FPEXC_EX bne process_exception @ might as well handle the pending @ exception before retrying branch @ out before setting an FPEXC that @ stops us reading stuff VFPFMXR FPEXC, r1 @ restore FPEXC last sub r2, r2, #4 str r2, [sp, #S_PC] @ retry the instruction > > > Since the VFP instructions in Thumb mode are always 32-bit, Colin's > > > patch made sense to me. > > > > I looked up the VADD instruction in the ARM ARM. It has both a 16-bit > > and 32-bit encoding. > > Are you sure? The Thumb encoding is made up of two 16-bit values but it > is still 32-bit in total. No, I'm not sure - it looks like it is made up from two 16-bit instructions. > > At the moment its just confusing as things stand, as some things are > > changed in one place and not the other. Let's kill the pointless > > addition of 2 in the undefined instruction handler so that in every > > case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc > > as per the ARM ARM undefined exception entry LR. > > > > Undefined instruction exception handlers can then rely on the meaning > > of both of these. > > That's an alternative, though we may end up with checking the encoding > twice. The Undef handler already reads the instruction opcode and it > needs to know whether it is a 16 or a 32-bit wide instruction. At the moment we add 2 in one place, take off 4 in another, and now we're going to add 2 in a completely different place. This is insane. It's a big mess, one which it's impossible to tell if anything is correct or even easy to follow what's going on. I don't really care what it's replaced with provided its replaced by something sane, easy to follow and obviously correct. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 16:35 ` Russell King - ARM Linux @ 2011-01-14 16:58 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-14 16:58 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Fri, 2011-01-14 at 16:35 +0000, Russell King - ARM Linux wrote: > On Fri, Jan 14, 2011 at 04:23:12PM +0000, Catalin Marinas wrote: > > On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote: > > > On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote: > > > > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: > > > > > I don't think this is correct. On entry to the undefined instruction > > > > > handler, we get the uncorrected PC value, so PC points to the > > > > > instruction after the faulting instruction. > > > > > > > > > > If it was an ARM instruction, that is located at PC-4. If it was a > > > > > Thumb instruction, it is located at PC-2. This PC value is passed > > > > > unmodified to the VFP entry code, and the passed r2 reflect the > > > > > value in regs->ARM_pc. > > > > > > > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit > > > > Thumb instruction, so it is no longer the same as the ARM_pc. > > > > > > That's something that should be fixed - the entry conditions should be > > > the same irrespective of thumb or arm encoding. > > > > But in this case you have to fix the vfphw.S code to check for Thumb and > > subtract 2 rather than 4 from r2. > > So is this right for Thumb? Or does it need to be 2 for thumb and 4 for > ARM? Maybe it needs documenting to say why 4 is always correct (if that > is the case). > > check_for_exception: > tst r1, #FPEXC_EX > bne process_exception @ might as well handle the pending > @ exception before retrying branch > @ out before setting an FPEXC that > @ stops us reading stuff > VFPFMXR FPEXC, r1 @ restore FPEXC last > sub r2, r2, #4 > str r2, [sp, #S_PC] @ retry the instruction If we don't touch r2 in __und_usr, than in vfphw.S we would need to subtract 2 for Thumb and 4 for ARM. But since we did +2 in __und_usr, we always subtracted 4 here (confusingly though). > > > At the moment its just confusing as things stand, as some things are > > > changed in one place and not the other. Let's kill the pointless > > > addition of 2 in the undefined instruction handler so that in every > > > case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc > > > as per the ARM ARM undefined exception entry LR. > > > > > > Undefined instruction exception handlers can then rely on the meaning > > > of both of these. > > > > That's an alternative, though we may end up with checking the encoding > > twice. The Undef handler already reads the instruction opcode and it > > needs to know whether it is a 16 or a 32-bit wide instruction. > > At the moment we add 2 in one place, take off 4 in another, and now > we're going to add 2 in a completely different place. This is insane. > It's a big mess, one which it's impossible to tell if anything is > correct or even easy to follow what's going on. I agree, this code needs some clean-up. Maybe for Undef we could unify the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the breakpoint code, I haven't checked). Otherwise just let the code handling the undef deal with the ARM/Thumb difference. For SVC, it makes sense to have different offsets as we always return to the next instruction. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 16:58 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-14 16:58 UTC (permalink / raw) To: linux-arm-kernel On Fri, 2011-01-14 at 16:35 +0000, Russell King - ARM Linux wrote: > On Fri, Jan 14, 2011 at 04:23:12PM +0000, Catalin Marinas wrote: > > On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote: > > > On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote: > > > > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: > > > > > I don't think this is correct. On entry to the undefined instruction > > > > > handler, we get the uncorrected PC value, so PC points to the > > > > > instruction after the faulting instruction. > > > > > > > > > > If it was an ARM instruction, that is located at PC-4. If it was a > > > > > Thumb instruction, it is located at PC-2. This PC value is passed > > > > > unmodified to the VFP entry code, and the passed r2 reflect the > > > > > value in regs->ARM_pc. > > > > > > > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit > > > > Thumb instruction, so it is no longer the same as the ARM_pc. > > > > > > That's something that should be fixed - the entry conditions should be > > > the same irrespective of thumb or arm encoding. > > > > But in this case you have to fix the vfphw.S code to check for Thumb and > > subtract 2 rather than 4 from r2. > > So is this right for Thumb? Or does it need to be 2 for thumb and 4 for > ARM? Maybe it needs documenting to say why 4 is always correct (if that > is the case). > > check_for_exception: > tst r1, #FPEXC_EX > bne process_exception @ might as well handle the pending > @ exception before retrying branch > @ out before setting an FPEXC that > @ stops us reading stuff > VFPFMXR FPEXC, r1 @ restore FPEXC last > sub r2, r2, #4 > str r2, [sp, #S_PC] @ retry the instruction If we don't touch r2 in __und_usr, than in vfphw.S we would need to subtract 2 for Thumb and 4 for ARM. But since we did +2 in __und_usr, we always subtracted 4 here (confusingly though). > > > At the moment its just confusing as things stand, as some things are > > > changed in one place and not the other. Let's kill the pointless > > > addition of 2 in the undefined instruction handler so that in every > > > case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc > > > as per the ARM ARM undefined exception entry LR. > > > > > > Undefined instruction exception handlers can then rely on the meaning > > > of both of these. > > > > That's an alternative, though we may end up with checking the encoding > > twice. The Undef handler already reads the instruction opcode and it > > needs to know whether it is a 16 or a 32-bit wide instruction. > > At the moment we add 2 in one place, take off 4 in another, and now > we're going to add 2 in a completely different place. This is insane. > It's a big mess, one which it's impossible to tell if anything is > correct or even easy to follow what's going on. I agree, this code needs some clean-up. Maybe for Undef we could unify the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the breakpoint code, I haven't checked). Otherwise just let the code handling the undef deal with the ARM/Thumb difference. For SVC, it makes sense to have different offsets as we always return to the next instruction. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 16:58 ` Catalin Marinas @ 2011-01-14 17:30 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 17:30 UTC (permalink / raw) To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote: > I agree, this code needs some clean-up. Maybe for Undef we could unify > the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the > breakpoint code, I haven't checked). > > Otherwise just let the code handling the undef deal with the ARM/Thumb > difference. For SVC, it makes sense to have different offsets as we > always return to the next instruction. I think it just needs better documentation. Having been through all this, there _are_ bugs lurking in the code exactly because of this randomness with what PC value is means what. When the VFP support code tests the state of the VFP hardware during boot, it sets the VFP handler to point at vfp_testing_entry, bypassing the normal VFP handling code, and executes a VFP instruction. If this VFP instruction faults (eg, because there is no VFP hardware present or we're not permitted to use it), it could end up resuming execution in the middle of the 16-bit paired instruction because regs->ARM_pc points in the middle of it. So vfp_testing_entry should at least store r2 into regs->ARM_pc to guarantee resuming at the following instruction. So maybe the right answer is to store r2 into regs->ARM_pc in process_exception in the VFP assembly code too? Or maybe we should just make it unconditional that whenever we have an undefined instruction exception, the regs->ARM_pc value will always be set for resuming execution after the faulted instruction. That makes it consistent with r2 throughout the code in every case. diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 2b46fea..5876eec 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) .align 5 __und_usr: usr_entry - - @ - @ fall through to the emulation code, which returns using r9 if - @ it has emulated the instruction, or the more conventional lr - @ if we are to treat this as a real undefined instruction @ - @ r0 - instruction + @ The emulation code returns using r9 if it has emulated the + @ instruction, or the more conventional lr if we are to treat + @ this as a real undefined instruction @ adr r9, BSYM(ret_from_exception) adr lr, BSYM(__und_usr_unknown) + @ + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the + @ faulting instruction depending on Thumb mode. + @ r3 = regs->ARM_cpsr + @ tst r3, #PSR_T_BIT @ Thumb mode? - itet eq @ explicit IT needed for the 1f label + itttt eq @ explicit IT needed for the 1f label subeq r4, r2, #4 @ ARM instr at LR - 4 - subne r4, r2, #2 @ Thumb instr at LR - 2 1: ldreqt r0, [r4] #ifdef CONFIG_CPU_ENDIAN_BE8 reveq r0, r0 @ little endian instruction #endif + @ + @ r0 = 32-bit ARM instruction which caused the exception + @ r2 = PC value for the following instruction (:= regs->ARM_pc) + @ r4 = PC value for the faulting instruction + @ beq call_fpe + @ Thumb instruction #if __LINUX_ARM_ARCH__ >= 7 + sub r4, r2, #2 @ Thumb instr at LR - 2 2: ARM( ldrht r5, [r4], #2 ) THUMB( ldrht r5, [r4] ) @@ -492,18 +500,19 @@ __und_usr: 3: ldrht r0, [r4] add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 orr r0, r0, r5, lsl #16 + @ + @ r0 = the two 16-bit Thumb instructions which caused the exception + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) + @ r4 = PC value for the first 16-bit Thumb instruction + @ #else b __und_usr_unknown #endif - UNWIND(.fnend ) + UNWIND(.fnend) ENDPROC(__und_usr) - @ - @ fallthrough to call_fpe - @ - /* - * The out of line fixup for the ldrt above. + * The out of line fixup for the ldrt instructions above. */ .pushsection .fixup, "ax" 4: mov pc, r9 @@ -534,11 +543,12 @@ ENDPROC(__und_usr) * NEON handler code. * * Emulators may wish to make use of the following registers: - * r0 = instruction opcode. - * r2 = PC+4 + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) + * r2 = PC value to resume execution after successful emulation * r9 = normal "successful" return address - * r10 = this threads thread_info structure. + * r10 = this threads thread_info structure * lr = unrecognised instruction return address + * IRQs disabled, FIQs enabled. */ @ @ Fall-through from Thumb-2 __und_usr diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index ee57640..eeb9250 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) void __user *pc; /* - * According to the ARM ARM, PC is 2 or 4 bytes ahead, - * depending whether we're in Thumb mode or not. - * Correct this offset. + * According to the ARM ARM, the PC is 2 or 4 bytes ahead + * depending on Thumb mode. Correct this offset so that + * regs->ARM_pc points at the faulting instruction. */ regs->ARM_pc -= correction; diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S index 4fa9903..2bf6089 100644 --- a/arch/arm/vfp/entry.S +++ b/arch/arm/vfp/entry.S @@ -19,6 +19,14 @@ #include <asm/vfpmacros.h> #include "../kernel/entry-header.S" +@ VFP entry point. +@ +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address +@ r10 = this threads thread_info structure +@ lr = unrecognised instruction return address +@ ENTRY(do_vfp) #ifdef CONFIG_PREEMPT ldr r4, [r10, #TI_PREEMPT] @ get preempt count diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S index 9897dcf..7292921 100644 --- a/arch/arm/vfp/vfphw.S +++ b/arch/arm/vfp/vfphw.S @@ -61,13 +61,13 @@ @ VFP hardware support entry point. @ -@ r0 = faulted instruction -@ r2 = faulted PC+4 -@ r9 = successful return +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address @ r10 = vfp_state union @ r11 = CPU number -@ lr = failure return - +@ lr = unrecognised instruction return address +@ IRQs enabled. ENTRY(vfp_support_entry) DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 @@ -138,9 +138,12 @@ check_for_exception: @ exception before retrying branch @ out before setting an FPEXC that @ stops us reading stuff - VFPFMXR FPEXC, r1 @ restore FPEXC last - sub r2, r2, #4 - str r2, [sp, #S_PC] @ retry the instruction + VFPFMXR FPEXC, r1 @ Restore FPEXC last + sub r2, r2, #4 @ Retry current instruction - if Thumb + str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, + @ else it's one 32-bit instruction, so + @ always subtract 4 from the following + @ instruction address. #ifdef CONFIG_PREEMPT get_thread_info r10 ldr r4, [r10, #TI_PREEMPT] @ get preempt count ^ permalink raw reply related [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 17:30 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 17:30 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote: > I agree, this code needs some clean-up. Maybe for Undef we could unify > the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the > breakpoint code, I haven't checked). > > Otherwise just let the code handling the undef deal with the ARM/Thumb > difference. For SVC, it makes sense to have different offsets as we > always return to the next instruction. I think it just needs better documentation. Having been through all this, there _are_ bugs lurking in the code exactly because of this randomness with what PC value is means what. When the VFP support code tests the state of the VFP hardware during boot, it sets the VFP handler to point at vfp_testing_entry, bypassing the normal VFP handling code, and executes a VFP instruction. If this VFP instruction faults (eg, because there is no VFP hardware present or we're not permitted to use it), it could end up resuming execution in the middle of the 16-bit paired instruction because regs->ARM_pc points in the middle of it. So vfp_testing_entry should at least store r2 into regs->ARM_pc to guarantee resuming at the following instruction. So maybe the right answer is to store r2 into regs->ARM_pc in process_exception in the VFP assembly code too? Or maybe we should just make it unconditional that whenever we have an undefined instruction exception, the regs->ARM_pc value will always be set for resuming execution after the faulted instruction. That makes it consistent with r2 throughout the code in every case. diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 2b46fea..5876eec 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) .align 5 __und_usr: usr_entry - - @ - @ fall through to the emulation code, which returns using r9 if - @ it has emulated the instruction, or the more conventional lr - @ if we are to treat this as a real undefined instruction @ - @ r0 - instruction + @ The emulation code returns using r9 if it has emulated the + @ instruction, or the more conventional lr if we are to treat + @ this as a real undefined instruction @ adr r9, BSYM(ret_from_exception) adr lr, BSYM(__und_usr_unknown) + @ + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the + @ faulting instruction depending on Thumb mode. + @ r3 = regs->ARM_cpsr + @ tst r3, #PSR_T_BIT @ Thumb mode? - itet eq @ explicit IT needed for the 1f label + itttt eq @ explicit IT needed for the 1f label subeq r4, r2, #4 @ ARM instr at LR - 4 - subne r4, r2, #2 @ Thumb instr at LR - 2 1: ldreqt r0, [r4] #ifdef CONFIG_CPU_ENDIAN_BE8 reveq r0, r0 @ little endian instruction #endif + @ + @ r0 = 32-bit ARM instruction which caused the exception + @ r2 = PC value for the following instruction (:= regs->ARM_pc) + @ r4 = PC value for the faulting instruction + @ beq call_fpe + @ Thumb instruction #if __LINUX_ARM_ARCH__ >= 7 + sub r4, r2, #2 @ Thumb instr at LR - 2 2: ARM( ldrht r5, [r4], #2 ) THUMB( ldrht r5, [r4] ) @@ -492,18 +500,19 @@ __und_usr: 3: ldrht r0, [r4] add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 orr r0, r0, r5, lsl #16 + @ + @ r0 = the two 16-bit Thumb instructions which caused the exception + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) + @ r4 = PC value for the first 16-bit Thumb instruction + @ #else b __und_usr_unknown #endif - UNWIND(.fnend ) + UNWIND(.fnend) ENDPROC(__und_usr) - @ - @ fallthrough to call_fpe - @ - /* - * The out of line fixup for the ldrt above. + * The out of line fixup for the ldrt instructions above. */ .pushsection .fixup, "ax" 4: mov pc, r9 @@ -534,11 +543,12 @@ ENDPROC(__und_usr) * NEON handler code. * * Emulators may wish to make use of the following registers: - * r0 = instruction opcode. - * r2 = PC+4 + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) + * r2 = PC value to resume execution after successful emulation * r9 = normal "successful" return address - * r10 = this threads thread_info structure. + * r10 = this threads thread_info structure * lr = unrecognised instruction return address + * IRQs disabled, FIQs enabled. */ @ @ Fall-through from Thumb-2 __und_usr diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index ee57640..eeb9250 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) void __user *pc; /* - * According to the ARM ARM, PC is 2 or 4 bytes ahead, - * depending whether we're in Thumb mode or not. - * Correct this offset. + * According to the ARM ARM, the PC is 2 or 4 bytes ahead + * depending on Thumb mode. Correct this offset so that + * regs->ARM_pc points at the faulting instruction. */ regs->ARM_pc -= correction; diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S index 4fa9903..2bf6089 100644 --- a/arch/arm/vfp/entry.S +++ b/arch/arm/vfp/entry.S @@ -19,6 +19,14 @@ #include <asm/vfpmacros.h> #include "../kernel/entry-header.S" +@ VFP entry point. +@ +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address +@ r10 = this threads thread_info structure +@ lr = unrecognised instruction return address +@ ENTRY(do_vfp) #ifdef CONFIG_PREEMPT ldr r4, [r10, #TI_PREEMPT] @ get preempt count diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S index 9897dcf..7292921 100644 --- a/arch/arm/vfp/vfphw.S +++ b/arch/arm/vfp/vfphw.S @@ -61,13 +61,13 @@ @ VFP hardware support entry point. @ -@ r0 = faulted instruction -@ r2 = faulted PC+4 -@ r9 = successful return +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address @ r10 = vfp_state union @ r11 = CPU number -@ lr = failure return - +@ lr = unrecognised instruction return address +@ IRQs enabled. ENTRY(vfp_support_entry) DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 @@ -138,9 +138,12 @@ check_for_exception: @ exception before retrying branch @ out before setting an FPEXC that @ stops us reading stuff - VFPFMXR FPEXC, r1 @ restore FPEXC last - sub r2, r2, #4 - str r2, [sp, #S_PC] @ retry the instruction + VFPFMXR FPEXC, r1 @ Restore FPEXC last + sub r2, r2, #4 @ Retry current instruction - if Thumb + str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, + @ else it's one 32-bit instruction, so + @ always subtract 4 from the following + @ instruction address. #ifdef CONFIG_PREEMPT get_thread_info r10 ldr r4, [r10, #TI_PREEMPT] @ get preempt count ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 17:30 ` Russell King - ARM Linux @ 2011-01-14 18:47 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 18:47 UTC (permalink / raw) To: Catalin Marinas; +Cc: linux-kernel, linux-arm-kernel, Colin Cross On Fri, Jan 14, 2011 at 05:30:50PM +0000, Russell King - ARM Linux wrote: > On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote: > > I agree, this code needs some clean-up. Maybe for Undef we could unify > > the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the > > breakpoint code, I haven't checked). > > > > Otherwise just let the code handling the undef deal with the ARM/Thumb > > difference. For SVC, it makes sense to have different offsets as we > > always return to the next instruction. > > I think it just needs better documentation. > > Having been through all this, there _are_ bugs lurking in the code exactly > because of this randomness with what PC value is means what. > > When the VFP support code tests the state of the VFP hardware during boot, > it sets the VFP handler to point at vfp_testing_entry, bypassing the normal > VFP handling code, and executes a VFP instruction. > > If this VFP instruction faults (eg, because there is no VFP hardware > present or we're not permitted to use it), it could end up resuming > execution in the middle of the 16-bit paired instruction because > regs->ARM_pc points in the middle of it. > > So vfp_testing_entry should at least store r2 into regs->ARM_pc to > guarantee resuming at the following instruction. > > So maybe the right answer is to store r2 into regs->ARM_pc in > process_exception in the VFP assembly code too? > > Or maybe we should just make it unconditional that whenever we have an > undefined instruction exception, the regs->ARM_pc value will always be > set for resuming execution after the faulted instruction. That makes > it consistent with r2 throughout the code in every case. So... this incrementally on top of the previous patch (which I've reproduced below as there's a subtle comment change in there wrt IRQ state.) This means we have consistent state - both r2 and regs->ARM_pc always point to the next instruction to be executed in every case, which means its easy to understand and remember while reading through the code. diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S --- b/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -499,10 +499,11 @@ blo __und_usr_unknown 3: ldrht r0, [r4] add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 - orr r0, r0, r5, lsl #16 + str r2, [sp, #S_PC] @ it's a 2x16bit instr, update + orr r0, r0, r5, lsl #16 @ regs->ARM_pc @ @ r0 = the two 16-bit Thumb instructions which caused the exception - @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) @ r4 = PC value for the first 16-bit Thumb instruction @ #else 8<-x-x- diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 2b46fea..5876eec 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) .align 5 __und_usr: usr_entry - - @ - @ fall through to the emulation code, which returns using r9 if - @ it has emulated the instruction, or the more conventional lr - @ if we are to treat this as a real undefined instruction @ - @ r0 - instruction + @ The emulation code returns using r9 if it has emulated the + @ instruction, or the more conventional lr if we are to treat + @ this as a real undefined instruction @ adr r9, BSYM(ret_from_exception) adr lr, BSYM(__und_usr_unknown) + @ + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the + @ faulting instruction depending on Thumb mode. + @ r3 = regs->ARM_cpsr + @ tst r3, #PSR_T_BIT @ Thumb mode? - itet eq @ explicit IT needed for the 1f label + itttt eq @ explicit IT needed for the 1f label subeq r4, r2, #4 @ ARM instr at LR - 4 - subne r4, r2, #2 @ Thumb instr at LR - 2 1: ldreqt r0, [r4] #ifdef CONFIG_CPU_ENDIAN_BE8 reveq r0, r0 @ little endian instruction #endif + @ + @ r0 = 32-bit ARM instruction which caused the exception + @ r2 = PC value for the following instruction (:= regs->ARM_pc) + @ r4 = PC value for the faulting instruction + @ beq call_fpe + @ Thumb instruction #if __LINUX_ARM_ARCH__ >= 7 + sub r4, r2, #2 @ Thumb instr at LR - 2 2: ARM( ldrht r5, [r4], #2 ) THUMB( ldrht r5, [r4] ) @@ -492,18 +500,19 @@ __und_usr: 3: ldrht r0, [r4] add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 orr r0, r0, r5, lsl #16 + @ + @ r0 = the two 16-bit Thumb instructions which caused the exception + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) + @ r4 = PC value for the first 16-bit Thumb instruction + @ #else b __und_usr_unknown #endif - UNWIND(.fnend ) + UNWIND(.fnend) ENDPROC(__und_usr) - @ - @ fallthrough to call_fpe - @ - /* - * The out of line fixup for the ldrt above. + * The out of line fixup for the ldrt instructions above. */ .pushsection .fixup, "ax" 4: mov pc, r9 @@ -534,11 +543,12 @@ ENDPROC(__und_usr) * NEON handler code. * * Emulators may wish to make use of the following registers: - * r0 = instruction opcode. - * r2 = PC+4 + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) + * r2 = PC value to resume execution after successful emulation * r9 = normal "successful" return address - * r10 = this threads thread_info structure. + * r10 = this threads thread_info structure * lr = unrecognised instruction return address + * IRQs disabled, FIQs enabled. */ @ @ Fall-through from Thumb-2 __und_usr diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index ee57640..eeb9250 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) void __user *pc; /* - * According to the ARM ARM, PC is 2 or 4 bytes ahead, - * depending whether we're in Thumb mode or not. - * Correct this offset. + * According to the ARM ARM, the PC is 2 or 4 bytes ahead + * depending on Thumb mode. Correct this offset so that + * regs->ARM_pc points at the faulting instruction. */ regs->ARM_pc -= correction; diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S index 4fa9903..2bf6089 100644 --- a/arch/arm/vfp/entry.S +++ b/arch/arm/vfp/entry.S @@ -19,6 +19,15 @@ #include <asm/vfpmacros.h> #include "../kernel/entry-header.S" +@ VFP entry point. +@ +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address +@ r10 = this threads thread_info structure +@ lr = unrecognised instruction return address +@ IRQs disabled. +@ ENTRY(do_vfp) #ifdef CONFIG_PREEMPT ldr r4, [r10, #TI_PREEMPT] @ get preempt count diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S index 9897dcf..7292921 100644 --- a/arch/arm/vfp/vfphw.S +++ b/arch/arm/vfp/vfphw.S @@ -61,13 +61,13 @@ @ VFP hardware support entry point. @ -@ r0 = faulted instruction -@ r2 = faulted PC+4 -@ r9 = successful return +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address @ r10 = vfp_state union @ r11 = CPU number -@ lr = failure return - +@ lr = unrecognised instruction return address +@ IRQs enabled. ENTRY(vfp_support_entry) DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 @@ -138,9 +138,12 @@ check_for_exception: @ exception before retrying branch @ out before setting an FPEXC that @ stops us reading stuff - VFPFMXR FPEXC, r1 @ restore FPEXC last - sub r2, r2, #4 - str r2, [sp, #S_PC] @ retry the instruction + VFPFMXR FPEXC, r1 @ Restore FPEXC last + sub r2, r2, #4 @ Retry current instruction - if Thumb + str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, + @ else it's one 32-bit instruction, so + @ always subtract 4 from the following + @ instruction address. #ifdef CONFIG_PREEMPT get_thread_info r10 ldr r4, [r10, #TI_PREEMPT] @ get preempt count ^ permalink raw reply related [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 18:47 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 18:47 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 14, 2011 at 05:30:50PM +0000, Russell King - ARM Linux wrote: > On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote: > > I agree, this code needs some clean-up. Maybe for Undef we could unify > > the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the > > breakpoint code, I haven't checked). > > > > Otherwise just let the code handling the undef deal with the ARM/Thumb > > difference. For SVC, it makes sense to have different offsets as we > > always return to the next instruction. > > I think it just needs better documentation. > > Having been through all this, there _are_ bugs lurking in the code exactly > because of this randomness with what PC value is means what. > > When the VFP support code tests the state of the VFP hardware during boot, > it sets the VFP handler to point at vfp_testing_entry, bypassing the normal > VFP handling code, and executes a VFP instruction. > > If this VFP instruction faults (eg, because there is no VFP hardware > present or we're not permitted to use it), it could end up resuming > execution in the middle of the 16-bit paired instruction because > regs->ARM_pc points in the middle of it. > > So vfp_testing_entry should at least store r2 into regs->ARM_pc to > guarantee resuming at the following instruction. > > So maybe the right answer is to store r2 into regs->ARM_pc in > process_exception in the VFP assembly code too? > > Or maybe we should just make it unconditional that whenever we have an > undefined instruction exception, the regs->ARM_pc value will always be > set for resuming execution after the faulted instruction. That makes > it consistent with r2 throughout the code in every case. So... this incrementally on top of the previous patch (which I've reproduced below as there's a subtle comment change in there wrt IRQ state.) This means we have consistent state - both r2 and regs->ARM_pc always point to the next instruction to be executed in every case, which means its easy to understand and remember while reading through the code. diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S --- b/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -499,10 +499,11 @@ blo __und_usr_unknown 3: ldrht r0, [r4] add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 - orr r0, r0, r5, lsl #16 + str r2, [sp, #S_PC] @ it's a 2x16bit instr, update + orr r0, r0, r5, lsl #16 @ regs->ARM_pc @ @ r0 = the two 16-bit Thumb instructions which caused the exception - @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) @ r4 = PC value for the first 16-bit Thumb instruction @ #else 8<-x-x- diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 2b46fea..5876eec 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) .align 5 __und_usr: usr_entry - - @ - @ fall through to the emulation code, which returns using r9 if - @ it has emulated the instruction, or the more conventional lr - @ if we are to treat this as a real undefined instruction @ - @ r0 - instruction + @ The emulation code returns using r9 if it has emulated the + @ instruction, or the more conventional lr if we are to treat + @ this as a real undefined instruction @ adr r9, BSYM(ret_from_exception) adr lr, BSYM(__und_usr_unknown) + @ + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the + @ faulting instruction depending on Thumb mode. + @ r3 = regs->ARM_cpsr + @ tst r3, #PSR_T_BIT @ Thumb mode? - itet eq @ explicit IT needed for the 1f label + itttt eq @ explicit IT needed for the 1f label subeq r4, r2, #4 @ ARM instr at LR - 4 - subne r4, r2, #2 @ Thumb instr at LR - 2 1: ldreqt r0, [r4] #ifdef CONFIG_CPU_ENDIAN_BE8 reveq r0, r0 @ little endian instruction #endif + @ + @ r0 = 32-bit ARM instruction which caused the exception + @ r2 = PC value for the following instruction (:= regs->ARM_pc) + @ r4 = PC value for the faulting instruction + @ beq call_fpe + @ Thumb instruction #if __LINUX_ARM_ARCH__ >= 7 + sub r4, r2, #2 @ Thumb instr at LR - 2 2: ARM( ldrht r5, [r4], #2 ) THUMB( ldrht r5, [r4] ) @@ -492,18 +500,19 @@ __und_usr: 3: ldrht r0, [r4] add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 orr r0, r0, r5, lsl #16 + @ + @ r0 = the two 16-bit Thumb instructions which caused the exception + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) + @ r4 = PC value for the first 16-bit Thumb instruction + @ #else b __und_usr_unknown #endif - UNWIND(.fnend ) + UNWIND(.fnend) ENDPROC(__und_usr) - @ - @ fallthrough to call_fpe - @ - /* - * The out of line fixup for the ldrt above. + * The out of line fixup for the ldrt instructions above. */ .pushsection .fixup, "ax" 4: mov pc, r9 @@ -534,11 +543,12 @@ ENDPROC(__und_usr) * NEON handler code. * * Emulators may wish to make use of the following registers: - * r0 = instruction opcode. - * r2 = PC+4 + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) + * r2 = PC value to resume execution after successful emulation * r9 = normal "successful" return address - * r10 = this threads thread_info structure. + * r10 = this threads thread_info structure * lr = unrecognised instruction return address + * IRQs disabled, FIQs enabled. */ @ @ Fall-through from Thumb-2 __und_usr diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index ee57640..eeb9250 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) void __user *pc; /* - * According to the ARM ARM, PC is 2 or 4 bytes ahead, - * depending whether we're in Thumb mode or not. - * Correct this offset. + * According to the ARM ARM, the PC is 2 or 4 bytes ahead + * depending on Thumb mode. Correct this offset so that + * regs->ARM_pc points at the faulting instruction. */ regs->ARM_pc -= correction; diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S index 4fa9903..2bf6089 100644 --- a/arch/arm/vfp/entry.S +++ b/arch/arm/vfp/entry.S @@ -19,6 +19,15 @@ #include <asm/vfpmacros.h> #include "../kernel/entry-header.S" +@ VFP entry point. +@ +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address +@ r10 = this threads thread_info structure +@ lr = unrecognised instruction return address +@ IRQs disabled. +@ ENTRY(do_vfp) #ifdef CONFIG_PREEMPT ldr r4, [r10, #TI_PREEMPT] @ get preempt count diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S index 9897dcf..7292921 100644 --- a/arch/arm/vfp/vfphw.S +++ b/arch/arm/vfp/vfphw.S @@ -61,13 +61,13 @@ @ VFP hardware support entry point. @ -@ r0 = faulted instruction -@ r2 = faulted PC+4 -@ r9 = successful return +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address @ r10 = vfp_state union @ r11 = CPU number -@ lr = failure return - +@ lr = unrecognised instruction return address +@ IRQs enabled. ENTRY(vfp_support_entry) DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 @@ -138,9 +138,12 @@ check_for_exception: @ exception before retrying branch @ out before setting an FPEXC that @ stops us reading stuff - VFPFMXR FPEXC, r1 @ restore FPEXC last - sub r2, r2, #4 - str r2, [sp, #S_PC] @ retry the instruction + VFPFMXR FPEXC, r1 @ Restore FPEXC last + sub r2, r2, #4 @ Retry current instruction - if Thumb + str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, + @ else it's one 32-bit instruction, so + @ always subtract 4 from the following + @ instruction address. #ifdef CONFIG_PREEMPT get_thread_info r10 ldr r4, [r10, #TI_PREEMPT] @ get preempt count ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 18:47 ` Russell King - ARM Linux @ 2011-01-14 19:23 ` Colin Cross -1 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-14 19:23 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > So... this incrementally on top of the previous patch (which I've > reproduced below as there's a subtle comment change in there wrt IRQ > state.) > > This means we have consistent state - both r2 and regs->ARM_pc always > point to the next instruction to be executed in every case, which means > its easy to understand and remember while reading through the code. > > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > --- b/arch/arm/kernel/entry-armv.S > +++ b/arch/arm/kernel/entry-armv.S > @@ -499,10 +499,11 @@ > blo __und_usr_unknown > 3: ldrht r0, [r4] > add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 > - orr r0, r0, r5, lsl #16 > + str r2, [sp, #S_PC] @ it's a 2x16bit instr, update > + orr r0, r0, r5, lsl #16 @ regs->ARM_pc > @ > @ r0 = the two 16-bit Thumb instructions which caused the exception > - @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) > + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) > @ r4 = PC value for the first 16-bit Thumb instruction > @ > #else > > 8<-x-x- > > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > index 2b46fea..5876eec 100644 > --- a/arch/arm/kernel/entry-armv.S > +++ b/arch/arm/kernel/entry-armv.S > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) > .align 5 > __und_usr: > usr_entry > - > - @ > - @ fall through to the emulation code, which returns using r9 if > - @ it has emulated the instruction, or the more conventional lr > - @ if we are to treat this as a real undefined instruction > @ > - @ r0 - instruction > + @ The emulation code returns using r9 if it has emulated the > + @ instruction, or the more conventional lr if we are to treat > + @ this as a real undefined instruction > @ > adr r9, BSYM(ret_from_exception) > adr lr, BSYM(__und_usr_unknown) > + @ > + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the > + @ faulting instruction depending on Thumb mode. > + @ r3 = regs->ARM_cpsr > + @ > tst r3, #PSR_T_BIT @ Thumb mode? > - itet eq @ explicit IT needed for the 1f label > + itttt eq @ explicit IT needed for the 1f label > subeq r4, r2, #4 @ ARM instr at LR - 4 > - subne r4, r2, #2 @ Thumb instr at LR - 2 > 1: ldreqt r0, [r4] > #ifdef CONFIG_CPU_ENDIAN_BE8 > reveq r0, r0 @ little endian instruction > #endif > + @ > + @ r0 = 32-bit ARM instruction which caused the exception > + @ r2 = PC value for the following instruction (:= regs->ARM_pc) > + @ r4 = PC value for the faulting instruction > + @ > beq call_fpe > + > @ Thumb instruction > #if __LINUX_ARM_ARCH__ >= 7 > + sub r4, r2, #2 @ Thumb instr at LR - 2 > 2: > ARM( ldrht r5, [r4], #2 ) > THUMB( ldrht r5, [r4] ) > @@ -492,18 +500,19 @@ __und_usr: > 3: ldrht r0, [r4] > add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 > orr r0, r0, r5, lsl #16 > + @ > + @ r0 = the two 16-bit Thumb instructions which caused the exception > + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) > + @ r4 = PC value for the first 16-bit Thumb instruction > + @ > #else > b __und_usr_unknown > #endif > - UNWIND(.fnend ) > + UNWIND(.fnend) > ENDPROC(__und_usr) > > - @ > - @ fallthrough to call_fpe > - @ > - > /* > - * The out of line fixup for the ldrt above. > + * The out of line fixup for the ldrt instructions above. > */ > .pushsection .fixup, "ax" > 4: mov pc, r9 > @@ -534,11 +543,12 @@ ENDPROC(__und_usr) > * NEON handler code. > * > * Emulators may wish to make use of the following registers: > - * r0 = instruction opcode. > - * r2 = PC+4 > + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) > + * r2 = PC value to resume execution after successful emulation > * r9 = normal "successful" return address > - * r10 = this threads thread_info structure. > + * r10 = this threads thread_info structure > * lr = unrecognised instruction return address > + * IRQs disabled, FIQs enabled. > */ > @ > @ Fall-through from Thumb-2 __und_usr > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > index ee57640..eeb9250 100644 > --- a/arch/arm/kernel/traps.c > +++ b/arch/arm/kernel/traps.c > @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) > void __user *pc; > > /* > - * According to the ARM ARM, PC is 2 or 4 bytes ahead, > - * depending whether we're in Thumb mode or not. > - * Correct this offset. > + * According to the ARM ARM, the PC is 2 or 4 bytes ahead > + * depending on Thumb mode. Correct this offset so that > + * regs->ARM_pc points at the faulting instruction. > */ > regs->ARM_pc -= correction; > > diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S > index 4fa9903..2bf6089 100644 > --- a/arch/arm/vfp/entry.S > +++ b/arch/arm/vfp/entry.S > @@ -19,6 +19,15 @@ > #include <asm/vfpmacros.h> > #include "../kernel/entry-header.S" > > +@ VFP entry point. > +@ > +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) > +@ r2 = PC value to resume execution after successful emulation > +@ r9 = normal "successful" return address > +@ r10 = this threads thread_info structure > +@ lr = unrecognised instruction return address > +@ IRQs disabled. > +@ > ENTRY(do_vfp) > #ifdef CONFIG_PREEMPT > ldr r4, [r10, #TI_PREEMPT] @ get preempt count > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S > index 9897dcf..7292921 100644 > --- a/arch/arm/vfp/vfphw.S > +++ b/arch/arm/vfp/vfphw.S > @@ -61,13 +61,13 @@ > > @ VFP hardware support entry point. > @ > -@ r0 = faulted instruction > -@ r2 = faulted PC+4 > -@ r9 = successful return > +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) > +@ r2 = PC value to resume execution after successful emulation > +@ r9 = normal "successful" return address > @ r10 = vfp_state union > @ r11 = CPU number > -@ lr = failure return > - > +@ lr = unrecognised instruction return address > +@ IRQs enabled. > ENTRY(vfp_support_entry) > DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 I tested copying r2 to regs->ARM_pc like this patch does, and it fixes my test case. Could this second patch go first so it can be applied to stable? ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 19:23 ` Colin Cross 0 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-14 19:23 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > So... this incrementally on top of the previous patch (which I've > reproduced below as there's a subtle comment change in there wrt IRQ > state.) > > This means we have consistent state - both r2 and regs->ARM_pc always > point to the next instruction to be executed in every case, which means > its easy to understand and remember while reading through the code. > > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > --- b/arch/arm/kernel/entry-armv.S > +++ b/arch/arm/kernel/entry-armv.S > @@ -499,10 +499,11 @@ > ? ? ? ?blo ? ? __und_usr_unknown > ?3: ? ? ldrht ? r0, [r4] > ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 > - ? ? ? orr ? ? r0, r0, r5, lsl #16 > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update > + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc > ? ? ? ?@ > ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception > - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) > + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) > ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction > ? ? ? ?@ > ?#else > > 8<-x-x- > > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > index 2b46fea..5876eec 100644 > --- a/arch/arm/kernel/entry-armv.S > +++ b/arch/arm/kernel/entry-armv.S > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) > ? ? ? ?.align ?5 > ?__und_usr: > ? ? ? ?usr_entry > - > - ? ? ? @ > - ? ? ? @ fall through to the emulation code, which returns using r9 if > - ? ? ? @ it has emulated the instruction, or the more conventional lr > - ? ? ? @ if we are to treat this as a real undefined instruction > ? ? ? ?@ > - ? ? ? @ ?r0 - instruction > + ? ? ? @ The emulation code returns using r9 if it has emulated the > + ? ? ? @ instruction, or the more conventional lr if we are to treat > + ? ? ? @ this as a real undefined instruction > ? ? ? ?@ > ? ? ? ?adr ? ? r9, BSYM(ret_from_exception) > ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown) > + ? ? ? @ > + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the > + ? ? ? @ faulting instruction depending on Thumb mode. > + ? ? ? @ r3 = regs->ARM_cpsr > + ? ? ? @ > ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode? > - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label > + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label > ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4 > - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 > ?1: ? ? ldreqt ?r0, [r4] > ?#ifdef CONFIG_CPU_ENDIAN_BE8 > ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction > ?#endif > + ? ? ? @ > + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception > + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc) > + ? ? ? @ r4 = PC value for the faulting instruction > + ? ? ? @ > ? ? ? ?beq ? ? call_fpe > + > ? ? ? ?@ Thumb instruction > ?#if __LINUX_ARM_ARCH__ >= 7 > + ? ? ? sub ? ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 > ?2: > ?ARM( ?ldrht ? r5, [r4], #2 ? ?) > ?THUMB( ? ? ? ?ldrht ? r5, [r4] ? ? ? ?) > @@ -492,18 +500,19 @@ __und_usr: > ?3: ? ? ldrht ? r0, [r4] > ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 > ? ? ? ?orr ? ? r0, r0, r5, lsl #16 > + ? ? ? @ > + ? ? ? @ r0 = the two 16-bit Thumb instructions which caused the exception > + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) > + ? ? ? @ r4 = PC value for the first 16-bit Thumb instruction > + ? ? ? @ > ?#else > ? ? ? ?b ? ? ? __und_usr_unknown > ?#endif > - UNWIND(.fnend ? ? ? ? ) > + UNWIND(.fnend) > ?ENDPROC(__und_usr) > > - ? ? ? @ > - ? ? ? @ fallthrough to call_fpe > - ? ? ? @ > - > ?/* > - * The out of line fixup for the ldrt above. > + * The out of line fixup for the ldrt instructions above. > ?*/ > ? ? ? ?.pushsection .fixup, "ax" > ?4: ? ? mov ? ? pc, r9 > @@ -534,11 +543,12 @@ ENDPROC(__und_usr) > ?* NEON handler code. > ?* > ?* Emulators may wish to make use of the following registers: > - * ?r0 ?= instruction opcode. > - * ?r2 ?= PC+4 > + * ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) > + * ?r2 ?= PC value to resume execution after successful emulation > ?* ?r9 ?= normal "successful" return address > - * ?r10 = this threads thread_info structure. > + * ?r10 = this threads thread_info structure > ?* ?lr ?= unrecognised instruction return address > + * IRQs disabled, FIQs enabled. > ?*/ > ? ? ? ?@ > ? ? ? ?@ Fall-through from Thumb-2 __und_usr > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > index ee57640..eeb9250 100644 > --- a/arch/arm/kernel/traps.c > +++ b/arch/arm/kernel/traps.c > @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) > ? ? ? ?void __user *pc; > > ? ? ? ?/* > - ? ? ? ?* According to the ARM ARM, PC is 2 or 4 bytes ahead, > - ? ? ? ?* depending whether we're in Thumb mode or not. > - ? ? ? ?* Correct this offset. > + ? ? ? ?* According to the ARM ARM, the PC is 2 or 4 bytes ahead > + ? ? ? ?* depending on Thumb mode. ?Correct this offset so that > + ? ? ? ?* regs->ARM_pc points at the faulting instruction. > ? ? ? ? */ > ? ? ? ?regs->ARM_pc -= correction; > > diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S > index 4fa9903..2bf6089 100644 > --- a/arch/arm/vfp/entry.S > +++ b/arch/arm/vfp/entry.S > @@ -19,6 +19,15 @@ > ?#include <asm/vfpmacros.h> > ?#include "../kernel/entry-header.S" > > +@ VFP entry point. > +@ > +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) > +@ ?r2 ?= PC value to resume execution after successful emulation > +@ ?r9 ?= normal "successful" return address > +@ ?r10 = this threads thread_info structure > +@ ?lr ?= unrecognised instruction return address > +@ ?IRQs disabled. > +@ > ?ENTRY(do_vfp) > ?#ifdef CONFIG_PREEMPT > ? ? ? ?ldr ? ? r4, [r10, #TI_PREEMPT] ?@ get preempt count > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S > index 9897dcf..7292921 100644 > --- a/arch/arm/vfp/vfphw.S > +++ b/arch/arm/vfp/vfphw.S > @@ -61,13 +61,13 @@ > > ?@ VFP hardware support entry point. > ?@ > -@ ?r0 ?= faulted instruction > -@ ?r2 ?= faulted PC+4 > -@ ?r9 ?= successful return > +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) > +@ ?r2 ?= PC value to resume execution after successful emulation > +@ ?r9 ?= normal "successful" return address > ?@ ?r10 = vfp_state union > ?@ ?r11 = CPU number > -@ ?lr ?= failure return > - > +@ ?lr ?= unrecognised instruction return address > +@ ?IRQs enabled. > ?ENTRY(vfp_support_entry) > ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 I tested copying r2 to regs->ARM_pc like this patch does, and it fixes my test case. Could this second patch go first so it can be applied to stable? ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 19:23 ` Colin Cross @ 2011-01-14 19:51 ` Colin Cross -1 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-14 19:51 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote: > On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: >> So... this incrementally on top of the previous patch (which I've >> reproduced below as there's a subtle comment change in there wrt IRQ >> state.) >> >> This means we have consistent state - both r2 and regs->ARM_pc always >> point to the next instruction to be executed in every case, which means >> its easy to understand and remember while reading through the code. >> >> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >> --- b/arch/arm/kernel/entry-armv.S >> +++ b/arch/arm/kernel/entry-armv.S >> @@ -499,10 +499,11 @@ >> blo __und_usr_unknown >> 3: ldrht r0, [r4] >> add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 >> - orr r0, r0, r5, lsl #16 >> + str r2, [sp, #S_PC] @ it's a 2x16bit instr, update >> + orr r0, r0, r5, lsl #16 @ regs->ARM_pc >> @ >> @ r0 = the two 16-bit Thumb instructions which caused the exception >> - @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) >> + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) >> @ r4 = PC value for the first 16-bit Thumb instruction >> @ >> #else >> >> 8<-x-x- >> >> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >> index 2b46fea..5876eec 100644 >> --- a/arch/arm/kernel/entry-armv.S >> +++ b/arch/arm/kernel/entry-armv.S >> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) >> .align 5 >> __und_usr: >> usr_entry >> - >> - @ >> - @ fall through to the emulation code, which returns using r9 if >> - @ it has emulated the instruction, or the more conventional lr >> - @ if we are to treat this as a real undefined instruction >> @ >> - @ r0 - instruction >> + @ The emulation code returns using r9 if it has emulated the >> + @ instruction, or the more conventional lr if we are to treat >> + @ this as a real undefined instruction >> @ >> adr r9, BSYM(ret_from_exception) >> adr lr, BSYM(__und_usr_unknown) >> + @ >> + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the >> + @ faulting instruction depending on Thumb mode. >> + @ r3 = regs->ARM_cpsr >> + @ >> tst r3, #PSR_T_BIT @ Thumb mode? >> - itet eq @ explicit IT needed for the 1f label >> + itttt eq @ explicit IT needed for the 1f label >> subeq r4, r2, #4 @ ARM instr at LR - 4 >> - subne r4, r2, #2 @ Thumb instr at LR - 2 >> 1: ldreqt r0, [r4] >> #ifdef CONFIG_CPU_ENDIAN_BE8 >> reveq r0, r0 @ little endian instruction >> #endif >> + @ >> + @ r0 = 32-bit ARM instruction which caused the exception >> + @ r2 = PC value for the following instruction (:= regs->ARM_pc) >> + @ r4 = PC value for the faulting instruction >> + @ >> beq call_fpe >> + >> @ Thumb instruction >> #if __LINUX_ARM_ARCH__ >= 7 >> + sub r4, r2, #2 @ Thumb instr at LR - 2 >> 2: >> ARM( ldrht r5, [r4], #2 ) >> THUMB( ldrht r5, [r4] ) >> @@ -492,18 +500,19 @@ __und_usr: >> 3: ldrht r0, [r4] >> add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 >> orr r0, r0, r5, lsl #16 >> + @ >> + @ r0 = the two 16-bit Thumb instructions which caused the exception >> + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) >> + @ r4 = PC value for the first 16-bit Thumb instruction >> + @ >> #else >> b __und_usr_unknown >> #endif >> - UNWIND(.fnend ) >> + UNWIND(.fnend) >> ENDPROC(__und_usr) >> >> - @ >> - @ fallthrough to call_fpe >> - @ >> - >> /* >> - * The out of line fixup for the ldrt above. >> + * The out of line fixup for the ldrt instructions above. >> */ >> .pushsection .fixup, "ax" >> 4: mov pc, r9 >> @@ -534,11 +543,12 @@ ENDPROC(__und_usr) >> * NEON handler code. >> * >> * Emulators may wish to make use of the following registers: >> - * r0 = instruction opcode. >> - * r2 = PC+4 >> + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) >> + * r2 = PC value to resume execution after successful emulation >> * r9 = normal "successful" return address >> - * r10 = this threads thread_info structure. >> + * r10 = this threads thread_info structure >> * lr = unrecognised instruction return address >> + * IRQs disabled, FIQs enabled. >> */ >> @ >> @ Fall-through from Thumb-2 __und_usr >> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c >> index ee57640..eeb9250 100644 >> --- a/arch/arm/kernel/traps.c >> +++ b/arch/arm/kernel/traps.c >> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) >> void __user *pc; >> >> /* >> - * According to the ARM ARM, PC is 2 or 4 bytes ahead, >> - * depending whether we're in Thumb mode or not. >> - * Correct this offset. >> + * According to the ARM ARM, the PC is 2 or 4 bytes ahead >> + * depending on Thumb mode. Correct this offset so that >> + * regs->ARM_pc points at the faulting instruction. >> */ >> regs->ARM_pc -= correction; >> >> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S >> index 4fa9903..2bf6089 100644 >> --- a/arch/arm/vfp/entry.S >> +++ b/arch/arm/vfp/entry.S >> @@ -19,6 +19,15 @@ >> #include <asm/vfpmacros.h> >> #include "../kernel/entry-header.S" >> >> +@ VFP entry point. >> +@ >> +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) >> +@ r2 = PC value to resume execution after successful emulation >> +@ r9 = normal "successful" return address >> +@ r10 = this threads thread_info structure >> +@ lr = unrecognised instruction return address >> +@ IRQs disabled. >> +@ >> ENTRY(do_vfp) >> #ifdef CONFIG_PREEMPT >> ldr r4, [r10, #TI_PREEMPT] @ get preempt count >> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S >> index 9897dcf..7292921 100644 >> --- a/arch/arm/vfp/vfphw.S >> +++ b/arch/arm/vfp/vfphw.S >> @@ -61,13 +61,13 @@ >> >> @ VFP hardware support entry point. >> @ >> -@ r0 = faulted instruction >> -@ r2 = faulted PC+4 >> -@ r9 = successful return >> +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) >> +@ r2 = PC value to resume execution after successful emulation >> +@ r9 = normal "successful" return address >> @ r10 = vfp_state union >> @ r11 = CPU number >> -@ lr = failure return >> - >> +@ lr = unrecognised instruction return address >> +@ IRQs enabled. >> ENTRY(vfp_support_entry) >> DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 > > I tested copying r2 to regs->ARM_pc like this patch does, and it fixes > my test case. Could this second patch go first so it can be applied > to stable? > Also, both patches together Tested-by: Colin Cross <ccross@android.com> ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 19:51 ` Colin Cross 0 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-14 19:51 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote: > On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: >> So... this incrementally on top of the previous patch (which I've >> reproduced below as there's a subtle comment change in there wrt IRQ >> state.) >> >> This means we have consistent state - both r2 and regs->ARM_pc always >> point to the next instruction to be executed in every case, which means >> its easy to understand and remember while reading through the code. >> >> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >> --- b/arch/arm/kernel/entry-armv.S >> +++ b/arch/arm/kernel/entry-armv.S >> @@ -499,10 +499,11 @@ >> ? ? ? ?blo ? ? __und_usr_unknown >> ?3: ? ? ldrht ? r0, [r4] >> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 >> - ? ? ? orr ? ? r0, r0, r5, lsl #16 >> + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update >> + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc >> ? ? ? ?@ >> ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception >> - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) >> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) >> ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction >> ? ? ? ?@ >> ?#else >> >> 8<-x-x- >> >> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >> index 2b46fea..5876eec 100644 >> --- a/arch/arm/kernel/entry-armv.S >> +++ b/arch/arm/kernel/entry-armv.S >> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) >> ? ? ? ?.align ?5 >> ?__und_usr: >> ? ? ? ?usr_entry >> - >> - ? ? ? @ >> - ? ? ? @ fall through to the emulation code, which returns using r9 if >> - ? ? ? @ it has emulated the instruction, or the more conventional lr >> - ? ? ? @ if we are to treat this as a real undefined instruction >> ? ? ? ?@ >> - ? ? ? @ ?r0 - instruction >> + ? ? ? @ The emulation code returns using r9 if it has emulated the >> + ? ? ? @ instruction, or the more conventional lr if we are to treat >> + ? ? ? @ this as a real undefined instruction >> ? ? ? ?@ >> ? ? ? ?adr ? ? r9, BSYM(ret_from_exception) >> ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown) >> + ? ? ? @ >> + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the >> + ? ? ? @ faulting instruction depending on Thumb mode. >> + ? ? ? @ r3 = regs->ARM_cpsr >> + ? ? ? @ >> ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode? >> - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label >> + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label >> ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4 >> - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 >> ?1: ? ? ldreqt ?r0, [r4] >> ?#ifdef CONFIG_CPU_ENDIAN_BE8 >> ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction >> ?#endif >> + ? ? ? @ >> + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception >> + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc) >> + ? ? ? @ r4 = PC value for the faulting instruction >> + ? ? ? @ >> ? ? ? ?beq ? ? call_fpe >> + >> ? ? ? ?@ Thumb instruction >> ?#if __LINUX_ARM_ARCH__ >= 7 >> + ? ? ? sub ? ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 >> ?2: >> ?ARM( ?ldrht ? r5, [r4], #2 ? ?) >> ?THUMB( ? ? ? ?ldrht ? r5, [r4] ? ? ? ?) >> @@ -492,18 +500,19 @@ __und_usr: >> ?3: ? ? ldrht ? r0, [r4] >> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 >> ? ? ? ?orr ? ? r0, r0, r5, lsl #16 >> + ? ? ? @ >> + ? ? ? @ r0 = the two 16-bit Thumb instructions which caused the exception >> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) >> + ? ? ? @ r4 = PC value for the first 16-bit Thumb instruction >> + ? ? ? @ >> ?#else >> ? ? ? ?b ? ? ? __und_usr_unknown >> ?#endif >> - UNWIND(.fnend ? ? ? ? ) >> + UNWIND(.fnend) >> ?ENDPROC(__und_usr) >> >> - ? ? ? @ >> - ? ? ? @ fallthrough to call_fpe >> - ? ? ? @ >> - >> ?/* >> - * The out of line fixup for the ldrt above. >> + * The out of line fixup for the ldrt instructions above. >> ?*/ >> ? ? ? ?.pushsection .fixup, "ax" >> ?4: ? ? mov ? ? pc, r9 >> @@ -534,11 +543,12 @@ ENDPROC(__und_usr) >> ?* NEON handler code. >> ?* >> ?* Emulators may wish to make use of the following registers: >> - * ?r0 ?= instruction opcode. >> - * ?r2 ?= PC+4 >> + * ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) >> + * ?r2 ?= PC value to resume execution after successful emulation >> ?* ?r9 ?= normal "successful" return address >> - * ?r10 = this threads thread_info structure. >> + * ?r10 = this threads thread_info structure >> ?* ?lr ?= unrecognised instruction return address >> + * IRQs disabled, FIQs enabled. >> ?*/ >> ? ? ? ?@ >> ? ? ? ?@ Fall-through from Thumb-2 __und_usr >> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c >> index ee57640..eeb9250 100644 >> --- a/arch/arm/kernel/traps.c >> +++ b/arch/arm/kernel/traps.c >> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) >> ? ? ? ?void __user *pc; >> >> ? ? ? ?/* >> - ? ? ? ?* According to the ARM ARM, PC is 2 or 4 bytes ahead, >> - ? ? ? ?* depending whether we're in Thumb mode or not. >> - ? ? ? ?* Correct this offset. >> + ? ? ? ?* According to the ARM ARM, the PC is 2 or 4 bytes ahead >> + ? ? ? ?* depending on Thumb mode. ?Correct this offset so that >> + ? ? ? ?* regs->ARM_pc points at the faulting instruction. >> ? ? ? ? */ >> ? ? ? ?regs->ARM_pc -= correction; >> >> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S >> index 4fa9903..2bf6089 100644 >> --- a/arch/arm/vfp/entry.S >> +++ b/arch/arm/vfp/entry.S >> @@ -19,6 +19,15 @@ >> ?#include <asm/vfpmacros.h> >> ?#include "../kernel/entry-header.S" >> >> +@ VFP entry point. >> +@ >> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) >> +@ ?r2 ?= PC value to resume execution after successful emulation >> +@ ?r9 ?= normal "successful" return address >> +@ ?r10 = this threads thread_info structure >> +@ ?lr ?= unrecognised instruction return address >> +@ ?IRQs disabled. >> +@ >> ?ENTRY(do_vfp) >> ?#ifdef CONFIG_PREEMPT >> ? ? ? ?ldr ? ? r4, [r10, #TI_PREEMPT] ?@ get preempt count >> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S >> index 9897dcf..7292921 100644 >> --- a/arch/arm/vfp/vfphw.S >> +++ b/arch/arm/vfp/vfphw.S >> @@ -61,13 +61,13 @@ >> >> ?@ VFP hardware support entry point. >> ?@ >> -@ ?r0 ?= faulted instruction >> -@ ?r2 ?= faulted PC+4 >> -@ ?r9 ?= successful return >> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) >> +@ ?r2 ?= PC value to resume execution after successful emulation >> +@ ?r9 ?= normal "successful" return address >> ?@ ?r10 = vfp_state union >> ?@ ?r11 = CPU number >> -@ ?lr ?= failure return >> - >> +@ ?lr ?= unrecognised instruction return address >> +@ ?IRQs enabled. >> ?ENTRY(vfp_support_entry) >> ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 > > I tested copying r2 to regs->ARM_pc like this patch does, and it fixes > my test case. ?Could this second patch go first so it can be applied > to stable? > Also, both patches together Tested-by: Colin Cross <ccross@android.com> ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 19:51 ` Colin Cross @ 2011-01-14 21:24 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 21:24 UTC (permalink / raw) To: Colin Cross; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel On Fri, Jan 14, 2011 at 11:51:38AM -0800, Colin Cross wrote: > On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote: > > I tested copying r2 to regs->ARM_pc like this patch does, and it fixes > > my test case. Could this second patch go first so it can be applied > > to stable? > > > > Also, both patches together Tested-by: Colin Cross <ccross@android.com> Thanks. I do want to hear back from Catalin before merging them, so they won't be part of tonight's push. ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 21:24 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 21:24 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 14, 2011 at 11:51:38AM -0800, Colin Cross wrote: > On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote: > > I tested copying r2 to regs->ARM_pc like this patch does, and it fixes > > my test case. ?Could this second patch go first so it can be applied > > to stable? > > > > Also, both patches together Tested-by: Colin Cross <ccross@android.com> Thanks. I do want to hear back from Catalin before merging them, so they won't be part of tonight's push. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 19:51 ` Colin Cross @ 2011-01-25 23:33 ` Colin Cross -1 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-25 23:33 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel On Fri, Jan 14, 2011 at 11:51 AM, Colin Cross <ccross@android.com> wrote: > On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote: >> On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >>> So... this incrementally on top of the previous patch (which I've >>> reproduced below as there's a subtle comment change in there wrt IRQ >>> state.) >>> >>> This means we have consistent state - both r2 and regs->ARM_pc always >>> point to the next instruction to be executed in every case, which means >>> its easy to understand and remember while reading through the code. >>> >>> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >>> --- b/arch/arm/kernel/entry-armv.S >>> +++ b/arch/arm/kernel/entry-armv.S >>> @@ -499,10 +499,11 @@ >>> blo __und_usr_unknown >>> 3: ldrht r0, [r4] >>> add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 >>> - orr r0, r0, r5, lsl #16 >>> + str r2, [sp, #S_PC] @ it's a 2x16bit instr, update >>> + orr r0, r0, r5, lsl #16 @ regs->ARM_pc >>> @ >>> @ r0 = the two 16-bit Thumb instructions which caused the exception >>> - @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) >>> + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) >>> @ r4 = PC value for the first 16-bit Thumb instruction >>> @ >>> #else >>> >>> 8<-x-x- >>> >>> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >>> index 2b46fea..5876eec 100644 >>> --- a/arch/arm/kernel/entry-armv.S >>> +++ b/arch/arm/kernel/entry-armv.S >>> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) >>> .align 5 >>> __und_usr: >>> usr_entry >>> - >>> - @ >>> - @ fall through to the emulation code, which returns using r9 if >>> - @ it has emulated the instruction, or the more conventional lr >>> - @ if we are to treat this as a real undefined instruction >>> @ >>> - @ r0 - instruction >>> + @ The emulation code returns using r9 if it has emulated the >>> + @ instruction, or the more conventional lr if we are to treat >>> + @ this as a real undefined instruction >>> @ >>> adr r9, BSYM(ret_from_exception) >>> adr lr, BSYM(__und_usr_unknown) >>> + @ >>> + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the >>> + @ faulting instruction depending on Thumb mode. >>> + @ r3 = regs->ARM_cpsr >>> + @ >>> tst r3, #PSR_T_BIT @ Thumb mode? >>> - itet eq @ explicit IT needed for the 1f label >>> + itttt eq @ explicit IT needed for the 1f label >>> subeq r4, r2, #4 @ ARM instr at LR - 4 >>> - subne r4, r2, #2 @ Thumb instr at LR - 2 >>> 1: ldreqt r0, [r4] >>> #ifdef CONFIG_CPU_ENDIAN_BE8 >>> reveq r0, r0 @ little endian instruction >>> #endif >>> + @ >>> + @ r0 = 32-bit ARM instruction which caused the exception >>> + @ r2 = PC value for the following instruction (:= regs->ARM_pc) >>> + @ r4 = PC value for the faulting instruction >>> + @ >>> beq call_fpe >>> + >>> @ Thumb instruction >>> #if __LINUX_ARM_ARCH__ >= 7 >>> + sub r4, r2, #2 @ Thumb instr at LR - 2 >>> 2: >>> ARM( ldrht r5, [r4], #2 ) >>> THUMB( ldrht r5, [r4] ) >>> @@ -492,18 +500,19 @@ __und_usr: >>> 3: ldrht r0, [r4] >>> add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 >>> orr r0, r0, r5, lsl #16 >>> + @ >>> + @ r0 = the two 16-bit Thumb instructions which caused the exception >>> + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) >>> + @ r4 = PC value for the first 16-bit Thumb instruction >>> + @ >>> #else >>> b __und_usr_unknown >>> #endif >>> - UNWIND(.fnend ) >>> + UNWIND(.fnend) >>> ENDPROC(__und_usr) >>> >>> - @ >>> - @ fallthrough to call_fpe >>> - @ >>> - >>> /* >>> - * The out of line fixup for the ldrt above. >>> + * The out of line fixup for the ldrt instructions above. >>> */ >>> .pushsection .fixup, "ax" >>> 4: mov pc, r9 >>> @@ -534,11 +543,12 @@ ENDPROC(__und_usr) >>> * NEON handler code. >>> * >>> * Emulators may wish to make use of the following registers: >>> - * r0 = instruction opcode. >>> - * r2 = PC+4 >>> + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) >>> + * r2 = PC value to resume execution after successful emulation >>> * r9 = normal "successful" return address >>> - * r10 = this threads thread_info structure. >>> + * r10 = this threads thread_info structure >>> * lr = unrecognised instruction return address >>> + * IRQs disabled, FIQs enabled. >>> */ >>> @ >>> @ Fall-through from Thumb-2 __und_usr >>> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c >>> index ee57640..eeb9250 100644 >>> --- a/arch/arm/kernel/traps.c >>> +++ b/arch/arm/kernel/traps.c >>> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) >>> void __user *pc; >>> >>> /* >>> - * According to the ARM ARM, PC is 2 or 4 bytes ahead, >>> - * depending whether we're in Thumb mode or not. >>> - * Correct this offset. >>> + * According to the ARM ARM, the PC is 2 or 4 bytes ahead >>> + * depending on Thumb mode. Correct this offset so that >>> + * regs->ARM_pc points at the faulting instruction. >>> */ >>> regs->ARM_pc -= correction; >>> >>> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S >>> index 4fa9903..2bf6089 100644 >>> --- a/arch/arm/vfp/entry.S >>> +++ b/arch/arm/vfp/entry.S >>> @@ -19,6 +19,15 @@ >>> #include <asm/vfpmacros.h> >>> #include "../kernel/entry-header.S" >>> >>> +@ VFP entry point. >>> +@ >>> +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) >>> +@ r2 = PC value to resume execution after successful emulation >>> +@ r9 = normal "successful" return address >>> +@ r10 = this threads thread_info structure >>> +@ lr = unrecognised instruction return address >>> +@ IRQs disabled. >>> +@ >>> ENTRY(do_vfp) >>> #ifdef CONFIG_PREEMPT >>> ldr r4, [r10, #TI_PREEMPT] @ get preempt count >>> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S >>> index 9897dcf..7292921 100644 >>> --- a/arch/arm/vfp/vfphw.S >>> +++ b/arch/arm/vfp/vfphw.S >>> @@ -61,13 +61,13 @@ >>> >>> @ VFP hardware support entry point. >>> @ >>> -@ r0 = faulted instruction >>> -@ r2 = faulted PC+4 >>> -@ r9 = successful return >>> +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) >>> +@ r2 = PC value to resume execution after successful emulation >>> +@ r9 = normal "successful" return address >>> @ r10 = vfp_state union >>> @ r11 = CPU number >>> -@ lr = failure return >>> - >>> +@ lr = unrecognised instruction return address >>> +@ IRQs enabled. >>> ENTRY(vfp_support_entry) >>> DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 >> >> I tested copying r2 to regs->ARM_pc like this patch does, and it fixes >> my test case. Could this second patch go first so it can be applied >> to stable? >> > > Also, both patches together Tested-by: Colin Cross <ccross@android.com> > I think there is an additional change needed to __und_usr_unknown/do_undefinstr. do_undefinstr, which gets called directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc to be the fault address, and not the next PC, and gets called for 2 or 4 byte instructions. If it gets called after the regs->ARM_pc fixup added in the second patch, it will return to the middle of the faulting instruction. If it do_undefinstr called on a 4 byte thumb instruction, with regs->ARM_pc pointing to the end of the faulting instruction, there is no good way to figure out where the PC of the instruction that faulted is. Would it make more sense to convert all the exception handlers to get regs->ARM_pc as the PC of the faulting instruction, rather than getting the PC of the next instruction? That way, each handler can easily use thumb_mode(regs) and decoding the instruction to determine how big the instruction is, and decide whether to return to the beginning or end of the faulting instruction. ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-25 23:33 ` Colin Cross 0 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-25 23:33 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 14, 2011 at 11:51 AM, Colin Cross <ccross@android.com> wrote: > On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote: >> On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >>> So... this incrementally on top of the previous patch (which I've >>> reproduced below as there's a subtle comment change in there wrt IRQ >>> state.) >>> >>> This means we have consistent state - both r2 and regs->ARM_pc always >>> point to the next instruction to be executed in every case, which means >>> its easy to understand and remember while reading through the code. >>> >>> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >>> --- b/arch/arm/kernel/entry-armv.S >>> +++ b/arch/arm/kernel/entry-armv.S >>> @@ -499,10 +499,11 @@ >>> ? ? ? ?blo ? ? __und_usr_unknown >>> ?3: ? ? ldrht ? r0, [r4] >>> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 >>> - ? ? ? orr ? ? r0, r0, r5, lsl #16 >>> + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update >>> + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc >>> ? ? ? ?@ >>> ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception >>> - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) >>> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) >>> ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction >>> ? ? ? ?@ >>> ?#else >>> >>> 8<-x-x- >>> >>> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >>> index 2b46fea..5876eec 100644 >>> --- a/arch/arm/kernel/entry-armv.S >>> +++ b/arch/arm/kernel/entry-armv.S >>> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) >>> ? ? ? ?.align ?5 >>> ?__und_usr: >>> ? ? ? ?usr_entry >>> - >>> - ? ? ? @ >>> - ? ? ? @ fall through to the emulation code, which returns using r9 if >>> - ? ? ? @ it has emulated the instruction, or the more conventional lr >>> - ? ? ? @ if we are to treat this as a real undefined instruction >>> ? ? ? ?@ >>> - ? ? ? @ ?r0 - instruction >>> + ? ? ? @ The emulation code returns using r9 if it has emulated the >>> + ? ? ? @ instruction, or the more conventional lr if we are to treat >>> + ? ? ? @ this as a real undefined instruction >>> ? ? ? ?@ >>> ? ? ? ?adr ? ? r9, BSYM(ret_from_exception) >>> ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown) >>> + ? ? ? @ >>> + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the >>> + ? ? ? @ faulting instruction depending on Thumb mode. >>> + ? ? ? @ r3 = regs->ARM_cpsr >>> + ? ? ? @ >>> ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode? >>> - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label >>> + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label >>> ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4 >>> - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 >>> ?1: ? ? ldreqt ?r0, [r4] >>> ?#ifdef CONFIG_CPU_ENDIAN_BE8 >>> ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction >>> ?#endif >>> + ? ? ? @ >>> + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception >>> + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc) >>> + ? ? ? @ r4 = PC value for the faulting instruction >>> + ? ? ? @ >>> ? ? ? ?beq ? ? call_fpe >>> + >>> ? ? ? ?@ Thumb instruction >>> ?#if __LINUX_ARM_ARCH__ >= 7 >>> + ? ? ? sub ? ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 >>> ?2: >>> ?ARM( ?ldrht ? r5, [r4], #2 ? ?) >>> ?THUMB( ? ? ? ?ldrht ? r5, [r4] ? ? ? ?) >>> @@ -492,18 +500,19 @@ __und_usr: >>> ?3: ? ? ldrht ? r0, [r4] >>> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 >>> ? ? ? ?orr ? ? r0, r0, r5, lsl #16 >>> + ? ? ? @ >>> + ? ? ? @ r0 = the two 16-bit Thumb instructions which caused the exception >>> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) >>> + ? ? ? @ r4 = PC value for the first 16-bit Thumb instruction >>> + ? ? ? @ >>> ?#else >>> ? ? ? ?b ? ? ? __und_usr_unknown >>> ?#endif >>> - UNWIND(.fnend ? ? ? ? ) >>> + UNWIND(.fnend) >>> ?ENDPROC(__und_usr) >>> >>> - ? ? ? @ >>> - ? ? ? @ fallthrough to call_fpe >>> - ? ? ? @ >>> - >>> ?/* >>> - * The out of line fixup for the ldrt above. >>> + * The out of line fixup for the ldrt instructions above. >>> ?*/ >>> ? ? ? ?.pushsection .fixup, "ax" >>> ?4: ? ? mov ? ? pc, r9 >>> @@ -534,11 +543,12 @@ ENDPROC(__und_usr) >>> ?* NEON handler code. >>> ?* >>> ?* Emulators may wish to make use of the following registers: >>> - * ?r0 ?= instruction opcode. >>> - * ?r2 ?= PC+4 >>> + * ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) >>> + * ?r2 ?= PC value to resume execution after successful emulation >>> ?* ?r9 ?= normal "successful" return address >>> - * ?r10 = this threads thread_info structure. >>> + * ?r10 = this threads thread_info structure >>> ?* ?lr ?= unrecognised instruction return address >>> + * IRQs disabled, FIQs enabled. >>> ?*/ >>> ? ? ? ?@ >>> ? ? ? ?@ Fall-through from Thumb-2 __und_usr >>> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c >>> index ee57640..eeb9250 100644 >>> --- a/arch/arm/kernel/traps.c >>> +++ b/arch/arm/kernel/traps.c >>> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) >>> ? ? ? ?void __user *pc; >>> >>> ? ? ? ?/* >>> - ? ? ? ?* According to the ARM ARM, PC is 2 or 4 bytes ahead, >>> - ? ? ? ?* depending whether we're in Thumb mode or not. >>> - ? ? ? ?* Correct this offset. >>> + ? ? ? ?* According to the ARM ARM, the PC is 2 or 4 bytes ahead >>> + ? ? ? ?* depending on Thumb mode. ?Correct this offset so that >>> + ? ? ? ?* regs->ARM_pc points at the faulting instruction. >>> ? ? ? ? */ >>> ? ? ? ?regs->ARM_pc -= correction; >>> >>> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S >>> index 4fa9903..2bf6089 100644 >>> --- a/arch/arm/vfp/entry.S >>> +++ b/arch/arm/vfp/entry.S >>> @@ -19,6 +19,15 @@ >>> ?#include <asm/vfpmacros.h> >>> ?#include "../kernel/entry-header.S" >>> >>> +@ VFP entry point. >>> +@ >>> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) >>> +@ ?r2 ?= PC value to resume execution after successful emulation >>> +@ ?r9 ?= normal "successful" return address >>> +@ ?r10 = this threads thread_info structure >>> +@ ?lr ?= unrecognised instruction return address >>> +@ ?IRQs disabled. >>> +@ >>> ?ENTRY(do_vfp) >>> ?#ifdef CONFIG_PREEMPT >>> ? ? ? ?ldr ? ? r4, [r10, #TI_PREEMPT] ?@ get preempt count >>> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S >>> index 9897dcf..7292921 100644 >>> --- a/arch/arm/vfp/vfphw.S >>> +++ b/arch/arm/vfp/vfphw.S >>> @@ -61,13 +61,13 @@ >>> >>> ?@ VFP hardware support entry point. >>> ?@ >>> -@ ?r0 ?= faulted instruction >>> -@ ?r2 ?= faulted PC+4 >>> -@ ?r9 ?= successful return >>> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) >>> +@ ?r2 ?= PC value to resume execution after successful emulation >>> +@ ?r9 ?= normal "successful" return address >>> ?@ ?r10 = vfp_state union >>> ?@ ?r11 = CPU number >>> -@ ?lr ?= failure return >>> - >>> +@ ?lr ?= unrecognised instruction return address >>> +@ ?IRQs enabled. >>> ?ENTRY(vfp_support_entry) >>> ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 >> >> I tested copying r2 to regs->ARM_pc like this patch does, and it fixes >> my test case. ?Could this second patch go first so it can be applied >> to stable? >> > > Also, both patches together Tested-by: Colin Cross <ccross@android.com> > I think there is an additional change needed to __und_usr_unknown/do_undefinstr. do_undefinstr, which gets called directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc to be the fault address, and not the next PC, and gets called for 2 or 4 byte instructions. If it gets called after the regs->ARM_pc fixup added in the second patch, it will return to the middle of the faulting instruction. If it do_undefinstr called on a 4 byte thumb instruction, with regs->ARM_pc pointing to the end of the faulting instruction, there is no good way to figure out where the PC of the instruction that faulted is. Would it make more sense to convert all the exception handlers to get regs->ARM_pc as the PC of the faulting instruction, rather than getting the PC of the next instruction? That way, each handler can easily use thumb_mode(regs) and decoding the instruction to determine how big the instruction is, and decide whether to return to the beginning or end of the faulting instruction. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-25 23:33 ` Colin Cross @ 2011-01-26 11:26 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-26 11:26 UTC (permalink / raw) To: Colin Cross; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote: > I think there is an additional change needed to > __und_usr_unknown/do_undefinstr. do_undefinstr, which gets called > directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc > to be the fault address, and not the next PC, and gets called for 2 or > 4 byte instructions. It expects it to be the next PC: asmlinkage void __exception do_undefinstr(struct pt_regs *regs) { unsigned int correction = thumb_mode(regs) ? 2 : 4; unsigned int instr; siginfo_t info; void __user *pc; /* * According to the ARM ARM, PC is 2 or 4 bytes ahead, * depending whether we're in Thumb mode or not. * Correct this offset. */ regs->ARM_pc -= correction; We expect the PC to be pointing at the next instruction to be executed. This is the value of the PC saved by the CPU when entering the exception. We correct the PC by four bytes for ARM mode to point at the previously executed instruction. For 16-bit Thumb mode, the PC is again pointing at the next instruction to be executed, and this is the value saved by the CPU. So we correct the PC by two bytes as that is the Thumb instruction size. The problem comes with T2, where we advance the saved PC by two bytes if the instruction was 32-bit such that it again points at the next instruction to be executed. This is where the problem comes in because we have two different chunks of code with completely different expectations. Maybe we need to pass in the correction factor to do_undefinstr instead. ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-26 11:26 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-26 11:26 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote: > I think there is an additional change needed to > __und_usr_unknown/do_undefinstr. do_undefinstr, which gets called > directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc > to be the fault address, and not the next PC, and gets called for 2 or > 4 byte instructions. It expects it to be the next PC: asmlinkage void __exception do_undefinstr(struct pt_regs *regs) { unsigned int correction = thumb_mode(regs) ? 2 : 4; unsigned int instr; siginfo_t info; void __user *pc; /* * According to the ARM ARM, PC is 2 or 4 bytes ahead, * depending whether we're in Thumb mode or not. * Correct this offset. */ regs->ARM_pc -= correction; We expect the PC to be pointing at the next instruction to be executed. This is the value of the PC saved by the CPU when entering the exception. We correct the PC by four bytes for ARM mode to point at the previously executed instruction. For 16-bit Thumb mode, the PC is again pointing at the next instruction to be executed, and this is the value saved by the CPU. So we correct the PC by two bytes as that is the Thumb instruction size. The problem comes with T2, where we advance the saved PC by two bytes if the instruction was 32-bit such that it again points at the next instruction to be executed. This is where the problem comes in because we have two different chunks of code with completely different expectations. Maybe we need to pass in the correction factor to do_undefinstr instead. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-26 11:26 ` Russell King - ARM Linux @ 2011-01-27 6:11 ` Colin Cross -1 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-27 6:11 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote: >> I think there is an additional change needed to >> __und_usr_unknown/do_undefinstr. do_undefinstr, which gets called >> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc >> to be the fault address, and not the next PC, and gets called for 2 or >> 4 byte instructions. > > It expects it to be the next PC: > > asmlinkage void __exception do_undefinstr(struct pt_regs *regs) > { > unsigned int correction = thumb_mode(regs) ? 2 : 4; > unsigned int instr; > siginfo_t info; > void __user *pc; > > /* > * According to the ARM ARM, PC is 2 or 4 bytes ahead, > * depending whether we're in Thumb mode or not. > * Correct this offset. > */ > regs->ARM_pc -= correction; > > We expect the PC to be pointing at the next instruction to be executed. > This is the value of the PC saved by the CPU when entering the exception. > We correct the PC by four bytes for ARM mode to point at the previously > executed instruction. > > For 16-bit Thumb mode, the PC is again pointing at the next instruction > to be executed, and this is the value saved by the CPU. So we correct > the PC by two bytes as that is the Thumb instruction size. > > The problem comes with T2, where we advance the saved PC by two bytes > if the instruction was 32-bit such that it again points at the next > instruction to be executed. This is where the problem comes in because > we have two different chunks of code with completely different > expectations. All do_undefinstr will do with the correction is subtract it off so that pc points to the faulting instruction instead of the next instruction, and calls any registered handlers. VFP_bounce sometimes subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP instructions are all 32 bit), and sometimes leaves pc pointing at the next instruction. If both were modified to expect the faulting instruction's pc, do_undefinstr would leave pc unmodified, and VFP_bounce would add 4 in the opposite cases from where it subtracts 4 now. iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr. Currently, iwmmxt_task_enable will either end up in do_undefinstr through mov pc, lr, or it will subtract 4 from the pc register value and return through ret_from_exception. With the change above, it would not need to modify the pc value at all. crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same. > Maybe we need to pass in the correction factor to do_undefinstr instead. > That just makes it even more complicated, the correction factor has to be tracked through every code path. RFC patch to follow. ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-27 6:11 ` Colin Cross 0 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-27 6:11 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote: >> I think there is an additional change needed to >> __und_usr_unknown/do_undefinstr. ?do_undefinstr, which gets called >> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc >> to be the fault address, and not the next PC, and gets called for 2 or >> 4 byte instructions. > > It expects it to be the next PC: > > asmlinkage void __exception do_undefinstr(struct pt_regs *regs) > { > ? ? ? ?unsigned int correction = thumb_mode(regs) ? 2 : 4; > ? ? ? ?unsigned int instr; > ? ? ? ?siginfo_t info; > ? ? ? ?void __user *pc; > > ? ? ? ?/* > ? ? ? ? * According to the ARM ARM, PC is 2 or 4 bytes ahead, > ? ? ? ? * depending whether we're in Thumb mode or not. > ? ? ? ? * Correct this offset. > ? ? ? ? */ > ? ? ? ?regs->ARM_pc -= correction; > > We expect the PC to be pointing at the next instruction to be executed. > This is the value of the PC saved by the CPU when entering the exception. > We correct the PC by four bytes for ARM mode to point at the previously > executed instruction. > > For 16-bit Thumb mode, the PC is again pointing at the next instruction > to be executed, and this is the value saved by the CPU. ?So we correct > the PC by two bytes as that is the Thumb instruction size. > > The problem comes with T2, where we advance the saved PC by two bytes > if the instruction was 32-bit such that it again points at the next > instruction to be executed. ?This is where the problem comes in because > we have two different chunks of code with completely different > expectations. All do_undefinstr will do with the correction is subtract it off so that pc points to the faulting instruction instead of the next instruction, and calls any registered handlers. VFP_bounce sometimes subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP instructions are all 32 bit), and sometimes leaves pc pointing at the next instruction. If both were modified to expect the faulting instruction's pc, do_undefinstr would leave pc unmodified, and VFP_bounce would add 4 in the opposite cases from where it subtracts 4 now. iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr. Currently, iwmmxt_task_enable will either end up in do_undefinstr through mov pc, lr, or it will subtract 4 from the pc register value and return through ret_from_exception. With the change above, it would not need to modify the pc value at all. crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same. > Maybe we need to pass in the correction factor to do_undefinstr instead. > That just makes it even more complicated, the correction factor has to be tracked through every code path. RFC patch to follow. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-27 6:11 ` Colin Cross @ 2011-01-27 6:35 ` Colin Cross -1 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-27 6:35 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel On Wed, Jan 26, 2011 at 10:11 PM, Colin Cross <ccross@android.com> wrote: > On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: >> On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote: >>> I think there is an additional change needed to >>> __und_usr_unknown/do_undefinstr. do_undefinstr, which gets called >>> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc >>> to be the fault address, and not the next PC, and gets called for 2 or >>> 4 byte instructions. >> >> It expects it to be the next PC: >> >> asmlinkage void __exception do_undefinstr(struct pt_regs *regs) >> { >> unsigned int correction = thumb_mode(regs) ? 2 : 4; >> unsigned int instr; >> siginfo_t info; >> void __user *pc; >> >> /* >> * According to the ARM ARM, PC is 2 or 4 bytes ahead, >> * depending whether we're in Thumb mode or not. >> * Correct this offset. >> */ >> regs->ARM_pc -= correction; >> >> We expect the PC to be pointing at the next instruction to be executed. >> This is the value of the PC saved by the CPU when entering the exception. >> We correct the PC by four bytes for ARM mode to point at the previously >> executed instruction. >> >> For 16-bit Thumb mode, the PC is again pointing at the next instruction >> to be executed, and this is the value saved by the CPU. So we correct >> the PC by two bytes as that is the Thumb instruction size. >> >> The problem comes with T2, where we advance the saved PC by two bytes >> if the instruction was 32-bit such that it again points at the next >> instruction to be executed. This is where the problem comes in because >> we have two different chunks of code with completely different >> expectations. > > All do_undefinstr will do with the correction is subtract it off so > that pc points to the faulting instruction instead of the next > instruction, and calls any registered handlers. VFP_bounce sometimes > subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP > instructions are all 32 bit), and sometimes leaves pc pointing at the > next instruction. If both were modified to expect the faulting > instruction's pc, do_undefinstr would leave pc unmodified, and > VFP_bounce would add 4 in the opposite cases from where it subtracts 4 > now. > > iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr. > > Currently, iwmmxt_task_enable will either end up in do_undefinstr > through mov pc, lr, or it will subtract 4 from the pc register value > and return through ret_from_exception. With the change above, it > would not need to modify the pc value at all. > > crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same. > >> Maybe we need to pass in the correction factor to do_undefinstr instead. >> > > That just makes it even more complicated, the correction factor has to > be tracked through every code path. > > RFC patch to follow. > Missed one case. do_fpe calls into nwfpe_enter (I don't see any other users of fp_enter), which assumes the PC in the registers on the stack is faulting PC + 4, ignores r2, and uses r0 as the first instruction to emulate. It will need to be slightly modified in my patch. ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-27 6:35 ` Colin Cross 0 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-27 6:35 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 26, 2011 at 10:11 PM, Colin Cross <ccross@android.com> wrote: > On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: >> On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote: >>> I think there is an additional change needed to >>> __und_usr_unknown/do_undefinstr. ?do_undefinstr, which gets called >>> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc >>> to be the fault address, and not the next PC, and gets called for 2 or >>> 4 byte instructions. >> >> It expects it to be the next PC: >> >> asmlinkage void __exception do_undefinstr(struct pt_regs *regs) >> { >> ? ? ? ?unsigned int correction = thumb_mode(regs) ? 2 : 4; >> ? ? ? ?unsigned int instr; >> ? ? ? ?siginfo_t info; >> ? ? ? ?void __user *pc; >> >> ? ? ? ?/* >> ? ? ? ? * According to the ARM ARM, PC is 2 or 4 bytes ahead, >> ? ? ? ? * depending whether we're in Thumb mode or not. >> ? ? ? ? * Correct this offset. >> ? ? ? ? */ >> ? ? ? ?regs->ARM_pc -= correction; >> >> We expect the PC to be pointing at the next instruction to be executed. >> This is the value of the PC saved by the CPU when entering the exception. >> We correct the PC by four bytes for ARM mode to point at the previously >> executed instruction. >> >> For 16-bit Thumb mode, the PC is again pointing at the next instruction >> to be executed, and this is the value saved by the CPU. ?So we correct >> the PC by two bytes as that is the Thumb instruction size. >> >> The problem comes with T2, where we advance the saved PC by two bytes >> if the instruction was 32-bit such that it again points at the next >> instruction to be executed. ?This is where the problem comes in because >> we have two different chunks of code with completely different >> expectations. > > All do_undefinstr will do with the correction is subtract it off so > that pc points to the faulting instruction instead of the next > instruction, and calls any registered handlers. ?VFP_bounce sometimes > subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP > instructions are all 32 bit), and sometimes leaves pc pointing at the > next instruction. ?If both were modified to expect the faulting > instruction's pc, do_undefinstr would leave pc unmodified, and > VFP_bounce would add 4 in the opposite cases from where it subtracts 4 > now. > > iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr. > > Currently, iwmmxt_task_enable will either end up in do_undefinstr > through mov pc, lr, or it will subtract 4 from the pc register value > and return through ret_from_exception. ?With the change above, it > would not need to modify the pc value at all. > > crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same. > >> Maybe we need to pass in the correction factor to do_undefinstr instead. >> > > That just makes it even more complicated, the correction factor has to > be tracked through every code path. > > RFC patch to follow. > Missed one case. do_fpe calls into nwfpe_enter (I don't see any other users of fp_enter), which assumes the PC in the registers on the stack is faulting PC + 4, ignores r2, and uses r0 as the first instruction to emulate. It will need to be slightly modified in my patch. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-27 6:35 ` Colin Cross @ 2011-01-27 7:30 ` Colin Cross -1 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-27 7:30 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel On Wed, Jan 26, 2011 at 10:35 PM, Colin Cross <ccross@android.com> wrote: > On Wed, Jan 26, 2011 at 10:11 PM, Colin Cross <ccross@android.com> wrote: >> On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >>> On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote: >>>> I think there is an additional change needed to >>>> __und_usr_unknown/do_undefinstr. do_undefinstr, which gets called >>>> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc >>>> to be the fault address, and not the next PC, and gets called for 2 or >>>> 4 byte instructions. >>> >>> It expects it to be the next PC: >>> >>> asmlinkage void __exception do_undefinstr(struct pt_regs *regs) >>> { >>> unsigned int correction = thumb_mode(regs) ? 2 : 4; >>> unsigned int instr; >>> siginfo_t info; >>> void __user *pc; >>> >>> /* >>> * According to the ARM ARM, PC is 2 or 4 bytes ahead, >>> * depending whether we're in Thumb mode or not. >>> * Correct this offset. >>> */ >>> regs->ARM_pc -= correction; >>> >>> We expect the PC to be pointing at the next instruction to be executed. >>> This is the value of the PC saved by the CPU when entering the exception. >>> We correct the PC by four bytes for ARM mode to point at the previously >>> executed instruction. >>> >>> For 16-bit Thumb mode, the PC is again pointing at the next instruction >>> to be executed, and this is the value saved by the CPU. So we correct >>> the PC by two bytes as that is the Thumb instruction size. >>> >>> The problem comes with T2, where we advance the saved PC by two bytes >>> if the instruction was 32-bit such that it again points at the next >>> instruction to be executed. This is where the problem comes in because >>> we have two different chunks of code with completely different >>> expectations. >> >> All do_undefinstr will do with the correction is subtract it off so >> that pc points to the faulting instruction instead of the next >> instruction, and calls any registered handlers. VFP_bounce sometimes >> subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP >> instructions are all 32 bit), and sometimes leaves pc pointing at the >> next instruction. If both were modified to expect the faulting >> instruction's pc, do_undefinstr would leave pc unmodified, and >> VFP_bounce would add 4 in the opposite cases from where it subtracts 4 >> now. >> >> iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr. >> >> Currently, iwmmxt_task_enable will either end up in do_undefinstr >> through mov pc, lr, or it will subtract 4 from the pc register value >> and return through ret_from_exception. With the change above, it >> would not need to modify the pc value at all. >> >> crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same. >> >>> Maybe we need to pass in the correction factor to do_undefinstr instead. >>> >> >> That just makes it even more complicated, the correction factor has to >> be tracked through every code path. >> >> RFC patch to follow. >> > > Missed one case. do_fpe calls into nwfpe_enter (I don't see any other > users of fp_enter), which assumes the PC in the registers on the stack > is faulting PC + 4, ignores r2, and uses r0 as the first instruction > to emulate. It will need to be slightly modified in my patch. > This patch is on top of your first patch that cleans up the comments, but not the second patch that fixes the PC in the VFP exception case. Compiled but not run tested, and I can't test crunch or iwmmxt. vfpmodule.c may be able to be simplified (right now its both adding and subtracting 4 from regs->ARM_pc). arch/arm/kernel/crunch-bits.S | 3 --- arch/arm/kernel/entry-armv.S | 38 ++++++++++++++++++++------------------ arch/arm/kernel/iwmmxt.S | 3 --- arch/arm/nwfpe/entry.S | 1 + arch/arm/vfp/vfphw.S | 5 ----- arch/arm/vfp/vfpmodule.c | 6 ++++++ 6 files changed, 27 insertions(+), 29 deletions(-) diff --git a/arch/arm/kernel/crunch-bits.S b/arch/arm/kernel/crunch-bits.S index 0ec9bb4..0095368 100644 --- a/arch/arm/kernel/crunch-bits.S +++ b/arch/arm/kernel/crunch-bits.S @@ -77,11 +77,8 @@ ENTRY(crunch_task_enable) ldr r3, =crunch_owner add r0, r10, #TI_CRUNCH_STATE @ get task crunch save area - ldr r2, [sp, #60] @ current task pc value ldr r1, [r3] @ get current crunch owner str r0, [r3] @ this task now owns crunch - sub r2, r2, #4 @ adjust pc back - str r2, [sp, #60] ldr r2, [r8, #0x80] mov r2, r2 @ flush out enable (@@@) diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 5876eec..f09e576 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -259,14 +259,18 @@ __und_svc: @ r0 - instruction @ #ifndef CONFIG_THUMB2_KERNEL - ldr r0, [r2, #-4] + sub r2, r2, #4 + ldr r0, [r2] #else - ldrh r0, [r2, #-2] @ Thumb instruction at LR - 2 + sub r2, r2, #2 + ldrh r0, [r2] @ Thumb instruction at LR - 2 and r9, r0, #0xf800 cmp r9, #0xe800 @ 32-bit instruction if xx >= 0 - ldrhhs r9, [r2] @ bottom 16 bits + ldrhhs r9, [r2, #2] @ bottom 16 bits orrhs r0, r9, r0, lsl #16 #endif + str r2, [sp, #S_PC] @ replace regs->ARM_pc with + @ pc of faulting instruction adr r9, BSYM(1f) bl call_fpe @@ -474,36 +478,34 @@ __und_usr: @ r3 = regs->ARM_cpsr @ tst r3, #PSR_T_BIT @ Thumb mode? - itttt eq @ explicit IT needed for the 1f label - subeq r4, r2, #4 @ ARM instr at LR - 4 -1: ldreqt r0, [r4] + itet eq @ explicit IT needed for the 1f label + subeq r2, r2, #4 @ ARM instr at LR - 4 + subne r2, r2, #2 @ Thumb instr at LR - 2 +1: ldreqt r0, [r2] #ifdef CONFIG_CPU_ENDIAN_BE8 reveq r0, r0 @ little endian instruction #endif + str r2, [sp, #S_PC] @ replace regs->ARM_pc with + @ pc of faulting instruction @ @ r0 = 32-bit ARM instruction which caused the exception - @ r2 = PC value for the following instruction (:= regs->ARM_pc) - @ r4 = PC value for the faulting instruction - @ + @ r2 = PC value for the faulting instruction (:= regs->ARM_pc) beq call_fpe @ Thumb instruction #if __LINUX_ARM_ARCH__ >= 7 - sub r4, r2, #2 @ Thumb instr at LR - 2 2: - ARM( ldrht r5, [r4], #2 ) - THUMB( ldrht r5, [r4] ) - THUMB( add r4, r4, #2 ) + ARM( ldrht r5, [r2], #2 ) + THUMB( ldrht r5, [r2] ) + THUMB( add r4, r2, #2 ) and r0, r5, #0xf800 @ mask bits 111x x... .... .... cmp r0, #0xe800 @ 32bit instruction if xx != 0 blo __und_usr_unknown 3: ldrht r0, [r4] - add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 orr r0, r0, r5, lsl #16 @ @ r0 = the two 16-bit Thumb instructions which caused the exception - @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) - @ r4 = PC value for the first 16-bit Thumb instruction + @ r2 = PC value for the faulting Thumb instruction (:= regs->ARM_pc) @ #else b __und_usr_unknown @@ -544,7 +546,7 @@ ENDPROC(__und_usr) * * Emulators may wish to make use of the following registers: * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) - * r2 = PC value to resume execution after successful emulation + * r2 = PC value of the faulting instruction * r9 = normal "successful" return address * r10 = this threads thread_info structure * lr = unrecognised instruction return address @@ -662,7 +664,7 @@ do_fpe: /* * The FP module is called with these registers set: * r0 = instruction - * r2 = PC+4 + * r2 = PC of the faulting instruction * r9 = normal "successful" return address * r10 = FP workspace * lr = unrecognised FP instruction return address diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S index 7fa3bb0..0842487 100644 --- a/arch/arm/kernel/iwmmxt.S +++ b/arch/arm/kernel/iwmmxt.S @@ -80,11 +80,8 @@ ENTRY(iwmmxt_task_enable) ldr r3, =concan_owner add r0, r10, #TI_IWMMXT_STATE @ get task Concan save area - ldr r2, [sp, #60] @ current task pc value ldr r1, [r3] @ get current Concan owner str r0, [r3] @ this task now owns Concan regs - sub r2, r2, #4 @ adjust pc back - str r2, [sp, #60] mrc p15, 0, r2, c2, c0, 0 mov r2, r2 @ cpwait diff --git a/arch/arm/nwfpe/entry.S b/arch/arm/nwfpe/entry.S index cafa183..923f03f 100644 --- a/arch/arm/nwfpe/entry.S +++ b/arch/arm/nwfpe/entry.S @@ -78,6 +78,7 @@ nwfpe_enter: mov sl, sp @ we access the registers via 'sl' ldr r5, [sp, #S_PC] @ get contents of PC; + add r5, r5, #4 @ skip first PC, already have opcode mov r6, r0 @ save the opcode emulate: ldr r1, [sp, #S_PSR] @ fetch the PSR diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S index 7292921..73dea8b 100644 --- a/arch/arm/vfp/vfphw.S +++ b/arch/arm/vfp/vfphw.S @@ -139,11 +139,6 @@ check_for_exception: @ out before setting an FPEXC that @ stops us reading stuff VFPFMXR FPEXC, r1 @ Restore FPEXC last - sub r2, r2, #4 @ Retry current instruction - if Thumb - str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, - @ else it's one 32-bit instruction, so - @ always subtract 4 from the following - @ instruction address. #ifdef CONFIG_PREEMPT get_thread_info r10 ldr r4, [r10, #TI_PREEMPT] @ get preempt count diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c index 0797cb5..1142742 100644 --- a/arch/arm/vfp/vfpmodule.c +++ b/arch/arm/vfp/vfpmodule.c @@ -293,6 +293,12 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs) orig_fpscr = fpscr = fmrx(FPSCR); /* + * Normally the instruction will be emulated, set return PC to be + * the next instruction + */ + regs->ARM_pc += 4; + + /* * Check for the special VFP subarch 1 and FPSCR.IXE bit case */ if ((fpsid & FPSID_ARCH_MASK) == (1 << FPSID_ARCH_BIT) ^ permalink raw reply related [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-27 7:30 ` Colin Cross 0 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-01-27 7:30 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 26, 2011 at 10:35 PM, Colin Cross <ccross@android.com> wrote: > On Wed, Jan 26, 2011 at 10:11 PM, Colin Cross <ccross@android.com> wrote: >> On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >>> On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote: >>>> I think there is an additional change needed to >>>> __und_usr_unknown/do_undefinstr. ?do_undefinstr, which gets called >>>> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc >>>> to be the fault address, and not the next PC, and gets called for 2 or >>>> 4 byte instructions. >>> >>> It expects it to be the next PC: >>> >>> asmlinkage void __exception do_undefinstr(struct pt_regs *regs) >>> { >>> ? ? ? ?unsigned int correction = thumb_mode(regs) ? 2 : 4; >>> ? ? ? ?unsigned int instr; >>> ? ? ? ?siginfo_t info; >>> ? ? ? ?void __user *pc; >>> >>> ? ? ? ?/* >>> ? ? ? ? * According to the ARM ARM, PC is 2 or 4 bytes ahead, >>> ? ? ? ? * depending whether we're in Thumb mode or not. >>> ? ? ? ? * Correct this offset. >>> ? ? ? ? */ >>> ? ? ? ?regs->ARM_pc -= correction; >>> >>> We expect the PC to be pointing at the next instruction to be executed. >>> This is the value of the PC saved by the CPU when entering the exception. >>> We correct the PC by four bytes for ARM mode to point at the previously >>> executed instruction. >>> >>> For 16-bit Thumb mode, the PC is again pointing at the next instruction >>> to be executed, and this is the value saved by the CPU. ?So we correct >>> the PC by two bytes as that is the Thumb instruction size. >>> >>> The problem comes with T2, where we advance the saved PC by two bytes >>> if the instruction was 32-bit such that it again points at the next >>> instruction to be executed. ?This is where the problem comes in because >>> we have two different chunks of code with completely different >>> expectations. >> >> All do_undefinstr will do with the correction is subtract it off so >> that pc points to the faulting instruction instead of the next >> instruction, and calls any registered handlers. ?VFP_bounce sometimes >> subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP >> instructions are all 32 bit), and sometimes leaves pc pointing at the >> next instruction. ?If both were modified to expect the faulting >> instruction's pc, do_undefinstr would leave pc unmodified, and >> VFP_bounce would add 4 in the opposite cases from where it subtracts 4 >> now. >> >> iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr. >> >> Currently, iwmmxt_task_enable will either end up in do_undefinstr >> through mov pc, lr, or it will subtract 4 from the pc register value >> and return through ret_from_exception. ?With the change above, it >> would not need to modify the pc value at all. >> >> crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same. >> >>> Maybe we need to pass in the correction factor to do_undefinstr instead. >>> >> >> That just makes it even more complicated, the correction factor has to >> be tracked through every code path. >> >> RFC patch to follow. >> > > Missed one case. ?do_fpe calls into nwfpe_enter (I don't see any other > users of fp_enter), which assumes the PC in the registers on the stack > is faulting PC + 4, ignores r2, and uses r0 as the first instruction > to emulate. ?It will need to be slightly modified in my patch. > This patch is on top of your first patch that cleans up the comments, but not the second patch that fixes the PC in the VFP exception case. Compiled but not run tested, and I can't test crunch or iwmmxt. vfpmodule.c may be able to be simplified (right now its both adding and subtracting 4 from regs->ARM_pc). arch/arm/kernel/crunch-bits.S | 3 --- arch/arm/kernel/entry-armv.S | 38 ++++++++++++++++++++------------------ arch/arm/kernel/iwmmxt.S | 3 --- arch/arm/nwfpe/entry.S | 1 + arch/arm/vfp/vfphw.S | 5 ----- arch/arm/vfp/vfpmodule.c | 6 ++++++ 6 files changed, 27 insertions(+), 29 deletions(-) diff --git a/arch/arm/kernel/crunch-bits.S b/arch/arm/kernel/crunch-bits.S index 0ec9bb4..0095368 100644 --- a/arch/arm/kernel/crunch-bits.S +++ b/arch/arm/kernel/crunch-bits.S @@ -77,11 +77,8 @@ ENTRY(crunch_task_enable) ldr r3, =crunch_owner add r0, r10, #TI_CRUNCH_STATE @ get task crunch save area - ldr r2, [sp, #60] @ current task pc value ldr r1, [r3] @ get current crunch owner str r0, [r3] @ this task now owns crunch - sub r2, r2, #4 @ adjust pc back - str r2, [sp, #60] ldr r2, [r8, #0x80] mov r2, r2 @ flush out enable (@@@) diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 5876eec..f09e576 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -259,14 +259,18 @@ __und_svc: @ r0 - instruction @ #ifndef CONFIG_THUMB2_KERNEL - ldr r0, [r2, #-4] + sub r2, r2, #4 + ldr r0, [r2] #else - ldrh r0, [r2, #-2] @ Thumb instruction at LR - 2 + sub r2, r2, #2 + ldrh r0, [r2] @ Thumb instruction at LR - 2 and r9, r0, #0xf800 cmp r9, #0xe800 @ 32-bit instruction if xx >= 0 - ldrhhs r9, [r2] @ bottom 16 bits + ldrhhs r9, [r2, #2] @ bottom 16 bits orrhs r0, r9, r0, lsl #16 #endif + str r2, [sp, #S_PC] @ replace regs->ARM_pc with + @ pc of faulting instruction adr r9, BSYM(1f) bl call_fpe @@ -474,36 +478,34 @@ __und_usr: @ r3 = regs->ARM_cpsr @ tst r3, #PSR_T_BIT @ Thumb mode? - itttt eq @ explicit IT needed for the 1f label - subeq r4, r2, #4 @ ARM instr at LR - 4 -1: ldreqt r0, [r4] + itet eq @ explicit IT needed for the 1f label + subeq r2, r2, #4 @ ARM instr at LR - 4 + subne r2, r2, #2 @ Thumb instr at LR - 2 +1: ldreqt r0, [r2] #ifdef CONFIG_CPU_ENDIAN_BE8 reveq r0, r0 @ little endian instruction #endif + str r2, [sp, #S_PC] @ replace regs->ARM_pc with + @ pc of faulting instruction @ @ r0 = 32-bit ARM instruction which caused the exception - @ r2 = PC value for the following instruction (:= regs->ARM_pc) - @ r4 = PC value for the faulting instruction - @ + @ r2 = PC value for the faulting instruction (:= regs->ARM_pc) beq call_fpe @ Thumb instruction #if __LINUX_ARM_ARCH__ >= 7 - sub r4, r2, #2 @ Thumb instr at LR - 2 2: - ARM( ldrht r5, [r4], #2 ) - THUMB( ldrht r5, [r4] ) - THUMB( add r4, r4, #2 ) + ARM( ldrht r5, [r2], #2 ) + THUMB( ldrht r5, [r2] ) + THUMB( add r4, r2, #2 ) and r0, r5, #0xf800 @ mask bits 111x x... .... .... cmp r0, #0xe800 @ 32bit instruction if xx != 0 blo __und_usr_unknown 3: ldrht r0, [r4] - add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 orr r0, r0, r5, lsl #16 @ @ r0 = the two 16-bit Thumb instructions which caused the exception - @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) - @ r4 = PC value for the first 16-bit Thumb instruction + @ r2 = PC value for the faulting Thumb instruction (:= regs->ARM_pc) @ #else b __und_usr_unknown @@ -544,7 +546,7 @@ ENDPROC(__und_usr) * * Emulators may wish to make use of the following registers: * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) - * r2 = PC value to resume execution after successful emulation + * r2 = PC value of the faulting instruction * r9 = normal "successful" return address * r10 = this threads thread_info structure * lr = unrecognised instruction return address @@ -662,7 +664,7 @@ do_fpe: /* * The FP module is called with these registers set: * r0 = instruction - * r2 = PC+4 + * r2 = PC of the faulting instruction * r9 = normal "successful" return address * r10 = FP workspace * lr = unrecognised FP instruction return address diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S index 7fa3bb0..0842487 100644 --- a/arch/arm/kernel/iwmmxt.S +++ b/arch/arm/kernel/iwmmxt.S @@ -80,11 +80,8 @@ ENTRY(iwmmxt_task_enable) ldr r3, =concan_owner add r0, r10, #TI_IWMMXT_STATE @ get task Concan save area - ldr r2, [sp, #60] @ current task pc value ldr r1, [r3] @ get current Concan owner str r0, [r3] @ this task now owns Concan regs - sub r2, r2, #4 @ adjust pc back - str r2, [sp, #60] mrc p15, 0, r2, c2, c0, 0 mov r2, r2 @ cpwait diff --git a/arch/arm/nwfpe/entry.S b/arch/arm/nwfpe/entry.S index cafa183..923f03f 100644 --- a/arch/arm/nwfpe/entry.S +++ b/arch/arm/nwfpe/entry.S @@ -78,6 +78,7 @@ nwfpe_enter: mov sl, sp @ we access the registers via 'sl' ldr r5, [sp, #S_PC] @ get contents of PC; + add r5, r5, #4 @ skip first PC, already have opcode mov r6, r0 @ save the opcode emulate: ldr r1, [sp, #S_PSR] @ fetch the PSR diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S index 7292921..73dea8b 100644 --- a/arch/arm/vfp/vfphw.S +++ b/arch/arm/vfp/vfphw.S @@ -139,11 +139,6 @@ check_for_exception: @ out before setting an FPEXC that @ stops us reading stuff VFPFMXR FPEXC, r1 @ Restore FPEXC last - sub r2, r2, #4 @ Retry current instruction - if Thumb - str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, - @ else it's one 32-bit instruction, so - @ always subtract 4 from the following - @ instruction address. #ifdef CONFIG_PREEMPT get_thread_info r10 ldr r4, [r10, #TI_PREEMPT] @ get preempt count diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c index 0797cb5..1142742 100644 --- a/arch/arm/vfp/vfpmodule.c +++ b/arch/arm/vfp/vfpmodule.c @@ -293,6 +293,12 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs) orig_fpscr = fpscr = fmrx(FPSCR); /* + * Normally the instruction will be emulated, set return PC to be + * the next instruction + */ + regs->ARM_pc += 4; + + /* * Check for the special VFP subarch 1 and FPSCR.IXE bit case */ if ((fpsid & FPSID_ARCH_MASK) == (1 << FPSID_ARCH_BIT) ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-27 7:30 ` Colin Cross @ 2011-02-09 18:12 ` Colin Cross -1 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-02-09 18:12 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel On Wed, Jan 26, 2011 at 11:30 PM, Colin Cross <ccross@android.com> wrote: > This patch is on top of your first patch that cleans up the comments, > but not the second patch that fixes the PC in the VFP exception case. > Compiled but not run tested, and I can't test crunch or iwmmxt. > vfpmodule.c may be able to be simplified (right now its both adding > and subtracting 4 from regs->ARM_pc). > > arch/arm/kernel/crunch-bits.S | 3 --- > arch/arm/kernel/entry-armv.S | 38 ++++++++++++++++++++------------------ > arch/arm/kernel/iwmmxt.S | 3 --- > arch/arm/nwfpe/entry.S | 1 + > arch/arm/vfp/vfphw.S | 5 ----- > arch/arm/vfp/vfpmodule.c | 6 ++++++ > 6 files changed, 27 insertions(+), 29 deletions(-) Finally got a chance to do some quick testing on this, it needs one more change to drop the correction in do_undefinstr in traps.c, and fix one register in entry-armv.S. Russell, do you have any interest in this solution? Alternatively, a one line change to store the corrected PC into the stack in vfphw.S instead of entry-armv.S will fix the original problem (VFP_bounce from thumb mode returning to the wrong address) without causing the problem introduced by your second patch (undefined NEON or VFP-D32 instructions in thumb mode returning to the wrong address). ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-02-09 18:12 ` Colin Cross 0 siblings, 0 replies; 64+ messages in thread From: Colin Cross @ 2011-02-09 18:12 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 26, 2011 at 11:30 PM, Colin Cross <ccross@android.com> wrote: > This patch is on top of your first patch that cleans up the comments, > but not the second patch that fixes the PC in the VFP exception case. > Compiled but not run tested, and I can't test crunch or iwmmxt. > vfpmodule.c may be able to be simplified (right now its both adding > and subtracting 4 from regs->ARM_pc). > > ?arch/arm/kernel/crunch-bits.S | ? ?3 --- > ?arch/arm/kernel/entry-armv.S ?| ? 38 ++++++++++++++++++++------------------ > ?arch/arm/kernel/iwmmxt.S ? ? ?| ? ?3 --- > ?arch/arm/nwfpe/entry.S ? ? ? ?| ? ?1 + > ?arch/arm/vfp/vfphw.S ? ? ? ? ?| ? ?5 ----- > ?arch/arm/vfp/vfpmodule.c ? ? ?| ? ?6 ++++++ > ?6 files changed, 27 insertions(+), 29 deletions(-) Finally got a chance to do some quick testing on this, it needs one more change to drop the correction in do_undefinstr in traps.c, and fix one register in entry-armv.S. Russell, do you have any interest in this solution? Alternatively, a one line change to store the corrected PC into the stack in vfphw.S instead of entry-armv.S will fix the original problem (VFP_bounce from thumb mode returning to the wrong address) without causing the problem introduced by your second patch (undefined NEON or VFP-D32 instructions in thumb mode returning to the wrong address). ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 18:47 ` Russell King - ARM Linux @ 2011-01-15 15:38 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-15 15:38 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: linux-kernel, linux-arm-kernel, Colin Cross On 14 January 2011 18:47, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Fri, Jan 14, 2011 at 05:30:50PM +0000, Russell King - ARM Linux wrote: >> On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote: >> > I agree, this code needs some clean-up. Maybe for Undef we could unify >> > the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the >> > breakpoint code, I haven't checked). >> > >> > Otherwise just let the code handling the undef deal with the ARM/Thumb >> > difference. For SVC, it makes sense to have different offsets as we >> > always return to the next instruction. >> >> I think it just needs better documentation. >> >> Having been through all this, there _are_ bugs lurking in the code exactly >> because of this randomness with what PC value is means what. >> >> When the VFP support code tests the state of the VFP hardware during boot, >> it sets the VFP handler to point at vfp_testing_entry, bypassing the normal >> VFP handling code, and executes a VFP instruction. >> >> If this VFP instruction faults (eg, because there is no VFP hardware >> present or we're not permitted to use it), it could end up resuming >> execution in the middle of the 16-bit paired instruction because >> regs->ARM_pc points in the middle of it. >> >> So vfp_testing_entry should at least store r2 into regs->ARM_pc to >> guarantee resuming at the following instruction. >> >> So maybe the right answer is to store r2 into regs->ARM_pc in >> process_exception in the VFP assembly code too? >> >> Or maybe we should just make it unconditional that whenever we have an >> undefined instruction exception, the regs->ARM_pc value will always be >> set for resuming execution after the faulted instruction. That makes >> it consistent with r2 throughout the code in every case. > > So... this incrementally on top of the previous patch (which I've > reproduced below as there's a subtle comment change in there wrt IRQ > state.) > > This means we have consistent state - both r2 and regs->ARM_pc always > point to the next instruction to be executed in every case, which means > its easy to understand and remember while reading through the code. > > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > --- b/arch/arm/kernel/entry-armv.S > +++ b/arch/arm/kernel/entry-armv.S > @@ -499,10 +499,11 @@ > blo __und_usr_unknown > 3: ldrht r0, [r4] > add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 > - orr r0, r0, r5, lsl #16 > + str r2, [sp, #S_PC] @ it's a 2x16bit instr, update > + orr r0, r0, r5, lsl #16 @ regs->ARM_pc > @ > @ r0 = the two 16-bit Thumb instructions which caused the exception > - @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) > + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) > @ r4 = PC value for the first 16-bit Thumb instruction > @ > #else Do we need to modify the VFP entry code to avoit the store to ARM_pc? -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-15 15:38 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-15 15:38 UTC (permalink / raw) To: linux-arm-kernel On 14 January 2011 18:47, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Fri, Jan 14, 2011 at 05:30:50PM +0000, Russell King - ARM Linux wrote: >> On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote: >> > I agree, this code needs some clean-up. Maybe for Undef we could unify >> > the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the >> > breakpoint code, I haven't checked). >> > >> > Otherwise just let the code handling the undef deal with the ARM/Thumb >> > difference. For SVC, it makes sense to have different offsets as we >> > always return to the next instruction. >> >> I think it just needs better documentation. >> >> Having been through all this, there _are_ bugs lurking in the code exactly >> because of this randomness with what PC value is means what. >> >> When the VFP support code tests the state of the VFP hardware during boot, >> it sets the VFP handler to point at vfp_testing_entry, bypassing the normal >> VFP handling code, and executes a VFP instruction. >> >> If this VFP instruction faults (eg, because there is no VFP hardware >> present or we're not permitted to use it), it could end up resuming >> execution in the middle of the 16-bit paired instruction because >> regs->ARM_pc points in the middle of it. >> >> So vfp_testing_entry should at least store r2 into regs->ARM_pc to >> guarantee resuming at the following instruction. >> >> So maybe the right answer is to store r2 into regs->ARM_pc in >> process_exception in the VFP assembly code too? >> >> Or maybe we should just make it unconditional that whenever we have an >> undefined instruction exception, the regs->ARM_pc value will always be >> set for resuming execution after the faulted instruction. ?That makes >> it consistent with r2 throughout the code in every case. > > So... this incrementally on top of the previous patch (which I've > reproduced below as there's a subtle comment change in there wrt IRQ > state.) > > This means we have consistent state - both r2 and regs->ARM_pc always > point to the next instruction to be executed in every case, which means > its easy to understand and remember while reading through the code. > > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > --- b/arch/arm/kernel/entry-armv.S > +++ b/arch/arm/kernel/entry-armv.S > @@ -499,10 +499,11 @@ > ? ? ? ?blo ? ? __und_usr_unknown > ?3: ? ? ldrht ? r0, [r4] > ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 > - ? ? ? orr ? ? r0, r0, r5, lsl #16 > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update > + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc > ? ? ? ?@ > ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception > - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) > + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) > ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction > ? ? ? ?@ > ?#else Do we need to modify the VFP entry code to avoit the store to ARM_pc? -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-15 15:38 ` Catalin Marinas @ 2011-01-15 15:43 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-15 15:43 UTC (permalink / raw) To: Catalin Marinas; +Cc: linux-kernel, linux-arm-kernel, Colin Cross On Sat, Jan 15, 2011 at 03:38:16PM +0000, Catalin Marinas wrote: > On 14 January 2011 18:47, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > > --- b/arch/arm/kernel/entry-armv.S > > +++ b/arch/arm/kernel/entry-armv.S > > @@ -499,10 +499,11 @@ > > blo __und_usr_unknown > > 3: ldrht r0, [r4] > > add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 > > - orr r0, r0, r5, lsl #16 > > + str r2, [sp, #S_PC] @ it's a 2x16bit instr, update > > + orr r0, r0, r5, lsl #16 @ regs->ARM_pc > > @ > > @ r0 = the two 16-bit Thumb instructions which caused the exception > > - @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) > > + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) > > @ r4 = PC value for the first 16-bit Thumb instruction > > @ > > #else > > Do we need to modify the VFP entry code to avoit the store to ARM_pc? The one after the sub #4 instruction? That's answered by the comments... "retry the instruction" and that r2 = regs->ARM_pc in every case, and both r2 and regs->ARM_pc point at the _following_ instruction... I do hope this isn't a case that _more_ comments are making this more confusing (which seems to be the way with documentation - the more words you use, the more questions people have). Maybe we should get rid of all the comments instead? ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-15 15:43 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-15 15:43 UTC (permalink / raw) To: linux-arm-kernel On Sat, Jan 15, 2011 at 03:38:16PM +0000, Catalin Marinas wrote: > On 14 January 2011 18:47, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > > --- b/arch/arm/kernel/entry-armv.S > > +++ b/arch/arm/kernel/entry-armv.S > > @@ -499,10 +499,11 @@ > > ? ? ? ?blo ? ? __und_usr_unknown > > ?3: ? ? ldrht ? r0, [r4] > > ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 > > - ? ? ? orr ? ? r0, r0, r5, lsl #16 > > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update > > + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc > > ? ? ? ?@ > > ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception > > - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) > > + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) > > ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction > > ? ? ? ?@ > > ?#else > > Do we need to modify the VFP entry code to avoit the store to ARM_pc? The one after the sub #4 instruction? That's answered by the comments... "retry the instruction" and that r2 = regs->ARM_pc in every case, and both r2 and regs->ARM_pc point at the _following_ instruction... I do hope this isn't a case that _more_ comments are making this more confusing (which seems to be the way with documentation - the more words you use, the more questions people have). Maybe we should get rid of all the comments instead? ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-15 15:43 ` Russell King - ARM Linux @ 2011-01-16 11:51 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-16 11:51 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Catalin Marinas, linux-kernel, linux-arm-kernel, Colin Cross On Saturday, 15 January 2011, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Sat, Jan 15, 2011 at 03:38:16PM +0000, Catalin Marinas wrote: >> On 14 January 2011 18:47, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >> > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >> > --- b/arch/arm/kernel/entry-armv.S >> > +++ b/arch/arm/kernel/entry-armv.S >> > @@ -499,10 +499,11 @@ >> > blo __und_usr_unknown >> > 3: ldrht r0, [r4] >> > add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 >> > - orr r0, r0, r5, lsl #16 >> > + str r2, [sp, #S_PC] @ it's a 2x16bit instr, update >> > + orr r0, r0, r5, lsl #16 @ regs->ARM_pc >> > @ >> > @ r0 = the two 16-bit Thumb instructions which caused the exception >> > - @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) >> > + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) >> > @ r4 = PC value for the first 16-bit Thumb instruction >> > @ >> > #else >> >> Do we need to modify the VFP entry code to avoit the store to ARM_pc? > > The one after the sub #4 instruction? No, I misread the code, we still need the one after sub #4. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-16 11:51 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-16 11:51 UTC (permalink / raw) To: linux-arm-kernel On Saturday, 15 January 2011, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Sat, Jan 15, 2011 at 03:38:16PM +0000, Catalin Marinas wrote: >> On 14 January 2011 18:47, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >> > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >> > --- b/arch/arm/kernel/entry-armv.S >> > +++ b/arch/arm/kernel/entry-armv.S >> > @@ -499,10 +499,11 @@ >> > ? ? ? ?blo ? ? __und_usr_unknown >> > ?3: ? ? ldrht ? r0, [r4] >> > ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 >> > - ? ? ? orr ? ? r0, r0, r5, lsl #16 >> > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update >> > + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc >> > ? ? ? ?@ >> > ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception >> > - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) >> > + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc) >> > ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction >> > ? ? ? ?@ >> > ?#else >> >> Do we need to modify the VFP entry code to avoit the store to ARM_pc? > > The one after the sub #4 instruction? No, I misread the code, we still need the one after sub #4. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 17:30 ` Russell King - ARM Linux @ 2011-01-15 15:31 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-15 15:31 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On 14 January 2011 17:30, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote: >> I agree, this code needs some clean-up. Maybe for Undef we could unify >> the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the >> breakpoint code, I haven't checked). >> >> Otherwise just let the code handling the undef deal with the ARM/Thumb >> difference. For SVC, it makes sense to have different offsets as we >> always return to the next instruction. [...] > When the VFP support code tests the state of the VFP hardware during boot, > it sets the VFP handler to point at vfp_testing_entry, bypassing the normal > VFP handling code, and executes a VFP instruction. > > If this VFP instruction faults (eg, because there is no VFP hardware > present or we're not permitted to use it), it could end up resuming > execution in the middle of the 16-bit paired instruction because > regs->ARM_pc points in the middle of it. Yes, that's possible. We probably never tried a Thumb-2 kernel where VFP isn't present. > Or maybe we should just make it unconditional that whenever we have an > undefined instruction exception, the regs->ARM_pc value will always be > set for resuming execution after the faulted instruction. That makes > it consistent with r2 throughout the code in every case. I have some comments below. > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > index 2b46fea..5876eec 100644 > --- a/arch/arm/kernel/entry-armv.S > +++ b/arch/arm/kernel/entry-armv.S > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) > .align 5 > __und_usr: > usr_entry > - > - @ > - @ fall through to the emulation code, which returns using r9 if > - @ it has emulated the instruction, or the more conventional lr > - @ if we are to treat this as a real undefined instruction > @ > - @ r0 - instruction > + @ The emulation code returns using r9 if it has emulated the > + @ instruction, or the more conventional lr if we are to treat > + @ this as a real undefined instruction > @ > adr r9, BSYM(ret_from_exception) > adr lr, BSYM(__und_usr_unknown) > + @ > + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the > + @ faulting instruction depending on Thumb mode. > + @ r3 = regs->ARM_cpsr > + @ > tst r3, #PSR_T_BIT @ Thumb mode? > - itet eq @ explicit IT needed for the 1f label > + itttt eq @ explicit IT needed for the 1f label > subeq r4, r2, #4 @ ARM instr at LR - 4 > - subne r4, r2, #2 @ Thumb instr at LR - 2 > 1: ldreqt r0, [r4] The itttt above should just be itt. The reveq is conditionally compiled and beq doesn't necessarily need one. > #ifdef CONFIG_CPU_ENDIAN_BE8 > reveq r0, r0 @ little endian instruction > #endif > + @ > + @ r0 = 32-bit ARM instruction which caused the exception > + @ r2 = PC value for the following instruction (:= regs->ARM_pc) Is r2 here always the PC value following instruction? If the Thumb instruction was 32-bit, it just points in the middle of the faulting instruction. > + @ r4 = PC value for the faulting instruction > + @ > beq call_fpe > + > @ Thumb instruction > #if __LINUX_ARM_ARCH__ >= 7 > + sub r4, r2, #2 @ Thumb instr at LR - 2 > 2: > ARM( ldrht r5, [r4], #2 ) > THUMB( ldrht r5, [r4] ) > @@ -492,18 +500,19 @@ __und_usr: > 3: ldrht r0, [r4] > add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 > orr r0, r0, r5, lsl #16 > + @ > + @ r0 = the two 16-bit Thumb instructions which caused the exception > + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) That's correct. > + @ r4 = PC value for the first 16-bit Thumb instruction I think r4 here points in the middle of tha faulting instruction for 32-bit Thumb. > + @ > #else > b __und_usr_unknown > #endif > - UNWIND(.fnend ) > + UNWIND(.fnend) > ENDPROC(__und_usr) > > - @ > - @ fallthrough to call_fpe > - @ > - > /* > - * The out of line fixup for the ldrt above. > + * The out of line fixup for the ldrt instructions above. > */ > .pushsection .fixup, "ax" > 4: mov pc, r9 > @@ -534,11 +543,12 @@ ENDPROC(__und_usr) > * NEON handler code. > * > * Emulators may wish to make use of the following registers: > - * r0 = instruction opcode. > - * r2 = PC+4 > + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) > + * r2 = PC value to resume execution after successful emulation > * r9 = normal "successful" return address > - * r10 = this threads thread_info structure. > + * r10 = this threads thread_info structure > * lr = unrecognised instruction return address > + * IRQs disabled, FIQs enabled. > */ > @ > @ Fall-through from Thumb-2 __und_usr > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > index ee57640..eeb9250 100644 > --- a/arch/arm/kernel/traps.c > +++ b/arch/arm/kernel/traps.c > @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) > void __user *pc; > > /* > - * According to the ARM ARM, PC is 2 or 4 bytes ahead, > - * depending whether we're in Thumb mode or not. > - * Correct this offset. > + * According to the ARM ARM, the PC is 2 or 4 bytes ahead > + * depending on Thumb mode. Correct this offset so that > + * regs->ARM_pc points at the faulting instruction. > */ > regs->ARM_pc -= correction; > > diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S > index 4fa9903..2bf6089 100644 > --- a/arch/arm/vfp/entry.S > +++ b/arch/arm/vfp/entry.S > @@ -19,6 +19,14 @@ > #include <asm/vfpmacros.h> > #include "../kernel/entry-header.S" > > +@ VFP entry point. > +@ > +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) > +@ r2 = PC value to resume execution after successful emulation > +@ r9 = normal "successful" return address > +@ r10 = this threads thread_info structure > +@ lr = unrecognised instruction return address > +@ > ENTRY(do_vfp) > #ifdef CONFIG_PREEMPT > ldr r4, [r10, #TI_PREEMPT] @ get preempt count > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S > index 9897dcf..7292921 100644 > --- a/arch/arm/vfp/vfphw.S > +++ b/arch/arm/vfp/vfphw.S > @@ -61,13 +61,13 @@ > > @ VFP hardware support entry point. > @ > -@ r0 = faulted instruction > -@ r2 = faulted PC+4 > -@ r9 = successful return > +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) > +@ r2 = PC value to resume execution after successful emulation That's right. > +@ r9 = normal "successful" return address > @ r10 = vfp_state union > @ r11 = CPU number > -@ lr = failure return > - > +@ lr = unrecognised instruction return address > +@ IRQs enabled. > ENTRY(vfp_support_entry) > DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 > > @@ -138,9 +138,12 @@ check_for_exception: > @ exception before retrying branch > @ out before setting an FPEXC that > @ stops us reading stuff > - VFPFMXR FPEXC, r1 @ restore FPEXC last > - sub r2, r2, #4 > - str r2, [sp, #S_PC] @ retry the instruction > + VFPFMXR FPEXC, r1 @ Restore FPEXC last > + sub r2, r2, #4 @ Retry current instruction - if Thumb > + str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, > + @ else it's one 32-bit instruction, so > + @ always subtract 4 from the following > + @ instruction address. I would say it's always a 32-bit instruction but made up of two 16-bit values to allow half-word alignment. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-15 15:31 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-15 15:31 UTC (permalink / raw) To: linux-arm-kernel On 14 January 2011 17:30, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote: >> I agree, this code needs some clean-up. Maybe for Undef we could unify >> the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the >> breakpoint code, I haven't checked). >> >> Otherwise just let the code handling the undef deal with the ARM/Thumb >> difference. For SVC, it makes sense to have different offsets as we >> always return to the next instruction. [...] > When the VFP support code tests the state of the VFP hardware during boot, > it sets the VFP handler to point at vfp_testing_entry, bypassing the normal > VFP handling code, and executes a VFP instruction. > > If this VFP instruction faults (eg, because there is no VFP hardware > present or we're not permitted to use it), it could end up resuming > execution in the middle of the 16-bit paired instruction because > regs->ARM_pc points in the middle of it. Yes, that's possible. We probably never tried a Thumb-2 kernel where VFP isn't present. > Or maybe we should just make it unconditional that whenever we have an > undefined instruction exception, the regs->ARM_pc value will always be > set for resuming execution after the faulted instruction. ?That makes > it consistent with r2 throughout the code in every case. I have some comments below. > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > index 2b46fea..5876eec 100644 > --- a/arch/arm/kernel/entry-armv.S > +++ b/arch/arm/kernel/entry-armv.S > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) > ? ? ? ?.align ?5 > ?__und_usr: > ? ? ? ?usr_entry > - > - ? ? ? @ > - ? ? ? @ fall through to the emulation code, which returns using r9 if > - ? ? ? @ it has emulated the instruction, or the more conventional lr > - ? ? ? @ if we are to treat this as a real undefined instruction > ? ? ? ?@ > - ? ? ? @ ?r0 - instruction > + ? ? ? @ The emulation code returns using r9 if it has emulated the > + ? ? ? @ instruction, or the more conventional lr if we are to treat > + ? ? ? @ this as a real undefined instruction > ? ? ? ?@ > ? ? ? ?adr ? ? r9, BSYM(ret_from_exception) > ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown) > + ? ? ? @ > + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the > + ? ? ? @ faulting instruction depending on Thumb mode. > + ? ? ? @ r3 = regs->ARM_cpsr > + ? ? ? @ > ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode? > - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label > + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label > ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4 > - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 > ?1: ? ? ldreqt ?r0, [r4] The itttt above should just be itt. The reveq is conditionally compiled and beq doesn't necessarily need one. > ?#ifdef CONFIG_CPU_ENDIAN_BE8 > ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction > ?#endif > + ? ? ? @ > + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception > + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc) Is r2 here always the PC value following instruction? If the Thumb instruction was 32-bit, it just points in the middle of the faulting instruction. > + ? ? ? @ r4 = PC value for the faulting instruction > + ? ? ? @ > ? ? ? ?beq ? ? call_fpe > + > ? ? ? ?@ Thumb instruction > ?#if __LINUX_ARM_ARCH__ >= 7 > + ? ? ? sub ? ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 > ?2: > ?ARM( ?ldrht ? r5, [r4], #2 ? ?) > ?THUMB( ? ? ? ?ldrht ? r5, [r4] ? ? ? ?) > @@ -492,18 +500,19 @@ __und_usr: > ?3: ? ? ldrht ? r0, [r4] > ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 > ? ? ? ?orr ? ? r0, r0, r5, lsl #16 > + ? ? ? @ > + ? ? ? @ r0 = the two 16-bit Thumb instructions which caused the exception > + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) That's correct. > + ? ? ? @ r4 = PC value for the first 16-bit Thumb instruction I think r4 here points in the middle of tha faulting instruction for 32-bit Thumb. > + ? ? ? @ > ?#else > ? ? ? ?b ? ? ? __und_usr_unknown > ?#endif > - UNWIND(.fnend ? ? ? ? ) > + UNWIND(.fnend) > ?ENDPROC(__und_usr) > > - ? ? ? @ > - ? ? ? @ fallthrough to call_fpe > - ? ? ? @ > - > ?/* > - * The out of line fixup for the ldrt above. > + * The out of line fixup for the ldrt instructions above. > ?*/ > ? ? ? ?.pushsection .fixup, "ax" > ?4: ? ? mov ? ? pc, r9 > @@ -534,11 +543,12 @@ ENDPROC(__und_usr) > ?* NEON handler code. > ?* > ?* Emulators may wish to make use of the following registers: > - * ?r0 ?= instruction opcode. > - * ?r2 ?= PC+4 > + * ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) > + * ?r2 ?= PC value to resume execution after successful emulation > ?* ?r9 ?= normal "successful" return address > - * ?r10 = this threads thread_info structure. > + * ?r10 = this threads thread_info structure > ?* ?lr ?= unrecognised instruction return address > + * IRQs disabled, FIQs enabled. > ?*/ > ? ? ? ?@ > ? ? ? ?@ Fall-through from Thumb-2 __und_usr > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > index ee57640..eeb9250 100644 > --- a/arch/arm/kernel/traps.c > +++ b/arch/arm/kernel/traps.c > @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) > ? ? ? ?void __user *pc; > > ? ? ? ?/* > - ? ? ? ?* According to the ARM ARM, PC is 2 or 4 bytes ahead, > - ? ? ? ?* depending whether we're in Thumb mode or not. > - ? ? ? ?* Correct this offset. > + ? ? ? ?* According to the ARM ARM, the PC is 2 or 4 bytes ahead > + ? ? ? ?* depending on Thumb mode. ?Correct this offset so that > + ? ? ? ?* regs->ARM_pc points at the faulting instruction. > ? ? ? ? */ > ? ? ? ?regs->ARM_pc -= correction; > > diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S > index 4fa9903..2bf6089 100644 > --- a/arch/arm/vfp/entry.S > +++ b/arch/arm/vfp/entry.S > @@ -19,6 +19,14 @@ > ?#include <asm/vfpmacros.h> > ?#include "../kernel/entry-header.S" > > +@ VFP entry point. > +@ > +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) > +@ ?r2 ?= PC value to resume execution after successful emulation > +@ ?r9 ?= normal "successful" return address > +@ ?r10 = this threads thread_info structure > +@ ?lr ?= unrecognised instruction return address > +@ > ?ENTRY(do_vfp) > ?#ifdef CONFIG_PREEMPT > ? ? ? ?ldr ? ? r4, [r10, #TI_PREEMPT] ?@ get preempt count > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S > index 9897dcf..7292921 100644 > --- a/arch/arm/vfp/vfphw.S > +++ b/arch/arm/vfp/vfphw.S > @@ -61,13 +61,13 @@ > > ?@ VFP hardware support entry point. > ?@ > -@ ?r0 ?= faulted instruction > -@ ?r2 ?= faulted PC+4 > -@ ?r9 ?= successful return > +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) > +@ ?r2 ?= PC value to resume execution after successful emulation That's right. > +@ ?r9 ?= normal "successful" return address > ?@ ?r10 = vfp_state union > ?@ ?r11 = CPU number > -@ ?lr ?= failure return > - > +@ ?lr ?= unrecognised instruction return address > +@ ?IRQs enabled. > ?ENTRY(vfp_support_entry) > ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 > > @@ -138,9 +138,12 @@ check_for_exception: > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ exception before retrying branch > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ out before setting an FPEXC that > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ stops us reading stuff > - ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ restore FPEXC last > - ? ? ? sub ? ? r2, r2, #4 > - ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ retry the instruction > + ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ Restore FPEXC last > + ? ? ? sub ? ? r2, r2, #4 ? ? ? ? ? ? ?@ Retry current instruction - if Thumb > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ mode it's two 16-bit instructions, > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ else it's one 32-bit instruction, so > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ always subtract 4 from the following > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ instruction address. I would say it's always a 32-bit instruction but made up of two 16-bit values to allow half-word alignment. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-15 15:31 ` Catalin Marinas @ 2011-01-15 15:40 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-15 15:40 UTC (permalink / raw) To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote: > On 14 January 2011 17:30, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote: > >> I agree, this code needs some clean-up. Maybe for Undef we could unify > >> the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the > >> breakpoint code, I haven't checked). > >> > >> Otherwise just let the code handling the undef deal with the ARM/Thumb > >> difference. For SVC, it makes sense to have different offsets as we > >> always return to the next instruction. > [...] > > When the VFP support code tests the state of the VFP hardware during boot, > > it sets the VFP handler to point at vfp_testing_entry, bypassing the normal > > VFP handling code, and executes a VFP instruction. > > > > If this VFP instruction faults (eg, because there is no VFP hardware > > present or we're not permitted to use it), it could end up resuming > > execution in the middle of the 16-bit paired instruction because > > regs->ARM_pc points in the middle of it. > > Yes, that's possible. We probably never tried a Thumb-2 kernel where > VFP isn't present. > > > Or maybe we should just make it unconditional that whenever we have an > > undefined instruction exception, the regs->ARM_pc value will always be > > set for resuming execution after the faulted instruction. That makes > > it consistent with r2 throughout the code in every case. > > I have some comments below. > > > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > > index 2b46fea..5876eec 100644 > > --- a/arch/arm/kernel/entry-armv.S > > +++ b/arch/arm/kernel/entry-armv.S > > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) > > .align 5 > > __und_usr: > > usr_entry > > - > > - @ > > - @ fall through to the emulation code, which returns using r9 if > > - @ it has emulated the instruction, or the more conventional lr > > - @ if we are to treat this as a real undefined instruction > > @ > > - @ r0 - instruction > > + @ The emulation code returns using r9 if it has emulated the > > + @ instruction, or the more conventional lr if we are to treat > > + @ this as a real undefined instruction > > @ > > adr r9, BSYM(ret_from_exception) > > adr lr, BSYM(__und_usr_unknown) > > + @ > > + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the > > + @ faulting instruction depending on Thumb mode. > > + @ r3 = regs->ARM_cpsr > > + @ > > tst r3, #PSR_T_BIT @ Thumb mode? > > - itet eq @ explicit IT needed for the 1f label > > + itttt eq @ explicit IT needed for the 1f label > > subeq r4, r2, #4 @ ARM instr at LR - 4 > > - subne r4, r2, #2 @ Thumb instr at LR - 2 > > 1: ldreqt r0, [r4] > > The itttt above should just be itt. The reveq is conditionally > compiled and beq doesn't necessarily need one. It's a reveq, so I thought we should cover all the instructions with an 'eq' conditional for thumb. > > > #ifdef CONFIG_CPU_ENDIAN_BE8 > > reveq r0, r0 @ little endian instruction > > #endif > > + @ > > + @ r0 = 32-bit ARM instruction which caused the exception > > + @ r2 = PC value for the following instruction (:= regs->ARM_pc) > > Is r2 here always the PC value following instruction? If the Thumb > instruction was 32-bit, it just points in the middle of the faulting > instruction. Is the T bit ever zero in this case? The code here is: tst r3, #PSR_T_BIT subeq r4, r2, #4 1: ldreqt r0, [r4] reveq r0, r0 beq call_fpe So, if !T, then we subtract 4 and load the instruction (which was the faulting instruction). So r2 is the following instruction. Ah, maybe you're getting confused by the comment. Should we put an 'eq' suffix on the end of each line? ;) > > > + @ r4 = PC value for the faulting instruction > > + @ > > beq call_fpe > > + > > @ Thumb instruction > > #if __LINUX_ARM_ARCH__ >= 7 > > + sub r4, r2, #2 @ Thumb instr at LR - 2 > > 2: > > ARM( ldrht r5, [r4], #2 ) > > THUMB( ldrht r5, [r4] ) > > @@ -492,18 +500,19 @@ __und_usr: > > 3: ldrht r0, [r4] > > add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 > > orr r0, r0, r5, lsl #16 > > + @ > > + @ r0 = the two 16-bit Thumb instructions which caused the exception > > + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) > > That's correct. > > > + @ r4 = PC value for the first 16-bit Thumb instruction > > I think r4 here points in the middle of tha faulting instruction for > 32-bit Thumb. You're right. > > > + @ > > #else > > b __und_usr_unknown > > #endif > > - UNWIND(.fnend ) > > + UNWIND(.fnend) > > ENDPROC(__und_usr) > > > > - @ > > - @ fallthrough to call_fpe > > - @ > > - > > /* > > - * The out of line fixup for the ldrt above. > > + * The out of line fixup for the ldrt instructions above. > > */ > > .pushsection .fixup, "ax" > > 4: mov pc, r9 > > @@ -534,11 +543,12 @@ ENDPROC(__und_usr) > > * NEON handler code. > > * > > * Emulators may wish to make use of the following registers: > > - * r0 = instruction opcode. > > - * r2 = PC+4 > > + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) > > + * r2 = PC value to resume execution after successful emulation > > * r9 = normal "successful" return address > > - * r10 = this threads thread_info structure. > > + * r10 = this threads thread_info structure > > * lr = unrecognised instruction return address > > + * IRQs disabled, FIQs enabled. > > */ > > @ > > @ Fall-through from Thumb-2 __und_usr > > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > > index ee57640..eeb9250 100644 > > --- a/arch/arm/kernel/traps.c > > +++ b/arch/arm/kernel/traps.c > > @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) > > void __user *pc; > > > > /* > > - * According to the ARM ARM, PC is 2 or 4 bytes ahead, > > - * depending whether we're in Thumb mode or not. > > - * Correct this offset. > > + * According to the ARM ARM, the PC is 2 or 4 bytes ahead > > + * depending on Thumb mode. Correct this offset so that > > + * regs->ARM_pc points at the faulting instruction. > > */ > > regs->ARM_pc -= correction; > > > > diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S > > index 4fa9903..2bf6089 100644 > > --- a/arch/arm/vfp/entry.S > > +++ b/arch/arm/vfp/entry.S > > @@ -19,6 +19,14 @@ > > #include <asm/vfpmacros.h> > > #include "../kernel/entry-header.S" > > > > +@ VFP entry point. > > +@ > > +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) > > +@ r2 = PC value to resume execution after successful emulation > > +@ r9 = normal "successful" return address > > +@ r10 = this threads thread_info structure > > +@ lr = unrecognised instruction return address > > +@ > > ENTRY(do_vfp) > > #ifdef CONFIG_PREEMPT > > ldr r4, [r10, #TI_PREEMPT] @ get preempt count > > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S > > index 9897dcf..7292921 100644 > > --- a/arch/arm/vfp/vfphw.S > > +++ b/arch/arm/vfp/vfphw.S > > @@ -61,13 +61,13 @@ > > > > @ VFP hardware support entry point. > > @ > > -@ r0 = faulted instruction > > -@ r2 = faulted PC+4 > > -@ r9 = successful return > > +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) > > +@ r2 = PC value to resume execution after successful emulation > > That's right. > > > +@ r9 = normal "successful" return address > > @ r10 = vfp_state union > > @ r11 = CPU number > > -@ lr = failure return > > - > > +@ lr = unrecognised instruction return address > > +@ IRQs enabled. > > ENTRY(vfp_support_entry) > > DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 > > > > @@ -138,9 +138,12 @@ check_for_exception: > > @ exception before retrying branch > > @ out before setting an FPEXC that > > @ stops us reading stuff > > - VFPFMXR FPEXC, r1 @ restore FPEXC last > > - sub r2, r2, #4 > > - str r2, [sp, #S_PC] @ retry the instruction > > + VFPFMXR FPEXC, r1 @ Restore FPEXC last > > + sub r2, r2, #4 @ Retry current instruction - if Thumb > > + str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, > > + @ else it's one 32-bit instruction, so > > + @ always subtract 4 from the following > > + @ instruction address. > > I would say it's always a 32-bit instruction but made up of two 16-bit > values to allow half-word alignment. Do you have a suggested replacement text? ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-15 15:40 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-15 15:40 UTC (permalink / raw) To: linux-arm-kernel On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote: > On 14 January 2011 17:30, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote: > >> I agree, this code needs some clean-up. Maybe for Undef we could unify > >> the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the > >> breakpoint code, I haven't checked). > >> > >> Otherwise just let the code handling the undef deal with the ARM/Thumb > >> difference. For SVC, it makes sense to have different offsets as we > >> always return to the next instruction. > [...] > > When the VFP support code tests the state of the VFP hardware during boot, > > it sets the VFP handler to point at vfp_testing_entry, bypassing the normal > > VFP handling code, and executes a VFP instruction. > > > > If this VFP instruction faults (eg, because there is no VFP hardware > > present or we're not permitted to use it), it could end up resuming > > execution in the middle of the 16-bit paired instruction because > > regs->ARM_pc points in the middle of it. > > Yes, that's possible. We probably never tried a Thumb-2 kernel where > VFP isn't present. > > > Or maybe we should just make it unconditional that whenever we have an > > undefined instruction exception, the regs->ARM_pc value will always be > > set for resuming execution after the faulted instruction. ?That makes > > it consistent with r2 throughout the code in every case. > > I have some comments below. > > > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S > > index 2b46fea..5876eec 100644 > > --- a/arch/arm/kernel/entry-armv.S > > +++ b/arch/arm/kernel/entry-armv.S > > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) > > ? ? ? ?.align ?5 > > ?__und_usr: > > ? ? ? ?usr_entry > > - > > - ? ? ? @ > > - ? ? ? @ fall through to the emulation code, which returns using r9 if > > - ? ? ? @ it has emulated the instruction, or the more conventional lr > > - ? ? ? @ if we are to treat this as a real undefined instruction > > ? ? ? ?@ > > - ? ? ? @ ?r0 - instruction > > + ? ? ? @ The emulation code returns using r9 if it has emulated the > > + ? ? ? @ instruction, or the more conventional lr if we are to treat > > + ? ? ? @ this as a real undefined instruction > > ? ? ? ?@ > > ? ? ? ?adr ? ? r9, BSYM(ret_from_exception) > > ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown) > > + ? ? ? @ > > + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the > > + ? ? ? @ faulting instruction depending on Thumb mode. > > + ? ? ? @ r3 = regs->ARM_cpsr > > + ? ? ? @ > > ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode? > > - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label > > + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label > > ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4 > > - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 > > ?1: ? ? ldreqt ?r0, [r4] > > The itttt above should just be itt. The reveq is conditionally > compiled and beq doesn't necessarily need one. It's a reveq, so I thought we should cover all the instructions with an 'eq' conditional for thumb. > > > ?#ifdef CONFIG_CPU_ENDIAN_BE8 > > ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction > > ?#endif > > + ? ? ? @ > > + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception > > + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc) > > Is r2 here always the PC value following instruction? If the Thumb > instruction was 32-bit, it just points in the middle of the faulting > instruction. Is the T bit ever zero in this case? The code here is: tst r3, #PSR_T_BIT subeq r4, r2, #4 1: ldreqt r0, [r4] reveq r0, r0 beq call_fpe So, if !T, then we subtract 4 and load the instruction (which was the faulting instruction). So r2 is the following instruction. Ah, maybe you're getting confused by the comment. Should we put an 'eq' suffix on the end of each line? ;) > > > + ? ? ? @ r4 = PC value for the faulting instruction > > + ? ? ? @ > > ? ? ? ?beq ? ? call_fpe > > + > > ? ? ? ?@ Thumb instruction > > ?#if __LINUX_ARM_ARCH__ >= 7 > > + ? ? ? sub ? ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 > > ?2: > > ?ARM( ?ldrht ? r5, [r4], #2 ? ?) > > ?THUMB( ? ? ? ?ldrht ? r5, [r4] ? ? ? ?) > > @@ -492,18 +500,19 @@ __und_usr: > > ?3: ? ? ldrht ? r0, [r4] > > ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4 > > ? ? ? ?orr ? ? r0, r0, r5, lsl #16 > > + ? ? ? @ > > + ? ? ? @ r0 = the two 16-bit Thumb instructions which caused the exception > > + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) > > That's correct. > > > + ? ? ? @ r4 = PC value for the first 16-bit Thumb instruction > > I think r4 here points in the middle of tha faulting instruction for > 32-bit Thumb. You're right. > > > + ? ? ? @ > > ?#else > > ? ? ? ?b ? ? ? __und_usr_unknown > > ?#endif > > - UNWIND(.fnend ? ? ? ? ) > > + UNWIND(.fnend) > > ?ENDPROC(__und_usr) > > > > - ? ? ? @ > > - ? ? ? @ fallthrough to call_fpe > > - ? ? ? @ > > - > > ?/* > > - * The out of line fixup for the ldrt above. > > + * The out of line fixup for the ldrt instructions above. > > ?*/ > > ? ? ? ?.pushsection .fixup, "ax" > > ?4: ? ? mov ? ? pc, r9 > > @@ -534,11 +543,12 @@ ENDPROC(__und_usr) > > ?* NEON handler code. > > ?* > > ?* Emulators may wish to make use of the following registers: > > - * ?r0 ?= instruction opcode. > > - * ?r2 ?= PC+4 > > + * ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) > > + * ?r2 ?= PC value to resume execution after successful emulation > > ?* ?r9 ?= normal "successful" return address > > - * ?r10 = this threads thread_info structure. > > + * ?r10 = this threads thread_info structure > > ?* ?lr ?= unrecognised instruction return address > > + * IRQs disabled, FIQs enabled. > > ?*/ > > ? ? ? ?@ > > ? ? ? ?@ Fall-through from Thumb-2 __und_usr > > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > > index ee57640..eeb9250 100644 > > --- a/arch/arm/kernel/traps.c > > +++ b/arch/arm/kernel/traps.c > > @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) > > ? ? ? ?void __user *pc; > > > > ? ? ? ?/* > > - ? ? ? ?* According to the ARM ARM, PC is 2 or 4 bytes ahead, > > - ? ? ? ?* depending whether we're in Thumb mode or not. > > - ? ? ? ?* Correct this offset. > > + ? ? ? ?* According to the ARM ARM, the PC is 2 or 4 bytes ahead > > + ? ? ? ?* depending on Thumb mode. ?Correct this offset so that > > + ? ? ? ?* regs->ARM_pc points at the faulting instruction. > > ? ? ? ? */ > > ? ? ? ?regs->ARM_pc -= correction; > > > > diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S > > index 4fa9903..2bf6089 100644 > > --- a/arch/arm/vfp/entry.S > > +++ b/arch/arm/vfp/entry.S > > @@ -19,6 +19,14 @@ > > ?#include <asm/vfpmacros.h> > > ?#include "../kernel/entry-header.S" > > > > +@ VFP entry point. > > +@ > > +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) > > +@ ?r2 ?= PC value to resume execution after successful emulation > > +@ ?r9 ?= normal "successful" return address > > +@ ?r10 = this threads thread_info structure > > +@ ?lr ?= unrecognised instruction return address > > +@ > > ?ENTRY(do_vfp) > > ?#ifdef CONFIG_PREEMPT > > ? ? ? ?ldr ? ? r4, [r10, #TI_PREEMPT] ?@ get preempt count > > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S > > index 9897dcf..7292921 100644 > > --- a/arch/arm/vfp/vfphw.S > > +++ b/arch/arm/vfp/vfphw.S > > @@ -61,13 +61,13 @@ > > > > ?@ VFP hardware support entry point. > > ?@ > > -@ ?r0 ?= faulted instruction > > -@ ?r2 ?= faulted PC+4 > > -@ ?r9 ?= successful return > > +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb) > > +@ ?r2 ?= PC value to resume execution after successful emulation > > That's right. > > > +@ ?r9 ?= normal "successful" return address > > ?@ ?r10 = vfp_state union > > ?@ ?r11 = CPU number > > -@ ?lr ?= failure return > > - > > +@ ?lr ?= unrecognised instruction return address > > +@ ?IRQs enabled. > > ?ENTRY(vfp_support_entry) > > ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 > > > > @@ -138,9 +138,12 @@ check_for_exception: > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ exception before retrying branch > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ out before setting an FPEXC that > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ stops us reading stuff > > - ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ restore FPEXC last > > - ? ? ? sub ? ? r2, r2, #4 > > - ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ retry the instruction > > + ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ Restore FPEXC last > > + ? ? ? sub ? ? r2, r2, #4 ? ? ? ? ? ? ?@ Retry current instruction - if Thumb > > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ mode it's two 16-bit instructions, > > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ else it's one 32-bit instruction, so > > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ always subtract 4 from the following > > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ instruction address. > > I would say it's always a 32-bit instruction but made up of two 16-bit > values to allow half-word alignment. Do you have a suggested replacement text? ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-15 15:40 ` Russell King - ARM Linux @ 2011-01-16 11:49 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-16 11:49 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Catalin Marinas, Colin Cross, linux-arm-kernel, linux-kernel On Saturday, 15 January 2011, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote: >> On 14 January 2011 17:30, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >> > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >> > index 2b46fea..5876eec 100644 >> > --- a/arch/arm/kernel/entry-armv.S >> > +++ b/arch/arm/kernel/entry-armv.S >> > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) >> > .align 5 >> > __und_usr: >> > usr_entry >> > - >> > - @ >> > - @ fall through to the emulation code, which returns using r9 if >> > - @ it has emulated the instruction, or the more conventional lr >> > - @ if we are to treat this as a real undefined instruction >> > @ >> > - @ r0 - instruction >> > + @ The emulation code returns using r9 if it has emulated the >> > + @ instruction, or the more conventional lr if we are to treat >> > + @ this as a real undefined instruction >> > @ >> > adr r9, BSYM(ret_from_exception) >> > adr lr, BSYM(__und_usr_unknown) >> > + @ >> > + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the >> > + @ faulting instruction depending on Thumb mode. >> > + @ r3 = regs->ARM_cpsr >> > + @ >> > tst r3, #PSR_T_BIT @ Thumb mode? >> > - itet eq @ explicit IT needed for the 1f label >> > + itttt eq @ explicit IT needed for the 1f label >> > subeq r4, r2, #4 @ ARM instr at LR - 4 >> > - subne r4, r2, #2 @ Thumb instr at LR - 2 >> > 1: ldreqt r0, [r4] >> >> The itttt above should just be itt. The reveq is conditionally >> compiled and beq doesn't necessarily need one. > > It's a reveq, so I thought we should cover all the instructions with > an 'eq' conditional for thumb. If the it instruction doesn't cover all instructions, gas generates some more its. But in this case, for little endian, the it instruction covers more since reveq isn't included and having the beq not last in the block I think is unpredictable. If you really want to optimise the big endian case not to have an additional it generated by gas, you can write ittt so that beq is included with little endian but not with big endian. I wouldn't bother much for an extra it anyway. >> > #ifdef CONFIG_CPU_ENDIAN_BE8 >> > reveq r0, r0 @ little endian instruction >> > #endif >> > + @ >> > + @ r0 = 32-bit ARM instruction which caused the exception >> > + @ r2 = PC value for the following instruction (:= regs->ARM_pc) >> >> Is r2 here always the PC value following instruction? If the Thumb >> instruction was 32-bit, it just points in the middle of the faulting >> instruction. > > Is the T bit ever zero in this case? The code here is: > > tst r3, #PSR_T_BIT > subeq r4, r2, #4 > 1: ldreqt r0, [r4] > reveq r0, r0 > beq call_fpe You can have the T bit set but the instruction a 32-bit Thumb in which case r2 is in the middle of such instruction rather than the next. Unless you only refer to the ARM mode, in which case the comment is fine. > So, if !T, then we subtract 4 and load the instruction (which was the > faulting instruction). So r2 is the following instruction. > > Ah, maybe you're getting confused by the comment. Should we put > an 'eq' suffix on the end of each line? ;) Maybe mention that this is ARM. I think documenting this code is difficult anyway. I found myself not reading the comments at all when revisiting this code :) but they may be useful for others. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-16 11:49 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-16 11:49 UTC (permalink / raw) To: linux-arm-kernel On Saturday, 15 January 2011, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote: >> On 14 January 2011 17:30, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >> > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S >> > index 2b46fea..5876eec 100644 >> > --- a/arch/arm/kernel/entry-armv.S >> > +++ b/arch/arm/kernel/entry-armv.S >> > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) >> > ? ? ? ?.align ?5 >> > ?__und_usr: >> > ? ? ? ?usr_entry >> > - >> > - ? ? ? @ >> > - ? ? ? @ fall through to the emulation code, which returns using r9 if >> > - ? ? ? @ it has emulated the instruction, or the more conventional lr >> > - ? ? ? @ if we are to treat this as a real undefined instruction >> > ? ? ? ?@ >> > - ? ? ? @ ?r0 - instruction >> > + ? ? ? @ The emulation code returns using r9 if it has emulated the >> > + ? ? ? @ instruction, or the more conventional lr if we are to treat >> > + ? ? ? @ this as a real undefined instruction >> > ? ? ? ?@ >> > ? ? ? ?adr ? ? r9, BSYM(ret_from_exception) >> > ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown) >> > + ? ? ? @ >> > + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the >> > + ? ? ? @ faulting instruction depending on Thumb mode. >> > + ? ? ? @ r3 = regs->ARM_cpsr >> > + ? ? ? @ >> > ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode? >> > - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label >> > + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label >> > ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4 >> > - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2 >> > ?1: ? ? ldreqt ?r0, [r4] >> >> The itttt above should just be itt. The reveq is conditionally >> compiled and beq doesn't necessarily need one. > > It's a reveq, so I thought we should cover all the instructions with > an 'eq' conditional for thumb. If the it instruction doesn't cover all instructions, gas generates some more its. But in this case, for little endian, the it instruction covers more since reveq isn't included and having the beq not last in the block I think is unpredictable. If you really want to optimise the big endian case not to have an additional it generated by gas, you can write ittt so that beq is included with little endian but not with big endian. I wouldn't bother much for an extra it anyway. >> > ?#ifdef CONFIG_CPU_ENDIAN_BE8 >> > ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction >> > ?#endif >> > + ? ? ? @ >> > + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception >> > + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc) >> >> Is r2 here always the PC value following instruction? If the Thumb >> instruction was 32-bit, it just points in the middle of the faulting >> instruction. > > Is the T bit ever zero in this case? ?The code here is: > > ?? ? ? ?tst ? ? r3, #PSR_T_BIT > ?? ? ? ?subeq ? r4, r2, #4 > 1: ? ? ?ldreqt ?r0, [r4] > ?? ? ? ?reveq ? r0, r0 > ?? ? ? ?beq ? ? call_fpe You can have the T bit set but the instruction a 32-bit Thumb in which case r2 is in the middle of such instruction rather than the next. Unless you only refer to the ARM mode, in which case the comment is fine. > So, if !T, then we subtract 4 and load the instruction (which was the > faulting instruction). ?So r2 is the following instruction. > > Ah, maybe you're getting confused by the comment. ?Should we put > an 'eq' suffix on the end of each line? ;) Maybe mention that this is ARM. I think documenting this code is difficult anyway. I found myself not reading the comments at all when revisiting this code :) but they may be useful for others. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-16 11:49 ` Catalin Marinas @ 2011-01-23 15:51 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-23 15:51 UTC (permalink / raw) To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Sun, Jan 16, 2011 at 11:49:21AM +0000, Catalin Marinas wrote: > On Saturday, 15 January 2011, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > It's a reveq, so I thought we should cover all the instructions with > > an 'eq' conditional for thumb. > > If the it instruction doesn't cover all instructions, gas generates > some more its. But in this case, for little endian, the it instruction > covers more since reveq isn't included and having the beq not last in > the block I think is unpredictable. If you really want to optimise the > big endian case not to have an additional it generated by gas, you can > write ittt so that beq is included with little endian but not with big > endian. I wouldn't bother much for an extra it anyway. I think the itttt is correct. Unless you wish to illustrate why you think it's wrong by pasting the code and showing why you think the beq isn't the last instruction... > > tst r3, #PSR_T_BIT > > subeq r4, r2, #4 > > 1: ldreqt r0, [r4] > > reveq r0, r0 > > beq call_fpe > > You can have the T bit set but the instruction a 32-bit Thumb in which > case r2 is in the middle of such instruction rather than the next. > Unless you only refer to the ARM mode, in which case the comment is > fine. So? I'm confused why you're making a mountain out of apparantly nothing. ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-23 15:51 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-23 15:51 UTC (permalink / raw) To: linux-arm-kernel On Sun, Jan 16, 2011 at 11:49:21AM +0000, Catalin Marinas wrote: > On Saturday, 15 January 2011, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > It's a reveq, so I thought we should cover all the instructions with > > an 'eq' conditional for thumb. > > If the it instruction doesn't cover all instructions, gas generates > some more its. But in this case, for little endian, the it instruction > covers more since reveq isn't included and having the beq not last in > the block I think is unpredictable. If you really want to optimise the > big endian case not to have an additional it generated by gas, you can > write ittt so that beq is included with little endian but not with big > endian. I wouldn't bother much for an extra it anyway. I think the itttt is correct. Unless you wish to illustrate why you think it's wrong by pasting the code and showing why you think the beq isn't the last instruction... > > ?? ? ? ?tst ? ? r3, #PSR_T_BIT > > ?? ? ? ?subeq ? r4, r2, #4 > > 1: ? ? ?ldreqt ?r0, [r4] > > ?? ? ? ?reveq ? r0, r0 > > ?? ? ? ?beq ? ? call_fpe > > You can have the T bit set but the instruction a 32-bit Thumb in which > case r2 is in the middle of such instruction rather than the next. > Unless you only refer to the ARM mode, in which case the comment is > fine. So? I'm confused why you're making a mountain out of apparantly nothing. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-23 15:51 ` Russell King - ARM Linux @ 2011-01-25 13:19 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-25 13:19 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Sun, 2011-01-23 at 15:51 +0000, Russell King - ARM Linux wrote: > On Sun, Jan 16, 2011 at 11:49:21AM +0000, Catalin Marinas wrote: > > On Saturday, 15 January 2011, Russell King - ARM Linux > > <linux@arm.linux.org.uk> wrote: > > > It's a reveq, so I thought we should cover all the instructions with > > > an 'eq' conditional for thumb. > > > > If the it instruction doesn't cover all instructions, gas generates > > some more its. But in this case, for little endian, the it instruction > > covers more since reveq isn't included and having the beq not last in > > the block I think is unpredictable. If you really want to optimise the > > big endian case not to have an additional it generated by gas, you can > > write ittt so that beq is included with little endian but not with big > > endian. I wouldn't bother much for an extra it anyway. > > I think the itttt is correct. Unless you wish to illustrate why you > think it's wrong by pasting the code and showing why you think the > beq isn't the last instruction... With your patch applied (visually), the code becomes (removed the comment before beq): tst r3, #PSR_T_BIT @ Thumb mode? itttt eq @ explicit IT needed for the 1f label subeq r4, r2, #4 @ ARM instr at LR - 4 1: ldreqt r0, [r4] #ifdef CONFIG_CPU_ENDIAN_BE8 reveq r0, r0 @ little endian instruction #endif beq call_fpe The little endian case only has 3 conditional instructions: subeq r4, r2, #4 @ ARM instr at LR - 4 1: ldreqt r0, [r4] beq call_fpe but you add itttt (if-then-then-then-then) which expects 4 conditional instructions, IOW beq is no longer the last. So cutting a 't' would sort it out (unless I misread your patch). > > > tst r3, #PSR_T_BIT > > > subeq r4, r2, #4 > > > 1: ldreqt r0, [r4] > > > reveq r0, r0 > > > beq call_fpe > > > > You can have the T bit set but the instruction a 32-bit Thumb in which > > case r2 is in the middle of such instruction rather than the next. > > Unless you only refer to the ARM mode, in which case the comment is > > fine. > > So? I'm confused why you're making a mountain out of apparantly > nothing. No issue really, the comment can stay as you wrote it (I don't read them anyway :)). -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-25 13:19 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-25 13:19 UTC (permalink / raw) To: linux-arm-kernel On Sun, 2011-01-23 at 15:51 +0000, Russell King - ARM Linux wrote: > On Sun, Jan 16, 2011 at 11:49:21AM +0000, Catalin Marinas wrote: > > On Saturday, 15 January 2011, Russell King - ARM Linux > > <linux@arm.linux.org.uk> wrote: > > > It's a reveq, so I thought we should cover all the instructions with > > > an 'eq' conditional for thumb. > > > > If the it instruction doesn't cover all instructions, gas generates > > some more its. But in this case, for little endian, the it instruction > > covers more since reveq isn't included and having the beq not last in > > the block I think is unpredictable. If you really want to optimise the > > big endian case not to have an additional it generated by gas, you can > > write ittt so that beq is included with little endian but not with big > > endian. I wouldn't bother much for an extra it anyway. > > I think the itttt is correct. Unless you wish to illustrate why you > think it's wrong by pasting the code and showing why you think the > beq isn't the last instruction... With your patch applied (visually), the code becomes (removed the comment before beq): tst r3, #PSR_T_BIT @ Thumb mode? itttt eq @ explicit IT needed for the 1f label subeq r4, r2, #4 @ ARM instr at LR - 4 1: ldreqt r0, [r4] #ifdef CONFIG_CPU_ENDIAN_BE8 reveq r0, r0 @ little endian instruction #endif beq call_fpe The little endian case only has 3 conditional instructions: subeq r4, r2, #4 @ ARM instr at LR - 4 1: ldreqt r0, [r4] beq call_fpe but you add itttt (if-then-then-then-then) which expects 4 conditional instructions, IOW beq is no longer the last. So cutting a 't' would sort it out (unless I misread your patch). > > > tst r3, #PSR_T_BIT > > > subeq r4, r2, #4 > > > 1: ldreqt r0, [r4] > > > reveq r0, r0 > > > beq call_fpe > > > > You can have the T bit set but the instruction a 32-bit Thumb in which > > case r2 is in the middle of such instruction rather than the next. > > Unless you only refer to the ARM mode, in which case the comment is > > fine. > > So? I'm confused why you're making a mountain out of apparantly > nothing. No issue really, the comment can stay as you wrote it (I don't read them anyway :)). -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-15 15:40 ` Russell King - ARM Linux @ 2011-01-16 21:25 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-16 21:25 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On 15 January 2011 15:40, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote: >> On 14 January 2011 17:30, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >> > +@ r9 = normal "successful" return address >> > @ r10 = vfp_state union >> > @ r11 = CPU number >> > -@ lr = failure return >> > - >> > +@ lr = unrecognised instruction return address >> > +@ IRQs enabled. >> > ENTRY(vfp_support_entry) >> > DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 >> > >> > @@ -138,9 +138,12 @@ check_for_exception: >> > @ exception before retrying branch >> > @ out before setting an FPEXC that >> > @ stops us reading stuff >> > - VFPFMXR FPEXC, r1 @ restore FPEXC last >> > - sub r2, r2, #4 >> > - str r2, [sp, #S_PC] @ retry the instruction >> > + VFPFMXR FPEXC, r1 @ Restore FPEXC last >> > + sub r2, r2, #4 @ Retry current instruction - if Thumb >> > + str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, >> > + @ else it's one 32-bit instruction, so >> > + @ always subtract 4 from the following >> > + @ instruction address. >> >> I would say it's always a 32-bit instruction but made up of two 16-bit >> values to allow half-word alignment. > > Do you have a suggested replacement text? Maybe something like: Retry the current VFP instruction (32-bit in both ARM and Thumb modes). (I was wondering whether we can get on the above code path with asynchronous VFP exceptions where the interrupted instruction may not be the VFP one. But I think all Thumb-2 processors these days generate synchronous exceptions) -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-16 21:25 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-16 21:25 UTC (permalink / raw) To: linux-arm-kernel On 15 January 2011 15:40, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote: >> On 14 January 2011 17:30, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >> > +@ ?r9 ?= normal "successful" return address >> > ?@ ?r10 = vfp_state union >> > ?@ ?r11 = CPU number >> > -@ ?lr ?= failure return >> > - >> > +@ ?lr ?= unrecognised instruction return address >> > +@ ?IRQs enabled. >> > ?ENTRY(vfp_support_entry) >> > ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 >> > >> > @@ -138,9 +138,12 @@ check_for_exception: >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ exception before retrying branch >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ out before setting an FPEXC that >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ stops us reading stuff >> > - ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ restore FPEXC last >> > - ? ? ? sub ? ? r2, r2, #4 >> > - ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ retry the instruction >> > + ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ Restore FPEXC last >> > + ? ? ? sub ? ? r2, r2, #4 ? ? ? ? ? ? ?@ Retry current instruction - if Thumb >> > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ mode it's two 16-bit instructions, >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ else it's one 32-bit instruction, so >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ always subtract 4 from the following >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ instruction address. >> >> I would say it's always a 32-bit instruction but made up of two 16-bit >> values to allow half-word alignment. > > Do you have a suggested replacement text? Maybe something like: Retry the current VFP instruction (32-bit in both ARM and Thumb modes). (I was wondering whether we can get on the above code path with asynchronous VFP exceptions where the interrupted instruction may not be the VFP one. But I think all Thumb-2 processors these days generate synchronous exceptions) -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-16 21:25 ` Catalin Marinas @ 2011-01-23 15:46 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-23 15:46 UTC (permalink / raw) To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Sun, Jan 16, 2011 at 09:25:00PM +0000, Catalin Marinas wrote: > On 15 January 2011 15:40, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote: > >> On 14 January 2011 17:30, Russell King - ARM Linux > >> <linux@arm.linux.org.uk> wrote: > >> > +@ r9 = normal "successful" return address > >> > @ r10 = vfp_state union > >> > @ r11 = CPU number > >> > -@ lr = failure return > >> > - > >> > +@ lr = unrecognised instruction return address > >> > +@ IRQs enabled. > >> > ENTRY(vfp_support_entry) > >> > DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 > >> > > >> > @@ -138,9 +138,12 @@ check_for_exception: > >> > @ exception before retrying branch > >> > @ out before setting an FPEXC that > >> > @ stops us reading stuff > >> > - VFPFMXR FPEXC, r1 @ restore FPEXC last > >> > - sub r2, r2, #4 > >> > - str r2, [sp, #S_PC] @ retry the instruction > >> > + VFPFMXR FPEXC, r1 @ Restore FPEXC last > >> > + sub r2, r2, #4 @ Retry current instruction - if Thumb > >> > + str r2, [sp, #S_PC] @ mode it's two 16-bit instructions, > >> > + @ else it's one 32-bit instruction, so > >> > + @ always subtract 4 from the following > >> > + @ instruction address. > >> > >> I would say it's always a 32-bit instruction but made up of two 16-bit > >> values to allow half-word alignment. > > > > Do you have a suggested replacement text? > > Maybe something like: Retry the current VFP instruction (32-bit in > both ARM and Thumb modes). > > (I was wondering whether we can get on the above code path with > asynchronous VFP exceptions where the interrupted instruction may not > be the VFP one. But I think all Thumb-2 processors these days generate > synchronous exceptions) I don't think so from my understanding. Firstly, in order to raise an undefined instruction fault, the coprocessor has to be targetted with an instruction for it. So the faulting instruction here must always be for the VFP coprocessor. Secondly, in order to get an asynchronous exception, the VFP hardware has to be enabled. We only retry the instruction if the VFP hardware wasn't enabled. So, here's the revised patch. Ack? diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 2b46fea..5876eec 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) .align 5 __und_usr: usr_entry - - @ - @ fall through to the emulation code, which returns using r9 if - @ it has emulated the instruction, or the more conventional lr - @ if we are to treat this as a real undefined instruction @ - @ r0 - instruction + @ The emulation code returns using r9 if it has emulated the + @ instruction, or the more conventional lr if we are to treat + @ this as a real undefined instruction @ adr r9, BSYM(ret_from_exception) adr lr, BSYM(__und_usr_unknown) + @ + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the + @ faulting instruction depending on Thumb mode. + @ r3 = regs->ARM_cpsr + @ tst r3, #PSR_T_BIT @ Thumb mode? - itet eq @ explicit IT needed for the 1f label + itttt eq @ explicit IT needed for the 1f label subeq r4, r2, #4 @ ARM instr at LR - 4 - subne r4, r2, #2 @ Thumb instr at LR - 2 1: ldreqt r0, [r4] #ifdef CONFIG_CPU_ENDIAN_BE8 reveq r0, r0 @ little endian instruction #endif + @ + @ r0 = 32-bit ARM instruction which caused the exception + @ r2 = PC value for the following instruction (:= regs->ARM_pc) + @ r4 = PC value for the faulting instruction + @ beq call_fpe + @ Thumb instruction #if __LINUX_ARM_ARCH__ >= 7 + sub r4, r2, #2 @ Thumb instr at LR - 2 2: ARM( ldrht r5, [r4], #2 ) THUMB( ldrht r5, [r4] ) @@ -492,18 +500,19 @@ __und_usr: 3: ldrht r0, [r4] add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 orr r0, r0, r5, lsl #16 + @ + @ r0 = the two 16-bit Thumb instructions which caused the exception + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) + @ r4 = PC value for the second 16-bit Thumb instruction + @ #else b __und_usr_unknown #endif - UNWIND(.fnend ) + UNWIND(.fnend) ENDPROC(__und_usr) - @ - @ fallthrough to call_fpe - @ - /* - * The out of line fixup for the ldrt above. + * The out of line fixup for the ldrt instructions above. */ .pushsection .fixup, "ax" 4: mov pc, r9 @@ -534,11 +543,12 @@ ENDPROC(__und_usr) * NEON handler code. * * Emulators may wish to make use of the following registers: - * r0 = instruction opcode. - * r2 = PC+4 + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) + * r2 = PC value to resume execution after successful emulation * r9 = normal "successful" return address - * r10 = this threads thread_info structure. + * r10 = this threads thread_info structure * lr = unrecognised instruction return address + * IRQs disabled, FIQs enabled. */ @ @ Fall-through from Thumb-2 __und_usr diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index ee57640..eeb9250 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) void __user *pc; /* - * According to the ARM ARM, PC is 2 or 4 bytes ahead, - * depending whether we're in Thumb mode or not. - * Correct this offset. + * According to the ARM ARM, the PC is 2 or 4 bytes ahead + * depending on Thumb mode. Correct this offset so that + * regs->ARM_pc points at the faulting instruction. */ regs->ARM_pc -= correction; diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S index 4fa9903..2bf6089 100644 --- a/arch/arm/vfp/entry.S +++ b/arch/arm/vfp/entry.S @@ -19,6 +19,15 @@ #include <asm/vfpmacros.h> #include "../kernel/entry-header.S" +@ VFP entry point. +@ +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address +@ r10 = this threads thread_info structure +@ lr = unrecognised instruction return address +@ IRQs disabled. +@ ENTRY(do_vfp) #ifdef CONFIG_PREEMPT ldr r4, [r10, #TI_PREEMPT] @ get preempt count diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S index 9897dcf..7292921 100644 --- a/arch/arm/vfp/vfphw.S +++ b/arch/arm/vfp/vfphw.S @@ -61,13 +61,13 @@ @ VFP hardware support entry point. @ -@ r0 = faulted instruction -@ r2 = faulted PC+4 -@ r9 = successful return +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address @ r10 = vfp_state union @ r11 = CPU number -@ lr = failure return - +@ lr = unrecognised instruction return address +@ IRQs enabled. ENTRY(vfp_support_entry) DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 @@ -138,9 +138,9 @@ check_for_exception: @ exception before retrying branch @ out before setting an FPEXC that @ stops us reading stuff - VFPFMXR FPEXC, r1 @ restore FPEXC last - sub r2, r2, #4 - str r2, [sp, #S_PC] @ retry the instruction + VFPFMXR FPEXC, r1 @ Restore FPEXC last + sub r2, r2, #4 @ Retry current instruction, 32-bit + str r2, [sp, #S_PC] @ in both ARM and Thumb modes. #ifdef CONFIG_PREEMPT get_thread_info r10 ldr r4, [r10, #TI_PREEMPT] @ get preempt count ^ permalink raw reply related [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-23 15:46 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-23 15:46 UTC (permalink / raw) To: linux-arm-kernel On Sun, Jan 16, 2011 at 09:25:00PM +0000, Catalin Marinas wrote: > On 15 January 2011 15:40, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote: > >> On 14 January 2011 17:30, Russell King - ARM Linux > >> <linux@arm.linux.org.uk> wrote: > >> > +@ ?r9 ?= normal "successful" return address > >> > ?@ ?r10 = vfp_state union > >> > ?@ ?r11 = CPU number > >> > -@ ?lr ?= failure return > >> > - > >> > +@ ?lr ?= unrecognised instruction return address > >> > +@ ?IRQs enabled. > >> > ?ENTRY(vfp_support_entry) > >> > ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 > >> > > >> > @@ -138,9 +138,12 @@ check_for_exception: > >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ exception before retrying branch > >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ out before setting an FPEXC that > >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ stops us reading stuff > >> > - ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ restore FPEXC last > >> > - ? ? ? sub ? ? r2, r2, #4 > >> > - ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ retry the instruction > >> > + ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ Restore FPEXC last > >> > + ? ? ? sub ? ? r2, r2, #4 ? ? ? ? ? ? ?@ Retry current instruction - if Thumb > >> > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ mode it's two 16-bit instructions, > >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ else it's one 32-bit instruction, so > >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ always subtract 4 from the following > >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ instruction address. > >> > >> I would say it's always a 32-bit instruction but made up of two 16-bit > >> values to allow half-word alignment. > > > > Do you have a suggested replacement text? > > Maybe something like: Retry the current VFP instruction (32-bit in > both ARM and Thumb modes). > > (I was wondering whether we can get on the above code path with > asynchronous VFP exceptions where the interrupted instruction may not > be the VFP one. But I think all Thumb-2 processors these days generate > synchronous exceptions) I don't think so from my understanding. Firstly, in order to raise an undefined instruction fault, the coprocessor has to be targetted with an instruction for it. So the faulting instruction here must always be for the VFP coprocessor. Secondly, in order to get an asynchronous exception, the VFP hardware has to be enabled. We only retry the instruction if the VFP hardware wasn't enabled. So, here's the revised patch. Ack? diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 2b46fea..5876eec 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -461,27 +461,35 @@ ENDPROC(__irq_usr) .align 5 __und_usr: usr_entry - - @ - @ fall through to the emulation code, which returns using r9 if - @ it has emulated the instruction, or the more conventional lr - @ if we are to treat this as a real undefined instruction @ - @ r0 - instruction + @ The emulation code returns using r9 if it has emulated the + @ instruction, or the more conventional lr if we are to treat + @ this as a real undefined instruction @ adr r9, BSYM(ret_from_exception) adr lr, BSYM(__und_usr_unknown) + @ + @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the + @ faulting instruction depending on Thumb mode. + @ r3 = regs->ARM_cpsr + @ tst r3, #PSR_T_BIT @ Thumb mode? - itet eq @ explicit IT needed for the 1f label + itttt eq @ explicit IT needed for the 1f label subeq r4, r2, #4 @ ARM instr at LR - 4 - subne r4, r2, #2 @ Thumb instr at LR - 2 1: ldreqt r0, [r4] #ifdef CONFIG_CPU_ENDIAN_BE8 reveq r0, r0 @ little endian instruction #endif + @ + @ r0 = 32-bit ARM instruction which caused the exception + @ r2 = PC value for the following instruction (:= regs->ARM_pc) + @ r4 = PC value for the faulting instruction + @ beq call_fpe + @ Thumb instruction #if __LINUX_ARM_ARCH__ >= 7 + sub r4, r2, #2 @ Thumb instr at LR - 2 2: ARM( ldrht r5, [r4], #2 ) THUMB( ldrht r5, [r4] ) @@ -492,18 +500,19 @@ __und_usr: 3: ldrht r0, [r4] add r2, r2, #2 @ r2 is PC + 2, make it PC + 4 orr r0, r0, r5, lsl #16 + @ + @ r0 = the two 16-bit Thumb instructions which caused the exception + @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2) + @ r4 = PC value for the second 16-bit Thumb instruction + @ #else b __und_usr_unknown #endif - UNWIND(.fnend ) + UNWIND(.fnend) ENDPROC(__und_usr) - @ - @ fallthrough to call_fpe - @ - /* - * The out of line fixup for the ldrt above. + * The out of line fixup for the ldrt instructions above. */ .pushsection .fixup, "ax" 4: mov pc, r9 @@ -534,11 +543,12 @@ ENDPROC(__und_usr) * NEON handler code. * * Emulators may wish to make use of the following registers: - * r0 = instruction opcode. - * r2 = PC+4 + * r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) + * r2 = PC value to resume execution after successful emulation * r9 = normal "successful" return address - * r10 = this threads thread_info structure. + * r10 = this threads thread_info structure * lr = unrecognised instruction return address + * IRQs disabled, FIQs enabled. */ @ @ Fall-through from Thumb-2 __und_usr diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index ee57640..eeb9250 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) void __user *pc; /* - * According to the ARM ARM, PC is 2 or 4 bytes ahead, - * depending whether we're in Thumb mode or not. - * Correct this offset. + * According to the ARM ARM, the PC is 2 or 4 bytes ahead + * depending on Thumb mode. Correct this offset so that + * regs->ARM_pc points at the faulting instruction. */ regs->ARM_pc -= correction; diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S index 4fa9903..2bf6089 100644 --- a/arch/arm/vfp/entry.S +++ b/arch/arm/vfp/entry.S @@ -19,6 +19,15 @@ #include <asm/vfpmacros.h> #include "../kernel/entry-header.S" +@ VFP entry point. +@ +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address +@ r10 = this threads thread_info structure +@ lr = unrecognised instruction return address +@ IRQs disabled. +@ ENTRY(do_vfp) #ifdef CONFIG_PREEMPT ldr r4, [r10, #TI_PREEMPT] @ get preempt count diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S index 9897dcf..7292921 100644 --- a/arch/arm/vfp/vfphw.S +++ b/arch/arm/vfp/vfphw.S @@ -61,13 +61,13 @@ @ VFP hardware support entry point. @ -@ r0 = faulted instruction -@ r2 = faulted PC+4 -@ r9 = successful return +@ r0 = instruction opcode (32-bit ARM or two 16-bit Thumb) +@ r2 = PC value to resume execution after successful emulation +@ r9 = normal "successful" return address @ r10 = vfp_state union @ r11 = CPU number -@ lr = failure return - +@ lr = unrecognised instruction return address +@ IRQs enabled. ENTRY(vfp_support_entry) DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10 @@ -138,9 +138,9 @@ check_for_exception: @ exception before retrying branch @ out before setting an FPEXC that @ stops us reading stuff - VFPFMXR FPEXC, r1 @ restore FPEXC last - sub r2, r2, #4 - str r2, [sp, #S_PC] @ retry the instruction + VFPFMXR FPEXC, r1 @ Restore FPEXC last + sub r2, r2, #4 @ Retry current instruction, 32-bit + str r2, [sp, #S_PC] @ in both ARM and Thumb modes. #ifdef CONFIG_PREEMPT get_thread_info r10 ldr r4, [r10, #TI_PREEMPT] @ get preempt count ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-23 15:46 ` Russell King - ARM Linux @ 2011-01-25 13:45 ` Catalin Marinas -1 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-25 13:45 UTC (permalink / raw) To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel On Sun, 2011-01-23 at 15:46 +0000, Russell King - ARM Linux wrote: > So, here's the revised patch. Ack? Apart from the itttt instruction, it looks fine. So once we agree on that, you can add my Ack. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-25 13:45 ` Catalin Marinas 0 siblings, 0 replies; 64+ messages in thread From: Catalin Marinas @ 2011-01-25 13:45 UTC (permalink / raw) To: linux-arm-kernel On Sun, 2011-01-23 at 15:46 +0000, Russell King - ARM Linux wrote: > So, here's the revised patch. Ack? Apart from the itttt instruction, it looks fine. So once we agree on that, you can add my Ack. -- Catalin ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 14:10 ` Catalin Marinas @ 2011-01-14 16:24 ` Dave Martin -1 siblings, 0 replies; 64+ messages in thread From: Dave Martin @ 2011-01-14 16:24 UTC (permalink / raw) To: Catalin Marinas Cc: Russell King - ARM Linux, linux-kernel, linux-arm-kernel, Colin Cross Hi, On Fri, Jan 14, 2011 at 8:10 AM, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: >> On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote: >> > > pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc); >> > > >> > > /* >> > > + * If the exception occured in thumb mode, pc is exception location + 2, >> > > + * the middle of the 32-bit VFP instruction. Add 2 to get exception >> > > + * location + 4, the same we get in ARM mode. >> > > + */ >> > > +#ifdef CONFIG_ARM_THUMB >> > > + if (regs->ARM_cpsr & PSR_T_BIT) >> > > + regs->ARM_pc += 2; >> > > +#endif >> > >> > You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely. >> >> I don't think this is correct. On entry to the undefined instruction >> handler, we get the uncorrected PC value, so PC points to the >> instruction after the faulting instruction. >> >> If it was an ARM instruction, that is located at PC-4. If it was a >> Thumb instruction, it is located at PC-2. This PC value is passed >> unmodified to the VFP entry code, and the passed r2 reflect the >> value in regs->ARM_pc. > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit > Thumb instruction, so it is no longer the same as the ARM_pc. > > Since the VFP instructions in Thumb mode are always 32-bit, Colin's > patch made sense to me. Is the comment preceding __und_usr_unknown causing some confusion here? /* * The FP module is called with these registers set: * r0 = instruction * r2 = PC+4 ... That reflects the ARM case only: for Thumb, r2 is always PC+2 (?) The comment at the start of do_undefinstr() (which receives these registers) is correct though: /* * According to the ARM ARM, PC is 2 or 4 bytes ahead, * depending whether we're in Thumb mode or not. * Correct this offset. Cheers ---Dave ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 16:24 ` Dave Martin 0 siblings, 0 replies; 64+ messages in thread From: Dave Martin @ 2011-01-14 16:24 UTC (permalink / raw) To: linux-arm-kernel Hi, On Fri, Jan 14, 2011 at 8:10 AM, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote: >> On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote: >> > > ? ? ? ?pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc); >> > > >> > > ? ? ? ?/* >> > > + ? ? ? ?* If the exception occured in thumb mode, pc is exception location + 2, >> > > + ? ? ? ?* the middle of the 32-bit VFP instruction. ?Add 2 to get exception >> > > + ? ? ? ?* location + 4, the same we get in ARM mode. >> > > + ? ? ? ?*/ >> > > +#ifdef CONFIG_ARM_THUMB >> > > + ? ? ? if (regs->ARM_cpsr & PSR_T_BIT) >> > > + ? ? ? ? ? ? ? regs->ARM_pc += 2; >> > > +#endif >> > >> > You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely. >> >> I don't think this is correct. ?On entry to the undefined instruction >> handler, we get the uncorrected PC value, so PC points to the >> instruction after the faulting instruction. >> >> If it was an ARM instruction, that is located at PC-4. ?If it was a >> Thumb instruction, it is located at PC-2. ?This PC value is passed >> unmodified to the VFP entry code, and the passed r2 reflect the >> value in regs->ARM_pc. > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit > Thumb instruction, so it is no longer the same as the ARM_pc. > > Since the VFP instructions in Thumb mode are always 32-bit, Colin's > patch made sense to me. Is the comment preceding __und_usr_unknown causing some confusion here? /* * The FP module is called with these registers set: * r0 = instruction * r2 = PC+4 ... That reflects the ARM case only: for Thumb, r2 is always PC+2 (?) The comment at the start of do_undefinstr() (which receives these registers) is correct though: /* * According to the ARM ARM, PC is 2 or 4 bytes ahead, * depending whether we're in Thumb mode or not. * Correct this offset. Cheers ---Dave ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode 2011-01-14 16:24 ` Dave Martin @ 2011-01-14 16:52 ` Russell King - ARM Linux -1 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 16:52 UTC (permalink / raw) To: Dave Martin; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel, Colin Cross On Fri, Jan 14, 2011 at 10:24:52AM -0600, Dave Martin wrote: > Is the comment preceding __und_usr_unknown causing some confusion here? > > /* > * The FP module is called with these registers set: > * r0 = instruction > * r2 = PC+4 > ... > > That reflects the ARM case only: for Thumb, r2 is always PC+2 (?) Actually, referring to 'PC' here is confusing (and yes, I probably wrote it) - does 'PC' refer to the address of the faulting instruction or the current PC value... Your '(?)' there is exactly the problem I'm referring to - I don't think there's much of a clear idea really what's going on here... > The comment at the start of do_undefinstr() (which receives these > registers) is correct though: > > /* > * According to the ARM ARM, PC is 2 or 4 bytes ahead, > * depending whether we're in Thumb mode or not. > * Correct this offset. The ARM ARM says that in order to return to the instruction which generated the exception, subtract 2 bytes for thumb or 4 bytes for ARM. So, in order to point at the instruction which generated the exception, we have to subtract this value from the PC value we were passed. I suggest changing this comment to: /* * According to the ARM ARM, the PC is 2 or 4 bytes ahead * depending on Thumb mode. Correct this offset so that * regs->ARM_pc points at the faulting instruction. */ ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode @ 2011-01-14 16:52 ` Russell King - ARM Linux 0 siblings, 0 replies; 64+ messages in thread From: Russell King - ARM Linux @ 2011-01-14 16:52 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 14, 2011 at 10:24:52AM -0600, Dave Martin wrote: > Is the comment preceding __und_usr_unknown causing some confusion here? > > /* > * The FP module is called with these registers set: > * r0 = instruction > * r2 = PC+4 > ... > > That reflects the ARM case only: for Thumb, r2 is always PC+2 (?) Actually, referring to 'PC' here is confusing (and yes, I probably wrote it) - does 'PC' refer to the address of the faulting instruction or the current PC value... Your '(?)' there is exactly the problem I'm referring to - I don't think there's much of a clear idea really what's going on here... > The comment at the start of do_undefinstr() (which receives these > registers) is correct though: > > /* > * According to the ARM ARM, PC is 2 or 4 bytes ahead, > * depending whether we're in Thumb mode or not. > * Correct this offset. The ARM ARM says that in order to return to the instruction which generated the exception, subtract 2 bytes for thumb or 4 bytes for ARM. So, in order to point at the instruction which generated the exception, we have to subtract this value from the PC value we were passed. I suggest changing this comment to: /* * According to the ARM ARM, the PC is 2 or 4 bytes ahead * depending on Thumb mode. Correct this offset so that * regs->ARM_pc points at the faulting instruction. */ ^ permalink raw reply [flat|nested] 64+ messages in thread
end of thread, other threads:[~2011-02-09 18:12 UTC | newest] Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-01-14 7:42 [PATCH] ARM: vfp: Fix up exception location in Thumb mode Colin Cross 2011-01-14 7:42 ` Colin Cross 2011-01-14 11:43 ` Catalin Marinas 2011-01-14 11:43 ` Catalin Marinas 2011-01-14 12:02 ` Russell King - ARM Linux 2011-01-14 12:02 ` Russell King - ARM Linux 2011-01-14 14:10 ` Catalin Marinas 2011-01-14 14:10 ` Catalin Marinas 2011-01-14 15:49 ` Russell King - ARM Linux 2011-01-14 15:49 ` Russell King - ARM Linux 2011-01-14 16:23 ` Catalin Marinas 2011-01-14 16:23 ` Catalin Marinas 2011-01-14 16:35 ` Russell King - ARM Linux 2011-01-14 16:35 ` Russell King - ARM Linux 2011-01-14 16:58 ` Catalin Marinas 2011-01-14 16:58 ` Catalin Marinas 2011-01-14 17:30 ` Russell King - ARM Linux 2011-01-14 17:30 ` Russell King - ARM Linux 2011-01-14 18:47 ` Russell King - ARM Linux 2011-01-14 18:47 ` Russell King - ARM Linux 2011-01-14 19:23 ` Colin Cross 2011-01-14 19:23 ` Colin Cross 2011-01-14 19:51 ` Colin Cross 2011-01-14 19:51 ` Colin Cross 2011-01-14 21:24 ` Russell King - ARM Linux 2011-01-14 21:24 ` Russell King - ARM Linux 2011-01-25 23:33 ` Colin Cross 2011-01-25 23:33 ` Colin Cross 2011-01-26 11:26 ` Russell King - ARM Linux 2011-01-26 11:26 ` Russell King - ARM Linux 2011-01-27 6:11 ` Colin Cross 2011-01-27 6:11 ` Colin Cross 2011-01-27 6:35 ` Colin Cross 2011-01-27 6:35 ` Colin Cross 2011-01-27 7:30 ` Colin Cross 2011-01-27 7:30 ` Colin Cross 2011-02-09 18:12 ` Colin Cross 2011-02-09 18:12 ` Colin Cross 2011-01-15 15:38 ` Catalin Marinas 2011-01-15 15:38 ` Catalin Marinas 2011-01-15 15:43 ` Russell King - ARM Linux 2011-01-15 15:43 ` Russell King - ARM Linux 2011-01-16 11:51 ` Catalin Marinas 2011-01-16 11:51 ` Catalin Marinas 2011-01-15 15:31 ` Catalin Marinas 2011-01-15 15:31 ` Catalin Marinas 2011-01-15 15:40 ` Russell King - ARM Linux 2011-01-15 15:40 ` Russell King - ARM Linux 2011-01-16 11:49 ` Catalin Marinas 2011-01-16 11:49 ` Catalin Marinas 2011-01-23 15:51 ` Russell King - ARM Linux 2011-01-23 15:51 ` Russell King - ARM Linux 2011-01-25 13:19 ` Catalin Marinas 2011-01-25 13:19 ` Catalin Marinas 2011-01-16 21:25 ` Catalin Marinas 2011-01-16 21:25 ` Catalin Marinas 2011-01-23 15:46 ` Russell King - ARM Linux 2011-01-23 15:46 ` Russell King - ARM Linux 2011-01-25 13:45 ` Catalin Marinas 2011-01-25 13:45 ` Catalin Marinas 2011-01-14 16:24 ` Dave Martin 2011-01-14 16:24 ` Dave Martin 2011-01-14 16:52 ` Russell King - ARM Linux 2011-01-14 16:52 ` Russell King - ARM Linux
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.