All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14  7:42 ` Colin Cross
  0 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-14  7:42 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Catalin Marinas, Colin Cross, Russell King, linux-kernel

The exception handler in entry-armv.S checks for thumb mode and
correctly determines the exception location and instruction,
but VFP_bounce uses the uncorrected location off the stack.
If the VFP exception occured in Thumb mode, fix up the
exception location to match the value that would be returned
in ARM mode.

Fixes segfaults in userspace applications running in Thumb mode
caused by a handled VFP exception returning to the middle of the
instruction that triggered the exception.

Change-Id: I6c6ba1ab88e107bec166ea334d7e0974a4f6bfba
Signed-off-by: Colin Cross <ccross@android.com>
---
 arch/arm/vfp/vfpmodule.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 0797cb5..63ed73d 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -275,6 +275,16 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs)
 	pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
 
 	/*
+	 * If the exception occured in thumb mode, pc is exception location + 2,
+	 * the middle of the 32-bit VFP instruction.  Add 2 to get exception
+	 * location + 4, the same we get in ARM mode.
+	 */
+#ifdef CONFIG_ARM_THUMB
+	if (regs->ARM_cpsr & PSR_T_BIT)
+		regs->ARM_pc += 2;
+#endif
+
+	/*
 	 * At this point, FPEXC can have the following configuration:
 	 *
 	 *  EX DEX IXE
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14  7:42 ` Colin Cross
  0 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-14  7:42 UTC (permalink / raw)
  To: linux-arm-kernel

The exception handler in entry-armv.S checks for thumb mode and
correctly determines the exception location and instruction,
but VFP_bounce uses the uncorrected location off the stack.
If the VFP exception occured in Thumb mode, fix up the
exception location to match the value that would be returned
in ARM mode.

Fixes segfaults in userspace applications running in Thumb mode
caused by a handled VFP exception returning to the middle of the
instruction that triggered the exception.

Change-Id: I6c6ba1ab88e107bec166ea334d7e0974a4f6bfba
Signed-off-by: Colin Cross <ccross@android.com>
---
 arch/arm/vfp/vfpmodule.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 0797cb5..63ed73d 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -275,6 +275,16 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs)
 	pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
 
 	/*
+	 * If the exception occured in thumb mode, pc is exception location + 2,
+	 * the middle of the 32-bit VFP instruction.  Add 2 to get exception
+	 * location + 4, the same we get in ARM mode.
+	 */
+#ifdef CONFIG_ARM_THUMB
+	if (regs->ARM_cpsr & PSR_T_BIT)
+		regs->ARM_pc += 2;
+#endif
+
+	/*
 	 * At this point, FPEXC can have the following configuration:
 	 *
 	 *  EX DEX IXE
-- 
1.7.3.1

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14  7:42 ` Colin Cross
@ 2011-01-14 11:43   ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-14 11:43 UTC (permalink / raw)
  To: Colin Cross; +Cc: linux-arm-kernel, Russell King, linux-kernel

On 14 January 2011 07:42, Colin Cross <ccross@android.com> wrote:
> The exception handler in entry-armv.S checks for thumb mode and
> correctly determines the exception location and instruction,
> but VFP_bounce uses the uncorrected location off the stack.
> If the VFP exception occured in Thumb mode, fix up the
> exception location to match the value that would be returned
> in ARM mode.
>
> Fixes segfaults in userspace applications running in Thumb mode
> caused by a handled VFP exception returning to the middle of the
> instruction that triggered the exception.
>
> Change-Id: I6c6ba1ab88e107bec166ea334d7e0974a4f6bfba
> Signed-off-by: Colin Cross <ccross@android.com>
> ---
>  arch/arm/vfp/vfpmodule.c |   10 ++++++++++
>  1 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
> index 0797cb5..63ed73d 100644
> --- a/arch/arm/vfp/vfpmodule.c
> +++ b/arch/arm/vfp/vfpmodule.c
> @@ -275,6 +275,16 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs)
>        pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
>
>        /*
> +        * If the exception occured in thumb mode, pc is exception location + 2,
> +        * the middle of the 32-bit VFP instruction.  Add 2 to get exception
> +        * location + 4, the same we get in ARM mode.
> +        */
> +#ifdef CONFIG_ARM_THUMB
> +       if (regs->ARM_cpsr & PSR_T_BIT)
> +               regs->ARM_pc += 2;
> +#endif

You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 11:43   ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-14 11:43 UTC (permalink / raw)
  To: linux-arm-kernel

On 14 January 2011 07:42, Colin Cross <ccross@android.com> wrote:
> The exception handler in entry-armv.S checks for thumb mode and
> correctly determines the exception location and instruction,
> but VFP_bounce uses the uncorrected location off the stack.
> If the VFP exception occured in Thumb mode, fix up the
> exception location to match the value that would be returned
> in ARM mode.
>
> Fixes segfaults in userspace applications running in Thumb mode
> caused by a handled VFP exception returning to the middle of the
> instruction that triggered the exception.
>
> Change-Id: I6c6ba1ab88e107bec166ea334d7e0974a4f6bfba
> Signed-off-by: Colin Cross <ccross@android.com>
> ---
> ?arch/arm/vfp/vfpmodule.c | ? 10 ++++++++++
> ?1 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
> index 0797cb5..63ed73d 100644
> --- a/arch/arm/vfp/vfpmodule.c
> +++ b/arch/arm/vfp/vfpmodule.c
> @@ -275,6 +275,16 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs)
> ? ? ? ?pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
>
> ? ? ? ?/*
> + ? ? ? ?* If the exception occured in thumb mode, pc is exception location + 2,
> + ? ? ? ?* the middle of the 32-bit VFP instruction. ?Add 2 to get exception
> + ? ? ? ?* location + 4, the same we get in ARM mode.
> + ? ? ? ?*/
> +#ifdef CONFIG_ARM_THUMB
> + ? ? ? if (regs->ARM_cpsr & PSR_T_BIT)
> + ? ? ? ? ? ? ? regs->ARM_pc += 2;
> +#endif

You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 11:43   ` Catalin Marinas
@ 2011-01-14 12:02     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 12:02 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote:
> >        pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
> >
> >        /*
> > +        * If the exception occured in thumb mode, pc is exception location + 2,
> > +        * the middle of the 32-bit VFP instruction.  Add 2 to get exception
> > +        * location + 4, the same we get in ARM mode.
> > +        */
> > +#ifdef CONFIG_ARM_THUMB
> > +       if (regs->ARM_cpsr & PSR_T_BIT)
> > +               regs->ARM_pc += 2;
> > +#endif
> 
> You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely.

I don't think this is correct.  On entry to the undefined instruction
handler, we get the uncorrected PC value, so PC points to the
instruction after the faulting instruction.

If it was an ARM instruction, that is located at PC-4.  If it was a
Thumb instruction, it is located at PC-2.  This PC value is passed
unmodified to the VFP entry code, and the passed r2 reflect the
value in regs->ARM_pc.

The VFP entry assembly doesn't touch the PC value, except when it
wants to retry an instruction:

        sub     r2, r2, #4
        str     r2, [sp, #S_PC]         @ retry the instruction

So I think that 2 to the PC when in thumb mode is incorrect, as that'll
cause us to skip the instruction following the faulted one.

I think that the undefined instruction handling needs reworking for
Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb
instruction, and we have no way of knowing without repeatedly
decoding that instruction.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 12:02     ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 12:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote:
> > ? ? ? ?pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
> >
> > ? ? ? ?/*
> > + ? ? ? ?* If the exception occured in thumb mode, pc is exception location + 2,
> > + ? ? ? ?* the middle of the 32-bit VFP instruction. ?Add 2 to get exception
> > + ? ? ? ?* location + 4, the same we get in ARM mode.
> > + ? ? ? ?*/
> > +#ifdef CONFIG_ARM_THUMB
> > + ? ? ? if (regs->ARM_cpsr & PSR_T_BIT)
> > + ? ? ? ? ? ? ? regs->ARM_pc += 2;
> > +#endif
> 
> You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely.

I don't think this is correct.  On entry to the undefined instruction
handler, we get the uncorrected PC value, so PC points to the
instruction after the faulting instruction.

If it was an ARM instruction, that is located at PC-4.  If it was a
Thumb instruction, it is located at PC-2.  This PC value is passed
unmodified to the VFP entry code, and the passed r2 reflect the
value in regs->ARM_pc.

The VFP entry assembly doesn't touch the PC value, except when it
wants to retry an instruction:

        sub     r2, r2, #4
        str     r2, [sp, #S_PC]         @ retry the instruction

So I think that 2 to the PC when in thumb mode is incorrect, as that'll
cause us to skip the instruction following the faulted one.

I think that the undefined instruction handling needs reworking for
Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb
instruction, and we have no way of knowing without repeatedly
decoding that instruction.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 12:02     ` Russell King - ARM Linux
@ 2011-01-14 14:10       ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-14 14:10 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
> On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote:
> > >        pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
> > >
> > >        /*
> > > +        * If the exception occured in thumb mode, pc is exception location + 2,
> > > +        * the middle of the 32-bit VFP instruction.  Add 2 to get exception
> > > +        * location + 4, the same we get in ARM mode.
> > > +        */
> > > +#ifdef CONFIG_ARM_THUMB
> > > +       if (regs->ARM_cpsr & PSR_T_BIT)
> > > +               regs->ARM_pc += 2;
> > > +#endif
> >
> > You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely.
> 
> I don't think this is correct.  On entry to the undefined instruction
> handler, we get the uncorrected PC value, so PC points to the
> instruction after the faulting instruction.
> 
> If it was an ARM instruction, that is located at PC-4.  If it was a
> Thumb instruction, it is located at PC-2.  This PC value is passed
> unmodified to the VFP entry code, and the passed r2 reflect the
> value in regs->ARM_pc.

The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
Thumb instruction, so it is no longer the same as the ARM_pc.

Since the VFP instructions in Thumb mode are always 32-bit, Colin's
patch made sense to me.

> I think that the undefined instruction handling needs reworking for
> Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb
> instruction, and we have no way of knowing without repeatedly
> decoding that instruction.

We already handle the r2 for in __und_usr. We don't deal with ARM_pc but
we could either do it in __und_usr or let the code handling the undef
fix it up.

-- 
Catalin



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 14:10       ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-14 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
> On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote:
> > >        pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
> > >
> > >        /*
> > > +        * If the exception occured in thumb mode, pc is exception location + 2,
> > > +        * the middle of the 32-bit VFP instruction.  Add 2 to get exception
> > > +        * location + 4, the same we get in ARM mode.
> > > +        */
> > > +#ifdef CONFIG_ARM_THUMB
> > > +       if (regs->ARM_cpsr & PSR_T_BIT)
> > > +               regs->ARM_pc += 2;
> > > +#endif
> >
> > You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely.
> 
> I don't think this is correct.  On entry to the undefined instruction
> handler, we get the uncorrected PC value, so PC points to the
> instruction after the faulting instruction.
> 
> If it was an ARM instruction, that is located at PC-4.  If it was a
> Thumb instruction, it is located at PC-2.  This PC value is passed
> unmodified to the VFP entry code, and the passed r2 reflect the
> value in regs->ARM_pc.

The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
Thumb instruction, so it is no longer the same as the ARM_pc.

Since the VFP instructions in Thumb mode are always 32-bit, Colin's
patch made sense to me.

> I think that the undefined instruction handling needs reworking for
> Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb
> instruction, and we have no way of knowing without repeatedly
> decoding that instruction.

We already handle the r2 for in __und_usr. We don't deal with ARM_pc but
we could either do it in __und_usr or let the code handling the undef
fix it up.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 14:10       ` Catalin Marinas
@ 2011-01-14 15:49         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 15:49 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote:
> On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
> > I don't think this is correct.  On entry to the undefined instruction
> > handler, we get the uncorrected PC value, so PC points to the
> > instruction after the faulting instruction.
> > 
> > If it was an ARM instruction, that is located at PC-4.  If it was a
> > Thumb instruction, it is located at PC-2.  This PC value is passed
> > unmodified to the VFP entry code, and the passed r2 reflect the
> > value in regs->ARM_pc.
> 
> The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
> Thumb instruction, so it is no longer the same as the ARM_pc.

That's something that should be fixed - the entry conditions should be
the same irrespective of thumb or arm encoding.

> Since the VFP instructions in Thumb mode are always 32-bit, Colin's
> patch made sense to me.

I looked up the VADD instruction in the ARM ARM.  It has both a 16-bit
and 32-bit encoding.

> > I think that the undefined instruction handling needs reworking for
> > Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb
> > instruction, and we have no way of knowing without repeatedly
> > decoding that instruction.
> 
> We already handle the r2 for in __und_usr. We don't deal with ARM_pc but
> we could either do it in __und_usr or let the code handling the undef
> fix it up.

At the moment its just confusing as things stand, as some things are
changed in one place and not the other.  Let's kill the pointless
addition of 2 in the undefined instruction handler so that in every
case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc
as per the ARM ARM undefined exception entry LR.

Undefined instruction exception handlers can then rely on the meaning
of both of these.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 15:49         ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 15:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote:
> On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
> > I don't think this is correct.  On entry to the undefined instruction
> > handler, we get the uncorrected PC value, so PC points to the
> > instruction after the faulting instruction.
> > 
> > If it was an ARM instruction, that is located at PC-4.  If it was a
> > Thumb instruction, it is located at PC-2.  This PC value is passed
> > unmodified to the VFP entry code, and the passed r2 reflect the
> > value in regs->ARM_pc.
> 
> The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
> Thumb instruction, so it is no longer the same as the ARM_pc.

That's something that should be fixed - the entry conditions should be
the same irrespective of thumb or arm encoding.

> Since the VFP instructions in Thumb mode are always 32-bit, Colin's
> patch made sense to me.

I looked up the VADD instruction in the ARM ARM.  It has both a 16-bit
and 32-bit encoding.

> > I think that the undefined instruction handling needs reworking for
> > Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb
> > instruction, and we have no way of knowing without repeatedly
> > decoding that instruction.
> 
> We already handle the r2 for in __und_usr. We don't deal with ARM_pc but
> we could either do it in __und_usr or let the code handling the undef
> fix it up.

At the moment its just confusing as things stand, as some things are
changed in one place and not the other.  Let's kill the pointless
addition of 2 in the undefined instruction handler so that in every
case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc
as per the ARM ARM undefined exception entry LR.

Undefined instruction exception handlers can then rely on the meaning
of both of these.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 15:49         ` Russell King - ARM Linux
@ 2011-01-14 16:23           ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-14 16:23 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote:
> On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote:
> > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
> > > I don't think this is correct.  On entry to the undefined instruction
> > > handler, we get the uncorrected PC value, so PC points to the
> > > instruction after the faulting instruction.
> > >
> > > If it was an ARM instruction, that is located at PC-4.  If it was a
> > > Thumb instruction, it is located at PC-2.  This PC value is passed
> > > unmodified to the VFP entry code, and the passed r2 reflect the
> > > value in regs->ARM_pc.
> >
> > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
> > Thumb instruction, so it is no longer the same as the ARM_pc.
> 
> That's something that should be fixed - the entry conditions should be
> the same irrespective of thumb or arm encoding.

But in this case you have to fix the vfphw.S code to check for Thumb and
subtract 2 rather than 4 from r2.

> > Since the VFP instructions in Thumb mode are always 32-bit, Colin's
> > patch made sense to me.
> 
> I looked up the VADD instruction in the ARM ARM.  It has both a 16-bit
> and 32-bit encoding.

Are you sure? The Thumb encoding is made up of two 16-bit values but it
is still 32-bit in total.

> > > I think that the undefined instruction handling needs reworking for
> > > Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb
> > > instruction, and we have no way of knowing without repeatedly
> > > decoding that instruction.
> >
> > We already handle the r2 for in __und_usr. We don't deal with ARM_pc but
> > we could either do it in __und_usr or let the code handling the undef
> > fix it up.
> 
> At the moment its just confusing as things stand, as some things are
> changed in one place and not the other.  Let's kill the pointless
> addition of 2 in the undefined instruction handler so that in every
> case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc
> as per the ARM ARM undefined exception entry LR.
> 
> Undefined instruction exception handlers can then rely on the meaning
> of both of these.

That's an alternative, though we may end up with checking the encoding
twice. The Undef handler already reads the instruction opcode and it
needs to know whether it is a 16 or a 32-bit wide instruction.

But I agree that the current implementation is a bit confusing.

-- 
Catalin



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 16:23           ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-14 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote:
> On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote:
> > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
> > > I don't think this is correct.  On entry to the undefined instruction
> > > handler, we get the uncorrected PC value, so PC points to the
> > > instruction after the faulting instruction.
> > >
> > > If it was an ARM instruction, that is located at PC-4.  If it was a
> > > Thumb instruction, it is located at PC-2.  This PC value is passed
> > > unmodified to the VFP entry code, and the passed r2 reflect the
> > > value in regs->ARM_pc.
> >
> > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
> > Thumb instruction, so it is no longer the same as the ARM_pc.
> 
> That's something that should be fixed - the entry conditions should be
> the same irrespective of thumb or arm encoding.

But in this case you have to fix the vfphw.S code to check for Thumb and
subtract 2 rather than 4 from r2.

> > Since the VFP instructions in Thumb mode are always 32-bit, Colin's
> > patch made sense to me.
> 
> I looked up the VADD instruction in the ARM ARM.  It has both a 16-bit
> and 32-bit encoding.

Are you sure? The Thumb encoding is made up of two 16-bit values but it
is still 32-bit in total.

> > > I think that the undefined instruction handling needs reworking for
> > > Thumb entirely as we could be dealing with a 16-bit or 32-bit thumb
> > > instruction, and we have no way of knowing without repeatedly
> > > decoding that instruction.
> >
> > We already handle the r2 for in __und_usr. We don't deal with ARM_pc but
> > we could either do it in __und_usr or let the code handling the undef
> > fix it up.
> 
> At the moment its just confusing as things stand, as some things are
> changed in one place and not the other.  Let's kill the pointless
> addition of 2 in the undefined instruction handler so that in every
> case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc
> as per the ARM ARM undefined exception entry LR.
> 
> Undefined instruction exception handlers can then rely on the meaning
> of both of these.

That's an alternative, though we may end up with checking the encoding
twice. The Undef handler already reads the instruction opcode and it
needs to know whether it is a 16 or a 32-bit wide instruction.

But I agree that the current implementation is a bit confusing.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 14:10       ` Catalin Marinas
@ 2011-01-14 16:24         ` Dave Martin
  -1 siblings, 0 replies; 64+ messages in thread
From: Dave Martin @ 2011-01-14 16:24 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Russell King - ARM Linux, linux-kernel, linux-arm-kernel, Colin Cross

Hi,

On Fri, Jan 14, 2011 at 8:10 AM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
> On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
>> On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote:
>> > >        pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
>> > >
>> > >        /*
>> > > +        * If the exception occured in thumb mode, pc is exception location + 2,
>> > > +        * the middle of the 32-bit VFP instruction.  Add 2 to get exception
>> > > +        * location + 4, the same we get in ARM mode.
>> > > +        */
>> > > +#ifdef CONFIG_ARM_THUMB
>> > > +       if (regs->ARM_cpsr & PSR_T_BIT)
>> > > +               regs->ARM_pc += 2;
>> > > +#endif
>> >
>> > You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely.
>>
>> I don't think this is correct.  On entry to the undefined instruction
>> handler, we get the uncorrected PC value, so PC points to the
>> instruction after the faulting instruction.
>>
>> If it was an ARM instruction, that is located at PC-4.  If it was a
>> Thumb instruction, it is located at PC-2.  This PC value is passed
>> unmodified to the VFP entry code, and the passed r2 reflect the
>> value in regs->ARM_pc.
>
> The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
> Thumb instruction, so it is no longer the same as the ARM_pc.
>
> Since the VFP instructions in Thumb mode are always 32-bit, Colin's
> patch made sense to me.

Is the comment preceding __und_usr_unknown causing some confusion here?

/*
 * The FP module is called with these registers set:
 *  r0  = instruction
 *  r2  = PC+4
 ...

That reflects the ARM case only: for Thumb, r2 is always PC+2 (?)

The comment at the start of do_undefinstr() (which receives these
registers) is correct though:

	/*
	 * According to the ARM ARM, PC is 2 or 4 bytes ahead,
	 * depending whether we're in Thumb mode or not.
	 * Correct this offset.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 16:24         ` Dave Martin
  0 siblings, 0 replies; 64+ messages in thread
From: Dave Martin @ 2011-01-14 16:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Fri, Jan 14, 2011 at 8:10 AM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
> On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
>> On Fri, Jan 14, 2011 at 11:43:04AM +0000, Catalin Marinas wrote:
>> > > ? ? ? ?pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
>> > >
>> > > ? ? ? ?/*
>> > > + ? ? ? ?* If the exception occured in thumb mode, pc is exception location + 2,
>> > > + ? ? ? ?* the middle of the 32-bit VFP instruction. ?Add 2 to get exception
>> > > + ? ? ? ?* location + 4, the same we get in ARM mode.
>> > > + ? ? ? ?*/
>> > > +#ifdef CONFIG_ARM_THUMB
>> > > + ? ? ? if (regs->ARM_cpsr & PSR_T_BIT)
>> > > + ? ? ? ? ? ? ? regs->ARM_pc += 2;
>> > > +#endif
>> >
>> > You can use "if (thumb_mode(regs))" and avoid the #ifdef entirely.
>>
>> I don't think this is correct. ?On entry to the undefined instruction
>> handler, we get the uncorrected PC value, so PC points to the
>> instruction after the faulting instruction.
>>
>> If it was an ARM instruction, that is located at PC-4. ?If it was a
>> Thumb instruction, it is located at PC-2. ?This PC value is passed
>> unmodified to the VFP entry code, and the passed r2 reflect the
>> value in regs->ARM_pc.
>
> The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
> Thumb instruction, so it is no longer the same as the ARM_pc.
>
> Since the VFP instructions in Thumb mode are always 32-bit, Colin's
> patch made sense to me.

Is the comment preceding __und_usr_unknown causing some confusion here?

/*
 * The FP module is called with these registers set:
 *  r0  = instruction
 *  r2  = PC+4
 ...

That reflects the ARM case only: for Thumb, r2 is always PC+2 (?)

The comment at the start of do_undefinstr() (which receives these
registers) is correct though:

	/*
	 * According to the ARM ARM, PC is 2 or 4 bytes ahead,
	 * depending whether we're in Thumb mode or not.
	 * Correct this offset.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 16:23           ` Catalin Marinas
@ 2011-01-14 16:35             ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 16:35 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Fri, Jan 14, 2011 at 04:23:12PM +0000, Catalin Marinas wrote:
> On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote:
> > On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote:
> > > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
> > > > I don't think this is correct.  On entry to the undefined instruction
> > > > handler, we get the uncorrected PC value, so PC points to the
> > > > instruction after the faulting instruction.
> > > >
> > > > If it was an ARM instruction, that is located at PC-4.  If it was a
> > > > Thumb instruction, it is located at PC-2.  This PC value is passed
> > > > unmodified to the VFP entry code, and the passed r2 reflect the
> > > > value in regs->ARM_pc.
> > >
> > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
> > > Thumb instruction, so it is no longer the same as the ARM_pc.
> > 
> > That's something that should be fixed - the entry conditions should be
> > the same irrespective of thumb or arm encoding.
> 
> But in this case you have to fix the vfphw.S code to check for Thumb and
> subtract 2 rather than 4 from r2.

So is this right for Thumb?  Or does it need to be 2 for thumb and 4 for
ARM?  Maybe it needs documenting to say why 4 is always correct (if that
is the case).

check_for_exception:
        tst     r1, #FPEXC_EX
        bne     process_exception       @ might as well handle the pending
                                        @ exception before retrying branch
                                        @ out before setting an FPEXC that
                                        @ stops us reading stuff
        VFPFMXR FPEXC, r1               @ restore FPEXC last
        sub     r2, r2, #4
        str     r2, [sp, #S_PC]         @ retry the instruction

> > > Since the VFP instructions in Thumb mode are always 32-bit, Colin's
> > > patch made sense to me.
> > 
> > I looked up the VADD instruction in the ARM ARM.  It has both a 16-bit
> > and 32-bit encoding.
> 
> Are you sure? The Thumb encoding is made up of two 16-bit values but it
> is still 32-bit in total.

No, I'm not sure - it looks like it is made up from two 16-bit
instructions.

> > At the moment its just confusing as things stand, as some things are
> > changed in one place and not the other.  Let's kill the pointless
> > addition of 2 in the undefined instruction handler so that in every
> > case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc
> > as per the ARM ARM undefined exception entry LR.
> > 
> > Undefined instruction exception handlers can then rely on the meaning
> > of both of these.
> 
> That's an alternative, though we may end up with checking the encoding
> twice. The Undef handler already reads the instruction opcode and it
> needs to know whether it is a 16 or a 32-bit wide instruction.

At the moment we add 2 in one place, take off 4 in another, and now
we're going to add 2 in a completely different place.  This is insane.
It's a big mess, one which it's impossible to tell if anything is
correct or even easy to follow what's going on.

I don't really care what it's replaced with provided its replaced by
something sane, easy to follow and obviously correct.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 16:35             ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 16:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 14, 2011 at 04:23:12PM +0000, Catalin Marinas wrote:
> On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote:
> > On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote:
> > > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
> > > > I don't think this is correct.  On entry to the undefined instruction
> > > > handler, we get the uncorrected PC value, so PC points to the
> > > > instruction after the faulting instruction.
> > > >
> > > > If it was an ARM instruction, that is located at PC-4.  If it was a
> > > > Thumb instruction, it is located at PC-2.  This PC value is passed
> > > > unmodified to the VFP entry code, and the passed r2 reflect the
> > > > value in regs->ARM_pc.
> > >
> > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
> > > Thumb instruction, so it is no longer the same as the ARM_pc.
> > 
> > That's something that should be fixed - the entry conditions should be
> > the same irrespective of thumb or arm encoding.
> 
> But in this case you have to fix the vfphw.S code to check for Thumb and
> subtract 2 rather than 4 from r2.

So is this right for Thumb?  Or does it need to be 2 for thumb and 4 for
ARM?  Maybe it needs documenting to say why 4 is always correct (if that
is the case).

check_for_exception:
        tst     r1, #FPEXC_EX
        bne     process_exception       @ might as well handle the pending
                                        @ exception before retrying branch
                                        @ out before setting an FPEXC that
                                        @ stops us reading stuff
        VFPFMXR FPEXC, r1               @ restore FPEXC last
        sub     r2, r2, #4
        str     r2, [sp, #S_PC]         @ retry the instruction

> > > Since the VFP instructions in Thumb mode are always 32-bit, Colin's
> > > patch made sense to me.
> > 
> > I looked up the VADD instruction in the ARM ARM.  It has both a 16-bit
> > and 32-bit encoding.
> 
> Are you sure? The Thumb encoding is made up of two 16-bit values but it
> is still 32-bit in total.

No, I'm not sure - it looks like it is made up from two 16-bit
instructions.

> > At the moment its just confusing as things stand, as some things are
> > changed in one place and not the other.  Let's kill the pointless
> > addition of 2 in the undefined instruction handler so that in every
> > case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc
> > as per the ARM ARM undefined exception entry LR.
> > 
> > Undefined instruction exception handlers can then rely on the meaning
> > of both of these.
> 
> That's an alternative, though we may end up with checking the encoding
> twice. The Undef handler already reads the instruction opcode and it
> needs to know whether it is a 16 or a 32-bit wide instruction.

At the moment we add 2 in one place, take off 4 in another, and now
we're going to add 2 in a completely different place.  This is insane.
It's a big mess, one which it's impossible to tell if anything is
correct or even easy to follow what's going on.

I don't really care what it's replaced with provided its replaced by
something sane, easy to follow and obviously correct.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 16:24         ` Dave Martin
@ 2011-01-14 16:52           ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 16:52 UTC (permalink / raw)
  To: Dave Martin; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel, Colin Cross

On Fri, Jan 14, 2011 at 10:24:52AM -0600, Dave Martin wrote:
> Is the comment preceding __und_usr_unknown causing some confusion here?
> 
> /*
>  * The FP module is called with these registers set:
>  *  r0  = instruction
>  *  r2  = PC+4
>  ...
> 
> That reflects the ARM case only: for Thumb, r2 is always PC+2 (?)

Actually, referring to 'PC' here is confusing (and yes, I probably wrote
it) - does 'PC' refer to the address of the faulting instruction or the
current PC value...

Your '(?)' there is exactly the problem I'm referring to - I don't think
there's much of a clear idea really what's going on here...

> The comment at the start of do_undefinstr() (which receives these
> registers) is correct though:
> 
> 	/*
> 	 * According to the ARM ARM, PC is 2 or 4 bytes ahead,
> 	 * depending whether we're in Thumb mode or not.
> 	 * Correct this offset.

The ARM ARM says that in order to return to the instruction which
generated the exception, subtract 2 bytes for thumb or 4 bytes for
ARM.

So, in order to point at the instruction which generated the exception,
we have to subtract this value from the PC value we were passed.  I
suggest changing this comment to:

	/*
	 * According to the ARM ARM, the PC is 2 or 4 bytes ahead
	 * depending on Thumb mode.  Correct this offset so that
	 * regs->ARM_pc points at the faulting instruction.
	 */


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 16:52           ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 16:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 14, 2011 at 10:24:52AM -0600, Dave Martin wrote:
> Is the comment preceding __und_usr_unknown causing some confusion here?
> 
> /*
>  * The FP module is called with these registers set:
>  *  r0  = instruction
>  *  r2  = PC+4
>  ...
> 
> That reflects the ARM case only: for Thumb, r2 is always PC+2 (?)

Actually, referring to 'PC' here is confusing (and yes, I probably wrote
it) - does 'PC' refer to the address of the faulting instruction or the
current PC value...

Your '(?)' there is exactly the problem I'm referring to - I don't think
there's much of a clear idea really what's going on here...

> The comment at the start of do_undefinstr() (which receives these
> registers) is correct though:
> 
> 	/*
> 	 * According to the ARM ARM, PC is 2 or 4 bytes ahead,
> 	 * depending whether we're in Thumb mode or not.
> 	 * Correct this offset.

The ARM ARM says that in order to return to the instruction which
generated the exception, subtract 2 bytes for thumb or 4 bytes for
ARM.

So, in order to point at the instruction which generated the exception,
we have to subtract this value from the PC value we were passed.  I
suggest changing this comment to:

	/*
	 * According to the ARM ARM, the PC is 2 or 4 bytes ahead
	 * depending on Thumb mode.  Correct this offset so that
	 * regs->ARM_pc points at the faulting instruction.
	 */

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 16:35             ` Russell King - ARM Linux
@ 2011-01-14 16:58               ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-14 16:58 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Fri, 2011-01-14 at 16:35 +0000, Russell King - ARM Linux wrote:
> On Fri, Jan 14, 2011 at 04:23:12PM +0000, Catalin Marinas wrote:
> > On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote:
> > > On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote:
> > > > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
> > > > > I don't think this is correct.  On entry to the undefined instruction
> > > > > handler, we get the uncorrected PC value, so PC points to the
> > > > > instruction after the faulting instruction.
> > > > >
> > > > > If it was an ARM instruction, that is located at PC-4.  If it was a
> > > > > Thumb instruction, it is located at PC-2.  This PC value is passed
> > > > > unmodified to the VFP entry code, and the passed r2 reflect the
> > > > > value in regs->ARM_pc.
> > > >
> > > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
> > > > Thumb instruction, so it is no longer the same as the ARM_pc.
> > >
> > > That's something that should be fixed - the entry conditions should be
> > > the same irrespective of thumb or arm encoding.
> >
> > But in this case you have to fix the vfphw.S code to check for Thumb and
> > subtract 2 rather than 4 from r2.
> 
> So is this right for Thumb?  Or does it need to be 2 for thumb and 4 for
> ARM?  Maybe it needs documenting to say why 4 is always correct (if that
> is the case).
> 
> check_for_exception:
>         tst     r1, #FPEXC_EX
>         bne     process_exception       @ might as well handle the pending
>                                         @ exception before retrying branch
>                                         @ out before setting an FPEXC that
>                                         @ stops us reading stuff
>         VFPFMXR FPEXC, r1               @ restore FPEXC last
>         sub     r2, r2, #4
>         str     r2, [sp, #S_PC]         @ retry the instruction

If we don't touch r2 in __und_usr, than in vfphw.S we would need to
subtract 2 for Thumb and 4 for ARM. But since we did +2 in __und_usr, we
always subtracted 4 here (confusingly though).

> > > At the moment its just confusing as things stand, as some things are
> > > changed in one place and not the other.  Let's kill the pointless
> > > addition of 2 in the undefined instruction handler so that in every
> > > case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc
> > > as per the ARM ARM undefined exception entry LR.
> > >
> > > Undefined instruction exception handlers can then rely on the meaning
> > > of both of these.
> >
> > That's an alternative, though we may end up with checking the encoding
> > twice. The Undef handler already reads the instruction opcode and it
> > needs to know whether it is a 16 or a 32-bit wide instruction.
> 
> At the moment we add 2 in one place, take off 4 in another, and now
> we're going to add 2 in a completely different place.  This is insane.
> It's a big mess, one which it's impossible to tell if anything is
> correct or even easy to follow what's going on.

I agree, this code needs some clean-up. Maybe for Undef we could unify
the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
breakpoint code, I haven't checked).

Otherwise just let the code handling the undef deal with the ARM/Thumb
difference. For SVC, it makes sense to have different offsets as we
always return to the next instruction.

-- 
Catalin



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 16:58               ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-14 16:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2011-01-14 at 16:35 +0000, Russell King - ARM Linux wrote:
> On Fri, Jan 14, 2011 at 04:23:12PM +0000, Catalin Marinas wrote:
> > On Fri, 2011-01-14 at 15:49 +0000, Russell King - ARM Linux wrote:
> > > On Fri, Jan 14, 2011 at 02:10:31PM +0000, Catalin Marinas wrote:
> > > > On Fri, 2011-01-14 at 12:02 +0000, Russell King - ARM Linux wrote:
> > > > > I don't think this is correct.  On entry to the undefined instruction
> > > > > handler, we get the uncorrected PC value, so PC points to the
> > > > > instruction after the faulting instruction.
> > > > >
> > > > > If it was an ARM instruction, that is located at PC-4.  If it was a
> > > > > Thumb instruction, it is located at PC-2.  This PC value is passed
> > > > > unmodified to the VFP entry code, and the passed r2 reflect the
> > > > > value in regs->ARM_pc.
> > > >
> > > > The entry-armv.S code adds 2 to the r2 register in case of a 32-bit
> > > > Thumb instruction, so it is no longer the same as the ARM_pc.
> > >
> > > That's something that should be fixed - the entry conditions should be
> > > the same irrespective of thumb or arm encoding.
> >
> > But in this case you have to fix the vfphw.S code to check for Thumb and
> > subtract 2 rather than 4 from r2.
> 
> So is this right for Thumb?  Or does it need to be 2 for thumb and 4 for
> ARM?  Maybe it needs documenting to say why 4 is always correct (if that
> is the case).
> 
> check_for_exception:
>         tst     r1, #FPEXC_EX
>         bne     process_exception       @ might as well handle the pending
>                                         @ exception before retrying branch
>                                         @ out before setting an FPEXC that
>                                         @ stops us reading stuff
>         VFPFMXR FPEXC, r1               @ restore FPEXC last
>         sub     r2, r2, #4
>         str     r2, [sp, #S_PC]         @ retry the instruction

If we don't touch r2 in __und_usr, than in vfphw.S we would need to
subtract 2 for Thumb and 4 for ARM. But since we did +2 in __und_usr, we
always subtracted 4 here (confusingly though).

> > > At the moment its just confusing as things stand, as some things are
> > > changed in one place and not the other.  Let's kill the pointless
> > > addition of 2 in the undefined instruction handler so that in every
> > > case we enter handlers with r2 == regs->ARM_pc, and regs->ARM_pc
> > > as per the ARM ARM undefined exception entry LR.
> > >
> > > Undefined instruction exception handlers can then rely on the meaning
> > > of both of these.
> >
> > That's an alternative, though we may end up with checking the encoding
> > twice. The Undef handler already reads the instruction opcode and it
> > needs to know whether it is a 16 or a 32-bit wide instruction.
> 
> At the moment we add 2 in one place, take off 4 in another, and now
> we're going to add 2 in a completely different place.  This is insane.
> It's a big mess, one which it's impossible to tell if anything is
> correct or even easy to follow what's going on.

I agree, this code needs some clean-up. Maybe for Undef we could unify
the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
breakpoint code, I haven't checked).

Otherwise just let the code handling the undef deal with the ARM/Thumb
difference. For SVC, it makes sense to have different offsets as we
always return to the next instruction.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 16:58               ` Catalin Marinas
@ 2011-01-14 17:30                 ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 17:30 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote:
> I agree, this code needs some clean-up. Maybe for Undef we could unify
> the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
> breakpoint code, I haven't checked).
> 
> Otherwise just let the code handling the undef deal with the ARM/Thumb
> difference. For SVC, it makes sense to have different offsets as we
> always return to the next instruction.

I think it just needs better documentation.

Having been through all this, there _are_ bugs lurking in the code exactly
because of this randomness with what PC value is means what.

When the VFP support code tests the state of the VFP hardware during boot,
it sets the VFP handler to point at vfp_testing_entry, bypassing the normal
VFP handling code, and executes a VFP instruction.

If this VFP instruction faults (eg, because there is no VFP hardware
present or we're not permitted to use it), it could end up resuming
execution in the middle of the 16-bit paired instruction because
regs->ARM_pc points in the middle of it.

So vfp_testing_entry should at least store r2 into regs->ARM_pc to
guarantee resuming at the following instruction.

So maybe the right answer is to store r2 into regs->ARM_pc in
process_exception in the VFP assembly code too?

Or maybe we should just make it unconditional that whenever we have an
undefined instruction exception, the regs->ARM_pc value will always be
set for resuming execution after the faulted instruction.  That makes
it consistent with r2 throughout the code in every case.

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 2b46fea..5876eec 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
 	.align	5
 __und_usr:
 	usr_entry
-
-	@
-	@ fall through to the emulation code, which returns using r9 if
-	@ it has emulated the instruction, or the more conventional lr
-	@ if we are to treat this as a real undefined instruction
 	@
-	@  r0 - instruction
+	@ The emulation code returns using r9 if it has emulated the
+	@ instruction, or the more conventional lr if we are to treat
+	@ this as a real undefined instruction
 	@
 	adr	r9, BSYM(ret_from_exception)
 	adr	lr, BSYM(__und_usr_unknown)
+	@
+	@ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
+	@ faulting instruction depending on Thumb mode.
+	@ r3 = regs->ARM_cpsr
+	@
 	tst	r3, #PSR_T_BIT			@ Thumb mode?
-	itet	eq				@ explicit IT needed for the 1f label
+	itttt	eq				@ explicit IT needed for the 1f label
 	subeq	r4, r2, #4			@ ARM instr at LR - 4
-	subne	r4, r2, #2			@ Thumb instr at LR - 2
 1:	ldreqt	r0, [r4]
 #ifdef CONFIG_CPU_ENDIAN_BE8
 	reveq	r0, r0				@ little endian instruction
 #endif
+	@
+	@ r0 = 32-bit ARM instruction which caused the exception
+	@ r2 = PC value for the following instruction (:= regs->ARM_pc)
+	@ r4 = PC value for the faulting instruction
+	@
 	beq	call_fpe
+
 	@ Thumb instruction
 #if __LINUX_ARM_ARCH__ >= 7
+	sub	r4, r2, #2			@ Thumb instr at LR - 2
 2:
  ARM(	ldrht	r5, [r4], #2	)
  THUMB(	ldrht	r5, [r4]	)
@@ -492,18 +500,19 @@ __und_usr:
 3:	ldrht	r0, [r4]
 	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
 	orr	r0, r0, r5, lsl #16
+	@
+	@ r0 = the two 16-bit Thumb instructions which caused the exception
+	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
+	@ r4 = PC value for the first 16-bit Thumb instruction
+	@
 #else
 	b	__und_usr_unknown
 #endif
- UNWIND(.fnend		)
+ UNWIND(.fnend)
 ENDPROC(__und_usr)
 
-	@
-	@ fallthrough to call_fpe
-	@
-
 /*
- * The out of line fixup for the ldrt above.
+ * The out of line fixup for the ldrt instructions above.
  */
 	.pushsection .fixup, "ax"
 4:	mov	pc, r9
@@ -534,11 +543,12 @@ ENDPROC(__und_usr)
  * NEON handler code.
  *
  * Emulators may wish to make use of the following registers:
- *  r0  = instruction opcode.
- *  r2  = PC+4
+ *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+ *  r2  = PC value to resume execution after successful emulation
  *  r9  = normal "successful" return address
- *  r10 = this threads thread_info structure.
+ *  r10 = this threads thread_info structure
  *  lr  = unrecognised instruction return address
+ * IRQs disabled, FIQs enabled.
  */
 	@
 	@ Fall-through from Thumb-2 __und_usr
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index ee57640..eeb9250 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
 	void __user *pc;
 
 	/*
-	 * According to the ARM ARM, PC is 2 or 4 bytes ahead,
-	 * depending whether we're in Thumb mode or not.
-	 * Correct this offset.
+	 * According to the ARM ARM, the PC is 2 or 4 bytes ahead
+	 * depending on Thumb mode.  Correct this offset so that
+	 * regs->ARM_pc points at the faulting instruction.
 	 */
 	regs->ARM_pc -= correction;
 
diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
index 4fa9903..2bf6089 100644
--- a/arch/arm/vfp/entry.S
+++ b/arch/arm/vfp/entry.S
@@ -19,6 +19,14 @@
 #include <asm/vfpmacros.h>
 #include "../kernel/entry-header.S"
 
+@ VFP entry point.
+@
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
+@  r10 = this threads thread_info structure
+@  lr  = unrecognised instruction return address
+@
 ENTRY(do_vfp)
 #ifdef CONFIG_PREEMPT
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count
diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 9897dcf..7292921 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -61,13 +61,13 @@
 
 @ VFP hardware support entry point.
 @
-@  r0  = faulted instruction
-@  r2  = faulted PC+4
-@  r9  = successful return
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
 @  r10 = vfp_state union
 @  r11 = CPU number
-@  lr  = failure return
-
+@  lr  = unrecognised instruction return address
+@  IRQs enabled.
 ENTRY(vfp_support_entry)
 	DBGSTR3	"instr %08x pc %08x state %p", r0, r2, r10
 
@@ -138,9 +138,12 @@ check_for_exception:
 					@ exception before retrying branch
 					@ out before setting an FPEXC that
 					@ stops us reading stuff
-	VFPFMXR	FPEXC, r1		@ restore FPEXC last
-	sub	r2, r2, #4
-	str	r2, [sp, #S_PC]		@ retry the instruction
+	VFPFMXR	FPEXC, r1		@ Restore FPEXC last
+	sub	r2, r2, #4		@ Retry current instruction - if Thumb
+	str	r2, [sp, #S_PC]		@ mode it's two 16-bit instructions,
+					@ else it's one 32-bit instruction, so
+					@ always subtract 4 from the following
+					@ instruction address.
 #ifdef CONFIG_PREEMPT
 	get_thread_info	r10
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 17:30                 ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote:
> I agree, this code needs some clean-up. Maybe for Undef we could unify
> the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
> breakpoint code, I haven't checked).
> 
> Otherwise just let the code handling the undef deal with the ARM/Thumb
> difference. For SVC, it makes sense to have different offsets as we
> always return to the next instruction.

I think it just needs better documentation.

Having been through all this, there _are_ bugs lurking in the code exactly
because of this randomness with what PC value is means what.

When the VFP support code tests the state of the VFP hardware during boot,
it sets the VFP handler to point at vfp_testing_entry, bypassing the normal
VFP handling code, and executes a VFP instruction.

If this VFP instruction faults (eg, because there is no VFP hardware
present or we're not permitted to use it), it could end up resuming
execution in the middle of the 16-bit paired instruction because
regs->ARM_pc points in the middle of it.

So vfp_testing_entry should at least store r2 into regs->ARM_pc to
guarantee resuming at the following instruction.

So maybe the right answer is to store r2 into regs->ARM_pc in
process_exception in the VFP assembly code too?

Or maybe we should just make it unconditional that whenever we have an
undefined instruction exception, the regs->ARM_pc value will always be
set for resuming execution after the faulted instruction.  That makes
it consistent with r2 throughout the code in every case.

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 2b46fea..5876eec 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
 	.align	5
 __und_usr:
 	usr_entry
-
-	@
-	@ fall through to the emulation code, which returns using r9 if
-	@ it has emulated the instruction, or the more conventional lr
-	@ if we are to treat this as a real undefined instruction
 	@
-	@  r0 - instruction
+	@ The emulation code returns using r9 if it has emulated the
+	@ instruction, or the more conventional lr if we are to treat
+	@ this as a real undefined instruction
 	@
 	adr	r9, BSYM(ret_from_exception)
 	adr	lr, BSYM(__und_usr_unknown)
+	@
+	@ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
+	@ faulting instruction depending on Thumb mode.
+	@ r3 = regs->ARM_cpsr
+	@
 	tst	r3, #PSR_T_BIT			@ Thumb mode?
-	itet	eq				@ explicit IT needed for the 1f label
+	itttt	eq				@ explicit IT needed for the 1f label
 	subeq	r4, r2, #4			@ ARM instr at LR - 4
-	subne	r4, r2, #2			@ Thumb instr at LR - 2
 1:	ldreqt	r0, [r4]
 #ifdef CONFIG_CPU_ENDIAN_BE8
 	reveq	r0, r0				@ little endian instruction
 #endif
+	@
+	@ r0 = 32-bit ARM instruction which caused the exception
+	@ r2 = PC value for the following instruction (:= regs->ARM_pc)
+	@ r4 = PC value for the faulting instruction
+	@
 	beq	call_fpe
+
 	@ Thumb instruction
 #if __LINUX_ARM_ARCH__ >= 7
+	sub	r4, r2, #2			@ Thumb instr at LR - 2
 2:
  ARM(	ldrht	r5, [r4], #2	)
  THUMB(	ldrht	r5, [r4]	)
@@ -492,18 +500,19 @@ __und_usr:
 3:	ldrht	r0, [r4]
 	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
 	orr	r0, r0, r5, lsl #16
+	@
+	@ r0 = the two 16-bit Thumb instructions which caused the exception
+	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
+	@ r4 = PC value for the first 16-bit Thumb instruction
+	@
 #else
 	b	__und_usr_unknown
 #endif
- UNWIND(.fnend		)
+ UNWIND(.fnend)
 ENDPROC(__und_usr)
 
-	@
-	@ fallthrough to call_fpe
-	@
-
 /*
- * The out of line fixup for the ldrt above.
+ * The out of line fixup for the ldrt instructions above.
  */
 	.pushsection .fixup, "ax"
 4:	mov	pc, r9
@@ -534,11 +543,12 @@ ENDPROC(__und_usr)
  * NEON handler code.
  *
  * Emulators may wish to make use of the following registers:
- *  r0  = instruction opcode.
- *  r2  = PC+4
+ *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+ *  r2  = PC value to resume execution after successful emulation
  *  r9  = normal "successful" return address
- *  r10 = this threads thread_info structure.
+ *  r10 = this threads thread_info structure
  *  lr  = unrecognised instruction return address
+ * IRQs disabled, FIQs enabled.
  */
 	@
 	@ Fall-through from Thumb-2 __und_usr
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index ee57640..eeb9250 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
 	void __user *pc;
 
 	/*
-	 * According to the ARM ARM, PC is 2 or 4 bytes ahead,
-	 * depending whether we're in Thumb mode or not.
-	 * Correct this offset.
+	 * According to the ARM ARM, the PC is 2 or 4 bytes ahead
+	 * depending on Thumb mode.  Correct this offset so that
+	 * regs->ARM_pc points at the faulting instruction.
 	 */
 	regs->ARM_pc -= correction;
 
diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
index 4fa9903..2bf6089 100644
--- a/arch/arm/vfp/entry.S
+++ b/arch/arm/vfp/entry.S
@@ -19,6 +19,14 @@
 #include <asm/vfpmacros.h>
 #include "../kernel/entry-header.S"
 
+@ VFP entry point.
+@
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
+@  r10 = this threads thread_info structure
+@  lr  = unrecognised instruction return address
+@
 ENTRY(do_vfp)
 #ifdef CONFIG_PREEMPT
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count
diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 9897dcf..7292921 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -61,13 +61,13 @@
 
 @ VFP hardware support entry point.
 @
-@  r0  = faulted instruction
-@  r2  = faulted PC+4
-@  r9  = successful return
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
 @  r10 = vfp_state union
 @  r11 = CPU number
-@  lr  = failure return
-
+@  lr  = unrecognised instruction return address
+@  IRQs enabled.
 ENTRY(vfp_support_entry)
 	DBGSTR3	"instr %08x pc %08x state %p", r0, r2, r10
 
@@ -138,9 +138,12 @@ check_for_exception:
 					@ exception before retrying branch
 					@ out before setting an FPEXC that
 					@ stops us reading stuff
-	VFPFMXR	FPEXC, r1		@ restore FPEXC last
-	sub	r2, r2, #4
-	str	r2, [sp, #S_PC]		@ retry the instruction
+	VFPFMXR	FPEXC, r1		@ Restore FPEXC last
+	sub	r2, r2, #4		@ Retry current instruction - if Thumb
+	str	r2, [sp, #S_PC]		@ mode it's two 16-bit instructions,
+					@ else it's one 32-bit instruction, so
+					@ always subtract 4 from the following
+					@ instruction address.
 #ifdef CONFIG_PREEMPT
 	get_thread_info	r10
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 17:30                 ` Russell King - ARM Linux
@ 2011-01-14 18:47                   ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 18:47 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-kernel, linux-arm-kernel, Colin Cross

On Fri, Jan 14, 2011 at 05:30:50PM +0000, Russell King - ARM Linux wrote:
> On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote:
> > I agree, this code needs some clean-up. Maybe for Undef we could unify
> > the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
> > breakpoint code, I haven't checked).
> > 
> > Otherwise just let the code handling the undef deal with the ARM/Thumb
> > difference. For SVC, it makes sense to have different offsets as we
> > always return to the next instruction.
> 
> I think it just needs better documentation.
> 
> Having been through all this, there _are_ bugs lurking in the code exactly
> because of this randomness with what PC value is means what.
> 
> When the VFP support code tests the state of the VFP hardware during boot,
> it sets the VFP handler to point at vfp_testing_entry, bypassing the normal
> VFP handling code, and executes a VFP instruction.
> 
> If this VFP instruction faults (eg, because there is no VFP hardware
> present or we're not permitted to use it), it could end up resuming
> execution in the middle of the 16-bit paired instruction because
> regs->ARM_pc points in the middle of it.
> 
> So vfp_testing_entry should at least store r2 into regs->ARM_pc to
> guarantee resuming at the following instruction.
> 
> So maybe the right answer is to store r2 into regs->ARM_pc in
> process_exception in the VFP assembly code too?
> 
> Or maybe we should just make it unconditional that whenever we have an
> undefined instruction exception, the regs->ARM_pc value will always be
> set for resuming execution after the faulted instruction.  That makes
> it consistent with r2 throughout the code in every case.

So... this incrementally on top of the previous patch (which I've
reproduced below as there's a subtle comment change in there wrt IRQ
state.)

This means we have consistent state - both r2 and regs->ARM_pc always
point to the next instruction to be executed in every case, which means
its easy to understand and remember while reading through the code.

diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
--- b/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -499,10 +499,11 @@
 	blo	__und_usr_unknown
 3:	ldrht	r0, [r4]
 	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
-	orr	r0, r0, r5, lsl #16
+	str	r2, [sp, #S_PC]			@ it's a 2x16bit instr, update
+	orr	r0, r0, r5, lsl #16		@  regs->ARM_pc
 	@
 	@ r0 = the two 16-bit Thumb instructions which caused the exception
-	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
+	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
 	@ r4 = PC value for the first 16-bit Thumb instruction
 	@
 #else

8<-x-x-

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 2b46fea..5876eec 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
 	.align	5
 __und_usr:
 	usr_entry
-
-	@
-	@ fall through to the emulation code, which returns using r9 if
-	@ it has emulated the instruction, or the more conventional lr
-	@ if we are to treat this as a real undefined instruction
 	@
-	@  r0 - instruction
+	@ The emulation code returns using r9 if it has emulated the
+	@ instruction, or the more conventional lr if we are to treat
+	@ this as a real undefined instruction
 	@
 	adr	r9, BSYM(ret_from_exception)
 	adr	lr, BSYM(__und_usr_unknown)
+	@
+	@ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
+	@ faulting instruction depending on Thumb mode.
+	@ r3 = regs->ARM_cpsr
+	@
 	tst	r3, #PSR_T_BIT			@ Thumb mode?
-	itet	eq				@ explicit IT needed for the 1f label
+	itttt	eq				@ explicit IT needed for the 1f label
 	subeq	r4, r2, #4			@ ARM instr at LR - 4
-	subne	r4, r2, #2			@ Thumb instr at LR - 2
 1:	ldreqt	r0, [r4]
 #ifdef CONFIG_CPU_ENDIAN_BE8
 	reveq	r0, r0				@ little endian instruction
 #endif
+	@
+	@ r0 = 32-bit ARM instruction which caused the exception
+	@ r2 = PC value for the following instruction (:= regs->ARM_pc)
+	@ r4 = PC value for the faulting instruction
+	@
 	beq	call_fpe
+
 	@ Thumb instruction
 #if __LINUX_ARM_ARCH__ >= 7
+	sub	r4, r2, #2			@ Thumb instr at LR - 2
 2:
  ARM(	ldrht	r5, [r4], #2	)
  THUMB(	ldrht	r5, [r4]	)
@@ -492,18 +500,19 @@ __und_usr:
 3:	ldrht	r0, [r4]
 	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
 	orr	r0, r0, r5, lsl #16
+	@
+	@ r0 = the two 16-bit Thumb instructions which caused the exception
+	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
+	@ r4 = PC value for the first 16-bit Thumb instruction
+	@
 #else
 	b	__und_usr_unknown
 #endif
- UNWIND(.fnend		)
+ UNWIND(.fnend)
 ENDPROC(__und_usr)
 
-	@
-	@ fallthrough to call_fpe
-	@
-
 /*
- * The out of line fixup for the ldrt above.
+ * The out of line fixup for the ldrt instructions above.
  */
 	.pushsection .fixup, "ax"
 4:	mov	pc, r9
@@ -534,11 +543,12 @@ ENDPROC(__und_usr)
  * NEON handler code.
  *
  * Emulators may wish to make use of the following registers:
- *  r0  = instruction opcode.
- *  r2  = PC+4
+ *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+ *  r2  = PC value to resume execution after successful emulation
  *  r9  = normal "successful" return address
- *  r10 = this threads thread_info structure.
+ *  r10 = this threads thread_info structure
  *  lr  = unrecognised instruction return address
+ * IRQs disabled, FIQs enabled.
  */
 	@
 	@ Fall-through from Thumb-2 __und_usr
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index ee57640..eeb9250 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
 	void __user *pc;
 
 	/*
-	 * According to the ARM ARM, PC is 2 or 4 bytes ahead,
-	 * depending whether we're in Thumb mode or not.
-	 * Correct this offset.
+	 * According to the ARM ARM, the PC is 2 or 4 bytes ahead
+	 * depending on Thumb mode.  Correct this offset so that
+	 * regs->ARM_pc points at the faulting instruction.
 	 */
 	regs->ARM_pc -= correction;
 
diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
index 4fa9903..2bf6089 100644
--- a/arch/arm/vfp/entry.S
+++ b/arch/arm/vfp/entry.S
@@ -19,6 +19,15 @@
 #include <asm/vfpmacros.h>
 #include "../kernel/entry-header.S"
 
+@ VFP entry point.
+@
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
+@  r10 = this threads thread_info structure
+@  lr  = unrecognised instruction return address
+@  IRQs disabled.
+@
 ENTRY(do_vfp)
 #ifdef CONFIG_PREEMPT
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count
diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 9897dcf..7292921 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -61,13 +61,13 @@
 
 @ VFP hardware support entry point.
 @
-@  r0  = faulted instruction
-@  r2  = faulted PC+4
-@  r9  = successful return
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
 @  r10 = vfp_state union
 @  r11 = CPU number
-@  lr  = failure return
-
+@  lr  = unrecognised instruction return address
+@  IRQs enabled.
 ENTRY(vfp_support_entry)
 	DBGSTR3	"instr %08x pc %08x state %p", r0, r2, r10
 
@@ -138,9 +138,12 @@ check_for_exception:
 					@ exception before retrying branch
 					@ out before setting an FPEXC that
 					@ stops us reading stuff
-	VFPFMXR	FPEXC, r1		@ restore FPEXC last
-	sub	r2, r2, #4
-	str	r2, [sp, #S_PC]		@ retry the instruction
+	VFPFMXR	FPEXC, r1		@ Restore FPEXC last
+	sub	r2, r2, #4		@ Retry current instruction - if Thumb
+	str	r2, [sp, #S_PC]		@ mode it's two 16-bit instructions,
+					@ else it's one 32-bit instruction, so
+					@ always subtract 4 from the following
+					@ instruction address.
 #ifdef CONFIG_PREEMPT
 	get_thread_info	r10
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 18:47                   ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 18:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 14, 2011 at 05:30:50PM +0000, Russell King - ARM Linux wrote:
> On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote:
> > I agree, this code needs some clean-up. Maybe for Undef we could unify
> > the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
> > breakpoint code, I haven't checked).
> > 
> > Otherwise just let the code handling the undef deal with the ARM/Thumb
> > difference. For SVC, it makes sense to have different offsets as we
> > always return to the next instruction.
> 
> I think it just needs better documentation.
> 
> Having been through all this, there _are_ bugs lurking in the code exactly
> because of this randomness with what PC value is means what.
> 
> When the VFP support code tests the state of the VFP hardware during boot,
> it sets the VFP handler to point at vfp_testing_entry, bypassing the normal
> VFP handling code, and executes a VFP instruction.
> 
> If this VFP instruction faults (eg, because there is no VFP hardware
> present or we're not permitted to use it), it could end up resuming
> execution in the middle of the 16-bit paired instruction because
> regs->ARM_pc points in the middle of it.
> 
> So vfp_testing_entry should at least store r2 into regs->ARM_pc to
> guarantee resuming at the following instruction.
> 
> So maybe the right answer is to store r2 into regs->ARM_pc in
> process_exception in the VFP assembly code too?
> 
> Or maybe we should just make it unconditional that whenever we have an
> undefined instruction exception, the regs->ARM_pc value will always be
> set for resuming execution after the faulted instruction.  That makes
> it consistent with r2 throughout the code in every case.

So... this incrementally on top of the previous patch (which I've
reproduced below as there's a subtle comment change in there wrt IRQ
state.)

This means we have consistent state - both r2 and regs->ARM_pc always
point to the next instruction to be executed in every case, which means
its easy to understand and remember while reading through the code.

diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
--- b/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -499,10 +499,11 @@
 	blo	__und_usr_unknown
 3:	ldrht	r0, [r4]
 	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
-	orr	r0, r0, r5, lsl #16
+	str	r2, [sp, #S_PC]			@ it's a 2x16bit instr, update
+	orr	r0, r0, r5, lsl #16		@  regs->ARM_pc
 	@
 	@ r0 = the two 16-bit Thumb instructions which caused the exception
-	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
+	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
 	@ r4 = PC value for the first 16-bit Thumb instruction
 	@
 #else

8<-x-x-

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 2b46fea..5876eec 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
 	.align	5
 __und_usr:
 	usr_entry
-
-	@
-	@ fall through to the emulation code, which returns using r9 if
-	@ it has emulated the instruction, or the more conventional lr
-	@ if we are to treat this as a real undefined instruction
 	@
-	@  r0 - instruction
+	@ The emulation code returns using r9 if it has emulated the
+	@ instruction, or the more conventional lr if we are to treat
+	@ this as a real undefined instruction
 	@
 	adr	r9, BSYM(ret_from_exception)
 	adr	lr, BSYM(__und_usr_unknown)
+	@
+	@ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
+	@ faulting instruction depending on Thumb mode.
+	@ r3 = regs->ARM_cpsr
+	@
 	tst	r3, #PSR_T_BIT			@ Thumb mode?
-	itet	eq				@ explicit IT needed for the 1f label
+	itttt	eq				@ explicit IT needed for the 1f label
 	subeq	r4, r2, #4			@ ARM instr at LR - 4
-	subne	r4, r2, #2			@ Thumb instr at LR - 2
 1:	ldreqt	r0, [r4]
 #ifdef CONFIG_CPU_ENDIAN_BE8
 	reveq	r0, r0				@ little endian instruction
 #endif
+	@
+	@ r0 = 32-bit ARM instruction which caused the exception
+	@ r2 = PC value for the following instruction (:= regs->ARM_pc)
+	@ r4 = PC value for the faulting instruction
+	@
 	beq	call_fpe
+
 	@ Thumb instruction
 #if __LINUX_ARM_ARCH__ >= 7
+	sub	r4, r2, #2			@ Thumb instr at LR - 2
 2:
  ARM(	ldrht	r5, [r4], #2	)
  THUMB(	ldrht	r5, [r4]	)
@@ -492,18 +500,19 @@ __und_usr:
 3:	ldrht	r0, [r4]
 	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
 	orr	r0, r0, r5, lsl #16
+	@
+	@ r0 = the two 16-bit Thumb instructions which caused the exception
+	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
+	@ r4 = PC value for the first 16-bit Thumb instruction
+	@
 #else
 	b	__und_usr_unknown
 #endif
- UNWIND(.fnend		)
+ UNWIND(.fnend)
 ENDPROC(__und_usr)
 
-	@
-	@ fallthrough to call_fpe
-	@
-
 /*
- * The out of line fixup for the ldrt above.
+ * The out of line fixup for the ldrt instructions above.
  */
 	.pushsection .fixup, "ax"
 4:	mov	pc, r9
@@ -534,11 +543,12 @@ ENDPROC(__und_usr)
  * NEON handler code.
  *
  * Emulators may wish to make use of the following registers:
- *  r0  = instruction opcode.
- *  r2  = PC+4
+ *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+ *  r2  = PC value to resume execution after successful emulation
  *  r9  = normal "successful" return address
- *  r10 = this threads thread_info structure.
+ *  r10 = this threads thread_info structure
  *  lr  = unrecognised instruction return address
+ * IRQs disabled, FIQs enabled.
  */
 	@
 	@ Fall-through from Thumb-2 __und_usr
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index ee57640..eeb9250 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
 	void __user *pc;
 
 	/*
-	 * According to the ARM ARM, PC is 2 or 4 bytes ahead,
-	 * depending whether we're in Thumb mode or not.
-	 * Correct this offset.
+	 * According to the ARM ARM, the PC is 2 or 4 bytes ahead
+	 * depending on Thumb mode.  Correct this offset so that
+	 * regs->ARM_pc points at the faulting instruction.
 	 */
 	regs->ARM_pc -= correction;
 
diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
index 4fa9903..2bf6089 100644
--- a/arch/arm/vfp/entry.S
+++ b/arch/arm/vfp/entry.S
@@ -19,6 +19,15 @@
 #include <asm/vfpmacros.h>
 #include "../kernel/entry-header.S"
 
+@ VFP entry point.
+@
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
+@  r10 = this threads thread_info structure
+@  lr  = unrecognised instruction return address
+@  IRQs disabled.
+@
 ENTRY(do_vfp)
 #ifdef CONFIG_PREEMPT
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count
diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 9897dcf..7292921 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -61,13 +61,13 @@
 
 @ VFP hardware support entry point.
 @
-@  r0  = faulted instruction
-@  r2  = faulted PC+4
-@  r9  = successful return
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
 @  r10 = vfp_state union
 @  r11 = CPU number
-@  lr  = failure return
-
+@  lr  = unrecognised instruction return address
+@  IRQs enabled.
 ENTRY(vfp_support_entry)
 	DBGSTR3	"instr %08x pc %08x state %p", r0, r2, r10
 
@@ -138,9 +138,12 @@ check_for_exception:
 					@ exception before retrying branch
 					@ out before setting an FPEXC that
 					@ stops us reading stuff
-	VFPFMXR	FPEXC, r1		@ restore FPEXC last
-	sub	r2, r2, #4
-	str	r2, [sp, #S_PC]		@ retry the instruction
+	VFPFMXR	FPEXC, r1		@ Restore FPEXC last
+	sub	r2, r2, #4		@ Retry current instruction - if Thumb
+	str	r2, [sp, #S_PC]		@ mode it's two 16-bit instructions,
+					@ else it's one 32-bit instruction, so
+					@ always subtract 4 from the following
+					@ instruction address.
 #ifdef CONFIG_PREEMPT
 	get_thread_info	r10
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 18:47                   ` Russell King - ARM Linux
@ 2011-01-14 19:23                     ` Colin Cross
  -1 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-14 19:23 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel

On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> So... this incrementally on top of the previous patch (which I've
> reproduced below as there's a subtle comment change in there wrt IRQ
> state.)
>
> This means we have consistent state - both r2 and regs->ARM_pc always
> point to the next instruction to be executed in every case, which means
> its easy to understand and remember while reading through the code.
>
> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> --- b/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -499,10 +499,11 @@
>        blo     __und_usr_unknown
>  3:     ldrht   r0, [r4]
>        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
> -       orr     r0, r0, r5, lsl #16
> +       str     r2, [sp, #S_PC]                 @ it's a 2x16bit instr, update
> +       orr     r0, r0, r5, lsl #16             @  regs->ARM_pc
>        @
>        @ r0 = the two 16-bit Thumb instructions which caused the exception
> -       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
> +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
>        @ r4 = PC value for the first 16-bit Thumb instruction
>        @
>  #else
>
> 8<-x-x-
>
> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> index 2b46fea..5876eec 100644
> --- a/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
>        .align  5
>  __und_usr:
>        usr_entry
> -
> -       @
> -       @ fall through to the emulation code, which returns using r9 if
> -       @ it has emulated the instruction, or the more conventional lr
> -       @ if we are to treat this as a real undefined instruction
>        @
> -       @  r0 - instruction
> +       @ The emulation code returns using r9 if it has emulated the
> +       @ instruction, or the more conventional lr if we are to treat
> +       @ this as a real undefined instruction
>        @
>        adr     r9, BSYM(ret_from_exception)
>        adr     lr, BSYM(__und_usr_unknown)
> +       @
> +       @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
> +       @ faulting instruction depending on Thumb mode.
> +       @ r3 = regs->ARM_cpsr
> +       @
>        tst     r3, #PSR_T_BIT                  @ Thumb mode?
> -       itet    eq                              @ explicit IT needed for the 1f label
> +       itttt   eq                              @ explicit IT needed for the 1f label
>        subeq   r4, r2, #4                      @ ARM instr at LR - 4
> -       subne   r4, r2, #2                      @ Thumb instr at LR - 2
>  1:     ldreqt  r0, [r4]
>  #ifdef CONFIG_CPU_ENDIAN_BE8
>        reveq   r0, r0                          @ little endian instruction
>  #endif
> +       @
> +       @ r0 = 32-bit ARM instruction which caused the exception
> +       @ r2 = PC value for the following instruction (:= regs->ARM_pc)
> +       @ r4 = PC value for the faulting instruction
> +       @
>        beq     call_fpe
> +
>        @ Thumb instruction
>  #if __LINUX_ARM_ARCH__ >= 7
> +       sub     r4, r2, #2                      @ Thumb instr at LR - 2
>  2:
>  ARM(  ldrht   r5, [r4], #2    )
>  THUMB(        ldrht   r5, [r4]        )
> @@ -492,18 +500,19 @@ __und_usr:
>  3:     ldrht   r0, [r4]
>        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
>        orr     r0, r0, r5, lsl #16
> +       @
> +       @ r0 = the two 16-bit Thumb instructions which caused the exception
> +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
> +       @ r4 = PC value for the first 16-bit Thumb instruction
> +       @
>  #else
>        b       __und_usr_unknown
>  #endif
> - UNWIND(.fnend         )
> + UNWIND(.fnend)
>  ENDPROC(__und_usr)
>
> -       @
> -       @ fallthrough to call_fpe
> -       @
> -
>  /*
> - * The out of line fixup for the ldrt above.
> + * The out of line fixup for the ldrt instructions above.
>  */
>        .pushsection .fixup, "ax"
>  4:     mov     pc, r9
> @@ -534,11 +543,12 @@ ENDPROC(__und_usr)
>  * NEON handler code.
>  *
>  * Emulators may wish to make use of the following registers:
> - *  r0  = instruction opcode.
> - *  r2  = PC+4
> + *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
> + *  r2  = PC value to resume execution after successful emulation
>  *  r9  = normal "successful" return address
> - *  r10 = this threads thread_info structure.
> + *  r10 = this threads thread_info structure
>  *  lr  = unrecognised instruction return address
> + * IRQs disabled, FIQs enabled.
>  */
>        @
>        @ Fall-through from Thumb-2 __und_usr
> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> index ee57640..eeb9250 100644
> --- a/arch/arm/kernel/traps.c
> +++ b/arch/arm/kernel/traps.c
> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
>        void __user *pc;
>
>        /*
> -        * According to the ARM ARM, PC is 2 or 4 bytes ahead,
> -        * depending whether we're in Thumb mode or not.
> -        * Correct this offset.
> +        * According to the ARM ARM, the PC is 2 or 4 bytes ahead
> +        * depending on Thumb mode.  Correct this offset so that
> +        * regs->ARM_pc points at the faulting instruction.
>         */
>        regs->ARM_pc -= correction;
>
> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
> index 4fa9903..2bf6089 100644
> --- a/arch/arm/vfp/entry.S
> +++ b/arch/arm/vfp/entry.S
> @@ -19,6 +19,15 @@
>  #include <asm/vfpmacros.h>
>  #include "../kernel/entry-header.S"
>
> +@ VFP entry point.
> +@
> +@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
> +@  r2  = PC value to resume execution after successful emulation
> +@  r9  = normal "successful" return address
> +@  r10 = this threads thread_info structure
> +@  lr  = unrecognised instruction return address
> +@  IRQs disabled.
> +@
>  ENTRY(do_vfp)
>  #ifdef CONFIG_PREEMPT
>        ldr     r4, [r10, #TI_PREEMPT]  @ get preempt count
> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> index 9897dcf..7292921 100644
> --- a/arch/arm/vfp/vfphw.S
> +++ b/arch/arm/vfp/vfphw.S
> @@ -61,13 +61,13 @@
>
>  @ VFP hardware support entry point.
>  @
> -@  r0  = faulted instruction
> -@  r2  = faulted PC+4
> -@  r9  = successful return
> +@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
> +@  r2  = PC value to resume execution after successful emulation
> +@  r9  = normal "successful" return address
>  @  r10 = vfp_state union
>  @  r11 = CPU number
> -@  lr  = failure return
> -
> +@  lr  = unrecognised instruction return address
> +@  IRQs enabled.
>  ENTRY(vfp_support_entry)
>        DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10

I tested copying r2 to regs->ARM_pc like this patch does, and it fixes
my test case.  Could this second patch go first so it can be applied
to stable?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 19:23                     ` Colin Cross
  0 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-14 19:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> So... this incrementally on top of the previous patch (which I've
> reproduced below as there's a subtle comment change in there wrt IRQ
> state.)
>
> This means we have consistent state - both r2 and regs->ARM_pc always
> point to the next instruction to be executed in every case, which means
> its easy to understand and remember while reading through the code.
>
> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> --- b/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -499,10 +499,11 @@
> ? ? ? ?blo ? ? __und_usr_unknown
> ?3: ? ? ldrht ? r0, [r4]
> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
> - ? ? ? orr ? ? r0, r0, r5, lsl #16
> + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update
> + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc
> ? ? ? ?@
> ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception
> - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
> ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction
> ? ? ? ?@
> ?#else
>
> 8<-x-x-
>
> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> index 2b46fea..5876eec 100644
> --- a/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
> ? ? ? ?.align ?5
> ?__und_usr:
> ? ? ? ?usr_entry
> -
> - ? ? ? @
> - ? ? ? @ fall through to the emulation code, which returns using r9 if
> - ? ? ? @ it has emulated the instruction, or the more conventional lr
> - ? ? ? @ if we are to treat this as a real undefined instruction
> ? ? ? ?@
> - ? ? ? @ ?r0 - instruction
> + ? ? ? @ The emulation code returns using r9 if it has emulated the
> + ? ? ? @ instruction, or the more conventional lr if we are to treat
> + ? ? ? @ this as a real undefined instruction
> ? ? ? ?@
> ? ? ? ?adr ? ? r9, BSYM(ret_from_exception)
> ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown)
> + ? ? ? @
> + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
> + ? ? ? @ faulting instruction depending on Thumb mode.
> + ? ? ? @ r3 = regs->ARM_cpsr
> + ? ? ? @
> ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode?
> - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
> + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
> ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4
> - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
> ?1: ? ? ldreqt ?r0, [r4]
> ?#ifdef CONFIG_CPU_ENDIAN_BE8
> ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction
> ?#endif
> + ? ? ? @
> + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception
> + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc)
> + ? ? ? @ r4 = PC value for the faulting instruction
> + ? ? ? @
> ? ? ? ?beq ? ? call_fpe
> +
> ? ? ? ?@ Thumb instruction
> ?#if __LINUX_ARM_ARCH__ >= 7
> + ? ? ? sub ? ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
> ?2:
> ?ARM( ?ldrht ? r5, [r4], #2 ? ?)
> ?THUMB( ? ? ? ?ldrht ? r5, [r4] ? ? ? ?)
> @@ -492,18 +500,19 @@ __und_usr:
> ?3: ? ? ldrht ? r0, [r4]
> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
> ? ? ? ?orr ? ? r0, r0, r5, lsl #16
> + ? ? ? @
> + ? ? ? @ r0 = the two 16-bit Thumb instructions which caused the exception
> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
> + ? ? ? @ r4 = PC value for the first 16-bit Thumb instruction
> + ? ? ? @
> ?#else
> ? ? ? ?b ? ? ? __und_usr_unknown
> ?#endif
> - UNWIND(.fnend ? ? ? ? )
> + UNWIND(.fnend)
> ?ENDPROC(__und_usr)
>
> - ? ? ? @
> - ? ? ? @ fallthrough to call_fpe
> - ? ? ? @
> -
> ?/*
> - * The out of line fixup for the ldrt above.
> + * The out of line fixup for the ldrt instructions above.
> ?*/
> ? ? ? ?.pushsection .fixup, "ax"
> ?4: ? ? mov ? ? pc, r9
> @@ -534,11 +543,12 @@ ENDPROC(__und_usr)
> ?* NEON handler code.
> ?*
> ?* Emulators may wish to make use of the following registers:
> - * ?r0 ?= instruction opcode.
> - * ?r2 ?= PC+4
> + * ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
> + * ?r2 ?= PC value to resume execution after successful emulation
> ?* ?r9 ?= normal "successful" return address
> - * ?r10 = this threads thread_info structure.
> + * ?r10 = this threads thread_info structure
> ?* ?lr ?= unrecognised instruction return address
> + * IRQs disabled, FIQs enabled.
> ?*/
> ? ? ? ?@
> ? ? ? ?@ Fall-through from Thumb-2 __und_usr
> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> index ee57640..eeb9250 100644
> --- a/arch/arm/kernel/traps.c
> +++ b/arch/arm/kernel/traps.c
> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
> ? ? ? ?void __user *pc;
>
> ? ? ? ?/*
> - ? ? ? ?* According to the ARM ARM, PC is 2 or 4 bytes ahead,
> - ? ? ? ?* depending whether we're in Thumb mode or not.
> - ? ? ? ?* Correct this offset.
> + ? ? ? ?* According to the ARM ARM, the PC is 2 or 4 bytes ahead
> + ? ? ? ?* depending on Thumb mode. ?Correct this offset so that
> + ? ? ? ?* regs->ARM_pc points at the faulting instruction.
> ? ? ? ? */
> ? ? ? ?regs->ARM_pc -= correction;
>
> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
> index 4fa9903..2bf6089 100644
> --- a/arch/arm/vfp/entry.S
> +++ b/arch/arm/vfp/entry.S
> @@ -19,6 +19,15 @@
> ?#include <asm/vfpmacros.h>
> ?#include "../kernel/entry-header.S"
>
> +@ VFP entry point.
> +@
> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
> +@ ?r2 ?= PC value to resume execution after successful emulation
> +@ ?r9 ?= normal "successful" return address
> +@ ?r10 = this threads thread_info structure
> +@ ?lr ?= unrecognised instruction return address
> +@ ?IRQs disabled.
> +@
> ?ENTRY(do_vfp)
> ?#ifdef CONFIG_PREEMPT
> ? ? ? ?ldr ? ? r4, [r10, #TI_PREEMPT] ?@ get preempt count
> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> index 9897dcf..7292921 100644
> --- a/arch/arm/vfp/vfphw.S
> +++ b/arch/arm/vfp/vfphw.S
> @@ -61,13 +61,13 @@
>
> ?@ VFP hardware support entry point.
> ?@
> -@ ?r0 ?= faulted instruction
> -@ ?r2 ?= faulted PC+4
> -@ ?r9 ?= successful return
> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
> +@ ?r2 ?= PC value to resume execution after successful emulation
> +@ ?r9 ?= normal "successful" return address
> ?@ ?r10 = vfp_state union
> ?@ ?r11 = CPU number
> -@ ?lr ?= failure return
> -
> +@ ?lr ?= unrecognised instruction return address
> +@ ?IRQs enabled.
> ?ENTRY(vfp_support_entry)
> ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10

I tested copying r2 to regs->ARM_pc like this patch does, and it fixes
my test case.  Could this second patch go first so it can be applied
to stable?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 19:23                     ` Colin Cross
@ 2011-01-14 19:51                       ` Colin Cross
  -1 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-14 19:51 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel

On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote:
> On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> So... this incrementally on top of the previous patch (which I've
>> reproduced below as there's a subtle comment change in there wrt IRQ
>> state.)
>>
>> This means we have consistent state - both r2 and regs->ARM_pc always
>> point to the next instruction to be executed in every case, which means
>> its easy to understand and remember while reading through the code.
>>
>> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>> --- b/arch/arm/kernel/entry-armv.S
>> +++ b/arch/arm/kernel/entry-armv.S
>> @@ -499,10 +499,11 @@
>>        blo     __und_usr_unknown
>>  3:     ldrht   r0, [r4]
>>        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
>> -       orr     r0, r0, r5, lsl #16
>> +       str     r2, [sp, #S_PC]                 @ it's a 2x16bit instr, update
>> +       orr     r0, r0, r5, lsl #16             @  regs->ARM_pc
>>        @
>>        @ r0 = the two 16-bit Thumb instructions which caused the exception
>> -       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
>> +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
>>        @ r4 = PC value for the first 16-bit Thumb instruction
>>        @
>>  #else
>>
>> 8<-x-x-
>>
>> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>> index 2b46fea..5876eec 100644
>> --- a/arch/arm/kernel/entry-armv.S
>> +++ b/arch/arm/kernel/entry-armv.S
>> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
>>        .align  5
>>  __und_usr:
>>        usr_entry
>> -
>> -       @
>> -       @ fall through to the emulation code, which returns using r9 if
>> -       @ it has emulated the instruction, or the more conventional lr
>> -       @ if we are to treat this as a real undefined instruction
>>        @
>> -       @  r0 - instruction
>> +       @ The emulation code returns using r9 if it has emulated the
>> +       @ instruction, or the more conventional lr if we are to treat
>> +       @ this as a real undefined instruction
>>        @
>>        adr     r9, BSYM(ret_from_exception)
>>        adr     lr, BSYM(__und_usr_unknown)
>> +       @
>> +       @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
>> +       @ faulting instruction depending on Thumb mode.
>> +       @ r3 = regs->ARM_cpsr
>> +       @
>>        tst     r3, #PSR_T_BIT                  @ Thumb mode?
>> -       itet    eq                              @ explicit IT needed for the 1f label
>> +       itttt   eq                              @ explicit IT needed for the 1f label
>>        subeq   r4, r2, #4                      @ ARM instr at LR - 4
>> -       subne   r4, r2, #2                      @ Thumb instr at LR - 2
>>  1:     ldreqt  r0, [r4]
>>  #ifdef CONFIG_CPU_ENDIAN_BE8
>>        reveq   r0, r0                          @ little endian instruction
>>  #endif
>> +       @
>> +       @ r0 = 32-bit ARM instruction which caused the exception
>> +       @ r2 = PC value for the following instruction (:= regs->ARM_pc)
>> +       @ r4 = PC value for the faulting instruction
>> +       @
>>        beq     call_fpe
>> +
>>        @ Thumb instruction
>>  #if __LINUX_ARM_ARCH__ >= 7
>> +       sub     r4, r2, #2                      @ Thumb instr at LR - 2
>>  2:
>>  ARM(  ldrht   r5, [r4], #2    )
>>  THUMB(        ldrht   r5, [r4]        )
>> @@ -492,18 +500,19 @@ __und_usr:
>>  3:     ldrht   r0, [r4]
>>        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
>>        orr     r0, r0, r5, lsl #16
>> +       @
>> +       @ r0 = the two 16-bit Thumb instructions which caused the exception
>> +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
>> +       @ r4 = PC value for the first 16-bit Thumb instruction
>> +       @
>>  #else
>>        b       __und_usr_unknown
>>  #endif
>> - UNWIND(.fnend         )
>> + UNWIND(.fnend)
>>  ENDPROC(__und_usr)
>>
>> -       @
>> -       @ fallthrough to call_fpe
>> -       @
>> -
>>  /*
>> - * The out of line fixup for the ldrt above.
>> + * The out of line fixup for the ldrt instructions above.
>>  */
>>        .pushsection .fixup, "ax"
>>  4:     mov     pc, r9
>> @@ -534,11 +543,12 @@ ENDPROC(__und_usr)
>>  * NEON handler code.
>>  *
>>  * Emulators may wish to make use of the following registers:
>> - *  r0  = instruction opcode.
>> - *  r2  = PC+4
>> + *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
>> + *  r2  = PC value to resume execution after successful emulation
>>  *  r9  = normal "successful" return address
>> - *  r10 = this threads thread_info structure.
>> + *  r10 = this threads thread_info structure
>>  *  lr  = unrecognised instruction return address
>> + * IRQs disabled, FIQs enabled.
>>  */
>>        @
>>        @ Fall-through from Thumb-2 __und_usr
>> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
>> index ee57640..eeb9250 100644
>> --- a/arch/arm/kernel/traps.c
>> +++ b/arch/arm/kernel/traps.c
>> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
>>        void __user *pc;
>>
>>        /*
>> -        * According to the ARM ARM, PC is 2 or 4 bytes ahead,
>> -        * depending whether we're in Thumb mode or not.
>> -        * Correct this offset.
>> +        * According to the ARM ARM, the PC is 2 or 4 bytes ahead
>> +        * depending on Thumb mode.  Correct this offset so that
>> +        * regs->ARM_pc points at the faulting instruction.
>>         */
>>        regs->ARM_pc -= correction;
>>
>> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
>> index 4fa9903..2bf6089 100644
>> --- a/arch/arm/vfp/entry.S
>> +++ b/arch/arm/vfp/entry.S
>> @@ -19,6 +19,15 @@
>>  #include <asm/vfpmacros.h>
>>  #include "../kernel/entry-header.S"
>>
>> +@ VFP entry point.
>> +@
>> +@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
>> +@  r2  = PC value to resume execution after successful emulation
>> +@  r9  = normal "successful" return address
>> +@  r10 = this threads thread_info structure
>> +@  lr  = unrecognised instruction return address
>> +@  IRQs disabled.
>> +@
>>  ENTRY(do_vfp)
>>  #ifdef CONFIG_PREEMPT
>>        ldr     r4, [r10, #TI_PREEMPT]  @ get preempt count
>> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
>> index 9897dcf..7292921 100644
>> --- a/arch/arm/vfp/vfphw.S
>> +++ b/arch/arm/vfp/vfphw.S
>> @@ -61,13 +61,13 @@
>>
>>  @ VFP hardware support entry point.
>>  @
>> -@  r0  = faulted instruction
>> -@  r2  = faulted PC+4
>> -@  r9  = successful return
>> +@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
>> +@  r2  = PC value to resume execution after successful emulation
>> +@  r9  = normal "successful" return address
>>  @  r10 = vfp_state union
>>  @  r11 = CPU number
>> -@  lr  = failure return
>> -
>> +@  lr  = unrecognised instruction return address
>> +@  IRQs enabled.
>>  ENTRY(vfp_support_entry)
>>        DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
>
> I tested copying r2 to regs->ARM_pc like this patch does, and it fixes
> my test case.  Could this second patch go first so it can be applied
> to stable?
>

Also, both patches together Tested-by: Colin Cross <ccross@android.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 19:51                       ` Colin Cross
  0 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-14 19:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote:
> On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> So... this incrementally on top of the previous patch (which I've
>> reproduced below as there's a subtle comment change in there wrt IRQ
>> state.)
>>
>> This means we have consistent state - both r2 and regs->ARM_pc always
>> point to the next instruction to be executed in every case, which means
>> its easy to understand and remember while reading through the code.
>>
>> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>> --- b/arch/arm/kernel/entry-armv.S
>> +++ b/arch/arm/kernel/entry-armv.S
>> @@ -499,10 +499,11 @@
>> ? ? ? ?blo ? ? __und_usr_unknown
>> ?3: ? ? ldrht ? r0, [r4]
>> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
>> - ? ? ? orr ? ? r0, r0, r5, lsl #16
>> + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update
>> + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc
>> ? ? ? ?@
>> ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception
>> - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
>> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
>> ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction
>> ? ? ? ?@
>> ?#else
>>
>> 8<-x-x-
>>
>> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>> index 2b46fea..5876eec 100644
>> --- a/arch/arm/kernel/entry-armv.S
>> +++ b/arch/arm/kernel/entry-armv.S
>> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
>> ? ? ? ?.align ?5
>> ?__und_usr:
>> ? ? ? ?usr_entry
>> -
>> - ? ? ? @
>> - ? ? ? @ fall through to the emulation code, which returns using r9 if
>> - ? ? ? @ it has emulated the instruction, or the more conventional lr
>> - ? ? ? @ if we are to treat this as a real undefined instruction
>> ? ? ? ?@
>> - ? ? ? @ ?r0 - instruction
>> + ? ? ? @ The emulation code returns using r9 if it has emulated the
>> + ? ? ? @ instruction, or the more conventional lr if we are to treat
>> + ? ? ? @ this as a real undefined instruction
>> ? ? ? ?@
>> ? ? ? ?adr ? ? r9, BSYM(ret_from_exception)
>> ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown)
>> + ? ? ? @
>> + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
>> + ? ? ? @ faulting instruction depending on Thumb mode.
>> + ? ? ? @ r3 = regs->ARM_cpsr
>> + ? ? ? @
>> ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode?
>> - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
>> + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
>> ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4
>> - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
>> ?1: ? ? ldreqt ?r0, [r4]
>> ?#ifdef CONFIG_CPU_ENDIAN_BE8
>> ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction
>> ?#endif
>> + ? ? ? @
>> + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception
>> + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc)
>> + ? ? ? @ r4 = PC value for the faulting instruction
>> + ? ? ? @
>> ? ? ? ?beq ? ? call_fpe
>> +
>> ? ? ? ?@ Thumb instruction
>> ?#if __LINUX_ARM_ARCH__ >= 7
>> + ? ? ? sub ? ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
>> ?2:
>> ?ARM( ?ldrht ? r5, [r4], #2 ? ?)
>> ?THUMB( ? ? ? ?ldrht ? r5, [r4] ? ? ? ?)
>> @@ -492,18 +500,19 @@ __und_usr:
>> ?3: ? ? ldrht ? r0, [r4]
>> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
>> ? ? ? ?orr ? ? r0, r0, r5, lsl #16
>> + ? ? ? @
>> + ? ? ? @ r0 = the two 16-bit Thumb instructions which caused the exception
>> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
>> + ? ? ? @ r4 = PC value for the first 16-bit Thumb instruction
>> + ? ? ? @
>> ?#else
>> ? ? ? ?b ? ? ? __und_usr_unknown
>> ?#endif
>> - UNWIND(.fnend ? ? ? ? )
>> + UNWIND(.fnend)
>> ?ENDPROC(__und_usr)
>>
>> - ? ? ? @
>> - ? ? ? @ fallthrough to call_fpe
>> - ? ? ? @
>> -
>> ?/*
>> - * The out of line fixup for the ldrt above.
>> + * The out of line fixup for the ldrt instructions above.
>> ?*/
>> ? ? ? ?.pushsection .fixup, "ax"
>> ?4: ? ? mov ? ? pc, r9
>> @@ -534,11 +543,12 @@ ENDPROC(__und_usr)
>> ?* NEON handler code.
>> ?*
>> ?* Emulators may wish to make use of the following registers:
>> - * ?r0 ?= instruction opcode.
>> - * ?r2 ?= PC+4
>> + * ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
>> + * ?r2 ?= PC value to resume execution after successful emulation
>> ?* ?r9 ?= normal "successful" return address
>> - * ?r10 = this threads thread_info structure.
>> + * ?r10 = this threads thread_info structure
>> ?* ?lr ?= unrecognised instruction return address
>> + * IRQs disabled, FIQs enabled.
>> ?*/
>> ? ? ? ?@
>> ? ? ? ?@ Fall-through from Thumb-2 __und_usr
>> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
>> index ee57640..eeb9250 100644
>> --- a/arch/arm/kernel/traps.c
>> +++ b/arch/arm/kernel/traps.c
>> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
>> ? ? ? ?void __user *pc;
>>
>> ? ? ? ?/*
>> - ? ? ? ?* According to the ARM ARM, PC is 2 or 4 bytes ahead,
>> - ? ? ? ?* depending whether we're in Thumb mode or not.
>> - ? ? ? ?* Correct this offset.
>> + ? ? ? ?* According to the ARM ARM, the PC is 2 or 4 bytes ahead
>> + ? ? ? ?* depending on Thumb mode. ?Correct this offset so that
>> + ? ? ? ?* regs->ARM_pc points at the faulting instruction.
>> ? ? ? ? */
>> ? ? ? ?regs->ARM_pc -= correction;
>>
>> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
>> index 4fa9903..2bf6089 100644
>> --- a/arch/arm/vfp/entry.S
>> +++ b/arch/arm/vfp/entry.S
>> @@ -19,6 +19,15 @@
>> ?#include <asm/vfpmacros.h>
>> ?#include "../kernel/entry-header.S"
>>
>> +@ VFP entry point.
>> +@
>> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
>> +@ ?r2 ?= PC value to resume execution after successful emulation
>> +@ ?r9 ?= normal "successful" return address
>> +@ ?r10 = this threads thread_info structure
>> +@ ?lr ?= unrecognised instruction return address
>> +@ ?IRQs disabled.
>> +@
>> ?ENTRY(do_vfp)
>> ?#ifdef CONFIG_PREEMPT
>> ? ? ? ?ldr ? ? r4, [r10, #TI_PREEMPT] ?@ get preempt count
>> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
>> index 9897dcf..7292921 100644
>> --- a/arch/arm/vfp/vfphw.S
>> +++ b/arch/arm/vfp/vfphw.S
>> @@ -61,13 +61,13 @@
>>
>> ?@ VFP hardware support entry point.
>> ?@
>> -@ ?r0 ?= faulted instruction
>> -@ ?r2 ?= faulted PC+4
>> -@ ?r9 ?= successful return
>> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
>> +@ ?r2 ?= PC value to resume execution after successful emulation
>> +@ ?r9 ?= normal "successful" return address
>> ?@ ?r10 = vfp_state union
>> ?@ ?r11 = CPU number
>> -@ ?lr ?= failure return
>> -
>> +@ ?lr ?= unrecognised instruction return address
>> +@ ?IRQs enabled.
>> ?ENTRY(vfp_support_entry)
>> ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
>
> I tested copying r2 to regs->ARM_pc like this patch does, and it fixes
> my test case. ?Could this second patch go first so it can be applied
> to stable?
>

Also, both patches together Tested-by: Colin Cross <ccross@android.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 19:51                       ` Colin Cross
@ 2011-01-14 21:24                         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 21:24 UTC (permalink / raw)
  To: Colin Cross; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel

On Fri, Jan 14, 2011 at 11:51:38AM -0800, Colin Cross wrote:
> On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote:
> > I tested copying r2 to regs->ARM_pc like this patch does, and it fixes
> > my test case.  Could this second patch go first so it can be applied
> > to stable?
> >
> 
> Also, both patches together Tested-by: Colin Cross <ccross@android.com>

Thanks.  I do want to hear back from Catalin before merging them, so
they won't be part of tonight's push.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-14 21:24                         ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-14 21:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 14, 2011 at 11:51:38AM -0800, Colin Cross wrote:
> On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote:
> > I tested copying r2 to regs->ARM_pc like this patch does, and it fixes
> > my test case. ?Could this second patch go first so it can be applied
> > to stable?
> >
> 
> Also, both patches together Tested-by: Colin Cross <ccross@android.com>

Thanks.  I do want to hear back from Catalin before merging them, so
they won't be part of tonight's push.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 17:30                 ` Russell King - ARM Linux
@ 2011-01-15 15:31                   ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-15 15:31 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On 14 January 2011 17:30, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote:
>> I agree, this code needs some clean-up. Maybe for Undef we could unify
>> the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
>> breakpoint code, I haven't checked).
>>
>> Otherwise just let the code handling the undef deal with the ARM/Thumb
>> difference. For SVC, it makes sense to have different offsets as we
>> always return to the next instruction.
[...]
> When the VFP support code tests the state of the VFP hardware during boot,
> it sets the VFP handler to point at vfp_testing_entry, bypassing the normal
> VFP handling code, and executes a VFP instruction.
>
> If this VFP instruction faults (eg, because there is no VFP hardware
> present or we're not permitted to use it), it could end up resuming
> execution in the middle of the 16-bit paired instruction because
> regs->ARM_pc points in the middle of it.

Yes, that's possible. We probably never tried a Thumb-2 kernel where
VFP isn't present.

> Or maybe we should just make it unconditional that whenever we have an
> undefined instruction exception, the regs->ARM_pc value will always be
> set for resuming execution after the faulted instruction.  That makes
> it consistent with r2 throughout the code in every case.

I have some comments below.

> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> index 2b46fea..5876eec 100644
> --- a/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
>        .align  5
>  __und_usr:
>        usr_entry
> -
> -       @
> -       @ fall through to the emulation code, which returns using r9 if
> -       @ it has emulated the instruction, or the more conventional lr
> -       @ if we are to treat this as a real undefined instruction
>        @
> -       @  r0 - instruction
> +       @ The emulation code returns using r9 if it has emulated the
> +       @ instruction, or the more conventional lr if we are to treat
> +       @ this as a real undefined instruction
>        @
>        adr     r9, BSYM(ret_from_exception)
>        adr     lr, BSYM(__und_usr_unknown)
> +       @
> +       @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
> +       @ faulting instruction depending on Thumb mode.
> +       @ r3 = regs->ARM_cpsr
> +       @
>        tst     r3, #PSR_T_BIT                  @ Thumb mode?
> -       itet    eq                              @ explicit IT needed for the 1f label
> +       itttt   eq                              @ explicit IT needed for the 1f label
>        subeq   r4, r2, #4                      @ ARM instr at LR - 4
> -       subne   r4, r2, #2                      @ Thumb instr at LR - 2
>  1:     ldreqt  r0, [r4]

The itttt above should just be itt. The reveq is conditionally
compiled and beq doesn't necessarily need one.

>  #ifdef CONFIG_CPU_ENDIAN_BE8
>        reveq   r0, r0                          @ little endian instruction
>  #endif
> +       @
> +       @ r0 = 32-bit ARM instruction which caused the exception
> +       @ r2 = PC value for the following instruction (:= regs->ARM_pc)

Is r2 here always the PC value following instruction? If the Thumb
instruction was 32-bit, it just points in the middle of the faulting
instruction.

> +       @ r4 = PC value for the faulting instruction
> +       @
>        beq     call_fpe
> +
>        @ Thumb instruction
>  #if __LINUX_ARM_ARCH__ >= 7
> +       sub     r4, r2, #2                      @ Thumb instr at LR - 2
>  2:
>  ARM(  ldrht   r5, [r4], #2    )
>  THUMB(        ldrht   r5, [r4]        )
> @@ -492,18 +500,19 @@ __und_usr:
>  3:     ldrht   r0, [r4]
>        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
>        orr     r0, r0, r5, lsl #16
> +       @
> +       @ r0 = the two 16-bit Thumb instructions which caused the exception
> +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)

That's correct.

> +       @ r4 = PC value for the first 16-bit Thumb instruction

I think r4 here points in the middle of tha faulting instruction for
32-bit Thumb.

> +       @
>  #else
>        b       __und_usr_unknown
>  #endif
> - UNWIND(.fnend         )
> + UNWIND(.fnend)
>  ENDPROC(__und_usr)
>
> -       @
> -       @ fallthrough to call_fpe
> -       @
> -
>  /*
> - * The out of line fixup for the ldrt above.
> + * The out of line fixup for the ldrt instructions above.
>  */
>        .pushsection .fixup, "ax"
>  4:     mov     pc, r9
> @@ -534,11 +543,12 @@ ENDPROC(__und_usr)
>  * NEON handler code.
>  *
>  * Emulators may wish to make use of the following registers:
> - *  r0  = instruction opcode.
> - *  r2  = PC+4
> + *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
> + *  r2  = PC value to resume execution after successful emulation
>  *  r9  = normal "successful" return address
> - *  r10 = this threads thread_info structure.
> + *  r10 = this threads thread_info structure
>  *  lr  = unrecognised instruction return address
> + * IRQs disabled, FIQs enabled.
>  */
>        @
>        @ Fall-through from Thumb-2 __und_usr
> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> index ee57640..eeb9250 100644
> --- a/arch/arm/kernel/traps.c
> +++ b/arch/arm/kernel/traps.c
> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
>        void __user *pc;
>
>        /*
> -        * According to the ARM ARM, PC is 2 or 4 bytes ahead,
> -        * depending whether we're in Thumb mode or not.
> -        * Correct this offset.
> +        * According to the ARM ARM, the PC is 2 or 4 bytes ahead
> +        * depending on Thumb mode.  Correct this offset so that
> +        * regs->ARM_pc points at the faulting instruction.
>         */
>        regs->ARM_pc -= correction;
>
> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
> index 4fa9903..2bf6089 100644
> --- a/arch/arm/vfp/entry.S
> +++ b/arch/arm/vfp/entry.S
> @@ -19,6 +19,14 @@
>  #include <asm/vfpmacros.h>
>  #include "../kernel/entry-header.S"
>
> +@ VFP entry point.
> +@
> +@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
> +@  r2  = PC value to resume execution after successful emulation
> +@  r9  = normal "successful" return address
> +@  r10 = this threads thread_info structure
> +@  lr  = unrecognised instruction return address
> +@
>  ENTRY(do_vfp)
>  #ifdef CONFIG_PREEMPT
>        ldr     r4, [r10, #TI_PREEMPT]  @ get preempt count
> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> index 9897dcf..7292921 100644
> --- a/arch/arm/vfp/vfphw.S
> +++ b/arch/arm/vfp/vfphw.S
> @@ -61,13 +61,13 @@
>
>  @ VFP hardware support entry point.
>  @
> -@  r0  = faulted instruction
> -@  r2  = faulted PC+4
> -@  r9  = successful return
> +@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
> +@  r2  = PC value to resume execution after successful emulation

That's right.

> +@  r9  = normal "successful" return address
>  @  r10 = vfp_state union
>  @  r11 = CPU number
> -@  lr  = failure return
> -
> +@  lr  = unrecognised instruction return address
> +@  IRQs enabled.
>  ENTRY(vfp_support_entry)
>        DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
>
> @@ -138,9 +138,12 @@ check_for_exception:
>                                        @ exception before retrying branch
>                                        @ out before setting an FPEXC that
>                                        @ stops us reading stuff
> -       VFPFMXR FPEXC, r1               @ restore FPEXC last
> -       sub     r2, r2, #4
> -       str     r2, [sp, #S_PC]         @ retry the instruction
> +       VFPFMXR FPEXC, r1               @ Restore FPEXC last
> +       sub     r2, r2, #4              @ Retry current instruction - if Thumb
> +       str     r2, [sp, #S_PC]         @ mode it's two 16-bit instructions,
> +                                       @ else it's one 32-bit instruction, so
> +                                       @ always subtract 4 from the following
> +                                       @ instruction address.

I would say it's always a 32-bit instruction but made up of two 16-bit
values to allow half-word alignment.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-15 15:31                   ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-15 15:31 UTC (permalink / raw)
  To: linux-arm-kernel

On 14 January 2011 17:30, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote:
>> I agree, this code needs some clean-up. Maybe for Undef we could unify
>> the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
>> breakpoint code, I haven't checked).
>>
>> Otherwise just let the code handling the undef deal with the ARM/Thumb
>> difference. For SVC, it makes sense to have different offsets as we
>> always return to the next instruction.
[...]
> When the VFP support code tests the state of the VFP hardware during boot,
> it sets the VFP handler to point at vfp_testing_entry, bypassing the normal
> VFP handling code, and executes a VFP instruction.
>
> If this VFP instruction faults (eg, because there is no VFP hardware
> present or we're not permitted to use it), it could end up resuming
> execution in the middle of the 16-bit paired instruction because
> regs->ARM_pc points in the middle of it.

Yes, that's possible. We probably never tried a Thumb-2 kernel where
VFP isn't present.

> Or maybe we should just make it unconditional that whenever we have an
> undefined instruction exception, the regs->ARM_pc value will always be
> set for resuming execution after the faulted instruction. ?That makes
> it consistent with r2 throughout the code in every case.

I have some comments below.

> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> index 2b46fea..5876eec 100644
> --- a/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
> ? ? ? ?.align ?5
> ?__und_usr:
> ? ? ? ?usr_entry
> -
> - ? ? ? @
> - ? ? ? @ fall through to the emulation code, which returns using r9 if
> - ? ? ? @ it has emulated the instruction, or the more conventional lr
> - ? ? ? @ if we are to treat this as a real undefined instruction
> ? ? ? ?@
> - ? ? ? @ ?r0 - instruction
> + ? ? ? @ The emulation code returns using r9 if it has emulated the
> + ? ? ? @ instruction, or the more conventional lr if we are to treat
> + ? ? ? @ this as a real undefined instruction
> ? ? ? ?@
> ? ? ? ?adr ? ? r9, BSYM(ret_from_exception)
> ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown)
> + ? ? ? @
> + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
> + ? ? ? @ faulting instruction depending on Thumb mode.
> + ? ? ? @ r3 = regs->ARM_cpsr
> + ? ? ? @
> ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode?
> - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
> + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
> ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4
> - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
> ?1: ? ? ldreqt ?r0, [r4]

The itttt above should just be itt. The reveq is conditionally
compiled and beq doesn't necessarily need one.

> ?#ifdef CONFIG_CPU_ENDIAN_BE8
> ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction
> ?#endif
> + ? ? ? @
> + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception
> + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc)

Is r2 here always the PC value following instruction? If the Thumb
instruction was 32-bit, it just points in the middle of the faulting
instruction.

> + ? ? ? @ r4 = PC value for the faulting instruction
> + ? ? ? @
> ? ? ? ?beq ? ? call_fpe
> +
> ? ? ? ?@ Thumb instruction
> ?#if __LINUX_ARM_ARCH__ >= 7
> + ? ? ? sub ? ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
> ?2:
> ?ARM( ?ldrht ? r5, [r4], #2 ? ?)
> ?THUMB( ? ? ? ?ldrht ? r5, [r4] ? ? ? ?)
> @@ -492,18 +500,19 @@ __und_usr:
> ?3: ? ? ldrht ? r0, [r4]
> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
> ? ? ? ?orr ? ? r0, r0, r5, lsl #16
> + ? ? ? @
> + ? ? ? @ r0 = the two 16-bit Thumb instructions which caused the exception
> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)

That's correct.

> + ? ? ? @ r4 = PC value for the first 16-bit Thumb instruction

I think r4 here points in the middle of tha faulting instruction for
32-bit Thumb.

> + ? ? ? @
> ?#else
> ? ? ? ?b ? ? ? __und_usr_unknown
> ?#endif
> - UNWIND(.fnend ? ? ? ? )
> + UNWIND(.fnend)
> ?ENDPROC(__und_usr)
>
> - ? ? ? @
> - ? ? ? @ fallthrough to call_fpe
> - ? ? ? @
> -
> ?/*
> - * The out of line fixup for the ldrt above.
> + * The out of line fixup for the ldrt instructions above.
> ?*/
> ? ? ? ?.pushsection .fixup, "ax"
> ?4: ? ? mov ? ? pc, r9
> @@ -534,11 +543,12 @@ ENDPROC(__und_usr)
> ?* NEON handler code.
> ?*
> ?* Emulators may wish to make use of the following registers:
> - * ?r0 ?= instruction opcode.
> - * ?r2 ?= PC+4
> + * ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
> + * ?r2 ?= PC value to resume execution after successful emulation
> ?* ?r9 ?= normal "successful" return address
> - * ?r10 = this threads thread_info structure.
> + * ?r10 = this threads thread_info structure
> ?* ?lr ?= unrecognised instruction return address
> + * IRQs disabled, FIQs enabled.
> ?*/
> ? ? ? ?@
> ? ? ? ?@ Fall-through from Thumb-2 __und_usr
> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> index ee57640..eeb9250 100644
> --- a/arch/arm/kernel/traps.c
> +++ b/arch/arm/kernel/traps.c
> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
> ? ? ? ?void __user *pc;
>
> ? ? ? ?/*
> - ? ? ? ?* According to the ARM ARM, PC is 2 or 4 bytes ahead,
> - ? ? ? ?* depending whether we're in Thumb mode or not.
> - ? ? ? ?* Correct this offset.
> + ? ? ? ?* According to the ARM ARM, the PC is 2 or 4 bytes ahead
> + ? ? ? ?* depending on Thumb mode. ?Correct this offset so that
> + ? ? ? ?* regs->ARM_pc points at the faulting instruction.
> ? ? ? ? */
> ? ? ? ?regs->ARM_pc -= correction;
>
> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
> index 4fa9903..2bf6089 100644
> --- a/arch/arm/vfp/entry.S
> +++ b/arch/arm/vfp/entry.S
> @@ -19,6 +19,14 @@
> ?#include <asm/vfpmacros.h>
> ?#include "../kernel/entry-header.S"
>
> +@ VFP entry point.
> +@
> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
> +@ ?r2 ?= PC value to resume execution after successful emulation
> +@ ?r9 ?= normal "successful" return address
> +@ ?r10 = this threads thread_info structure
> +@ ?lr ?= unrecognised instruction return address
> +@
> ?ENTRY(do_vfp)
> ?#ifdef CONFIG_PREEMPT
> ? ? ? ?ldr ? ? r4, [r10, #TI_PREEMPT] ?@ get preempt count
> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> index 9897dcf..7292921 100644
> --- a/arch/arm/vfp/vfphw.S
> +++ b/arch/arm/vfp/vfphw.S
> @@ -61,13 +61,13 @@
>
> ?@ VFP hardware support entry point.
> ?@
> -@ ?r0 ?= faulted instruction
> -@ ?r2 ?= faulted PC+4
> -@ ?r9 ?= successful return
> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
> +@ ?r2 ?= PC value to resume execution after successful emulation

That's right.

> +@ ?r9 ?= normal "successful" return address
> ?@ ?r10 = vfp_state union
> ?@ ?r11 = CPU number
> -@ ?lr ?= failure return
> -
> +@ ?lr ?= unrecognised instruction return address
> +@ ?IRQs enabled.
> ?ENTRY(vfp_support_entry)
> ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
>
> @@ -138,9 +138,12 @@ check_for_exception:
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ exception before retrying branch
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ out before setting an FPEXC that
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ stops us reading stuff
> - ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ restore FPEXC last
> - ? ? ? sub ? ? r2, r2, #4
> - ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ retry the instruction
> + ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ Restore FPEXC last
> + ? ? ? sub ? ? r2, r2, #4 ? ? ? ? ? ? ?@ Retry current instruction - if Thumb
> + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ mode it's two 16-bit instructions,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ else it's one 32-bit instruction, so
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ always subtract 4 from the following
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ instruction address.

I would say it's always a 32-bit instruction but made up of two 16-bit
values to allow half-word alignment.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 18:47                   ` Russell King - ARM Linux
@ 2011-01-15 15:38                     ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-15 15:38 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-kernel, linux-arm-kernel, Colin Cross

On 14 January 2011 18:47, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Jan 14, 2011 at 05:30:50PM +0000, Russell King - ARM Linux wrote:
>> On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote:
>> > I agree, this code needs some clean-up. Maybe for Undef we could unify
>> > the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
>> > breakpoint code, I haven't checked).
>> >
>> > Otherwise just let the code handling the undef deal with the ARM/Thumb
>> > difference. For SVC, it makes sense to have different offsets as we
>> > always return to the next instruction.
>>
>> I think it just needs better documentation.
>>
>> Having been through all this, there _are_ bugs lurking in the code exactly
>> because of this randomness with what PC value is means what.
>>
>> When the VFP support code tests the state of the VFP hardware during boot,
>> it sets the VFP handler to point at vfp_testing_entry, bypassing the normal
>> VFP handling code, and executes a VFP instruction.
>>
>> If this VFP instruction faults (eg, because there is no VFP hardware
>> present or we're not permitted to use it), it could end up resuming
>> execution in the middle of the 16-bit paired instruction because
>> regs->ARM_pc points in the middle of it.
>>
>> So vfp_testing_entry should at least store r2 into regs->ARM_pc to
>> guarantee resuming at the following instruction.
>>
>> So maybe the right answer is to store r2 into regs->ARM_pc in
>> process_exception in the VFP assembly code too?
>>
>> Or maybe we should just make it unconditional that whenever we have an
>> undefined instruction exception, the regs->ARM_pc value will always be
>> set for resuming execution after the faulted instruction.  That makes
>> it consistent with r2 throughout the code in every case.
>
> So... this incrementally on top of the previous patch (which I've
> reproduced below as there's a subtle comment change in there wrt IRQ
> state.)
>
> This means we have consistent state - both r2 and regs->ARM_pc always
> point to the next instruction to be executed in every case, which means
> its easy to understand and remember while reading through the code.
>
> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> --- b/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -499,10 +499,11 @@
>        blo     __und_usr_unknown
>  3:     ldrht   r0, [r4]
>        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
> -       orr     r0, r0, r5, lsl #16
> +       str     r2, [sp, #S_PC]                 @ it's a 2x16bit instr, update
> +       orr     r0, r0, r5, lsl #16             @  regs->ARM_pc
>        @
>        @ r0 = the two 16-bit Thumb instructions which caused the exception
> -       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
> +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
>        @ r4 = PC value for the first 16-bit Thumb instruction
>        @
>  #else

Do we need to modify the VFP entry code to avoit the store to ARM_pc?

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-15 15:38                     ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-15 15:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 14 January 2011 18:47, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Jan 14, 2011 at 05:30:50PM +0000, Russell King - ARM Linux wrote:
>> On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote:
>> > I agree, this code needs some clean-up. Maybe for Undef we could unify
>> > the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
>> > breakpoint code, I haven't checked).
>> >
>> > Otherwise just let the code handling the undef deal with the ARM/Thumb
>> > difference. For SVC, it makes sense to have different offsets as we
>> > always return to the next instruction.
>>
>> I think it just needs better documentation.
>>
>> Having been through all this, there _are_ bugs lurking in the code exactly
>> because of this randomness with what PC value is means what.
>>
>> When the VFP support code tests the state of the VFP hardware during boot,
>> it sets the VFP handler to point at vfp_testing_entry, bypassing the normal
>> VFP handling code, and executes a VFP instruction.
>>
>> If this VFP instruction faults (eg, because there is no VFP hardware
>> present or we're not permitted to use it), it could end up resuming
>> execution in the middle of the 16-bit paired instruction because
>> regs->ARM_pc points in the middle of it.
>>
>> So vfp_testing_entry should at least store r2 into regs->ARM_pc to
>> guarantee resuming at the following instruction.
>>
>> So maybe the right answer is to store r2 into regs->ARM_pc in
>> process_exception in the VFP assembly code too?
>>
>> Or maybe we should just make it unconditional that whenever we have an
>> undefined instruction exception, the regs->ARM_pc value will always be
>> set for resuming execution after the faulted instruction. ?That makes
>> it consistent with r2 throughout the code in every case.
>
> So... this incrementally on top of the previous patch (which I've
> reproduced below as there's a subtle comment change in there wrt IRQ
> state.)
>
> This means we have consistent state - both r2 and regs->ARM_pc always
> point to the next instruction to be executed in every case, which means
> its easy to understand and remember while reading through the code.
>
> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> --- b/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -499,10 +499,11 @@
> ? ? ? ?blo ? ? __und_usr_unknown
> ?3: ? ? ldrht ? r0, [r4]
> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
> - ? ? ? orr ? ? r0, r0, r5, lsl #16
> + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update
> + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc
> ? ? ? ?@
> ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception
> - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
> ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction
> ? ? ? ?@
> ?#else

Do we need to modify the VFP entry code to avoit the store to ARM_pc?

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-15 15:31                   ` Catalin Marinas
@ 2011-01-15 15:40                     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-15 15:40 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote:
> On 14 January 2011 17:30, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote:
> >> I agree, this code needs some clean-up. Maybe for Undef we could unify
> >> the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
> >> breakpoint code, I haven't checked).
> >>
> >> Otherwise just let the code handling the undef deal with the ARM/Thumb
> >> difference. For SVC, it makes sense to have different offsets as we
> >> always return to the next instruction.
> [...]
> > When the VFP support code tests the state of the VFP hardware during boot,
> > it sets the VFP handler to point at vfp_testing_entry, bypassing the normal
> > VFP handling code, and executes a VFP instruction.
> >
> > If this VFP instruction faults (eg, because there is no VFP hardware
> > present or we're not permitted to use it), it could end up resuming
> > execution in the middle of the 16-bit paired instruction because
> > regs->ARM_pc points in the middle of it.
> 
> Yes, that's possible. We probably never tried a Thumb-2 kernel where
> VFP isn't present.
> 
> > Or maybe we should just make it unconditional that whenever we have an
> > undefined instruction exception, the regs->ARM_pc value will always be
> > set for resuming execution after the faulted instruction.  That makes
> > it consistent with r2 throughout the code in every case.
> 
> I have some comments below.
> 
> > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> > index 2b46fea..5876eec 100644
> > --- a/arch/arm/kernel/entry-armv.S
> > +++ b/arch/arm/kernel/entry-armv.S
> > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
> >        .align  5
> >  __und_usr:
> >        usr_entry
> > -
> > -       @
> > -       @ fall through to the emulation code, which returns using r9 if
> > -       @ it has emulated the instruction, or the more conventional lr
> > -       @ if we are to treat this as a real undefined instruction
> >        @
> > -       @  r0 - instruction
> > +       @ The emulation code returns using r9 if it has emulated the
> > +       @ instruction, or the more conventional lr if we are to treat
> > +       @ this as a real undefined instruction
> >        @
> >        adr     r9, BSYM(ret_from_exception)
> >        adr     lr, BSYM(__und_usr_unknown)
> > +       @
> > +       @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
> > +       @ faulting instruction depending on Thumb mode.
> > +       @ r3 = regs->ARM_cpsr
> > +       @
> >        tst     r3, #PSR_T_BIT                  @ Thumb mode?
> > -       itet    eq                              @ explicit IT needed for the 1f label
> > +       itttt   eq                              @ explicit IT needed for the 1f label
> >        subeq   r4, r2, #4                      @ ARM instr at LR - 4
> > -       subne   r4, r2, #2                      @ Thumb instr at LR - 2
> >  1:     ldreqt  r0, [r4]
> 
> The itttt above should just be itt. The reveq is conditionally
> compiled and beq doesn't necessarily need one.

It's a reveq, so I thought we should cover all the instructions with
an 'eq' conditional for thumb.

> 
> >  #ifdef CONFIG_CPU_ENDIAN_BE8
> >        reveq   r0, r0                          @ little endian instruction
> >  #endif
> > +       @
> > +       @ r0 = 32-bit ARM instruction which caused the exception
> > +       @ r2 = PC value for the following instruction (:= regs->ARM_pc)
> 
> Is r2 here always the PC value following instruction? If the Thumb
> instruction was 32-bit, it just points in the middle of the faulting
> instruction.

Is the T bit ever zero in this case?  The code here is:

	tst	r3, #PSR_T_BIT
	subeq	r4, r2, #4
1:	ldreqt	r0, [r4]
	reveq	r0, r0
	beq	call_fpe

So, if !T, then we subtract 4 and load the instruction (which was the
faulting instruction).  So r2 is the following instruction.

Ah, maybe you're getting confused by the comment.  Should we put
an 'eq' suffix on the end of each line? ;)

> 
> > +       @ r4 = PC value for the faulting instruction
> > +       @
> >        beq     call_fpe
> > +
> >        @ Thumb instruction
> >  #if __LINUX_ARM_ARCH__ >= 7
> > +       sub     r4, r2, #2                      @ Thumb instr at LR - 2
> >  2:
> >  ARM(  ldrht   r5, [r4], #2    )
> >  THUMB(        ldrht   r5, [r4]        )
> > @@ -492,18 +500,19 @@ __und_usr:
> >  3:     ldrht   r0, [r4]
> >        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
> >        orr     r0, r0, r5, lsl #16
> > +       @
> > +       @ r0 = the two 16-bit Thumb instructions which caused the exception
> > +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
> 
> That's correct.
> 
> > +       @ r4 = PC value for the first 16-bit Thumb instruction
> 
> I think r4 here points in the middle of tha faulting instruction for
> 32-bit Thumb.

You're right.
> 
> > +       @
> >  #else
> >        b       __und_usr_unknown
> >  #endif
> > - UNWIND(.fnend         )
> > + UNWIND(.fnend)
> >  ENDPROC(__und_usr)
> >
> > -       @
> > -       @ fallthrough to call_fpe
> > -       @
> > -
> >  /*
> > - * The out of line fixup for the ldrt above.
> > + * The out of line fixup for the ldrt instructions above.
> >  */
> >        .pushsection .fixup, "ax"
> >  4:     mov     pc, r9
> > @@ -534,11 +543,12 @@ ENDPROC(__und_usr)
> >  * NEON handler code.
> >  *
> >  * Emulators may wish to make use of the following registers:
> > - *  r0  = instruction opcode.
> > - *  r2  = PC+4
> > + *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
> > + *  r2  = PC value to resume execution after successful emulation
> >  *  r9  = normal "successful" return address
> > - *  r10 = this threads thread_info structure.
> > + *  r10 = this threads thread_info structure
> >  *  lr  = unrecognised instruction return address
> > + * IRQs disabled, FIQs enabled.
> >  */
> >        @
> >        @ Fall-through from Thumb-2 __und_usr
> > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> > index ee57640..eeb9250 100644
> > --- a/arch/arm/kernel/traps.c
> > +++ b/arch/arm/kernel/traps.c
> > @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
> >        void __user *pc;
> >
> >        /*
> > -        * According to the ARM ARM, PC is 2 or 4 bytes ahead,
> > -        * depending whether we're in Thumb mode or not.
> > -        * Correct this offset.
> > +        * According to the ARM ARM, the PC is 2 or 4 bytes ahead
> > +        * depending on Thumb mode.  Correct this offset so that
> > +        * regs->ARM_pc points at the faulting instruction.
> >         */
> >        regs->ARM_pc -= correction;
> >
> > diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
> > index 4fa9903..2bf6089 100644
> > --- a/arch/arm/vfp/entry.S
> > +++ b/arch/arm/vfp/entry.S
> > @@ -19,6 +19,14 @@
> >  #include <asm/vfpmacros.h>
> >  #include "../kernel/entry-header.S"
> >
> > +@ VFP entry point.
> > +@
> > +@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
> > +@  r2  = PC value to resume execution after successful emulation
> > +@  r9  = normal "successful" return address
> > +@  r10 = this threads thread_info structure
> > +@  lr  = unrecognised instruction return address
> > +@
> >  ENTRY(do_vfp)
> >  #ifdef CONFIG_PREEMPT
> >        ldr     r4, [r10, #TI_PREEMPT]  @ get preempt count
> > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> > index 9897dcf..7292921 100644
> > --- a/arch/arm/vfp/vfphw.S
> > +++ b/arch/arm/vfp/vfphw.S
> > @@ -61,13 +61,13 @@
> >
> >  @ VFP hardware support entry point.
> >  @
> > -@  r0  = faulted instruction
> > -@  r2  = faulted PC+4
> > -@  r9  = successful return
> > +@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
> > +@  r2  = PC value to resume execution after successful emulation
> 
> That's right.
> 
> > +@  r9  = normal "successful" return address
> >  @  r10 = vfp_state union
> >  @  r11 = CPU number
> > -@  lr  = failure return
> > -
> > +@  lr  = unrecognised instruction return address
> > +@  IRQs enabled.
> >  ENTRY(vfp_support_entry)
> >        DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
> >
> > @@ -138,9 +138,12 @@ check_for_exception:
> >                                        @ exception before retrying branch
> >                                        @ out before setting an FPEXC that
> >                                        @ stops us reading stuff
> > -       VFPFMXR FPEXC, r1               @ restore FPEXC last
> > -       sub     r2, r2, #4
> > -       str     r2, [sp, #S_PC]         @ retry the instruction
> > +       VFPFMXR FPEXC, r1               @ Restore FPEXC last
> > +       sub     r2, r2, #4              @ Retry current instruction - if Thumb
> > +       str     r2, [sp, #S_PC]         @ mode it's two 16-bit instructions,
> > +                                       @ else it's one 32-bit instruction, so
> > +                                       @ always subtract 4 from the following
> > +                                       @ instruction address.
> 
> I would say it's always a 32-bit instruction but made up of two 16-bit
> values to allow half-word alignment.

Do you have a suggested replacement text?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-15 15:40                     ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-15 15:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote:
> On 14 January 2011 17:30, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Jan 14, 2011 at 04:58:47PM +0000, Catalin Marinas wrote:
> >> I agree, this code needs some clean-up. Maybe for Undef we could unify
> >> the ARM and Thumb-2 offsets so that they are both 4 (it may confuse the
> >> breakpoint code, I haven't checked).
> >>
> >> Otherwise just let the code handling the undef deal with the ARM/Thumb
> >> difference. For SVC, it makes sense to have different offsets as we
> >> always return to the next instruction.
> [...]
> > When the VFP support code tests the state of the VFP hardware during boot,
> > it sets the VFP handler to point at vfp_testing_entry, bypassing the normal
> > VFP handling code, and executes a VFP instruction.
> >
> > If this VFP instruction faults (eg, because there is no VFP hardware
> > present or we're not permitted to use it), it could end up resuming
> > execution in the middle of the 16-bit paired instruction because
> > regs->ARM_pc points in the middle of it.
> 
> Yes, that's possible. We probably never tried a Thumb-2 kernel where
> VFP isn't present.
> 
> > Or maybe we should just make it unconditional that whenever we have an
> > undefined instruction exception, the regs->ARM_pc value will always be
> > set for resuming execution after the faulted instruction. ?That makes
> > it consistent with r2 throughout the code in every case.
> 
> I have some comments below.
> 
> > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> > index 2b46fea..5876eec 100644
> > --- a/arch/arm/kernel/entry-armv.S
> > +++ b/arch/arm/kernel/entry-armv.S
> > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
> > ? ? ? ?.align ?5
> > ?__und_usr:
> > ? ? ? ?usr_entry
> > -
> > - ? ? ? @
> > - ? ? ? @ fall through to the emulation code, which returns using r9 if
> > - ? ? ? @ it has emulated the instruction, or the more conventional lr
> > - ? ? ? @ if we are to treat this as a real undefined instruction
> > ? ? ? ?@
> > - ? ? ? @ ?r0 - instruction
> > + ? ? ? @ The emulation code returns using r9 if it has emulated the
> > + ? ? ? @ instruction, or the more conventional lr if we are to treat
> > + ? ? ? @ this as a real undefined instruction
> > ? ? ? ?@
> > ? ? ? ?adr ? ? r9, BSYM(ret_from_exception)
> > ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown)
> > + ? ? ? @
> > + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
> > + ? ? ? @ faulting instruction depending on Thumb mode.
> > + ? ? ? @ r3 = regs->ARM_cpsr
> > + ? ? ? @
> > ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode?
> > - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
> > + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
> > ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4
> > - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
> > ?1: ? ? ldreqt ?r0, [r4]
> 
> The itttt above should just be itt. The reveq is conditionally
> compiled and beq doesn't necessarily need one.

It's a reveq, so I thought we should cover all the instructions with
an 'eq' conditional for thumb.

> 
> > ?#ifdef CONFIG_CPU_ENDIAN_BE8
> > ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction
> > ?#endif
> > + ? ? ? @
> > + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception
> > + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc)
> 
> Is r2 here always the PC value following instruction? If the Thumb
> instruction was 32-bit, it just points in the middle of the faulting
> instruction.

Is the T bit ever zero in this case?  The code here is:

	tst	r3, #PSR_T_BIT
	subeq	r4, r2, #4
1:	ldreqt	r0, [r4]
	reveq	r0, r0
	beq	call_fpe

So, if !T, then we subtract 4 and load the instruction (which was the
faulting instruction).  So r2 is the following instruction.

Ah, maybe you're getting confused by the comment.  Should we put
an 'eq' suffix on the end of each line? ;)

> 
> > + ? ? ? @ r4 = PC value for the faulting instruction
> > + ? ? ? @
> > ? ? ? ?beq ? ? call_fpe
> > +
> > ? ? ? ?@ Thumb instruction
> > ?#if __LINUX_ARM_ARCH__ >= 7
> > + ? ? ? sub ? ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
> > ?2:
> > ?ARM( ?ldrht ? r5, [r4], #2 ? ?)
> > ?THUMB( ? ? ? ?ldrht ? r5, [r4] ? ? ? ?)
> > @@ -492,18 +500,19 @@ __und_usr:
> > ?3: ? ? ldrht ? r0, [r4]
> > ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
> > ? ? ? ?orr ? ? r0, r0, r5, lsl #16
> > + ? ? ? @
> > + ? ? ? @ r0 = the two 16-bit Thumb instructions which caused the exception
> > + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
> 
> That's correct.
> 
> > + ? ? ? @ r4 = PC value for the first 16-bit Thumb instruction
> 
> I think r4 here points in the middle of tha faulting instruction for
> 32-bit Thumb.

You're right.
> 
> > + ? ? ? @
> > ?#else
> > ? ? ? ?b ? ? ? __und_usr_unknown
> > ?#endif
> > - UNWIND(.fnend ? ? ? ? )
> > + UNWIND(.fnend)
> > ?ENDPROC(__und_usr)
> >
> > - ? ? ? @
> > - ? ? ? @ fallthrough to call_fpe
> > - ? ? ? @
> > -
> > ?/*
> > - * The out of line fixup for the ldrt above.
> > + * The out of line fixup for the ldrt instructions above.
> > ?*/
> > ? ? ? ?.pushsection .fixup, "ax"
> > ?4: ? ? mov ? ? pc, r9
> > @@ -534,11 +543,12 @@ ENDPROC(__und_usr)
> > ?* NEON handler code.
> > ?*
> > ?* Emulators may wish to make use of the following registers:
> > - * ?r0 ?= instruction opcode.
> > - * ?r2 ?= PC+4
> > + * ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
> > + * ?r2 ?= PC value to resume execution after successful emulation
> > ?* ?r9 ?= normal "successful" return address
> > - * ?r10 = this threads thread_info structure.
> > + * ?r10 = this threads thread_info structure
> > ?* ?lr ?= unrecognised instruction return address
> > + * IRQs disabled, FIQs enabled.
> > ?*/
> > ? ? ? ?@
> > ? ? ? ?@ Fall-through from Thumb-2 __und_usr
> > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> > index ee57640..eeb9250 100644
> > --- a/arch/arm/kernel/traps.c
> > +++ b/arch/arm/kernel/traps.c
> > @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
> > ? ? ? ?void __user *pc;
> >
> > ? ? ? ?/*
> > - ? ? ? ?* According to the ARM ARM, PC is 2 or 4 bytes ahead,
> > - ? ? ? ?* depending whether we're in Thumb mode or not.
> > - ? ? ? ?* Correct this offset.
> > + ? ? ? ?* According to the ARM ARM, the PC is 2 or 4 bytes ahead
> > + ? ? ? ?* depending on Thumb mode. ?Correct this offset so that
> > + ? ? ? ?* regs->ARM_pc points at the faulting instruction.
> > ? ? ? ? */
> > ? ? ? ?regs->ARM_pc -= correction;
> >
> > diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
> > index 4fa9903..2bf6089 100644
> > --- a/arch/arm/vfp/entry.S
> > +++ b/arch/arm/vfp/entry.S
> > @@ -19,6 +19,14 @@
> > ?#include <asm/vfpmacros.h>
> > ?#include "../kernel/entry-header.S"
> >
> > +@ VFP entry point.
> > +@
> > +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
> > +@ ?r2 ?= PC value to resume execution after successful emulation
> > +@ ?r9 ?= normal "successful" return address
> > +@ ?r10 = this threads thread_info structure
> > +@ ?lr ?= unrecognised instruction return address
> > +@
> > ?ENTRY(do_vfp)
> > ?#ifdef CONFIG_PREEMPT
> > ? ? ? ?ldr ? ? r4, [r10, #TI_PREEMPT] ?@ get preempt count
> > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> > index 9897dcf..7292921 100644
> > --- a/arch/arm/vfp/vfphw.S
> > +++ b/arch/arm/vfp/vfphw.S
> > @@ -61,13 +61,13 @@
> >
> > ?@ VFP hardware support entry point.
> > ?@
> > -@ ?r0 ?= faulted instruction
> > -@ ?r2 ?= faulted PC+4
> > -@ ?r9 ?= successful return
> > +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
> > +@ ?r2 ?= PC value to resume execution after successful emulation
> 
> That's right.
> 
> > +@ ?r9 ?= normal "successful" return address
> > ?@ ?r10 = vfp_state union
> > ?@ ?r11 = CPU number
> > -@ ?lr ?= failure return
> > -
> > +@ ?lr ?= unrecognised instruction return address
> > +@ ?IRQs enabled.
> > ?ENTRY(vfp_support_entry)
> > ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
> >
> > @@ -138,9 +138,12 @@ check_for_exception:
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ exception before retrying branch
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ out before setting an FPEXC that
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ stops us reading stuff
> > - ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ restore FPEXC last
> > - ? ? ? sub ? ? r2, r2, #4
> > - ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ retry the instruction
> > + ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ Restore FPEXC last
> > + ? ? ? sub ? ? r2, r2, #4 ? ? ? ? ? ? ?@ Retry current instruction - if Thumb
> > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ mode it's two 16-bit instructions,
> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ else it's one 32-bit instruction, so
> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ always subtract 4 from the following
> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ instruction address.
> 
> I would say it's always a 32-bit instruction but made up of two 16-bit
> values to allow half-word alignment.

Do you have a suggested replacement text?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-15 15:38                     ` Catalin Marinas
@ 2011-01-15 15:43                       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-15 15:43 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-kernel, linux-arm-kernel, Colin Cross

On Sat, Jan 15, 2011 at 03:38:16PM +0000, Catalin Marinas wrote:
> On 14 January 2011 18:47, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> > --- b/arch/arm/kernel/entry-armv.S
> > +++ b/arch/arm/kernel/entry-armv.S
> > @@ -499,10 +499,11 @@
> >        blo     __und_usr_unknown
> >  3:     ldrht   r0, [r4]
> >        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
> > -       orr     r0, r0, r5, lsl #16
> > +       str     r2, [sp, #S_PC]                 @ it's a 2x16bit instr, update
> > +       orr     r0, r0, r5, lsl #16             @  regs->ARM_pc
> >        @
> >        @ r0 = the two 16-bit Thumb instructions which caused the exception
> > -       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
> > +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
> >        @ r4 = PC value for the first 16-bit Thumb instruction
> >        @
> >  #else
> 
> Do we need to modify the VFP entry code to avoit the store to ARM_pc?

The one after the sub #4 instruction?

That's answered by the comments... "retry the instruction" and that
r2 = regs->ARM_pc in every case, and both r2 and regs->ARM_pc point
at the _following_ instruction...

I do hope this isn't a case that _more_ comments are making this more
confusing (which seems to be the way with documentation - the more
words you use, the more questions people have).  Maybe we should get
rid of all the comments instead?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-15 15:43                       ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-15 15:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 15, 2011 at 03:38:16PM +0000, Catalin Marinas wrote:
> On 14 January 2011 18:47, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> > --- b/arch/arm/kernel/entry-armv.S
> > +++ b/arch/arm/kernel/entry-armv.S
> > @@ -499,10 +499,11 @@
> > ? ? ? ?blo ? ? __und_usr_unknown
> > ?3: ? ? ldrht ? r0, [r4]
> > ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
> > - ? ? ? orr ? ? r0, r0, r5, lsl #16
> > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update
> > + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc
> > ? ? ? ?@
> > ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception
> > - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
> > + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
> > ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction
> > ? ? ? ?@
> > ?#else
> 
> Do we need to modify the VFP entry code to avoit the store to ARM_pc?

The one after the sub #4 instruction?

That's answered by the comments... "retry the instruction" and that
r2 = regs->ARM_pc in every case, and both r2 and regs->ARM_pc point
at the _following_ instruction...

I do hope this isn't a case that _more_ comments are making this more
confusing (which seems to be the way with documentation - the more
words you use, the more questions people have).  Maybe we should get
rid of all the comments instead?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-15 15:40                     ` Russell King - ARM Linux
@ 2011-01-16 11:49                       ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-16 11:49 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Catalin Marinas, Colin Cross, linux-arm-kernel, linux-kernel

On Saturday, 15 January 2011, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote:
>> On 14 January 2011 17:30, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>> > index 2b46fea..5876eec 100644
>> > --- a/arch/arm/kernel/entry-armv.S
>> > +++ b/arch/arm/kernel/entry-armv.S
>> > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
>> >        .align  5
>> >  __und_usr:
>> >        usr_entry
>> > -
>> > -       @
>> > -       @ fall through to the emulation code, which returns using r9 if
>> > -       @ it has emulated the instruction, or the more conventional lr
>> > -       @ if we are to treat this as a real undefined instruction
>> >        @
>> > -       @  r0 - instruction
>> > +       @ The emulation code returns using r9 if it has emulated the
>> > +       @ instruction, or the more conventional lr if we are to treat
>> > +       @ this as a real undefined instruction
>> >        @
>> >        adr     r9, BSYM(ret_from_exception)
>> >        adr     lr, BSYM(__und_usr_unknown)
>> > +       @
>> > +       @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
>> > +       @ faulting instruction depending on Thumb mode.
>> > +       @ r3 = regs->ARM_cpsr
>> > +       @
>> >        tst     r3, #PSR_T_BIT                  @ Thumb mode?
>> > -       itet    eq                              @ explicit IT needed for the 1f label
>> > +       itttt   eq                              @ explicit IT needed for the 1f label
>> >        subeq   r4, r2, #4                      @ ARM instr at LR - 4
>> > -       subne   r4, r2, #2                      @ Thumb instr at LR - 2
>> >  1:     ldreqt  r0, [r4]
>>
>> The itttt above should just be itt. The reveq is conditionally
>> compiled and beq doesn't necessarily need one.
>
> It's a reveq, so I thought we should cover all the instructions with
> an 'eq' conditional for thumb.

If the it instruction doesn't cover all instructions, gas generates
some more its. But in this case, for little endian, the it instruction
covers more since reveq isn't included and having the beq not last in
the block I think is unpredictable. If you really want to optimise the
big endian case not to have an additional it generated by gas, you can
write ittt so that beq is included with little endian but not with big
endian. I wouldn't bother much for an extra it anyway.

>> >  #ifdef CONFIG_CPU_ENDIAN_BE8
>> >        reveq   r0, r0                          @ little endian instruction
>> >  #endif
>> > +       @
>> > +       @ r0 = 32-bit ARM instruction which caused the exception
>> > +       @ r2 = PC value for the following instruction (:= regs->ARM_pc)
>>
>> Is r2 here always the PC value following instruction? If the Thumb
>> instruction was 32-bit, it just points in the middle of the faulting
>> instruction.
>
> Is the T bit ever zero in this case?  The code here is:
>
>         tst     r3, #PSR_T_BIT
>         subeq   r4, r2, #4
> 1:      ldreqt  r0, [r4]
>         reveq   r0, r0
>         beq     call_fpe

You can have the T bit set but the instruction a 32-bit Thumb in which
case r2 is in the middle of such instruction rather than the next.
Unless you only refer to the ARM mode, in which case the comment is
fine.

> So, if !T, then we subtract 4 and load the instruction (which was the
> faulting instruction).  So r2 is the following instruction.
>
> Ah, maybe you're getting confused by the comment.  Should we put
> an 'eq' suffix on the end of each line? ;)

Maybe mention that this is ARM. I think documenting this code is
difficult anyway. I found myself not reading the comments at all when
revisiting this code :) but they may be useful for others.


-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-16 11:49                       ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-16 11:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Saturday, 15 January 2011, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote:
>> On 14 January 2011 17:30, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>> > index 2b46fea..5876eec 100644
>> > --- a/arch/arm/kernel/entry-armv.S
>> > +++ b/arch/arm/kernel/entry-armv.S
>> > @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
>> > ? ? ? ?.align ?5
>> > ?__und_usr:
>> > ? ? ? ?usr_entry
>> > -
>> > - ? ? ? @
>> > - ? ? ? @ fall through to the emulation code, which returns using r9 if
>> > - ? ? ? @ it has emulated the instruction, or the more conventional lr
>> > - ? ? ? @ if we are to treat this as a real undefined instruction
>> > ? ? ? ?@
>> > - ? ? ? @ ?r0 - instruction
>> > + ? ? ? @ The emulation code returns using r9 if it has emulated the
>> > + ? ? ? @ instruction, or the more conventional lr if we are to treat
>> > + ? ? ? @ this as a real undefined instruction
>> > ? ? ? ?@
>> > ? ? ? ?adr ? ? r9, BSYM(ret_from_exception)
>> > ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown)
>> > + ? ? ? @
>> > + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
>> > + ? ? ? @ faulting instruction depending on Thumb mode.
>> > + ? ? ? @ r3 = regs->ARM_cpsr
>> > + ? ? ? @
>> > ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode?
>> > - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
>> > + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
>> > ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4
>> > - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
>> > ?1: ? ? ldreqt ?r0, [r4]
>>
>> The itttt above should just be itt. The reveq is conditionally
>> compiled and beq doesn't necessarily need one.
>
> It's a reveq, so I thought we should cover all the instructions with
> an 'eq' conditional for thumb.

If the it instruction doesn't cover all instructions, gas generates
some more its. But in this case, for little endian, the it instruction
covers more since reveq isn't included and having the beq not last in
the block I think is unpredictable. If you really want to optimise the
big endian case not to have an additional it generated by gas, you can
write ittt so that beq is included with little endian but not with big
endian. I wouldn't bother much for an extra it anyway.

>> > ?#ifdef CONFIG_CPU_ENDIAN_BE8
>> > ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction
>> > ?#endif
>> > + ? ? ? @
>> > + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception
>> > + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc)
>>
>> Is r2 here always the PC value following instruction? If the Thumb
>> instruction was 32-bit, it just points in the middle of the faulting
>> instruction.
>
> Is the T bit ever zero in this case? ?The code here is:
>
> ?? ? ? ?tst ? ? r3, #PSR_T_BIT
> ?? ? ? ?subeq ? r4, r2, #4
> 1: ? ? ?ldreqt ?r0, [r4]
> ?? ? ? ?reveq ? r0, r0
> ?? ? ? ?beq ? ? call_fpe

You can have the T bit set but the instruction a 32-bit Thumb in which
case r2 is in the middle of such instruction rather than the next.
Unless you only refer to the ARM mode, in which case the comment is
fine.

> So, if !T, then we subtract 4 and load the instruction (which was the
> faulting instruction). ?So r2 is the following instruction.
>
> Ah, maybe you're getting confused by the comment. ?Should we put
> an 'eq' suffix on the end of each line? ;)

Maybe mention that this is ARM. I think documenting this code is
difficult anyway. I found myself not reading the comments at all when
revisiting this code :) but they may be useful for others.


-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-15 15:43                       ` Russell King - ARM Linux
@ 2011-01-16 11:51                         ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-16 11:51 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Catalin Marinas, linux-kernel, linux-arm-kernel, Colin Cross

On Saturday, 15 January 2011, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sat, Jan 15, 2011 at 03:38:16PM +0000, Catalin Marinas wrote:
>> On 14 January 2011 18:47, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>> > --- b/arch/arm/kernel/entry-armv.S
>> > +++ b/arch/arm/kernel/entry-armv.S
>> > @@ -499,10 +499,11 @@
>> >        blo     __und_usr_unknown
>> >  3:     ldrht   r0, [r4]
>> >        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
>> > -       orr     r0, r0, r5, lsl #16
>> > +       str     r2, [sp, #S_PC]                 @ it's a 2x16bit instr, update
>> > +       orr     r0, r0, r5, lsl #16             @  regs->ARM_pc
>> >        @
>> >        @ r0 = the two 16-bit Thumb instructions which caused the exception
>> > -       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
>> > +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
>> >        @ r4 = PC value for the first 16-bit Thumb instruction
>> >        @
>> >  #else
>>
>> Do we need to modify the VFP entry code to avoit the store to ARM_pc?
>
> The one after the sub #4 instruction?

No, I misread the code, we still need the one after sub #4.


-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-16 11:51                         ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-16 11:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Saturday, 15 January 2011, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sat, Jan 15, 2011 at 03:38:16PM +0000, Catalin Marinas wrote:
>> On 14 January 2011 18:47, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>> > --- b/arch/arm/kernel/entry-armv.S
>> > +++ b/arch/arm/kernel/entry-armv.S
>> > @@ -499,10 +499,11 @@
>> > ? ? ? ?blo ? ? __und_usr_unknown
>> > ?3: ? ? ldrht ? r0, [r4]
>> > ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
>> > - ? ? ? orr ? ? r0, r0, r5, lsl #16
>> > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update
>> > + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc
>> > ? ? ? ?@
>> > ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception
>> > - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
>> > + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
>> > ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction
>> > ? ? ? ?@
>> > ?#else
>>
>> Do we need to modify the VFP entry code to avoit the store to ARM_pc?
>
> The one after the sub #4 instruction?

No, I misread the code, we still need the one after sub #4.


-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-15 15:40                     ` Russell King - ARM Linux
@ 2011-01-16 21:25                       ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-16 21:25 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On 15 January 2011 15:40, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote:
>> On 14 January 2011 17:30, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > +@  r9  = normal "successful" return address
>> >  @  r10 = vfp_state union
>> >  @  r11 = CPU number
>> > -@  lr  = failure return
>> > -
>> > +@  lr  = unrecognised instruction return address
>> > +@  IRQs enabled.
>> >  ENTRY(vfp_support_entry)
>> >        DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
>> >
>> > @@ -138,9 +138,12 @@ check_for_exception:
>> >                                        @ exception before retrying branch
>> >                                        @ out before setting an FPEXC that
>> >                                        @ stops us reading stuff
>> > -       VFPFMXR FPEXC, r1               @ restore FPEXC last
>> > -       sub     r2, r2, #4
>> > -       str     r2, [sp, #S_PC]         @ retry the instruction
>> > +       VFPFMXR FPEXC, r1               @ Restore FPEXC last
>> > +       sub     r2, r2, #4              @ Retry current instruction - if Thumb
>> > +       str     r2, [sp, #S_PC]         @ mode it's two 16-bit instructions,
>> > +                                       @ else it's one 32-bit instruction, so
>> > +                                       @ always subtract 4 from the following
>> > +                                       @ instruction address.
>>
>> I would say it's always a 32-bit instruction but made up of two 16-bit
>> values to allow half-word alignment.
>
> Do you have a suggested replacement text?

Maybe something like: Retry the current VFP instruction (32-bit in
both ARM and Thumb modes).

(I was wondering whether we can get on the above code path with
asynchronous VFP exceptions where the interrupted instruction may not
be the VFP one. But I think all Thumb-2 processors these days generate
synchronous exceptions)

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-16 21:25                       ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-16 21:25 UTC (permalink / raw)
  To: linux-arm-kernel

On 15 January 2011 15:40, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote:
>> On 14 January 2011 17:30, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > +@ ?r9 ?= normal "successful" return address
>> > ?@ ?r10 = vfp_state union
>> > ?@ ?r11 = CPU number
>> > -@ ?lr ?= failure return
>> > -
>> > +@ ?lr ?= unrecognised instruction return address
>> > +@ ?IRQs enabled.
>> > ?ENTRY(vfp_support_entry)
>> > ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
>> >
>> > @@ -138,9 +138,12 @@ check_for_exception:
>> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ exception before retrying branch
>> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ out before setting an FPEXC that
>> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ stops us reading stuff
>> > - ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ restore FPEXC last
>> > - ? ? ? sub ? ? r2, r2, #4
>> > - ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ retry the instruction
>> > + ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ Restore FPEXC last
>> > + ? ? ? sub ? ? r2, r2, #4 ? ? ? ? ? ? ?@ Retry current instruction - if Thumb
>> > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ mode it's two 16-bit instructions,
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ else it's one 32-bit instruction, so
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ always subtract 4 from the following
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ instruction address.
>>
>> I would say it's always a 32-bit instruction but made up of two 16-bit
>> values to allow half-word alignment.
>
> Do you have a suggested replacement text?

Maybe something like: Retry the current VFP instruction (32-bit in
both ARM and Thumb modes).

(I was wondering whether we can get on the above code path with
asynchronous VFP exceptions where the interrupted instruction may not
be the VFP one. But I think all Thumb-2 processors these days generate
synchronous exceptions)

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-16 21:25                       ` Catalin Marinas
@ 2011-01-23 15:46                         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-23 15:46 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Sun, Jan 16, 2011 at 09:25:00PM +0000, Catalin Marinas wrote:
> On 15 January 2011 15:40, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote:
> >> On 14 January 2011 17:30, Russell King - ARM Linux
> >> <linux@arm.linux.org.uk> wrote:
> >> > +@  r9  = normal "successful" return address
> >> >  @  r10 = vfp_state union
> >> >  @  r11 = CPU number
> >> > -@  lr  = failure return
> >> > -
> >> > +@  lr  = unrecognised instruction return address
> >> > +@  IRQs enabled.
> >> >  ENTRY(vfp_support_entry)
> >> >        DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
> >> >
> >> > @@ -138,9 +138,12 @@ check_for_exception:
> >> >                                        @ exception before retrying branch
> >> >                                        @ out before setting an FPEXC that
> >> >                                        @ stops us reading stuff
> >> > -       VFPFMXR FPEXC, r1               @ restore FPEXC last
> >> > -       sub     r2, r2, #4
> >> > -       str     r2, [sp, #S_PC]         @ retry the instruction
> >> > +       VFPFMXR FPEXC, r1               @ Restore FPEXC last
> >> > +       sub     r2, r2, #4              @ Retry current instruction - if Thumb
> >> > +       str     r2, [sp, #S_PC]         @ mode it's two 16-bit instructions,
> >> > +                                       @ else it's one 32-bit instruction, so
> >> > +                                       @ always subtract 4 from the following
> >> > +                                       @ instruction address.
> >>
> >> I would say it's always a 32-bit instruction but made up of two 16-bit
> >> values to allow half-word alignment.
> >
> > Do you have a suggested replacement text?
> 
> Maybe something like: Retry the current VFP instruction (32-bit in
> both ARM and Thumb modes).
> 
> (I was wondering whether we can get on the above code path with
> asynchronous VFP exceptions where the interrupted instruction may not
> be the VFP one. But I think all Thumb-2 processors these days generate
> synchronous exceptions)

I don't think so from my understanding.

Firstly, in order to raise an undefined instruction fault, the coprocessor
has to be targetted with an instruction for it.  So the faulting instruction
here must always be for the VFP coprocessor.

Secondly, in order to get an asynchronous exception, the VFP hardware has
to be enabled.  We only retry the instruction if the VFP hardware wasn't
enabled.

So, here's the revised patch.  Ack?

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 2b46fea..5876eec 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
 	.align	5
 __und_usr:
 	usr_entry
-
-	@
-	@ fall through to the emulation code, which returns using r9 if
-	@ it has emulated the instruction, or the more conventional lr
-	@ if we are to treat this as a real undefined instruction
 	@
-	@  r0 - instruction
+	@ The emulation code returns using r9 if it has emulated the
+	@ instruction, or the more conventional lr if we are to treat
+	@ this as a real undefined instruction
 	@
 	adr	r9, BSYM(ret_from_exception)
 	adr	lr, BSYM(__und_usr_unknown)
+	@
+	@ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
+	@ faulting instruction depending on Thumb mode.
+	@ r3 = regs->ARM_cpsr
+	@
 	tst	r3, #PSR_T_BIT			@ Thumb mode?
-	itet	eq				@ explicit IT needed for the 1f label
+	itttt	eq				@ explicit IT needed for the 1f label
 	subeq	r4, r2, #4			@ ARM instr at LR - 4
-	subne	r4, r2, #2			@ Thumb instr at LR - 2
 1:	ldreqt	r0, [r4]
 #ifdef CONFIG_CPU_ENDIAN_BE8
 	reveq	r0, r0				@ little endian instruction
 #endif
+	@
+	@ r0 = 32-bit ARM instruction which caused the exception
+	@ r2 = PC value for the following instruction (:= regs->ARM_pc)
+	@ r4 = PC value for the faulting instruction
+	@
 	beq	call_fpe
+
 	@ Thumb instruction
 #if __LINUX_ARM_ARCH__ >= 7
+	sub	r4, r2, #2			@ Thumb instr at LR - 2
 2:
  ARM(	ldrht	r5, [r4], #2	)
  THUMB(	ldrht	r5, [r4]	)
@@ -492,18 +500,19 @@ __und_usr:
 3:	ldrht	r0, [r4]
 	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
 	orr	r0, r0, r5, lsl #16
+	@
+	@ r0 = the two 16-bit Thumb instructions which caused the exception
+	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
+	@ r4 = PC value for the second 16-bit Thumb instruction
+	@
 #else
 	b	__und_usr_unknown
 #endif
- UNWIND(.fnend		)
+ UNWIND(.fnend)
 ENDPROC(__und_usr)
 
-	@
-	@ fallthrough to call_fpe
-	@
-
 /*
- * The out of line fixup for the ldrt above.
+ * The out of line fixup for the ldrt instructions above.
  */
 	.pushsection .fixup, "ax"
 4:	mov	pc, r9
@@ -534,11 +543,12 @@ ENDPROC(__und_usr)
  * NEON handler code.
  *
  * Emulators may wish to make use of the following registers:
- *  r0  = instruction opcode.
- *  r2  = PC+4
+ *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+ *  r2  = PC value to resume execution after successful emulation
  *  r9  = normal "successful" return address
- *  r10 = this threads thread_info structure.
+ *  r10 = this threads thread_info structure
  *  lr  = unrecognised instruction return address
+ * IRQs disabled, FIQs enabled.
  */
 	@
 	@ Fall-through from Thumb-2 __und_usr
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index ee57640..eeb9250 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
 	void __user *pc;
 
 	/*
-	 * According to the ARM ARM, PC is 2 or 4 bytes ahead,
-	 * depending whether we're in Thumb mode or not.
-	 * Correct this offset.
+	 * According to the ARM ARM, the PC is 2 or 4 bytes ahead
+	 * depending on Thumb mode.  Correct this offset so that
+	 * regs->ARM_pc points at the faulting instruction.
 	 */
 	regs->ARM_pc -= correction;
 
diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
index 4fa9903..2bf6089 100644
--- a/arch/arm/vfp/entry.S
+++ b/arch/arm/vfp/entry.S
@@ -19,6 +19,15 @@
 #include <asm/vfpmacros.h>
 #include "../kernel/entry-header.S"
 
+@ VFP entry point.
+@
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
+@  r10 = this threads thread_info structure
+@  lr  = unrecognised instruction return address
+@  IRQs disabled.
+@
 ENTRY(do_vfp)
 #ifdef CONFIG_PREEMPT
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count
diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 9897dcf..7292921 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -61,13 +61,13 @@
 
 @ VFP hardware support entry point.
 @
-@  r0  = faulted instruction
-@  r2  = faulted PC+4
-@  r9  = successful return
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
 @  r10 = vfp_state union
 @  r11 = CPU number
-@  lr  = failure return
-
+@  lr  = unrecognised instruction return address
+@  IRQs enabled.
 ENTRY(vfp_support_entry)
 	DBGSTR3	"instr %08x pc %08x state %p", r0, r2, r10
 
@@ -138,9 +138,9 @@ check_for_exception:
 					@ exception before retrying branch
 					@ out before setting an FPEXC that
 					@ stops us reading stuff
-	VFPFMXR	FPEXC, r1		@ restore FPEXC last
-	sub	r2, r2, #4
-	str	r2, [sp, #S_PC]		@ retry the instruction
+	VFPFMXR	FPEXC, r1		@ Restore FPEXC last
+	sub	r2, r2, #4		@ Retry current instruction, 32-bit
+	str	r2, [sp, #S_PC]		@ in both ARM and Thumb modes.
 #ifdef CONFIG_PREEMPT
 	get_thread_info	r10
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-23 15:46                         ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-23 15:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Jan 16, 2011 at 09:25:00PM +0000, Catalin Marinas wrote:
> On 15 January 2011 15:40, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Sat, Jan 15, 2011 at 03:31:04PM +0000, Catalin Marinas wrote:
> >> On 14 January 2011 17:30, Russell King - ARM Linux
> >> <linux@arm.linux.org.uk> wrote:
> >> > +@ ?r9 ?= normal "successful" return address
> >> > ?@ ?r10 = vfp_state union
> >> > ?@ ?r11 = CPU number
> >> > -@ ?lr ?= failure return
> >> > -
> >> > +@ ?lr ?= unrecognised instruction return address
> >> > +@ ?IRQs enabled.
> >> > ?ENTRY(vfp_support_entry)
> >> > ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
> >> >
> >> > @@ -138,9 +138,12 @@ check_for_exception:
> >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ exception before retrying branch
> >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ out before setting an FPEXC that
> >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ stops us reading stuff
> >> > - ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ restore FPEXC last
> >> > - ? ? ? sub ? ? r2, r2, #4
> >> > - ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ retry the instruction
> >> > + ? ? ? VFPFMXR FPEXC, r1 ? ? ? ? ? ? ? @ Restore FPEXC last
> >> > + ? ? ? sub ? ? r2, r2, #4 ? ? ? ? ? ? ?@ Retry current instruction - if Thumb
> >> > + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? @ mode it's two 16-bit instructions,
> >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ else it's one 32-bit instruction, so
> >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ always subtract 4 from the following
> >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? @ instruction address.
> >>
> >> I would say it's always a 32-bit instruction but made up of two 16-bit
> >> values to allow half-word alignment.
> >
> > Do you have a suggested replacement text?
> 
> Maybe something like: Retry the current VFP instruction (32-bit in
> both ARM and Thumb modes).
> 
> (I was wondering whether we can get on the above code path with
> asynchronous VFP exceptions where the interrupted instruction may not
> be the VFP one. But I think all Thumb-2 processors these days generate
> synchronous exceptions)

I don't think so from my understanding.

Firstly, in order to raise an undefined instruction fault, the coprocessor
has to be targetted with an instruction for it.  So the faulting instruction
here must always be for the VFP coprocessor.

Secondly, in order to get an asynchronous exception, the VFP hardware has
to be enabled.  We only retry the instruction if the VFP hardware wasn't
enabled.

So, here's the revised patch.  Ack?

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 2b46fea..5876eec 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
 	.align	5
 __und_usr:
 	usr_entry
-
-	@
-	@ fall through to the emulation code, which returns using r9 if
-	@ it has emulated the instruction, or the more conventional lr
-	@ if we are to treat this as a real undefined instruction
 	@
-	@  r0 - instruction
+	@ The emulation code returns using r9 if it has emulated the
+	@ instruction, or the more conventional lr if we are to treat
+	@ this as a real undefined instruction
 	@
 	adr	r9, BSYM(ret_from_exception)
 	adr	lr, BSYM(__und_usr_unknown)
+	@
+	@ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
+	@ faulting instruction depending on Thumb mode.
+	@ r3 = regs->ARM_cpsr
+	@
 	tst	r3, #PSR_T_BIT			@ Thumb mode?
-	itet	eq				@ explicit IT needed for the 1f label
+	itttt	eq				@ explicit IT needed for the 1f label
 	subeq	r4, r2, #4			@ ARM instr at LR - 4
-	subne	r4, r2, #2			@ Thumb instr at LR - 2
 1:	ldreqt	r0, [r4]
 #ifdef CONFIG_CPU_ENDIAN_BE8
 	reveq	r0, r0				@ little endian instruction
 #endif
+	@
+	@ r0 = 32-bit ARM instruction which caused the exception
+	@ r2 = PC value for the following instruction (:= regs->ARM_pc)
+	@ r4 = PC value for the faulting instruction
+	@
 	beq	call_fpe
+
 	@ Thumb instruction
 #if __LINUX_ARM_ARCH__ >= 7
+	sub	r4, r2, #2			@ Thumb instr at LR - 2
 2:
  ARM(	ldrht	r5, [r4], #2	)
  THUMB(	ldrht	r5, [r4]	)
@@ -492,18 +500,19 @@ __und_usr:
 3:	ldrht	r0, [r4]
 	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
 	orr	r0, r0, r5, lsl #16
+	@
+	@ r0 = the two 16-bit Thumb instructions which caused the exception
+	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
+	@ r4 = PC value for the second 16-bit Thumb instruction
+	@
 #else
 	b	__und_usr_unknown
 #endif
- UNWIND(.fnend		)
+ UNWIND(.fnend)
 ENDPROC(__und_usr)
 
-	@
-	@ fallthrough to call_fpe
-	@
-
 /*
- * The out of line fixup for the ldrt above.
+ * The out of line fixup for the ldrt instructions above.
  */
 	.pushsection .fixup, "ax"
 4:	mov	pc, r9
@@ -534,11 +543,12 @@ ENDPROC(__und_usr)
  * NEON handler code.
  *
  * Emulators may wish to make use of the following registers:
- *  r0  = instruction opcode.
- *  r2  = PC+4
+ *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+ *  r2  = PC value to resume execution after successful emulation
  *  r9  = normal "successful" return address
- *  r10 = this threads thread_info structure.
+ *  r10 = this threads thread_info structure
  *  lr  = unrecognised instruction return address
+ * IRQs disabled, FIQs enabled.
  */
 	@
 	@ Fall-through from Thumb-2 __und_usr
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index ee57640..eeb9250 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
 	void __user *pc;
 
 	/*
-	 * According to the ARM ARM, PC is 2 or 4 bytes ahead,
-	 * depending whether we're in Thumb mode or not.
-	 * Correct this offset.
+	 * According to the ARM ARM, the PC is 2 or 4 bytes ahead
+	 * depending on Thumb mode.  Correct this offset so that
+	 * regs->ARM_pc points at the faulting instruction.
 	 */
 	regs->ARM_pc -= correction;
 
diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
index 4fa9903..2bf6089 100644
--- a/arch/arm/vfp/entry.S
+++ b/arch/arm/vfp/entry.S
@@ -19,6 +19,15 @@
 #include <asm/vfpmacros.h>
 #include "../kernel/entry-header.S"
 
+@ VFP entry point.
+@
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
+@  r10 = this threads thread_info structure
+@  lr  = unrecognised instruction return address
+@  IRQs disabled.
+@
 ENTRY(do_vfp)
 #ifdef CONFIG_PREEMPT
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count
diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 9897dcf..7292921 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -61,13 +61,13 @@
 
 @ VFP hardware support entry point.
 @
-@  r0  = faulted instruction
-@  r2  = faulted PC+4
-@  r9  = successful return
+@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
+@  r2  = PC value to resume execution after successful emulation
+@  r9  = normal "successful" return address
 @  r10 = vfp_state union
 @  r11 = CPU number
-@  lr  = failure return
-
+@  lr  = unrecognised instruction return address
+@  IRQs enabled.
 ENTRY(vfp_support_entry)
 	DBGSTR3	"instr %08x pc %08x state %p", r0, r2, r10
 
@@ -138,9 +138,9 @@ check_for_exception:
 					@ exception before retrying branch
 					@ out before setting an FPEXC that
 					@ stops us reading stuff
-	VFPFMXR	FPEXC, r1		@ restore FPEXC last
-	sub	r2, r2, #4
-	str	r2, [sp, #S_PC]		@ retry the instruction
+	VFPFMXR	FPEXC, r1		@ Restore FPEXC last
+	sub	r2, r2, #4		@ Retry current instruction, 32-bit
+	str	r2, [sp, #S_PC]		@ in both ARM and Thumb modes.
 #ifdef CONFIG_PREEMPT
 	get_thread_info	r10
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-16 11:49                       ` Catalin Marinas
@ 2011-01-23 15:51                         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-23 15:51 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Sun, Jan 16, 2011 at 11:49:21AM +0000, Catalin Marinas wrote:
> On Saturday, 15 January 2011, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > It's a reveq, so I thought we should cover all the instructions with
> > an 'eq' conditional for thumb.
> 
> If the it instruction doesn't cover all instructions, gas generates
> some more its. But in this case, for little endian, the it instruction
> covers more since reveq isn't included and having the beq not last in
> the block I think is unpredictable. If you really want to optimise the
> big endian case not to have an additional it generated by gas, you can
> write ittt so that beq is included with little endian but not with big
> endian. I wouldn't bother much for an extra it anyway.

I think the itttt is correct.  Unless you wish to illustrate why you
think it's wrong by pasting the code and showing why you think the
beq isn't the last instruction...

> >         tst     r3, #PSR_T_BIT
> >         subeq   r4, r2, #4
> > 1:      ldreqt  r0, [r4]
> >         reveq   r0, r0
> >         beq     call_fpe
> 
> You can have the T bit set but the instruction a 32-bit Thumb in which
> case r2 is in the middle of such instruction rather than the next.
> Unless you only refer to the ARM mode, in which case the comment is
> fine.

So?  I'm confused why you're making a mountain out of apparantly
nothing.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-23 15:51                         ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-23 15:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Jan 16, 2011 at 11:49:21AM +0000, Catalin Marinas wrote:
> On Saturday, 15 January 2011, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > It's a reveq, so I thought we should cover all the instructions with
> > an 'eq' conditional for thumb.
> 
> If the it instruction doesn't cover all instructions, gas generates
> some more its. But in this case, for little endian, the it instruction
> covers more since reveq isn't included and having the beq not last in
> the block I think is unpredictable. If you really want to optimise the
> big endian case not to have an additional it generated by gas, you can
> write ittt so that beq is included with little endian but not with big
> endian. I wouldn't bother much for an extra it anyway.

I think the itttt is correct.  Unless you wish to illustrate why you
think it's wrong by pasting the code and showing why you think the
beq isn't the last instruction...

> > ?? ? ? ?tst ? ? r3, #PSR_T_BIT
> > ?? ? ? ?subeq ? r4, r2, #4
> > 1: ? ? ?ldreqt ?r0, [r4]
> > ?? ? ? ?reveq ? r0, r0
> > ?? ? ? ?beq ? ? call_fpe
> 
> You can have the T bit set but the instruction a 32-bit Thumb in which
> case r2 is in the middle of such instruction rather than the next.
> Unless you only refer to the ARM mode, in which case the comment is
> fine.

So?  I'm confused why you're making a mountain out of apparantly
nothing.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-23 15:51                         ` Russell King - ARM Linux
@ 2011-01-25 13:19                           ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-25 13:19 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Sun, 2011-01-23 at 15:51 +0000, Russell King - ARM Linux wrote:
> On Sun, Jan 16, 2011 at 11:49:21AM +0000, Catalin Marinas wrote:
> > On Saturday, 15 January 2011, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > It's a reveq, so I thought we should cover all the instructions with
> > > an 'eq' conditional for thumb.
> >
> > If the it instruction doesn't cover all instructions, gas generates
> > some more its. But in this case, for little endian, the it instruction
> > covers more since reveq isn't included and having the beq not last in
> > the block I think is unpredictable. If you really want to optimise the
> > big endian case not to have an additional it generated by gas, you can
> > write ittt so that beq is included with little endian but not with big
> > endian. I wouldn't bother much for an extra it anyway.
> 
> I think the itttt is correct.  Unless you wish to illustrate why you
> think it's wrong by pasting the code and showing why you think the
> beq isn't the last instruction...

With your patch applied (visually), the code becomes (removed the
comment before beq):

	tst	r3, #PSR_T_BIT			@ Thumb mode?
	itttt	eq				@ explicit IT needed for the 1f label
	subeq	r4, r2, #4			@ ARM instr at LR - 4
1:	ldreqt	r0, [r4]
#ifdef CONFIG_CPU_ENDIAN_BE8
	reveq	r0, r0				@ little endian instruction
#endif
	beq	call_fpe

The little endian case only has 3 conditional instructions:

	subeq	r4, r2, #4			@ ARM instr at LR - 4
1:	ldreqt	r0, [r4]
	beq	call_fpe

but you add itttt (if-then-then-then-then) which expects 4 conditional
instructions, IOW beq is no longer the last. So cutting a 't' would sort
it out (unless I misread your patch).

> > >         tst     r3, #PSR_T_BIT
> > >         subeq   r4, r2, #4
> > > 1:      ldreqt  r0, [r4]
> > >         reveq   r0, r0
> > >         beq     call_fpe
> >
> > You can have the T bit set but the instruction a 32-bit Thumb in which
> > case r2 is in the middle of such instruction rather than the next.
> > Unless you only refer to the ARM mode, in which case the comment is
> > fine.
> 
> So?  I'm confused why you're making a mountain out of apparantly
> nothing.

No issue really, the comment can stay as you wrote it (I don't read them
anyway :)).

-- 
Catalin



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-25 13:19                           ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-25 13:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 2011-01-23 at 15:51 +0000, Russell King - ARM Linux wrote:
> On Sun, Jan 16, 2011 at 11:49:21AM +0000, Catalin Marinas wrote:
> > On Saturday, 15 January 2011, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > It's a reveq, so I thought we should cover all the instructions with
> > > an 'eq' conditional for thumb.
> >
> > If the it instruction doesn't cover all instructions, gas generates
> > some more its. But in this case, for little endian, the it instruction
> > covers more since reveq isn't included and having the beq not last in
> > the block I think is unpredictable. If you really want to optimise the
> > big endian case not to have an additional it generated by gas, you can
> > write ittt so that beq is included with little endian but not with big
> > endian. I wouldn't bother much for an extra it anyway.
> 
> I think the itttt is correct.  Unless you wish to illustrate why you
> think it's wrong by pasting the code and showing why you think the
> beq isn't the last instruction...

With your patch applied (visually), the code becomes (removed the
comment before beq):

	tst	r3, #PSR_T_BIT			@ Thumb mode?
	itttt	eq				@ explicit IT needed for the 1f label
	subeq	r4, r2, #4			@ ARM instr at LR - 4
1:	ldreqt	r0, [r4]
#ifdef CONFIG_CPU_ENDIAN_BE8
	reveq	r0, r0				@ little endian instruction
#endif
	beq	call_fpe

The little endian case only has 3 conditional instructions:

	subeq	r4, r2, #4			@ ARM instr at LR - 4
1:	ldreqt	r0, [r4]
	beq	call_fpe

but you add itttt (if-then-then-then-then) which expects 4 conditional
instructions, IOW beq is no longer the last. So cutting a 't' would sort
it out (unless I misread your patch).

> > >         tst     r3, #PSR_T_BIT
> > >         subeq   r4, r2, #4
> > > 1:      ldreqt  r0, [r4]
> > >         reveq   r0, r0
> > >         beq     call_fpe
> >
> > You can have the T bit set but the instruction a 32-bit Thumb in which
> > case r2 is in the middle of such instruction rather than the next.
> > Unless you only refer to the ARM mode, in which case the comment is
> > fine.
> 
> So?  I'm confused why you're making a mountain out of apparantly
> nothing.

No issue really, the comment can stay as you wrote it (I don't read them
anyway :)).

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-23 15:46                         ` Russell King - ARM Linux
@ 2011-01-25 13:45                           ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-25 13:45 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Colin Cross, linux-arm-kernel, linux-kernel

On Sun, 2011-01-23 at 15:46 +0000, Russell King - ARM Linux wrote:
> So, here's the revised patch.  Ack?

Apart from the itttt instruction, it looks fine. So once we agree on
that, you can add my Ack.

-- 
Catalin



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-25 13:45                           ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2011-01-25 13:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 2011-01-23 at 15:46 +0000, Russell King - ARM Linux wrote:
> So, here's the revised patch.  Ack?

Apart from the itttt instruction, it looks fine. So once we agree on
that, you can add my Ack.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-14 19:51                       ` Colin Cross
@ 2011-01-25 23:33                         ` Colin Cross
  -1 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-25 23:33 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel

On Fri, Jan 14, 2011 at 11:51 AM, Colin Cross <ccross@android.com> wrote:
> On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote:
>> On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> So... this incrementally on top of the previous patch (which I've
>>> reproduced below as there's a subtle comment change in there wrt IRQ
>>> state.)
>>>
>>> This means we have consistent state - both r2 and regs->ARM_pc always
>>> point to the next instruction to be executed in every case, which means
>>> its easy to understand and remember while reading through the code.
>>>
>>> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>>> --- b/arch/arm/kernel/entry-armv.S
>>> +++ b/arch/arm/kernel/entry-armv.S
>>> @@ -499,10 +499,11 @@
>>>        blo     __und_usr_unknown
>>>  3:     ldrht   r0, [r4]
>>>        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
>>> -       orr     r0, r0, r5, lsl #16
>>> +       str     r2, [sp, #S_PC]                 @ it's a 2x16bit instr, update
>>> +       orr     r0, r0, r5, lsl #16             @  regs->ARM_pc
>>>        @
>>>        @ r0 = the two 16-bit Thumb instructions which caused the exception
>>> -       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
>>> +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
>>>        @ r4 = PC value for the first 16-bit Thumb instruction
>>>        @
>>>  #else
>>>
>>> 8<-x-x-
>>>
>>> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>>> index 2b46fea..5876eec 100644
>>> --- a/arch/arm/kernel/entry-armv.S
>>> +++ b/arch/arm/kernel/entry-armv.S
>>> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
>>>        .align  5
>>>  __und_usr:
>>>        usr_entry
>>> -
>>> -       @
>>> -       @ fall through to the emulation code, which returns using r9 if
>>> -       @ it has emulated the instruction, or the more conventional lr
>>> -       @ if we are to treat this as a real undefined instruction
>>>        @
>>> -       @  r0 - instruction
>>> +       @ The emulation code returns using r9 if it has emulated the
>>> +       @ instruction, or the more conventional lr if we are to treat
>>> +       @ this as a real undefined instruction
>>>        @
>>>        adr     r9, BSYM(ret_from_exception)
>>>        adr     lr, BSYM(__und_usr_unknown)
>>> +       @
>>> +       @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
>>> +       @ faulting instruction depending on Thumb mode.
>>> +       @ r3 = regs->ARM_cpsr
>>> +       @
>>>        tst     r3, #PSR_T_BIT                  @ Thumb mode?
>>> -       itet    eq                              @ explicit IT needed for the 1f label
>>> +       itttt   eq                              @ explicit IT needed for the 1f label
>>>        subeq   r4, r2, #4                      @ ARM instr at LR - 4
>>> -       subne   r4, r2, #2                      @ Thumb instr at LR - 2
>>>  1:     ldreqt  r0, [r4]
>>>  #ifdef CONFIG_CPU_ENDIAN_BE8
>>>        reveq   r0, r0                          @ little endian instruction
>>>  #endif
>>> +       @
>>> +       @ r0 = 32-bit ARM instruction which caused the exception
>>> +       @ r2 = PC value for the following instruction (:= regs->ARM_pc)
>>> +       @ r4 = PC value for the faulting instruction
>>> +       @
>>>        beq     call_fpe
>>> +
>>>        @ Thumb instruction
>>>  #if __LINUX_ARM_ARCH__ >= 7
>>> +       sub     r4, r2, #2                      @ Thumb instr at LR - 2
>>>  2:
>>>  ARM(  ldrht   r5, [r4], #2    )
>>>  THUMB(        ldrht   r5, [r4]        )
>>> @@ -492,18 +500,19 @@ __und_usr:
>>>  3:     ldrht   r0, [r4]
>>>        add     r2, r2, #2                      @ r2 is PC + 2, make it PC + 4
>>>        orr     r0, r0, r5, lsl #16
>>> +       @
>>> +       @ r0 = the two 16-bit Thumb instructions which caused the exception
>>> +       @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
>>> +       @ r4 = PC value for the first 16-bit Thumb instruction
>>> +       @
>>>  #else
>>>        b       __und_usr_unknown
>>>  #endif
>>> - UNWIND(.fnend         )
>>> + UNWIND(.fnend)
>>>  ENDPROC(__und_usr)
>>>
>>> -       @
>>> -       @ fallthrough to call_fpe
>>> -       @
>>> -
>>>  /*
>>> - * The out of line fixup for the ldrt above.
>>> + * The out of line fixup for the ldrt instructions above.
>>>  */
>>>        .pushsection .fixup, "ax"
>>>  4:     mov     pc, r9
>>> @@ -534,11 +543,12 @@ ENDPROC(__und_usr)
>>>  * NEON handler code.
>>>  *
>>>  * Emulators may wish to make use of the following registers:
>>> - *  r0  = instruction opcode.
>>> - *  r2  = PC+4
>>> + *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
>>> + *  r2  = PC value to resume execution after successful emulation
>>>  *  r9  = normal "successful" return address
>>> - *  r10 = this threads thread_info structure.
>>> + *  r10 = this threads thread_info structure
>>>  *  lr  = unrecognised instruction return address
>>> + * IRQs disabled, FIQs enabled.
>>>  */
>>>        @
>>>        @ Fall-through from Thumb-2 __und_usr
>>> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
>>> index ee57640..eeb9250 100644
>>> --- a/arch/arm/kernel/traps.c
>>> +++ b/arch/arm/kernel/traps.c
>>> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
>>>        void __user *pc;
>>>
>>>        /*
>>> -        * According to the ARM ARM, PC is 2 or 4 bytes ahead,
>>> -        * depending whether we're in Thumb mode or not.
>>> -        * Correct this offset.
>>> +        * According to the ARM ARM, the PC is 2 or 4 bytes ahead
>>> +        * depending on Thumb mode.  Correct this offset so that
>>> +        * regs->ARM_pc points at the faulting instruction.
>>>         */
>>>        regs->ARM_pc -= correction;
>>>
>>> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
>>> index 4fa9903..2bf6089 100644
>>> --- a/arch/arm/vfp/entry.S
>>> +++ b/arch/arm/vfp/entry.S
>>> @@ -19,6 +19,15 @@
>>>  #include <asm/vfpmacros.h>
>>>  #include "../kernel/entry-header.S"
>>>
>>> +@ VFP entry point.
>>> +@
>>> +@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
>>> +@  r2  = PC value to resume execution after successful emulation
>>> +@  r9  = normal "successful" return address
>>> +@  r10 = this threads thread_info structure
>>> +@  lr  = unrecognised instruction return address
>>> +@  IRQs disabled.
>>> +@
>>>  ENTRY(do_vfp)
>>>  #ifdef CONFIG_PREEMPT
>>>        ldr     r4, [r10, #TI_PREEMPT]  @ get preempt count
>>> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
>>> index 9897dcf..7292921 100644
>>> --- a/arch/arm/vfp/vfphw.S
>>> +++ b/arch/arm/vfp/vfphw.S
>>> @@ -61,13 +61,13 @@
>>>
>>>  @ VFP hardware support entry point.
>>>  @
>>> -@  r0  = faulted instruction
>>> -@  r2  = faulted PC+4
>>> -@  r9  = successful return
>>> +@  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
>>> +@  r2  = PC value to resume execution after successful emulation
>>> +@  r9  = normal "successful" return address
>>>  @  r10 = vfp_state union
>>>  @  r11 = CPU number
>>> -@  lr  = failure return
>>> -
>>> +@  lr  = unrecognised instruction return address
>>> +@  IRQs enabled.
>>>  ENTRY(vfp_support_entry)
>>>        DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
>>
>> I tested copying r2 to regs->ARM_pc like this patch does, and it fixes
>> my test case.  Could this second patch go first so it can be applied
>> to stable?
>>
>
> Also, both patches together Tested-by: Colin Cross <ccross@android.com>
>

I think there is an additional change needed to
__und_usr_unknown/do_undefinstr.  do_undefinstr, which gets called
directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc
to be the fault address, and not the next PC, and gets called for 2 or
4 byte instructions.  If it gets called after the regs->ARM_pc fixup
added in the second patch, it will return to the middle of the
faulting instruction.

If it do_undefinstr called on a 4 byte thumb instruction, with
regs->ARM_pc pointing to the end of the faulting instruction, there is
no good way to figure out where the PC of the instruction that faulted
is.  Would it make more sense to convert all the exception handlers to
get regs->ARM_pc as the PC of the faulting instruction, rather than
getting the PC of the next instruction?  That way, each handler can
easily use thumb_mode(regs) and decoding the instruction to determine
how big the instruction is, and decide whether to return to the
beginning or end of the faulting instruction.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-25 23:33                         ` Colin Cross
  0 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-25 23:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jan 14, 2011 at 11:51 AM, Colin Cross <ccross@android.com> wrote:
> On Fri, Jan 14, 2011 at 11:23 AM, Colin Cross <ccross@android.com> wrote:
>> On Fri, Jan 14, 2011 at 10:47 AM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> So... this incrementally on top of the previous patch (which I've
>>> reproduced below as there's a subtle comment change in there wrt IRQ
>>> state.)
>>>
>>> This means we have consistent state - both r2 and regs->ARM_pc always
>>> point to the next instruction to be executed in every case, which means
>>> its easy to understand and remember while reading through the code.
>>>
>>> diff -u b/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>>> --- b/arch/arm/kernel/entry-armv.S
>>> +++ b/arch/arm/kernel/entry-armv.S
>>> @@ -499,10 +499,11 @@
>>> ? ? ? ?blo ? ? __und_usr_unknown
>>> ?3: ? ? ldrht ? r0, [r4]
>>> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
>>> - ? ? ? orr ? ? r0, r0, r5, lsl #16
>>> + ? ? ? str ? ? r2, [sp, #S_PC] ? ? ? ? ? ? ? ? @ it's a 2x16bit instr, update
>>> + ? ? ? orr ? ? r0, r0, r5, lsl #16 ? ? ? ? ? ? @ ?regs->ARM_pc
>>> ? ? ? ?@
>>> ? ? ? ?@ r0 = the two 16-bit Thumb instructions which caused the exception
>>> - ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
>>> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc)
>>> ? ? ? ?@ r4 = PC value for the first 16-bit Thumb instruction
>>> ? ? ? ?@
>>> ?#else
>>>
>>> 8<-x-x-
>>>
>>> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
>>> index 2b46fea..5876eec 100644
>>> --- a/arch/arm/kernel/entry-armv.S
>>> +++ b/arch/arm/kernel/entry-armv.S
>>> @@ -461,27 +461,35 @@ ENDPROC(__irq_usr)
>>> ? ? ? ?.align ?5
>>> ?__und_usr:
>>> ? ? ? ?usr_entry
>>> -
>>> - ? ? ? @
>>> - ? ? ? @ fall through to the emulation code, which returns using r9 if
>>> - ? ? ? @ it has emulated the instruction, or the more conventional lr
>>> - ? ? ? @ if we are to treat this as a real undefined instruction
>>> ? ? ? ?@
>>> - ? ? ? @ ?r0 - instruction
>>> + ? ? ? @ The emulation code returns using r9 if it has emulated the
>>> + ? ? ? @ instruction, or the more conventional lr if we are to treat
>>> + ? ? ? @ this as a real undefined instruction
>>> ? ? ? ?@
>>> ? ? ? ?adr ? ? r9, BSYM(ret_from_exception)
>>> ? ? ? ?adr ? ? lr, BSYM(__und_usr_unknown)
>>> + ? ? ? @
>>> + ? ? ? @ r2 = regs->ARM_pc, which is either 2 or 4 bytes ahead of the
>>> + ? ? ? @ faulting instruction depending on Thumb mode.
>>> + ? ? ? @ r3 = regs->ARM_cpsr
>>> + ? ? ? @
>>> ? ? ? ?tst ? ? r3, #PSR_T_BIT ? ? ? ? ? ? ? ? ?@ Thumb mode?
>>> - ? ? ? itet ? ?eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
>>> + ? ? ? itttt ? eq ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?@ explicit IT needed for the 1f label
>>> ? ? ? ?subeq ? r4, r2, #4 ? ? ? ? ? ? ? ? ? ? ?@ ARM instr at LR - 4
>>> - ? ? ? subne ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
>>> ?1: ? ? ldreqt ?r0, [r4]
>>> ?#ifdef CONFIG_CPU_ENDIAN_BE8
>>> ? ? ? ?reveq ? r0, r0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ little endian instruction
>>> ?#endif
>>> + ? ? ? @
>>> + ? ? ? @ r0 = 32-bit ARM instruction which caused the exception
>>> + ? ? ? @ r2 = PC value for the following instruction (:= regs->ARM_pc)
>>> + ? ? ? @ r4 = PC value for the faulting instruction
>>> + ? ? ? @
>>> ? ? ? ?beq ? ? call_fpe
>>> +
>>> ? ? ? ?@ Thumb instruction
>>> ?#if __LINUX_ARM_ARCH__ >= 7
>>> + ? ? ? sub ? ? r4, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ Thumb instr at LR - 2
>>> ?2:
>>> ?ARM( ?ldrht ? r5, [r4], #2 ? ?)
>>> ?THUMB( ? ? ? ?ldrht ? r5, [r4] ? ? ? ?)
>>> @@ -492,18 +500,19 @@ __und_usr:
>>> ?3: ? ? ldrht ? r0, [r4]
>>> ? ? ? ?add ? ? r2, r2, #2 ? ? ? ? ? ? ? ? ? ? ?@ r2 is PC + 2, make it PC + 4
>>> ? ? ? ?orr ? ? r0, r0, r5, lsl #16
>>> + ? ? ? @
>>> + ? ? ? @ r0 = the two 16-bit Thumb instructions which caused the exception
>>> + ? ? ? @ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
>>> + ? ? ? @ r4 = PC value for the first 16-bit Thumb instruction
>>> + ? ? ? @
>>> ?#else
>>> ? ? ? ?b ? ? ? __und_usr_unknown
>>> ?#endif
>>> - UNWIND(.fnend ? ? ? ? )
>>> + UNWIND(.fnend)
>>> ?ENDPROC(__und_usr)
>>>
>>> - ? ? ? @
>>> - ? ? ? @ fallthrough to call_fpe
>>> - ? ? ? @
>>> -
>>> ?/*
>>> - * The out of line fixup for the ldrt above.
>>> + * The out of line fixup for the ldrt instructions above.
>>> ?*/
>>> ? ? ? ?.pushsection .fixup, "ax"
>>> ?4: ? ? mov ? ? pc, r9
>>> @@ -534,11 +543,12 @@ ENDPROC(__und_usr)
>>> ?* NEON handler code.
>>> ?*
>>> ?* Emulators may wish to make use of the following registers:
>>> - * ?r0 ?= instruction opcode.
>>> - * ?r2 ?= PC+4
>>> + * ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
>>> + * ?r2 ?= PC value to resume execution after successful emulation
>>> ?* ?r9 ?= normal "successful" return address
>>> - * ?r10 = this threads thread_info structure.
>>> + * ?r10 = this threads thread_info structure
>>> ?* ?lr ?= unrecognised instruction return address
>>> + * IRQs disabled, FIQs enabled.
>>> ?*/
>>> ? ? ? ?@
>>> ? ? ? ?@ Fall-through from Thumb-2 __und_usr
>>> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
>>> index ee57640..eeb9250 100644
>>> --- a/arch/arm/kernel/traps.c
>>> +++ b/arch/arm/kernel/traps.c
>>> @@ -347,9 +347,9 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
>>> ? ? ? ?void __user *pc;
>>>
>>> ? ? ? ?/*
>>> - ? ? ? ?* According to the ARM ARM, PC is 2 or 4 bytes ahead,
>>> - ? ? ? ?* depending whether we're in Thumb mode or not.
>>> - ? ? ? ?* Correct this offset.
>>> + ? ? ? ?* According to the ARM ARM, the PC is 2 or 4 bytes ahead
>>> + ? ? ? ?* depending on Thumb mode. ?Correct this offset so that
>>> + ? ? ? ?* regs->ARM_pc points at the faulting instruction.
>>> ? ? ? ? */
>>> ? ? ? ?regs->ARM_pc -= correction;
>>>
>>> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
>>> index 4fa9903..2bf6089 100644
>>> --- a/arch/arm/vfp/entry.S
>>> +++ b/arch/arm/vfp/entry.S
>>> @@ -19,6 +19,15 @@
>>> ?#include <asm/vfpmacros.h>
>>> ?#include "../kernel/entry-header.S"
>>>
>>> +@ VFP entry point.
>>> +@
>>> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
>>> +@ ?r2 ?= PC value to resume execution after successful emulation
>>> +@ ?r9 ?= normal "successful" return address
>>> +@ ?r10 = this threads thread_info structure
>>> +@ ?lr ?= unrecognised instruction return address
>>> +@ ?IRQs disabled.
>>> +@
>>> ?ENTRY(do_vfp)
>>> ?#ifdef CONFIG_PREEMPT
>>> ? ? ? ?ldr ? ? r4, [r10, #TI_PREEMPT] ?@ get preempt count
>>> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
>>> index 9897dcf..7292921 100644
>>> --- a/arch/arm/vfp/vfphw.S
>>> +++ b/arch/arm/vfp/vfphw.S
>>> @@ -61,13 +61,13 @@
>>>
>>> ?@ VFP hardware support entry point.
>>> ?@
>>> -@ ?r0 ?= faulted instruction
>>> -@ ?r2 ?= faulted PC+4
>>> -@ ?r9 ?= successful return
>>> +@ ?r0 ?= instruction opcode (32-bit ARM or two 16-bit Thumb)
>>> +@ ?r2 ?= PC value to resume execution after successful emulation
>>> +@ ?r9 ?= normal "successful" return address
>>> ?@ ?r10 = vfp_state union
>>> ?@ ?r11 = CPU number
>>> -@ ?lr ?= failure return
>>> -
>>> +@ ?lr ?= unrecognised instruction return address
>>> +@ ?IRQs enabled.
>>> ?ENTRY(vfp_support_entry)
>>> ? ? ? ?DBGSTR3 "instr %08x pc %08x state %p", r0, r2, r10
>>
>> I tested copying r2 to regs->ARM_pc like this patch does, and it fixes
>> my test case. ?Could this second patch go first so it can be applied
>> to stable?
>>
>
> Also, both patches together Tested-by: Colin Cross <ccross@android.com>
>

I think there is an additional change needed to
__und_usr_unknown/do_undefinstr.  do_undefinstr, which gets called
directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc
to be the fault address, and not the next PC, and gets called for 2 or
4 byte instructions.  If it gets called after the regs->ARM_pc fixup
added in the second patch, it will return to the middle of the
faulting instruction.

If it do_undefinstr called on a 4 byte thumb instruction, with
regs->ARM_pc pointing to the end of the faulting instruction, there is
no good way to figure out where the PC of the instruction that faulted
is.  Would it make more sense to convert all the exception handlers to
get regs->ARM_pc as the PC of the faulting instruction, rather than
getting the PC of the next instruction?  That way, each handler can
easily use thumb_mode(regs) and decoding the instruction to determine
how big the instruction is, and decide whether to return to the
beginning or end of the faulting instruction.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-25 23:33                         ` Colin Cross
@ 2011-01-26 11:26                           ` Russell King - ARM Linux
  -1 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-26 11:26 UTC (permalink / raw)
  To: Colin Cross; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel

On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote:
> I think there is an additional change needed to
> __und_usr_unknown/do_undefinstr.  do_undefinstr, which gets called
> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc
> to be the fault address, and not the next PC, and gets called for 2 or
> 4 byte instructions.

It expects it to be the next PC:

asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
{
        unsigned int correction = thumb_mode(regs) ? 2 : 4;
        unsigned int instr;
        siginfo_t info;
        void __user *pc;

        /*
         * According to the ARM ARM, PC is 2 or 4 bytes ahead,
         * depending whether we're in Thumb mode or not.
         * Correct this offset.
         */
        regs->ARM_pc -= correction;

We expect the PC to be pointing at the next instruction to be executed.
This is the value of the PC saved by the CPU when entering the exception.
We correct the PC by four bytes for ARM mode to point at the previously
executed instruction.

For 16-bit Thumb mode, the PC is again pointing at the next instruction
to be executed, and this is the value saved by the CPU.  So we correct
the PC by two bytes as that is the Thumb instruction size.

The problem comes with T2, where we advance the saved PC by two bytes
if the instruction was 32-bit such that it again points at the next
instruction to be executed.  This is where the problem comes in because
we have two different chunks of code with completely different
expectations.

Maybe we need to pass in the correction factor to do_undefinstr instead.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-26 11:26                           ` Russell King - ARM Linux
  0 siblings, 0 replies; 64+ messages in thread
From: Russell King - ARM Linux @ 2011-01-26 11:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote:
> I think there is an additional change needed to
> __und_usr_unknown/do_undefinstr.  do_undefinstr, which gets called
> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc
> to be the fault address, and not the next PC, and gets called for 2 or
> 4 byte instructions.

It expects it to be the next PC:

asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
{
        unsigned int correction = thumb_mode(regs) ? 2 : 4;
        unsigned int instr;
        siginfo_t info;
        void __user *pc;

        /*
         * According to the ARM ARM, PC is 2 or 4 bytes ahead,
         * depending whether we're in Thumb mode or not.
         * Correct this offset.
         */
        regs->ARM_pc -= correction;

We expect the PC to be pointing at the next instruction to be executed.
This is the value of the PC saved by the CPU when entering the exception.
We correct the PC by four bytes for ARM mode to point at the previously
executed instruction.

For 16-bit Thumb mode, the PC is again pointing at the next instruction
to be executed, and this is the value saved by the CPU.  So we correct
the PC by two bytes as that is the Thumb instruction size.

The problem comes with T2, where we advance the saved PC by two bytes
if the instruction was 32-bit such that it again points at the next
instruction to be executed.  This is where the problem comes in because
we have two different chunks of code with completely different
expectations.

Maybe we need to pass in the correction factor to do_undefinstr instead.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-26 11:26                           ` Russell King - ARM Linux
@ 2011-01-27  6:11                             ` Colin Cross
  -1 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-27  6:11 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel

On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote:
>> I think there is an additional change needed to
>> __und_usr_unknown/do_undefinstr.  do_undefinstr, which gets called
>> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc
>> to be the fault address, and not the next PC, and gets called for 2 or
>> 4 byte instructions.
>
> It expects it to be the next PC:
>
> asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
> {
>        unsigned int correction = thumb_mode(regs) ? 2 : 4;
>        unsigned int instr;
>        siginfo_t info;
>        void __user *pc;
>
>        /*
>         * According to the ARM ARM, PC is 2 or 4 bytes ahead,
>         * depending whether we're in Thumb mode or not.
>         * Correct this offset.
>         */
>        regs->ARM_pc -= correction;
>
> We expect the PC to be pointing at the next instruction to be executed.
> This is the value of the PC saved by the CPU when entering the exception.
> We correct the PC by four bytes for ARM mode to point at the previously
> executed instruction.
>
> For 16-bit Thumb mode, the PC is again pointing at the next instruction
> to be executed, and this is the value saved by the CPU.  So we correct
> the PC by two bytes as that is the Thumb instruction size.
>
> The problem comes with T2, where we advance the saved PC by two bytes
> if the instruction was 32-bit such that it again points at the next
> instruction to be executed.  This is where the problem comes in because
> we have two different chunks of code with completely different
> expectations.

All do_undefinstr will do with the correction is subtract it off so
that pc points to the faulting instruction instead of the next
instruction, and calls any registered handlers.  VFP_bounce sometimes
subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP
instructions are all 32 bit), and sometimes leaves pc pointing at the
next instruction.  If both were modified to expect the faulting
instruction's pc, do_undefinstr would leave pc unmodified, and
VFP_bounce would add 4 in the opposite cases from where it subtracts 4
now.

iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr.

Currently, iwmmxt_task_enable will either end up in do_undefinstr
through mov pc, lr, or it will subtract 4 from the pc register value
and return through ret_from_exception.  With the change above, it
would not need to modify the pc value at all.

crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same.

> Maybe we need to pass in the correction factor to do_undefinstr instead.
>

That just makes it even more complicated, the correction factor has to
be tracked through every code path.

RFC patch to follow.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-27  6:11                             ` Colin Cross
  0 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-27  6:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote:
>> I think there is an additional change needed to
>> __und_usr_unknown/do_undefinstr. ?do_undefinstr, which gets called
>> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc
>> to be the fault address, and not the next PC, and gets called for 2 or
>> 4 byte instructions.
>
> It expects it to be the next PC:
>
> asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
> {
> ? ? ? ?unsigned int correction = thumb_mode(regs) ? 2 : 4;
> ? ? ? ?unsigned int instr;
> ? ? ? ?siginfo_t info;
> ? ? ? ?void __user *pc;
>
> ? ? ? ?/*
> ? ? ? ? * According to the ARM ARM, PC is 2 or 4 bytes ahead,
> ? ? ? ? * depending whether we're in Thumb mode or not.
> ? ? ? ? * Correct this offset.
> ? ? ? ? */
> ? ? ? ?regs->ARM_pc -= correction;
>
> We expect the PC to be pointing at the next instruction to be executed.
> This is the value of the PC saved by the CPU when entering the exception.
> We correct the PC by four bytes for ARM mode to point at the previously
> executed instruction.
>
> For 16-bit Thumb mode, the PC is again pointing at the next instruction
> to be executed, and this is the value saved by the CPU. ?So we correct
> the PC by two bytes as that is the Thumb instruction size.
>
> The problem comes with T2, where we advance the saved PC by two bytes
> if the instruction was 32-bit such that it again points at the next
> instruction to be executed. ?This is where the problem comes in because
> we have two different chunks of code with completely different
> expectations.

All do_undefinstr will do with the correction is subtract it off so
that pc points to the faulting instruction instead of the next
instruction, and calls any registered handlers.  VFP_bounce sometimes
subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP
instructions are all 32 bit), and sometimes leaves pc pointing at the
next instruction.  If both were modified to expect the faulting
instruction's pc, do_undefinstr would leave pc unmodified, and
VFP_bounce would add 4 in the opposite cases from where it subtracts 4
now.

iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr.

Currently, iwmmxt_task_enable will either end up in do_undefinstr
through mov pc, lr, or it will subtract 4 from the pc register value
and return through ret_from_exception.  With the change above, it
would not need to modify the pc value at all.

crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same.

> Maybe we need to pass in the correction factor to do_undefinstr instead.
>

That just makes it even more complicated, the correction factor has to
be tracked through every code path.

RFC patch to follow.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-27  6:11                             ` Colin Cross
@ 2011-01-27  6:35                               ` Colin Cross
  -1 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-27  6:35 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel

On Wed, Jan 26, 2011 at 10:11 PM, Colin Cross <ccross@android.com> wrote:
> On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote:
>>> I think there is an additional change needed to
>>> __und_usr_unknown/do_undefinstr.  do_undefinstr, which gets called
>>> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc
>>> to be the fault address, and not the next PC, and gets called for 2 or
>>> 4 byte instructions.
>>
>> It expects it to be the next PC:
>>
>> asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
>> {
>>        unsigned int correction = thumb_mode(regs) ? 2 : 4;
>>        unsigned int instr;
>>        siginfo_t info;
>>        void __user *pc;
>>
>>        /*
>>         * According to the ARM ARM, PC is 2 or 4 bytes ahead,
>>         * depending whether we're in Thumb mode or not.
>>         * Correct this offset.
>>         */
>>        regs->ARM_pc -= correction;
>>
>> We expect the PC to be pointing at the next instruction to be executed.
>> This is the value of the PC saved by the CPU when entering the exception.
>> We correct the PC by four bytes for ARM mode to point at the previously
>> executed instruction.
>>
>> For 16-bit Thumb mode, the PC is again pointing at the next instruction
>> to be executed, and this is the value saved by the CPU.  So we correct
>> the PC by two bytes as that is the Thumb instruction size.
>>
>> The problem comes with T2, where we advance the saved PC by two bytes
>> if the instruction was 32-bit such that it again points at the next
>> instruction to be executed.  This is where the problem comes in because
>> we have two different chunks of code with completely different
>> expectations.
>
> All do_undefinstr will do with the correction is subtract it off so
> that pc points to the faulting instruction instead of the next
> instruction, and calls any registered handlers.  VFP_bounce sometimes
> subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP
> instructions are all 32 bit), and sometimes leaves pc pointing at the
> next instruction.  If both were modified to expect the faulting
> instruction's pc, do_undefinstr would leave pc unmodified, and
> VFP_bounce would add 4 in the opposite cases from where it subtracts 4
> now.
>
> iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr.
>
> Currently, iwmmxt_task_enable will either end up in do_undefinstr
> through mov pc, lr, or it will subtract 4 from the pc register value
> and return through ret_from_exception.  With the change above, it
> would not need to modify the pc value at all.
>
> crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same.
>
>> Maybe we need to pass in the correction factor to do_undefinstr instead.
>>
>
> That just makes it even more complicated, the correction factor has to
> be tracked through every code path.
>
> RFC patch to follow.
>

Missed one case.  do_fpe calls into nwfpe_enter (I don't see any other
users of fp_enter), which assumes the PC in the registers on the stack
is faulting PC + 4, ignores r2, and uses r0 as the first instruction
to emulate.  It will need to be slightly modified in my patch.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-27  6:35                               ` Colin Cross
  0 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-27  6:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 26, 2011 at 10:11 PM, Colin Cross <ccross@android.com> wrote:
> On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote:
>>> I think there is an additional change needed to
>>> __und_usr_unknown/do_undefinstr. ?do_undefinstr, which gets called
>>> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc
>>> to be the fault address, and not the next PC, and gets called for 2 or
>>> 4 byte instructions.
>>
>> It expects it to be the next PC:
>>
>> asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
>> {
>> ? ? ? ?unsigned int correction = thumb_mode(regs) ? 2 : 4;
>> ? ? ? ?unsigned int instr;
>> ? ? ? ?siginfo_t info;
>> ? ? ? ?void __user *pc;
>>
>> ? ? ? ?/*
>> ? ? ? ? * According to the ARM ARM, PC is 2 or 4 bytes ahead,
>> ? ? ? ? * depending whether we're in Thumb mode or not.
>> ? ? ? ? * Correct this offset.
>> ? ? ? ? */
>> ? ? ? ?regs->ARM_pc -= correction;
>>
>> We expect the PC to be pointing at the next instruction to be executed.
>> This is the value of the PC saved by the CPU when entering the exception.
>> We correct the PC by four bytes for ARM mode to point at the previously
>> executed instruction.
>>
>> For 16-bit Thumb mode, the PC is again pointing at the next instruction
>> to be executed, and this is the value saved by the CPU. ?So we correct
>> the PC by two bytes as that is the Thumb instruction size.
>>
>> The problem comes with T2, where we advance the saved PC by two bytes
>> if the instruction was 32-bit such that it again points at the next
>> instruction to be executed. ?This is where the problem comes in because
>> we have two different chunks of code with completely different
>> expectations.
>
> All do_undefinstr will do with the correction is subtract it off so
> that pc points to the faulting instruction instead of the next
> instruction, and calls any registered handlers. ?VFP_bounce sometimes
> subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP
> instructions are all 32 bit), and sometimes leaves pc pointing at the
> next instruction. ?If both were modified to expect the faulting
> instruction's pc, do_undefinstr would leave pc unmodified, and
> VFP_bounce would add 4 in the opposite cases from where it subtracts 4
> now.
>
> iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr.
>
> Currently, iwmmxt_task_enable will either end up in do_undefinstr
> through mov pc, lr, or it will subtract 4 from the pc register value
> and return through ret_from_exception. ?With the change above, it
> would not need to modify the pc value at all.
>
> crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same.
>
>> Maybe we need to pass in the correction factor to do_undefinstr instead.
>>
>
> That just makes it even more complicated, the correction factor has to
> be tracked through every code path.
>
> RFC patch to follow.
>

Missed one case.  do_fpe calls into nwfpe_enter (I don't see any other
users of fp_enter), which assumes the PC in the registers on the stack
is faulting PC + 4, ignores r2, and uses r0 as the first instruction
to emulate.  It will need to be slightly modified in my patch.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-27  6:35                               ` Colin Cross
@ 2011-01-27  7:30                                 ` Colin Cross
  -1 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-27  7:30 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel

On Wed, Jan 26, 2011 at 10:35 PM, Colin Cross <ccross@android.com> wrote:
> On Wed, Jan 26, 2011 at 10:11 PM, Colin Cross <ccross@android.com> wrote:
>> On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote:
>>>> I think there is an additional change needed to
>>>> __und_usr_unknown/do_undefinstr.  do_undefinstr, which gets called
>>>> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc
>>>> to be the fault address, and not the next PC, and gets called for 2 or
>>>> 4 byte instructions.
>>>
>>> It expects it to be the next PC:
>>>
>>> asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
>>> {
>>>        unsigned int correction = thumb_mode(regs) ? 2 : 4;
>>>        unsigned int instr;
>>>        siginfo_t info;
>>>        void __user *pc;
>>>
>>>        /*
>>>         * According to the ARM ARM, PC is 2 or 4 bytes ahead,
>>>         * depending whether we're in Thumb mode or not.
>>>         * Correct this offset.
>>>         */
>>>        regs->ARM_pc -= correction;
>>>
>>> We expect the PC to be pointing at the next instruction to be executed.
>>> This is the value of the PC saved by the CPU when entering the exception.
>>> We correct the PC by four bytes for ARM mode to point at the previously
>>> executed instruction.
>>>
>>> For 16-bit Thumb mode, the PC is again pointing at the next instruction
>>> to be executed, and this is the value saved by the CPU.  So we correct
>>> the PC by two bytes as that is the Thumb instruction size.
>>>
>>> The problem comes with T2, where we advance the saved PC by two bytes
>>> if the instruction was 32-bit such that it again points at the next
>>> instruction to be executed.  This is where the problem comes in because
>>> we have two different chunks of code with completely different
>>> expectations.
>>
>> All do_undefinstr will do with the correction is subtract it off so
>> that pc points to the faulting instruction instead of the next
>> instruction, and calls any registered handlers.  VFP_bounce sometimes
>> subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP
>> instructions are all 32 bit), and sometimes leaves pc pointing at the
>> next instruction.  If both were modified to expect the faulting
>> instruction's pc, do_undefinstr would leave pc unmodified, and
>> VFP_bounce would add 4 in the opposite cases from where it subtracts 4
>> now.
>>
>> iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr.
>>
>> Currently, iwmmxt_task_enable will either end up in do_undefinstr
>> through mov pc, lr, or it will subtract 4 from the pc register value
>> and return through ret_from_exception.  With the change above, it
>> would not need to modify the pc value at all.
>>
>> crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same.
>>
>>> Maybe we need to pass in the correction factor to do_undefinstr instead.
>>>
>>
>> That just makes it even more complicated, the correction factor has to
>> be tracked through every code path.
>>
>> RFC patch to follow.
>>
>
> Missed one case.  do_fpe calls into nwfpe_enter (I don't see any other
> users of fp_enter), which assumes the PC in the registers on the stack
> is faulting PC + 4, ignores r2, and uses r0 as the first instruction
> to emulate.  It will need to be slightly modified in my patch.
>

This patch is on top of your first patch that cleans up the comments,
but not the second patch that fixes the PC in the VFP exception case.
Compiled but not run tested, and I can't test crunch or iwmmxt.
vfpmodule.c may be able to be simplified (right now its both adding
and subtracting 4 from regs->ARM_pc).

 arch/arm/kernel/crunch-bits.S |    3 ---
 arch/arm/kernel/entry-armv.S  |   38 ++++++++++++++++++++------------------
 arch/arm/kernel/iwmmxt.S      |    3 ---
 arch/arm/nwfpe/entry.S        |    1 +
 arch/arm/vfp/vfphw.S          |    5 -----
 arch/arm/vfp/vfpmodule.c      |    6 ++++++
 6 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/arch/arm/kernel/crunch-bits.S b/arch/arm/kernel/crunch-bits.S
index 0ec9bb4..0095368 100644
--- a/arch/arm/kernel/crunch-bits.S
+++ b/arch/arm/kernel/crunch-bits.S
@@ -77,11 +77,8 @@ ENTRY(crunch_task_enable)

 	ldr	r3, =crunch_owner
 	add	r0, r10, #TI_CRUNCH_STATE	@ get task crunch save area
-	ldr	r2, [sp, #60]			@ current task pc value
 	ldr	r1, [r3]			@ get current crunch owner
 	str	r0, [r3]			@ this task now owns crunch
-	sub	r2, r2, #4			@ adjust pc back
-	str	r2, [sp, #60]

 	ldr	r2, [r8, #0x80]
 	mov	r2, r2				@ flush out enable (@@@)
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 5876eec..f09e576 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -259,14 +259,18 @@ __und_svc:
 	@  r0 - instruction
 	@
 #ifndef	CONFIG_THUMB2_KERNEL
-	ldr	r0, [r2, #-4]
+        sub     r2, r2, #4
+	ldr	r0, [r2]
 #else
-	ldrh	r0, [r2, #-2]			@ Thumb instruction at LR - 2
+	sub	r2, r2, #2
+	ldrh	r0, [r2]			@ Thumb instruction at LR - 2
 	and	r9, r0, #0xf800
 	cmp	r9, #0xe800			@ 32-bit instruction if xx >= 0
-	ldrhhs	r9, [r2]			@ bottom 16 bits
+	ldrhhs	r9, [r2, #2]			@ bottom 16 bits
 	orrhs	r0, r9, r0, lsl #16
 #endif
+	str	r2, [sp, #S_PC]			@ replace regs->ARM_pc with
+						@ pc of faulting instruction
 	adr	r9, BSYM(1f)
 	bl	call_fpe

@@ -474,36 +478,34 @@ __und_usr:
 	@ r3 = regs->ARM_cpsr
 	@
 	tst	r3, #PSR_T_BIT			@ Thumb mode?
-	itttt	eq				@ explicit IT needed for the 1f label
-	subeq	r4, r2, #4			@ ARM instr at LR - 4
-1:	ldreqt	r0, [r4]
+	itet	eq				@ explicit IT needed for the 1f label
+	subeq	r2, r2, #4			@ ARM instr at LR - 4
+	subne	r2, r2, #2			@ Thumb instr at LR - 2
+1:	ldreqt	r0, [r2]
 #ifdef CONFIG_CPU_ENDIAN_BE8
 	reveq	r0, r0				@ little endian instruction
 #endif
+	str	r2, [sp, #S_PC]			@ replace regs->ARM_pc with
+						@ pc of faulting instruction
 	@
 	@ r0 = 32-bit ARM instruction which caused the exception
-	@ r2 = PC value for the following instruction (:= regs->ARM_pc)
-	@ r4 = PC value for the faulting instruction
-	@
+	@ r2 = PC value for the faulting instruction (:= regs->ARM_pc)
 	beq	call_fpe

 	@ Thumb instruction
 #if __LINUX_ARM_ARCH__ >= 7
-	sub	r4, r2, #2			@ Thumb instr at LR - 2
 2:
- ARM(	ldrht	r5, [r4], #2	)
- THUMB(	ldrht	r5, [r4]	)
- THUMB(	add	r4, r4, #2	)
+ ARM(	ldrht	r5, [r2], #2	)
+ THUMB(	ldrht	r5, [r2]	)
+ THUMB(	add	r4, r2, #2	)
 	and	r0, r5, #0xf800			@ mask bits 111x x... .... ....
 	cmp	r0, #0xe800			@ 32bit instruction if xx != 0
 	blo	__und_usr_unknown
 3:	ldrht	r0, [r4]
-	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
 	orr	r0, r0, r5, lsl #16
 	@
 	@ r0 = the two 16-bit Thumb instructions which caused the exception
-	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
-	@ r4 = PC value for the first 16-bit Thumb instruction
+	@ r2 = PC value for the faulting Thumb instruction (:= regs->ARM_pc)
 	@
 #else
 	b	__und_usr_unknown
@@ -544,7 +546,7 @@ ENDPROC(__und_usr)
  *
  * Emulators may wish to make use of the following registers:
  *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
- *  r2  = PC value to resume execution after successful emulation
+ *  r2  = PC value of the faulting instruction
  *  r9  = normal "successful" return address
  *  r10 = this threads thread_info structure
  *  lr  = unrecognised instruction return address
@@ -662,7 +664,7 @@ do_fpe:
 /*
  * The FP module is called with these registers set:
  *  r0  = instruction
- *  r2  = PC+4
+ *  r2  = PC of the faulting instruction
  *  r9  = normal "successful" return address
  *  r10 = FP workspace
  *  lr  = unrecognised FP instruction return address
diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
index 7fa3bb0..0842487 100644
--- a/arch/arm/kernel/iwmmxt.S
+++ b/arch/arm/kernel/iwmmxt.S
@@ -80,11 +80,8 @@ ENTRY(iwmmxt_task_enable)

 	ldr	r3, =concan_owner
 	add	r0, r10, #TI_IWMMXT_STATE	@ get task Concan save area
-	ldr	r2, [sp, #60]			@ current task pc value
 	ldr	r1, [r3]			@ get current Concan owner
 	str	r0, [r3]			@ this task now owns Concan regs
-	sub	r2, r2, #4			@ adjust pc back
-	str	r2, [sp, #60]

 	mrc	p15, 0, r2, c2, c0, 0
 	mov	r2, r2				@ cpwait
diff --git a/arch/arm/nwfpe/entry.S b/arch/arm/nwfpe/entry.S
index cafa183..923f03f 100644
--- a/arch/arm/nwfpe/entry.S
+++ b/arch/arm/nwfpe/entry.S
@@ -78,6 +78,7 @@ nwfpe_enter:
 	mov	sl, sp			@ we access the registers via 'sl'

 	ldr	r5, [sp, #S_PC]		@ get contents of PC;
+	add	r5, r5, #4		@ skip first PC, already have opcode
 	mov	r6, r0			@ save the opcode
 emulate:
 	ldr	r1, [sp, #S_PSR]	@ fetch the PSR
diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 7292921..73dea8b 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -139,11 +139,6 @@ check_for_exception:
 					@ out before setting an FPEXC that
 					@ stops us reading stuff
 	VFPFMXR	FPEXC, r1		@ Restore FPEXC last
-	sub	r2, r2, #4		@ Retry current instruction - if Thumb
-	str	r2, [sp, #S_PC]		@ mode it's two 16-bit instructions,
-					@ else it's one 32-bit instruction, so
-					@ always subtract 4 from the following
-					@ instruction address.
 #ifdef CONFIG_PREEMPT
 	get_thread_info	r10
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 0797cb5..1142742 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -293,6 +293,12 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct
pt_regs *regs)
 	orig_fpscr = fpscr = fmrx(FPSCR);

 	/*
+	 * Normally the instruction will be emulated, set return PC to be
+	 * the next instruction
+	 */
+	regs->ARM_pc += 4;
+
+	/*
 	 * Check for the special VFP subarch 1 and FPSCR.IXE bit case
 	 */
 	if ((fpsid & FPSID_ARCH_MASK) == (1 << FPSID_ARCH_BIT)

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-01-27  7:30                                 ` Colin Cross
  0 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-01-27  7:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 26, 2011 at 10:35 PM, Colin Cross <ccross@android.com> wrote:
> On Wed, Jan 26, 2011 at 10:11 PM, Colin Cross <ccross@android.com> wrote:
>> On Wed, Jan 26, 2011 at 3:26 AM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Tue, Jan 25, 2011 at 03:33:24PM -0800, Colin Cross wrote:
>>>> I think there is an additional change needed to
>>>> __und_usr_unknown/do_undefinstr. ?do_undefinstr, which gets called
>>>> directly in __und_usr as well as by mov pc, lr, expects regs->ARM_pc
>>>> to be the fault address, and not the next PC, and gets called for 2 or
>>>> 4 byte instructions.
>>>
>>> It expects it to be the next PC:
>>>
>>> asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
>>> {
>>> ? ? ? ?unsigned int correction = thumb_mode(regs) ? 2 : 4;
>>> ? ? ? ?unsigned int instr;
>>> ? ? ? ?siginfo_t info;
>>> ? ? ? ?void __user *pc;
>>>
>>> ? ? ? ?/*
>>> ? ? ? ? * According to the ARM ARM, PC is 2 or 4 bytes ahead,
>>> ? ? ? ? * depending whether we're in Thumb mode or not.
>>> ? ? ? ? * Correct this offset.
>>> ? ? ? ? */
>>> ? ? ? ?regs->ARM_pc -= correction;
>>>
>>> We expect the PC to be pointing at the next instruction to be executed.
>>> This is the value of the PC saved by the CPU when entering the exception.
>>> We correct the PC by four bytes for ARM mode to point at the previously
>>> executed instruction.
>>>
>>> For 16-bit Thumb mode, the PC is again pointing at the next instruction
>>> to be executed, and this is the value saved by the CPU. ?So we correct
>>> the PC by two bytes as that is the Thumb instruction size.
>>>
>>> The problem comes with T2, where we advance the saved PC by two bytes
>>> if the instruction was 32-bit such that it again points at the next
>>> instruction to be executed. ?This is where the problem comes in because
>>> we have two different chunks of code with completely different
>>> expectations.
>>
>> All do_undefinstr will do with the correction is subtract it off so
>> that pc points to the faulting instruction instead of the next
>> instruction, and calls any registered handlers. ?VFP_bounce sometimes
>> subtracts off 4 (it doesn't care about T2 mode there, because T2 VFP
>> instructions are all 32 bit), and sometimes leaves pc pointing at the
>> next instruction. ?If both were modified to expect the faulting
>> instruction's pc, do_undefinstr would leave pc unmodified, and
>> VFP_bounce would add 4 in the opposite cases from where it subtracts 4
>> now.
>>
>> iwmmxt_task_enable and crunch_task_enable can also get called from __und_usr.
>>
>> Currently, iwmmxt_task_enable will either end up in do_undefinstr
>> through mov pc, lr, or it will subtract 4 from the pc register value
>> and return through ret_from_exception. ?With the change above, it
>> would not need to modify the pc value at all.
>>
>> crunch_task_enable is based on iwmmxt_task_enable, and works exactly the same.
>>
>>> Maybe we need to pass in the correction factor to do_undefinstr instead.
>>>
>>
>> That just makes it even more complicated, the correction factor has to
>> be tracked through every code path.
>>
>> RFC patch to follow.
>>
>
> Missed one case. ?do_fpe calls into nwfpe_enter (I don't see any other
> users of fp_enter), which assumes the PC in the registers on the stack
> is faulting PC + 4, ignores r2, and uses r0 as the first instruction
> to emulate. ?It will need to be slightly modified in my patch.
>

This patch is on top of your first patch that cleans up the comments,
but not the second patch that fixes the PC in the VFP exception case.
Compiled but not run tested, and I can't test crunch or iwmmxt.
vfpmodule.c may be able to be simplified (right now its both adding
and subtracting 4 from regs->ARM_pc).

 arch/arm/kernel/crunch-bits.S |    3 ---
 arch/arm/kernel/entry-armv.S  |   38 ++++++++++++++++++++------------------
 arch/arm/kernel/iwmmxt.S      |    3 ---
 arch/arm/nwfpe/entry.S        |    1 +
 arch/arm/vfp/vfphw.S          |    5 -----
 arch/arm/vfp/vfpmodule.c      |    6 ++++++
 6 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/arch/arm/kernel/crunch-bits.S b/arch/arm/kernel/crunch-bits.S
index 0ec9bb4..0095368 100644
--- a/arch/arm/kernel/crunch-bits.S
+++ b/arch/arm/kernel/crunch-bits.S
@@ -77,11 +77,8 @@ ENTRY(crunch_task_enable)

 	ldr	r3, =crunch_owner
 	add	r0, r10, #TI_CRUNCH_STATE	@ get task crunch save area
-	ldr	r2, [sp, #60]			@ current task pc value
 	ldr	r1, [r3]			@ get current crunch owner
 	str	r0, [r3]			@ this task now owns crunch
-	sub	r2, r2, #4			@ adjust pc back
-	str	r2, [sp, #60]

 	ldr	r2, [r8, #0x80]
 	mov	r2, r2				@ flush out enable (@@@)
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 5876eec..f09e576 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -259,14 +259,18 @@ __und_svc:
 	@  r0 - instruction
 	@
 #ifndef	CONFIG_THUMB2_KERNEL
-	ldr	r0, [r2, #-4]
+        sub     r2, r2, #4
+	ldr	r0, [r2]
 #else
-	ldrh	r0, [r2, #-2]			@ Thumb instruction at LR - 2
+	sub	r2, r2, #2
+	ldrh	r0, [r2]			@ Thumb instruction at LR - 2
 	and	r9, r0, #0xf800
 	cmp	r9, #0xe800			@ 32-bit instruction if xx >= 0
-	ldrhhs	r9, [r2]			@ bottom 16 bits
+	ldrhhs	r9, [r2, #2]			@ bottom 16 bits
 	orrhs	r0, r9, r0, lsl #16
 #endif
+	str	r2, [sp, #S_PC]			@ replace regs->ARM_pc with
+						@ pc of faulting instruction
 	adr	r9, BSYM(1f)
 	bl	call_fpe

@@ -474,36 +478,34 @@ __und_usr:
 	@ r3 = regs->ARM_cpsr
 	@
 	tst	r3, #PSR_T_BIT			@ Thumb mode?
-	itttt	eq				@ explicit IT needed for the 1f label
-	subeq	r4, r2, #4			@ ARM instr at LR - 4
-1:	ldreqt	r0, [r4]
+	itet	eq				@ explicit IT needed for the 1f label
+	subeq	r2, r2, #4			@ ARM instr at LR - 4
+	subne	r2, r2, #2			@ Thumb instr at LR - 2
+1:	ldreqt	r0, [r2]
 #ifdef CONFIG_CPU_ENDIAN_BE8
 	reveq	r0, r0				@ little endian instruction
 #endif
+	str	r2, [sp, #S_PC]			@ replace regs->ARM_pc with
+						@ pc of faulting instruction
 	@
 	@ r0 = 32-bit ARM instruction which caused the exception
-	@ r2 = PC value for the following instruction (:= regs->ARM_pc)
-	@ r4 = PC value for the faulting instruction
-	@
+	@ r2 = PC value for the faulting instruction (:= regs->ARM_pc)
 	beq	call_fpe

 	@ Thumb instruction
 #if __LINUX_ARM_ARCH__ >= 7
-	sub	r4, r2, #2			@ Thumb instr at LR - 2
 2:
- ARM(	ldrht	r5, [r4], #2	)
- THUMB(	ldrht	r5, [r4]	)
- THUMB(	add	r4, r4, #2	)
+ ARM(	ldrht	r5, [r2], #2	)
+ THUMB(	ldrht	r5, [r2]	)
+ THUMB(	add	r4, r2, #2	)
 	and	r0, r5, #0xf800			@ mask bits 111x x... .... ....
 	cmp	r0, #0xe800			@ 32bit instruction if xx != 0
 	blo	__und_usr_unknown
 3:	ldrht	r0, [r4]
-	add	r2, r2, #2			@ r2 is PC + 2, make it PC + 4
 	orr	r0, r0, r5, lsl #16
 	@
 	@ r0 = the two 16-bit Thumb instructions which caused the exception
-	@ r2 = PC value for the following Thumb instruction (:= regs->ARM_pc+2)
-	@ r4 = PC value for the first 16-bit Thumb instruction
+	@ r2 = PC value for the faulting Thumb instruction (:= regs->ARM_pc)
 	@
 #else
 	b	__und_usr_unknown
@@ -544,7 +546,7 @@ ENDPROC(__und_usr)
  *
  * Emulators may wish to make use of the following registers:
  *  r0  = instruction opcode (32-bit ARM or two 16-bit Thumb)
- *  r2  = PC value to resume execution after successful emulation
+ *  r2  = PC value of the faulting instruction
  *  r9  = normal "successful" return address
  *  r10 = this threads thread_info structure
  *  lr  = unrecognised instruction return address
@@ -662,7 +664,7 @@ do_fpe:
 /*
  * The FP module is called with these registers set:
  *  r0  = instruction
- *  r2  = PC+4
+ *  r2  = PC of the faulting instruction
  *  r9  = normal "successful" return address
  *  r10 = FP workspace
  *  lr  = unrecognised FP instruction return address
diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
index 7fa3bb0..0842487 100644
--- a/arch/arm/kernel/iwmmxt.S
+++ b/arch/arm/kernel/iwmmxt.S
@@ -80,11 +80,8 @@ ENTRY(iwmmxt_task_enable)

 	ldr	r3, =concan_owner
 	add	r0, r10, #TI_IWMMXT_STATE	@ get task Concan save area
-	ldr	r2, [sp, #60]			@ current task pc value
 	ldr	r1, [r3]			@ get current Concan owner
 	str	r0, [r3]			@ this task now owns Concan regs
-	sub	r2, r2, #4			@ adjust pc back
-	str	r2, [sp, #60]

 	mrc	p15, 0, r2, c2, c0, 0
 	mov	r2, r2				@ cpwait
diff --git a/arch/arm/nwfpe/entry.S b/arch/arm/nwfpe/entry.S
index cafa183..923f03f 100644
--- a/arch/arm/nwfpe/entry.S
+++ b/arch/arm/nwfpe/entry.S
@@ -78,6 +78,7 @@ nwfpe_enter:
 	mov	sl, sp			@ we access the registers via 'sl'

 	ldr	r5, [sp, #S_PC]		@ get contents of PC;
+	add	r5, r5, #4		@ skip first PC, already have opcode
 	mov	r6, r0			@ save the opcode
 emulate:
 	ldr	r1, [sp, #S_PSR]	@ fetch the PSR
diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 7292921..73dea8b 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -139,11 +139,6 @@ check_for_exception:
 					@ out before setting an FPEXC that
 					@ stops us reading stuff
 	VFPFMXR	FPEXC, r1		@ Restore FPEXC last
-	sub	r2, r2, #4		@ Retry current instruction - if Thumb
-	str	r2, [sp, #S_PC]		@ mode it's two 16-bit instructions,
-					@ else it's one 32-bit instruction, so
-					@ always subtract 4 from the following
-					@ instruction address.
 #ifdef CONFIG_PREEMPT
 	get_thread_info	r10
 	ldr	r4, [r10, #TI_PREEMPT]	@ get preempt count
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 0797cb5..1142742 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -293,6 +293,12 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct
pt_regs *regs)
 	orig_fpscr = fpscr = fmrx(FPSCR);

 	/*
+	 * Normally the instruction will be emulated, set return PC to be
+	 * the next instruction
+	 */
+	regs->ARM_pc += 4;
+
+	/*
 	 * Check for the special VFP subarch 1 and FPSCR.IXE bit case
 	 */
 	if ((fpsid & FPSID_ARCH_MASK) == (1 << FPSID_ARCH_BIT)

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH] ARM: vfp: Fix up exception location in Thumb mode
  2011-01-27  7:30                                 ` Colin Cross
@ 2011-02-09 18:12                                   ` Colin Cross
  -1 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-02-09 18:12 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Catalin Marinas, linux-kernel, linux-arm-kernel

On Wed, Jan 26, 2011 at 11:30 PM, Colin Cross <ccross@android.com> wrote:
> This patch is on top of your first patch that cleans up the comments,
> but not the second patch that fixes the PC in the VFP exception case.
> Compiled but not run tested, and I can't test crunch or iwmmxt.
> vfpmodule.c may be able to be simplified (right now its both adding
> and subtracting 4 from regs->ARM_pc).
>
>  arch/arm/kernel/crunch-bits.S |    3 ---
>  arch/arm/kernel/entry-armv.S  |   38 ++++++++++++++++++++------------------
>  arch/arm/kernel/iwmmxt.S      |    3 ---
>  arch/arm/nwfpe/entry.S        |    1 +
>  arch/arm/vfp/vfphw.S          |    5 -----
>  arch/arm/vfp/vfpmodule.c      |    6 ++++++
>  6 files changed, 27 insertions(+), 29 deletions(-)

Finally got a chance to do some quick testing on this, it needs one
more change to drop the correction in do_undefinstr in traps.c, and
fix one register in entry-armv.S.  Russell, do you have any interest
in this solution?  Alternatively, a one line change to store the
corrected PC into the stack in vfphw.S instead of entry-armv.S will
fix the original problem (VFP_bounce from thumb mode returning to the
wrong address) without causing the problem introduced by your second
patch (undefined NEON or VFP-D32 instructions in thumb mode returning
to the wrong address).

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] ARM: vfp: Fix up exception location in Thumb mode
@ 2011-02-09 18:12                                   ` Colin Cross
  0 siblings, 0 replies; 64+ messages in thread
From: Colin Cross @ 2011-02-09 18:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 26, 2011 at 11:30 PM, Colin Cross <ccross@android.com> wrote:
> This patch is on top of your first patch that cleans up the comments,
> but not the second patch that fixes the PC in the VFP exception case.
> Compiled but not run tested, and I can't test crunch or iwmmxt.
> vfpmodule.c may be able to be simplified (right now its both adding
> and subtracting 4 from regs->ARM_pc).
>
> ?arch/arm/kernel/crunch-bits.S | ? ?3 ---
> ?arch/arm/kernel/entry-armv.S ?| ? 38 ++++++++++++++++++++------------------
> ?arch/arm/kernel/iwmmxt.S ? ? ?| ? ?3 ---
> ?arch/arm/nwfpe/entry.S ? ? ? ?| ? ?1 +
> ?arch/arm/vfp/vfphw.S ? ? ? ? ?| ? ?5 -----
> ?arch/arm/vfp/vfpmodule.c ? ? ?| ? ?6 ++++++
> ?6 files changed, 27 insertions(+), 29 deletions(-)

Finally got a chance to do some quick testing on this, it needs one
more change to drop the correction in do_undefinstr in traps.c, and
fix one register in entry-armv.S.  Russell, do you have any interest
in this solution?  Alternatively, a one line change to store the
corrected PC into the stack in vfphw.S instead of entry-armv.S will
fix the original problem (VFP_bounce from thumb mode returning to the
wrong address) without causing the problem introduced by your second
patch (undefined NEON or VFP-D32 instructions in thumb mode returning
to the wrong address).

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2011-02-09 18:12 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-14  7:42 [PATCH] ARM: vfp: Fix up exception location in Thumb mode Colin Cross
2011-01-14  7:42 ` Colin Cross
2011-01-14 11:43 ` Catalin Marinas
2011-01-14 11:43   ` Catalin Marinas
2011-01-14 12:02   ` Russell King - ARM Linux
2011-01-14 12:02     ` Russell King - ARM Linux
2011-01-14 14:10     ` Catalin Marinas
2011-01-14 14:10       ` Catalin Marinas
2011-01-14 15:49       ` Russell King - ARM Linux
2011-01-14 15:49         ` Russell King - ARM Linux
2011-01-14 16:23         ` Catalin Marinas
2011-01-14 16:23           ` Catalin Marinas
2011-01-14 16:35           ` Russell King - ARM Linux
2011-01-14 16:35             ` Russell King - ARM Linux
2011-01-14 16:58             ` Catalin Marinas
2011-01-14 16:58               ` Catalin Marinas
2011-01-14 17:30               ` Russell King - ARM Linux
2011-01-14 17:30                 ` Russell King - ARM Linux
2011-01-14 18:47                 ` Russell King - ARM Linux
2011-01-14 18:47                   ` Russell King - ARM Linux
2011-01-14 19:23                   ` Colin Cross
2011-01-14 19:23                     ` Colin Cross
2011-01-14 19:51                     ` Colin Cross
2011-01-14 19:51                       ` Colin Cross
2011-01-14 21:24                       ` Russell King - ARM Linux
2011-01-14 21:24                         ` Russell King - ARM Linux
2011-01-25 23:33                       ` Colin Cross
2011-01-25 23:33                         ` Colin Cross
2011-01-26 11:26                         ` Russell King - ARM Linux
2011-01-26 11:26                           ` Russell King - ARM Linux
2011-01-27  6:11                           ` Colin Cross
2011-01-27  6:11                             ` Colin Cross
2011-01-27  6:35                             ` Colin Cross
2011-01-27  6:35                               ` Colin Cross
2011-01-27  7:30                               ` Colin Cross
2011-01-27  7:30                                 ` Colin Cross
2011-02-09 18:12                                 ` Colin Cross
2011-02-09 18:12                                   ` Colin Cross
2011-01-15 15:38                   ` Catalin Marinas
2011-01-15 15:38                     ` Catalin Marinas
2011-01-15 15:43                     ` Russell King - ARM Linux
2011-01-15 15:43                       ` Russell King - ARM Linux
2011-01-16 11:51                       ` Catalin Marinas
2011-01-16 11:51                         ` Catalin Marinas
2011-01-15 15:31                 ` Catalin Marinas
2011-01-15 15:31                   ` Catalin Marinas
2011-01-15 15:40                   ` Russell King - ARM Linux
2011-01-15 15:40                     ` Russell King - ARM Linux
2011-01-16 11:49                     ` Catalin Marinas
2011-01-16 11:49                       ` Catalin Marinas
2011-01-23 15:51                       ` Russell King - ARM Linux
2011-01-23 15:51                         ` Russell King - ARM Linux
2011-01-25 13:19                         ` Catalin Marinas
2011-01-25 13:19                           ` Catalin Marinas
2011-01-16 21:25                     ` Catalin Marinas
2011-01-16 21:25                       ` Catalin Marinas
2011-01-23 15:46                       ` Russell King - ARM Linux
2011-01-23 15:46                         ` Russell King - ARM Linux
2011-01-25 13:45                         ` Catalin Marinas
2011-01-25 13:45                           ` Catalin Marinas
2011-01-14 16:24       ` Dave Martin
2011-01-14 16:24         ` Dave Martin
2011-01-14 16:52         ` Russell King - ARM Linux
2011-01-14 16:52           ` Russell King - ARM Linux

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.