All of lore.kernel.org
 help / color / mirror / Atom feed
* Change call ABI on PA-RISC
@ 2016-11-12 18:00 John David Anglin
  2016-11-13 18:37 ` Helge Deller
  2016-11-14 13:32 ` Carlos O'Donell
  0 siblings, 2 replies; 14+ messages in thread
From: John David Anglin @ 2016-11-12 18:00 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc@vger.kernel.org List, Jeff Law

I'm thinking about adding a "-mabi=" option to change the call ABI.  Currently, objects larger than 64 bits
in the 32-bit runtime are passed by reference and the callee copies the object when necessary.  This is
opposite to x86 where the caller does the copies.  Most targets are caller copies.

The problem with callee copies is that it doesn't work with openmp.  There are race problems and sometimes
we get internal compiler errors with openmp code due to this problem.  This became apparent when new testcases
were added to gcc-6.  It's tough to fix this problem in gcc.

This is gcc PR:
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68733>.

This is fixed if we change abi to caller copies (maybe "-mabi=gnu" and "-mabi=hp").  The default could be set
by a configure options.  Probably, we would want the new gnu abi on linux as the default.  However, there is
the potential to break stuff during the migration to the new abi.

Opinions?

Dave
--
John David Anglin	dave.anglin@bell.net




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-12 18:00 Change call ABI on PA-RISC John David Anglin
@ 2016-11-13 18:37 ` Helge Deller
  2016-11-13 18:56   ` Jeff Law
  2016-11-14 13:32 ` Carlos O'Donell
  1 sibling, 1 reply; 14+ messages in thread
From: Helge Deller @ 2016-11-13 18:37 UTC (permalink / raw)
  To: John David Anglin; +Cc: linux-parisc, List, Jeff Law

On 12.11.2016 19:00, John David Anglin wrote:
> I'm thinking about adding a "-mabi=" option to change the call ABI.  Currently, objects larger than 64 bits
> in the 32-bit runtime are passed by reference and the callee copies the object when necessary.  This is
> opposite to x86 where the caller does the copies.  Most targets are caller copies.
> 
> The problem with callee copies is that it doesn't work with openmp.  There are race problems and sometimes
> we get internal compiler errors with openmp code due to this problem.  This became apparent when new testcases
> were added to gcc-6.  It's tough to fix this problem in gcc.
> 
> This is gcc PR:
> <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68733>.
> 
> This is fixed if we change abi to caller copies (maybe "-mabi=gnu" and "-mabi=hp").  The default could be set
> by a configure options.  Probably, we would want the new gnu abi on linux as the default.  However, there is
> the potential to break stuff during the migration to the new abi.
> 
> Opinions?

PA-RISC Linux is a niche platform. We always had problems with platform-specifics
which were different to the more widely-used platforms (e.g. stack-grows-upwards, 
more signal numbers (>32) than other platforms, EWOULDBLOCK != EAGAIN, ...).
That said, I like your proposal if we then gain openmp support and if it doesn't
heavily breaks other stuff.

If you are going to change the ABI, maybe we can add more things as well?
Which comes to my mind here is for example an optimized mcount() function 
which allows changing the return pointer (see -mmcount-ra-address on MIPS) ?

Helge

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-13 18:37 ` Helge Deller
@ 2016-11-13 18:56   ` Jeff Law
  2016-11-13 19:48     ` Helge Deller
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Law @ 2016-11-13 18:56 UTC (permalink / raw)
  To: Helge Deller, John David Anglin; +Cc: linux-parisc

On 11/13/2016 11:37 AM, Helge Deller wrote:
> On 12.11.2016 19:00, John David Anglin wrote:
>> I'm thinking about adding a "-mabi=" option to change the call ABI.  Currently, objects larger than 64 bits
>> in the 32-bit runtime are passed by reference and the callee copies the object when necessary.  This is
>> opposite to x86 where the caller does the copies.  Most targets are caller copies.
>>
>> The problem with callee copies is that it doesn't work with openmp.  There are race problems and sometimes
>> we get internal compiler errors with openmp code due to this problem.  This became apparent when new testcases
>> were added to gcc-6.  It's tough to fix this problem in gcc.
>>
>> This is gcc PR:
>> <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68733>.
>>
>> This is fixed if we change abi to caller copies (maybe "-mabi=gnu" and "-mabi=hp").  The default could be set
>> by a configure options.  Probably, we would want the new gnu abi on linux as the default.  However, there is
>> the potential to break stuff during the migration to the new abi.
>>
>> Opinions?
>
> PA-RISC Linux is a niche platform. We always had problems with platform-specifics
> which were different to the more widely-used platforms (e.g. stack-grows-upwards,
> more signal numbers (>32) than other platforms, EWOULDBLOCK != EAGAIN, ...).
> That said, I like your proposal if we then gain openmp support and if it doesn't
> heavily breaks other stuff.
The motivation was to allow the callee to avoid the copy when it knew it 
wouldn't be changing the object.  And you could do that through a chain 
of calls avoiding lots of copies along the way.

This turned out to be a huge impact for co-locating the mach kernel and 
OS personality sever within a single address space.  We started that 
research with a pre-bugfix compiler.  During the research I fixed GCC's 
implementation to match the ABI (in response to a bug report I'm sure) 
and didn't think much of it.  When we finally got the co-located stuff 
working and compared it to separate address spaces baselines we'd 
gathered earlier the performance gains were huge and we were exceedingly 
happy.

Of course, I wanted to understand why -- so I dug deeper and eventually 
found that MIG would generate interface code which passed around things 
by-value all the time.  The compiler bugfix essentially allowed the 
compiler to avoid the copy in the callee because the callee didn't 
modify the object.  So we were avoiding a ton of memcpy traffic.

In the end the impact of the compiler bugfix was actually larger than 
the primary effect we were looking for (avoiding context switches, tlb 
flushing, caching effects, etc).

Anyway, as long as the world gets rebuilt and you never mix-match 
objects this should be safe.


> If you are going to change the ABI, maybe we can add more things as well?
> Which comes to my mind here is for example an optimized mcount() function
> which allows changing the return pointer (see -mmcount-ra-address on MIPS) ?
As in twiddling RP to return to a different point?  That's an 
exceedingly bad idea on PA8000 and beyond -- it totally hoses the branch 
predictors.  That's why we turned off the twiddle RP in the delay slot 
of a call to emulate a branch after returning from a call.


jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-13 18:56   ` Jeff Law
@ 2016-11-13 19:48     ` Helge Deller
  2016-11-14  8:21       ` Jeff Law
  0 siblings, 1 reply; 14+ messages in thread
From: Helge Deller @ 2016-11-13 19:48 UTC (permalink / raw)
  To: Jeff Law, John David Anglin; +Cc: linux-parisc

On 13.11.2016 19:56, Jeff Law wrote:
> On 11/13/2016 11:37 AM, Helge Deller wrote:
>> If you are going to change the ABI, maybe we can add more things as well?
>> Which comes to my mind here is for example an optimized mcount() function
>> which allows changing the return pointer (see -mmcount-ra-address on MIPS) ?

> As in twiddling RP to return to a different point?

No, that's not the use case for me.

I was working on the ftrace functionality in the Linux kernel.
I'd need to look up the full details again, but as far as I remember one of 
the tracers wants to know the function to which the caller of mcount() would return,
so some kind of simple __builtin_return_address(2).

Helge

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-13 19:48     ` Helge Deller
@ 2016-11-14  8:21       ` Jeff Law
  2016-11-14 15:11         ` John David Anglin
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Law @ 2016-11-14  8:21 UTC (permalink / raw)
  To: Helge Deller, John David Anglin; +Cc: linux-parisc

On 11/13/2016 12:48 PM, Helge Deller wrote:
> On 13.11.2016 19:56, Jeff Law wrote:
>> On 11/13/2016 11:37 AM, Helge Deller wrote:
>>> If you are going to change the ABI, maybe we can add more things as well?
>>> Which comes to my mind here is for example an optimized mcount() function
>>> which allows changing the return pointer (see -mmcount-ra-address on MIPS) ?
>
>> As in twiddling RP to return to a different point?
>
> No, that's not the use case for me.
>
> I was working on the ftrace functionality in the Linux kernel.
> I'd need to look up the full details again, but as far as I remember one of
> the tracers wants to know the function to which the caller of mcount() would return,
> so some kind of simple __builtin_return_address(2).
Ah.  Isn't that going to be sitting at sp-20 or something like that.  My 
PA is rusty, but my recollection is that's supposed to be at a fixed 
location in the frame.
jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-12 18:00 Change call ABI on PA-RISC John David Anglin
  2016-11-13 18:37 ` Helge Deller
@ 2016-11-14 13:32 ` Carlos O'Donell
  1 sibling, 0 replies; 14+ messages in thread
From: Carlos O'Donell @ 2016-11-14 13:32 UTC (permalink / raw)
  To: John David Anglin
  Cc: Helge Deller, linux-parisc@vger.kernel.org List, Jeff Law

On Sat, Nov 12, 2016 at 1:00 PM, John David Anglin <dave.anglin@bell.net> wrote:
> I'm thinking about adding a "-mabi=" option to change the call ABI.  Currently, objects larger than 64 bits
> in the 32-bit runtime are passed by reference and the callee copies the object when necessary.  This is
> opposite to x86 where the caller does the copies.  Most targets are caller copies.
>
> The problem with callee copies is that it doesn't work with openmp.  There are race problems and sometimes
> we get internal compiler errors with openmp code due to this problem.  This became apparent when new testcases
> were added to gcc-6.  It's tough to fix this problem in gcc.
>
> This is gcc PR:
> <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68733>.
>
> This is fixed if we change abi to caller copies (maybe "-mabi=gnu" and "-mabi=hp").  The default could be set
> by a configure options.  Probably, we would want the new gnu abi on linux as the default.  However, there is
> the potential to break stuff during the migration to the new abi.
>
> Opinions?

I think this is a good idea.

Anything that brings us slowly towards parity with other arches is
better in the long-run.

Any idea how to test the impact?

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-14  8:21       ` Jeff Law
@ 2016-11-14 15:11         ` John David Anglin
  2016-11-14 16:24           ` Helge Deller
  0 siblings, 1 reply; 14+ messages in thread
From: John David Anglin @ 2016-11-14 15:11 UTC (permalink / raw)
  To: Jeff Law, Helge Deller; +Cc: linux-parisc

On 2016-11-14 3:21 AM, Jeff Law wrote:
> On 11/13/2016 12:48 PM, Helge Deller wrote:
>> On 13.11.2016 19:56, Jeff Law wrote:
>>> On 11/13/2016 11:37 AM, Helge Deller wrote:
>>>> If you are going to change the ABI, maybe we can add more things as 
>>>> well?
>>>> Which comes to my mind here is for example an optimized mcount() 
>>>> function
>>>> which allows changing the return pointer (see -mmcount-ra-address 
>>>> on MIPS) ?
>>
>>> As in twiddling RP to return to a different point?
>>
>> No, that's not the use case for me.
>>
>> I was working on the ftrace functionality in the Linux kernel.
>> I'd need to look up the full details again, but as far as I remember 
>> one of
>> the tracers wants to know the function to which the caller of 
>> mcount() would return,
>> so some kind of simple __builtin_return_address(2).
> Ah.  Isn't that going to be sitting at sp-20 or something like that.  
> My PA is rusty, but my recollection is that's supposed to be at a 
> fixed location in the frame.
The return address of the the function to which the caller of mcount() 
would return
is passed to mcount() in %r26.  The saved value in the frame is not 
directly useful as one lacks
the frame offset of the routine calling mcount.

In linux, we have a single space runtime and thus the import stubs don't 
modify the return
address when shared library routines are called.  So, the value passed 
in %r26 can be used
directly to lookup the calling function.

Dave

-- 
John David Anglin  dave.anglin@bell.net


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-14 15:11         ` John David Anglin
@ 2016-11-14 16:24           ` Helge Deller
  2016-11-14 17:02             ` John David Anglin
                               ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Helge Deller @ 2016-11-14 16:24 UTC (permalink / raw)
  To: John David Anglin, Jeff Law; +Cc: linux-parisc

On 14.11.2016 16:11, John David Anglin wrote:
> On 2016-11-14 3:21 AM, Jeff Law wrote:
>> On 11/13/2016 12:48 PM, Helge Deller wrote:
>>> On 13.11.2016 19:56, Jeff Law wrote:
>>>> On 11/13/2016 11:37 AM, Helge Deller wrote:
>>>>> If you are going to change the ABI, maybe we can add more
>>>>> things as well? Which comes to my mind here is for example an
>>>>> optimized mcount() function which allows changing the return
>>>>> pointer (see -mmcount-ra-address on MIPS) ?
>>> 
>>>> As in twiddling RP to return to a different point?
>>> 
>>> No, that's not the use case for me.

I was wrong.
It's actually the use case to modify the RP...

>>> I was working on the ftrace functionality in the Linux kernel. 
>>> I'd need to look up the full details again, but as far as I
>>> remember one of the tracers wants to know the function to which
>>> the caller of mcount() would return, so some kind of simple
>>> __builtin_return_address(2).

>> Ah.  Isn't that going to be sitting at sp-20 or something like
>> that.  My PA is rusty, but my recollection is that's supposed to be
>> at a fixed location in the frame.

Yes, sp-10.

0000000000000000 <irq_to_desc>:
   0:   08 03 02 41     copy r3,r1
   4:   0f c2 12 c1     std rp,-10(sp)
   8:   08 1e 02 43     copy sp,r3
   c:   73 c1 00 a8     std,ma r1,50(sp)
  10:   37 dd 00 20     ldo 10(sp),ret1
  14:   0c 65 12 d0     std r5,8(r3)
  18:   00 00 14 b9     mfia r25
  1c:   db 45 0b e0     extrd,u,* r26,63,32,r5
  20:   70 64 00 20     std r4,10(r3)
  24:   08 02 02 5a     copy rp,r26
  28:   37 39 3f d1     ldo -18(r25),r25
  2c:   08 1b 02 44     copy dp,r4
  30:   2b 60 00 00     addil L%0,dp,r1
  34:   50 21 00 00     ldd 0(r1),r1
  38:   0c 20 10 c1     ldd 0(r1),r1
  3c:   50 22 00 20     ldd 10(r1),rp
  40:   e8 40 f0 00     bve,l (rp),rp
  44:   50 3b 00 30     ldd 18(r1),dp


> The return address of the the function to which the caller of
> mcount() would return is passed to mcount() in %r26.  The saved value
> in the frame is not directly useful as one lacks the frame offset of
> the routine calling mcount.

What I want to archieve is to modify the return pointer, in order
to be able to track when the function returns to his caller.
The kernel ftracer uses this then to generate call stacks and to
time the function.
Looking at the above code, it should then be possible for me
to modify -10(r3), but is there a guarantee that it's always at
-10(r3) and that r3 is used?
That's the reason I asked if we could modify mcount to
give the address (in the stack) of the return pointer, but maybe
it's just overkill for this use case ?

Helge 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-14 16:24           ` Helge Deller
@ 2016-11-14 17:02             ` John David Anglin
  2016-11-14 17:02             ` Jeff Law
  2016-11-14 17:23             ` Carlos O'Donell
  2 siblings, 0 replies; 14+ messages in thread
From: John David Anglin @ 2016-11-14 17:02 UTC (permalink / raw)
  To: Helge Deller, Jeff Law; +Cc: linux-parisc

On 2016-11-14 11:24 AM, Helge Deller wrote:
> On 14.11.2016 16:11, John David Anglin wrote:
>> On 2016-11-14 3:21 AM, Jeff Law wrote:
>>> On 11/13/2016 12:48 PM, Helge Deller wrote:
>>>> On 13.11.2016 19:56, Jeff Law wrote:
>>>>> On 11/13/2016 11:37 AM, Helge Deller wrote:
>>>>>> If you are going to change the ABI, maybe we can add more
>>>>>> things as well? Which comes to my mind here is for example an
>>>>>> optimized mcount() function which allows changing the return
>>>>>> pointer (see -mmcount-ra-address on MIPS) ?
>>>>> As in twiddling RP to return to a different point?
>>>> No, that's not the use case for me.
> I was wrong.
> It's actually the use case to modify the RP...
>
>>>> I was working on the ftrace functionality in the Linux kernel.
>>>> I'd need to look up the full details again, but as far as I
>>>> remember one of the tracers wants to know the function to which
>>>> the caller of mcount() would return, so some kind of simple
>>>> __builtin_return_address(2).
>>> Ah.  Isn't that going to be sitting at sp-20 or something like
>>> that.  My PA is rusty, but my recollection is that's supposed to be
>>> at a fixed location in the frame.
> Yes, sp-10.
It's sp-0x10 on 64-bit and sp-0x14 on 32-bit.
>
> 0000000000000000 <irq_to_desc>:
>     0:   08 03 02 41     copy r3,r1
>     4:   0f c2 12 c1     std rp,-10(sp)
>     8:   08 1e 02 43     copy sp,r3
>     c:   73 c1 00 a8     std,ma r1,50(sp)
>    10:   37 dd 00 20     ldo 10(sp),ret1
>    14:   0c 65 12 d0     std r5,8(r3)
>    18:   00 00 14 b9     mfia r25
>    1c:   db 45 0b e0     extrd,u,* r26,63,32,r5
>    20:   70 64 00 20     std r4,10(r3)
>    24:   08 02 02 5a     copy rp,r26
>    28:   37 39 3f d1     ldo -18(r25),r25
>    2c:   08 1b 02 44     copy dp,r4
>    30:   2b 60 00 00     addil L%0,dp,r1
>    34:   50 21 00 00     ldd 0(r1),r1
>    38:   0c 20 10 c1     ldd 0(r1),r1
>    3c:   50 22 00 20     ldd 10(r1),rp
>    40:   e8 40 f0 00     bve,l (rp),rp
>    44:   50 3b 00 30     ldd 18(r1),dp
>
>
>> The return address of the the function to which the caller of
>> mcount() would return is passed to mcount() in %r26.  The saved value
>> in the frame is not directly useful as one lacks the frame offset of
>> the routine calling mcount.
> What I want to archieve is to modify the return pointer, in order
> to be able to track when the function returns to his caller.
> The kernel ftracer uses this then to generate call stacks and to
> time the function.
> Looking at the above code, it should then be possible for me
> to modify -10(r3), but is there a guarantee that it's always at
> -10(r3) and that r3 is used?
The location for the for saving the return pointer is defined and always 
the same.

If routine is compiled with "-fno-omit-frame-pointer", there will always 
be a
frame pointer and it should be %r3.  Otherwise, there's no guarantee 
that the
frame pointer won't be eliminated.

Since the routine calls mcount, I believe the return pointer will be 
always be saved
and restored from the frame marker.  So, it should be possible the 
return address
on the stack to detect when the routine completes..

> That's the reason I asked if we could modify mcount to
> give the address (in the stack) of the return pointer, but maybe
> it's just overkill for this use case ?

If "-fno-omit-frame-pointer" works, then you have what you need.

Dave

-- 
John David Anglin  dave.anglin@bell.net


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-14 16:24           ` Helge Deller
  2016-11-14 17:02             ` John David Anglin
@ 2016-11-14 17:02             ` Jeff Law
  2016-11-14 17:23             ` Carlos O'Donell
  2 siblings, 0 replies; 14+ messages in thread
From: Jeff Law @ 2016-11-14 17:02 UTC (permalink / raw)
  To: Helge Deller, John David Anglin; +Cc: linux-parisc

On 11/14/2016 09:24 AM, Helge Deller wrote:
> On 14.11.2016 16:11, John David Anglin wrote:
>> On 2016-11-14 3:21 AM, Jeff Law wrote:
>>> On 11/13/2016 12:48 PM, Helge Deller wrote:
>>>> On 13.11.2016 19:56, Jeff Law wrote:
>>>>> On 11/13/2016 11:37 AM, Helge Deller wrote:
>>>>>> If you are going to change the ABI, maybe we can add more
>>>>>> things as well? Which comes to my mind here is for example an
>>>>>> optimized mcount() function which allows changing the return
>>>>>> pointer (see -mmcount-ra-address on MIPS) ?
>>>>
>>>>> As in twiddling RP to return to a different point?
>>>>
>>>> No, that's not the use case for me.
>
> I was wrong.
> It's actually the use case to modify the RP...
>
>>>> I was working on the ftrace functionality in the Linux kernel.
>>>> I'd need to look up the full details again, but as far as I
>>>> remember one of the tracers wants to know the function to which
>>>> the caller of mcount() would return, so some kind of simple
>>>> __builtin_return_address(2).
>
>>> Ah.  Isn't that going to be sitting at sp-20 or something like
>>> that.  My PA is rusty, but my recollection is that's supposed to be
>>> at a fixed location in the frame.
>
> Yes, sp-10.
>
> 0000000000000000 <irq_to_desc>:
>    0:   08 03 02 41     copy r3,r1
>    4:   0f c2 12 c1     std rp,-10(sp)
>    8:   08 1e 02 43     copy sp,r3
>    c:   73 c1 00 a8     std,ma r1,50(sp)
>   10:   37 dd 00 20     ldo 10(sp),ret1
>   14:   0c 65 12 d0     std r5,8(r3)
>   18:   00 00 14 b9     mfia r25
>   1c:   db 45 0b e0     extrd,u,* r26,63,32,r5
>   20:   70 64 00 20     std r4,10(r3)
>   24:   08 02 02 5a     copy rp,r26
>   28:   37 39 3f d1     ldo -18(r25),r25
>   2c:   08 1b 02 44     copy dp,r4
>   30:   2b 60 00 00     addil L%0,dp,r1
>   34:   50 21 00 00     ldd 0(r1),r1
>   38:   0c 20 10 c1     ldd 0(r1),r1
>   3c:   50 22 00 20     ldd 10(r1),rp
>   40:   e8 40 f0 00     bve,l (rp),rp
>   44:   50 3b 00 30     ldd 18(r1),dp
>
>
>> The return address of the the function to which the caller of
>> mcount() would return is passed to mcount() in %r26.  The saved value
>> in the frame is not directly useful as one lacks the frame offset of
>> the routine calling mcount.
>
> What I want to archieve is to modify the return pointer, in order
> to be able to track when the function returns to his caller.
> The kernel ftracer uses this then to generate call stacks and to
> time the function.
> Looking at the above code, it should then be possible for me
> to modify -10(r3), but is there a guarantee that it's always at
> -10(r3) and that r3 is used?
I don't think you can depend on r3 (frame pointer).  But IIRC the save 
slot for the return address is fixed.  I think the problem you're going 
to run into is that the RP is saved into the frame allocated by the 
caller, then we adjust the SP for the current function.  So without 
knowing the size of the current function's frame, you're hosed.

Jeff


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-14 16:24           ` Helge Deller
  2016-11-14 17:02             ` John David Anglin
  2016-11-14 17:02             ` Jeff Law
@ 2016-11-14 17:23             ` Carlos O'Donell
  2016-11-15 18:30               ` Helge Deller
  2 siblings, 1 reply; 14+ messages in thread
From: Carlos O'Donell @ 2016-11-14 17:23 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, Jeff Law, linux-parisc

On Mon, Nov 14, 2016 at 11:24 AM, Helge Deller <deller@gmx.de> wrote:
> What I want to archieve is to modify the return pointer, in order
> to be able to track when the function returns to his caller.
> The kernel ftracer uses this then to generate call stacks and to
> time the function.
> Looking at the above code, it should then be possible for me
> to modify -10(r3), but is there a guarantee that it's always at
> -10(r3) and that r3 is used?
> That's the reason I asked if we could modify mcount to
> give the address (in the stack) of the return pointer, but maybe
> it's just overkill for this use case ?

Why do you need to modify the value?

AFAICT ftrace uses these values heuristically e.g.
HAVE_FUNCTION_GRAPH_FP_TEST, HAVE_FUNCTION_GRAPH_RET_ADDR_PTR?

For example the only uses I see are in ftrace_pop_return_trace() and
they are purely heuristic.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-14 17:23             ` Carlos O'Donell
@ 2016-11-15 18:30               ` Helge Deller
  2016-11-16  0:04                 ` John David Anglin
  0 siblings, 1 reply; 14+ messages in thread
From: Helge Deller @ 2016-11-15 18:30 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: John David Anglin, Jeff Law, linux-parisc

Hi Carlos,

On 14.11.2016 18:23, Carlos O'Donell wrote:
> On Mon, Nov 14, 2016 at 11:24 AM, Helge Deller <deller@gmx.de> wrote:
>> What I want to archieve is to modify the return pointer, in order
>> to be able to track when the function returns to his caller.
>> The kernel ftracer uses this then to generate call stacks and to
>> time the function.
>> Looking at the above code, it should then be possible for me
>> to modify -10(r3), but is there a guarantee that it's always at
>> -10(r3) and that r3 is used?
>> That's the reason I asked if we could modify mcount to
>> give the address (in the stack) of the return pointer, but maybe
>> it's just overkill for this use case ?
> 
> Why do you need to modify the value?
> 
> AFAICT ftrace uses these values heuristically e.g.
> HAVE_FUNCTION_GRAPH_FP_TEST, HAVE_FUNCTION_GRAPH_RET_ADDR_PTR?
> 
> For example the only uses I see are in ftrace_pop_return_trace() and
> they are purely heuristic.

Maybe I get you wrong, but do you suggest I don't need to implement
it because it's just heuristic ?
(I agree, it's not top priority, but nice to have nevertheless)

Helge


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-15 18:30               ` Helge Deller
@ 2016-11-16  0:04                 ` John David Anglin
  2016-11-17 20:19                   ` Carlos O'Donell
  0 siblings, 1 reply; 14+ messages in thread
From: John David Anglin @ 2016-11-16  0:04 UTC (permalink / raw)
  To: Helge Deller; +Cc: Carlos O'Donell, Jeff Law, linux-parisc

Hi Helge,

On 2016-11-15, at 1:30 PM, Helge Deller wrote:

> Hi Carlos,
> 
> On 14.11.2016 18:23, Carlos O'Donell wrote:
>> On Mon, Nov 14, 2016 at 11:24 AM, Helge Deller <deller@gmx.de> wrote:
>>> What I want to archieve is to modify the return pointer, in order
>>> to be able to track when the function returns to his caller.
>>> The kernel ftracer uses this then to generate call stacks and to
>>> time the function.
>>> Looking at the above code, it should then be possible for me
>>> to modify -10(r3), but is there a guarantee that it's always at
>>> -10(r3) and that r3 is used?
>>> That's the reason I asked if we could modify mcount to
>>> give the address (in the stack) of the return pointer, but maybe
>>> it's just overkill for this use case ?
>> 
>> Why do you need to modify the value?
>> 
>> AFAICT ftrace uses these values heuristically e.g.
>> HAVE_FUNCTION_GRAPH_FP_TEST, HAVE_FUNCTION_GRAPH_RET_ADDR_PTR?
>> 
>> For example the only uses I see are in ftrace_pop_return_trace() and
>> they are purely heuristic.
> 
> Maybe I get you wrong, but do you suggest I don't need to implement
> it because it's just heuristic ?
> (I agree, it's not top priority, but nice to have nevertheless)


Probably, Carlos is referring to glibc code.

Dave
--
John David Anglin	dave.anglin@bell.net




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Change call ABI on PA-RISC
  2016-11-16  0:04                 ` John David Anglin
@ 2016-11-17 20:19                   ` Carlos O'Donell
  0 siblings, 0 replies; 14+ messages in thread
From: Carlos O'Donell @ 2016-11-17 20:19 UTC (permalink / raw)
  To: John David Anglin; +Cc: Helge Deller, Jeff Law, linux-parisc

On Tue, Nov 15, 2016 at 7:04 PM, John David Anglin <dave.anglin@bell.net> wrote:
> Hi Helge,
>
> On 2016-11-15, at 1:30 PM, Helge Deller wrote:
>
>> Hi Carlos,
>>
>> On 14.11.2016 18:23, Carlos O'Donell wrote:
>>> On Mon, Nov 14, 2016 at 11:24 AM, Helge Deller <deller@gmx.de> wrote:
>>>> What I want to archieve is to modify the return pointer, in order
>>>> to be able to track when the function returns to his caller.
>>>> The kernel ftracer uses this then to generate call stacks and to
>>>> time the function.
>>>> Looking at the above code, it should then be possible for me
>>>> to modify -10(r3), but is there a guarantee that it's always at
>>>> -10(r3) and that r3 is used?
>>>> That's the reason I asked if we could modify mcount to
>>>> give the address (in the stack) of the return pointer, but maybe
>>>> it's just overkill for this use case ?
>>>
>>> Why do you need to modify the value?
>>>
>>> AFAICT ftrace uses these values heuristically e.g.
>>> HAVE_FUNCTION_GRAPH_FP_TEST, HAVE_FUNCTION_GRAPH_RET_ADDR_PTR?
>>>
>>> For example the only uses I see are in ftrace_pop_return_trace() and
>>> they are purely heuristic.
>>
>> Maybe I get you wrong, but do you suggest I don't need to implement
>> it because it's just heuristic ?
>> (I agree, it's not top priority, but nice to have nevertheless)
>
>
> Probably, Carlos is referring to glibc code.

No, I'm referring to linux kernel code, there is no ftrace in glibc.

The above two conditionals in the kernel are only heuristic checks to
terminate ftrace unwinds with errors.

You need not implement them to have ftrace working.

kernel/trace/trace_functions_graph.c

 173 #ifdef HAVE_FUNCTION_GRAPH_FP_TEST
 174         current->ret_stack[index].fp = frame_pointer;
 175 #endif
 176 #ifdef HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
 177         current->ret_stack[index].retp = retp;
 178 #endif


 211 #ifdef HAVE_FUNCTION_GRAPH_FP_TEST
 212         /*
 213          * The arch may choose to record the frame pointer used
 214          * and check it here to make sure that it is what we expect it
 215          * to be. If gcc does not set the place holder of the return
 216          * address in the frame pointer, and does a copy instead, then
 217          * the function graph trace will fail. This test detects this
 218          * case.
 219          *
 220          * Currently, x86_32 with optimize for size (-Os) makes the latest
 221          * gcc do the above.
 222          *
 223          * Note, -mfentry does not use frame pointers, and this test
 224          *  is not needed if CC_USING_FENTRY is set.
 225          */
 226         if (unlikely(current->ret_stack[index].fp != frame_pointer)) {
 227                 ftrace_graph_stop();
 228                 WARN(1, "Bad frame pointer: expected %lx, received %lx\n"
 229                      "  from func %ps return to %lx\n",
 230                      current->ret_stack[index].fp,
 231                      frame_pointer,
 232                      (void *)current->ret_stack[index].func,
 233                      current->ret_stack[index].ret);
 234                 *ret = (unsigned long)panic;
 235                 return;
 236         }
 237 #endif

 286 /**
 287  * ftrace_graph_ret_addr - convert a potentially modified stack
return address
 288  *                         to its original value
 289  *
 290  * This function can be called by stack unwinding code to convert
a found stack
 291  * return address ('ret') to its original value, in case the function graph
 292  * tracer has modified it to be 'return_to_handler'.  If the address hasn't
 293  * been modified, the unchanged value of 'ret' is returned.
 294  *
 295  * 'idx' is a state variable which should be initialized by the
caller to zero
 296  * before the first call.
 297  *
 298  * 'retp' is a pointer to the return address on the stack.  It's ignored if
 299  * the arch doesn't have HAVE_FUNCTION_GRAPH_RET_ADDR_PTR defined.
 300  */
 301 #ifdef HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
...
 317         for (i = 0; i <= index; i++)
 318                 if (task->ret_stack[i].retp == retp)
 319                         return task->ret_stack[i].ret;
 320
 321         return ret;
 322 }
 323 #else /* !HAVE_FUNCTION_GRAPH_RET_ADDR_PTR */

So you see the FP is used for heuristics to terminate a trace.

In the case of RP, I expect you should always have access to
save/restore RP from the sp-10 slot and that should be enough?

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-11-17 20:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-12 18:00 Change call ABI on PA-RISC John David Anglin
2016-11-13 18:37 ` Helge Deller
2016-11-13 18:56   ` Jeff Law
2016-11-13 19:48     ` Helge Deller
2016-11-14  8:21       ` Jeff Law
2016-11-14 15:11         ` John David Anglin
2016-11-14 16:24           ` Helge Deller
2016-11-14 17:02             ` John David Anglin
2016-11-14 17:02             ` Jeff Law
2016-11-14 17:23             ` Carlos O'Donell
2016-11-15 18:30               ` Helge Deller
2016-11-16  0:04                 ` John David Anglin
2016-11-17 20:19                   ` Carlos O'Donell
2016-11-14 13:32 ` Carlos O'Donell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.