All of lore.kernel.org
 help / color / mirror / Atom feed
* Runtime code modification fails on arm
@ 2009-11-10 13:08 Papalagi Pakeha
  2009-11-10 13:17 ` Jamie Lokier
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Papalagi Pakeha @ 2009-11-10 13:08 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

I've got a program that is stored partially encrypted on the
filesystem and should decrypt itself in runtime after retrieving the
key from the hardware.

Essentially the implementation puts some of the program functions into
a separate ELF section (.cryptext) and then a helper script encrypts
this section directly in the binary file. Offset and size is
determined using "objdump -h".

When the program is started it finds the address of the encrypted
function, its length and decrypts it back to the original valid
instructions. This all works just fine on x86 but the same approach
fails on ARM. There the decryptor can read the encrypted code, can
write back the decrypted code, can verify that the code has been
written but once the function is called it segfaults or dies on
invalid instruction. To me it looks like the changed code is not
picked up and the cpu still tries to run the old, encrypted one.

Why is this happening? What is so different between x86 and ARM in
that field? I'm aware that my problem exhibits in userspace, not in
the kernel. I'm sorry if it's way off topic here.

Some relevant snippets of my code.

// This function will get encrypted
static char *treasure(void) __attribute__((section (".cryptext")));
static char *treasure(void)
{
        return message;
}

static void decrypt(void)
{
        unsigned char *addr;
        int i;
        // decrypt all bytes of 'treasure()' function
        for (addr = (unsigned char *)treasure; addr < (unsigned char
*)treasure_key; addr++)
                *addr ^= 0x01;    // For now this is our "encryption"
}

To make .cryptext writable and to find out how long treasure()
function is I link the above with a simple asm file:
// cryptext-secrion-arm.s
        .section        .cryptext,"axw",%progbits
        .type   treasure_key, %function
        .globl  treasure_key
treasure_key:
        .size   treasure_key, .-treasure_key

Note the "axw" in .section declaration - the "w" makes the section
writable and enables decrypt() update the code. Objdump confirms that:
 11 .text         000003b8  00008388  00008388  00000388  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 12 .cryptext     0000001c  00008740  00008740  00000740  2**2
                  CONTENTS, ALLOC, LOAD, CODE                <== this
is not READONLY

Complete source demonstrating the behaviour is here:
http://cryptext.s3.amazonaws.com/cryptext.tar.gz - can be compiled
either for x86 or for ARM

Unpacked files are available from here for your convenience:
http://cryptext.s3.amazonaws.com/index.html

Does anyone have an idea what could be wrong here? Why does it work on
x86 but fails to pick up the decrypted instructions on ARM? FWIW we
use Samsung S3C2410 with kernel 2.6.27 and gcc 4.2.4 with glibc 2.3.3
toolchain.

Any hints are welcome!

Thanks

PaPa

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Runtime code modification fails on arm
  2009-11-10 13:08 Runtime code modification fails on arm Papalagi Pakeha
@ 2009-11-10 13:17 ` Jamie Lokier
  2009-11-10 13:19 ` Matthieu CASTET
  2009-11-10 19:17 ` Russell King - ARM Linux
  2 siblings, 0 replies; 6+ messages in thread
From: Jamie Lokier @ 2009-11-10 13:17 UTC (permalink / raw)
  To: linux-arm-kernel

Papalagi Pakeha wrote:
> Does anyone have an idea what could be wrong here? Why does it work on
> x86 but fails to pick up the decrypted instructions on ARM? FWIW we
> use Samsung S3C2410 with kernel 2.6.27 and gcc 4.2.4 with glibc 2.3.3
> toolchain.

On ARM, you have to flush the cache whenever you modify code, or even
generate code in a new area.  There is a system call for this.

Only the cache covering the modified range needs to be flushed.  It
must clean the D-cache and flush the I-cache.

You have to do this on several other architectures too.  x86 is
unusual in not needing it.

-- Jamie
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Runtime code modification fails on arm
  2009-11-10 13:08 Runtime code modification fails on arm Papalagi Pakeha
  2009-11-10 13:17 ` Jamie Lokier
@ 2009-11-10 13:19 ` Matthieu CASTET
  2009-11-10 19:17 ` Russell King - ARM Linux
  2 siblings, 0 replies; 6+ messages in thread
From: Matthieu CASTET @ 2009-11-10 13:19 UTC (permalink / raw)
  To: linux-arm-kernel

Papalagi Pakeha a ?crit :
> Hi,
> 
> I've got a program that is stored partially encrypted on the
> filesystem and should decrypt itself in runtime after retrieving the
> key from the hardware.
> 
> Essentially the implementation puts some of the program functions into
> a separate ELF section (.cryptext) and then a helper script encrypts
> this section directly in the binary file. Offset and size is
> determined using "objdump -h".
> 
> When the program is started it finds the address of the encrypted
> function, its length and decrypts it back to the original valid
> instructions. This all works just fine on x86 but the same approach
> fails on ARM. There the decryptor can read the encrypted code, can
> write back the decrypted code, can verify that the code has been
> written but once the function is called it segfaults or dies on
> invalid instruction. To me it looks like the changed code is not
> picked up and the cpu still tries to run the old, encrypted one.
> 
> Why is this happening? What is so different between x86 and ARM in
> that field? I'm aware that my problem exhibits in userspace, not in
> the kernel. I'm sorry if it's way off topic here.
You need to flush the data cache and invalidate instruction one. For
that you can use __ARM_NR_cacheflush syscall.


Matthieu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Runtime code modification fails on arm
  2009-11-10 13:08 Runtime code modification fails on arm Papalagi Pakeha
  2009-11-10 13:17 ` Jamie Lokier
  2009-11-10 13:19 ` Matthieu CASTET
@ 2009-11-10 19:17 ` Russell King - ARM Linux
  2009-11-10 23:56   ` Papalagi Pakeha
  2 siblings, 1 reply; 6+ messages in thread
From: Russell King - ARM Linux @ 2009-11-10 19:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 11, 2009 at 02:08:27AM +1300, Papalagi Pakeha wrote:
> Why is this happening? What is so different between x86 and ARM in
> that field? I'm aware that my problem exhibits in userspace, not in
> the kernel. I'm sorry if it's way off topic here.

ARM CPUs tend to have a Harvard architecture:
  http://en.wikipedia.org/wiki/Harvard_architecture

See the paragraph just above "Uses" for the problem you are hitting
and the solution.  As others in this thread have pointed out, the
kernel provides a call to deal with the I/D cache coherency problem
which must be called _after_ _any_ modification of instructions
whether they have been executed previously or not.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Runtime code modification fails on arm
  2009-11-10 19:17 ` Russell King - ARM Linux
@ 2009-11-10 23:56   ` Papalagi Pakeha
  2009-11-11  1:17     ` Papalagi Pakeha
  0 siblings, 1 reply; 6+ messages in thread
From: Papalagi Pakeha @ 2009-11-10 23:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 11, 2009 at 8:17 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Nov 11, 2009 at 02:08:27AM +1300, Papalagi Pakeha wrote:
>> Why is this happening? What is so different between x86 and ARM in
>> that field? I'm aware that my problem exhibits in userspace, not in
>> the kernel. I'm sorry if it's way off topic here.
>
> As others in this thread have pointed out, the
> kernel provides a call to deal with the I/D cache coherency problem
> which must be called _after_ _any_ modification of instructions
> whether they have been executed previously or not.

Thank you guys for the hint. However I can't find a ready to use
cacheflush() implemented in glibc (2.3.3) or dietlibc. Am I left on my
own implementing the userspace part of the syscall? I tried to figure
out how syscalls are implemented in dietlibc but there seems to be a
lot of preprocessor voodoo involved making it hard to understand...

Thanks!

PaPa

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Runtime code modification fails on arm
  2009-11-10 23:56   ` Papalagi Pakeha
@ 2009-11-11  1:17     ` Papalagi Pakeha
  0 siblings, 0 replies; 6+ messages in thread
From: Papalagi Pakeha @ 2009-11-11  1:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 11, 2009 at 12:56 PM, Papalagi Pakeha
<papalagi.pakeha@gmail.com> wrote:
> However I can't find a ready to use
> cacheflush() implemented in glibc (2.3.3) or dietlibc.

... because it is called __clear_cache() and comes from libgcc. Oh
well, it took me only about 2 hrs to figure that out ;-)

Thanks for everyones help, my decryptor now seems to work.

PaPa

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-11-11  1:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-10 13:08 Runtime code modification fails on arm Papalagi Pakeha
2009-11-10 13:17 ` Jamie Lokier
2009-11-10 13:19 ` Matthieu CASTET
2009-11-10 19:17 ` Russell King - ARM Linux
2009-11-10 23:56   ` Papalagi Pakeha
2009-11-11  1:17     ` Papalagi Pakeha

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.