All of lore.kernel.org
 help / color / mirror / Atom feed
* ptrace induced instruction cache bug?
@ 2004-01-13  2:34 Nathan Field
  2004-01-13 15:01 ` Daniel Jacobowitz
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Field @ 2004-01-13  2:34 UTC (permalink / raw)
  To: linux-mips

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3631 bytes --]

I'm writing a debugger that uses the Linux ptrace API for process control
and I think I've found a bug in ptrace in MIPS Linux. The specific
situation that breaks horribly with my debugger is quite complex, so I
wrote a little testbed to show the problem. The code and a sample Makefile
are attached. You can build the example for x86 or MIPS. I have some
things in there for PPC but I haven't ported it fully yet. Basically the
problem seems to be that writing a breakpoint (instruction 0xd), running 
to the breakpoint, replacing the breakpoint with the original instruction 
and then resuming sometimes results in the process halting on the same 
address, even though there isn't a breakpoint there anymore. If you resume 
again, or wait for a "while" after removing the breakpoint everything 
works fine. I believe the problem is probably linked to some sort of 
problem with the kernel not flushing the instruction cache, but that's 
just a guess.

I've encountered problems in ptrace like this with other architectures
before. If anyone wants to take my ptrace test code and make it part of
some kernel validation system please do. The code was whipped up fairly 
quickly so you might want to clean it up. I've verified that when it is 
run slowly enough it works fine.

I'd guess that this problem has been fixed in later versions of the 
kernel. If anyone can point me to a 2.4 release with this fixed I'd like 
to know about it. I tried building the cvs checkout but the build failed. 
It looks like I'll need a newer toolchain than the one I got from 
MontaVista[1].

I'm using a stock MontaVista distribution for the MIPS Malta 4Kc in big
endian mode, downloaded from their site a couple of days ago. I recompiled
the kernel with the arch/mips/configs/defconfig-malta, but haven't changed 
any options yet. Since that could be hard to classify here are some 
details about my system:

$ uname -a
Linux 192.67.158.75 2.4.17_mvl21 #8 Wed Jan 7 18:19:32 PST 2004 mips unknown

gcc version:
19) ./mips_fp_be-gcc -v
./mips_fp_be-gcc: Actual path = 
'/space1/opt/hardhat/previewkit/mips/fp_be/bin/'        Actual name = 
'mips_fp_be-gcc'
        Invoking 
/space1/opt/hardhat/previewkit/mips/fp_be/bin/../lib/gcc-lib/mips-hardhat-linux/2.95.3/mips_fp_be-gcc
Reading specs from 
/space1/opt/hardhat/previewkit/mips/fp_be/bin/../lib/gcc-lib/mips-hardhat-linux/2.95.3/specs
gcc version 2.95.3 20010315 (release/MontaVista)

$ cat /proc/cpuinfo
processor               : 0
cpu model               : MIPS 4Kc V0.5
BogoMIPS                : 124.51
wait instruction        : no
microsecond timers      : yes
extra interrupt vector  : yes
hardware watchpoint     : yes
VCED exceptions         : not available
VCEI exceptions         : not available

	Any help would be greatly appreciated,

	nathan

[1] Here's the error I get building the linux-mips.org cvs kernel. I don't 
know why it's trying to build a ramfs component, I only have ext2, /proc, 
/dev/pts, NFS, and NFS as root enabled. I've also diabled ramdisk support 
(CONFIG_BLK_DEV_RAM):

make[1]: `arch/mips/kernel/offset.s' is up to date.
make[1]: `arch/mips/kernel/reg.s' is up to date.
  CHK     include/linux/compile.h
  AS      usr/initramfs_data.o
usr/initramfs_data.S: Assembler messages:
usr/initramfs_data.S:29: Error: Unknown pseudo-op:  `.incbin'
make[1]: *** [usr/initramfs_data.o] Error 1
make: *** [usr] Error 2



-- 
Nathan Field (ndf@ghs.com)			          All gone.

But the trouble with analogies is that analogies are like goldfish:
sometimes they have nothing to do with the topic at hand.
        -- Crispin (from a posting to the Bugtraq mailing list)

[-- Attachment #2: Type: TEXT/PLAIN, Size: 12436 bytes --]

#ifndef TARGET_ARCH
#  error You must define a TARGET_ARCH.
#endif

#define X86   1
#define MIPS  2
#define PPC   3

#if(TARGET_ARCH==X86)
#  define NEED_TO_ADJUST_AFTER_BP_HIT
#  define BREAKPOINT_INSTRUCTION 0xcc
#  define BREAKPOINT_SIZE 1
#elif(TARGET_ARCH==MIPS)
#  define BREAKPOINT_INSTRUCTION 0x0000000d
#  define BREAKPOINT_SIZE 4
#elif(TARGET_ARCH==PPC)
#  define BREAKPOINT_INSTRUCTION 0x7fe00008
#  define BREAKPOINT_SIZE 4
#else
#  error Unsupported arch.
#endif

#define ITERATIONS 100


/* ---------------------------------------------------------------- */


#define SUCCESS 1
#define FAILURE 0


/* ---------------------------------------------------------------- */


#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <errno.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/select.h>
#include <sys/ptrace.h>
#include <asm/ptrace.h>


/* ---------------------------------------------------------------- */


static void debuggedProgram();
static void parentProcess(int pid);
static void startTrace(char *progname);

/* Functions we set breakpoints on. */
int main(int argc, char *argv[]);
static int doSomethingFunc();
static int doSomethingElseFunc();



/* ---------------------------------------------------------------- */


/** This program is a test bed to find bugs in Linux ptrace
 * implementations.  It starts by forking. The child process does a
 * PTRACE_TRACEME and then an exec of itself with special
 * arguments. When it starts up again it then runs the debuggedProgram
 * function. The parent does some debugging operations to test
 * ptrace. */
int main(int argc, char *argv[])
{
    int pid;

    if( (argc == 2) && !strcmp("traced_program", argv[1]) ) {
	debuggedProgram();
	exit(0);
    } else if(argc != 1) {
	fprintf(stderr, "Run this program with no arguments.\n");
	exit(1);
    }

    pid = fork();

    if(pid == 0) {
	startTrace(argv[0]);
	/* Should never get here. */
	exit(1);
    } else {
	fprintf(stderr, "Address of:\tmain\t\t\t: 0x%x\n", (uint32_t)main);
	fprintf(stderr, "Address of:\tdoSomethingFunc\t\t: 0x%x\n",
		(uint32_t)doSomethingFunc);
	fprintf(stderr, "Address of:\tdoSomethingElseFunc\t: 0x%x\n",
		(uint32_t)doSomethingElseFunc);

	parentProcess(pid);
    }

    return 0;
}


/* ---------------------------------------------------------------- */


static int some_number = 0;

/* Set a breakpoint on this function. */
static int doSomethingFunc()
{
    return ++some_number;
}

static void doSystemCall()
{
    /* Select seems like a good system call to do. */
    struct timeval timeout;
    timeout.tv_sec = 0;
    timeout.tv_usec = 1;
    select(0, 0, 0, 0, &timeout);
}

/* Set breakpoint on this function. */
static int doSomethingElseFunc()
{
    return ++some_number;
}

/* This is the "meat" of the program which gets debugged. Since I use
 * a combination of PTRACE_CONT and PTRACE_SYSCALL it's important that
 * this program only do system calls in very specific situations, in
 * particular only between the two doSomething* functions. */
static void debuggedProgram()
{
    int i;
    for(i=0; i<ITERATIONS; ++i) {
	doSomethingFunc();
	printf("debuggedProgram: Doing iteration %d.\n", i);
	doSystemCall();
	doSomethingElseFunc();
    }
    printf("debuggedProgram: Finished.\n");
}


/* ---------------------------------------------------------------- */


void failure(const char *msg)
{
    fprintf(stderr, msg);
    exit(1);
}

static void startTrace(char *progname)
{
    char *argv[] = {progname, "traced_program", NULL};
    ptrace(PTRACE_TRACEME, 0, 0, 0);
    execv(progname, argv);
    perror("Child failed to execv");
}

/* ptrace reads/writes in 4 byte blocks, so we need to be aligned to 4
 * byte boundaries. */
static int addrAligned(uint32_t addr)
{
    if( (addr & 3) == 0) {
	return SUCCESS;
    } else {
	return FAILURE;
    }
}

static uint32_t readPC(int pid)
{
#if(TARGET_ARCH==X86)
    uint32_t r[17]; /* 17 registers on x86 */
    if(ptrace(PTRACE_GETREGS, pid, 0, (char *)&r, 0) == 0) {
	return r[12];
    }
#else
    /* asm-mips/ptrace.h MIPS gives us a PC define for the offset. */
    uint32_t val = ptrace(PTRACE_PEEKUSER, pid, PC, 0, 0);
    if( (val != -1) || (errno == 0) ) {
	return val;
    }
#endif
    perror("Failed to read PC.\n");
    exit(1);
    return 0xFFFFFFFF;
}

static void writePC(int pid, uint32_t newpc)
{
#if(TARGET_ARCH==X86)
    uint32_t r[17];
    if(ptrace(PTRACE_GETREGS, pid, 0, (char*)&r, 0) != 0) {
	failure("Unable to get registers to write new PC.\n");
    }
    r[12] = newpc;
    if(ptrace(PTRACE_SETREGS, pid, 0, (char*)&r, 0) != 0) {
	failure("Failed to write registers when adjusting PC.\n");
    }
#elif(TARGET_ARCH==MIPS)
    /* asm-mips/ptrace.h MIPS gives us a PC define for the offset. */
    if(ptrace(PTRACE_POKEUSER, pid, PC, newpc, 0) != 0) {
	failure("Failed to write PC.\n");
    }
#else
#  error Unsupported arch.
#endif
}

static uint32_t adjustForBp(int pid, uint32_t curpc)
{
#ifdef NEED_TO_ADJUST_AFTER_BP_HIT
    uint32_t newpc;
    if(curpc - BREAKPOINT_SIZE == (uint32_t)main) {
	newpc = curpc - BREAKPOINT_SIZE;
    } else if(curpc - BREAKPOINT_SIZE == (uint32_t)doSomethingFunc) {
	newpc = curpc - BREAKPOINT_SIZE;
    } else if(curpc - BREAKPOINT_SIZE == (uint32_t)doSomethingElseFunc) {
	newpc = curpc - BREAKPOINT_SIZE;
    } else {
	return curpc;
    }
    writePC(pid, newpc);
    return newpc;
#else
    return curpc;
#endif
}

static int readWord(int pid, uint32_t aligned_addr, uint32_t *val)
{
    *val = ptrace(PTRACE_PEEKDATA, pid, aligned_addr, 0);
    if( (*val == 0xFFFFFFFF) && (errno != 0) ) {
	/* The value could be -1, so check errno to see if there was a
	 * failure. */
	return FAILURE;
    } else {
	return SUCCESS;
    }
}

static int writeWord(int pid, uint32_t aligned_addr, uint32_t val)
{
    if(ptrace(PTRACE_POKEDATA, pid, aligned_addr, val) == -1) {
	return FAILURE;
    } else {
	return SUCCESS;
    }
}

/* Swap a value (1 byte or 4) with the current value in the given
 * location.  Works with unaligned addresses (which should only happen
 * on x86). */
static int swapMemory(int pid, uint32_t addr, int size,
		      uint32_t new_val, uint32_t *old_val)
{
    if(addrAligned(addr)) {
	/* Simple case, 4 byte, aligned address. */
	if(readWord(pid, addr, old_val) != SUCCESS)
	    failure("Failed to read word while swapping memory.\n");
	switch(size) {
	case 1:
#ifdef LITTLE_ENDIAN
	    new_val = (new_val & 0x000000FF) | (*old_val & 0xFFFFFF00);
	    *old_val = *old_val & 0x000000FF;
#else /* big endian arch */
            new_val = (new_val << 24) | (*old_val & 0x00FFFFFF);
            *old_val = *old_val >> 24;
#endif
	    break;
	case 4:
	    break;
	default:
	    failure("Unsupported memory swap size.\n"); 
	}
	if(writeWord(pid, addr, new_val) != SUCCESS)
	    failure("Failed to write word while swapping memory.\n");
	return SUCCESS;
    } else {
	// Unaligned address, do each byte in turn recursivly.
	int i;
	uint32_t old_partial;
	for(i=0; i<size; ++i) {
	    old_partial = 0;
	    if(swapMemory(pid, addr + (2*i), 1, new_val<<(i*8), &old_partial) != SUCCESS)
		failure("Failed while doing recursive swapMemory.\n");
	    *old_val = *old_val & (old_partial << (i*8));
	}
	return SUCCESS;
    }
}

static void parentProcess(int pid)
{
    int i;
    int ret;
    int status;
    /* Used to store instruction where the breakpoint was
     * installed. */
    uint32_t inst;
    uint32_t temp;

    printf("parentProcess: Beginning debugging of child %d.\n", pid);

    /* Wait for process to exec, we will get a SIGTRAP
     * notification. */
    ret = waitpid(pid, &status, 0);
    if(ret != pid) failure("waitpid to find exec failed.\n");

    if(!WIFSTOPPED(status) || (WSTOPSIG(status) != SIGTRAP))
	failure("Did not get expected SIGTRAP for exec.\n");

    /* Set breakpoint on main. */
    if(swapMemory(pid, (uint32_t)main, BREAKPOINT_SIZE,
		  BREAKPOINT_INSTRUCTION, &inst) != SUCCESS)
	failure("Failed to install breakpoint on main.\n");

    /* Resume process. */
    ptrace(PTRACE_CONT, pid, 0, 0);

    /* Wait for process to hit breakpoint, we will get a SIGTRAP
     * notification. */
    ret = waitpid(pid, &status, 0);

    if(!WIFSTOPPED(status)) {
	failure("Status indicates that process is not stopped on main.\n");
    }

    if(WSTOPSIG(status) != SIGTRAP) {
	fprintf(stderr, "Failure attempting to hit main.\n");
	fprintf(stderr, "Did not get SIGTRAP, got: %d.\n", WSTOPSIG(status));
	fprintf(stderr, "Current PC is: 0x%x.\n", readPC(pid));
	exit(1);
    }

    /* Make sure we stopped on the breakpoint. */
    if(adjustForBp(pid, readPC(pid)) != (uint32_t)main) {
	failure("Did not stop on main.\n");
    }

    /* Remove breakpoint on main. */
    if(swapMemory(pid, (uint32_t)main, BREAKPOINT_SIZE,
		  inst, &temp) != SUCCESS)
	failure("Failed to remove breakpoint on main.\n");

    for(i=0; i<ITERATIONS; ++i) {
	/* Set breakpoint on doSomethingFunc. */
	if(swapMemory(pid, (uint32_t)doSomethingFunc, BREAKPOINT_SIZE,
		      BREAKPOINT_INSTRUCTION, &inst) != SUCCESS)
	    failure("Failed to install breakpoint on doSomethingFunc.\n");

	/* Resume process. */
	ptrace(PTRACE_SYSCALL, pid, 0, 0);

	/* Wait for process to hit breakpoint, we will get a SIGTRAP
	 * notification. */
	ret = waitpid(pid, &status, 0);

	if(!WIFSTOPPED(status) || (WSTOPSIG(status) != SIGTRAP))
	    failure("Did not get expected SIGTRAP for hitting breakpoint.\n");

	if(!WIFSTOPPED(status)) {
	    failure("Status indicates that process is not stopped.\n");
	}

	if(WSTOPSIG(status) != SIGTRAP) {
	    fprintf(stderr, "Failure attempting to hit doSomethingFunc.\n");
	    fprintf(stderr, "Did not get expected SIGTRAP, got: %d.\n", WSTOPSIG(status));
	    fprintf(stderr, "Current PC is: 0x%x.\n", readPC(pid));
	    exit(1);
	}

	/* Make sure we stopped on the breakpoint. */
	if(adjustForBp(pid, readPC(pid)) != (uint32_t)doSomethingFunc) {
	    fprintf(stderr, "PC is not 0x%x, it is: 0x%x.\n", (uint32_t)doSomethingFunc,
		    readPC(pid));
	    failure("Did not stop on doSomethingFunc.\n");
	}

	/* Remove breakpoint on doSomethingFunc. */
	if(swapMemory(pid, (uint32_t)doSomethingFunc, BREAKPOINT_SIZE,
		      inst, &temp) != SUCCESS)
	    failure("Failed to remove breakpoint on doSomethingFunc.\n");


	/* Set breakpoint on doSomethingElseFunc. */
	if(swapMemory(pid, (uint32_t)doSomethingElseFunc, BREAKPOINT_SIZE,
		      BREAKPOINT_INSTRUCTION, &inst) != SUCCESS)
	    failure("Failed to install breakpoint on doSomethingElseFunc.\n");

	/* Resume process. */
	ptrace(PTRACE_CONT, pid, 0, 0);

	/* Wait for process to hit breakpoint, we will get a SIGTRAP
	 * notification. */
	ret = waitpid(pid, &status, 0);

	if(ret != pid) {
	    failure("Return from waitpid indicates error or that process is still running.\n");
	}

	if(!WIFSTOPPED(status)) {
	    failure("Status indicates that process is not stopped.\n");
	}

	if(WSTOPSIG(status) != SIGTRAP) {
	    fprintf(stderr, "Failure attempting to hit doSomethingElseFunc.\n");
	    fprintf(stderr, "Did not get expected SIGTRAP, got: %d.\n", WSTOPSIG(status));
	    fprintf(stderr, "Current PC is: 0x%x.\n", readPC(pid));
	    exit(1);
	}

	/* Make sure we stopped on the breakpoint. */
	if(adjustForBp(pid, readPC(pid)) != (uint32_t)doSomethingElseFunc) {
	    fprintf(stderr, "PC is not 0x%x, it is: 0x%x.\n", (uint32_t)doSomethingElseFunc,
		    readPC(pid));
	    failure("Did not stop on doSomethingElseFunc.\n");
	}

	/* Remove breakpoint on doSomethingElseFunc. */
	if(swapMemory(pid, (uint32_t)doSomethingElseFunc, BREAKPOINT_SIZE,
		      inst, &temp) != SUCCESS)
	    failure("Failed to remove breakpoint on doSomethingElseFunc.\n");
    }

    /* Resume process to let it complete. */
    ptrace(PTRACE_CONT, pid, 0, 0);

    ret = waitpid(pid, &status, 0);

    if(!WIFEXITED(status)) {
	failure("Debugged program failed to exit at expected time.\n");
    }

    printf("parentProcess: Should have hit all bp's, child should have exited.\n");
}

[-- Attachment #3: Type: TEXT/PLAIN, Size: 439 bytes --]

CC_MALTA_EB=/home/zcar1/opt/hardhat/previewkit/mips/fp_be/bin/mips_fp_be-gcc
MALTA_DEFINES=-DTARGET_ARCH=MIPS


DBLINK=dblink --scan_source -auto_translate 

CFLAGS=-g -Wall



TARGETS=simpledebugger_malta_eb

all: $(TARGETS)

simpledebugger_malta_eb: simpledebugger.c
	$(CC_MALTA_EB) $(MALTA_DEFINES) $(CFLAGS) -o $@ $<
	$(DBLINK) $@

clean:
	rm -rf *~ core* *.o objs

clobber: clean
	rm -f $(TARGETS) $(TARGETS).*

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ptrace induced instruction cache bug?
  2004-01-13  2:34 ptrace induced instruction cache bug? Nathan Field
@ 2004-01-13 15:01 ` Daniel Jacobowitz
  2004-01-13 18:35   ` Nathan Field
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Jacobowitz @ 2004-01-13 15:01 UTC (permalink / raw)
  To: Nathan Field; +Cc: linux-mips

On Mon, Jan 12, 2004 at 06:34:57PM -0800, Nathan Field wrote:
> I'm writing a debugger that uses the Linux ptrace API for process control
> and I think I've found a bug in ptrace in MIPS Linux. The specific
> situation that breaks horribly with my debugger is quite complex, so I
> wrote a little testbed to show the problem. The code and a sample Makefile
> are attached. You can build the example for x86 or MIPS. I have some
> things in there for PPC but I haven't ported it fully yet. Basically the
> problem seems to be that writing a breakpoint (instruction 0xd), running 
> to the breakpoint, replacing the breakpoint with the original instruction 
> and then resuming sometimes results in the process halting on the same 
> address, even though there isn't a breakpoint there anymore. If you resume 
> again, or wait for a "while" after removing the breakpoint everything 
> works fine. I believe the problem is probably linked to some sort of 
> problem with the kernel not flushing the instruction cache, but that's 
> just a guess.

It sounds reasonable.  I've encountered this problem in the past also,
but never with the Pro 2.1 / MIPS release which is what you're using. 
I don't see anything obviously wrong with your test code, either.

> I'd guess that this problem has been fixed in later versions of the 
> kernel. If anyone can point me to a 2.4 release with this fixed I'd like 
> to know about it. I tried building the cvs checkout but the build failed. 
> It looks like I'll need a newer toolchain than the one I got from 
> MontaVista[1].
> 
> I'm using a stock MontaVista distribution for the MIPS Malta 4Kc in big
> endian mode, downloaded from their site a couple of days ago. I recompiled
> the kernel with the arch/mips/configs/defconfig-malta, but haven't changed 
> any options yet. Since that could be hard to classify here are some 
> details about my system:

Yes, you will need a newer toolchain.  Honestly, I'm baffled as to why
a Pro 2.1 toolchain was available from our web site at all, unless you
got it via an old product subscription... it should have been Pro 3.0,
which uses GCC 3.2 and a more recent binutils.  But I don't have any
control over these things :)

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ptrace induced instruction cache bug?
  2004-01-13 15:01 ` Daniel Jacobowitz
@ 2004-01-13 18:35   ` Nathan Field
  2004-01-13 20:58     ` Daniel Jacobowitz
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Field @ 2004-01-13 18:35 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: linux-mips

> It sounds reasonable.  I've encountered this problem in the past also,
> but never with the Pro 2.1 / MIPS release which is what you're using.  
> I don't see anything obviously wrong with your test code, either.
	So... is there a fix for this?

> Yes, you will need a newer toolchain.  Honestly, I'm baffled as to why a
> Pro 2.1 toolchain was available from our web site at all, unless you got
> it via an old product subscription... it should have been Pro 3.0, which
> uses GCC 3.2 and a more recent binutils.  But I don't have any control
> over these things :)
	I downloaded it about 5 days ago from:
http://www.mvista.com/previewkit/index.html

Could I get a preview kit of your 3.0 release for a Malta 4Kc board?

	nathan

-- 
Nathan Field (ndf@ghs.com)			          All gone.

But the trouble with analogies is that analogies are like goldfish:
sometimes they have nothing to do with the topic at hand.
        -- Crispin (from a posting to the Bugtraq mailing list)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ptrace induced instruction cache bug?
  2004-01-13 18:35   ` Nathan Field
@ 2004-01-13 20:58     ` Daniel Jacobowitz
  2004-01-14 23:36       ` Nathan Field
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Jacobowitz @ 2004-01-13 20:58 UTC (permalink / raw)
  To: Nathan Field; +Cc: linux-mips

On Tue, Jan 13, 2004 at 10:35:04AM -0800, Nathan Field wrote:
> > It sounds reasonable.  I've encountered this problem in the past also,
> > but never with the Pro 2.1 / MIPS release which is what you're using.  
> > I don't see anything obviously wrong with your test code, either.
> 	So... is there a fix for this?

Usually a missing cache flush, as you surmised :)  But I don't know of
any that were missing in that version.

> > Yes, you will need a newer toolchain.  Honestly, I'm baffled as to why a
> > Pro 2.1 toolchain was available from our web site at all, unless you got
> > it via an old product subscription... it should have been Pro 3.0, which
> > uses GCC 3.2 and a more recent binutils.  But I don't have any control
> > over these things :)
> 	I downloaded it about 5 days ago from:
> http://www.mvista.com/previewkit/index.html
> 
> Could I get a preview kit of your 3.0 release for a Malta 4Kc board?

Let me inquire as to why we're distributing old ones.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ptrace induced instruction cache bug?
  2004-01-13 20:58     ` Daniel Jacobowitz
@ 2004-01-14 23:36       ` Nathan Field
  2004-01-15  0:07         ` Jun Sun
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Field @ 2004-01-14 23:36 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: linux-mips

> On Tue, Jan 13, 2004 at 10:35:04AM -0800, Nathan Field wrote:
> > > It sounds reasonable.  I've encountered this problem in the past also,
> > > but never with the Pro 2.1 / MIPS release which is what you're using.  
> > > I don't see anything obviously wrong with your test code, either.
> > 	So... is there a fix for this?
> 
> Usually a missing cache flush, as you surmised :)  But I don't know of
> any that were missing in that version.
	So I looked into this and found that in ptrace.c:access_process_vm 
if I added a (obviously inefficient) flush_cache_all() into:

		if (write) {
			memcpy(maddr + offset, buf, bytes);
#ifdef CONFIG_SUPERH
			flush_dcache_page(page);
#endif
			flush_page_to_ram(page);
			flush_icache_page(vma, page);
			/* [ndf] I know this is not efficient, but it 
			 * makes it work. */
+++			flush_cache_all();
		} else {
			memcpy(buf, maddr + offset, bytes);
			flush_page_to_ram(page);
		}

then my ptrace test suite works. Looking at the status of the cache with 
my debugger while I step over various lines I see the entry for my address 
in the data cache in set 8, way 2. I step over flush_page_to_ram and it's 
still there. When I step over my call to flush_cache_all I see that the 
entry has moved to set 8, way 3. Unfortunatly there doesn't seem to be a 
"dirty" bit in the cache status bits, so I can't prove what's going wrong 
by looking at the contents of the data cache as I step over the various 
cache flushing functions. I'd guess that the address that I want flushed 
moving around when I call flush_cache_all indicates that it really is 
being flushed (and then filled again by a later memory access), but I 
don't know the details of how the data cache is supposed to operate.

	Anyway, I'd guess that flush_page_to_ram ->
mips32_flush_page_to_ram_pc -> blast_dcache_page doesn't work on the MIPS
Malta board. Given how frequently it seems to be used that seems unlikely. 
At this point the board does what I want it to for my testing purposes, 
but something isn't quite right.

	nathan

> 
> > > Yes, you will need a newer toolchain.  Honestly, I'm baffled as to why a
> > > Pro 2.1 toolchain was available from our web site at all, unless you got
> > > it via an old product subscription... it should have been Pro 3.0, which
> > > uses GCC 3.2 and a more recent binutils.  But I don't have any control
> > > over these things :)
> > 	I downloaded it about 5 days ago from:
> > http://www.mvista.com/previewkit/index.html
> > 
> > Could I get a preview kit of your 3.0 release for a Malta 4Kc board?
> 
> Let me inquire as to why we're distributing old ones.
> 
> 

-- 
Nathan Field (ndf@ghs.com)			          All gone.

But the trouble with analogies is that analogies are like goldfish:
sometimes they have nothing to do with the topic at hand.
        -- Crispin (from a posting to the Bugtraq mailing list)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ptrace induced instruction cache bug?
  2004-01-14 23:36       ` Nathan Field
@ 2004-01-15  0:07         ` Jun Sun
  2004-01-15  0:22             ` Nathan Field
  0 siblings, 1 reply; 9+ messages in thread
From: Jun Sun @ 2004-01-15  0:07 UTC (permalink / raw)
  To: Nathan Field; +Cc: Daniel Jacobowitz, linux-mips, jsun

On Wed, Jan 14, 2004 at 03:36:54PM -0800, Nathan Field wrote:
> > On Tue, Jan 13, 2004 at 10:35:04AM -0800, Nathan Field wrote:
> > > > It sounds reasonable.  I've encountered this problem in the past also,
> > > > but never with the Pro 2.1 / MIPS release which is what you're using.  
> > > > I don't see anything obviously wrong with your test code, either.
> > > 	So... is there a fix for this?
> > 
> > Usually a missing cache flush, as you surmised :)  But I don't know of
> > any that were missing in that version.
> 	So I looked into this and found that in ptrace.c:access_process_vm 
> if I added a (obviously inefficient) flush_cache_all() into:
> 
> 		if (write) {
> 			memcpy(maddr + offset, buf, bytes);
> #ifdef CONFIG_SUPERH
> 			flush_dcache_page(page);
> #endif
> 			flush_page_to_ram(page);
> 			flush_icache_page(vma, page);
> 			/* [ndf] I know this is not efficient, but it 
> 			 * makes it work. */
> +++			flush_cache_all();
> 		} else {
> 			memcpy(buf, maddr + offset, bytes);
> 			flush_page_to_ram(page);
> 		}
> 
> then my ptrace test suite works. Looking at the status of the cache with 
> my debugger while I step over various lines I see the entry for my address 
> in the data cache in set 8, way 2. I step over flush_page_to_ram and it's 
> still there. When I step over my call to flush_cache_all I see that the 
> entry has moved to set 8, way 3. Unfortunatly there doesn't seem to be a 
> "dirty" bit in the cache status bits, so I can't prove what's going wrong 
> by looking at the contents of the data cache as I step over the various 
> cache flushing functions. I'd guess that the address that I want flushed 
> moving around when I call flush_cache_all indicates that it really is 
> being flushed (and then filled again by a later memory access), but I 
> don't know the details of how the data cache is supposed to operate.
> 
> 	Anyway, I'd guess that flush_page_to_ram ->
> mips32_flush_page_to_ram_pc -> blast_dcache_page doesn't work on the MIPS
> Malta board. Given how frequently it seems to be used that seems unlikely. 
> At this point the board does what I want it to for my testing purposes, 
> but something isn't quite right.
> 

There are too many things related to cache are wrong in 2.4.17.  For example,

. flush_page_indexed() is not right for multi-way cache
. when you map user pages into kernel, you are sufferring potential cache
  aliasing problem (BTW, we still suffer from this right now to a less degree)
. flush_page_to_ram() has a broken semantic, because it is not clear whether
  the area mapped into user virt spaces should be flushed or not
...

In short, it is not worth your time to fix old bugs.  Last time I checked
malta was working fine around 2.4.21.  It shouldn't be too hard to get
it working again in the latest 2.4 branch.

Jun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ptrace induced instruction cache bug?
@ 2004-01-15  0:22             ` Nathan Field
  0 siblings, 0 replies; 9+ messages in thread
From: Nathan Field @ 2004-01-15  0:22 UTC (permalink / raw)
  To: Jun Sun; +Cc: Daniel Jacobowitz, linux-mips

> There are too many things related to cache are wrong in 2.4.17.  For
> example,
> 
> . flush_page_indexed() is not right for multi-way cache
> . when you map user pages into kernel, you are sufferring potential cache
>   aliasing problem (BTW, we still suffer from this right now to a less degree)
> . flush_page_to_ram() has a broken semantic, because it is not clear whether
>   the area mapped into user virt spaces should be flushed or not
> ...
> 
> In short, it is not worth your time to fix old bugs.  Last time I
> checked malta was working fine around 2.4.21.  It shouldn't be too hard
> to get it working again in the latest 2.4 branch.
	Is this the 2.4.21 from ftp.kernel.org, or do I need to get 
specific patches to get it to work? I looked at the cvs tree but it's 
currently a 2.6 release. Should I just check out the linux_2_4_branch 
version from linux-mips.org?

	nathan

-- 
Nathan Field (ndf@ghs.com)			          All gone.

But the trouble with analogies is that analogies are like goldfish:
sometimes they have nothing to do with the topic at hand.
        -- Crispin (from a posting to the Bugtraq mailing list)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ptrace induced instruction cache bug?
@ 2004-01-15  0:22             ` Nathan Field
  0 siblings, 0 replies; 9+ messages in thread
From: Nathan Field @ 2004-01-15  0:22 UTC (permalink / raw)
  To: Jun Sun; +Cc: Daniel Jacobowitz, linux-mips

> There are too many things related to cache are wrong in 2.4.17.  For
> example,
> 
> . flush_page_indexed() is not right for multi-way cache
> . when you map user pages into kernel, you are sufferring potential cache
>   aliasing problem (BTW, we still suffer from this right now to a less degree)
> . flush_page_to_ram() has a broken semantic, because it is not clear whether
>   the area mapped into user virt spaces should be flushed or not
> ...
> 
> In short, it is not worth your time to fix old bugs.  Last time I
> checked malta was working fine around 2.4.21.  It shouldn't be too hard
> to get it working again in the latest 2.4 branch.
	Is this the 2.4.21 from ftp.kernel.org, or do I need to get 
specific patches to get it to work? I looked at the cvs tree but it's 
currently a 2.6 release. Should I just check out the linux_2_4_branch 
version from linux-mips.org?

	nathan

-- 
Nathan Field (ndf@ghs.com)			          All gone.

But the trouble with analogies is that analogies are like goldfish:
sometimes they have nothing to do with the topic at hand.
        -- Crispin (from a posting to the Bugtraq mailing list)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ptrace induced instruction cache bug?
  2004-01-15  0:22             ` Nathan Field
  (?)
@ 2004-01-15  0:40             ` Jun Sun
  -1 siblings, 0 replies; 9+ messages in thread
From: Jun Sun @ 2004-01-15  0:40 UTC (permalink / raw)
  To: Nathan Field; +Cc: Daniel Jacobowitz, linux-mips, jsun

On Wed, Jan 14, 2004 at 04:22:01PM -0800, Nathan Field wrote:
> > There are too many things related to cache are wrong in 2.4.17.  For
> > example,
> > 
> > . flush_page_indexed() is not right for multi-way cache
> > . when you map user pages into kernel, you are sufferring potential cache
> >   aliasing problem (BTW, we still suffer from this right now to a less degree)
> > . flush_page_to_ram() has a broken semantic, because it is not clear whether
> >   the area mapped into user virt spaces should be flushed or not
> > ...
> > 
> > In short, it is not worth your time to fix old bugs.  Last time I
> > checked malta was working fine around 2.4.21.  It shouldn't be too hard
> > to get it working again in the latest 2.4 branch.
> 	Is this the 2.4.21 from ftp.kernel.org, or do I need to get 
> specific patches to get it to work? I looked at the cvs tree but it's 
> currently a 2.6 release. Should I just check out the linux_2_4_branch 
> version from linux-mips.org?
> 

Yes.  "linux_2_4" branch to be exact.

Jun

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-01-15  0:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-13  2:34 ptrace induced instruction cache bug? Nathan Field
2004-01-13 15:01 ` Daniel Jacobowitz
2004-01-13 18:35   ` Nathan Field
2004-01-13 20:58     ` Daniel Jacobowitz
2004-01-14 23:36       ` Nathan Field
2004-01-15  0:07         ` Jun Sun
2004-01-15  0:22           ` Nathan Field
2004-01-15  0:22             ` Nathan Field
2004-01-15  0:40             ` Jun Sun

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.