linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Shared memory shmat/dt not working well in 2.5.x
@ 2002-10-01  9:52 Zlatko Calusic
  2002-10-01 13:07 ` Alessandro Suardi
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Zlatko Calusic @ 2002-10-01  9:52 UTC (permalink / raw)
  To: akpm, hugh; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1594 bytes --]

Hi, Andrew, Hugh & others.

Still having problems with Oracle on 2.5.x (it can't even be started),
I devoted some time trying to pinpoint where the problem is. Reading
many traces of Oracle, and rebooting a dozen times, I finally found
that the culprit is weird behaviour of shmat/shmdt functions in 2.5,
when combined with mprotect() calls. I wrote a simple test app
(attached) and I'm also appending output of it below (running on
2.4.19 & 2.5.39 kernels, see the difference).

Hopefully, somebody will know how to help resolve that issue, so I can
finally benchmark Oracle on 2.4 vs Oracle on 2.5. ;)

Best regards,


{2.4.19} % shm-bug
First shmat & protects done: 50000000
50000000-51000000 rw-s 00000000 00:04 327974932  /SYSV01478e7f (deleted)
51000000-51001000 r--s 01000000 00:04 327974932  /SYSV01478e7f (deleted)
51001000-51081000 rw-s 01001000 00:04 327974932  /SYSV01478e7f (deleted)
51081000-51082000 r--s 01081000 00:04 327974932  /SYSV01478e7f (deleted)
51082000-51083000 rw-s 01082000 00:04 327974932  /SYSV01478e7f (deleted)
Second shmat done: 50000000
50000000-51083000 rw-s 00000000 00:04 327974932  /SYSV01478e7f (deleted)

{2.5.39} % shm-bug
First shmat & protects done: 50000000
50000000-51000000 rw-s 00000000 00:06 2457614    /SYSV01478e7f (deleted)
51000000-51001000 r--s 00000000 00:06 2457614    /SYSV01478e7f (deleted)
51001000-51081000 rw-s 00001000 00:06 2457614    /SYSV01478e7f (deleted)
51081000-51082000 r--s 00001000 00:06 2457614    /SYSV01478e7f (deleted)
51082000-51083000 rw-s 00002000 00:06 2457614    /SYSV01478e7f (deleted)
shmat 2: Invalid argument


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: shm-bug.c --]
[-- Type: text/x-csrc, Size: 1190 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/mman.h>

#define SIZE 17313792

void xperror(char *error_string)
{
	perror(error_string);
	exit(EXIT_FAILURE);
}

int main(int argc, char **argv)
{
	int shmid, *addr;
	char buffer[64];

	if ((shmid = shmget(21466751, SIZE, IPC_CREAT | IPC_EXCL | 0640)) < 0)
		xperror("shmget");
	addr = (int *) shmat(shmid, (char *) 0x50000000, 0);
	if (addr == (int *) -1)
		xperror("shmat 1");
	if (mprotect((char *) 0x51000000, 4096, PROT_READ) < 0)
		xperror("mprotect 1");
	if (mprotect((char *) 0x51081000, 4096, PROT_READ) < 0)
		xperror("mprotect 2");
	printf("First shmat & protects done: %08lx\n", (unsigned long) addr);
	sprintf(buffer, "cat /proc/%d/maps | grep /SYSV", getpid());
	system(buffer);
	if (shmdt(addr) < 0)
		xperror("shmdt 1");
	addr = (int *) shmat(shmid, (char *) 0x50000000, 0);
	if (addr == (int *) -1) {
		perror("shmat 2");
		shmctl(shmid, IPC_RMID, NULL);
		exit(EXIT_FAILURE);
	}
	printf("Second shmat done: %08lx\n", (unsigned long) addr);
	system(buffer);
	if (shmdt(addr) < 0)
		xperror("shmdt 2");
	shmctl(shmid, IPC_RMID, NULL);
	exit(EXIT_SUCCESS);
}

[-- Attachment #3: Type: text/plain, Size: 12 bytes --]


-- 
Zlatko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-01  9:52 Shared memory shmat/dt not working well in 2.5.x Zlatko Calusic
@ 2002-10-01 13:07 ` Alessandro Suardi
  2002-10-01 13:09 ` [PATCH] " Hugh Dickins
  2002-10-01 15:32 ` [PATCH] Oracle startup split_vma fix Hugh Dickins
  2 siblings, 0 replies; 14+ messages in thread
From: Alessandro Suardi @ 2002-10-01 13:07 UTC (permalink / raw)
  To: zlatko.calusic; +Cc: akpm, hugh, linux-kernel

Zlatko Calusic wrote:
> Hi, Andrew, Hugh & others.
> 
> Still having problems with Oracle on 2.5.x (it can't even be started),

[snip]

Just wanted to add that I can't provide further info about which
  kernel broke it... updated map:

  2.5.34 kernel okay, Oracle works
  2.5.35 kernel doesn't compile
  2.5.36 oops on linux kernel boot, frozen
  2.5.37 oops on linux kernel boot, SysRQ works
  2.5.38 kernel okay, Oracle OOMs
  2.5.39  as 2.5.38
  2.5.40 kernel.org down, no mirrors carrying it yet

My box is a dell latitude CPx750J, PIII CPU, 256M RAM / 512MB swap
  all on ext3fs, mounted rw,noatime except of course for /dev/shm
  which is tmpfs. UP kernel, preempt is on, hugetlb is off.

As I told Andrew in private email, the Oracle shm segment is created,
  the background processes forked but the SQL*Plus child which should
  perform the database open after checking datafiles and obviously
  attaching the shm segment (about 50MB of it) gets killed by OOM.

--alessandro

  "everything dies, baby that's a fact
    but maybe everything that dies someday comes back"
        (Bruce Springsteen, "Atlantic City")


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-01  9:52 Shared memory shmat/dt not working well in 2.5.x Zlatko Calusic
  2002-10-01 13:07 ` Alessandro Suardi
@ 2002-10-01 13:09 ` Hugh Dickins
  2002-10-01 13:28   ` Alessandro Suardi
  2002-10-01 13:37   ` Zlatko Calusic
  2002-10-01 15:32 ` [PATCH] Oracle startup split_vma fix Hugh Dickins
  2 siblings, 2 replies; 14+ messages in thread
From: Hugh Dickins @ 2002-10-01 13:09 UTC (permalink / raw)
  To: Zlatko Calusic; +Cc: Andrew Morton, linux-kernel

On Tue, 1 Oct 2002, Zlatko Calusic wrote:
> 
> Still having problems with Oracle on 2.5.x (it can't even be started),
> I devoted some time trying to pinpoint where the problem is. Reading
> many traces of Oracle, and rebooting a dozen times, I finally found
> that the culprit is weird behaviour of shmat/shmdt functions in 2.5,
> when combined with mprotect() calls. I wrote a simple test app
> (attached) and I'm also appending output of it below (running on
> 2.4.19 & 2.5.39 kernels, see the difference).

Exemplary bug report!  Many thanks for taking so much trouble to
reproduce the problem.  Patch below (against 2.5.39) should fix it:
I'll send Linus and Andrew when I can get hold of a 2.5.40 tree.

Hugh

--- 2.5.39/mm/mmap.c	Fri Sep 20 17:57:49 2002
+++ linux/mm/mmap.c	Tue Oct  1 13:59:54 2002
@@ -1055,7 +1055,7 @@ int split_vma(struct mm_struct * mm, str
 	if (new_below) {
 		new->vm_end = addr;
 		vma->vm_start = addr;
-		vma->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
+		vma->vm_pgoff += ((addr - new->vm_start) >> PAGE_SHIFT);
 	} else {
 		vma->vm_end = addr;
 		new->vm_start = addr;


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-01 13:09 ` [PATCH] " Hugh Dickins
@ 2002-10-01 13:28   ` Alessandro Suardi
  2002-10-01 13:46     ` Zlatko Calusic
  2002-10-01 13:37   ` Zlatko Calusic
  1 sibling, 1 reply; 14+ messages in thread
From: Alessandro Suardi @ 2002-10-01 13:28 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Zlatko Calusic, Andrew Morton, linux-kernel

Hugh Dickins wrote:
> On Tue, 1 Oct 2002, Zlatko Calusic wrote:
> 
>>Still having problems with Oracle on 2.5.x (it can't even be started),
>>I devoted some time trying to pinpoint where the problem is. Reading
>>many traces of Oracle, and rebooting a dozen times, I finally found
>>that the culprit is weird behaviour of shmat/shmdt functions in 2.5,
>>when combined with mprotect() calls. I wrote a simple test app
>>(attached) and I'm also appending output of it below (running on
>>2.4.19 & 2.5.39 kernels, see the difference).
> 
> 
> Exemplary bug report!  Many thanks for taking so much trouble to
> reproduce the problem.  Patch below (against 2.5.39) should fix it:
> I'll send Linus and Andrew when I can get hold of a 2.5.40 tree.
> 
> Hugh
> 
> --- 2.5.39/mm/mmap.c	Fri Sep 20 17:57:49 2002
> +++ linux/mm/mmap.c	Tue Oct  1 13:59:54 2002
> @@ -1055,7 +1055,7 @@ int split_vma(struct mm_struct * mm, str
>  	if (new_below) {
>  		new->vm_end = addr;
>  		vma->vm_start = addr;
> -		vma->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
> +		vma->vm_pgoff += ((addr - new->vm_start) >> PAGE_SHIFT);
>  	} else {
>  		vma->vm_end = addr;
>  		new->vm_start = addr;

I'm glad to report that Oracle 9.2 is now able to start once again
  on 2.5.x series :)

Thanks, cool work as always !

--alessandro

  "everything dies, baby that's a fact
    but maybe everything that dies someday comes back"
        (Bruce Springsteen, "Atlantic City")


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-01 13:09 ` [PATCH] " Hugh Dickins
  2002-10-01 13:28   ` Alessandro Suardi
@ 2002-10-01 13:37   ` Zlatko Calusic
  1 sibling, 0 replies; 14+ messages in thread
From: Zlatko Calusic @ 2002-10-01 13:37 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, linux-kernel

Hugh Dickins <hugh@veritas.com> writes:

> On Tue, 1 Oct 2002, Zlatko Calusic wrote:
>> 
>> Still having problems with Oracle on 2.5.x (it can't even be started),
>> I devoted some time trying to pinpoint where the problem is. Reading
>> many traces of Oracle, and rebooting a dozen times, I finally found
>> that the culprit is weird behaviour of shmat/shmdt functions in 2.5,
>> when combined with mprotect() calls. I wrote a simple test app
>> (attached) and I'm also appending output of it below (running on
>> 2.4.19 & 2.5.39 kernels, see the difference).
>
> Exemplary bug report!  Many thanks for taking so much trouble to
> reproduce the problem.  Patch below (against 2.5.39) should fix it:
> I'll send Linus and Andrew when I can get hold of a 2.5.40 tree.

Oh, dear sir, but you are the one who solved it eventually. :)
Anyway, I'm glad it worked well and I was helpful.

Looking forward to test/bench Oracle w/ patch applied (and report
further bugs in ext3, fsync() and friends ;)).

Keep up the good work!
-- 
Zlatko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-01 13:28   ` Alessandro Suardi
@ 2002-10-01 13:46     ` Zlatko Calusic
  2002-10-01 14:51       ` Alessandro Suardi
  0 siblings, 1 reply; 14+ messages in thread
From: Zlatko Calusic @ 2002-10-01 13:46 UTC (permalink / raw)
  To: Alessandro Suardi; +Cc: Hugh Dickins, Andrew Morton, linux-kernel

Alessandro Suardi <alessandro.suardi@oracle.com> writes:

> Hugh Dickins wrote:
>> On Tue, 1 Oct 2002, Zlatko Calusic wrote:
>>
>>>Still having problems with Oracle on 2.5.x (it can't even be started),
>>>I devoted some time trying to pinpoint where the problem is. Reading
>>>many traces of Oracle, and rebooting a dozen times, I finally found
>>>that the culprit is weird behaviour of shmat/shmdt functions in 2.5,
>>>when combined with mprotect() calls. I wrote a simple test app
>>>(attached) and I'm also appending output of it below (running on
>>>2.4.19 & 2.5.39 kernels, see the difference).
>> Exemplary bug report!  Many thanks for taking so much trouble to
>> reproduce the problem.  Patch below (against 2.5.39) should fix it:
>> I'll send Linus and Andrew when I can get hold of a 2.5.40 tree.
>> Hugh
>
[snip]
>
> I'm glad to report that Oracle 9.2 is now able to start once again
>   on 2.5.x series :)
>
> Thanks, cool work as always !

Was it a known problem for some time?

I haven't been testing 2.5.x series for some time, and also haven't
read linux-kernel list last few months, so I don't know exact history
of the bug. If you can enlighten me, I'm just curious... :)

I rememeber other more complicated bugs from the older 2.5.x kernels,
and now I'll test if they're solved in newer ones. I might need some
help if they still exist (could you lend me a hand if that's the
case?) as I was getting Oracle internal error - coredump - with only
one meaningful sentence (at least to me :)). Google was silent on the
case. :(

Regards,
-- 
Zlatko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-01 13:46     ` Zlatko Calusic
@ 2002-10-01 14:51       ` Alessandro Suardi
  2002-10-01 14:59         ` Zlatko Calusic
  2002-10-02 18:45         ` Zlatko Calusic
  0 siblings, 2 replies; 14+ messages in thread
From: Alessandro Suardi @ 2002-10-01 14:51 UTC (permalink / raw)
  To: zlatko.calusic; +Cc: Hugh Dickins, Andrew Morton, linux-kernel

Zlatko Calusic wrote:

>>I'm glad to report that Oracle 9.2 is now able to start once again
>>  on 2.5.x series :)
>>
>>Thanks, cool work as always !
> 
> 
> Was it a known problem for some time?
> 
> I haven't been testing 2.5.x series for some time, and also haven't
> read linux-kernel list last few months, so I don't know exact history
> of the bug. If you can enlighten me, I'm just curious... :)
> 
> I rememeber other more complicated bugs from the older 2.5.x kernels,
> and now I'll test if they're solved in newer ones. I might need some
> help if they still exist (could you lend me a hand if that's the
> case?) as I was getting Oracle internal error - coredump - with only
> one meaningful sentence (at least to me :)). Google was silent on the
> case. :(

I reported the issue on l-k the other day:

http://www.uwsg.iu.edu/hypermail/linux/kernel/0209.3/1691.html

The more complicated bug you're talking about is the exec_mmap
  change introduced in 2.5.19 and fixed a handful of versions
  later, possibly .28, where PMON wouldn't start after 120"...
  I guess :)


Ciao,

--alessandro

  "everything dies, baby that's a fact
    but maybe everything that dies someday comes back"
        (Bruce Springsteen, "Atlantic City")


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-01 14:51       ` Alessandro Suardi
@ 2002-10-01 14:59         ` Zlatko Calusic
  2002-10-02 18:45         ` Zlatko Calusic
  1 sibling, 0 replies; 14+ messages in thread
From: Zlatko Calusic @ 2002-10-01 14:59 UTC (permalink / raw)
  To: Alessandro Suardi; +Cc: Hugh Dickins, Andrew Morton, linux-kernel

Alessandro Suardi <alessandro.suardi@oracle.com> writes:

> Zlatko Calusic wrote:
>
>>>I'm glad to report that Oracle 9.2 is now able to start once again
>>>  on 2.5.x series :)
>>>
>>>Thanks, cool work as always !
>> Was it a known problem for some time?
>> I haven't been testing 2.5.x series for some time, and also haven't
>> read linux-kernel list last few months, so I don't know exact history
>> of the bug. If you can enlighten me, I'm just curious... :)
>> I rememeber other more complicated bugs from the older 2.5.x kernels,
>> and now I'll test if they're solved in newer ones. I might need some
>> help if they still exist (could you lend me a hand if that's the
>> case?) as I was getting Oracle internal error - coredump - with only
>> one meaningful sentence (at least to me :)). Google was silent on the
>> case. :(
>
> I reported the issue on l-k the other day:
>
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0209.3/1691.html

I see. Same day I decided to dig deeper. :)

>
> The more complicated bug you're talking about is the exec_mmap
>   change introduced in 2.5.19 and fixed a handful of versions
>   later, possibly .28, where PMON wouldn't start after 120"...
>   I guess :)

Great. Thanks for the useful info.

It looks that there's a chance I will do only the interesting
benchmarking part. :) I'm quite curious how Andrew's work in 2.5.x
will affect performance of Oracle database.

Thanks for everything.
-- 
Zlatko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] Oracle startup split_vma fix
  2002-10-01  9:52 Shared memory shmat/dt not working well in 2.5.x Zlatko Calusic
  2002-10-01 13:07 ` Alessandro Suardi
  2002-10-01 13:09 ` [PATCH] " Hugh Dickins
@ 2002-10-01 15:32 ` Hugh Dickins
  2 siblings, 0 replies; 14+ messages in thread
From: Hugh Dickins @ 2002-10-01 15:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Zlatko Clausic, Alessandro Suardi, linux-kernel

Alessandro Suardi and Zlatko Calusic independently reported that
Oracle cannot start on recent 2.5: excellent research by Zlatko
quickly pointed to vm_pgoff buglet in the new split_vma.

Patch below against 2.5.40 or 2.5.40-mm1: please apply.

--- 2.5.40/mm/mmap.c	Tue Oct  1 15:33:04 2002
+++ linux/mm/mmap.c	Tue Oct  1 15:53:06 2002
@@ -1058,7 +1058,7 @@
 	if (new_below) {
 		new->vm_end = addr;
 		vma->vm_start = addr;
-		vma->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
+		vma->vm_pgoff += ((addr - new->vm_start) >> PAGE_SHIFT);
 	} else {
 		vma->vm_end = addr;
 		new->vm_start = addr;


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-01 14:51       ` Alessandro Suardi
  2002-10-01 14:59         ` Zlatko Calusic
@ 2002-10-02 18:45         ` Zlatko Calusic
  2002-10-08 11:22           ` Zlatko Calusic
  1 sibling, 1 reply; 14+ messages in thread
From: Zlatko Calusic @ 2002-10-02 18:45 UTC (permalink / raw)
  To: Alessandro Suardi; +Cc: Hugh Dickins, Andrew Morton, linux-kernel

Alessandro Suardi <alessandro.suardi@oracle.com> writes:
> The more complicated bug you're talking about is the exec_mmap
>   change introduced in 2.5.19 and fixed a handful of versions
>   later, possibly .28, where PMON wouldn't start after 120"...
>   I guess :)

Oh, well, if that one is really fixed, then I have another one. ;)

After some time up, few select & few inserts, Oracle decided to die
(2.5.40 + Hugh's patch, SMP, Oracle 9.0.1.4 - works flawlessly on
2.4.19). I have a full coredump, but I don't know what to do with it
(if somebody wants it, just say). It seems benchmarking will
wait... :(


*** 2002-10-02 20:15:27.634
*** SESSION ID:(4.1) 2002-10-02 20:15:27.583
BH (0x0x60fee288) file#: 1 rdba: 0x004000c7 (1/199) class 1 ba: 0x0x60c9a000
  set: 3, dbwrid: 0
  hash: [53509d88,53509d88], lru: [60fee370,60fee220]
  LRU flags: 
  ckptq: [NULL] fileq: [NULL]
  st: XCURRENT, md: NULL, rsop: 0x(nil), tch: 1
  L:[0x0.0.0] H:[0x0.0.0] R:[0x0.0.0]
*** 2002-10-02 20:15:27.634
ksedmp: internal or fatal error
ORA-00600: Message 600 not found; No message file for product=RDBMS, facility=ORA; arguments: [kcbkllrba_2]

...

-- 
Zlatko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-02 18:45         ` Zlatko Calusic
@ 2002-10-08 11:22           ` Zlatko Calusic
  2002-10-08 11:38             ` Duncan Sands
  0 siblings, 1 reply; 14+ messages in thread
From: Zlatko Calusic @ 2002-10-08 11:22 UTC (permalink / raw)
  To: Alessandro Suardi; +Cc: Hugh Dickins, Andrew Morton, linux-kernel

Zlatko Calusic <zlatko.calusic@iskon.hr> writes:

> Alessandro Suardi <alessandro.suardi@oracle.com> writes:
>> The more complicated bug you're talking about is the exec_mmap
>>   change introduced in 2.5.19 and fixed a handful of versions
>>   later, possibly .28, where PMON wouldn't start after 120"...
>>   I guess :)
>
> Oh, well, if that one is really fixed, then I have another one. ;)
>

Hm, not anymore!

Thanks to you guys, 2.5.41 is flawless. It works under all the tests
that were failing before. Great work!

I did some benchmarks and it looks like 2.5 is a little bit slower.  I
have two small perl+plsql applications for testing purposes,
"cucibench" benches how long it takes to parse cucitail POP daemon log
and put it into database (insert load). "mailproc" processes sendmail
log and does the same. mailproc is a little bit more complicated (it
also does updates). The results are as follows (numbers are
minutes:seconds it took to finish the task on Oracle 9.2.0.1):

| app       | 2.4.19 | 2.5.41 |
|-----------------------------|
| cucibench |  03:17 |  03:38 |
| mailproc  |  02:12 |  02:30 |
|-----------------------------|

I also observed that other application I use occasionally - LXR (Linux
source cross referencing tool) - takes much longer to generate xref
database (which is in Berkeley DB files). It works in three passes,
where the last one, when it dumps symbols into DB, is interesting. In
2.4 it finishes quickly (it uses 100% CPU, then occasionally syncs the
databases - heavy write traffic for a second - then continues), but
2.5 has problems with it (it stucks writing to disk all the time, CPU
usage is minimal and process progresses very slowly). Andrew, if
you're interested I can send you some numbers to describe the case
better.

Keep up the good work!
-- 
Zlatko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-08 11:22           ` Zlatko Calusic
@ 2002-10-08 11:38             ` Duncan Sands
  2002-10-08 15:10               ` Zlatko Calusic
  0 siblings, 1 reply; 14+ messages in thread
From: Duncan Sands @ 2002-10-08 11:38 UTC (permalink / raw)
  To: zlatko.calusic, Alessandro Suardi
  Cc: Hugh Dickins, Andrew Morton, linux-kernel

> I also observed that other application I use occasionally - LXR (Linux
> source cross referencing tool) - takes much longer to generate xref
> database (which is in Berkeley DB files). It works in three passes,
> where the last one, when it dumps symbols into DB, is interesting. In
> 2.4 it finishes quickly (it uses 100% CPU, then occasionally syncs the
> databases - heavy write traffic for a second - then continues), but
> 2.5 has problems with it (it stucks writing to disk all the time, CPU
> usage is minimal and process progresses very slowly). Andrew, if
> you're interested I can send you some numbers to describe the case
> better.

Hmmm, are you using ext3?  Changes to the meaning of yield sometimes
make fsync go very slowly.  This problem has been around since 2.5.28,
and hasn't yet been fixed (As for a fix, Andrew Morton said "I'll sit tight for
the while, see where shed_yield() behaviour ends up").

All the best,

Duncan.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-08 11:38             ` Duncan Sands
@ 2002-10-08 15:10               ` Zlatko Calusic
  2002-10-08 15:25                 ` Duncan Sands
  0 siblings, 1 reply; 14+ messages in thread
From: Zlatko Calusic @ 2002-10-08 15:10 UTC (permalink / raw)
  To: Duncan Sands; +Cc: Alessandro Suardi, Hugh Dickins, Andrew Morton, linux-kernel

Duncan Sands <baldrick@wanadoo.fr> writes:

>> I also observed that other application I use occasionally - LXR (Linux
>> source cross referencing tool) - takes much longer to generate xref
>> database (which is in Berkeley DB files). It works in three passes,
>> where the last one, when it dumps symbols into DB, is interesting. In
>> 2.4 it finishes quickly (it uses 100% CPU, then occasionally syncs the
>> databases - heavy write traffic for a second - then continues), but
>> 2.5 has problems with it (it stucks writing to disk all the time, CPU
>> usage is minimal and process progresses very slowly). Andrew, if
>> you're interested I can send you some numbers to describe the case
>> better.
>
> Hmmm, are you using ext3?  Changes to the meaning of yield sometimes
> make fsync go very slowly.  This problem has been around since 2.5.28,
> and hasn't yet been fixed (As for a fix, Andrew Morton said "I'll sit tight for
> the while, see where shed_yield() behaviour ends up").
>

Yes, it's an ext3 partition, ordered mode. I don't have ext2 compiled
into kernel anymore. :)

Hm, if it's a problem with fsync() then that could explain slight
Oracle slowdown, too, as I think that Oracle is a heavy user of
fsync. But I don't know that for sure. I'll investigate further..

Regards,
-- 
Zlatko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Re: Shared memory shmat/dt not working well in 2.5.x
  2002-10-08 15:10               ` Zlatko Calusic
@ 2002-10-08 15:25                 ` Duncan Sands
  0 siblings, 0 replies; 14+ messages in thread
From: Duncan Sands @ 2002-10-08 15:25 UTC (permalink / raw)
  To: zlatko.calusic
  Cc: Alessandro Suardi, Hugh Dickins, Andrew Morton, linux-kernel

> > Hmmm, are you using ext3?  Changes to the meaning of yield sometimes
> > make fsync go very slowly.  This problem has been around since 2.5.28,
> > and hasn't yet been fixed (As for a fix, Andrew Morton said "I'll sit
> > tight for the while, see where shed_yield() behaviour ends up").
>
> Yes, it's an ext3 partition, ordered mode. I don't have ext2 compiled
> into kernel anymore. :)
>
> Hm, if it's a problem with fsync() then that could explain slight
> Oracle slowdown, too, as I think that Oracle is a heavy user of
> fsync. But I don't know that for sure. I'll investigate further..

Andrew Morton made this suggestion to me:

>Please try replacing the yield() in fs/jbd/transaction.c
>with
>
>        set_current_state(TASK_RUNNING);
>        schedule();

and indeed it cured my problems.

All the best,

Duncan.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2002-10-08 15:20 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-01  9:52 Shared memory shmat/dt not working well in 2.5.x Zlatko Calusic
2002-10-01 13:07 ` Alessandro Suardi
2002-10-01 13:09 ` [PATCH] " Hugh Dickins
2002-10-01 13:28   ` Alessandro Suardi
2002-10-01 13:46     ` Zlatko Calusic
2002-10-01 14:51       ` Alessandro Suardi
2002-10-01 14:59         ` Zlatko Calusic
2002-10-02 18:45         ` Zlatko Calusic
2002-10-08 11:22           ` Zlatko Calusic
2002-10-08 11:38             ` Duncan Sands
2002-10-08 15:10               ` Zlatko Calusic
2002-10-08 15:25                 ` Duncan Sands
2002-10-01 13:37   ` Zlatko Calusic
2002-10-01 15:32 ` [PATCH] Oracle startup split_vma fix Hugh Dickins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).