linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* VM: killing process amavis
@ 2003-08-13 15:23 Coen Rosdorff
  2003-08-13 15:40 ` Hugh Dickins
  0 siblings, 1 reply; 4+ messages in thread
From: Coen Rosdorff @ 2003-08-13 15:23 UTC (permalink / raw)
  To: linux-kernel

Who can tell me something about this error in /var/log/messages:

Aug 13 10:12:51 rosdorff kernel: VM: killing process amavis
Aug 13 10:12:51 rosdorff kernel: swap_free: Unused swap offset entry 02000000

Memtest86: No errors.

Kernel: 2.4.21
Mem: 256MB
CPU: Intel PII 300Mhz

# cat /proc/swaps 
Filename                        Type            Size    Used    Priority
/dev/sda2                       partition       530136  44256   -1

# cat /proc/meminfo 
        total:    used:    free:  shared: buffers:  cached:
Mem:  263229440 194764800 68464640        0 55820288 91078656
Swap: 542859264 45318144 497541120
MemTotal:       257060 kB
MemFree:         66860 kB
MemShared:           0 kB
Buffers:         54512 kB
Cached:          58248 kB
SwapCached:      30696 kB
Active:          90332 kB
Inactive:        74088 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       257060 kB
LowFree:         66860 kB
SwapTotal:      530136 kB
SwapFree:       485880 kB


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: VM: killing process amavis
  2003-08-13 15:23 VM: killing process amavis Coen Rosdorff
@ 2003-08-13 15:40 ` Hugh Dickins
  2003-08-13 19:40   ` Coen Rosdorff
  0 siblings, 1 reply; 4+ messages in thread
From: Hugh Dickins @ 2003-08-13 15:40 UTC (permalink / raw)
  To: Coen Rosdorff; +Cc: linux-kernel

On Wed, 13 Aug 2003, Coen Rosdorff wrote:
> Who can tell me something about this error in /var/log/messages:
> 
> Aug 13 10:12:51 rosdorff kernel: VM: killing process amavis
> Aug 13 10:12:51 rosdorff kernel: swap_free: Unused swap offset entry 02000000
> 
> Memtest86: No errors.

It really would be worth giving memtest86 a good long run.

02000000 looks very much like a single-bit memory error,
and swap_free is exactly where such errors often show up.

Hugh


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: VM: killing process amavis
  2003-08-13 15:40 ` Hugh Dickins
@ 2003-08-13 19:40   ` Coen Rosdorff
  2003-08-17  9:48     ` Rob Landley
  0 siblings, 1 reply; 4+ messages in thread
From: Coen Rosdorff @ 2003-08-13 19:40 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-kernel

On Wed, 13 Aug 2003, Hugh Dickins wrote:

> It really would be worth giving memtest86 a good long run.
> 
> 02000000 looks very much like a single-bit memory error,
> and swap_free is exactly where such errors often show up.

I had the same problem before on the previous server. Running memtest for 
19 days didn't showed any memory problems.

After replacing the motherboard cpu and ram, now I have the same problem.

Previous motherboard:
swap_free: Unused swap offset entry 00000100

Apr 26 09:40:05 rosdorff kernel: kernel BUG at dcache.c:345!
Apr 26 09:40:05 rosdorff kernel: invalid operand: 0000
Apr 26 09:40:05 rosdorff kernel: CPU:    0
Apr 26 09:40:05 rosdorff kernel: EIP:    0010:[<c0141764>]    Not tainted
Apr 26 09:40:05 rosdorff kernel: EFLAGS: 00010206
Apr 26 09:40:05 rosdorff kernel: eax: 00000100   ebx: c17a8958   ecx: c1127f84   edx: c17a89d8
Apr 26 09:40:05 rosdorff kernel: esi: c17a8940   edi: 0000064d   ebp: 00000113   esp: c1147f20
Apr 26 09:40:05 rosdorff kernel: ds: 0018   es: 0018   ss: 0018
Apr 26 09:40:05 rosdorff kernel: Process kswapd (pid: 4, stackpage=c1147000)

May 11 14:40:05 rosdorff kernel: kernel BUG at dcache.c:345!
May 11 14:40:05 rosdorff kernel: invalid operand: 0000
May 11 14:40:05 rosdorff kernel: CPU:    0
May 11 14:40:05 rosdorff kernel: EIP:    0010:[<c0141b84>]    Not tainted
May 11 14:40:05 rosdorff kernel: EFLAGS: 00010206
May 11 14:40:05 rosdorff kernel: eax: 00000100   ebx: c17a8958   ecx: c1127f84   edx: c4d2a6d8
May 11 14:40:05 rosdorff kernel: esi: c17a8940   edi: 000011aa   ebp: 0000021b   esp: c114bf20
May 11 14:40:05 rosdorff kernel: ds: 0018   es: 0018   ss: 0018
May 11 14:40:05 rosdorff kernel: Process kswapd (pid: 4, stackpage=c114b000)

Jun 18 05:00:06 rosdorff kernel: kernel BUG at dcache.c:345!
Jun 18 05:00:06 rosdorff kernel: invalid operand: 0000
Jun 18 05:00:06 rosdorff kernel: CPU:    0
Jun 18 05:00:06 rosdorff kernel: EIP:    0010:[<c0141264>]    Not tainted
Jun 18 05:00:06 rosdorff kernel: EFLAGS: 00010206
Jun 18 05:00:06 rosdorff kernel: eax: 00000100   ebx: c17a8958   ecx: c110ff84   edx: c17a89d8
Jun 18 05:00:06 rosdorff kernel: esi: c17a8940   edi: 000019c1   ebp: 00000393   esp: c1163f20
Jun 18 05:00:06 rosdorff kernel: ds: 0018   es: 0018   ss: 0018
Jun 18 05:00:06 rosdorff kernel: Process kswapd (pid: 4, stackpage=c1163000)


Current motherboard:
Jul  8 08:31:53 rosdorff kernel: memory.c:100: bad pmd 02000000

Jul 15 04:05:16 rosdorff kernel: Unable to handle kernel paging request at virtual address 02000000
Jul 15 04:05:16 rosdorff kernel:  printing eip:
Jul 15 04:05:16 rosdorff kernel: c0131614
Jul 15 04:05:16 rosdorff kernel: *pde = 00000000
Jul 15 04:05:16 rosdorff kernel: Oops: 0002
Jul 15 04:05:16 rosdorff kernel: CPU:    0
Jul 15 04:05:16 rosdorff kernel: EIP:    0010:[<c0131614>]    Not tainted
Jul 15 04:05:16 rosdorff kernel: EFLAGS: 00010256
Jul 15 04:05:16 rosdorff kernel: eax: 00000000   ebx: c36cf3e0   ecx: c36cf3e0   edx: 02000000
Jul 15 04:05:16 rosdorff kernel: esi: c36cf3e0   edi: c36cf3e0   ebp: c11b5970   esp: c136df00
Jul 15 04:05:16 rosdorff kernel: ds: 0018   es: 0018   ss: 0018
Jul 15 04:05:16 rosdorff kernel: Process kswapd (pid: 4, stackpage=c136d000)

Aug 13 10:12:51 rosdorff kernel: VM: killing process amavis
Aug 13 10:12:51 rosdorff kernel: swap_free: Unused swap offset entry 02000000


So the problem moved from 00000100 to 02000000

The networkcards and the 3ware raid controler moved form the old to the 
new box. Could one of them be the problem?

I am running out of options.


TIA,
Coen Rosdorff


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: VM: killing process amavis
  2003-08-13 19:40   ` Coen Rosdorff
@ 2003-08-17  9:48     ` Rob Landley
  0 siblings, 0 replies; 4+ messages in thread
From: Rob Landley @ 2003-08-17  9:48 UTC (permalink / raw)
  To: Coen Rosdorff, Hugh Dickins; +Cc: linux-kernel

On Wednesday 13 August 2003 15:40, Coen Rosdorff wrote:
> On Wed, 13 Aug 2003, Hugh Dickins wrote:
> > It really would be worth giving memtest86 a good long run.
> >
> > 02000000 looks very much like a single-bit memory error,
> > and swap_free is exactly where such errors often show up.
>
> I had the same problem before on the previous server. Running memtest for
> 19 days didn't showed any memory problems.
>
> After replacing the motherboard cpu and ram, now I have the same problem.

I had a system once that looked very much like it had bad ram, but it turned 
out to have a bad hard drive controller, which showed up paging stuff into 
memory from disk (ala exec, sometimes), and in bringing stuff back in from 
swap.  (The kernel almost never went bye-bye, because it never swapped out, 
you see...)

Caused the weirdest problems in Myth II, among other things...

> So the problem moved from 00000100 to 02000000
>
> The networkcards and the 3ware raid controler moved form the old to the
> new box. Could one of them be the problem?
>
> I am running out of options.

Check the raid controller.  Especially if you're swapping through the raid 
controller.  I found out what was wrong with the other system by copying big 
tarballs through the network and verifying them.

Try this:

1) Copy a tarball to the remote system and confirm that it came out OK just 
coming across the network.

  cat enormous.tgz | ssh othersystem "tar tvz"

2) Now copy the tarball to the remote machine's disk, and test that the copy 
on disk is good.

  cat enormous.tgz | ssh othersystem "cat > temp.tgz; tar tvzf temp.tgz"

Of course using a tarball that's bigger than your ram, so it actually does 
have to write it out to disk and read it back in again.  Using ssh provides a 
little bit of a CPU load, and of course the network is providing a competing 
source of interrupts.  (You could also run contest in the background or some 
such to really beat the system to death...)

Rob



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-08-17 20:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-08-13 15:23 VM: killing process amavis Coen Rosdorff
2003-08-13 15:40 ` Hugh Dickins
2003-08-13 19:40   ` Coen Rosdorff
2003-08-17  9:48     ` Rob Landley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).