linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* experiences with 2.5.40 on a busy usenet news server
@ 2002-10-08  8:46 Miquel van Smoorenburg
  2002-10-08  9:11 ` bert hubert
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Miquel van Smoorenburg @ 2002-10-08  8:46 UTC (permalink / raw)
  To: linux-kernel

Just FYI:

So I booted 2.5.40 with the raid0 fix on our usenet news peering
server yesterday. It is a box that exchanges binary feeds with
about 40 peers, 400 GB/day in, 600 GB/day out.

It's a dual PIII/450, 1 GB RAM, 4x18 GB article spool directly
on partitions (not raw, but normal partitions). INN-2.4/CNFS.

With 2.4.19, it runs fine. With 2.5.40, it goes wildly into
swap. I'm assuming the I/O is pushing the newsserver binaries
and database mappings into swap.

# free
             total       used       free     shared    buffers     cached
Mem:       1033308    1027316       5992          0     836884      29776
-/+ buffers/cache:     160656     872652
Swap:       976888     364032     612856

No need to swap 364 MB when there's 872 MB still free...
This makes the machine dogslow. An 'expire' process that
runs every night normally takes 15 minutes to finish now
has been running for 10 hours and its still not finished.

Article acceptance rate has halved, the machine can't keep up
with the binaries it is fed.

I'm going to risk corrupting the databases and reboot back
to 2.4.19 now.

Mike.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: experiences with 2.5.40 on a busy usenet news server
  2002-10-08  8:46 experiences with 2.5.40 on a busy usenet news server Miquel van Smoorenburg
@ 2002-10-08  9:11 ` bert hubert
  2002-10-08  9:15 ` Andrew Morton
  2002-10-08  9:22 ` Nick Sanders
  2 siblings, 0 replies; 6+ messages in thread
From: bert hubert @ 2002-10-08  9:11 UTC (permalink / raw)
  To: Miquel van Smoorenburg; +Cc: linux-kernel

On Tue, Oct 08, 2002 at 08:46:20AM +0000, Miquel van Smoorenburg wrote:
> Just FYI:
> 
> So I booted 2.5.40 with the raid0 fix on our usenet news peering
> server yesterday. It is a box that exchanges binary feeds with
> about 40 peers, 400 GB/day in, 600 GB/day out.

If you'd dare to try a next time, could you try 2.5.4x-mm ? It tends to be
far more well tuned and is where vm development takes place.

Regards,

bert

-- 
http://www.PowerDNS.com          Versatile DNS Software & Services
http://www.tk                              the dot in .tk
http://lartc.org           Linux Advanced Routing & Traffic Control HOWTO

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: experiences with 2.5.40 on a busy usenet news server
  2002-10-08  8:46 experiences with 2.5.40 on a busy usenet news server Miquel van Smoorenburg
  2002-10-08  9:11 ` bert hubert
@ 2002-10-08  9:15 ` Andrew Morton
  2002-10-08  9:36   ` Miquel van Smoorenburg
  2002-10-08 23:02   ` Miquel van Smoorenburg
  2002-10-08  9:22 ` Nick Sanders
  2 siblings, 2 replies; 6+ messages in thread
From: Andrew Morton @ 2002-10-08  9:15 UTC (permalink / raw)
  To: Miquel van Smoorenburg; +Cc: linux-kernel

Miquel van Smoorenburg wrote:
> 
> ...
> # free
>              total       used       free     shared    buffers     cached
> Mem:       1033308    1027316       5992          0     836884      29776
> -/+ buffers/cache:     160656     872652
> Swap:       976888     364032     612856

Please always send /proc/meminfo - it's way more informative.

A vmstat trace is also useful.
 
> No need to swap 364 MB when there's 872 MB still free...
> This makes the machine dogslow. An 'expire' process that
> runs every night normally takes 15 minutes to finish now
> has been running for 10 hours and its still not finished.

It must be doing a ton of IO?

You'll probably find that 2.5.41-mm1 does not swap at all; but
I'd need to see meminfo to know.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: experiences with 2.5.40 on a busy usenet news server
  2002-10-08  8:46 experiences with 2.5.40 on a busy usenet news server Miquel van Smoorenburg
  2002-10-08  9:11 ` bert hubert
  2002-10-08  9:15 ` Andrew Morton
@ 2002-10-08  9:22 ` Nick Sanders
  2 siblings, 0 replies; 6+ messages in thread
From: Nick Sanders @ 2002-10-08  9:22 UTC (permalink / raw)
  To: Miquel van Smoorenburg, linux-kernel

On Tuesday 08 October 2002 9:46 am, Miquel van Smoorenburg wrote:
> Just FYI:
>
> So I booted 2.5.40 with the raid0 fix on our usenet news peering
> server yesterday. It is a box that exchanges binary feeds with
> about 40 peers, 400 GB/day in, 600 GB/day out.
>
> It's a dual PIII/450, 1 GB RAM, 4x18 GB article spool directly
> on partitions (not raw, but normal partitions). INN-2.4/CNFS.
>
> With 2.4.19, it runs fine. With 2.5.40, it goes wildly into
> swap. I'm assuming the I/O is pushing the newsserver binaries
> and database mappings into swap.
>
> # free
>              total       used       free     shared    buffers     cached
> Mem:       1033308    1027316       5992          0     836884      29776
> -/+ buffers/cache:     160656     872652
> Swap:       976888     364032     612856
>
> No need to swap 364 MB when there's 872 MB still free...
> This makes the machine dogslow. An 'expire' process that
> runs every night normally takes 15 minutes to finish now
> has been running for 10 hours and its still not finished.
>
> Article acceptance rate has halved, the machine can't keep up
> with the binaries it is fed.
>
> I'm going to risk corrupting the databases and reboot back
> to 2.4.19 now.
>
You might want to try 2.5.40-mm2

[snip]
- Started work on /proc/sys/vm/swappiness.  Setting it to 100% gives
  you current 2.5 behaviour.  Setting it to 0 feels pretty similar to
  2.4.19.
[snip]

then you would be able to test different swap behaviours

i.e echo 0 > /proc/sys/vm/swappiness for 2.4.19 behaviour

Nick

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: experiences with 2.5.40 on a busy usenet news server
  2002-10-08  9:15 ` Andrew Morton
@ 2002-10-08  9:36   ` Miquel van Smoorenburg
  2002-10-08 23:02   ` Miquel van Smoorenburg
  1 sibling, 0 replies; 6+ messages in thread
From: Miquel van Smoorenburg @ 2002-10-08  9:36 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

According to Andrew Morton:
> Miquel van Smoorenburg wrote:
> > # free
> 
> Please always send /proc/meminfo - it's way more informative.
> A vmstat trace is also useful.

Will do next time.

> > No need to swap 364 MB when there's 872 MB still free...
> > This makes the machine dogslow. An 'expire' process that
> > runs every night normally takes 15 minutes to finish now
> > has been running for 10 hours and its still not finished.
> 
> It must be doing a ton of IO?

Oh yes. This is a usenet news server, 50 mbit/sec
sustained in, 100 mbit/sec sustained out, and it's all being
cached on disk. See http://newsgate.cistron.nl/

> You'll probably find that 2.5.41-mm1 does not swap at all; but
> I'd need to see meminfo to know.

Right now I've not rebooted but instead I turned off swap. It
has enough memory anyway.

The 'expire' process that ran for 10 hours finished within
2 minutes, load went down from 6 to 1.8, and traffic volume
is climbing again.

I'll let it run like this for a few hours so it can catch up
with the backlog my peers have to me, then in the afternoon I'll
try 2.5.41-mm<latest> on it.

Mike.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: experiences with 2.5.40 on a busy usenet news server
  2002-10-08  9:15 ` Andrew Morton
  2002-10-08  9:36   ` Miquel van Smoorenburg
@ 2002-10-08 23:02   ` Miquel van Smoorenburg
  1 sibling, 0 replies; 6+ messages in thread
From: Miquel van Smoorenburg @ 2002-10-08 23:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

According to Andrew Morton:
> You'll probably find that 2.5.41-mm1 does not swap at all; but
> I'd need to see meminfo to know.

2.5.41-mm1 panics on boot for me. I applied 2 patches to it;
the first is the mremap fix you talked about earlier, the second
is the raid0 fix posted by Peter Chubb <peter@chubb.wattle.id.au>
last friday, which works fine on 2.5.40. Ofcourse I ported it
to 2.5.41-mm1, but both code paths are not used at the time
of the panic AFAICS

Below is the boot log, which shows the panic, followed by the
patches I used.

I can't experiment further since a) this is a production machine
and b) I really have to go to bed now :|

Loading 2.5.41........................
Linux version 2.5.41 (root@wormhole) (gcc version 2.95.4 20011006 (Debian prerelease)) #3 SMP Wed Oct 9 00:42:56 CEST 2002
Video mode to be used for restore is ffff
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 0000000040000000 (usable)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
128MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000fb460
hm, page 000fb000 reserved twice.
hm, page 000fc000 reserved twice.
hm, page 000f5000 reserved twice.
hm, page 000f6000 reserved twice.
On node 0 totalpages: 262144
  DMA zone: 4096 pages
  Normal zone: 225280 pages
  HighMem zone: 32768 pages
ACPI: Unable to locate RSDP
Intel MultiProcessor Specification v1.4
    Virtual Wire compatibility mode.
OEM ID: INTEL    Product ID: 440BX        APIC at: 0xFEE00000
Processor #0 6:7 APIC version 17
Processor #1 6:7 APIC version 17
I/O APIC #2 Version 17 at 0xFEC00000.
Processors: 2
Building zonelist for node : 0
Kernel command line: BOOT_IMAGE=2.5.41 root=801 ioapic_level=9,10,15 rootfstype=ext2 panic=30 console=tty0 console=ttyS0,9600n8
IO-APIC-level enabling for IRQ9 IRQ10 IRQ15
Initializing CPU#0
Detected 449.298 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 884.73 BogoMIPS
Memory: 1033152k/1048576k available (1458k kernel code, 13928k reserved, 644k data, 108k init, 131072k highmem)
Security Scaffold v1.0.0 initialized
Dentry-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
CPU serial number disabled.
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
CPU0: Intel Pentium III (Katmai) stepping 02
per-CPU timeslice cutoff: 1461.26 usecs.
task migration cache decay timeout: 2 msecs.
enabled ExtINT on CPU#0
ESR value before enabling vector: 00000004
ESR value after enabling vector: 00000000
Booting processor 1/1 eip 2000
Initializing CPU#1
masked ExtINT on CPU#1
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 897.02 BogoMIPS
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
CPU serial number disabled.
CPU1: Intel Pentium III (Katmai) stepping 02
Total of 2 processors activated (1781.76 BogoMIPS).
ENABLING IO-APIC IRQs
Setting 2 in the phys_id_present_map
...changing IO-APIC physical APIC ID to 2 ... ok.
, 2-11<7>Forcing IRQ15 to level
, 2-16, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
..TIMER: vector=0x31 pin1=2 pin2=0
testing the IO APIC.......................

.................................... done.
Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 449.0163 MHz.
..... host bus clock speed is 99.0813 MHz.
checking TSC synchronization across 2 CPUs: passed.
Starting migration thread for cpu 0
Bringing up 1
CPU 1 IS NOW UP!
Starting migration thread for cpu 1
Debug: sleeping function called from illegal context at kernel/sched.c:1177
Call Trace:
 [<c0115da4>] __might_sleep+0x54/0x58
 [<c011462b>] wait_for_completion+0x1b/0x104
 [<c011360b>] wake_up_process+0xb/0x10
 [<c0115996>] set_cpus_allowed+0x14a/0x16c
 [<c0115a08>] migration_thread+0x50/0x32c
 [<c01159b8>] migration_thread+0x0/0x32c
 [<c01054ed>] kernel_thread_helper+0x5/0xc

bad: scheduling while atomic!
Call Trace:
 [<c0114061>] schedule+0x3d/0x404
 [<c0115da4>] __might_sleep+0x54/0x58
 [<c01146c5>] wait_for_completion+0xb5/0x104
 [<c011446c>] default_wake_function+0x0/0x34
 [<c011446c>] default_wake_function+0x0/0x34
 [<c0115996>] set_cpus_allowed+0x14a/0x16c
 [<c0115a08>] migration_thread+0x50/0x32c
 [<c01159b8>] migration_thread+0x0/0x32c
 [<c01054ed>] kernel_thread_helper+0x5/0xc

CPUS done 4294967295
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c0132d94
*pde = 00000000
Oops: 0000
 
CPU:    1
EIP:    0060:[<c0132d94>]    Not tainted
EFLAGS: 00010002
EIP is at kmem_cache_alloc+0x18/0x48
eax: 00000004   ebx: 00000246   ecx: c02cff40   edx: 00000000
esi: 00000138   edi: 00000000   ebp: 00000000   esp: f7fa5f80
ds: 0068   es: 0068   ss: 0068
Process swapper (pid: 1, threadinfo=f7fa4000 task=f7fc7020)
Stack: f7fa4000 c0131ce8 c02cff40 000001d0 f7fa4000 00000000 00000000 00000000 
       f7fa4000 ffffe1e5 c0118a4c c040e83b 00000246 c02ce6c0 0000003b c0324d0a 
       c02ad2d1 00000138 00000000 00002000 00000000 00000000 c0324cb2 c0310862 
Call Trace:
 [<c0131ce8>] kmem_cache_create+0x6c/0x5c4
 [<c0118a4c>] release_console_sem+0xa4/0xdc
 [<c01050ab>] init+0x47/0x1ac
 [<c0105064>] init+0x0/0x1ac
 [<c01054ed>] kernel_thread_helper+0x5/0xc

Code: 8b 02 85 c0 74 16 c7 42 0c 01 00 00 00 48 89 02 8b 44 82 10 
 <0>Kernel panic: Attempted to kill init!
 <0>Rebooting in 30 seconds..


--- linux-2.5.41-mm1/mm/mremap.c.orig	Tue Oct  8 23:56:22 2002
+++ linux-2.5.41-mm1/mm/mremap.c	Wed Oct  9 00:05:15 2002
@@ -54,7 +54,7 @@
 	return pte;
 }
 
-#ifdef CONFIG_HIGHPTE	/* Save a few cycles on the sane machines */
+#ifdef CONFIG_HIGHMEM	/* Save a few cycles on the sane machines */
 static inline int page_table_present(struct mm_struct *mm, unsigned long addr)
 {
 	pgd_t *pgd;



--- linux-2.5.41-mm1/drivers/md/raid0.c.orig	Tue Oct  8 23:56:14 2002
+++ linux-2.5.41-mm1/drivers/md/raid0.c	Wed Oct  9 00:00:58 2002
@@ -162,6 +162,29 @@
 	return 1;
 }
 
+/**
+ *	raid0_mergeable_bvec -- tell bio layer if a two requests can be merged
+ *	@q: request queue
+ *	@bio: the buffer head that's been built up so far
+ *	@biovec: the request that could be merged to it.
+ *
+ *	Return 1 if the merge is not permitted (because the
+ *	result would cross a chunk boundary), 0 otherwise.
+ */
+static int raid0_mergeable_bvec(request_queue_t *q, struct bio *bio, struct bio_vec *biovec)
+{
+	mddev_t *mddev = q->queuedata;
+	sector_t block;
+	unsigned int chunk_size;
+	unsigned int bio_sz;
+
+	chunk_size = mddev->chunk_size >> 10;
+	block = bio->bi_sector >> 1;
+	bio_sz = (bio->bi_size + biovec->bv_len) >> 10;
+
+	return chunk_size < ((block & (chunk_size - 1)) + bio_sz);
+}
+
 static int raid0_run (mddev_t *mddev)
 {
 	unsigned  cur=0, i=0, nb_zone;
@@ -233,6 +256,8 @@
 		conf->hash_table[i++].zone1 = conf->strip_zone + cur;
 		size -= (conf->smallest->size - zone0_size);
 	}
+	blk_queue_max_sectors(&mddev->queue, mddev->chunk_size >> 9);
+	blk_queue_merge_bvec(&mddev->queue, raid0_mergeable_bvec);
 	return 0;
 
 out_free_zone_conf:
@@ -262,13 +287,6 @@
 	return 0;
 }
 
-/*
- * FIXME - We assume some things here :
- * - requested buffers NEVER bigger than chunk size,
- * - requested buffers NEVER cross stripes limits.
- * Of course, those facts may not be valid anymore (and surely won't...)
- * Hey guys, there's some work out there ;-)
- */
 static int raid0_make_request (request_queue_t *q, struct bio *bio)
 {
 	mddev_t *mddev = q->queuedata;
@@ -291,8 +309,8 @@
 		hash = conf->hash_table + x;
 	}
 
-	/* Sanity check */
-	if (chunk_size < (block & (chunk_size - 1)) + (bio->bi_size >> 10))
+	/* Sanity check -- queue functions should prevent this happening */
+	if (unlikely(chunk_size < (block & (chunk_size - 1)) + (bio->bi_size >> 10)))
 		goto bad_map;
  
 	if (!hash)


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2002-10-08 22:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-08  8:46 experiences with 2.5.40 on a busy usenet news server Miquel van Smoorenburg
2002-10-08  9:11 ` bert hubert
2002-10-08  9:15 ` Andrew Morton
2002-10-08  9:36   ` Miquel van Smoorenburg
2002-10-08 23:02   ` Miquel van Smoorenburg
2002-10-08  9:22 ` Nick Sanders

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).