linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* naive questions about thrashing
@ 2003-04-30 18:02 Timothy Miller
  2003-04-30 23:07 ` J.A. Magallon
  0 siblings, 1 reply; 3+ messages in thread
From: Timothy Miller @ 2003-04-30 18:02 UTC (permalink / raw)
  To: Linux Kernel Mailing List

I am running kernel version 2.4.18-26.7.x under Red Hat 7.2.

I wrote a CPU-intensive program which attempts to use over 700 megs of 
RAM on a 512-meg box, therefore it thrashes.

One thing I noticed was that 'top' reported that the kernel ("system") 
was using 68% of the CPU.  (The offending process was getting about 9%.) 
  How much CPU involvement is there in sending I/O requests to the drive 
and waiting on an interrupt?  Maybe I don't understand what's going on, 
but I would expect the CPU involvement in disk I/O to be practically 
NIL, unless it's trying to be really smart about it.  Is it?  Or maybe 
the kernel isn't using DMA... this is a Dell Precision 340.  I'm not 
sure what drive is in it, but I would be surprised if it weren't using DMA.

The next thing has to do with interactivity.  I don't know how much 
interactivity code is in the 2.4 kernel, but as I understand it, it's 
based on a process giving up its timeslice.  Well, if a process is 
causing numerous page faults, is it not, in effect, giving up its 
timeslice?  Is it therefore being scheduled as an interactive process? 
If so, the process scheduler is very mistaken.  In my naive opinion, 
causing a page fault should NOT be an indication of interactivity.  It 
seemed to me that the process was getting too much attention from the 
process scheduler.  The thrashing process was getting run while other 
processes (like X, etc.) were being almost completely starved.

Does causing a page fault get interpreted as interactivity?

Going under the assumption that disk I/O should require almost no CPU 
involvement, it seems to me that memory could be managed so that 
processes which behave themselves should feel no effect from the 
thrashing, while the thrashing process continues to thrash and therefore 
have little opportunity to use the CPU.  This would require that the VM 
not swap out pages which are being used heavily by well-behaved 
processes.  Instead, it seems that the thrashing process is being 
allowed to completely trash everything else running in the system.

Now, I'm not stupid.  Someone's going to tell me that the problem is in 
user space, and that I should fix the user space program.  (I am doing 
that.)  I can certainly understand that perspective, but it seems to me 
that the fairest result of a process misbehaving is that it should only 
punish itself, not every other process in the system.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: naive questions about thrashing
  2003-04-30 18:02 naive questions about thrashing Timothy Miller
@ 2003-04-30 23:07 ` J.A. Magallon
  2003-05-01 15:52   ` Timothy Miller
  0 siblings, 1 reply; 3+ messages in thread
From: J.A. Magallon @ 2003-04-30 23:07 UTC (permalink / raw)
  To: Timothy Miller; +Cc: Linux Kernel Mailing List


On 04.30, Timothy Miller wrote:
> I am running kernel version 2.4.18-26.7.x under Red Hat 7.2.
> 
> I wrote a CPU-intensive program which attempts to use over 700 megs of 
> RAM on a 512-meg box, therefore it thrashes.
> 
> One thing I noticed was that 'top' reported that the kernel ("system") 
> was using 68% of the CPU.  (The offending process was getting about 9%.) 
>   How much CPU involvement is there in sending I/O requests to the drive 
> and waiting on an interrupt?  Maybe I don't understand what's going on, 
> but I would expect the CPU involvement in disk I/O to be practically 
> NIL, unless it's trying to be really smart about it.  Is it?  Or maybe 
> the kernel isn't using DMA... this is a Dell Precision 340.  I'm not 
> sure what drive is in it, but I would be surprised if it weren't using DMA.
> 

As I understand it, it is telling you that your programs spends 68% of
its time is kernel space, ie, waiting your pages to come from disk. It
does not mean that the CPU is doing anything, but it is locked by the
kernel.

If you can't afford to buy more memory, recode the thing. So much thrashing
looks like you access your data very randomly. Try to process the data
in a more sequential way, so you just fault after processing a big bunch
of data. With 700Mb of data and a 512Mb box, at least half of your data
fit in memory, so under an ideal sequential access you just would page
300Mb one time...

-- 
J.A. Magallon <jamagallon@able.es>      \                 Software is like sex:
werewolf.able.es                         \           It's better when it's free
Mandrake Linux release 9.2 (Cooker) for i586
Linux 2.4.21-rc1-jam1 (gcc 3.2.2 (Mandrake Linux 9.2 3.2.2-5mdk))

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: naive questions about thrashing
  2003-04-30 23:07 ` J.A. Magallon
@ 2003-05-01 15:52   ` Timothy Miller
  0 siblings, 0 replies; 3+ messages in thread
From: Timothy Miller @ 2003-05-01 15:52 UTC (permalink / raw)
  To: J.A. Magallon; +Cc: Linux Kernel Mailing List



J.A. Magallon wrote:
> On 04.30, Timothy Miller wrote:
> 
>>I am running kernel version 2.4.18-26.7.x under Red Hat 7.2.
>>
>>I wrote a CPU-intensive program which attempts to use over 700 megs of 
>>RAM on a 512-meg box, therefore it thrashes.
>>
>>One thing I noticed was that 'top' reported that the kernel ("system") 
>>was using 68% of the CPU.  (The offending process was getting about 9%.) 
>>  How much CPU involvement is there in sending I/O requests to the drive 
>>and waiting on an interrupt?  Maybe I don't understand what's going on, 
>>but I would expect the CPU involvement in disk I/O to be practically 
>>NIL, unless it's trying to be really smart about it.  Is it?  Or maybe 
>>the kernel isn't using DMA... this is a Dell Precision 340.  I'm not 
>>sure what drive is in it, but I would be surprised if it weren't using DMA.
>>
> 
> 
> As I understand it, it is telling you that your programs spends 68% of
> its time is kernel space, ie, waiting your pages to come from disk. It
> does not mean that the CPU is doing anything, but it is locked by the
> kernel.

What would the kernel be locked while waiting on disk I/O?  Shouldn't it 
be running another process?  It's not DOING anything.  The whole idea 
behind a multitasking OS is to overlap the I/O of one process with the 
CPU usage of another whenever possible.  Swapping is an I/O operation.

And for that matter, if every runnable process has pages swapped out so 
that they cannot run, then the CPU should be IDLE.

Am I wrong?

> 
> If you can't afford to buy more memory, recode the thing. So much thrashing
> looks like you access your data very randomly. Try to process the data
> in a more sequential way, so you just fault after processing a big bunch
> of data. With 700Mb of data and a 512Mb box, at least half of your data
> fit in memory, so under an ideal sequential access you just would page
> 300Mb one time...
> 

The process got that large because of a bug in my program.  But a 
side-effect of that was kernel behavior that didn't make sense to me.  I 
decided to ask about it.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-05-01 15:38 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-30 18:02 naive questions about thrashing Timothy Miller
2003-04-30 23:07 ` J.A. Magallon
2003-05-01 15:52   ` Timothy Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).