linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: proc_misc.c bug
  2003-04-10 22:02 proc_misc.c bug David Mosberger
@ 2003-04-10 21:44 ` Alan Cox
  2003-04-10 22:49   ` Randy.Dunlap
                     ` (2 more replies)
  2003-04-10 22:18 ` Andrew Morton
  1 sibling, 3 replies; 12+ messages in thread
From: Alan Cox @ 2003-04-10 21:44 UTC (permalink / raw)
  To: davidm; +Cc: akpm, Linux Kernel Mailing List

On Thu, 2003-04-10 at 23:02, David Mosberger wrote:
> The workaround below is to allocate 4KB per 8 CPUs.  Not really a
> solution, but the fundamental problem is that /proc/interrupts
> shouldn't use a fixed buffer size in the first place.  I suppose
> another solution would be to use vmalloc() instead.  It all feels like
> bandaids though.

How about switching to Al's seqfile interface ?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* proc_misc.c bug
@ 2003-04-10 22:02 David Mosberger
  2003-04-10 21:44 ` Alan Cox
  2003-04-10 22:18 ` Andrew Morton
  0 siblings, 2 replies; 12+ messages in thread
From: David Mosberger @ 2003-04-10 22:02 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel

interrupts_open() can easily try to kmalloc() more memory than
supported by kmalloc.  E.g., with 16KB page size and NR_CPUS==64, it
would try to allocate 147456 bytes.

The workaround below is to allocate 4KB per 8 CPUs.  Not really a
solution, but the fundamental problem is that /proc/interrupts
shouldn't use a fixed buffer size in the first place.  I suppose
another solution would be to use vmalloc() instead.  It all feels like
bandaids though.

	--david

===== fs/proc/proc_misc.c 1.71 vs edited =====
--- 1.71/fs/proc/proc_misc.c	Sat Mar 22 22:14:49 2003
+++ edited/fs/proc/proc_misc.c	Thu Apr 10 14:35:16 2003
@@ -388,7 +388,7 @@
 extern int show_interrupts(struct seq_file *p, void *v);
 static int interrupts_open(struct inode *inode, struct file *file)
 {
-	unsigned size = PAGE_SIZE * (1 + NR_CPUS / 8);
+	unsigned size = 4096 * (1 + NR_CPUS / 8);
 	char *buf = kmalloc(size, GFP_KERNEL);
 	struct seq_file *m;
 	int res;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proc_misc.c bug
  2003-04-10 22:02 proc_misc.c bug David Mosberger
  2003-04-10 21:44 ` Alan Cox
@ 2003-04-10 22:18 ` Andrew Morton
  1 sibling, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2003-04-10 22:18 UTC (permalink / raw)
  To: davidm; +Cc: linux-kernel

David Mosberger <davidm@napali.hpl.hp.com> wrote:
>
> interrupts_open() can easily try to kmalloc() more memory than
> supported by kmalloc.  E.g., with 16KB page size and NR_CPUS==64, it
> would try to allocate 147456 bytes.
> 
> The workaround below is to allocate 4KB per 8 CPUs.  Not really a
> solution, but the fundamental problem is that /proc/interrupts
> shouldn't use a fixed buffer size in the first place.  I suppose
> another solution would be to use vmalloc() instead.  It all feels like
> bandaids though.
> 
> 	--david
> 
> ===== fs/proc/proc_misc.c 1.71 vs edited =====
> --- 1.71/fs/proc/proc_misc.c	Sat Mar 22 22:14:49 2003
> +++ edited/fs/proc/proc_misc.c	Thu Apr 10 14:35:16 2003
> @@ -388,7 +388,7 @@
>  extern int show_interrupts(struct seq_file *p, void *v);
>  static int interrupts_open(struct inode *inode, struct file *file)
>  {
> -	unsigned size = PAGE_SIZE * (1 + NR_CPUS / 8);
> +	unsigned size = 4096 * (1 + NR_CPUS / 8);

urgh, consider me thwapped.

There continue to be a lot of places where we assume that pages are 4k (eg:
sizing of the free page reserves).  But few are as fatal as this one...

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proc_misc.c bug
  2003-04-10 21:44 ` Alan Cox
@ 2003-04-10 22:49   ` Randy.Dunlap
  2003-04-11  5:01     ` Randy.Dunlap
  2003-04-10 22:53   ` Andrew Morton
  2003-04-11  0:27   ` David Mosberger
  2 siblings, 1 reply; 12+ messages in thread
From: Randy.Dunlap @ 2003-04-10 22:49 UTC (permalink / raw)
  To: Alan Cox; +Cc: davidm, akpm, linux-kernel

On 10 Apr 2003 22:44:17 +0100 Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

| On Thu, 2003-04-10 at 23:02, David Mosberger wrote:
| > The workaround below is to allocate 4KB per 8 CPUs.  Not really a
| > solution, but the fundamental problem is that /proc/interrupts
| > shouldn't use a fixed buffer size in the first place.  I suppose
| > another solution would be to use vmalloc() instead.  It all feels like
| > bandaids though.
| 
| How about switching to Al's seqfile interface ?

It's already using it, but it uses the simple/single version of it,
which doesn't automagically extend the output buffer area when
it's full, so a max size buffer has to be allocated for it
up front.

I'll look at changing it unless somebody beats me (to it :).

--
~Randy   ['tangent' is not a verb...unless you believe that
          "in English any noun can be verbed."]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proc_misc.c bug
  2003-04-10 21:44 ` Alan Cox
  2003-04-10 22:49   ` Randy.Dunlap
@ 2003-04-10 22:53   ` Andrew Morton
  2003-04-11  0:27   ` David Mosberger
  2 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2003-04-10 22:53 UTC (permalink / raw)
  To: Alan Cox; +Cc: davidm, linux-kernel

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>
> On Thu, 2003-04-10 at 23:02, David Mosberger wrote:
> > The workaround below is to allocate 4KB per 8 CPUs.  Not really a
> > solution, but the fundamental problem is that /proc/interrupts
> > shouldn't use a fixed buffer size in the first place.  I suppose
> > another solution would be to use vmalloc() instead.  It all feels like
> > bandaids though.
> 
> How about switching to Al's seqfile interface ?

It's already using it!  Not sure why it needs to buffer all the text in one
big slurp.

There are about 20 different show_interrupts()es which need updating if
someone decides to fix it for real.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proc_misc.c bug
  2003-04-10 21:44 ` Alan Cox
  2003-04-10 22:49   ` Randy.Dunlap
  2003-04-10 22:53   ` Andrew Morton
@ 2003-04-11  0:27   ` David Mosberger
  2 siblings, 0 replies; 12+ messages in thread
From: David Mosberger @ 2003-04-11  0:27 UTC (permalink / raw)
  To: Alan Cox; +Cc: davidm, akpm, Linux Kernel Mailing List

>>>>> On 10 Apr 2003 22:44:17 +0100, Alan Cox <alan@lxorguk.ukuu.org.uk> said:

  Alan> On Thu, 2003-04-10 at 23:02, David Mosberger wrote:
  >> The workaround below is to allocate 4KB per 8 CPUs.  Not really a
  >> solution, but the fundamental problem is that /proc/interrupts
  >> shouldn't use a fixed buffer size in the first place.  I suppose
  >> another solution would be to use vmalloc() instead.  It all feels like
  >> bandaids though.

  Alan> How about switching to Al's seqfile interface ?

That's obviously the non-bandaid solution I was hinting at.  Any volunteers?

	--david

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proc_misc.c bug
  2003-04-10 22:49   ` Randy.Dunlap
@ 2003-04-11  5:01     ` Randy.Dunlap
  2003-04-11  5:28       ` Andrew Morton
  2003-04-11  5:41       ` David Mosberger
  0 siblings, 2 replies; 12+ messages in thread
From: Randy.Dunlap @ 2003-04-11  5:01 UTC (permalink / raw)
  To: rddunlap; +Cc: alan, davidm, akpm, linux-kernel

> On 10 Apr 2003 22:44:17 +0100 Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>
> | On Thu, 2003-04-10 at 23:02, David Mosberger wrote:
> | > The workaround below is to allocate 4KB per 8 CPUs.  Not really a | >
> solution, but the fundamental problem is that /proc/interrupts | > shouldn't
> use a fixed buffer size in the first place.  I suppose | > another solution
> would be to use vmalloc() instead.  It all feels like | > bandaids though.
> |
> | How about switching to Al's seqfile interface ?
>
> It's already using it, but it uses the simple/single version of it, which
> doesn't automagically extend the output buffer area when
> it's full, so a max size buffer has to be allocated for it
> up front.
>
> I'll look at changing it unless somebody beats me (to it :).

OK, I've looked at it and concluded that it's not bad the way it is
(after David's patch is applied).  However, that really depends on
whether the static NR_CPUS is well-tuned or not.
If it's not tuned, then modifying the output to use the iterative
seq_file methods would make sense.
But if it's not tuned, someone is (usually) wasting lots of memory anyway.

[adding this:]
Or a better approximation instead of NR_CPUS might be to add and use
something a bit more intelligent about how many interrupts are
in use.  However, this makes my later argument not so strong.

The reason that I say it's not bad the way it is is that
it does one kmalloc() for the output buffer and then it can run
quickly and do its job [but it's limited to the original kmalloc
buffer size].   If it's modified to use seq_file iteration,
it will start out with kmalloc(PAGE_SIZE) in seq_read(), and when
that fills up, a buffer twice that size is kmalloc-ed, repeat....
with earlier buffers being freed first.

Does someone want to disagree now?  go a() instead.  It all feels like | >
bandaids though.
> |
> | How about switching to Al's seqfile interface ?
>
> It's already using it, but it uses the simple/single version of it, which
> doesn't automagically extend the output buffer area when
> it's full, so a max size buffer has to be allocated for it
> up front.
>
> I'll look at changing it unless somebody beats me (to it :).

OK, I've looked at it and concluded that it's not bad the way it is
(after David's patch is applied).  However, that really depends on
whether the static NR_CPUS is well-tuned or not.
If it's not tuned, then modifying the output to use the iterative
seq_file methods would make sense.
But if it's not tuned, someone is (usually) wasting lots of memory anyway.

[adding this:]
Or a better approximation instead of NR_CPUS might be to add and use
something a bit more intelligent about how many interrupts are
in use.  However, this makes my later argument not so strong.

The reason that I say it's not bad the way it is is that
it does one kmalloc() for the output buffer and then it can run
quickly and do its job [but it's limited to the original kmalloc
buffer size].   If it's modified to use seq_file iteration,
it will start out with kmalloc(PAGE_SIZE) in seq_read(), and when
that fills up, a buffer twice that size is kmalloc-ed, repeat....
with earlier buffers being freed first.

Does someone want to disagree now?  go ahead...i'm listening.
Maybe the reason to modify it is that NR_CPUS is not a good
approximation/hint/clue.

~Randy




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proc_misc.c bug
  2003-04-11  5:01     ` Randy.Dunlap
@ 2003-04-11  5:28       ` Andrew Morton
  2003-04-11  5:41       ` David Mosberger
  1 sibling, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2003-04-11  5:28 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: alan, davidm, linux-kernel

"Randy.Dunlap" <rddunlap@osdl.org> wrote:
>
> Maybe the reason to modify it is that NR_CPUS is not a good
> approximation/hint/clue.

NR_CPUS is not tooooo bad an approximation.  After all, the output size is
approximately proportional to the number of CPUs.

Multiplied by the number of interrupts :(

           CPU0       CPU1       
  0:   11982656          0    IO-APIC-edge  timer
  1:         26          0    IO-APIC-edge  i8042
  2:          0          0          XT-PIC  cascade


We could perhaps just convert it to num_online_cpus() and run away.  Depends
how heoric you're feeling.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proc_misc.c bug
  2003-04-11  5:01     ` Randy.Dunlap
  2003-04-11  5:28       ` Andrew Morton
@ 2003-04-11  5:41       ` David Mosberger
  2003-04-11  5:46         ` Zwane Mwaikambo
  2003-04-11 17:29         ` Randy.Dunlap
  1 sibling, 2 replies; 12+ messages in thread
From: David Mosberger @ 2003-04-11  5:41 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: alan, davidm, akpm, linux-kernel

>>>>> On Thu, 10 Apr 2003 22:01:43 -0700 (PDT), "Randy.Dunlap" <rddunlap@osdl.org> said:

  Randy> OK, I've looked at it and concluded that it's not bad the way
  Randy> it is (after David's patch is applied).  However, that really
  Randy> depends on whether the static NR_CPUS is well-tuned or not.
  Randy> If it's not tuned, then modifying the output to use the
  Randy> iterative seq_file methods would make sense.  But if it's not
  Randy> tuned, someone is (usually) wasting lots of memory anyway.

  Randy> [snip...]

  Randy> Does someone want to disagree now?  go ahead...i'm listening.
  Randy> Maybe the reason to modify it is that NR_CPUS is not a good
  Randy> approximation/hint/clue.

Wouldn't the kmalloc() likely fail in fragmented conditions?  Also,
I'm wondering whether there is such a thing as "well-tuned" in this
case.  For example, in the extreme case of the SGI SN2 machine, each
CPU could in theory have up to 256 interrupt sources (OK, perhaps it's
only 256 interrupts per 2 CPUs, but it's still a lot of interrupts to
go around ;-).  OTOH, most ia64 machines out there have less than 256
interrupt per _system_.  That's a large variation.

	--david

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proc_misc.c bug
  2003-04-11  5:41       ` David Mosberger
@ 2003-04-11  5:46         ` Zwane Mwaikambo
  2003-04-11 17:29         ` Randy.Dunlap
  1 sibling, 0 replies; 12+ messages in thread
From: Zwane Mwaikambo @ 2003-04-11  5:46 UTC (permalink / raw)
  To: davidm; +Cc: Randy.Dunlap, alan, akpm, linux-kernel

On Thu, 10 Apr 2003, David Mosberger wrote:

> Wouldn't the kmalloc() likely fail in fragmented conditions?  Also,
> I'm wondering whether there is such a thing as "well-tuned" in this
> case.  For example, in the extreme case of the SGI SN2 machine, each
> CPU could in theory have up to 256 interrupt sources (OK, perhaps it's
> only 256 interrupts per 2 CPUs, but it's still a lot of interrupts to
> go around ;-).  OTOH, most ia64 machines out there have less than 256
> interrupt per _system_.  That's a large variation.

I think NR_CPUS belongs in there somewhere, this is what triggered the 
original change;

           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       CPU8       CPU9       CPU10       CPU11       CPU12       CPU13       CPU14       CPU15       CPU16       CPU17       CPU18       CPU19       CPU20       CPU21       CPU22       CPU23       CPU24       CPU25       CPU26       CPU27       CPU28       CPU29       CPU30       CPU31       
  0:    3652909          0          0          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  local-APIC-edge  timer
  2:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          XT-PIC  cascade
  4:       2274       2134       1993       2147          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0    IO-APIC-edge  serial
  8:          0          0          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  rtc
 19:       1082       1068        527       1118          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  eth0
 37:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  eth7
 38:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  eth6
 39:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  eth5
 40:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  eth4
 41:       3558       2894       2648       3244          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  qlogicisp
 89:          0          0          0          0          4          4          4          3          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  qlogicisp
115:          0          0          0          0          0          0          0          0        361        366        366        347          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  eth1
163:          0          0          0          0          0          0          0          0          0          0          0          0        383        385        385        384          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  eth2
211:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0        375        376        374        375          0          0          0          0          0          0          0          0          0          0          0          0   IO-APIC-level  eth3
NMI:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0 
LOC:    3604478    3603640    3603819    3603799    3603663    3603641    3603627    3603601    3603603    3603583    3603563    3603542    3603588    3603567    3603546    3603526    3603416    3603394    3603374    3603352    3603601    3603578    3603557    3603536    3603707    3603686    3603665    3603644    3603430    3603410    3603388    3603368 
ERR:          0
MIS:          0

-- 
function.linuxpower.ca

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proc_misc.c bug
  2003-04-11  5:41       ` David Mosberger
  2003-04-11  5:46         ` Zwane Mwaikambo
@ 2003-04-11 17:29         ` Randy.Dunlap
  2003-04-11 18:32           ` David Mosberger
  1 sibling, 1 reply; 12+ messages in thread
From: Randy.Dunlap @ 2003-04-11 17:29 UTC (permalink / raw)
  To: davidm; +Cc: alan, akpm, linux-kernel

On Thu, 10 Apr 2003 22:41:23 -0700 David Mosberger <davidm@napali.hpl.hp.com> wrote:

| >>>>> On Thu, 10 Apr 2003 22:01:43 -0700 (PDT), "Randy.Dunlap" <rddunlap@osdl.org> said:
| 
|   Randy> OK, I've looked at it and concluded that it's not bad the way
|   Randy> it is (after David's patch is applied).  However, that really
|   Randy> depends on whether the static NR_CPUS is well-tuned or not.
|   Randy> If it's not tuned, then modifying the output to use the
|   Randy> iterative seq_file methods would make sense.  But if it's not
|   Randy> tuned, someone is (usually) wasting lots of memory anyway.
| 
|   Randy> [snip...]
| 
|   Randy> Does someone want to disagree now?  go ahead...i'm listening.
|   Randy> Maybe the reason to modify it is that NR_CPUS is not a good
|   Randy> approximation/hint/clue.
| 
| Wouldn't the kmalloc() likely fail in fragmented conditions?  Also,
| I'm wondering whether there is such a thing as "well-tuned" in this
| case.  For example, in the extreme case of the SGI SN2 machine, each
| CPU could in theory have up to 256 interrupt sources (OK, perhaps it's
| only 256 interrupts per 2 CPUs, but it's still a lot of interrupts to
| go around ;-).  OTOH, most ia64 machines out there have less than 256
| interrupt per _system_.  That's a large variation.

For kmalloc() failing, do you mean the first (large) kmalloc() or the
repeated ones that grow in size each time?
I would think that just doing one big kmalloc up front is desireable
to repeated ones...if the first one does a decent estimate of its max
size.  I just don't know how likely that is.

--
~Randy   ['tangent' is not a verb...unless you believe that
          "in English any noun can be verbed."]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proc_misc.c bug
  2003-04-11 17:29         ` Randy.Dunlap
@ 2003-04-11 18:32           ` David Mosberger
  0 siblings, 0 replies; 12+ messages in thread
From: David Mosberger @ 2003-04-11 18:32 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: davidm, alan, akpm, linux-kernel

>>>>> On Fri, 11 Apr 2003 10:29:46 -0700, "Randy.Dunlap" <rddunlap@osdl.org> said:

  Randy> For kmalloc() failing, do you mean the first (large)
  Randy> kmalloc() or the repeated ones that grow in size each time?

I was concerned about the kmalloc() in interrupts_open().

	--david

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2003-04-11 18:21 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-10 22:02 proc_misc.c bug David Mosberger
2003-04-10 21:44 ` Alan Cox
2003-04-10 22:49   ` Randy.Dunlap
2003-04-11  5:01     ` Randy.Dunlap
2003-04-11  5:28       ` Andrew Morton
2003-04-11  5:41       ` David Mosberger
2003-04-11  5:46         ` Zwane Mwaikambo
2003-04-11 17:29         ` Randy.Dunlap
2003-04-11 18:32           ` David Mosberger
2003-04-10 22:53   ` Andrew Morton
2003-04-11  0:27   ` David Mosberger
2003-04-10 22:18 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).