linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bio & Biovec-1 increasing cache size, never freed during disk IO
@ 2006-02-25 20:45 matteo brancaleoni
  2006-02-27 20:38 ` matteo brancaleoni
  0 siblings, 1 reply; 10+ messages in thread
From: matteo brancaleoni @ 2006-02-25 20:45 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4666 bytes --]

Hi.

I'm experiencing a problem with 2.6.15.4 / 2.6.16-rc4, noticed during
high disk IO (copying a lot of data between 2 machines): the system
memory get filled up, until the full swap is used and the system must
be rebooted (or is unusable). Stopping the process does not free the
memory, and happens not only copying via network, but also with a
simple
cp -a dirwithmanybigfiles testdir.

The system is running on athlon64:
Linux morgor 2.6.16-rc4 #2 SMP Sat Feb 25 19:55:36 CET 2006 x86_64
x86_64 x86_64 GNU/Linux
Same issue with 2.6.15.4

The box has a single soft-raid1 device made by 2 sata disk on promise
controller.
Attached slabinfo dump and dmesg dump.

Some system informations:
* This is the modules list:
Linux morgor 2.6.16-rc4 #2 SMP Sat Feb 25 19:55:36 CET 2006 x86_64
x86_64 x86_64 GNU/Linux
[root@morgor ~]# lsmod
Module                  Size  Used by
ipv6                  399008  18
ppdev                  42888  0
autofs4                55560  1
nfs                   251224  2
lockd                  97424  2 nfs
nfs_acl                37120  1 nfs
sunrpc                191944  4 nfs,lockd,nfs_acl
rfcomm                105376  0
l2cap                  92160  5 rfcomm
bluetooth             117252  4 rfcomm,l2cap
dm_mirror              54912  0
dm_mod                 90192  1 dm_mirror
video                  50952  0
button                 41120  0
battery                43912  0
ac                     38920  0
lp                     48208  0
parport_pc             63144  1
parport                74636  3 ppdev,lp,parport_pc
nvram                  42888  0
ohci1394               67272  0
ehci_hcd               65160  0
sg                     69672  0
ieee1394              392216  1 ohci1394
uhci_hcd               65952  0
snd_via82xx            63784  0
gameport               50832  1 snd_via82xx
snd_ac97_codec        136536  1 snd_via82xx
snd_ac97_bus           36224  1 snd_ac97_codec
snd_seq_dummy          37508  0
snd_seq_oss            66688  0
snd_seq_midi_event     41472  1 snd_seq_oss
snd_seq                90144  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
snd_pcm_oss            85632  0
snd_mixer_oss          51328  1 snd_pcm_oss
snd_pcm               126728  3 snd_via82xx,snd_ac97_codec,snd_pcm_oss
snd_timer              59656  2 snd_seq,snd_pcm
snd_page_alloc         44816  2 snd_via82xx,snd_pcm
snd_mpu401_uart        42112  1 snd_via82xx
snd_rawmidi            61696  1 snd_mpu401_uart
snd_seq_device         43280  4 snd_seq_dummy,snd_seq_oss,snd_seq,snd_rawmidi
i2c_viapro             43160  0
snd                    97320  11
snd_via82xx,snd_ac97_codec,snd_seq_oss,snd_seq,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer,snd_mpu401_uart,snd_rawmidi,snd_seq_device
i2c_core               57728  1 i2c_viapro
skge                   72720  0
soundcore              44576  1 snd
raid1                  54912  2
ext3                  163984  2
jbd                    93480  1 ext3
sata_promise           45700  6
sata_via               42500  0
libata                 93592  2 sata_promise,sata_via
sd_mod                 50688  8
scsi_mod              180688  4 sg,sata_promise,libata,sd_mod

* fstab
[root@morgor ~]# cat /etc/fstab
/dev/md1                /                       ext3    defaults        1 1
/dev/md0                /boot                   ext3    defaults        1 2
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
tmpfs                   /dev/shm                tmpfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
sysfs                   /sys                    sysfs   defaults        0 0
LABEL=SWAP-sda2         swap                    swap    defaults        0 0
LABEL=SWAP-sdb2         swap                    swap    defaults        0 0

* cpuinfo
[root@morgor ~]# cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 47
model name      : AMD Athlon(tm) 64 Processor 3200+
stepping        : 0
cpu MHz         : 1800.000
cache size      : 512 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext
fxsr_opt lm 3dnowext 3dnow pni lahf_lm
bogomips        : 3613.48
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

Thanks a lot,

Matteo Brancaleoni

[-- Attachment #2: slabinfo.txt.gz --]
[-- Type: application/x-gzip, Size: 4470 bytes --]

[-- Attachment #3: dmesg.txt.gz --]
[-- Type: application/x-gzip, Size: 5716 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bio & Biovec-1 increasing cache size, never freed during disk IO
  2006-02-25 20:45 Bio & Biovec-1 increasing cache size, never freed during disk IO matteo brancaleoni
@ 2006-02-27 20:38 ` matteo brancaleoni
  2006-02-27 22:46   ` Neil Brown
  0 siblings, 1 reply; 10+ messages in thread
From: matteo brancaleoni @ 2006-02-27 20:38 UTC (permalink / raw)
  To: linux-kernel

FYI, this problem on 2.6.14.7 does not happens at all...

anyone has any idea about?

Greetings, Matteo.

On 2/25/06, matteo brancaleoni <mbrancaleoni@gmail.com> wrote:
> Hi.
>
> I'm experiencing a problem with 2.6.15.4 / 2.6.16-rc4, noticed during
> high disk IO (copying a lot of data between 2 machines): the system
> memory get filled up, until the full swap is used and the system must
> be rebooted (or is unusable). Stopping the process does not free the
> memory, and happens not only copying via network, but also with a
> simple
> cp -a dirwithmanybigfiles testdir.
>
> The system is running on athlon64:
> Linux morgor 2.6.16-rc4 #2 SMP Sat Feb 25 19:55:36 CET 2006 x86_64
> x86_64 x86_64 GNU/Linux
> Same issue with 2.6.15.4
>
> The box has a single soft-raid1 device made by 2 sata disk on promise
> controller.
> Attached slabinfo dump and dmesg dump.
>
> Some system informations:
> * This is the modules list:
> Linux morgor 2.6.16-rc4 #2 SMP Sat Feb 25 19:55:36 CET 2006 x86_64
> x86_64 x86_64 GNU/Linux
> [root@morgor ~]# lsmod
> Module                  Size  Used by
> ipv6                  399008  18
> ppdev                  42888  0
> autofs4                55560  1
> nfs                   251224  2
> lockd                  97424  2 nfs
> nfs_acl                37120  1 nfs
> sunrpc                191944  4 nfs,lockd,nfs_acl
> rfcomm                105376  0
> l2cap                  92160  5 rfcomm
> bluetooth             117252  4 rfcomm,l2cap
> dm_mirror              54912  0
> dm_mod                 90192  1 dm_mirror
> video                  50952  0
> button                 41120  0
> battery                43912  0
> ac                     38920  0
> lp                     48208  0
> parport_pc             63144  1
> parport                74636  3 ppdev,lp,parport_pc
> nvram                  42888  0
> ohci1394               67272  0
> ehci_hcd               65160  0
> sg                     69672  0
> ieee1394              392216  1 ohci1394
> uhci_hcd               65952  0
> snd_via82xx            63784  0
> gameport               50832  1 snd_via82xx
> snd_ac97_codec        136536  1 snd_via82xx
> snd_ac97_bus           36224  1 snd_ac97_codec
> snd_seq_dummy          37508  0
> snd_seq_oss            66688  0
> snd_seq_midi_event     41472  1 snd_seq_oss
> snd_seq                90144  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
> snd_pcm_oss            85632  0
> snd_mixer_oss          51328  1 snd_pcm_oss
> snd_pcm               126728  3 snd_via82xx,snd_ac97_codec,snd_pcm_oss
> snd_timer              59656  2 snd_seq,snd_pcm
> snd_page_alloc         44816  2 snd_via82xx,snd_pcm
> snd_mpu401_uart        42112  1 snd_via82xx
> snd_rawmidi            61696  1 snd_mpu401_uart
> snd_seq_device         43280  4 snd_seq_dummy,snd_seq_oss,snd_seq,snd_rawmidi
> i2c_viapro             43160  0
> snd                    97320  11
> snd_via82xx,snd_ac97_codec,snd_seq_oss,snd_seq,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer,snd_mpu401_uart,snd_rawmidi,snd_seq_device
> i2c_core               57728  1 i2c_viapro
> skge                   72720  0
> soundcore              44576  1 snd
> raid1                  54912  2
> ext3                  163984  2
> jbd                    93480  1 ext3
> sata_promise           45700  6
> sata_via               42500  0
> libata                 93592  2 sata_promise,sata_via
> sd_mod                 50688  8
> scsi_mod              180688  4 sg,sata_promise,libata,sd_mod
>
> * fstab
> [root@morgor ~]# cat /etc/fstab
> /dev/md1                /                       ext3    defaults        1 1
> /dev/md0                /boot                   ext3    defaults        1 2
> devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
> tmpfs                   /dev/shm                tmpfs   defaults        0 0
> proc                    /proc                   proc    defaults        0 0
> sysfs                   /sys                    sysfs   defaults        0 0
> LABEL=SWAP-sda2         swap                    swap    defaults        0 0
> LABEL=SWAP-sdb2         swap                    swap    defaults        0 0
>
> * cpuinfo
> [root@morgor ~]# cat /proc/cpuinfo
> processor       : 0
> vendor_id       : AuthenticAMD
> cpu family      : 15
> model           : 47
> model name      : AMD Athlon(tm) 64 Processor 3200+
> stepping        : 0
> cpu MHz         : 1800.000
> cache size      : 512 KB
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 1
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext
> fxsr_opt lm 3dnowext 3dnow pni lahf_lm
> bogomips        : 3613.48
> TLB size        : 1024 4K pages
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 40 bits physical, 48 bits virtual
> power management: ts fid vid ttp tm stc
>
> Thanks a lot,
>
> Matteo Brancaleoni
>
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bio & Biovec-1 increasing cache size, never freed during disk IO
  2006-02-27 20:38 ` matteo brancaleoni
@ 2006-02-27 22:46   ` Neil Brown
  2006-02-28 15:05     ` matteo brancaleoni
  0 siblings, 1 reply; 10+ messages in thread
From: Neil Brown @ 2006-02-27 22:46 UTC (permalink / raw)
  To: matteo brancaleoni; +Cc: linux-kernel

On Monday February 27, mbrancaleoni@gmail.com wrote:
> FYI, this problem on 2.6.14.7 does not happens at all...
> 
> anyone has any idea about?

Apparently not.  Fingers are pointing at md/raid1 - with reasonable
cause - but I cannot find any error in that code, not can I reproduce
the problem.

Are you able to narrow down the difference between working and
not-working.

Possibly use 'git bisect' (This is the best option, but not having
used it myself yet, I don't feel comfortable recommending it).

Possibly just try the various 2.6.15-rcX kernels and find the first
one that breaks.  

That would be a great help.

NeilBrown

> 
> Greetings, Matteo.
> 
> On 2/25/06, matteo brancaleoni <mbrancaleoni@gmail.com> wrote:
> > Hi.
> >
> > I'm experiencing a problem with 2.6.15.4 / 2.6.16-rc4, noticed during
> > high disk IO (copying a lot of data between 2 machines): the system
> > memory get filled up, until the full swap is used and the system must
> > be rebooted (or is unusable). Stopping the process does not free the
> > memory, and happens not only copying via network, but also with a
> > simple
> > cp -a dirwithmanybigfiles testdir.
> >
> > The system is running on athlon64:
> > Linux morgor 2.6.16-rc4 #2 SMP Sat Feb 25 19:55:36 CET 2006 x86_64
> > x86_64 x86_64 GNU/Linux
> > Same issue with 2.6.15.4
> >
> > The box has a single soft-raid1 device made by 2 sata disk on promise
> > controller.
> > Attached slabinfo dump and dmesg dump.
> >
> > Some system informations:
> > * This is the modules list:
> > Linux morgor 2.6.16-rc4 #2 SMP Sat Feb 25 19:55:36 CET 2006 x86_64
> > x86_64 x86_64 GNU/Linux
> > [root@morgor ~]# lsmod
> > Module                  Size  Used by
> > ipv6                  399008  18
> > ppdev                  42888  0
> > autofs4                55560  1
> > nfs                   251224  2
> > lockd                  97424  2 nfs
> > nfs_acl                37120  1 nfs
> > sunrpc                191944  4 nfs,lockd,nfs_acl
> > rfcomm                105376  0
> > l2cap                  92160  5 rfcomm
> > bluetooth             117252  4 rfcomm,l2cap
> > dm_mirror              54912  0
> > dm_mod                 90192  1 dm_mirror
> > video                  50952  0
> > button                 41120  0
> > battery                43912  0
> > ac                     38920  0
> > lp                     48208  0
> > parport_pc             63144  1
> > parport                74636  3 ppdev,lp,parport_pc
> > nvram                  42888  0
> > ohci1394               67272  0
> > ehci_hcd               65160  0
> > sg                     69672  0
> > ieee1394              392216  1 ohci1394
> > uhci_hcd               65952  0
> > snd_via82xx            63784  0
> > gameport               50832  1 snd_via82xx
> > snd_ac97_codec        136536  1 snd_via82xx
> > snd_ac97_bus           36224  1 snd_ac97_codec
> > snd_seq_dummy          37508  0
> > snd_seq_oss            66688  0
> > snd_seq_midi_event     41472  1 snd_seq_oss
> > snd_seq                90144  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
> > snd_pcm_oss            85632  0
> > snd_mixer_oss          51328  1 snd_pcm_oss
> > snd_pcm               126728  3 snd_via82xx,snd_ac97_codec,snd_pcm_oss
> > snd_timer              59656  2 snd_seq,snd_pcm
> > snd_page_alloc         44816  2 snd_via82xx,snd_pcm
> > snd_mpu401_uart        42112  1 snd_via82xx
> > snd_rawmidi            61696  1 snd_mpu401_uart
> > snd_seq_device         43280  4 snd_seq_dummy,snd_seq_oss,snd_seq,snd_rawmidi
> > i2c_viapro             43160  0
> > snd                    97320  11
> > snd_via82xx,snd_ac97_codec,snd_seq_oss,snd_seq,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer,snd_mpu401_uart,snd_rawmidi,snd_seq_device
> > i2c_core               57728  1 i2c_viapro
> > skge                   72720  0
> > soundcore              44576  1 snd
> > raid1                  54912  2
> > ext3                  163984  2
> > jbd                    93480  1 ext3
> > sata_promise           45700  6
> > sata_via               42500  0
> > libata                 93592  2 sata_promise,sata_via
> > sd_mod                 50688  8
> > scsi_mod              180688  4 sg,sata_promise,libata,sd_mod
> >
> > * fstab
> > [root@morgor ~]# cat /etc/fstab
> > /dev/md1                /                       ext3    defaults        1 1
> > /dev/md0                /boot                   ext3    defaults        1 2
> > devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
> > tmpfs                   /dev/shm                tmpfs   defaults        0 0
> > proc                    /proc                   proc    defaults        0 0
> > sysfs                   /sys                    sysfs   defaults        0 0
> > LABEL=SWAP-sda2         swap                    swap    defaults        0 0
> > LABEL=SWAP-sdb2         swap                    swap    defaults        0 0
> >
> > * cpuinfo
> > [root@morgor ~]# cat /proc/cpuinfo
> > processor       : 0
> > vendor_id       : AuthenticAMD
> > cpu family      : 15
> > model           : 47
> > model name      : AMD Athlon(tm) 64 Processor 3200+
> > stepping        : 0
> > cpu MHz         : 1800.000
> > cache size      : 512 KB
> > fpu             : yes
> > fpu_exception   : yes
> > cpuid level     : 1
> > wp              : yes
> > flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> > mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext
> > fxsr_opt lm 3dnowext 3dnow pni lahf_lm
> > bogomips        : 3613.48
> > TLB size        : 1024 4K pages
> > clflush size    : 64
> > cache_alignment : 64
> > address sizes   : 40 bits physical, 48 bits virtual
> > power management: ts fid vid ttp tm stc
> >
> > Thanks a lot,
> >
> > Matteo Brancaleoni
> >
> >
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bio & Biovec-1 increasing cache size, never freed during disk IO
  2006-02-27 22:46   ` Neil Brown
@ 2006-02-28 15:05     ` matteo brancaleoni
  2006-02-28 23:35       ` Neil Brown
  0 siblings, 1 reply; 10+ messages in thread
From: matteo brancaleoni @ 2006-02-28 15:05 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-kernel

Hi Neil,

seems that the patch that leads to the error is the one signed up by you:
commit 3795bb0fc52fe2af2749f3ad2185cb9c90871ef8
Author: NeilBrown <neilb@suse.de>
Date:   Mon Dec 12 02:39:16 2005 -0800

    [PATCH] md: fix a use-after-free bug in raid1

    Who would submit code with a FIXME like that in it !!!!

    Signed-off-by: Neil Brown <neilb@suse.de>
    Signed-off-by: Andrew Morton <akpm@osdl.org>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

I'm by no means a kernel expert, but before the bio release was done
everytime, after putting r1_bio->bios[mirror] to NULL ; with the patch
is done only if
r1_bio->bios[mirror] is NULL ... perhaps sometimes this value is not null
and must be released anyway?

Greetings, Matteo.

On 2/27/06, Neil Brown <neilb@suse.de> wrote:
> On Monday February 27, mbrancaleoni@gmail.com wrote:
> > FYI, this problem on 2.6.14.7 does not happens at all...
> >
> > anyone has any idea about?
>
> Apparently not.  Fingers are pointing at md/raid1 - with reasonable
> cause - but I cannot find any error in that code, not can I reproduce
> the problem.
>
> Are you able to narrow down the difference between working and
> not-working.
>
> Possibly use 'git bisect' (This is the best option, but not having
> used it myself yet, I don't feel comfortable recommending it).
>
> Possibly just try the various 2.6.15-rcX kernels and find the first
> one that breaks.
>
> That would be a great help.
>
> NeilBrown
>
> >
> > Greetings, Matteo.
> >
> > On 2/25/06, matteo brancaleoni <mbrancaleoni@gmail.com> wrote:
> > > Hi.
> > >
> > > I'm experiencing a problem with 2.6.15.4 / 2.6.16-rc4, noticed during
> > > high disk IO (copying a lot of data between 2 machines): the system
> > > memory get filled up, until the full swap is used and the system must
> > > be rebooted (or is unusable). Stopping the process does not free the
> > > memory, and happens not only copying via network, but also with a
> > > simple
> > > cp -a dirwithmanybigfiles testdir.
> > >
> > > The system is running on athlon64:
> > > Linux morgor 2.6.16-rc4 #2 SMP Sat Feb 25 19:55:36 CET 2006 x86_64
> > > x86_64 x86_64 GNU/Linux
> > > Same issue with 2.6.15.4
> > >
> > > The box has a single soft-raid1 device made by 2 sata disk on promise
> > > controller.
> > > Attached slabinfo dump and dmesg dump.
> > >
> > > Some system informations:
> > > * This is the modules list:
> > > Linux morgor 2.6.16-rc4 #2 SMP Sat Feb 25 19:55:36 CET 2006 x86_64
> > > x86_64 x86_64 GNU/Linux
> > > [root@morgor ~]# lsmod
> > > Module                  Size  Used by
> > > ipv6                  399008  18
> > > ppdev                  42888  0
> > > autofs4                55560  1
> > > nfs                   251224  2
> > > lockd                  97424  2 nfs
> > > nfs_acl                37120  1 nfs
> > > sunrpc                191944  4 nfs,lockd,nfs_acl
> > > rfcomm                105376  0
> > > l2cap                  92160  5 rfcomm
> > > bluetooth             117252  4 rfcomm,l2cap
> > > dm_mirror              54912  0
> > > dm_mod                 90192  1 dm_mirror
> > > video                  50952  0
> > > button                 41120  0
> > > battery                43912  0
> > > ac                     38920  0
> > > lp                     48208  0
> > > parport_pc             63144  1
> > > parport                74636  3 ppdev,lp,parport_pc
> > > nvram                  42888  0
> > > ohci1394               67272  0
> > > ehci_hcd               65160  0
> > > sg                     69672  0
> > > ieee1394              392216  1 ohci1394
> > > uhci_hcd               65952  0
> > > snd_via82xx            63784  0
> > > gameport               50832  1 snd_via82xx
> > > snd_ac97_codec        136536  1 snd_via82xx
> > > snd_ac97_bus           36224  1 snd_ac97_codec
> > > snd_seq_dummy          37508  0
> > > snd_seq_oss            66688  0
> > > snd_seq_midi_event     41472  1 snd_seq_oss
> > > snd_seq                90144  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
> > > snd_pcm_oss            85632  0
> > > snd_mixer_oss          51328  1 snd_pcm_oss
> > > snd_pcm               126728  3 snd_via82xx,snd_ac97_codec,snd_pcm_oss
> > > snd_timer              59656  2 snd_seq,snd_pcm
> > > snd_page_alloc         44816  2 snd_via82xx,snd_pcm
> > > snd_mpu401_uart        42112  1 snd_via82xx
> > > snd_rawmidi            61696  1 snd_mpu401_uart
> > > snd_seq_device         43280  4 snd_seq_dummy,snd_seq_oss,snd_seq,snd_rawmidi
> > > i2c_viapro             43160  0
> > > snd                    97320  11
> > > snd_via82xx,snd_ac97_codec,snd_seq_oss,snd_seq,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer,snd_mpu401_uart,snd_rawmidi,snd_seq_device
> > > i2c_core               57728  1 i2c_viapro
> > > skge                   72720  0
> > > soundcore              44576  1 snd
> > > raid1                  54912  2
> > > ext3                  163984  2
> > > jbd                    93480  1 ext3
> > > sata_promise           45700  6
> > > sata_via               42500  0
> > > libata                 93592  2 sata_promise,sata_via
> > > sd_mod                 50688  8
> > > scsi_mod              180688  4 sg,sata_promise,libata,sd_mod
> > >
> > > * fstab
> > > [root@morgor ~]# cat /etc/fstab
> > > /dev/md1                /                       ext3    defaults        1 1
> > > /dev/md0                /boot                   ext3    defaults        1 2
> > > devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
> > > tmpfs                   /dev/shm                tmpfs   defaults        0 0
> > > proc                    /proc                   proc    defaults        0 0
> > > sysfs                   /sys                    sysfs   defaults        0 0
> > > LABEL=SWAP-sda2         swap                    swap    defaults        0 0
> > > LABEL=SWAP-sdb2         swap                    swap    defaults        0 0
> > >
> > > * cpuinfo
> > > [root@morgor ~]# cat /proc/cpuinfo
> > > processor       : 0
> > > vendor_id       : AuthenticAMD
> > > cpu family      : 15
> > > model           : 47
> > > model name      : AMD Athlon(tm) 64 Processor 3200+
> > > stepping        : 0
> > > cpu MHz         : 1800.000
> > > cache size      : 512 KB
> > > fpu             : yes
> > > fpu_exception   : yes
> > > cpuid level     : 1
> > > wp              : yes
> > > flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> > > mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext
> > > fxsr_opt lm 3dnowext 3dnow pni lahf_lm
> > > bogomips        : 3613.48
> > > TLB size        : 1024 4K pages
> > > clflush size    : 64
> > > cache_alignment : 64
> > > address sizes   : 40 bits physical, 48 bits virtual
> > > power management: ts fid vid ttp tm stc
> > >
> > > Thanks a lot,
> > >
> > > Matteo Brancaleoni
> > >
> > >
> > >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bio & Biovec-1 increasing cache size, never freed during disk IO
  2006-02-28 15:05     ` matteo brancaleoni
@ 2006-02-28 23:35       ` Neil Brown
  2006-03-01 10:46         ` matteo brancaleoni
  0 siblings, 1 reply; 10+ messages in thread
From: Neil Brown @ 2006-02-28 23:35 UTC (permalink / raw)
  To: matteo brancaleoni; +Cc: linux-kernel

On Tuesday February 28, mbrancaleoni@gmail.com wrote:
> Hi Neil,
> 
> seems that the patch that leads to the error is the one signed up by you:
> commit 3795bb0fc52fe2af2749f3ad2185cb9c90871ef8
> Author: NeilBrown <neilb@suse.de>
> Date:   Mon Dec 12 02:39:16 2005 -0800
> 
>     [PATCH] md: fix a use-after-free bug in raid1
> 
>     Who would submit code with a FIXME like that in it !!!!
> 
>     Signed-off-by: Neil Brown <neilb@suse.de>
>     Signed-off-by: Andrew Morton <akpm@osdl.org>
>     Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Thanks for finding this.

There are two bugs here.
One is that if BIO_RW_BARRIER is rejected by one drive but not the
other, then we forget to release a bio that we should have released.

The other is that the test for "should we do barrier IO" was wrong so
that even though one device doesn't support it, raid1 keeps trying it
(but only on read-ahead requests....)

It would seem that the devices in your array are not all on the same
controller.  Is that correct?  There isn't a problem with that, but I
just want to check my understanding of what is happening.

Could you try this patch please and see if it fixes the problem?

Thanks again,
NeilBrown

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid1.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
--- ./drivers/md/raid1.c~current~	2006-02-27 11:52:18.000000000 +1100
+++ ./drivers/md/raid1.c	2006-03-01 10:30:43.000000000 +1100
@@ -375,7 +375,7 @@ static int raid1_end_write_request(struc
 			/* Don't dec_pending yet, we want to hold
 			 * the reference over the retry
 			 */
-			return 0;
+			goto out;
 		}
 		if (test_bit(R1BIO_BehindIO, &r1_bio->state)) {
 			/* free extra copy of the data pages */
@@ -392,10 +392,11 @@ static int raid1_end_write_request(struc
 		raid_end_bio_io(r1_bio);
 	}
 
+	rdev_dec_pending(conf->mirrors[mirror].rdev, conf->mddev);
+ out:
 	if (r1_bio->bios[mirror]==NULL)
 		bio_put(bio);
 
-	rdev_dec_pending(conf->mirrors[mirror].rdev, conf->mddev);
 	return 0;
 }
 
@@ -857,7 +858,7 @@ static int make_request(request_queue_t 
 	atomic_set(&r1_bio->remaining, 0);
 	atomic_set(&r1_bio->behind_remaining, 0);
 
-	do_barriers = bio->bi_rw & BIO_RW_BARRIER;
+	do_barriers = bio_barrier(bio);
 	if (do_barriers)
 		set_bit(R1BIO_Barrier, &r1_bio->state);
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bio & Biovec-1 increasing cache size, never freed during disk IO
  2006-02-28 23:35       ` Neil Brown
@ 2006-03-01 10:46         ` matteo brancaleoni
  2006-03-02  6:12           ` Neil Brown
  0 siblings, 1 reply; 10+ messages in thread
From: matteo brancaleoni @ 2006-03-01 10:46 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-kernel

Hi Neil.

unfortunately the patch does nothing, the problem persists.
Tested with 2.6.16-rc5.
(I've double checked if the patch was applied correctly)

Can I do anything to be of some more help?

Greetings,

Matteo.

On 3/1/06, Neil Brown <neilb@suse.de> wrote:
> On Tuesday February 28, mbrancaleoni@gmail.com wrote:
> > Hi Neil,
> >
> > seems that the patch that leads to the error is the one signed up by you:
> > commit 3795bb0fc52fe2af2749f3ad2185cb9c90871ef8
> > Author: NeilBrown <neilb@suse.de>
> > Date:   Mon Dec 12 02:39:16 2005 -0800
> >
> >     [PATCH] md: fix a use-after-free bug in raid1
> >
> >     Who would submit code with a FIXME like that in it !!!!
> >
> >     Signed-off-by: Neil Brown <neilb@suse.de>
> >     Signed-off-by: Andrew Morton <akpm@osdl.org>
> >     Signed-off-by: Linus Torvalds <torvalds@osdl.org>
>
> Thanks for finding this.
>
> There are two bugs here.
> One is that if BIO_RW_BARRIER is rejected by one drive but not the
> other, then we forget to release a bio that we should have released.
>
> The other is that the test for "should we do barrier IO" was wrong so
> that even though one device doesn't support it, raid1 keeps trying it
> (but only on read-ahead requests....)
>
> It would seem that the devices in your array are not all on the same
> controller.  Is that correct?  There isn't a problem with that, but I
> just want to check my understanding of what is happening.
>
> Could you try this patch please and see if it fixes the problem?
>
> Thanks again,
> NeilBrown
>
> Signed-off-by: Neil Brown <neilb@suse.de>
>
> ### Diffstat output
>  ./drivers/md/raid1.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
> --- ./drivers/md/raid1.c~current~       2006-02-27 11:52:18.000000000 +1100
> +++ ./drivers/md/raid1.c        2006-03-01 10:30:43.000000000 +1100
> @@ -375,7 +375,7 @@ static int raid1_end_write_request(struc
>                         /* Don't dec_pending yet, we want to hold
>                          * the reference over the retry
>                          */
> -                       return 0;
> +                       goto out;
>                 }
>                 if (test_bit(R1BIO_BehindIO, &r1_bio->state)) {
>                         /* free extra copy of the data pages */
> @@ -392,10 +392,11 @@ static int raid1_end_write_request(struc
>                 raid_end_bio_io(r1_bio);
>         }
>
> +       rdev_dec_pending(conf->mirrors[mirror].rdev, conf->mddev);
> + out:
>         if (r1_bio->bios[mirror]==NULL)
>                 bio_put(bio);
>
> -       rdev_dec_pending(conf->mirrors[mirror].rdev, conf->mddev);
>         return 0;
>  }
>
> @@ -857,7 +858,7 @@ static int make_request(request_queue_t
>         atomic_set(&r1_bio->remaining, 0);
>         atomic_set(&r1_bio->behind_remaining, 0);
>
> -       do_barriers = bio->bi_rw & BIO_RW_BARRIER;
> +       do_barriers = bio_barrier(bio);
>         if (do_barriers)
>                 set_bit(R1BIO_Barrier, &r1_bio->state);
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bio & Biovec-1 increasing cache size, never freed during disk IO
  2006-03-01 10:46         ` matteo brancaleoni
@ 2006-03-02  6:12           ` Neil Brown
  2006-03-02  8:19             ` matteo brancaleoni
  2006-03-08 21:57             ` Adrian Bunk
  0 siblings, 2 replies; 10+ messages in thread
From: Neil Brown @ 2006-03-02  6:12 UTC (permalink / raw)
  To: matteo brancaleoni; +Cc: linux-kernel

On Wednesday March 1, mbrancaleoni@gmail.com wrote:
> Hi Neil.
> 
> unfortunately the patch does nothing, the problem persists.
> Tested with 2.6.16-rc5.
> (I've double checked if the patch was applied correctly)
> 
> Can I do anything to be of some more help?

Yes, try another patch. :-)  and tell me if you have CONFIG_DEBUG_SLAB
set... there was another use-after-free bug which CONFIG_DEBUG_SLAB
would have made worse.

This patch should fix it all up.

Thanks again,
NeilBrown


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid1.c |   13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
--- ./drivers/md/raid1.c~current~	2006-02-27 11:52:18.000000000 +1100
+++ ./drivers/md/raid1.c	2006-03-01 10:44:49.000000000 +1100
@@ -306,6 +306,7 @@ static int raid1_end_write_request(struc
 	r1bio_t * r1_bio = (r1bio_t *)(bio->bi_private);
 	int mirror, behind = test_bit(R1BIO_BehindIO, &r1_bio->state);
 	conf_t *conf = mddev_to_conf(r1_bio->mddev);
+	struct bio *to_put = NULL;
 
 	if (bio->bi_size)
 		return 1;
@@ -323,6 +324,7 @@ static int raid1_end_write_request(struc
 		 * this branch is our 'one mirror IO has finished' event handler:
 		 */
 		r1_bio->bios[mirror] = NULL;
+		to_put = bio;
 		if (!uptodate) {
 			md_error(r1_bio->mddev, conf->mirrors[mirror].rdev);
 			/* an I/O failed, we can't clear the bitmap */
@@ -375,7 +377,7 @@ static int raid1_end_write_request(struc
 			/* Don't dec_pending yet, we want to hold
 			 * the reference over the retry
 			 */
-			return 0;
+			goto out;
 		}
 		if (test_bit(R1BIO_BehindIO, &r1_bio->state)) {
 			/* free extra copy of the data pages */
@@ -392,10 +394,11 @@ static int raid1_end_write_request(struc
 		raid_end_bio_io(r1_bio);
 	}
 
-	if (r1_bio->bios[mirror]==NULL)
-		bio_put(bio);
-
 	rdev_dec_pending(conf->mirrors[mirror].rdev, conf->mddev);
+ out:
+	if (to_put)
+		bio_put(to_put);
+
 	return 0;
 }
 
@@ -857,7 +860,7 @@ static int make_request(request_queue_t 
 	atomic_set(&r1_bio->remaining, 0);
 	atomic_set(&r1_bio->behind_remaining, 0);
 
-	do_barriers = bio->bi_rw & BIO_RW_BARRIER;
+	do_barriers = bio_barrier(bio);
 	if (do_barriers)
 		set_bit(R1BIO_Barrier, &r1_bio->state);
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bio & Biovec-1 increasing cache size, never freed during disk IO
  2006-03-02  6:12           ` Neil Brown
@ 2006-03-02  8:19             ` matteo brancaleoni
  2006-03-02  8:58               ` matteo brancaleoni
  2006-03-08 21:57             ` Adrian Bunk
  1 sibling, 1 reply; 10+ messages in thread
From: matteo brancaleoni @ 2006-03-02  8:19 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-kernel

Hi.

no, the CONFIG_DEBUG_SLAB was not set.

Thanks for the new patch, compiling now.
I'll let you know asap.

Thanks a lot for the help,

Matteo.

On 3/2/06, Neil Brown <neilb@suse.de> wrote:
> On Wednesday March 1, mbrancaleoni@gmail.com wrote:
> > Hi Neil.
> >
> > unfortunately the patch does nothing, the problem persists.
> > Tested with 2.6.16-rc5.
> > (I've double checked if the patch was applied correctly)
> >
> > Can I do anything to be of some more help?
>
> Yes, try another patch. :-)  and tell me if you have CONFIG_DEBUG_SLAB
> set... there was another use-after-free bug which CONFIG_DEBUG_SLAB
> would have made worse.
>
> This patch should fix it all up.
>
> Thanks again,
> NeilBrown
>
>
> Signed-off-by: Neil Brown <neilb@suse.de>
>
> ### Diffstat output
>  ./drivers/md/raid1.c |   13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
> --- ./drivers/md/raid1.c~current~       2006-02-27 11:52:18.000000000 +1100
> +++ ./drivers/md/raid1.c        2006-03-01 10:44:49.000000000 +1100
> @@ -306,6 +306,7 @@ static int raid1_end_write_request(struc
>         r1bio_t * r1_bio = (r1bio_t *)(bio->bi_private);
>         int mirror, behind = test_bit(R1BIO_BehindIO, &r1_bio->state);
>         conf_t *conf = mddev_to_conf(r1_bio->mddev);
> +       struct bio *to_put = NULL;
>
>         if (bio->bi_size)
>                 return 1;
> @@ -323,6 +324,7 @@ static int raid1_end_write_request(struc
>                  * this branch is our 'one mirror IO has finished' event handler:
>                  */
>                 r1_bio->bios[mirror] = NULL;
> +               to_put = bio;
>                 if (!uptodate) {
>                         md_error(r1_bio->mddev, conf->mirrors[mirror].rdev);
>                         /* an I/O failed, we can't clear the bitmap */
> @@ -375,7 +377,7 @@ static int raid1_end_write_request(struc
>                         /* Don't dec_pending yet, we want to hold
>                          * the reference over the retry
>                          */
> -                       return 0;
> +                       goto out;
>                 }
>                 if (test_bit(R1BIO_BehindIO, &r1_bio->state)) {
>                         /* free extra copy of the data pages */
> @@ -392,10 +394,11 @@ static int raid1_end_write_request(struc
>                 raid_end_bio_io(r1_bio);
>         }
>
> -       if (r1_bio->bios[mirror]==NULL)
> -               bio_put(bio);
> -
>         rdev_dec_pending(conf->mirrors[mirror].rdev, conf->mddev);
> + out:
> +       if (to_put)
> +               bio_put(to_put);
> +
>         return 0;
>  }
>
> @@ -857,7 +860,7 @@ static int make_request(request_queue_t
>         atomic_set(&r1_bio->remaining, 0);
>         atomic_set(&r1_bio->behind_remaining, 0);
>
> -       do_barriers = bio->bi_rw & BIO_RW_BARRIER;
> +       do_barriers = bio_barrier(bio);
>         if (do_barriers)
>                 set_bit(R1BIO_Barrier, &r1_bio->state);
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bio & Biovec-1 increasing cache size, never freed during disk IO
  2006-03-02  8:19             ` matteo brancaleoni
@ 2006-03-02  8:58               ` matteo brancaleoni
  0 siblings, 0 replies; 10+ messages in thread
From: matteo brancaleoni @ 2006-03-02  8:58 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-kernel

Hi Neil,

I can confirm that the latest patch works nice on 2.6.16-rc5 !

Thanks a lot for the support!

Greetings, Matteo.

P.S. if you need some more details/tests/info I'm here to help.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bio & Biovec-1 increasing cache size, never freed during disk IO
  2006-03-02  6:12           ` Neil Brown
  2006-03-02  8:19             ` matteo brancaleoni
@ 2006-03-08 21:57             ` Adrian Bunk
  1 sibling, 0 replies; 10+ messages in thread
From: Adrian Bunk @ 2006-03-08 21:57 UTC (permalink / raw)
  To: Neil Brown, Andrew Morton; +Cc: matteo brancaleoni, linux-kernel

On Thu, Mar 02, 2006 at 05:12:49PM +1100, Neil Brown wrote:
> On Wednesday March 1, mbrancaleoni@gmail.com wrote:
> > Hi Neil.
> > 
> > unfortunately the patch does nothing, the problem persists.
> > Tested with 2.6.16-rc5.
> > (I've double checked if the patch was applied correctly)
> > 
> > Can I do anything to be of some more help?
> 
> Yes, try another patch. :-)  and tell me if you have CONFIG_DEBUG_SLAB
> set... there was another use-after-free bug which CONFIG_DEBUG_SLAB
> would have made worse.
> 
> This patch should fix it all up.


This patch seems to be 2.6.16 stuff?


> Thanks again,
> NeilBrown
> 
> 
> Signed-off-by: Neil Brown <neilb@suse.de>
> 
> ### Diffstat output
>  ./drivers/md/raid1.c |   13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
> --- ./drivers/md/raid1.c~current~	2006-02-27 11:52:18.000000000 +1100
> +++ ./drivers/md/raid1.c	2006-03-01 10:44:49.000000000 +1100
> @@ -306,6 +306,7 @@ static int raid1_end_write_request(struc
>  	r1bio_t * r1_bio = (r1bio_t *)(bio->bi_private);
>  	int mirror, behind = test_bit(R1BIO_BehindIO, &r1_bio->state);
>  	conf_t *conf = mddev_to_conf(r1_bio->mddev);
> +	struct bio *to_put = NULL;
>  
>  	if (bio->bi_size)
>  		return 1;
> @@ -323,6 +324,7 @@ static int raid1_end_write_request(struc
>  		 * this branch is our 'one mirror IO has finished' event handler:
>  		 */
>  		r1_bio->bios[mirror] = NULL;
> +		to_put = bio;
>  		if (!uptodate) {
>  			md_error(r1_bio->mddev, conf->mirrors[mirror].rdev);
>  			/* an I/O failed, we can't clear the bitmap */
> @@ -375,7 +377,7 @@ static int raid1_end_write_request(struc
>  			/* Don't dec_pending yet, we want to hold
>  			 * the reference over the retry
>  			 */
> -			return 0;
> +			goto out;
>  		}
>  		if (test_bit(R1BIO_BehindIO, &r1_bio->state)) {
>  			/* free extra copy of the data pages */
> @@ -392,10 +394,11 @@ static int raid1_end_write_request(struc
>  		raid_end_bio_io(r1_bio);
>  	}
>  
> -	if (r1_bio->bios[mirror]==NULL)
> -		bio_put(bio);
> -
>  	rdev_dec_pending(conf->mirrors[mirror].rdev, conf->mddev);
> + out:
> +	if (to_put)
> +		bio_put(to_put);
> +
>  	return 0;
>  }
>  
> @@ -857,7 +860,7 @@ static int make_request(request_queue_t 
>  	atomic_set(&r1_bio->remaining, 0);
>  	atomic_set(&r1_bio->behind_remaining, 0);
>  
> -	do_barriers = bio->bi_rw & BIO_RW_BARRIER;
> +	do_barriers = bio_barrier(bio);
>  	if (do_barriers)
>  		set_bit(R1BIO_Barrier, &r1_bio->state);
>  
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-03-08 21:57 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-25 20:45 Bio & Biovec-1 increasing cache size, never freed during disk IO matteo brancaleoni
2006-02-27 20:38 ` matteo brancaleoni
2006-02-27 22:46   ` Neil Brown
2006-02-28 15:05     ` matteo brancaleoni
2006-02-28 23:35       ` Neil Brown
2006-03-01 10:46         ` matteo brancaleoni
2006-03-02  6:12           ` Neil Brown
2006-03-02  8:19             ` matteo brancaleoni
2006-03-02  8:58               ` matteo brancaleoni
2006-03-08 21:57             ` Adrian Bunk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).