All of lore.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
@ 2012-10-19 14:50 Mike Kazantsev
  2012-10-19 17:36 ` Mike Kazantsev
  0 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-19 14:50 UTC (permalink / raw)
  To: linux-mm; +Cc: paul, netdev


[-- Attachment #1.1: Type: text/plain, Size: 2555 bytes --]

Good day,

There seem to be a large slab memory leak in standard (kernel.org)
kernel with the specific configuration and workload I have here.
From what I can tell at the moment, it appears to be a leak in
IPSec-related xfrm code.


It was really noticeable on several different physical machines
with same kernel configuration but different worloads since I've
upgraded to kernel 3.5.0.

Graph of total slab usage (+ total available RAM) on these machines:

  http://i.imgur.com/IyPqA.png

Presence of some leak can clearly be seen over time, and it caused
near-OOM condition several times now.
Sharp drops in memory usage indicates reboot, which, I'm afraid, with
such condition, has to be done at the regular intervals.

Initially I thought that it was triggered by heavy filesystem load, but
today finally got around to reboot one of the machines with
slub_debug=U and it doesn't seem to be the case.

slabtop showed "kmalloc-64" being the 99% offender in the past, but
with recent kernels (3.6.1), it has changed to "secpath_cache",
alloc_calls in /sys/kernel/slab/secpath_cache/ lists only the following:

  2779138 secpath_dup+0x1b/0x5a age=400/169538/326767 pid=0-1543 cpus=0-3

And free_calls lists these two lines:

  2543886 <not-available> age=4295223985 pid=0 cpus=0
  235252 __secpath_destroy+0x3e/0x43 age=1651/174629/327902 pid=0-1519 cpus=0-3

Contents of all paths available in /sys/kernel/slab/secpath_cache/
and "slabtop -o" output should be attached to this mail.
These were taken after heavy network + fs i/o load (rsync from a
different machine over network) after ~10-20min.

"secpath_dup" seem to be ipsec-related call, and all machines in
question communicate over IPSec almost exclusively all the time
(openswan-2.6.37 userspace at the moment).

As noted, the problem is highly reproducible - all I have to do is to
run rsync or something similar between these nodes for a few minutes.
All machines in question have x86_64 kernel 3.6.1 now, but I'll
probably update it to 3.6.2 in a moment.


Keywords:
linux kernel networking mm slub slab secpath_dup secpath_cache xfrm
ipsec 3.5 3.6 memory leak oom slabtop x86 x86_64 amd64


/proc/version: 
  Linux version 3.6.1-fg.mf_master (root@anathema) (gcc version 4.6.3
  (Exherbo gcc-4.6.3-r1) ) #1 SMP Sat Oct 13 04:21:08 YEKT 2012

Other information about the system (as per REPORTING-BUGS) is
attached, also including slabtop and slub_debug-related /sys paths
output/contents.


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #1.2: cpuinfo.txt --]
[-- Type: text/plain, Size: 3052 bytes --]

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 28
model name	: Intel(R) Atom(TM) CPU D510   @ 1.66GHz
stepping	: 10
microcode	: 0x107
cpu MHz		: 1666.683
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl tm2 ssse3 cx16 xtpr pdcm movbe lahf_lm dtherm
bogomips	: 3334.25
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 28
model name	: Intel(R) Atom(TM) CPU D510   @ 1.66GHz
stepping	: 10
microcode	: 0x107
cpu MHz		: 1666.683
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl tm2 ssse3 cx16 xtpr pdcm movbe lahf_lm dtherm
bogomips	: 3334.25
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 28
model name	: Intel(R) Atom(TM) CPU D510   @ 1.66GHz
stepping	: 10
microcode	: 0x107
cpu MHz		: 1666.683
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl tm2 ssse3 cx16 xtpr pdcm movbe lahf_lm dtherm
bogomips	: 3334.25
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 28
model name	: Intel(R) Atom(TM) CPU D510   @ 1.66GHz
stepping	: 10
microcode	: 0x107
cpu MHz		: 1666.683
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl tm2 ssse3 cx16 xtpr pdcm movbe lahf_lm dtherm
bogomips	: 3334.25
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:


[-- Attachment #1.3: iomem.txt --]
[-- Type: text/plain, Size: 2047 bytes --]

00000000-0000ffff : reserved
00010000-0008efff : System RAM
0008f000-0008ffff : reserved
00090000-0009ebff : System RAM
0009ec00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ce000-000ce7ff : Adapter ROM
000e0000-000fffff : reserved
  000f0000-000fffff : System ROM
00100000-3eebcfff : System RAM
  01000000-014dfd69 : Kernel code
  014dfd6a-018613bf : Kernel data
  018f2000-0198bfff : Kernel bss
3eebd000-3eebefff : reserved
3eebf000-3ef46fff : System RAM
3ef47000-3efbefff : ACPI Non-volatile Storage
3efbf000-3eff0fff : System RAM
3eff1000-3effefff : ACPI Tables
3efff000-3effffff : System RAM
3f000000-3fffffff : reserved
d0000000-f7ffffff : PCI Bus 0000:00
  d0000000-d04fffff : PCI Bus 0000:01
  d0500000-d06fffff : PCI Bus 0000:02
  d0700000-d08fffff : PCI Bus 0000:02
  d0900000-d0afffff : PCI Bus 0000:03
  d0b00000-d0cfffff : PCI Bus 0000:03
  d0d00000-d0efffff : PCI Bus 0000:04
  d0f00000-d10fffff : PCI Bus 0000:04
  e0000000-efffffff : 0000:00:02.0
  f0000000-f00fffff : PCI Bus 0000:01
    f0000000-f0003fff : 0000:01:00.0
      f0000000-f0003fff : r8169
    f0004000-f0004fff : 0000:01:00.0
      f0004000-f0004fff : r8169
    f0020000-f003ffff : 0000:01:00.0
  f0100000-f01fffff : PCI Bus 0000:05
    f0100000-f01000ff : 0000:05:00.0
      f0100000-f01000ff : via-rhine
  f0200000-f02fffff : 0000:00:02.0
  f0300000-f037ffff : 0000:00:02.0
  f0380000-f0383fff : 0000:00:1b.0
    f0380000-f0383fff : ICH HD audio
  f0384000-f03843ff : 0000:00:1f.2
    f0384000-f03843ff : ahci
  f0384400-f03847ff : 0000:00:1d.7
    f0384400-f03847ff : ehci_hcd
f8000000-fbffffff : PCI MMCONFIG 0000 [bus 00-3f]
  f8000000-fbffffff : reserved
    f8000000-fbffffff : pnp 00:01
fec00000-fec003ff : IOAPIC 0
fed00000-fed003ff : HPET 0
fed14000-fed17fff : pnp 00:01
fed18000-fed18fff : pnp 00:01
fed19000-fed19fff : pnp 00:01
fed1c000-fed1ffff : pnp 00:01
fee00000-fee00fff : Local APIC
fff00000-ffffffff : reserved
  fff00000-ffffffff : pnp 00:01

[-- Attachment #1.4: ioports.txt --]
[-- Type: text/plain, Size: 1507 bytes --]

0000-0cf7 : PCI Bus 0000:00
  0000-001f : dma1
  0020-0021 : pic1
  0040-0043 : timer0
  0050-0053 : timer1
  0060-0060 : keyboard
  0064-0064 : keyboard
  0070-0071 : rtc0
  0080-008f : dma page reg
  00a0-00a1 : pic2
  00c0-00df : dma2
  00f0-00ff : fpu
  0295-0296 : w83627hf
    0295-0296 : w83627hf
  0378-037a : parport0
  03c0-03df : vga+
  0400-047f : pnp 00:06
    0400-0403 : ACPI PM1a_EVT_BLK
    0404-0405 : ACPI PM1a_CNT_BLK
    0408-040b : ACPI PM_TMR
    0410-0415 : ACPI CPU throttle
    0420-0420 : ACPI PM2_CNT_BLK
    0428-042f : ACPI GPE0_BLK
  0500-053f : pnp 00:06
  0680-06ff : pnp 00:06
0cf8-0cff : PCI conf1
0d00-ffff : PCI Bus 0000:00
  1000-1fff : PCI Bus 0000:05
    1000-10ff : 0000:05:00.0
      1000-10ff : via-rhine
  2000-2fff : PCI Bus 0000:01
    2000-20ff : 0000:01:00.0
      2000-20ff : r8169
  3000-301f : 0000:00:1f.3
    3000-301f : i801_smbus
  3020-303f : 0000:00:1d.3
    3020-303f : uhci_hcd
  3040-305f : 0000:00:1d.2
    3040-305f : uhci_hcd
  3060-307f : 0000:00:1d.1
    3060-307f : uhci_hcd
  3080-309f : 0000:00:1d.0
    3080-309f : uhci_hcd
  30a0-30af : 0000:00:1f.2
    30a0-30af : ahci
  30b0-30b7 : 0000:00:1f.2
    30b0-30b7 : ahci
  30b8-30bf : 0000:00:1f.2
    30b8-30bf : ahci
  30c0-30c7 : 0000:00:02.0
  30c8-30cb : 0000:00:1f.2
    30c8-30cb : ahci
  30cc-30cf : 0000:00:1f.2
    30cc-30cf : ahci
  4000-4fff : PCI Bus 0000:02
  5000-5fff : PCI Bus 0000:03
  6000-6fff : PCI Bus 0000:04

[-- Attachment #1.5: lspci.txt --]
[-- Type: text/plain, Size: 26465 bytes --]

00:00.0 Host bridge: Intel Corporation Atom Processor D4xx/D5xx/N4xx/N5xx DMI Bridge (rev 02)
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: [e0] Vendor Specific Information: Len=08 <?>

00:02.0 VGA compatible controller: Intel Corporation Atom Processor D4xx/D5xx/N4xx/N5xx Integrated Graphics Controller (rev 02) (prog-if 00 [VGA controller])
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 11
	Region 0: Memory at f0300000 (32-bit, non-prefetchable) [size=512K]
	Region 1: I/O ports at 30c0 [size=8]
	Region 2: Memory at e0000000 (32-bit, prefetchable) [size=256M]
	Region 3: Memory at f0200000 (32-bit, non-prefetchable) [size=1M]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [d0] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

00:1b.0 Audio device: Intel Corporation N10/ICH 7 Family High Definition Audio Controller (rev 01)
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 46
	Region 0: Memory at f0380000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0f00c  Data: 4152
	Capabilities: [70] Express (v1) Root Complex Integrated Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE- FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed unknown, Width x0, ASPM unknown, Latency L0 <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed unknown, Width x0, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=01
			Status:	NegoPending- InProgress-
		VC1:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=1 ArbSelect=Fixed TC/VC=80
			Status:	NegoPending- InProgress-
	Capabilities: [130 v1] Root Complex Link
		Desc:	PortNumber=0f ComponentID=02 EltType=Config
		Link0:	Desc:	TargetPort=00 TargetComponent=02 AssocRCRB- LinkType=MemMapped LinkValid+
			Addr:	00000000fed1c000
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd-hda-intel

00:1c.0 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 1 (rev 01) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	I/O behind bridge: 00002000-00002fff
	Memory behind bridge: d0000000-d04fffff
	Prefetchable memory behind bridge: 00000000f0000000-00000000f00fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
			ExtTag- RBE- FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #1, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <256ns, L1 <4us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+
			Slot #1, PowerLimit 10.000W; Interlock- NoCompl-
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet+ LinkState+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0f00c  Data: 4191
	Capabilities: [90] Subsystem: Intel Corporation Device 4f4d
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed+ WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=01
			Status:	NegoPending- InProgress-
		VC1:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable- ID=0 ArbSelect=Fixed TC/VC=00
			Status:	NegoPending- InProgress-
	Capabilities: [180 v1] Root Complex Link
		Desc:	PortNumber=01 ComponentID=02 EltType=Config
		Link0:	Desc:	TargetPort=00 TargetComponent=02 AssocRCRB- LinkType=MemMapped LinkValid+
			Addr:	00000000fed1c001
	Kernel driver in use: pcieport

00:1c.1 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 2 (rev 01) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
	I/O behind bridge: 00004000-00004fff
	Memory behind bridge: d0500000-d06fffff
	Prefetchable memory behind bridge: 00000000d0700000-00000000d08fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
			ExtTag- RBE- FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #2, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <1us, L1 <4us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+
			Slot #2, PowerLimit 10.000W; Interlock- NoCompl-
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
			Changed: MRL- PresDet- LinkState-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0f00c  Data: 41a1
	Capabilities: [90] Subsystem: Intel Corporation Device 4f4d
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed+ WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=01
			Status:	NegoPending- InProgress-
		VC1:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable- ID=0 ArbSelect=Fixed TC/VC=00
			Status:	NegoPending- InProgress-
	Capabilities: [180 v1] Root Complex Link
		Desc:	PortNumber=02 ComponentID=02 EltType=Config
		Link0:	Desc:	TargetPort=00 TargetComponent=02 AssocRCRB- LinkType=MemMapped LinkValid+
			Addr:	00000000fed1c001
	Kernel driver in use: pcieport

00:1c.2 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 3 (rev 01) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
	I/O behind bridge: 00005000-00005fff
	Memory behind bridge: d0900000-d0afffff
	Prefetchable memory behind bridge: 00000000d0b00000-00000000d0cfffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
			ExtTag- RBE- FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #3, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <1us, L1 <4us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+
			Slot #3, PowerLimit 10.000W; Interlock- NoCompl-
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
			Changed: MRL- PresDet- LinkState-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0f00c  Data: 41b1
	Capabilities: [90] Subsystem: Intel Corporation Device 4f4d
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed+ WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=01
			Status:	NegoPending- InProgress-
		VC1:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable- ID=0 ArbSelect=Fixed TC/VC=00
			Status:	NegoPending- InProgress-
	Capabilities: [180 v1] Root Complex Link
		Desc:	PortNumber=03 ComponentID=02 EltType=Config
		Link0:	Desc:	TargetPort=00 TargetComponent=02 AssocRCRB- LinkType=MemMapped LinkValid+
			Addr:	00000000fed1c001
	Kernel driver in use: pcieport

00:1c.3 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 4 (rev 01) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
	I/O behind bridge: 00006000-00006fff
	Memory behind bridge: d0d00000-d0efffff
	Prefetchable memory behind bridge: 00000000d0f00000-00000000d10fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
			ExtTag- RBE- FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #4, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <1us, L1 <4us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+
			Slot #4, PowerLimit 10.000W; Interlock- NoCompl-
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
			Changed: MRL- PresDet- LinkState-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0f00c  Data: 41c1
	Capabilities: [90] Subsystem: Intel Corporation Device 4f4d
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed+ WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=01
			Status:	NegoPending- InProgress-
		VC1:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable- ID=0 ArbSelect=Fixed TC/VC=00
			Status:	NegoPending- InProgress-
	Capabilities: [180 v1] Root Complex Link
		Desc:	PortNumber=04 ComponentID=02 EltType=Config
		Link0:	Desc:	TargetPort=00 TargetComponent=02 AssocRCRB- LinkType=MemMapped LinkValid+
			Addr:	00000000fed1c001
	Kernel driver in use: pcieport

00:1d.0 USB controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #1 (rev 01) (prog-if 00 [UHCI])
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 23
	Region 4: I/O ports at 3080 [size=32]
	Kernel driver in use: uhci_hcd
	Kernel modules: uhci-hcd

00:1d.1 USB controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #2 (rev 01) (prog-if 00 [UHCI])
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin B routed to IRQ 19
	Region 4: I/O ports at 3060 [size=32]
	Kernel driver in use: uhci_hcd
	Kernel modules: uhci-hcd

00:1d.2 USB controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #3 (rev 01) (prog-if 00 [UHCI])
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin C routed to IRQ 18
	Region 4: I/O ports at 3040 [size=32]
	Kernel driver in use: uhci_hcd
	Kernel modules: uhci-hcd

00:1d.3 USB controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #4 (rev 01) (prog-if 00 [UHCI])
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin D routed to IRQ 16
	Region 4: I/O ports at 3020 [size=32]
	Kernel driver in use: uhci_hcd
	Kernel modules: uhci-hcd

00:1d.7 USB controller: Intel Corporation N10/ICH 7 Family USB2 EHCI Controller (rev 01) (prog-if 20 [EHCI])
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 23
	Region 0: Memory at f0384400 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Debug port: BAR=1 offset=00a0
	Kernel driver in use: ehci_hcd

00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e1) (prog-if 01 [Subtractive decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
	I/O behind bridge: 00001000-00001fff
	Memory behind bridge: f0100000-f01fffff
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
	Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [50] Subsystem: Intel Corporation Device 4f4d

00:1f.0 ISA bridge: Intel Corporation NM10 Family LPC Controller (rev 01)
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: [e0] Vendor Specific Information: Len=0c <?>

00:1f.2 SATA controller: Intel Corporation N10/ICH7 Family SATA Controller [AHCI mode] (rev 01) (prog-if 01 [AHCI 1.0])
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin B routed to IRQ 44
	Region 0: I/O ports at 30b8 [size=8]
	Region 1: I/O ports at 30cc [size=4]
	Region 2: I/O ports at 30b0 [size=8]
	Region 3: I/O ports at 30c8 [size=4]
	Region 4: I/O ports at 30a0 [size=16]
	Region 5: Memory at f0384000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0f00c  Data: 41d1
	Capabilities: [70] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: ahci

00:1f.3 SMBus: Intel Corporation N10/ICH 7 Family SMBus Controller (rev 01)
	Subsystem: Intel Corporation DeskTop Board D510MO
	Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin B routed to IRQ 19
	Region 4: I/O ports at 3000 [size=32]
	Kernel driver in use: i801_smbus
	Kernel modules: i2c-i801

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 03)
	Subsystem: Intel Corporation Desktop Board D510MO
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 45
	Region 0: I/O ports at 2000 [size=256]
	Region 2: Memory at f0004000 (64-bit, prefetchable) [size=4K]
	Region 4: Memory at f0000000 (64-bit, prefetchable) [size=16K]
	Expansion ROM at f0020000 [disabled] [size=128K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0f00c  Data: 41e1
	Capabilities: [70] Express (v2) Endpoint, MSI 01
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 4096 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us
			ClockPM+ Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [ac] MSI-X: Enable- Count=4 Masked-
		Vector table: BAR=4 offset=00000000
		PBA: BAR=4 offset=00000800
	Capabilities: [cc] Vital Product Data
		Unknown small resource type 00, will not decode more.
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [140 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=01
			Status:	NegoPending- InProgress-
	Capabilities: [160 v1] Device Serial Number 03-00-00-00-68-4c-e0-00
	Kernel driver in use: r8169

05:00.0 Ethernet controller: VIA Technologies, Inc. VT6105/VT6106S [Rhine-III] (rev 8b)
	Subsystem: D-Link System Inc Device 1405
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping+ SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 32 (750ns min, 2000ns max), Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 21
	Region 0: I/O ports at 1000 [size=256]
	Region 1: Memory at f0100000 (32-bit, non-prefetchable) [size=256]
	Capabilities: [44] Power Management version 2
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: via-rhine
	Kernel modules: via-rhine


[-- Attachment #1.6: modules.txt --]
[-- Type: text/plain, Size: 3003 bytes --]

nfsv4 142867 2 - Live 0xffffffffa02c9000
crypto_null 2468 0 - Live 0xffffffffa02c5000
camellia_generic 18121 0 - Live 0xffffffffa02bd000
camellia_x86_64 43731 0 - Live 0xffffffffa02af000
cast6 8321 0 - Live 0xffffffffa02a9000
cast5 14309 0 - Live 0xffffffffa02a2000
cts 3664 0 - Live 0xffffffffa029e000
gcm 10963 0 - Live 0xffffffffa0297000
ccm 6870 0 - Live 0xffffffffa0292000
twofish_generic 6065 0 - Live 0xffffffffa0280000
xcbc 2285 0 - Live 0xffffffffa026f000
ah6 4896 0 - Live 0xffffffffa026a000
ah4 4552 0 - Live 0xffffffffa0265000
esp6 5009 16 - Live 0xffffffffa0260000
esp4 5309 18 - Live 0xffffffffa025b000
xfrm4_mode_beet 1731 0 - Live 0xffffffffa0257000
xfrm4_tunnel 1633 0 - Live 0xffffffffa0253000
tunnel4 1949 1 xfrm4_tunnel, Live 0xffffffffa024f000
xfrm4_mode_tunnel 2060 0 - Live 0xffffffffa024b000
xfrm4_mode_transport 1186 36 - Live 0xffffffffa0247000
xfrm6_mode_transport 1250 32 - Live 0xffffffffa0243000
xfrm6_mode_ro 1062 0 - Live 0xffffffffa023f000
xfrm6_mode_beet 1618 0 - Live 0xffffffffa023b000
xfrm6_mode_tunnel 1544 0 - Live 0xffffffffa0237000
ipcomp 1772 0 - Live 0xffffffffa0233000
ipcomp6 1788 0 - Live 0xffffffffa022f000
xfrm_ipcomp 3191 2 ipcomp,ipcomp6, Live 0xffffffffa022b000
xfrm6_tunnel 2935 1 ipcomp6, Live 0xffffffffa0227000
tunnel6 1864 1 xfrm6_tunnel, Live 0xffffffffa0223000
xt_policy 2098 4 - Live 0xffffffffa021f000
xt_pkttype 947 6 - Live 0xffffffffa021b000
xt_recent 7118 2 - Live 0xffffffffa0215000
w83627hf 18739 0 - Live 0xffffffffa020b000
hwmon_vid 1916 1 w83627hf, Live 0xffffffffa0207000
nfsd 201168 11 - Live 0xffffffffa0187000
tcp_lp 1642 0 - Live 0xffffffffa0183000
tun 13568 0 - Live 0xffffffffa017b000
nfs_acl 1959 1 nfsd, Live 0xffffffffa0177000
auth_rpcgss 26056 2 nfsv4,nfsd, Live 0xffffffffa016b000
nfs 104535 3 nfsv4, Live 0xffffffffa0142000
snd_hda_codec_realtek 48800 1 - Live 0xffffffffa012f000
fscache 24055 1 nfs, Live 0xffffffffa0123000
lockd 52521 2 nfsd,nfs, Live 0xffffffffa010f000
snd_hda_intel 19431 0 - Live 0xffffffffa0105000
snd_hda_codec 64188 2 snd_hda_codec_realtek,snd_hda_intel, Live 0xffffffffa00ec000
snd_hwdep 4734 1 snd_hda_codec, Live 0xffffffffa00e7000
sunrpc 146486 35 nfsv4,nfsd,nfs_acl,auth_rpcgss,nfs,lockd, Live 0xffffffffa00b1000
uhci_hcd 17432 0 - Live 0xffffffffa0093000
via_rhine 17698 0 - Live 0xffffffffa0083000
i2c_i801 8220 0 - Live 0xffffffffa007c000
coretemp 4512 0 - Live 0xffffffffa006b000
snd_pcm 58586 2 snd_hda_intel,snd_hda_codec, Live 0xffffffffa0051000
snd_timer 15400 1 snd_pcm, Live 0xffffffffa0049000
snd 40912 6 snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_timer, Live 0xffffffffa0037000
soundcore 832 1 snd, Live 0xffffffffa0033000
hwmon 1201 2 w83627hf,coretemp, Live 0xffffffffa0023000
ppdev 5262 0 - Live 0xffffffffa001e000
snd_page_alloc 5905 2 snd_hda_intel,snd_pcm, Live 0xffffffffa0019000
parport_pc 27504 0 - Live 0xffffffffa000c000
parport 26359 2 ppdev,parport_pc, Live 0xffffffffa0000000

[-- Attachment #1.7: slabtop_output.txt --]
[-- Type: text/plain, Size: 1680 bytes --]

 Active / Total Objects (% used)    : 2985963 / 3024520 (98.7%)
 Active / Total Slabs (% used)      : 96586 / 96586 (100.0%)
 Active / Total Caches (% used)     : 135 / 219 (61.6%)
 Active / Total Size (% used)       : 395495.66K / 405044.92K (97.6%)
 Minimum / Average / Maximum Object : 0.05K / 0.13K / 8.05K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
2779072 2779011  99%    0.12K  86846       32    347384K secpath_cache          
 93106  73207  78%    0.15K   3581       26     14324K buffer_head            
 24288  24280  99%    0.18K   1104       22      4416K ext4_groupinfo_4k      
 21811  16993  77%    0.23K   1283       17      5132K dentry                 
 18575  18563  99%    0.16K    743       25      2972K sysfs_dir_cache        
  7738   6532  84%    0.05K    106       73       424K kmalloc-8              
  7124   5141  72%    0.59K    274       26      4384K inode_cache            
  6579   5532  84%    0.08K    129       51       516K kmalloc-32             
  6156   5064  82%    0.11K    171       36       684K kmalloc-64             
  4930   4388  89%    0.91K    290       17      4640K ext4_inode_cache       
  4738   4608  97%    0.09K    103       46       412K dm_io                  
  4704   4608  97%    0.07K     84       56       336K dm_target_io           
  4576   3081  67%    0.60K    176       26      2816K radix_tree_node        
  3520   3461  98%    0.06K     55       64       220K kmalloc-16             
  3096   3077  99%    0.66K    129       24      2064K shmem_inode_cache      
  3024   2961  97%    0.21K    168       18       672K vm_area_struct         

[-- Attachment #1.8: slub_leak_secpath_dup_slub_debug.tar.gz --]
[-- Type: application/x-gzip, Size: 979 bytes --]

[-- Attachment #1.9: ver_linux.txt --]
[-- Type: text/plain, Size: 1377 bytes --]

If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.
 
Linux anathema 3.6.1-fg.mf_master #1 SMP Sat Oct 13 04:21:08 YEKT 2012 x86_64 GNU/Linux
 
Gnu C                  4.6.3
Gnu make               3.82
binutils               2.22
util-linux             2.21.2
mount                  support
module-init-tools      3.16
e2fsprogs              1.42.5
reiserfsprogs          3.6.21
xfsprogs               3.1.8
PPP                    2.4.5
Linux C Library        2.16
Dynamic linker (ldd)   2.16
Linux C++ Library      6..
Procps                 3.3.3
Net-tools              1.60_p20120127084908
Kbd                    1.15.3
Sh-utils               8.19
Modules Loaded         nfsv4 crypto_null camellia_generic camellia_x86_64 cast6 cast5 cts gcm ccm twofish_generic xcbc ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 xt_policy xt_pkttype xt_recent w83627hf hwmon_vid nfsd tcp_lp tun nfs_acl auth_rpcgss nfs snd_hda_codec_realtek fscache lockd snd_hda_intel snd_hda_codec snd_hwdep sunrpc uhci_hcd via_rhine i2c_i801 coretemp snd_pcm snd_timer snd soundcore hwmon ppdev snd_page_alloc parport_pc parport

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-19 14:50 PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels Mike Kazantsev
@ 2012-10-19 17:36 ` Mike Kazantsev
  2012-10-20 12:42   ` Paul Moore
  0 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-19 17:36 UTC (permalink / raw)
  To: linux-mm; +Cc: paul, netdev

[-- Attachment #1: Type: text/plain, Size: 2261 bytes --]

On Fri, 19 Oct 2012 20:50:55 +0600
Mike Kazantsev <mk.fraggod@gmail.com> wrote:

> slabtop showed "kmalloc-64" being the 99% offender in the past, but
> with recent kernels (3.6.1), it has changed to "secpath_cache"

To be more specific, on 3.5.4 kernel leak looks like this:

 Active / Total Objects (% used)    : 19971419 / 20084060 (99.4%)
 Active / Total Slabs (% used)      : 318645 / 318645 (100.0%)
 Active / Total Caches (% used)     : 79 / 121 (65.3%)
 Active / Total Size (% used)       : 1285299.85K / 1307992.83K (98.3%)
 Minimum / Average / Maximum Object : 0.01K / 0.06K / 8.00K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
19678272 19678272 100%    0.06K 307473       64   1229892K kmalloc-64             
159198  95262  59%    0.10K   4082       39     16328K buffer_head            
 32865  17515  53%    0.19K   1565       21      6260K dentry                 
 20480  19456  95%    0.02K     80      256       320K ext4_io_page           
 16896  10380  61%    0.03K    132      128       528K kmalloc-32             
 16164  16164 100%    0.11K    449       36      1796K sysfs_dir_cache        
 15980  15980 100%    0.02K     94      170       376K fsnotify_event_holder  
 14742   9205  62%    0.87K    819       18     13104K ext4_inode_cache       
 13916   5494  39%    0.55K    497       28      7952K radix_tree_node        
 10030   5172  51%    0.05K    118       85       472K anon_vma_chain         
 10020  10020 100%    0.13K    334       30      1336K ext4_allocation_context
  9486   9398  99%    0.04K     93      102       372K Acpi-Namespace         
  8192   8192 100%    0.01K     16      512        64K kmalloc-8              
  6960   6016  86%    0.25K    435       16      1740K kmalloc-256            
  6641   5412  81%    0.55K    229       29      3664K inode_cache            
  5124   4333  84%    0.19K    244       21       976K kmalloc-192

Unfortunately, kernel on this machine isn't booted with slub_debug
options (yet), so there're no specific on whether it's allocated (as I
understand it) in the same call or a different one.

Not sure if it's even possible that it might be the same call.


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-19 17:36 ` Mike Kazantsev
@ 2012-10-20 12:42   ` Paul Moore
  2012-10-20 14:49     ` Mike Kazantsev
  0 siblings, 1 reply; 29+ messages in thread
From: Paul Moore @ 2012-10-20 12:42 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 360 bytes --]

Thanks for the problem report.  I'm not going to be in a position to start
looking into this until late Sunday, but hopefully it will be a quick fix.

Two quick questions (my apologies, I'm not able to dig through your logs
right now): do you see this leak on kernels < 3.5.0, and are you using any
labeled IPsec connections?

--
paul moore
www.paul-moore.com

[-- Attachment #2: Type: text/html, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-20 12:42   ` Paul Moore
@ 2012-10-20 14:49     ` Mike Kazantsev
  2012-10-20 22:45       ` Mike Kazantsev
  0 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-20 14:49 UTC (permalink / raw)
  To: Paul Moore; +Cc: netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1911 bytes --]

On Sat, 20 Oct 2012 08:42:33 -0400
Paul Moore <paul@paul-moore.com> wrote:

> Thanks for the problem report.  I'm not going to be in a position to start
> looking into this until late Sunday, but hopefully it will be a quick fix.
> 
> Two quick questions (my apologies, I'm not able to dig through your logs
> right now): do you see this leak on kernels < 3.5.0, and are you using any
> labeled IPsec connections?
> 

As I understand, labelled connections are only used in SELinux
and SMACK LSM, which are not enabled (in Kconfig, i.e. not built) in any
of the kernels I use.

The only LSM I have enabled (and actually use on 2/4 of these machines)
is AppArmor, and though I think it doesn't attach any labels to network
connections yet (there's a "Wishlist" bug at
https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/796588, but I
can't seem to find an existing implementation).

I believe it has started with 3.5.0, according to all available logs I
have. I'm afraid laziness and other tasks have prevented me from
looking into and reporting the issue back then, but memory graph trends
start at the exact time of reboot into 3.5.0 kernels, and before that,
there're no such trends for slab memory usage.

I've been able to ignore and work around the problem for months now, so
I don't think there's any rush at all ;)

But that said, currently I've started git bisect process between v3.5
and v3.4 tags, so hopefully I'll get good-enough results of it before
you'll get to it (probably in a few hours to a few days).

Also, I've found that switching to "slab" allocator from "slub" doesn't
help the problem at all, so I guess something doesn't get freed in the
code indeed, though I hasn't been able to find anything relevant in the
logs for the sources where secpath_put and secpath_dup are used, and
decided to try bisect.


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-20 14:49     ` Mike Kazantsev
@ 2012-10-20 22:45       ` Mike Kazantsev
  2012-10-21  0:24         ` Mike Kazantsev
  0 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-20 22:45 UTC (permalink / raw)
  To: Paul Moore; +Cc: netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 3897 bytes --]

On Sat, 20 Oct 2012 20:49:58 +0600
Mike Kazantsev <mk.fraggod@gmail.com> wrote:

> On Sat, 20 Oct 2012 08:42:33 -0400
> Paul Moore <paul@paul-moore.com> wrote:
> 
> > Thanks for the problem report.  I'm not going to be in a position to start
> > looking into this until late Sunday, but hopefully it will be a quick fix.
> > 
> > Two quick questions (my apologies, I'm not able to dig through your logs
> > right now): do you see this leak on kernels < 3.5.0, and are you using any
> > labeled IPsec connections?
> > 
> 
> As I understand, labelled connections are only used in SELinux
> and SMACK LSM, which are not enabled (in Kconfig, i.e. not built) in any
> of the kernels I use.
> 
> The only LSM I have enabled (and actually use on 2/4 of these machines)
> is AppArmor, and though I think it doesn't attach any labels to network
> connections yet (there's a "Wishlist" bug at
> https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/796588, but I
> can't seem to find an existing implementation).
> 
> I believe it has started with 3.5.0, according to all available logs I
> have. I'm afraid laziness and other tasks have prevented me from
> looking into and reporting the issue back then, but memory graph trends
> start at the exact time of reboot into 3.5.0 kernels, and before that,
> there're no such trends for slab memory usage.
> 
> I've been able to ignore and work around the problem for months now, so
> I don't think there's any rush at all ;)
> 
> But that said, currently I've started git bisect process between v3.5
> and v3.4 tags, so hopefully I'll get good-enough results of it before
> you'll get to it (probably in a few hours to a few days).
> 
> Also, I've found that switching to "slab" allocator from "slub" doesn't
> help the problem at all, so I guess something doesn't get freed in the
> code indeed, though I hasn't been able to find anything relevant in the
> logs for the sources where secpath_put and secpath_dup are used, and
> decided to try bisect.
> 

Sorry for yet another mail on the weekend, but I've finished the bisect
and here is the result:

a1c7fff7e18f59e684e07b0f9a770561cd39f395 is the first bad commit
commit a1c7fff7e18f59e684e07b0f9a770561cd39f395
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu May 17 07:34:16 2012 +0000

    net: netdev_alloc_skb() use build_skb()

    netdev_alloc_skb() is used by networks driver in their RX path to
    allocate an skb to receive an incoming frame.

    With recent skb->head_frag infrastructure, it makes sense to change
    netdev_alloc_skb() to use build_skb() and a frag allocator.

    This permits a zero copy splice(socket->pipe), and better GRO or TCP
    coalescing.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 17938b1b46bc38aa126cc23b7a7647259297657d 1e29cf65869391eb13552c51e0cf288fc7085fec M      net

No skips, all "good" / "bad" decisions were very unambiguous and easy
to make - secpath_cache slabs either stayed at always-constant 20K
cumulative size (~5 of them) and were reported as 10-15% full in "good"
case, or were 99% full and eating memory at hudreds KiB/s (during same
rsync transfer) in "bad" case.

Reverting that commit in 3.6.2 kernel looks like a bad idea and doesn't
seem possible to do cleanly.
Being not a C coder and having only faint idea about how things should
be done with regards to socket buffers, I can't seem to find anything
to tweak based on that commit either.

kmemleak mechanism seem to provide stack traces and interesting calls
for debugging of whatever is allocating the non-freed objects, so guess
I'll see if I can get more definitive (to my ignorant eye) "look here"
hint from it, and might drop one more mail with data from there.


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-20 22:45       ` Mike Kazantsev
@ 2012-10-21  0:24         ` Mike Kazantsev
  2012-10-21 13:29             ` Eric Dumazet
  0 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-21  0:24 UTC (permalink / raw)
  To: Paul Moore; +Cc: netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 4400 bytes --]

On Sun, 21 Oct 2012 04:45:40 +0600
Mike Kazantsev <mk.fraggod@gmail.com> wrote:

> 
> kmemleak mechanism seem to provide stack traces and interesting calls
> for debugging of whatever is allocating the non-freed objects, so guess
> I'll see if I can get more definitive (to my ignorant eye) "look here"
> hint from it, and might drop one more mail with data from there.
> 

kmemleak finds a lot (dozens megabytes of stack traces) of identical
paths leading to a leaks:

(for IPv6 packets)
unreferenced object 0xffff88002fa25b00 (size 56):
  comm "softirq", pid 0, jiffies 4295009073 (age 295.620s)
  hex dump (first 32 bytes):
    01 00 00 00 01 00 00 00 00 fc 6e 30 00 88 ff ff  ..........n0....
    6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
  backtrace:
    [<ffffffff814cfa2b>] kmemleak_alloc+0x21/0x3e
    [<ffffffff810d9445>] kmem_cache_alloc+0xa5/0xb1
    [<ffffffff8147dd35>] secpath_dup+0x1b/0x5a
    [<ffffffff8147df39>] xfrm_input+0x64/0x484
    [<ffffffff814b1d2c>] xfrm6_rcv_spi+0x19/0x1b
    [<ffffffff814b1d4e>] xfrm6_rcv+0x20/0x22
    [<ffffffff8148c19f>] ip6_input_finish+0x203/0x31b
    [<ffffffff8148c622>] ip6_input+0x1e/0x50
    [<ffffffff8148c31c>] ip6_rcv_finish+0x65/0x69
    [<ffffffff8148c5a3>] ipv6_rcv+0x283/0x2e4
    [<ffffffff813ff8ba>] __netif_receive_skb+0x599/0x64c
    [<ffffffff813ffb08>] netif_receive_skb+0x47/0x78
    [<ffffffff81400644>] napi_skb_finish+0x21/0x53
    [<ffffffff81400778>] napi_gro_receive+0x102/0x10e
    [<ffffffff8136978b>] rtl8169_poll+0x326/0x4f9
    [<ffffffff813ffcda>] net_rx_action+0x9f/0x175

(for IPv4 packets)
unreferenced object 0xffff88003387e000 (size 56):
  comm "softirq", pid 0, jiffies 4294915803 (age 563.583s)
  hex dump (first 32 bytes):
    01 00 00 00 01 00 00 00 00 48 be 30 00 88 ff ff  .........H.0....
    6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
  backtrace:
    [<ffffffff814cfa2b>] kmemleak_alloc+0x21/0x3e
    [<ffffffff810d9445>] kmem_cache_alloc+0xa5/0xb1
    [<ffffffff8147dd35>] secpath_dup+0x1b/0x5a
    [<ffffffff8147df39>] xfrm_input+0x64/0x484
    [<ffffffff81474f7b>] xfrm4_rcv_encap+0x17/0x19
    [<ffffffff81474f9c>] xfrm4_rcv+0x1f/0x21
    [<ffffffff81430514>] ip_local_deliver_finish+0x170/0x22a
    [<ffffffff81430706>] ip_local_deliver+0x46/0x78
    [<ffffffff8143038d>] ip_rcv_finish+0x2bd/0x2d4
    [<ffffffff81430969>] ip_rcv+0x231/0x28c
    [<ffffffff813ff8ba>] __netif_receive_skb+0x599/0x64c
    [<ffffffff813ffb08>] netif_receive_skb+0x47/0x78
    [<ffffffff81400644>] napi_skb_finish+0x21/0x53
    [<ffffffff81400778>] napi_gro_receive+0x102/0x10e
    [<ffffffff8136978b>] rtl8169_poll+0x326/0x4f9
    [<ffffffff813ffcda>] net_rx_action+0x9f/0x175

Object at the top and trace seem to be the same (between same
IP-family) everywhere, just ages and addresses are different.

IPv6 usage seem to be one important detail which I failed to mention.
IPv4 traces seem to be really rare (only several of them), but that
might be understandable because rsync was ran over IPv6.

Still wasn't able to figure out what might cause the get's/put's
disbalance with that commit, but was able to revert it, without
anything bad happening (so far), using the patch below (in case
issue might bite someone else before proper fix is found).


--

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6e04b1f..52a9d40 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -427,26 +427,8 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 				   unsigned int length, gfp_t gfp_mask)
 {
 	struct sk_buff *skb = NULL;
-	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
-			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-
-	if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) {
-		void *data;
-
-		if (sk_memalloc_socks())
-			gfp_mask |= __GFP_MEMALLOC;
-
-		data = __netdev_alloc_frag(fragsz, gfp_mask);
-
-		if (likely(data)) {
-			skb = build_skb(data, fragsz);
-			if (unlikely(!skb))
-				put_page(virt_to_head_page(data));
-		}
-	} else {
-		skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask,
+	skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask,
 				  SKB_ALLOC_RX, NUMA_NO_NODE);
-	}
 	if (likely(skb)) {
 		skb_reserve(skb, NET_SKB_PAD);
 		skb->dev = dev;


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-21  0:24         ` Mike Kazantsev
@ 2012-10-21 13:29             ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-21 13:29 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Sun, 2012-10-21 at 06:24 +0600, Mike Kazantsev wrote:
> On Sun, 21 Oct 2012 04:45:40 +0600
> Mike Kazantsev <mk.fraggod@gmail.com> wrote:
> 
> > 
> > kmemleak mechanism seem to provide stack traces and interesting calls
> > for debugging of whatever is allocating the non-freed objects, so guess
> > I'll see if I can get more definitive (to my ignorant eye) "look here"
> > hint from it, and might drop one more mail with data from there.
> > 
> 
> kmemleak finds a lot (dozens megabytes of stack traces) of identical
> paths leading to a leaks:
> 
> (for IPv6 packets)
> unreferenced object 0xffff88002fa25b00 (size 56):
>   comm "softirq", pid 0, jiffies 4295009073 (age 295.620s)
>   hex dump (first 32 bytes):
>     01 00 00 00 01 00 00 00 00 fc 6e 30 00 88 ff ff  ..........n0....
>     6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
>   backtrace:
>     [<ffffffff814cfa2b>] kmemleak_alloc+0x21/0x3e
>     [<ffffffff810d9445>] kmem_cache_alloc+0xa5/0xb1
>     [<ffffffff8147dd35>] secpath_dup+0x1b/0x5a
>     [<ffffffff8147df39>] xfrm_input+0x64/0x484
>     [<ffffffff814b1d2c>] xfrm6_rcv_spi+0x19/0x1b
>     [<ffffffff814b1d4e>] xfrm6_rcv+0x20/0x22
>     [<ffffffff8148c19f>] ip6_input_finish+0x203/0x31b
>     [<ffffffff8148c622>] ip6_input+0x1e/0x50
>     [<ffffffff8148c31c>] ip6_rcv_finish+0x65/0x69
>     [<ffffffff8148c5a3>] ipv6_rcv+0x283/0x2e4
>     [<ffffffff813ff8ba>] __netif_receive_skb+0x599/0x64c
>     [<ffffffff813ffb08>] netif_receive_skb+0x47/0x78
>     [<ffffffff81400644>] napi_skb_finish+0x21/0x53
>     [<ffffffff81400778>] napi_gro_receive+0x102/0x10e
>     [<ffffffff8136978b>] rtl8169_poll+0x326/0x4f9
>     [<ffffffff813ffcda>] net_rx_action+0x9f/0x175
> 
> (for IPv4 packets)
> unreferenced object 0xffff88003387e000 (size 56):
>   comm "softirq", pid 0, jiffies 4294915803 (age 563.583s)
>   hex dump (first 32 bytes):
>     01 00 00 00 01 00 00 00 00 48 be 30 00 88 ff ff  .........H.0....
>     6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
>   backtrace:
>     [<ffffffff814cfa2b>] kmemleak_alloc+0x21/0x3e
>     [<ffffffff810d9445>] kmem_cache_alloc+0xa5/0xb1
>     [<ffffffff8147dd35>] secpath_dup+0x1b/0x5a
>     [<ffffffff8147df39>] xfrm_input+0x64/0x484
>     [<ffffffff81474f7b>] xfrm4_rcv_encap+0x17/0x19
>     [<ffffffff81474f9c>] xfrm4_rcv+0x1f/0x21
>     [<ffffffff81430514>] ip_local_deliver_finish+0x170/0x22a
>     [<ffffffff81430706>] ip_local_deliver+0x46/0x78
>     [<ffffffff8143038d>] ip_rcv_finish+0x2bd/0x2d4
>     [<ffffffff81430969>] ip_rcv+0x231/0x28c
>     [<ffffffff813ff8ba>] __netif_receive_skb+0x599/0x64c
>     [<ffffffff813ffb08>] netif_receive_skb+0x47/0x78
>     [<ffffffff81400644>] napi_skb_finish+0x21/0x53
>     [<ffffffff81400778>] napi_gro_receive+0x102/0x10e
>     [<ffffffff8136978b>] rtl8169_poll+0x326/0x4f9
>     [<ffffffff813ffcda>] net_rx_action+0x9f/0x175
> 
> Object at the top and trace seem to be the same (between same
> IP-family) everywhere, just ages and addresses are different.
> 
> IPv6 usage seem to be one important detail which I failed to mention.
> IPv4 traces seem to be really rare (only several of them), but that
> might be understandable because rsync was ran over IPv6.
> 
> Still wasn't able to figure out what might cause the get's/put's
> disbalance with that commit, but was able to revert it, without
> anything bad happening (so far), using the patch below (in case
> issue might bite someone else before proper fix is found).
> 
> 
> --
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6e04b1f..52a9d40 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -427,26 +427,8 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
>  				   unsigned int length, gfp_t gfp_mask)
>  {
>  	struct sk_buff *skb = NULL;
> -	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
> -			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> -
> -	if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) {
> -		void *data;
> -
> -		if (sk_memalloc_socks())
> -			gfp_mask |= __GFP_MEMALLOC;
> -
> -		data = __netdev_alloc_frag(fragsz, gfp_mask);
> -
> -		if (likely(data)) {
> -			skb = build_skb(data, fragsz);
> -			if (unlikely(!skb))
> -				put_page(virt_to_head_page(data));
> -		}
> -	} else {
> -		skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask,
> +	skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask,
>  				  SKB_ALLOC_RX, NUMA_NO_NODE);
> -	}
>  	if (likely(skb)) {
>  		skb_reserve(skb, NET_SKB_PAD);
>  		skb->dev = dev;
> 
> 



Did you try linux-3.7-rc2 (or linux-3.7-rc1) ?



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
@ 2012-10-21 13:29             ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-21 13:29 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Sun, 2012-10-21 at 06:24 +0600, Mike Kazantsev wrote:
> On Sun, 21 Oct 2012 04:45:40 +0600
> Mike Kazantsev <mk.fraggod@gmail.com> wrote:
> 
> > 
> > kmemleak mechanism seem to provide stack traces and interesting calls
> > for debugging of whatever is allocating the non-freed objects, so guess
> > I'll see if I can get more definitive (to my ignorant eye) "look here"
> > hint from it, and might drop one more mail with data from there.
> > 
> 
> kmemleak finds a lot (dozens megabytes of stack traces) of identical
> paths leading to a leaks:
> 
> (for IPv6 packets)
> unreferenced object 0xffff88002fa25b00 (size 56):
>   comm "softirq", pid 0, jiffies 4295009073 (age 295.620s)
>   hex dump (first 32 bytes):
>     01 00 00 00 01 00 00 00 00 fc 6e 30 00 88 ff ff  ..........n0....
>     6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
>   backtrace:
>     [<ffffffff814cfa2b>] kmemleak_alloc+0x21/0x3e
>     [<ffffffff810d9445>] kmem_cache_alloc+0xa5/0xb1
>     [<ffffffff8147dd35>] secpath_dup+0x1b/0x5a
>     [<ffffffff8147df39>] xfrm_input+0x64/0x484
>     [<ffffffff814b1d2c>] xfrm6_rcv_spi+0x19/0x1b
>     [<ffffffff814b1d4e>] xfrm6_rcv+0x20/0x22
>     [<ffffffff8148c19f>] ip6_input_finish+0x203/0x31b
>     [<ffffffff8148c622>] ip6_input+0x1e/0x50
>     [<ffffffff8148c31c>] ip6_rcv_finish+0x65/0x69
>     [<ffffffff8148c5a3>] ipv6_rcv+0x283/0x2e4
>     [<ffffffff813ff8ba>] __netif_receive_skb+0x599/0x64c
>     [<ffffffff813ffb08>] netif_receive_skb+0x47/0x78
>     [<ffffffff81400644>] napi_skb_finish+0x21/0x53
>     [<ffffffff81400778>] napi_gro_receive+0x102/0x10e
>     [<ffffffff8136978b>] rtl8169_poll+0x326/0x4f9
>     [<ffffffff813ffcda>] net_rx_action+0x9f/0x175
> 
> (for IPv4 packets)
> unreferenced object 0xffff88003387e000 (size 56):
>   comm "softirq", pid 0, jiffies 4294915803 (age 563.583s)
>   hex dump (first 32 bytes):
>     01 00 00 00 01 00 00 00 00 48 be 30 00 88 ff ff  .........H.0....
>     6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
>   backtrace:
>     [<ffffffff814cfa2b>] kmemleak_alloc+0x21/0x3e
>     [<ffffffff810d9445>] kmem_cache_alloc+0xa5/0xb1
>     [<ffffffff8147dd35>] secpath_dup+0x1b/0x5a
>     [<ffffffff8147df39>] xfrm_input+0x64/0x484
>     [<ffffffff81474f7b>] xfrm4_rcv_encap+0x17/0x19
>     [<ffffffff81474f9c>] xfrm4_rcv+0x1f/0x21
>     [<ffffffff81430514>] ip_local_deliver_finish+0x170/0x22a
>     [<ffffffff81430706>] ip_local_deliver+0x46/0x78
>     [<ffffffff8143038d>] ip_rcv_finish+0x2bd/0x2d4
>     [<ffffffff81430969>] ip_rcv+0x231/0x28c
>     [<ffffffff813ff8ba>] __netif_receive_skb+0x599/0x64c
>     [<ffffffff813ffb08>] netif_receive_skb+0x47/0x78
>     [<ffffffff81400644>] napi_skb_finish+0x21/0x53
>     [<ffffffff81400778>] napi_gro_receive+0x102/0x10e
>     [<ffffffff8136978b>] rtl8169_poll+0x326/0x4f9
>     [<ffffffff813ffcda>] net_rx_action+0x9f/0x175
> 
> Object at the top and trace seem to be the same (between same
> IP-family) everywhere, just ages and addresses are different.
> 
> IPv6 usage seem to be one important detail which I failed to mention.
> IPv4 traces seem to be really rare (only several of them), but that
> might be understandable because rsync was ran over IPv6.
> 
> Still wasn't able to figure out what might cause the get's/put's
> disbalance with that commit, but was able to revert it, without
> anything bad happening (so far), using the patch below (in case
> issue might bite someone else before proper fix is found).
> 
> 
> --
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6e04b1f..52a9d40 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -427,26 +427,8 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
>  				   unsigned int length, gfp_t gfp_mask)
>  {
>  	struct sk_buff *skb = NULL;
> -	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
> -			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> -
> -	if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) {
> -		void *data;
> -
> -		if (sk_memalloc_socks())
> -			gfp_mask |= __GFP_MEMALLOC;
> -
> -		data = __netdev_alloc_frag(fragsz, gfp_mask);
> -
> -		if (likely(data)) {
> -			skb = build_skb(data, fragsz);
> -			if (unlikely(!skb))
> -				put_page(virt_to_head_page(data));
> -		}
> -	} else {
> -		skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask,
> +	skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask,
>  				  SKB_ALLOC_RX, NUMA_NO_NODE);
> -	}
>  	if (likely(skb)) {
>  		skb_reserve(skb, NET_SKB_PAD);
>  		skb->dev = dev;
> 
> 



Did you try linux-3.7-rc2 (or linux-3.7-rc1) ?



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-21 13:29             ` Eric Dumazet
  (?)
@ 2012-10-21 13:57             ` Mike Kazantsev
  2012-10-21 18:43               ` Mike Kazantsev
  -1 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-21 13:57 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paul Moore, netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 244 bytes --]

On Sun, 21 Oct 2012 15:29:43 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> 
> Did you try linux-3.7-rc2 (or linux-3.7-rc1) ?
> 

I did not, will do in a few hours, thanks for the pointer.


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-21 13:57             ` Mike Kazantsev
@ 2012-10-21 18:43               ` Mike Kazantsev
  2012-10-21 19:51                 ` Mike Kazantsev
  0 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-21 18:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paul Moore, netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1131 bytes --]

On Sun, 21 Oct 2012 19:57:01 +0600
Mike Kazantsev <mk.fraggod@gmail.com> wrote:

> On Sun, 21 Oct 2012 15:29:43 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > 
> > Did you try linux-3.7-rc2 (or linux-3.7-rc1) ?
> > 
> 
> I did not, will do in a few hours, thanks for the pointer.
> 

I just built "torvalds/linux-2.6" (v3.7-rc2) and rebooted into it,
started same rsync-over-net test and got kmalloc-64 leaking (it went up
to tens of MiB until I stopped rsync, normally these are fixed at ~500
KiB).

Unfortunately, I forgot to add slub_debug option and build kmemleak so
wasn't able to look at this case further, and when I rebooted with
these enabled/built, it was secpath_cache again.

So previously noted "slabtop showed 'kmalloc-64' being the 99% offender
in the past, but with recent kernels (3.6.1), it has changed to
'secpath_cache'" seem to be incorrect, as it seem to depend not on
kernel version, but some other factor.

Guess I'll try to reboot a few more times to see if I can catch
kmalloc-64 leaking (instead of secpath_cache) again.


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-21 18:43               ` Mike Kazantsev
@ 2012-10-21 19:51                 ` Mike Kazantsev
  2012-10-21 21:47                     ` Eric Dumazet
  0 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-21 19:51 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paul Moore, netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 2459 bytes --]

On Mon, 22 Oct 2012 00:43:32 +0600
Mike Kazantsev <mk.fraggod@gmail.com> wrote:

> > On Sun, 21 Oct 2012 15:29:43 +0200
> > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > 
> > > 
> > > Did you try linux-3.7-rc2 (or linux-3.7-rc1) ?
> > > 
> 
> I just built "torvalds/linux-2.6" (v3.7-rc2) and rebooted into it,
> started same rsync-over-net test and got kmalloc-64 leaking (it went up
> to tens of MiB until I stopped rsync, normally these are fixed at ~500
> KiB).
> 
> Unfortunately, I forgot to add slub_debug option and build kmemleak so
> wasn't able to look at this case further, and when I rebooted with
> these enabled/built, it was secpath_cache again.
> 
> So previously noted "slabtop showed 'kmalloc-64' being the 99% offender
> in the past, but with recent kernels (3.6.1), it has changed to
> 'secpath_cache'" seem to be incorrect, as it seem to depend not on
> kernel version, but some other factor.
> 
> Guess I'll try to reboot a few more times to see if I can catch
> kmalloc-64 leaking (instead of secpath_cache) again.
> 

I haven't been able to catch the aforementioned condition, but noticed
that with v3.7-rc2, "hex dump" part seem to vary in kmemleak
traces, and contain all sorts of random stuff, for example:

unreferenced object 0xffff88002ae2de00 (size 56):
  comm "softirq", pid 0, jiffies 4295006317 (age 213.066s)
  hex dump (first 32 bytes):
    01 00 00 00 01 00 00 00 20 9f f4 28 00 88 ff ff  ........ ..(....
    2f 6f 72 67 2f 66 72 65 65 64 65 73 6b 74 6f 70  /org/freedesktop
  backtrace:
    [<ffffffff814da4e3>] kmemleak_alloc+0x21/0x3e
    [<ffffffff810dc1f7>] kmem_cache_alloc+0xa5/0xb1
    [<ffffffff81487bf1>] secpath_dup+0x1b/0x5a
    [<ffffffff81487df5>] xfrm_input+0x64/0x484
    [<ffffffff814bbd70>] xfrm6_rcv_spi+0x19/0x1b
    [<ffffffff814bbd92>] xfrm6_rcv+0x20/0x22
    [<ffffffff814960c3>] ip6_input_finish+0x203/0x31b
    [<ffffffff81496542>] ip6_input+0x1e/0x50
    [<ffffffff81496240>] ip6_rcv_finish+0x65/0x69
    [<ffffffff814964c3>] ipv6_rcv+0x27f/0x2e0
    [<ffffffff8140a659>] __netif_receive_skb+0x5ba/0x65a
    [<ffffffff8140a894>] netif_receive_skb+0x47/0x78
    [<ffffffff8140b4bf>] napi_skb_finish+0x21/0x54
    [<ffffffff8140b5ef>] napi_gro_receive+0xfd/0x10a
    [<ffffffff81372b47>] rtl8169_poll+0x326/0x4fc
    [<ffffffff8140ad44>] net_rx_action+0x9f/0x188

Not sure if it's relevant though.


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-21 19:51                 ` Mike Kazantsev
@ 2012-10-21 21:47                     ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-21 21:47 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 01:51 +0600, Mike Kazantsev wrote:
> On Mon, 22 Oct 2012 00:43:32 +0600
> Mike Kazantsev <mk.fraggod@gmail.com> wrote:
> 
> > > On Sun, 21 Oct 2012 15:29:43 +0200
> > > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > 
> > > > 
> > > > Did you try linux-3.7-rc2 (or linux-3.7-rc1) ?
> > > > 
> > 
> > I just built "torvalds/linux-2.6" (v3.7-rc2) and rebooted into it,
> > started same rsync-over-net test and got kmalloc-64 leaking (it went up
> > to tens of MiB until I stopped rsync, normally these are fixed at ~500
> > KiB).
> > 
> > Unfortunately, I forgot to add slub_debug option and build kmemleak so
> > wasn't able to look at this case further, and when I rebooted with
> > these enabled/built, it was secpath_cache again.
> > 
> > So previously noted "slabtop showed 'kmalloc-64' being the 99% offender
> > in the past, but with recent kernels (3.6.1), it has changed to
> > 'secpath_cache'" seem to be incorrect, as it seem to depend not on
> > kernel version, but some other factor.
> > 
> > Guess I'll try to reboot a few more times to see if I can catch
> > kmalloc-64 leaking (instead of secpath_cache) again.
> > 
> 
> I haven't been able to catch the aforementioned condition, but noticed
> that with v3.7-rc2, "hex dump" part seem to vary in kmemleak
> traces, and contain all sorts of random stuff, for example:
> 
> unreferenced object 0xffff88002ae2de00 (size 56):
>   comm "softirq", pid 0, jiffies 4295006317 (age 213.066s)
>   hex dump (first 32 bytes):
>     01 00 00 00 01 00 00 00 20 9f f4 28 00 88 ff ff  ........ ..(....
>     2f 6f 72 67 2f 66 72 65 65 64 65 73 6b 74 6f 70  /org/freedesktop
>   backtrace:
>     [<ffffffff814da4e3>] kmemleak_alloc+0x21/0x3e
>     [<ffffffff810dc1f7>] kmem_cache_alloc+0xa5/0xb1
>     [<ffffffff81487bf1>] secpath_dup+0x1b/0x5a
>     [<ffffffff81487df5>] xfrm_input+0x64/0x484
>     [<ffffffff814bbd70>] xfrm6_rcv_spi+0x19/0x1b
>     [<ffffffff814bbd92>] xfrm6_rcv+0x20/0x22
>     [<ffffffff814960c3>] ip6_input_finish+0x203/0x31b
>     [<ffffffff81496542>] ip6_input+0x1e/0x50
>     [<ffffffff81496240>] ip6_rcv_finish+0x65/0x69
>     [<ffffffff814964c3>] ipv6_rcv+0x27f/0x2e0
>     [<ffffffff8140a659>] __netif_receive_skb+0x5ba/0x65a
>     [<ffffffff8140a894>] netif_receive_skb+0x47/0x78
>     [<ffffffff8140b4bf>] napi_skb_finish+0x21/0x54
>     [<ffffffff8140b5ef>] napi_gro_receive+0xfd/0x10a
>     [<ffffffff81372b47>] rtl8169_poll+0x326/0x4fc
>     [<ffffffff8140ad44>] net_rx_action+0x9f/0x188
> 
> Not sure if it's relevant though.
> 
> 

OK, so  some layer seems to have a bug if the skb->head is exactly
allocated, instead of having extra tailroom (because of kmalloc-powerof2
alignment)

Or some layer overwrites past skb->cb[] array

If you try to move sp field in sk_buff, does it change something ?

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6a2c34e..9b1438a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -395,6 +395,9 @@ struct sk_buff {
 	struct sock		*sk;
 	struct net_device	*dev;
 
+#ifdef CONFIG_XFRM
+	struct	sec_path	*sp;
+#endif
 	/*
 	 * This is the control buffer. It is free to use for every
 	 * layer. Please put your private variables there. If you
@@ -404,9 +407,6 @@ struct sk_buff {
 	char			cb[48] __aligned(8);
 
 	unsigned long		_skb_refdst;
-#ifdef CONFIG_XFRM
-	struct	sec_path	*sp;
-#endif
 	unsigned int		len,
 				data_len;
 	__u16			mac_len,




Also try to increase tailroom in __netdev_alloc_skb()

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6e04b1f..972ee4f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -427,7 +427,7 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 				   unsigned int length, gfp_t gfp_mask)
 {
 	struct sk_buff *skb = NULL;
-	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
+	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD + 64) +
 			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 
 	if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) {


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
@ 2012-10-21 21:47                     ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-21 21:47 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 01:51 +0600, Mike Kazantsev wrote:
> On Mon, 22 Oct 2012 00:43:32 +0600
> Mike Kazantsev <mk.fraggod@gmail.com> wrote:
> 
> > > On Sun, 21 Oct 2012 15:29:43 +0200
> > > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > 
> > > > 
> > > > Did you try linux-3.7-rc2 (or linux-3.7-rc1) ?
> > > > 
> > 
> > I just built "torvalds/linux-2.6" (v3.7-rc2) and rebooted into it,
> > started same rsync-over-net test and got kmalloc-64 leaking (it went up
> > to tens of MiB until I stopped rsync, normally these are fixed at ~500
> > KiB).
> > 
> > Unfortunately, I forgot to add slub_debug option and build kmemleak so
> > wasn't able to look at this case further, and when I rebooted with
> > these enabled/built, it was secpath_cache again.
> > 
> > So previously noted "slabtop showed 'kmalloc-64' being the 99% offender
> > in the past, but with recent kernels (3.6.1), it has changed to
> > 'secpath_cache'" seem to be incorrect, as it seem to depend not on
> > kernel version, but some other factor.
> > 
> > Guess I'll try to reboot a few more times to see if I can catch
> > kmalloc-64 leaking (instead of secpath_cache) again.
> > 
> 
> I haven't been able to catch the aforementioned condition, but noticed
> that with v3.7-rc2, "hex dump" part seem to vary in kmemleak
> traces, and contain all sorts of random stuff, for example:
> 
> unreferenced object 0xffff88002ae2de00 (size 56):
>   comm "softirq", pid 0, jiffies 4295006317 (age 213.066s)
>   hex dump (first 32 bytes):
>     01 00 00 00 01 00 00 00 20 9f f4 28 00 88 ff ff  ........ ..(....
>     2f 6f 72 67 2f 66 72 65 65 64 65 73 6b 74 6f 70  /org/freedesktop
>   backtrace:
>     [<ffffffff814da4e3>] kmemleak_alloc+0x21/0x3e
>     [<ffffffff810dc1f7>] kmem_cache_alloc+0xa5/0xb1
>     [<ffffffff81487bf1>] secpath_dup+0x1b/0x5a
>     [<ffffffff81487df5>] xfrm_input+0x64/0x484
>     [<ffffffff814bbd70>] xfrm6_rcv_spi+0x19/0x1b
>     [<ffffffff814bbd92>] xfrm6_rcv+0x20/0x22
>     [<ffffffff814960c3>] ip6_input_finish+0x203/0x31b
>     [<ffffffff81496542>] ip6_input+0x1e/0x50
>     [<ffffffff81496240>] ip6_rcv_finish+0x65/0x69
>     [<ffffffff814964c3>] ipv6_rcv+0x27f/0x2e0
>     [<ffffffff8140a659>] __netif_receive_skb+0x5ba/0x65a
>     [<ffffffff8140a894>] netif_receive_skb+0x47/0x78
>     [<ffffffff8140b4bf>] napi_skb_finish+0x21/0x54
>     [<ffffffff8140b5ef>] napi_gro_receive+0xfd/0x10a
>     [<ffffffff81372b47>] rtl8169_poll+0x326/0x4fc
>     [<ffffffff8140ad44>] net_rx_action+0x9f/0x188
> 
> Not sure if it's relevant though.
> 
> 

OK, so  some layer seems to have a bug if the skb->head is exactly
allocated, instead of having extra tailroom (because of kmalloc-powerof2
alignment)

Or some layer overwrites past skb->cb[] array

If you try to move sp field in sk_buff, does it change something ?

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6a2c34e..9b1438a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -395,6 +395,9 @@ struct sk_buff {
 	struct sock		*sk;
 	struct net_device	*dev;
 
+#ifdef CONFIG_XFRM
+	struct	sec_path	*sp;
+#endif
 	/*
 	 * This is the control buffer. It is free to use for every
 	 * layer. Please put your private variables there. If you
@@ -404,9 +407,6 @@ struct sk_buff {
 	char			cb[48] __aligned(8);
 
 	unsigned long		_skb_refdst;
-#ifdef CONFIG_XFRM
-	struct	sec_path	*sp;
-#endif
 	unsigned int		len,
 				data_len;
 	__u16			mac_len,




Also try to increase tailroom in __netdev_alloc_skb()

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6e04b1f..972ee4f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -427,7 +427,7 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 				   unsigned int length, gfp_t gfp_mask)
 {
 	struct sk_buff *skb = NULL;
-	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
+	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD + 64) +
 			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 
 	if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) {


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-21 21:47                     ` Eric Dumazet
  (?)
@ 2012-10-21 22:58                     ` Mike Kazantsev
  2012-10-22  8:15                         ` Eric Dumazet
  -1 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-21 22:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paul Moore, netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 3382 bytes --]

On Sun, 21 Oct 2012 23:47:33 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> 
> OK, so  some layer seems to have a bug if the skb->head is exactly
> allocated, instead of having extra tailroom (because of kmalloc-powerof2
> alignment)
> 
> Or some layer overwrites past skb->cb[] array
> 
> If you try to move sp field in sk_buff, does it change something ?
> 
...
> 
> Also try to increase tailroom in __netdev_alloc_skb()
> 

Applied both patches, but unfortunately, the problem seem to be still
there.

This time the leaking objects seem to show up as kmalloc-64.

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
266760 265333  99%    0.30K  10260       26     82080K kmemleak_object
157440 157440 100%    0.06K   2460       64      9840K kmalloc-64
 94458  94458 100%    0.10K   2422       39      9688K buffer_head
 27573  27573 100%    0.19K   1313       21      5252K dentry


kmemleak traces:

unreferenced object 0xffff88002f38ec80 (size 64):
  comm "softirq", pid 0, jiffies 4294900815 (age 142.346s)
  hex dump (first 32 bytes):
    01 00 00 00 01 00 00 00 00 08 03 2e 00 88 ff ff  ................
    2b 6f a0 ca 28 b2 4a f1 0a 74 33 74 5a 76 18 cb  +o..(.J..t3tZv..
  backtrace:
    [<ffffffff814da4e3>] kmemleak_alloc+0x21/0x3e
    [<ffffffff810dc1f7>] kmem_cache_alloc+0xa5/0xb1
    [<ffffffff81487bf5>] secpath_dup+0x1b/0x5a
    [<ffffffff81487df9>] xfrm_input+0x64/0x484
    [<ffffffff8147eec3>] xfrm4_rcv_encap+0x17/0x19
    [<ffffffff8147eee4>] xfrm4_rcv+0x1f/0x21
    [<ffffffff8143b4e4>] ip_local_deliver_finish+0x170/0x22a
    [<ffffffff8143b6d6>] ip_local_deliver+0x46/0x78
    [<ffffffff8143b35d>] ip_rcv_finish+0x295/0x2ac
    [<ffffffff8143b936>] ip_rcv+0x22e/0x288
    [<ffffffff8140a65d>] __netif_receive_skb+0x5ba/0x65a
    [<ffffffff8140a898>] netif_receive_skb+0x47/0x78
    [<ffffffff8140b4c3>] napi_skb_finish+0x21/0x54
    [<ffffffff8140b5f3>] napi_gro_receive+0xfd/0x10a
    [<ffffffff81372b47>] rtl8169_poll+0x326/0x4fc
    [<ffffffff8140ad48>] net_rx_action+0x9f/0x188

unreferenced object 0xffff880029b47580 (size 64):
  comm "softirq", pid 0, jiffies 4294926900 (age 143.946s)
  hex dump (first 32 bytes):
    01 00 00 00 01 00 00 00 00 88 07 2e 00 88 ff ff  ................
    00 00 00 00 2f 6f 72 67 2f 66 72 65 65 64 65 73  ..../org/freedes
  backtrace:
    [<ffffffff814da4e3>] kmemleak_alloc+0x21/0x3e
    [<ffffffff810dc1f7>] kmem_cache_alloc+0xa5/0xb1
    [<ffffffff81487bf5>] secpath_dup+0x1b/0x5a
    [<ffffffff81487df9>] xfrm_input+0x64/0x484
    [<ffffffff814bbd74>] xfrm6_rcv_spi+0x19/0x1b
    [<ffffffff814bbd96>] xfrm6_rcv+0x20/0x22
    [<ffffffff814960c7>] ip6_input_finish+0x203/0x31b
    [<ffffffff81496546>] ip6_input+0x1e/0x50
    [<ffffffff81496244>] ip6_rcv_finish+0x65/0x69
    [<ffffffff814964c7>] ipv6_rcv+0x27f/0x2e0
    [<ffffffff8140a65d>] __netif_receive_skb+0x5ba/0x65a
    [<ffffffff8140a898>] netif_receive_skb+0x47/0x78
    [<ffffffff8140b4c3>] napi_skb_finish+0x21/0x54
    [<ffffffff8140b5f3>] napi_gro_receive+0xfd/0x10a
    [<ffffffff81372b47>] rtl8169_poll+0x326/0x4fc
    [<ffffffff8140ad48>] net_rx_action+0x9f/0x188

I've grepped for "/org/free" specifically and sure enough, same scraps
of data seem to be in some of the (varied) dumps there.


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-21 22:58                     ` Mike Kazantsev
@ 2012-10-22  8:15                         ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22  8:15 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 04:58 +0600, Mike Kazantsev wrote:

> I've grepped for "/org/free" specifically and sure enough, same scraps
> of data seem to be in some of the (varied) dumps there.

Content is not meaningful, as we dont initialize it.
So you see previous content.

Could you try the following :

diff --git a/net/core/dev.c b/net/core/dev.c
index 09cb3f6..a903cca 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2977,6 +2977,9 @@ int netif_rx(struct sk_buff *skb)
 {
 	int ret;
 
+#ifdef CONFIG_XFRM
+	WARN_ON_ONCE(skb->sp);
+#endif
 	/* if netpoll wants it, pretend we never saw it */
 	if (netpoll_rx(skb))
 		return NET_RX_DROP;
@@ -3388,6 +3391,9 @@ out:
  */
 int netif_receive_skb(struct sk_buff *skb)
 {
+#ifdef CONFIG_XFRM
+	WARN_ON_ONCE(skb->sp);
+#endif
 	net_timestamp_check(netdev_tstamp_prequeue, skb);
 
 	if (skb_defer_rx_timestamp(skb))
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index ab2bb42..5930e91 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -29,11 +29,10 @@ struct sec_path *secpath_dup(struct sec_path *src)
 {
 	struct sec_path *sp;
 
-	sp = kmem_cache_alloc(secpath_cachep, GFP_ATOMIC);
+	sp = kmem_cache_zalloc(secpath_cachep, GFP_ATOMIC);
 	if (!sp)
 		return NULL;
 
-	sp->len = 0;
 	if (src) {
 		int i;
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
@ 2012-10-22  8:15                         ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22  8:15 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 04:58 +0600, Mike Kazantsev wrote:

> I've grepped for "/org/free" specifically and sure enough, same scraps
> of data seem to be in some of the (varied) dumps there.

Content is not meaningful, as we dont initialize it.
So you see previous content.

Could you try the following :

diff --git a/net/core/dev.c b/net/core/dev.c
index 09cb3f6..a903cca 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2977,6 +2977,9 @@ int netif_rx(struct sk_buff *skb)
 {
 	int ret;
 
+#ifdef CONFIG_XFRM
+	WARN_ON_ONCE(skb->sp);
+#endif
 	/* if netpoll wants it, pretend we never saw it */
 	if (netpoll_rx(skb))
 		return NET_RX_DROP;
@@ -3388,6 +3391,9 @@ out:
  */
 int netif_receive_skb(struct sk_buff *skb)
 {
+#ifdef CONFIG_XFRM
+	WARN_ON_ONCE(skb->sp);
+#endif
 	net_timestamp_check(netdev_tstamp_prequeue, skb);
 
 	if (skb_defer_rx_timestamp(skb))
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index ab2bb42..5930e91 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -29,11 +29,10 @@ struct sec_path *secpath_dup(struct sec_path *src)
 {
 	struct sec_path *sp;
 
-	sp = kmem_cache_alloc(secpath_cachep, GFP_ATOMIC);
+	sp = kmem_cache_zalloc(secpath_cachep, GFP_ATOMIC);
 	if (!sp)
 		return NULL;
 
-	sp->len = 0;
 	if (src) {
 		int i;
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-22  8:15                         ` Eric Dumazet
  (?)
@ 2012-10-22 12:06                         ` Mike Kazantsev
  2012-10-22 15:16                             ` Eric Dumazet
  -1 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-22 12:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paul Moore, netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1201 bytes --]

On Mon, 22 Oct 2012 10:15:43 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On Mon, 2012-10-22 at 04:58 +0600, Mike Kazantsev wrote:
> 
> > I've grepped for "/org/free" specifically and sure enough, same scraps
> > of data seem to be in some of the (varied) dumps there.
> 
> Content is not meaningful, as we dont initialize it.
> So you see previous content.
> 
> Could you try the following :
> 
...

With this patch on top of v3.7-rc2 (w/o patches from your previous
mail), leak seem to be still present.

If I understand correctly, WARN_ON_ONCE should've produced some output
in dmesg when the conditions passed to it were met.

They don't appear to be, as the only output in dmesg during
ipsec-related modules loading (I think openswan probes them manually)
is still "AVX instructions are not detected" (can be seen in tty on
boot) and the only post-boot dmesg output (incl. during leaks
happening) is from kmemleak ("kmemleak: ... new suspected memory
leaks").

Looks like kmem_cache_zalloc got rid of the content, though traces
still report it as "kmem_cache_alloc", but I guess it's because of its
"inline" nature.


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-22 12:06                         ` Mike Kazantsev
@ 2012-10-22 15:16                             ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22 15:16 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 18:06 +0600, Mike Kazantsev wrote:
> On Mon, 22 Oct 2012 10:15:43 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > On Mon, 2012-10-22 at 04:58 +0600, Mike Kazantsev wrote:
> > 
> > > I've grepped for "/org/free" specifically and sure enough, same scraps
> > > of data seem to be in some of the (varied) dumps there.
> > 
> > Content is not meaningful, as we dont initialize it.
> > So you see previous content.
> > 
> > Could you try the following :
> > 
> ...
> 
> With this patch on top of v3.7-rc2 (w/o patches from your previous
> mail), leak seem to be still present.

OK, I believe I found the bug in IPv4 defrag / IPv6 reasm

Please test the following patch.

Thanks !

diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 448e685..0a52771 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -48,6 +48,7 @@
 #include <linux/inet.h>
 #include <linux/netfilter_ipv4.h>
 #include <net/inet_ecn.h>
+#include <net/xfrm.h>
 
 /* NOTE. Logic of IP defragmentation is parallel to corresponding IPv6
  * code now. If you change something here, _PLEASE_ update ipv6/reassembly.c
@@ -634,6 +635,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
 		else if (head->ip_summed == CHECKSUM_COMPLETE)
 			head->csum = csum_add(head->csum, fp->csum);
 
+		secpath_reset(fp);
 		if (skb_try_coalesce(head, fp, &headstolen, &delta)) {
 			kfree_skb_partial(fp, headstolen);
 		} else {
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index da8a4e3..4fcc463 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -55,6 +55,7 @@
 #include <net/ndisc.h>
 #include <net/addrconf.h>
 #include <net/inet_frag.h>
+#include <net/xfrm.h>
 
 struct ip6frag_skb_cb
 {
@@ -456,6 +457,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
 		else if (head->ip_summed == CHECKSUM_COMPLETE)
 			head->csum = csum_add(head->csum, fp->csum);
 
+		secpath_reset(fp);
 		if (skb_try_coalesce(head, fp, &headstolen, &delta)) {
 			kfree_skb_partial(fp, headstolen);
 		} else {

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
@ 2012-10-22 15:16                             ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22 15:16 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 18:06 +0600, Mike Kazantsev wrote:
> On Mon, 22 Oct 2012 10:15:43 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > On Mon, 2012-10-22 at 04:58 +0600, Mike Kazantsev wrote:
> > 
> > > I've grepped for "/org/free" specifically and sure enough, same scraps
> > > of data seem to be in some of the (varied) dumps there.
> > 
> > Content is not meaningful, as we dont initialize it.
> > So you see previous content.
> > 
> > Could you try the following :
> > 
> ...
> 
> With this patch on top of v3.7-rc2 (w/o patches from your previous
> mail), leak seem to be still present.

OK, I believe I found the bug in IPv4 defrag / IPv6 reasm

Please test the following patch.

Thanks !

diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 448e685..0a52771 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -48,6 +48,7 @@
 #include <linux/inet.h>
 #include <linux/netfilter_ipv4.h>
 #include <net/inet_ecn.h>
+#include <net/xfrm.h>
 
 /* NOTE. Logic of IP defragmentation is parallel to corresponding IPv6
  * code now. If you change something here, _PLEASE_ update ipv6/reassembly.c
@@ -634,6 +635,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
 		else if (head->ip_summed == CHECKSUM_COMPLETE)
 			head->csum = csum_add(head->csum, fp->csum);
 
+		secpath_reset(fp);
 		if (skb_try_coalesce(head, fp, &headstolen, &delta)) {
 			kfree_skb_partial(fp, headstolen);
 		} else {
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index da8a4e3..4fcc463 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -55,6 +55,7 @@
 #include <net/ndisc.h>
 #include <net/addrconf.h>
 #include <net/inet_frag.h>
+#include <net/xfrm.h>
 
 struct ip6frag_skb_cb
 {
@@ -456,6 +457,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
 		else if (head->ip_summed == CHECKSUM_COMPLETE)
 			head->csum = csum_add(head->csum, fp->csum);
 
+		secpath_reset(fp);
 		if (skb_try_coalesce(head, fp, &headstolen, &delta)) {
 			kfree_skb_partial(fp, headstolen);
 		} else {


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-22 15:16                             ` Eric Dumazet
@ 2012-10-22 15:22                               ` Eric Dumazet
  -1 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22 15:22 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 17:16 +0200, Eric Dumazet wrote:

> OK, I believe I found the bug in IPv4 defrag / IPv6 reasm
> 
> Please test the following patch.
> 
> Thanks !

I'll send a more generic patch in a few minutes, changing
kfree_skb_partial() to call skb_release_head_state()





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
@ 2012-10-22 15:22                               ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22 15:22 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 17:16 +0200, Eric Dumazet wrote:

> OK, I believe I found the bug in IPv4 defrag / IPv6 reasm
> 
> Please test the following patch.
> 
> Thanks !

I'll send a more generic patch in a few minutes, changing
kfree_skb_partial() to call skb_release_head_state()





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-22 15:22                               ` Eric Dumazet
@ 2012-10-22 15:28                                 ` Eric Dumazet
  -1 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22 15:28 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 17:22 +0200, Eric Dumazet wrote:
> On Mon, 2012-10-22 at 17:16 +0200, Eric Dumazet wrote:
> 
> > OK, I believe I found the bug in IPv4 defrag / IPv6 reasm
> > 
> > Please test the following patch.
> > 
> > Thanks !
> 
> I'll send a more generic patch in a few minutes, changing
> kfree_skb_partial() to call skb_release_head_state()
> 

Here it is :

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6e04b1f..4007c14 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3379,10 +3379,12 @@ EXPORT_SYMBOL(__skb_warn_lro_forwarding);
 
 void kfree_skb_partial(struct sk_buff *skb, bool head_stolen)
 {
-	if (head_stolen)
+	if (head_stolen) {
+		skb_release_head_state(skb);
 		kmem_cache_free(skbuff_head_cache, skb);
-	else
+	} else {
 		__kfree_skb(skb);
+	}
 }
 EXPORT_SYMBOL(kfree_skb_partial);
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
@ 2012-10-22 15:28                                 ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22 15:28 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 17:22 +0200, Eric Dumazet wrote:
> On Mon, 2012-10-22 at 17:16 +0200, Eric Dumazet wrote:
> 
> > OK, I believe I found the bug in IPv4 defrag / IPv6 reasm
> > 
> > Please test the following patch.
> > 
> > Thanks !
> 
> I'll send a more generic patch in a few minutes, changing
> kfree_skb_partial() to call skb_release_head_state()
> 

Here it is :

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6e04b1f..4007c14 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3379,10 +3379,12 @@ EXPORT_SYMBOL(__skb_warn_lro_forwarding);
 
 void kfree_skb_partial(struct sk_buff *skb, bool head_stolen)
 {
-	if (head_stolen)
+	if (head_stolen) {
+		skb_release_head_state(skb);
 		kmem_cache_free(skbuff_head_cache, skb);
-	else
+	} else {
 		__kfree_skb(skb);
+	}
 }
 EXPORT_SYMBOL(kfree_skb_partial);
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-22 15:28                                 ` Eric Dumazet
  (?)
@ 2012-10-22 16:59                                 ` Mike Kazantsev
  2012-10-22 17:24                                     ` Eric Dumazet
  -1 siblings, 1 reply; 29+ messages in thread
From: Mike Kazantsev @ 2012-10-22 16:59 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paul Moore, netdev, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1012 bytes --]

On Mon, 22 Oct 2012 17:28:02 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On Mon, 2012-10-22 at 17:22 +0200, Eric Dumazet wrote:
> > On Mon, 2012-10-22 at 17:16 +0200, Eric Dumazet wrote:
> > 
> > > OK, I believe I found the bug in IPv4 defrag / IPv6 reasm
> > > 
> > > Please test the following patch.
> > > 
> > > Thanks !
> > 
> > I'll send a more generic patch in a few minutes, changing
> > kfree_skb_partial() to call skb_release_head_state()
> > 
> 
> Here it is :
> 
...

Problem is indeed gone in v3.7-rc2 with the proposed generic patch, I
haven't read the mail in time to test the first one, but I guess it's
not relevant now that the latter one works.

Thank you for taking your time to look into the problem and actually
fix it.

I'm unclear about policies in place on the matter, but I think this
patch might be a good candidate to backport into 3.5 and 3.6 kernels,
because they seem to suffer from the issue as well.


-- 
Mike Kazantsev // fraggod.net

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
  2012-10-22 16:59                                 ` Mike Kazantsev
@ 2012-10-22 17:24                                     ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22 17:24 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 22:59 +0600, Mike Kazantsev wrote:
> On Mon, 22 Oct 2012 17:28:02 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > On Mon, 2012-10-22 at 17:22 +0200, Eric Dumazet wrote:
> > > On Mon, 2012-10-22 at 17:16 +0200, Eric Dumazet wrote:
> > > 
> > > > OK, I believe I found the bug in IPv4 defrag / IPv6 reasm
> > > > 
> > > > Please test the following patch.
> > > > 
> > > > Thanks !
> > > 
> > > I'll send a more generic patch in a few minutes, changing
> > > kfree_skb_partial() to call skb_release_head_state()
> > > 
> > 
> > Here it is :
> > 
> ...
> 
> Problem is indeed gone in v3.7-rc2 with the proposed generic patch, I
> haven't read the mail in time to test the first one, but I guess it's
> not relevant now that the latter one works.
> 
> Thank you for taking your time to look into the problem and actually
> fix it.
> 
> I'm unclear about policies in place on the matter, but I think this
> patch might be a good candidate to backport into 3.5 and 3.6 kernels,
> because they seem to suffer from the issue as well.

Thanks a lot Mike for your help.

Dont worry, I'll submit an official patch with details and all credits. 

David Miller will forward it to stable teams.

Thanks !

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels
@ 2012-10-22 17:24                                     ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22 17:24 UTC (permalink / raw)
  To: Mike Kazantsev; +Cc: Paul Moore, netdev, linux-mm

On Mon, 2012-10-22 at 22:59 +0600, Mike Kazantsev wrote:
> On Mon, 22 Oct 2012 17:28:02 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > On Mon, 2012-10-22 at 17:22 +0200, Eric Dumazet wrote:
> > > On Mon, 2012-10-22 at 17:16 +0200, Eric Dumazet wrote:
> > > 
> > > > OK, I believe I found the bug in IPv4 defrag / IPv6 reasm
> > > > 
> > > > Please test the following patch.
> > > > 
> > > > Thanks !
> > > 
> > > I'll send a more generic patch in a few minutes, changing
> > > kfree_skb_partial() to call skb_release_head_state()
> > > 
> > 
> > Here it is :
> > 
> ...
> 
> Problem is indeed gone in v3.7-rc2 with the proposed generic patch, I
> haven't read the mail in time to test the first one, but I guess it's
> not relevant now that the latter one works.
> 
> Thank you for taking your time to look into the problem and actually
> fix it.
> 
> I'm unclear about policies in place on the matter, but I think this
> patch might be a good candidate to backport into 3.5 and 3.6 kernels,
> because they seem to suffer from the issue as well.

Thanks a lot Mike for your help.

Dont worry, I'll submit an official patch with details and all credits. 

David Miller will forward it to stable teams.

Thanks !



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH] net: fix secpath kmemleak
  2012-10-22 17:24                                     ` Eric Dumazet
@ 2012-10-22 19:03                                       ` Eric Dumazet
  -1 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22 19:03 UTC (permalink / raw)
  To: Mike Kazantsev, David Miller; +Cc: Paul Moore, netdev, linux-mm

From: Eric Dumazet <edumazet@google.com>

Mike Kazantsev found 3.5 kernels and beyond were leaking memory,
and tracked the faulty commit to a1c7fff7e18f59e (net:
netdev_alloc_skb() use build_skb()

While this commit seems fine, it uncovered a bug introduced
in commit bad43ca8325 (net: introduce skb_try_coalesce()), in function
kfree_skb_partial() :

If head is stolen, we free the sk_buff,
without removing references on secpath (skb->sp).

So IPsec + IP defrag/reassembly (using skb coalescing), or
TCP coalescing could leak secpath objects.

Fix this bug by calling skb_release_head_state(skb) to properly
release all possible references to linked objects.

Reported-by: Mike Kazantsev <mk.fraggod@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Bisected-by: Mike Kazantsev <mk.fraggod@gmail.com>
Tested-by: Mike Kazantsev <mk.fraggod@gmail.com>
---
It seems TCP stack could immediately release secpath references instead
of waiting skb are eaten by consumer, thats will be a followup patch.

 net/core/skbuff.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6e04b1f..4007c14 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3379,10 +3379,12 @@ EXPORT_SYMBOL(__skb_warn_lro_forwarding);
 
 void kfree_skb_partial(struct sk_buff *skb, bool head_stolen)
 {
-	if (head_stolen)
+	if (head_stolen) {
+		skb_release_head_state(skb);
 		kmem_cache_free(skbuff_head_cache, skb);
-	else
+	} else {
 		__kfree_skb(skb);
+	}
 }
 EXPORT_SYMBOL(kfree_skb_partial);
 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH] net: fix secpath kmemleak
@ 2012-10-22 19:03                                       ` Eric Dumazet
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Dumazet @ 2012-10-22 19:03 UTC (permalink / raw)
  To: Mike Kazantsev, David Miller; +Cc: Paul Moore, netdev, linux-mm

From: Eric Dumazet <edumazet@google.com>

Mike Kazantsev found 3.5 kernels and beyond were leaking memory,
and tracked the faulty commit to a1c7fff7e18f59e (net:
netdev_alloc_skb() use build_skb()

While this commit seems fine, it uncovered a bug introduced
in commit bad43ca8325 (net: introduce skb_try_coalesce()), in function
kfree_skb_partial() :

If head is stolen, we free the sk_buff,
without removing references on secpath (skb->sp).

So IPsec + IP defrag/reassembly (using skb coalescing), or
TCP coalescing could leak secpath objects.

Fix this bug by calling skb_release_head_state(skb) to properly
release all possible references to linked objects.

Reported-by: Mike Kazantsev <mk.fraggod@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Bisected-by: Mike Kazantsev <mk.fraggod@gmail.com>
Tested-by: Mike Kazantsev <mk.fraggod@gmail.com>
---
It seems TCP stack could immediately release secpath references instead
of waiting skb are eaten by consumer, thats will be a followup patch.

 net/core/skbuff.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6e04b1f..4007c14 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3379,10 +3379,12 @@ EXPORT_SYMBOL(__skb_warn_lro_forwarding);
 
 void kfree_skb_partial(struct sk_buff *skb, bool head_stolen)
 {
-	if (head_stolen)
+	if (head_stolen) {
+		skb_release_head_state(skb);
 		kmem_cache_free(skbuff_head_cache, skb);
-	else
+	} else {
 		__kfree_skb(skb);
+	}
 }
 EXPORT_SYMBOL(kfree_skb_partial);
 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] net: fix secpath kmemleak
  2012-10-22 19:03                                       ` Eric Dumazet
  (?)
@ 2012-10-22 19:17                                       ` David Miller
  -1 siblings, 0 replies; 29+ messages in thread
From: David Miller @ 2012-10-22 19:17 UTC (permalink / raw)
  To: eric.dumazet; +Cc: mk.fraggod, paul, netdev, linux-mm

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 22 Oct 2012 21:03:40 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> Mike Kazantsev found 3.5 kernels and beyond were leaking memory,
> and tracked the faulty commit to a1c7fff7e18f59e (net:
> netdev_alloc_skb() use build_skb()
> 
> While this commit seems fine, it uncovered a bug introduced
> in commit bad43ca8325 (net: introduce skb_try_coalesce()), in function
> kfree_skb_partial() :
> 
> If head is stolen, we free the sk_buff,
> without removing references on secpath (skb->sp).
> 
> So IPsec + IP defrag/reassembly (using skb coalescing), or
> TCP coalescing could leak secpath objects.
> 
> Fix this bug by calling skb_release_head_state(skb) to properly
> release all possible references to linked objects.
> 
> Reported-by: Mike Kazantsev <mk.fraggod@gmail.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Bisected-by: Mike Kazantsev <mk.fraggod@gmail.com>
> Tested-by: Mike Kazantsev <mk.fraggod@gmail.com>

Applied and queued up for -stable, thanks!

> It seems TCP stack could immediately release secpath references instead
> of waiting skb are eaten by consumer, thats will be a followup patch.

Indeed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2012-10-22 19:17 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-19 14:50 PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels Mike Kazantsev
2012-10-19 17:36 ` Mike Kazantsev
2012-10-20 12:42   ` Paul Moore
2012-10-20 14:49     ` Mike Kazantsev
2012-10-20 22:45       ` Mike Kazantsev
2012-10-21  0:24         ` Mike Kazantsev
2012-10-21 13:29           ` Eric Dumazet
2012-10-21 13:29             ` Eric Dumazet
2012-10-21 13:57             ` Mike Kazantsev
2012-10-21 18:43               ` Mike Kazantsev
2012-10-21 19:51                 ` Mike Kazantsev
2012-10-21 21:47                   ` Eric Dumazet
2012-10-21 21:47                     ` Eric Dumazet
2012-10-21 22:58                     ` Mike Kazantsev
2012-10-22  8:15                       ` Eric Dumazet
2012-10-22  8:15                         ` Eric Dumazet
2012-10-22 12:06                         ` Mike Kazantsev
2012-10-22 15:16                           ` Eric Dumazet
2012-10-22 15:16                             ` Eric Dumazet
2012-10-22 15:22                             ` Eric Dumazet
2012-10-22 15:22                               ` Eric Dumazet
2012-10-22 15:28                               ` Eric Dumazet
2012-10-22 15:28                                 ` Eric Dumazet
2012-10-22 16:59                                 ` Mike Kazantsev
2012-10-22 17:24                                   ` Eric Dumazet
2012-10-22 17:24                                     ` Eric Dumazet
2012-10-22 19:03                                     ` [PATCH] net: fix secpath kmemleak Eric Dumazet
2012-10-22 19:03                                       ` Eric Dumazet
2012-10-22 19:17                                       ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.