Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
       [not found] <8511913.uMAmUdIO30@eistomin.edss.local>
@ 2013-05-17  8:59 ` Wei Liu
  2013-05-17  9:08   ` Jan Beulich
                     ` (2 more replies)
  0 siblings, 3 replies; 85+ messages in thread
From: Wei Liu @ 2013-05-17  8:59 UTC (permalink / raw)
  To: Eugene Istomin; +Cc: wei.liu2, xen-devel

Moving discussion to Xen-devel

On Thu, May 16, 2013 at 10:29:56PM +0300, Eugene Istomin wrote:
> Hello,
> 
> I tried to use 3.9.2 kernel with xen 4.2.2/4.3rc1 and in both variants leads 
> to this error in network-intensive load (such as iperf, 100 nginx parallel 
> requests to 1M files and so on):
> 

It would be more helpful if you can provide info on your configurations
(Dom0 and DomU), your workload, how to reproduce the bug.

I run iperf and NFS to test Xen network, but never see any crash like
this myself.

> BUG: unable to handle kernel paging request at ffff8800795a3000
> [   60.246945] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
> [   60.246975] PGD a8a067 PUD a9a067 PMD 7fc27067 PTE 
> 80100000795a3065
> [   60.247004] Oops: 0003 [#1] SMP 
> [   60.247020] Modules linked in: af_packet hwmon domctl crc32_pclmul 
> crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
> aes_x86_64 joydev xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua 
> scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic 
> ata_piix
> [   60.247144] CPU 0 
> [   60.247154] Pid: 0, comm: swapper/0 Not tainted 3.9.2-1.g04040b9-xen 
> #1  
> [   60.247179] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
> netif_poll+0x49c/0xe80 [xennet]
> ...

Could you provide fuul stack trace? AFAICT there is no netif_poll in Xen
netfront/back.

Presumably this is Dom0 log? (from the domctl module)

> 
> We have couple of production hypervisors on 3.4 kernels with high-
> throughput  internal network (VM-to-VM in one Dom0), iperf on them is 
> working well:
> [  3]  0.0- 2.0 sec  3357 MBytes  14080 Mbits/sec
> [  3]  2.0- 4.0 sec  2880 MBytes  12077 Mbits/sec
> [  3]  4.0- 6.0 sec  2909 MBytes  12202 Mbits/sec
> [  3]  6.0- 8.0 sec  2552 MBytes  10702 Mbits/sec
> [  3]  8.0-10.0 sec  3616 MBytes  15166 Mbits/sec
> [  3] 10.0-12.0 sec  3415 MBytes  14324 Mbits/sec
> 
> 
> Seems like a kernel bug, is this related to one of this fixes in linux-next or i 
> need to create new bugreport?
> 
> 1) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=1aaf6d3d3d1e95f4be07e32dd84aa1c93855fbbd
> 2) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=9ecd1a75d977e2e8c48139c7d3efed183f898d94
> 3) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=2810e5b9a7731ca5fce22bfbe12c96e16ac44b6f
> 4) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=03393fd5cc2b6cdeec32b704ecba64dbb0feae3c
> 5) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=59ccb4ebbc35e36a3c143f2d1355deb75c2e628f
> 

I don't think these patches can fix your problem at first glance.


Wei.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17  8:59 ` [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8 Wei Liu
@ 2013-05-17  9:08   ` Jan Beulich
  2013-05-17  9:37   ` Eugene Istomin
  2013-07-04 13:43   ` Dion Kant
  2 siblings, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-05-17  9:08 UTC (permalink / raw)
  To: wei.liu2; +Cc: Eugene Istomin, xen-devel

>>> On 17.05.13 at 10:59, Wei Liu <wei.liu2@citrix.com> wrote:
>> BUG: unable to handle kernel paging request at ffff8800795a3000
>> [   60.246945] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
>> [   60.246975] PGD a8a067 PUD a9a067 PMD 7fc27067 PTE 
>> 80100000795a3065
>> [   60.247004] Oops: 0003 [#1] SMP 
>> [   60.247020] Modules linked in: af_packet hwmon domctl crc32_pclmul 
>> crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
>> aes_x86_64 joydev xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua 
>> scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic 
>> ata_piix
>> [   60.247144] CPU 0 
>> [   60.247154] Pid: 0, comm: swapper/0 Not tainted 3.9.2-1.g04040b9-xen 
>> #1  
>> [   60.247179] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
>> netif_poll+0x49c/0xe80 [xennet]
>> ...
> 
> Could you provide fuul stack trace? AFAICT there is no netif_poll in Xen
> netfront/back.

He failed to mention that this is an openSUSE kernel, i.e. not
pv-ops but forward ported.

> Presumably this is Dom0 log? (from the domctl module)

Indeed.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17  8:59 ` [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8 Wei Liu
  2013-05-17  9:08   ` Jan Beulich
@ 2013-05-17  9:37   ` Eugene Istomin
  2013-05-17  9:40     ` Eugene Istomin
  2013-05-17 10:45     ` Jan Beulich
  2013-07-04 13:43   ` Dion Kant
  2 siblings, 2 replies; 85+ messages in thread
From: Eugene Istomin @ 2013-05-17  9:37 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 5491 bytes --]

Ye, this is OpenSUSE 12.3 kernel (both Dom0/DomU is kernel 
xen-3.9.2-8.1.g04040b9.x86_64) from 
http://download.opensuse.org/repositories/Kernel:/HEAD/standard


First VM:

template:/home/local # iperf -s 
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.251.2.202 port 5001 connected with 10.251.2.201 port 
38196


##
#(after iperf -c 10.251.2.202 -i 2 -f m from the second VM)
##

[   38.447860] BUG: unable to handle kernel paging request at 
ffff88007928b000
[   38.447898] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[   38.447927] PGD a83067 PUD a93067 PMD 7fc28067 PTE 
801000007928b065
[   38.447955] Oops: 0003 [#1] SMP 
[   38.447970] Modules linked in: af_packet hwmon domctl crc32_pclmul 
crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
aes_x86_64 xts gf128mul joydev autofs4 scsi_dh_emc scsi_dh_alua 
scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic 
ata_piix
[   38.448091] CPU 0 
[   38.448100] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
#1  
[   38.448125] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[   38.448158] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[   38.448176] RAX: ffff88007da68cd0 RBX: ffff88007928aec0 RCX: 
ffff88007928b000




This trace is viewed only using xl console, DomUs had no records in logs. 
You are right, may be this is Dom0 trace.

Here is xl-test1.log from Dom0:

libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1944130 
wpath=@releaseDomain token=3/0: event epath=@releaseDomain
libxl: debug: libxl.c:998:domain_death_xswatch_callback: 
[evg=0x19435e0:2] from domid=2 nentries=1 rc=1
libxl: debug: libxl.c:1009:domain_death_xswatch_callback: 
[evg=0x19435e0:2]   got=domaininfos[0] got->domain=2
libxl: debug: libxl.c:1036:domain_death_xswatch_callback:  exists 
shutdown_reported=0 dominf.flags=30004
libxl: debug: libxl.c:1048:domain_death_xswatch_callback:  shutdown 
reporting
libxl: debug: libxl.c:1002:domain_death_xswatch_callback: [evg=0] all 
reported
libxl: debug: libxl.c:1066:domain_death_xswatch_callback: domain death 
search done
Domain 2 has shut down, reason code 3 0x3
Action for shutdown reason code 3 is restart
Domain 2 needs to be cleaned up: destroying the domain
libxl: debug: libxl.c:1250:libxl_domain_destroy: ao 0x19438d0: create: 
how=(nil) callback=(nil) poller=0x19435a0
libxl: debug: libxl_dm.c:1266:libxl__destroy_device_model: Device Model 
signaled


-- 
*/Best regards,/*
/Eugene Istomin/


> Moving discussion to Xen-devel
> 
> On Thu, May 16, 2013 at 10:29:56PM +0300, Eugene Istomin wrote:
> > Hello,
> > 
> > I tried to use 3.9.2 kernel with xen 4.2.2/4.3rc1 and in both variants
> > leads to this error in network-intensive load (such as iperf, 100 nginx
> > parallel
> > requests to 1M files and so on):
> It would be more helpful if you can provide info on your configurations
> (Dom0 and DomU), your workload, how to reproduce the bug.
> 
> I run iperf and NFS to test Xen network, but never see any crash like
> this myself.
> 
> > BUG: unable to handle kernel paging request at ffff8800795a3000
> > [   60.246945] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
> > [   60.246975] PGD a8a067 PUD a9a067 PMD 7fc27067 PTE
> > 80100000795a3065
> > [   60.247004] Oops: 0003 [#1] SMP
> > [   60.247020] Modules linked in: af_packet hwmon domctl 
crc32_pclmul
> > crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw
> > aes_x86_64 joydev xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua
> > scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic
> > ata_piix
> > [   60.247144] CPU 0
> > [   60.247154] Pid: 0, comm: swapper/0 Not tainted 3.9.2-1.g04040b9-
xen
> > #1
> > [   60.247179] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>]
> > netif_poll+0x49c/0xe80 [xennet]
> > ...
> 
> Could you provide fuul stack trace? AFAICT there is no netif_poll in Xen
> netfront/back.
> 
> Presumably this is Dom0 log? (from the domctl module)
> 
> > We have couple of production hypervisors on 3.4 kernels with high-
> > throughput  internal network (VM-to-VM in one Dom0), iperf on them is
> > working well:
> > [  3]  0.0- 2.0 sec  3357 MBytes  14080 Mbits/sec
> > [  3]  2.0- 4.0 sec  2880 MBytes  12077 Mbits/sec
> > [  3]  4.0- 6.0 sec  2909 MBytes  12202 Mbits/sec
> > [  3]  6.0- 8.0 sec  2552 MBytes  10702 Mbits/sec
> > [  3]  8.0-10.0 sec  3616 MBytes  15166 Mbits/sec
> > [  3] 10.0-12.0 sec  3415 MBytes  14324 Mbits/sec
> > 
> > 
> > Seems like a kernel bug, is this related to one of this fixes in
> > linux-next or i need to create new bugreport?
> > 
> > 1)
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?
> > id=1aaf6d3d3d1e95f4be07e32dd84aa1c93855fbbd 2)
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?
> > id=9ecd1a75d977e2e8c48139c7d3efed183f898d94 3)
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?
> > id=2810e5b9a7731ca5fce22bfbe12c96e16ac44b6f 4)
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?
> > id=03393fd5cc2b6cdeec32b704ecba64dbb0feae3c 5)
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?
> > id=59ccb4ebbc35e36a3c143f2d1355deb75c2e628f
> I don't think these patches can fix your problem at first glance.
> 
> 

[-- Attachment #1.2: Type: text/html, Size: 25466 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17  9:37   ` Eugene Istomin
@ 2013-05-17  9:40     ` Eugene Istomin
  2013-05-17 10:45     ` Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Eugene Istomin @ 2013-05-17  9:40 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 5112 bytes --]

>>Ye, this is OpenSUSE 12.3 kernel (both Dom0/DomU is
>>kernel-xen-3.9.2-8.1.g04040b9.x86_64) from 
>>>>http://download.opensuse.org/repositories/Kernel:/HEAD/standard

Sorry, misspelled,

I tested kernel-xen-3.9.2-8.1.g04040b9.x86_64 and kernel-
xen-3.9.2-4.756ee56.x86_64 kernels - results are the same.
-- 
*/Best regards,/*
/Eugene Istomin/




Ye, this is OpenSUSE 12.3 kernel (both Dom0/DomU is kernel 
xen-3.9.2-8.1.g04040b9.x86_64) from 
http://download.opensuse.org/repositories/Kernel:/HEAD/standard


First VM:

template:/home/local # iperf -s 
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.251.2.202 port 5001 connected with 10.251.2.201 port 
38196


##
#(after iperf -c 10.251.2.202 -i 2 -f m from the second VM)
##

[   38.447860] BUG: unable to handle kernel paging request at 
ffff88007928b000
[   38.447898] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[   38.447927] PGD a83067 PUD a93067 PMD 7fc28067 PTE 
801000007928b065
[   38.447955] Oops: 0003 [#1] SMP 
[   38.447970] Modules linked in: af_packet hwmon domctl crc32_pclmul 
crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
aes_x86_64 xts gf128mul joydev autofs4 scsi_dh_emc scsi_dh_alua 
scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic 
ata_piix
[   38.448091] CPU 0 
[   38.448100] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
#1  
[   38.448125] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[   38.448158] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[   38.448176] RAX: ffff88007da68cd0 RBX: ffff88007928aec0 RCX: 
ffff88007928b000




This trace is viewed only using xl console, DomUs had no records in logs. 
You are right, may be this is Dom0 trace.

Here is xl-test1.log from Dom0:

libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1944130 
wpath=@releaseDomain token=3/0: event epath=@releaseDomain
libxl: debug: libxl.c:998:domain_death_xswatch_callback: 
[evg=0x19435e0:2] from domid=2 nentries=1 rc=1
libxl: debug: libxl.c:1009:domain_death_xswatch_callback: 
[evg=0x19435e0:2]   got=domaininfos[0] got->domain=2
libxl: debug: libxl.c:1036:domain_death_xswatch_callback:  exists 
shutdown_reported=0 dominf.flags=30004
libxl: debug: libxl.c:1048:domain_death_xswatch_callback:  shutdown 
reporting
libxl: debug: libxl.c:1002:domain_death_xswatch_callback: [evg=0] all 
reported
libxl: debug: libxl.c:1066:domain_death_xswatch_callback: domain death 
search done
Domain 2 has shut down, reason code 3 0x3
Action for shutdown reason code 3 is restart
Domain 2 needs to be cleaned up: destroying the domain
libxl: debug: libxl.c:1250:libxl_domain_destroy: ao 0x19438d0: create: 
how=(nil) callback=(nil) poller=0x19435a0
libxl: debug: libxl_dm.c:1266:libxl__destroy_device_model: Device Model 
signaled


-- 
*/Best regards,/*
/Eugene Istomin/


> Moving discussion to Xen-devel
> 
> On Thu, May 16, 2013 at 10:29:56PM +0300, Eugene Istomin wrote:
> > Hello,
> > 
> > I tried to use 3.9.2 kernel with xen 4.2.2/4.3rc1 and in both variants
> > leads to this error in network-intensive load (such as iperf, 100 nginx
> > parallel
> > requests to 1M files and so on):
> It would be more helpful if you can provide info on your configurations
> (Dom0 and DomU), your workload, how to reproduce the bug.
> 
> I run iperf and NFS to test Xen network, but never see any crash like
> this myself.
> 
> > BUG: unable to handle kernel paging request at ffff8800795a3000
> > [   60.246945] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
> > [   60.246975] PGD a8a067 PUD a9a067 PMD 7fc27067 PTE
> > 80100000795a3065
> > [   60.247004] Oops: 0003 [#1] SMP
> > [   60.247020] Modules linked in: af_packet hwmon domctl 
crc32_pclmul
> > crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw
> > aes_x86_64 joydev xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua
> > scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic
> > ata_piix
> > [   60.247144] CPU 0
> > [   60.247154] Pid: 0, comm: swapper/0 Not tainted 3.9.2-1.g04040b9-
xen
> > #1
> > [   60.247179] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>]
> > netif_poll+0x49c/0xe80 [xennet]
> > ...
> 
> Could you provide fuul stack trace? AFAICT there is no netif_poll in Xen
> netfront/back.
> 
> Presumably this is Dom0 log? (from the domctl module)
> 
> > We have couple of production hypervisors on 3.4 kernels with high-
> > throughput  internal network (VM-to-VM in one Dom0), iperf on them is
> > working well:
> > [  3]  0.0- 2.0 sec  3357 MBytes  14080 Mbits/sec
> > [  3]  2.0- 4.0 sec  2880 MBytes  12077 Mbits/sec
> > [  3]  4.0- 6.0 sec  2909 MBytes  12202 Mbits/sec
> > [  3]  6.0- 8.0 sec  2552 MBytes  10702 Mbits/sec
> > [  3]  8.0-10.0 sec  3616 MBytes  15166 Mbits/sec
> > [  3] 10.0-12.0 sec  3415 MBytes  14324 Mbits/sec
> > 
> > 
> > Seems like a kernel bug, is this related to one of this fixes in
> > linux-next or i need to create new bugreport?

[-- Attachment #1.2: Type: text/html, Size: 28904 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17  9:37   ` Eugene Istomin
  2013-05-17  9:40     ` Eugene Istomin
@ 2013-05-17 10:45     ` Jan Beulich
  2013-05-17 12:03       ` Eugene Istomin
  1 sibling, 1 reply; 85+ messages in thread
From: Jan Beulich @ 2013-05-17 10:45 UTC (permalink / raw)
  To: Eugene Istomin; +Cc: Wei Liu, xen-devel

>>> On 17.05.13 at 11:37, Eugene Istomin <e.istomin@edss.ee> wrote:
> [   38.447860] BUG: unable to handle kernel paging request at 
> ffff88007928b000
> [   38.447898] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
> [   38.447927] PGD a83067 PUD a93067 PMD 7fc28067 PTE 
> 801000007928b065
> [   38.447955] Oops: 0003 [#1] SMP 
> [   38.447970] Modules linked in: af_packet hwmon domctl crc32_pclmul 
> crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
> aes_x86_64 xts gf128mul joydev autofs4 scsi_dh_emc scsi_dh_alua 
> scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic 
> ata_piix
> [   38.448091] CPU 0 
> [   38.448100] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
> #1  
> [   38.448125] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
> netif_poll+0x49c/0xe80 [xennet]
> [   38.448158] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
> [   38.448176] RAX: ffff88007da68cd0 RBX: ffff88007928aec0 RCX: 
> ffff88007928b000
> 
> 
> 
> 
> This trace is viewed only using xl console, DomUs had no records in logs. 
> You are right, may be this is Dom0 trace.

Output of "xl console" can hardly be Dom0 kernel output. The fact
that various modules (most prominently domctl) are present here that
would normally only be expected in Dom0 may be confusing us. Init
scripts might be loading those for no reason...

But - you still didn't give us a full guest kernel log, so we're still left in
the dark as to what may have gone wrong earlier, and what the call
stack is that got us here. Nor did you tell us what kernel version you
used last without seeing that issue.

Plus, should in depth analysis of the stack turn out necessary, using
a random build will make this more complicated than necessary (as
those builds vanish as newer ones appear).

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17 10:45     ` Jan Beulich
@ 2013-05-17 12:03       ` Eugene Istomin
  2013-05-17 12:25         ` Jan Beulich
  0 siblings, 1 reply; 85+ messages in thread
From: Eugene Istomin @ 2013-05-17 12:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, xen-devel

[-- Attachment #1.1: Type: text/plain, Size: 5377 bytes --]

All of kernels i mention before are official opensuse kernels (3.4 from 
http://download.opensuse.org/repositories/Kernel:/openSUSE-12.2/standar
d
, 3.9 from 
http://download.opensuse.org/repositories/Kernel:/stable/standard )

Also i try kernels from 
http://download.opensuse.org/repositories/Kernel:/HEAD/standard for 
double-checking the fact that is problem is kernel-related/opensuse-xen 
build related.

We use 3.4 kernels in production, 3.4.37-1-xen and 3.4.45-2 are working 
without this problems, a couple of days ago i started to test 3.9.

*VM configuration:*
name="test2"                                                                                                                                                                                                                                                                                                                        
memory = 2048
vcpus = 4
cpu_weight = 256
maxvcpus = 8
maxmem = 4096                                                                                                                                                                    
superpages = 1
bootloader = "pygrub"                                                                                                                                                               
bootargs = "--entry=xvda1:/boot/vmlinuz-xen,/boot/initrd-xen" 
vfb = [ 'type=vnc,vnclisten=0.0.0.0,vncdisplay=2,vncpasswd=vnc' ]                                                                                                                 
on_poweroff = "destroy"                                                                                                                                                             
on_reboot = "restart"                                                                                                                                                               
on_crash = "restart"                                                                                                                                                                

disk=[
"tap:raw:/storage/images/manual/test2.raw,xvda,w",                                                                                                                       
"tap:raw:/storage/images/manual/test2_storage.raw,xvdb,w",                                                                                                                       
]                                                                                                                                                                                   

vif =[    
"script=/etc/xen/scripts/vif-
ovs,mac=AA:00:AA:BB:02:01,bridge=vlannet.2512.9000",
]

We use simple vif-to-OVS bash mapper and it work fine.
BTW, only VM runs iperf server (-s) is affected, VM runs iperf client is 
working normally.

Logs (test1 - iperf server, test2 - iperf client):

Dom0:
1) tailf /var/log/xen/xl-test1.log (turned just before bug) - xl-test1.log
2) tailf /var/log/xen/xl-test2.log (turned just before bug) - xl-test2.log
3) tailf /var/log/messages (turned just before bug) - messages.log
4) xl create log - xl_creation.log

DomU:
1) dmesg  - domu_dmesg

Do you need any kernel dumps? 
-- 
*/Best regards,/*
/Eugene Istomin/

> >>> On 17.05.13 at 11:37, Eugene Istomin <e.istomin@edss.ee> 
wrote:
> > [   38.447860] BUG: unable to handle kernel paging request at
> > ffff88007928b000
> > [   38.447898] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
> > [   38.447927] PGD a83067 PUD a93067 PMD 7fc28067 PTE
> > 801000007928b065
> > [   38.447955] Oops: 0003 [#1] SMP
> > [   38.447970] Modules linked in: af_packet hwmon domctl 
crc32_pclmul
> > crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw
> > aes_x86_64 xts gf128mul joydev autofs4 scsi_dh_emc scsi_dh_alua
> > scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic
> > ata_piix
> > [   38.448091] CPU 0
> > [   38.448100] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-
xen
> > #1
> > [   38.448125] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>]
> > netif_poll+0x49c/0xe80 [xennet]
> > [   38.448158] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
> > [   38.448176] RAX: ffff88007da68cd0 RBX: ffff88007928aec0 RCX:
> > ffff88007928b000
> > 
> > 
> > 
> > 
> > This trace is viewed only using xl console, DomUs had no records in 
logs.
> > You are right, may be this is Dom0 trace.
> 
> Output of "xl console" can hardly be Dom0 kernel output. The fact
> that various modules (most prominently domctl) are present here that
> would normally only be expected in Dom0 may be confusing us. Init
> scripts might be loading those for no reason...
> 
> But - you still didn't give us a full guest kernel log, so we're still left
> in the dark as to what may have gone wrong earlier, and what the call 
stack
> is that got us here. Nor did you tell us what kernel version you used last
> without seeing that issue.
> 
> Plus, should in depth analysis of the stack turn out necessary, using
> a random build will make this more complicated than necessary (as
> those builds vanish as newer ones appear).
> 
> Jan

[-- Attachment #1.2: Type: text/html, Size: 22837 bytes --]

[-- Attachment #2: domu_dmesg.log --]
[-- Type: text/x-log, Size: 15474 bytes --]

[    0.000000] BRK [0x00a83000, 0x00a83fff] PUD
[    0.000000] BRK [0x00a84000, 0x00a84fff] PMD
[    0.000000] BRK [0x00a85000, 0x00a92fff] PTE
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.9.2-4.756ee56-xen (geeko@buildhost) (gcc version 4.7.3 (SUSE Linux) ) #1 SMP Thu May 16 10:08:22 UTC 2013 (756ee56)
[    0.000000] Command line: root=/dev/disk/by-label/system resume=/dev/disk/by-label/swap splash=silent quiet showopts ipv6.disable=1 elevator=noop vga=0x314
[    0.000000] e820: Xen-provided physical RAM map:
[    0.000000] Xen: [mem 0x0000000000000000-0x00000000ffffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x80000000
[    0.000000] init_memory_mapping: [mem 0x7fe00000-0x7fffffff]
[    0.000000]  [mem 0x7fe00000-0x7fffffff] page 4k
[    0.000000] BRK [0x00a93000, 0x00a93fff] PGTABLE
[    0.000000] BRK [0x00a94000, 0x00a94fff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x7c000000-0x7fdfffff]
[    0.000000]  [mem 0x7c000000-0x7fdfffff] page 4k
[    0.000000] BRK [0x00a95000, 0x00a95fff] PGTABLE
[    0.000000] BRK [0x00a96000, 0x00a96fff] PGTABLE
[    0.000000] BRK [0x00a97000, 0x00a97fff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x00000000-0x7bffffff]
[    0.000000]  [mem 0x00000000-0x7bffffff] page 4k
[    0.000000] init_memory_mapping: [mem 0x80000000-0xffffffff]
[    0.000000]  [mem 0x80000000-0xffffffff] page 4k
[    0.000000] RAMDISK: [mem 0x00baa000-0x0168efff]
[    0.000000] ACPI in unprivileged domain disabled
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x00000000-0x00ffffff]
[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x00000000-0xffffffff]
[    0.000000] On node 0 totalpages: 1048576
[    0.000000] free_area_init_node: node 0, pgdat ffffffff8091d980, node_mem_map ffff88007c010000
[    0.000000]   DMA zone: 56 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 4096 pages, LIFO batch:0
[    0.000000]   DMA32 zone: 14280 pages used for memmap
[    0.000000]   DMA32 zone: 1044480 pages, LIFO batch:31
[    0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:8 nr_node_ids:1
[    0.000000] PERCPU: Embedded 19 pages/cpu @ffff88007b400000 s47680 r8192 d21952 u262144
[    0.000000] pcpu-alloc: s47680 r8192 d21952 u262144 alloc=1*2097152
[    0.000000] pcpu-alloc: [0] 0 1 2 3 4 5 6 7 
[    0.000000] Swapping MFNs for PFN 93f and 7b407 (MFN 105f841 and 10e4e79)
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 1034240
[    0.000000] Kernel command line: root=/dev/disk/by-label/system resume=/dev/disk/by-label/swap splash=silent quiet showopts ipv6.disable=1 elevator=noop vga=0x314
[    0.000000] bootsplash: silent mode.
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
[    0.000000] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
[    0.000000] __ex_table already sorted, skipping sort
[    0.000000] allocated 16777216 bytes of page_cgroup
[    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[    0.000000] Software IO TLB disabled
[    0.000000] Memory: 1978108k/4194304k available (5336k kernel code, 0k absent, 2216196k reserved, 4091k data, 440k init)
[    0.000000] Hierarchical RCU implementation.
[    0.000000]  RCU dyntick-idle grace-period acceleration is enabled.
[    0.000000]  RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=8.
[    0.000000] nr_pirqs: 256
[    0.000000] NR_IRQS:67328 nr_irqs:2624 16
[    0.000000] Xen reported: 2666.846 MHz processor.
[    0.000000] Console: colour dummy device 80x25
[    0.000000] console [tty0] enabled
[    0.000000] console [xvc-1] enabled
[    0.080001] Calibrating delay using timer specific routine.. 5403.97 BogoMIPS (lpj=10807950)
[    0.080003] pid_max: default: 32768 minimum: 301
[    0.080026] Security Framework initialized
[    0.080036] AppArmor: AppArmor initialized
[    0.080044] Mount-cache hash table entries: 256
[    0.080165] Initializing cgroup subsys cpuacct
[    0.080167] Initializing cgroup subsys memory
[    0.080175] Initializing cgroup subsys devices
[    0.080176] Initializing cgroup subsys freezer
[    0.080178] Initializing cgroup subsys net_cls
[    0.080179] Initializing cgroup subsys blkio
[    0.080180] Initializing cgroup subsys perf_event
[    0.080218] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[    0.080218] ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)
[    0.080222] Last level iTLB entries: 4KB 512, 2MB 7, 4MB 7
[    0.080222] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32
[    0.080222] tlb_flushall_shift: 6
[    0.110684] Brought up 1 CPUs
[    0.110786] devtmpfs: initialized
[    0.130801] RTC time: 165:165:165, date: 165/165/65
[    0.130838] NET: Registered protocol family 16
[    0.132005] SMP alternatives: switching to SMP code
[    0.164163] Brought up 4 CPUs
[    0.164376] PCI: Fatal: No config space access function found
[    0.164381] PCI: setting up Xen PCI frontend stub
[    0.164487] patch_vdso64: .altinstructions not found
[    0.164489] patch_vdsox32: .altinstructions not found
[    0.164816] bio: create slab <bio-0> at 0
[    0.164816] ACPI: Interpreter disabled.
[    0.164816] vgaarb: loaded
[    0.164816] suspend: event channel 15
[    0.164816] xen_mem: Initialising balloon driver.
[    0.172927] Unable to read sysrq code in control/sysrq
[    0.172927] SCSI subsystem initialized
[    0.172927] libata version 3.00 loaded.
[    0.172927] PCI: System does not support PCI
[    0.172927] PCI: System does not support PCI
[    0.172927] NetLabel: Initializing
[    0.172927] NetLabel:  domain hash size = 128
[    0.172927] NetLabel:  protocols = UNLABELED CIPSOv4
[    0.172927] NetLabel:  unlabeled traffic allowed by default
[    0.172927] Switching to clocksource xen
[    0.173707] AppArmor: AppArmor Filesystem Enabled
[    0.173719] pnp: PnP ACPI: disabled
[    0.174653] NET: Registered protocol family 2
[    0.174798] TCP established hash table entries: 32768 (order: 7, 524288 bytes)
[    0.174868] TCP bind hash table entries: 32768 (order: 7, 524288 bytes)
[    0.174934] TCP: Hash tables configured (established 32768 bind 32768)
[    0.174955] TCP: reno registered
[    0.174958] UDP hash table entries: 2048 (order: 5, 131072 bytes)
[    0.174978] UDP-Lite hash table entries: 2048 (order: 5, 131072 bytes)
[    0.175053] NET: Registered protocol family 1
[    0.175059] PCI: CLS 32 bytes
[    0.175086] Unpacking initramfs...
[    0.182596] Freeing initrd memory: 11156k freed
[    0.186579] audit: initializing netlink socket (disabled)
[    0.186590] type=2000 audit(1368791454.184:1): initialized
[    0.216723] VFS: Disk quotas dquot_6.5.2
[    0.216748] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.216931] msgmni has been set to 4117
[    0.217165] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[    0.217197] io scheduler noop registered (default)
[    0.217199] io scheduler deadline registered
[    0.217216] io scheduler cfq registered
[    0.217366] Non-volatile memory driver v1.3
[    0.217367] Linux agpgart interface v0.103
[    0.217440] Xen virtual console successfully installed as xvc0
[    0.219206] Console: switching to colour frame buffer device 100x37
[    0.222889] console [tty0] enabled
[    0.223852] input: Xen Virtual Keyboard as /devices/virtual/input/input0
[    0.223875] input: Xen Virtual Pointer as /devices/virtual/input/input1
[    0.226130] i8042: PNP: No PS/2 controller found. Probing ports directly.
[    1.239115] i8042: No controller found
[    1.239292] mousedev: PS/2 mouse device common for all mice
[    1.239410] ledtrig-cpu: registered to indicate activity on CPUs
[    1.239515] TCP: cubic registered
[    1.239516] IPv6: Loaded, but administratively disabled, reboot required to enable
[    1.239526] Key type dns_resolver registered
[    1.239629] registered taskstats version 1
[    1.239699]   Magic number: 1:252:3141
[    1.239715] XENBUS: Device with no driver: device/vbd/51712
[    1.239716] XENBUS: Device with no driver: device/vbd/51728
[    1.239717] XENBUS: Device with no driver: device/vif/0
[    1.239934] Freeing unused kernel memory: 440k freed
[    1.240070] Write protecting the kernel read-only data: 8912k
[    1.310915] Initialising virtual ethernet driver.
[    1.414401] blkfront: device/vbd/51712: ring-pages=1 nr_ents=32
[    1.421937] blkfront: device/vbd/51728: ring-pages=1 nr_ents=32
[    1.423590] xen-vbd: registered block device major 202
[    1.423628] blkfront: xvda: barrier or flush: disabled
[    1.425229]  xvda: xvda1 xvda2
[    1.426265] Setting capacity to 20971520
[    1.426270] xvda: detected capacity change from 0 to 10737418240
[    1.427111] blkfront: xvdb: barrier or flush: disabled
[    1.433071]  xvdb: unknown partition table
[    1.528825] hp_sw: device handler registered
[    1.531766] rdac: device handler registered
[    1.534773] alua: device handler registered
[    1.537735] emc: device handler registered
[    1.547178] systemd-udevd[111]: starting version 195
[    1.843895] EXT4-fs (xvda2): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[    1.991518] EXT4-fs (xvda2): re-mounted. Opts: acl,user_xattr
[    2.257074] systemd[1]: systemd 195 running in system mode. (+PAM +LIBWRAP +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ; suse)
[    2.362136] systemd[1]: Inserted module 'autofs4'
[    2.370140] systemd[1]: Set hostname to <template.edss.local>.
[    2.754679] systemd[1]: Expecting device dev-xvc0.device...
[    2.754708] systemd[1]: Starting Forward Password Requests to Wall Directory Watch.
[    2.754827] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[    2.754851] systemd[1]: Starting System Time Synchronized.
[    2.754871] systemd[1]: Reached target System Time Synchronized.
[    2.754886] systemd[1]: Starting Remote File Systems.
[    2.754904] systemd[1]: Reached target Remote File Systems.
[    2.754919] systemd[1]: Starting Syslog Socket.
[    2.754981] systemd[1]: Listening on Syslog Socket.
[    2.755007] systemd[1]: Starting Collect Read-Ahead Data...
[    2.756219] systemd[1]: Starting Replay Read-Ahead Data...
[    2.757217] systemd[1]: Starting Delayed Shutdown Socket.
[    2.757278] systemd[1]: Listening on Delayed Shutdown Socket.
[    2.757295] systemd[1]: Starting /dev/initctl Compatibility Named Pipe.
[    2.757349] systemd[1]: Listening on /dev/initctl Compatibility Named Pipe.
[    2.757383] systemd[1]: Starting Dispatch Password Requests to Console Directory Watch.
[    2.757472] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
[    2.757586] systemd[1]: Starting udev Kernel Socket.
[    2.757627] systemd[1]: Listening on udev Kernel Socket.
[    2.757724] systemd[1]: Starting udev Control Socket.
[    2.757778] systemd[1]: Listening on udev Control Socket.
[    2.757836] systemd[1]: Starting Arbitrary Executable File Formats File System Automount Point.
[    2.757990] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point.
[    2.758010] systemd[1]: Starting Encrypted Volumes.
[    2.758033] systemd[1]: Reached target Encrypted Volumes.
[    2.758060] systemd[1]: Expecting device dev-disk-by\x2dlabel-swap.device...
[    2.758096] systemd[1]: Starting Journal Socket.
[    2.758201] systemd[1]: Listening on Journal Socket.
[    2.758248] systemd[1]: Mounting POSIX Message Queue File System...
[    2.759183] systemd[1]: Mounted Huge Pages File System.
[    2.759220] systemd[1]: Starting Create dynamic rule for /dev/root link...
[    2.760345] systemd[1]: Starting Journal Service...
[    2.761603] systemd[1]: Started Journal Service.
[    2.761805] systemd-readahead[208]: Bumped block_nr parameter of 202:0 to 20480. This is a temporary hack and should be removed one day.
[    2.830611] EXT4-fs (xvda2): re-mounted. Opts: acl,user_xattr
[    2.884995] systemd-udevd[232]: starting version 195
[    2.913594] systemd-journald[211]: Received SIGUSR1
[    2.919885] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
[    2.919887] EDD information not available.
[    3.005770] type=1400 audit(1368791457.000:2): apparmor="STATUS" operation="profile_load" name="/{usr/,}bin/ping" pid=273 comm="apparmor_parser"
[    3.021044] type=1400 audit(1368791457.016:3): apparmor="STATUS" operation="profile_load" name="/sbin/klogd" pid=299 comm="apparmor_parser"
[    3.031577] type=1400 audit(1368791457.024:4): apparmor="STATUS" operation="profile_load" name="/sbin/syslog-ng" pid=329 comm="apparmor_parser"
[    3.043335] type=1400 audit(1368791457.036:5): apparmor="STATUS" operation="profile_load" name="/sbin/syslogd" pid=339 comm="apparmor_parser"
[    3.056589] type=1400 audit(1368791457.052:6): apparmor="STATUS" operation="profile_load" name="/usr/lib/apache2/mpm-prefork/apache2" pid=356 comm="apparmor_parser"
[    3.056854] type=1400 audit(1368791457.052:7): apparmor="STATUS" operation="profile_load" name="/usr/lib/apache2/mpm-prefork/apache2//DEFAULT_URI" pid=356 comm="apparmor_parser"
[    3.057092] type=1400 audit(1368791457.052:8): apparmor="STATUS" operation="profile_load" name="/usr/lib/apache2/mpm-prefork/apache2//HANDLING_UNTRUSTED_INPUT" pid=356 comm="apparmor_parser"
[    3.057308] type=1400 audit(1368791457.052:9): apparmor="STATUS" operation="profile_load" name="/usr/lib/apache2/mpm-prefork/apache2//phpsysinfo" pid=356 comm="apparmor_parser"
[    3.065740] type=1400 audit(1368791457.060:10): apparmor="STATUS" operation="profile_load" name="/usr/lib/dovecot/deliver" pid=363 comm="apparmor_parser"
[    5.963850] Adding 1051644k swap on /dev/xvda1.  Priority:-1 extents:1 across:1051644k SS
[    6.007235] audit_printk_skb: 57 callbacks suppressed
[    6.007238] type=1400 audit(1368791460.000:30): apparmor="DENIED" operation="mknod" parent=1 profile="/usr/sbin/nscd" name="/var/run/nscd/netgroup" pid=468 comm="nscd" requested_mask="c" denied_mask="c" fsuid=0 ouid=0
[    6.275196] type=1400 audit(1368791460.268:31): apparmor="DENIED" operation="capable" parent=1 profile="/usr/sbin/nscd" pid=468 comm="nscd" pid=468 comm="nscd" capability=36  capname="block_suspend"
[    6.612452] NET: Registered protocol family 17
[    9.077256] type=1400 audit(1368791463.072:32): apparmor="DENIED" operation="capable" parent=1 profile="/usr/sbin/nscd" pid=468 comm="nscd" pid=468 comm="nscd" capability=36  capname="block_suspend"
[    9.113208] type=1400 audit(1368791463.108:33): apparmor="DENIED" operation="capable" parent=1 profile="/usr/sbin/nscd" pid=468 comm="nscd" pid=468 comm="nscd" capability=36  capname="block_suspend"
[   36.008106] type=1400 audit(1368791490.004:34): apparmor="DENIED" operation="open" parent=1 profile="/usr/sbin/nscd" name="/proc/sys/vm/overcommit_memory" pid=472 comm="nscd" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
[   69.172553] type=1400 audit(1368791523.168:35): apparmor="DENIED" operation="capable" parent=1 profile="/usr/sbin/nscd" pid=468 comm="nscd" pid=468 comm="nscd" capability=36  capname="block_suspend"

[-- Attachment #3: xl_creation.log --]
[-- Type: text/x-log, Size: 17483 bytes --]

Parsing config from /storage/xl/test1.xl
libxl: debug: libxl_create.c:1192:do_domain_create: ao 0x1f824c0: create: how=(nil) callback=(nil) poller=0x1f839f0
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=xvda spec.backend=unknown
libxl: debug: libxl_device.c:175:disk_try_backend: Disk vdev=xvda, backend phy unsuitable as phys path not a block device
libxl: debug: libxl_device.c:265:libxl__device_disk_set_backend: Disk vdev=xvda, using backend tap
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=xvdb spec.backend=unknown
libxl: debug: libxl_device.c:175:disk_try_backend: Disk vdev=xvdb, backend phy unsuitable as phys path not a block device
libxl: debug: libxl_device.c:265:libxl__device_disk_set_backend: Disk vdev=xvdb, using backend tap
libxl: debug: libxl_create.c:694:initiate_domain_create: running bootloader
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=(null) spec.backend=tap
libxl: debug: libxl.c:2318:libxl__device_disk_local_initiate_attach: locally attaching tap disk /storage/images/manual/test1.raw directly (ie without using blktap)
libxl: debug: libxl_bootloader.c:409:bootloader_disk_attached_cb: Config bootloader value: pygrub
libxl: debug: libxl_bootloader.c:425:bootloader_disk_attached_cb: Checking for bootloader in libexec path: /usr/lib64/xen/bin/pygrub
libxl: debug: libxl_create.c:1205:do_domain_create: ao 0x1f824c0: inprogress: poller=0x1f839f0, flags=i
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f82840 wpath=/local/domain/2 token=3/0: register slotnum=3
libxl: debug: libxl_event.c:1736:libxl__ao_progress_report: ao 0x1f824c0: progress report: ignored
libxl: debug: libxl_bootloader.c:535:bootloader_gotptys: executing bootloader: /usr/lib64/xen/bin/pygrub
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys:   bootloader arg: /usr/lib64/xen/bin/pygrub
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys:   bootloader arg: --output=/var/run/xen/bootloader.2.out
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys:   bootloader arg: --output-format=simple0
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys:   bootloader arg: --output-directory=/var/run/xen/bootloader.2.d
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys:   bootloader arg: /storage/images/manual/test1.raw
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f82840 wpath=/local/domain/2 token=3/0: event epath=/local/domain/2
libxl: debug: libxl_bootloader.c:634:bootloader_finished: bootloader completed
libxl: debug: libxl_bootloader.c:136:bootloader_result_command: bootloader output contained kernel /var/run/xen/bootloader.2.d/boot_kernel.JoFGbt
libxl: debug: libxl_bootloader.c:136:bootloader_result_command: bootloader output contained ramdisk /var/run/xen/bootloader.2.d/boot_ramdisk.Vg4_jl
libxl: debug: libxl_bootloader.c:136:bootloader_result_command: bootloader output contained args root=/dev/disk/by-label/system resume=/dev/disk/by-label/swap splash=silent quiet showopts ipv6.disable=1 elevator=noop vga=0x314
libxl: debug: libxl_bootloader.c:647:bootloader_finished: bootloader execution successful
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f82840 wpath=/local/domain/2 token=3/0: deregister slotnum=3
libxl: debug: libxl_numa.c:435:libxl__get_numa_candidate: New best NUMA placement candidate found: nr_nodes=1, nr_cpus=12, nr_vcpus=16, free_memkb=46390
libxl: detail: libxl_dom.c:193:numa_place_domain: NUMA placement candidate with 1 nodes, 12 cpus and 46390 KB free selected
domainbuilder: detail: xc_dom_allocate: cmdline="root=/dev/disk/by-label/system resume=/dev/disk/by-label/swap splash=silent quiet showopts ipv6.disable=1 elevator=noop vga=0x314", features="(null)"
libxl: debug: libxl_dom.c:383:libxl__build_pv: pv kernel mapped 1 path /var/run/xen/bootloader.2.d/boot_kernel.JoFGbt

domainbuilder: detail: xc_dom_kernel_mem: called
domainbuilder: detail: xc_dom_ramdisk_mem: called
domainbuilder: detail: xc_dom_boot_xen_init: ver 4.2, caps xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 
domainbuilder: detail: xc_dom_parse_image: called
domainbuilder: detail: xc_dom_find_loader: trying multiboot-binary loader ... 
domainbuilder: detail: loader probe failed
domainbuilder: detail: xc_dom_find_loader: trying Linux bzImage loader ... 
domainbuilder: detail: xc_dom_malloc            : 11958 kB
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x43c02b -> 0xbada10
domainbuilder: detail: loader probe OK
xc: detail: elf_parse_binary: phdr: paddr=0x2000 memsz=0x8b4000
xc: detail: elf_parse_binary: phdr: paddr=0x8b6000 memsz=0x810f0
xc: detail: elf_parse_binary: phdr: paddr=0x938000 memsz=0xba40
xc: detail: elf_parse_binary: phdr: paddr=0x944000 memsz=0x266000
xc: detail: elf_parse_binary: memory: 0x2000 -> 0xbaa000
xc: detail: elf_xen_parse_note: GUEST_OS = "linux"
xc: detail: elf_xen_parse_note: GUEST_VERSION = "2.6"
xc: detail: elf_xen_parse_note: XEN_VERSION = "xen-3.0"
xc: detail: elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
xc: detail: elf_xen_parse_note: PADDR_OFFSET = 0x0
xc: detail: elf_xen_parse_note: ENTRY = 0xffffffff80002000
xc: detail: elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff80003000
xc: detail: elf_xen_parse_note: unknown xen elf note (0xd)
xc: detail: elf_xen_parse_note: MOD_START_PFN = 0x1
xc: detail: elf_xen_parse_note: INIT_P2M = 0xffffea0000000000
xc: detail: elf_xen_parse_note: FEATURES = "writable_page_tables|writable_descriptor_tables|auto_translated_physmap|supervisor_mode_kernel"
xc: detail: elf_xen_parse_note: SUPPORTED_FEATURES = 0x80f
xc: detail: elf_xen_parse_note: LOADER = "generic"
xc: detail: elf_xen_parse_note: SUSPEND_CANCEL = 0x1
xc: detail: elf_xen_addr_calc_check: addresses:
xc: detail:     virt_base        = 0xffffffff80000000
xc: detail:     elf_paddr_offset = 0x0
xc: detail:     virt_offset      = 0xffffffff80000000
xc: detail:     virt_kstart      = 0xffffffff80002000
xc: detail:     virt_kend        = 0xffffffff80baa000
xc: detail:     virt_entry       = 0xffffffff80002000
xc: detail:     p2m_base         = 0xffffea0000000000
domainbuilder: detail: xc_dom_parse_elf_kernel: xen-3.0-x86_64: 0xffffffff80002000 -> 0xffffffff80baa000
domainbuilder: detail: xc_dom_mem_init: mem 2048 MB, pages 0x80000 pages, 4k each
domainbuilder: detail: xc_dom_mem_init: 0x80000 pages
domainbuilder: detail: xc_dom_boot_mem_init: called
domainbuilder: detail: x86_compat: guest xen-3.0-x86_64, address size 64
domainbuilder: detail: xc_dom_malloc            : 4096 kB
domainbuilder: detail: xc_dom_build_image: called
domainbuilder: detail: xc_dom_alloc_segment:   kernel       : 0xffffffff80002000 -> 0xffffffff80baa000  (pfn 0x2 + 0xba8 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x2+0xba8 at 0x7f63ab057000
xc: detail: elf_load_binary: phdr 0 at 0x0x7f63ab057000 -> 0x0x7f63ab90b000
xc: detail: elf_load_binary: phdr 1 at 0x0x7f63ab90b000 -> 0x0x7f63ab98c0f0
xc: detail: elf_load_binary: phdr 2 at 0x0x7f63ab98d000 -> 0x0x7f63ab998a40
xc: detail: elf_load_binary: phdr 3 at 0x0x7f63ab999000 -> 0x0x7f63aba02000
domainbuilder: detail: xc_dom_alloc_segment:   ramdisk      : 0xffffffff80baa000 -> 0xffffffff8168f000  (pfn 0xbaa + 0xae5 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0xbaa+0xae5 at 0x7f63aa572000
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x4bffdd -> 0xae4010
domainbuilder: detail: xc_dom_alloc_segment:   phys2mach    : 0xffffffff8168f000 -> 0xffffffff81a8f000  (pfn 0x168f + 0x400 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x168f+0x400 at 0x7f63aa172000
domainbuilder: detail: xc_dom_alloc_page   :   start info   : 0xffffffff81a8f000 (pfn 0x1a8f)
domainbuilder: detail: xc_dom_alloc_page   :   xenstore     : 0xffffffff81a90000 (pfn 0x1a90)
domainbuilder: detail: xc_dom_alloc_page   :   console      : 0xffffffff81a91000 (pfn 0x1a91)
domainbuilder: detail: nr_page_tables: 0x0000ffffffffffff/48: 0xffff000000000000 -> 0xffffffffffffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x0000007fffffffff/39: 0xffffff8000000000 -> 0xffffffffffffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x000000003fffffff/30: 0xffffffff80000000 -> 0xffffffffbfffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x00000000001fffff/21: 0xffffffff80000000 -> 0xffffffff81bfffff, 14 table(s)
domainbuilder: detail: xc_dom_alloc_segment:   page tables  : 0xffffffff81a92000 -> 0xffffffff81aa3000  (pfn 0x1a92 + 0x11 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x1a92+0x11 at 0x7f63b3b92000
domainbuilder: detail: xc_dom_alloc_page   :   boot stack   : 0xffffffff81aa3000 (pfn 0x1aa3)
domainbuilder: detail: xc_dom_build_image  : virt_alloc_end : 0xffffffff81aa4000
domainbuilder: detail: xc_dom_build_image  : virt_pgtab_end : 0xffffffff81c00000
domainbuilder: detail: xc_dom_boot_image: called
domainbuilder: detail: arch_setup_bootearly: doing nothing
domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_64 <= matches
domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_32p
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32p
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_64
domainbuilder: detail: xc_dom_update_guest_p2m: dst 64bit, pages 0x80000
domainbuilder: detail: clear_page: pfn 0x1a91, mfn 0x823127
domainbuilder: detail: clear_page: pfn 0x1a90, mfn 0x823128
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x1a8f+0x1 at 0x7f63b3bab000
domainbuilder: detail: start_info_x86_64: called
domainbuilder: detail: setup_hypercall_page: vaddr=0xffffffff80003000 pfn=0x3
domainbuilder: detail: domain builder memory footprint
domainbuilder: detail:    allocated
domainbuilder: detail:       malloc             : 16217 kB
domainbuilder: detail:       anon mmap          : 0 bytes
domainbuilder: detail:    mapped
domainbuilder: detail:       file mmap          : 0 bytes
domainbuilder: detail:       domU mmap          : 27260 kB
domainbuilder: detail: arch_setup_bootlate: shared_info: pfn 0x0, mfn 0xbf029
domainbuilder: detail: shared_info_x86_64: called
domainbuilder: detail: vcpu_x86_64: called
domainbuilder: detail: vcpu_x86_64: cr3: pfn 0x1a92 mfn 0x823126
domainbuilder: detail: launch_vm: called, ctxt=0x7fff4896ef90
domainbuilder: detail: xc_dom_release: called
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=xvda spec.backend=tap
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f84878 wpath=/local/domain/0/backend/vbd/2/51712/state token=3/1: register slotnum=3
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=xvdb spec.backend=tap
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=xvdb spec.backend=tap
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f84d48 wpath=/local/domain/0/backend/vbd/2/51728/state token=2/2: register slotnum=2
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f84878 wpath=/local/domain/0/backend/vbd/2/51712/state token=3/1: event epath=/local/domain/0/backend/vbd/2/51712/state
libxl: debug: libxl_event.c:642:devstate_watch_callback: backend /local/domain/0/backend/vbd/2/51712/state wanted state 2 ok
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f84878 wpath=/local/domain/0/backend/vbd/2/51712/state token=3/1: deregister slotnum=3
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f84878: deregister unregistered
libxl: debug: libxl_device.c:917:device_hotplug: calling hotplug script: /etc/xen/scripts/block add
libxl: debug: libxl_event.c:472:watchfd_callback: watch epath=/local/domain/0/backend/vbd/2/51712/state token=3/1: empty slot
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f84d48 wpath=/local/domain/0/backend/vbd/2/51728/state token=2/2: event epath=/local/domain/0/backend/vbd/2/51728/state
libxl: debug: libxl_event.c:646:devstate_watch_callback: backend /local/domain/0/backend/vbd/2/51728/state wanted state 2 still waiting state 1
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f84d48 wpath=/local/domain/0/backend/vbd/2/51728/state token=2/2: event epath=/local/domain/0/backend/vbd/2/51728/state
libxl: debug: libxl_event.c:642:devstate_watch_callback: backend /local/domain/0/backend/vbd/2/51728/state wanted state 2 ok
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f84d48 wpath=/local/domain/0/backend/vbd/2/51728/state token=2/2: deregister slotnum=2
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f84d48: deregister unregistered
libxl: debug: libxl_device.c:917:device_hotplug: calling hotplug script: /etc/xen/scripts/block add
libxl: debug: libxl_dm.c:1160:libxl__spawn_local_dm: Spawning device-model /usr/lib/xen/bin/qemu-system-i386 with arguments:
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   /usr/lib/xen/bin/qemu-system-i386
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -xen-domid
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   2
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-2,server,nowait
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -mon
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   chardev=libxl-cmd,mode=control
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -xen-attach
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -name
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   test1
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -vnc
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   0.0.0.0:1,password,to=99
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -M
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   xenpv
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -m
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   4097
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f82a78 wpath=/local/domain/0/device-model/2/state token=2/3: register slotnum=2
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f82a78 wpath=/local/domain/0/device-model/2/state token=2/3: event epath=/local/domain/0/device-model/2/state
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f82a78 wpath=/local/domain/0/device-model/2/state token=2/3: event epath=/local/domain/0/device-model/2/state
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f82a78 wpath=/local/domain/0/device-model/2/state token=2/3: deregister slotnum=2
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f82a78: deregister unregistered
libxl: debug: libxl_qmp.c:696:libxl__qmp_initialize: connected to /var/run/xen/qmp-libxl-2
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: qmp
libxl: debug: libxl_qmp.c:556:qmp_send_prepare: next qmp command: '{"execute":"qmp_capabilities","id":1}'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_qmp.c:556:qmp_send_prepare: next qmp command: '{"execute":"query-chardev","id":2}'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_qmp.c:556:qmp_send_prepare: next qmp command: '{"execute":"change","id":3,"arguments":{"device":"vnc","target":"password","arg":"vnc"}}'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_qmp.c:556:qmp_send_prepare: next qmp command: '{"execute":"query-vnc","id":4}'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f8a0c8 wpath=/local/domain/0/backend/vif/2/0/state token=2/4: register slotnum=2
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f8a0c8 wpath=/local/domain/0/backend/vif/2/0/state token=2/4: event epath=/local/domain/0/backend/vif/2/0/state
libxl: debug: libxl_event.c:646:devstate_watch_callback: backend /local/domain/0/backend/vif/2/0/state wanted state 2 still waiting state 1
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f8a0c8 wpath=/local/domain/0/backend/vif/2/0/state token=2/4: event epath=/local/domain/0/backend/vif/2/0/state
libxl: debug: libxl_event.c:642:devstate_watch_callback: backend /local/domain/0/backend/vif/2/0/state wanted state 2 ok
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f8a0c8 wpath=/local/domain/0/backend/vif/2/0/state token=2/4: deregister slotnum=2
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f8a0c8: deregister unregistered
libxl: debug: libxl_device.c:917:device_hotplug: calling hotplug script: /etc/xen/scripts/vif-ovs online
libxl: debug: libxl_event.c:1736:libxl__ao_progress_report: ao 0x1f824c0: progress report: ignored
libxl: debug: libxl_event.c:1568:libxl__ao_complete: ao 0x1f824c0: complete, rc=0
libxl: debug: libxl_event.c:1540:libxl__ao__destroy: ao 0x1f824c0: destroy
Daemon running with PID 4439
xc: debug: hypercall buffer: total allocations:369 total releases:369
xc: debug: hypercall buffer: current allocations:0 maximum allocations:4
xc: debug: hypercall buffer: cache current size:4
xc: debug: hypercall buffer: cache hits:358 misses:4 toobig:7

[-- Attachment #4: xl-test1.log --]
[-- Type: text/x-log, Size: 1219 bytes --]

libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0xabeff0 wpath=@releaseDomain token=3/0: event epath=@releaseDomain
libxl: debug: libxl.c:998:domain_death_xswatch_callback: [evg=0xabe4a0:1] from domid=1 nentries=1 rc=1
libxl: debug: libxl.c:1009:domain_death_xswatch_callback: [evg=0xabe4a0:1]   got=domaininfos[0] got->domain=1
libxl: debug: libxl.c:1036:domain_death_xswatch_callback:  exists shutdown_reported=0 dominf.flags=ffff0010
libxl: debug: libxl.c:1002:domain_death_xswatch_callback: [evg=0] all reported
libxl: debug: libxl.c:1066:domain_death_xswatch_callback: domain death search done
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0xabeff0 wpath=@releaseDomain token=3/0: event epath=@releaseDomain
libxl: debug: libxl.c:998:domain_death_xswatch_callback: [evg=0xabe4a0:1] from domid=1 nentries=1 rc=1
libxl: debug: libxl.c:1009:domain_death_xswatch_callback: [evg=0xabe4a0:1]   got=domaininfos[0] got->domain=1
libxl: debug: libxl.c:1036:domain_death_xswatch_callback:  exists shutdown_reported=0 dominf.flags=ffff0010
libxl: debug: libxl.c:1002:domain_death_xswatch_callback: [evg=0] all reported
libxl: debug: libxl.c:1066:domain_death_xswatch_callback: domain death search done

[-- Attachment #5: xl-test2.log --]
[-- Type: text/x-log, Size: 22012 bytes --]

libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f8a0d0 wpath=@releaseDomain token=3/0: event epath=@releaseDomain
libxl: debug: libxl.c:998:domain_death_xswatch_callback: [evg=0x1f89580:2] from domid=2 nentries=1 rc=1
libxl: debug: libxl.c:1009:domain_death_xswatch_callback: [evg=0x1f89580:2]   got=domaininfos[0] got->domain=2
libxl: debug: libxl.c:1036:domain_death_xswatch_callback:  exists shutdown_reported=0 dominf.flags=30004
libxl: debug: libxl.c:1048:domain_death_xswatch_callback:  shutdown reporting
libxl: debug: libxl.c:1002:domain_death_xswatch_callback: [evg=0] all reported
libxl: debug: libxl.c:1066:domain_death_xswatch_callback: domain death search done
Domain 2 has shut down, reason code 3 0x3
Action for shutdown reason code 3 is restart
Domain 2 needs to be cleaned up: destroying the domain
libxl: debug: libxl.c:1250:libxl_domain_destroy: ao 0x1f83dc0: create: how=(nil) callback=(nil) poller=0x1f89540
libxl: debug: libxl_dm.c:1266:libxl__destroy_device_model: Device Model signaled
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f899d8 wpath=/local/domain/0/backend/vbd/2/51712/state token=2/1: register slotnum=2
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f89b78 wpath=/local/domain/0/backend/vbd/2/51728/state token=1/2: register slotnum=1
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f85138 wpath=/local/domain/0/backend/vif/2/0/state token=0/3: register slotnum=0
libxl: debug: libxl.c:1259:libxl_domain_destroy: ao 0x1f83dc0: inprogress: poller=0x1f89540, flags=i
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f899d8 wpath=/local/domain/0/backend/vbd/2/51712/state token=2/1: event epath=/local/domain/0/backend/vbd/2/51712/state
libxl: debug: libxl_event.c:642:devstate_watch_callback: backend /local/domain/0/backend/vbd/2/51712/state wanted state 6 ok
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f899d8 wpath=/local/domain/0/backend/vbd/2/51712/state token=2/1: deregister slotnum=2
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f899d8: deregister unregistered
libxl: debug: libxl_device.c:917:device_hotplug: calling hotplug script: /etc/xen/scripts/block remove
libxl: debug: libxl_event.c:472:watchfd_callback: watch epath=/local/domain/0/backend/vbd/2/51712/state token=2/1: empty slot
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f89b78 wpath=/local/domain/0/backend/vbd/2/51728/state token=1/2: event epath=/local/domain/0/backend/vbd/2/51728/state
libxl: debug: libxl_event.c:642:devstate_watch_callback: backend /local/domain/0/backend/vbd/2/51728/state wanted state 6 ok
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f89b78 wpath=/local/domain/0/backend/vbd/2/51728/state token=1/2: deregister slotnum=1
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f89b78: deregister unregistered
libxl: debug: libxl_device.c:917:device_hotplug: calling hotplug script: /etc/xen/scripts/block remove
libxl: debug: libxl_event.c:472:watchfd_callback: watch epath=/local/domain/0/backend/vbd/2/51728/state token=1/2: empty slot
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f85138 wpath=/local/domain/0/backend/vif/2/0/state token=0/3: event epath=/local/domain/0/backend/vif/2/0/state
libxl: debug: libxl_event.c:646:devstate_watch_callback: backend /local/domain/0/backend/vif/2/0/state wanted state 6 still waiting state 5
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f85138 wpath=/local/domain/0/backend/vif/2/0/state token=0/3: event epath=/local/domain/0/backend/vif/2/0/state
libxl: debug: libxl_event.c:642:devstate_watch_callback: backend /local/domain/0/backend/vif/2/0/state wanted state 6 ok
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f85138 wpath=/local/domain/0/backend/vif/2/0/state token=0/3: deregister slotnum=0
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f85138: deregister unregistered
libxl: debug: libxl_device.c:917:device_hotplug: calling hotplug script: /etc/xen/scripts/vif-ovs offline
libxl: debug: libxl_event.c:1568:libxl__ao_complete: ao 0x1f83dc0: complete, rc=0
libxl: debug: libxl_event.c:1540:libxl__ao__destroy: ao 0x1f83dc0: destroy
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f8a0d0 wpath=@releaseDomain token=3/0: deregister slotnum=3
Done. Rebooting now
libxl: debug: libxl_create.c:1192:do_domain_create: ao 0x1f86fa0: create: how=(nil) callback=(nil) poller=0x1f89540
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=xvda spec.backend=unknown
libxl: debug: libxl_device.c:175:disk_try_backend: Disk vdev=xvda, backend phy unsuitable as phys path not a block device
libxl: debug: libxl_device.c:265:libxl__device_disk_set_backend: Disk vdev=xvda, using backend tap
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=xvdb spec.backend=unknown
libxl: debug: libxl_device.c:175:disk_try_backend: Disk vdev=xvdb, backend phy unsuitable as phys path not a block device
libxl: debug: libxl_device.c:265:libxl__device_disk_set_backend: Disk vdev=xvdb, using backend tap
libxl: debug: libxl_create.c:694:initiate_domain_create: running bootloader
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=(null) spec.backend=tap
libxl: debug: libxl.c:2318:libxl__device_disk_local_initiate_attach: locally attaching tap disk /storage/images/manual/test1.raw directly (ie without using blktap)
libxl: debug: libxl_bootloader.c:409:bootloader_disk_attached_cb: Config bootloader value: pygrub
libxl: debug: libxl_bootloader.c:425:bootloader_disk_attached_cb: Checking for bootloader in libexec path: /usr/lib64/xen/bin/pygrub
libxl: debug: libxl_create.c:1205:do_domain_create: ao 0x1f86fa0: inprogress: poller=0x1f89540, flags=i
libxl: debug: libxl_event.c:472:watchfd_callback: watch epath=@releaseDomain token=3/0: empty slot
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f85a90 wpath=/local/domain/3 token=3/4: register slotnum=3
libxl: debug: libxl_event.c:1736:libxl__ao_progress_report: ao 0x1f86fa0: progress report: ignored
libxl: debug: libxl_bootloader.c:535:bootloader_gotptys: executing bootloader: /usr/lib64/xen/bin/pygrub
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys:   bootloader arg: /usr/lib64/xen/bin/pygrub
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys:   bootloader arg: --output=/var/run/xen/bootloader.3.out
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys:   bootloader arg: --output-format=simple0
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys:   bootloader arg: --output-directory=/var/run/xen/bootloader.3.d
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys:   bootloader arg: /storage/images/manual/test1.raw
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f85a90 wpath=/local/domain/3 token=3/4: event epath=/local/domain/3
libxl: debug: libxl_bootloader.c:634:bootloader_finished: bootloader completed
libxl: debug: libxl_bootloader.c:136:bootloader_result_command: bootloader output contained kernel /var/run/xen/bootloader.3.d/boot_kernel.8zO6CU
libxl: debug: libxl_bootloader.c:136:bootloader_result_command: bootloader output contained ramdisk /var/run/xen/bootloader.3.d/boot_ramdisk.x8kIrK
libxl: debug: libxl_bootloader.c:136:bootloader_result_command: bootloader output contained args root=/dev/disk/by-label/system resume=/dev/disk/by-label/swap splash=silent quiet showopts ipv6.disable=1 elevator=noop vga=0x314
libxl: debug: libxl_bootloader.c:647:bootloader_finished: bootloader execution successful
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f85a90 wpath=/local/domain/3 token=3/4: deregister slotnum=3
libxl: debug: libxl_numa.c:435:libxl__get_numa_candidate: New best NUMA placement candidate found: nr_nodes=1, nr_cpus=12, nr_vcpus=16, free_memkb=46390
libxl: detail: libxl_dom.c:193:numa_place_domain: NUMA placement candidate with 1 nodes, 12 cpus and 46390 KB free selected
domainbuilder: detail: xc_dom_allocate: cmdline="root=/dev/disk/by-label/system resume=/dev/disk/by-label/swap splash=silent quiet showopts ipv6.disable=1 elevator=noop vga=0x314", features="(null)"
libxl: debug: libxl_dom.c:383:libxl__build_pv: pv kernel mapped 1 path /var/run/xen/bootloader.3.d/boot_kernel.8zO6CU

domainbuilder: detail: xc_dom_kernel_mem: called
domainbuilder: detail: xc_dom_ramdisk_mem: called
domainbuilder: detail: xc_dom_boot_xen_init: ver 4.2, caps xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 
domainbuilder: detail: xc_dom_parse_image: called
domainbuilder: detail: xc_dom_find_loader: trying multiboot-binary loader ... 
domainbuilder: detail: loader probe failed
domainbuilder: detail: xc_dom_find_loader: trying Linux bzImage loader ... 
domainbuilder: detail: xc_dom_malloc            : 11958 kB
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x43c02b -> 0xbada10
domainbuilder: detail: loader probe OK
xc: detail: elf_parse_binary: phdr: paddr=0x2000 memsz=0x8b4000
xc: detail: elf_parse_binary: phdr: paddr=0x8b6000 memsz=0x810f0
xc: detail: elf_parse_binary: phdr: paddr=0x938000 memsz=0xba40
xc: detail: elf_parse_binary: phdr: paddr=0x944000 memsz=0x266000
xc: detail: elf_parse_binary: memory: 0x2000 -> 0xbaa000
xc: detail: elf_xen_parse_note: GUEST_OS = "linux"
xc: detail: elf_xen_parse_note: GUEST_VERSION = "2.6"
xc: detail: elf_xen_parse_note: XEN_VERSION = "xen-3.0"
xc: detail: elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
xc: detail: elf_xen_parse_note: PADDR_OFFSET = 0x0
xc: detail: elf_xen_parse_note: ENTRY = 0xffffffff80002000
xc: detail: elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff80003000
xc: detail: elf_xen_parse_note: unknown xen elf note (0xd)
xc: detail: elf_xen_parse_note: MOD_START_PFN = 0x1
xc: detail: elf_xen_parse_note: INIT_P2M = 0xffffea0000000000
xc: detail: elf_xen_parse_note: FEATURES = "writable_page_tables|writable_descriptor_tables|auto_translated_physmap|supervisor_mode_kernel"
xc: detail: elf_xen_parse_note: SUPPORTED_FEATURES = 0x80f
xc: detail: elf_xen_parse_note: LOADER = "generic"
xc: detail: elf_xen_parse_note: SUSPEND_CANCEL = 0x1
xc: detail: elf_xen_addr_calc_check: addresses:
xc: detail:     virt_base        = 0xffffffff80000000
xc: detail:     elf_paddr_offset = 0x0
xc: detail:     virt_offset      = 0xffffffff80000000
xc: detail:     virt_kstart      = 0xffffffff80002000
xc: detail:     virt_kend        = 0xffffffff80baa000
xc: detail:     virt_entry       = 0xffffffff80002000
xc: detail:     p2m_base         = 0xffffea0000000000
domainbuilder: detail: xc_dom_parse_elf_kernel: xen-3.0-x86_64: 0xffffffff80002000 -> 0xffffffff80baa000
domainbuilder: detail: xc_dom_mem_init: mem 2048 MB, pages 0x80000 pages, 4k each
domainbuilder: detail: xc_dom_mem_init: 0x80000 pages
domainbuilder: detail: xc_dom_boot_mem_init: called
domainbuilder: detail: x86_compat: guest xen-3.0-x86_64, address size 64
domainbuilder: detail: xc_dom_malloc            : 4096 kB
domainbuilder: detail: xc_dom_build_image: called
domainbuilder: detail: xc_dom_alloc_segment:   kernel       : 0xffffffff80002000 -> 0xffffffff80baa000  (pfn 0x2 + 0xba8 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x2+0xba8 at 0x7f63b03f6000
xc: detail: elf_load_binary: phdr 0 at 0x0x7f63b03f6000 -> 0x0x7f63b0caa000
xc: detail: elf_load_binary: phdr 1 at 0x0x7f63b0caa000 -> 0x0x7f63b0d2b0f0
xc: detail: elf_load_binary: phdr 2 at 0x0x7f63b0d2c000 -> 0x0x7f63b0d37a40
xc: detail: elf_load_binary: phdr 3 at 0x0x7f63b0d38000 -> 0x0x7f63b0da1000
domainbuilder: detail: xc_dom_alloc_segment:   ramdisk      : 0xffffffff80baa000 -> 0xffffffff8168f000  (pfn 0xbaa + 0xae5 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0xbaa+0xae5 at 0x7f63ab51b000
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x4bffdd -> 0xae4010
domainbuilder: detail: xc_dom_alloc_segment:   phys2mach    : 0xffffffff8168f000 -> 0xffffffff81a8f000  (pfn 0x168f + 0x400 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x168f+0x400 at 0x7f63ab11b000
domainbuilder: detail: xc_dom_alloc_page   :   start info   : 0xffffffff81a8f000 (pfn 0x1a8f)
domainbuilder: detail: xc_dom_alloc_page   :   xenstore     : 0xffffffff81a90000 (pfn 0x1a90)
domainbuilder: detail: xc_dom_alloc_page   :   console      : 0xffffffff81a91000 (pfn 0x1a91)
domainbuilder: detail: nr_page_tables: 0x0000ffffffffffff/48: 0xffff000000000000 -> 0xffffffffffffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x0000007fffffffff/39: 0xffffff8000000000 -> 0xffffffffffffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x000000003fffffff/30: 0xffffffff80000000 -> 0xffffffffbfffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x00000000001fffff/21: 0xffffffff80000000 -> 0xffffffff81bfffff, 14 table(s)
domainbuilder: detail: xc_dom_alloc_segment:   page tables  : 0xffffffff81a92000 -> 0xffffffff81aa3000  (pfn 0x1a92 + 0x11 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x1a92+0x11 at 0x7f63b3b92000
domainbuilder: detail: xc_dom_alloc_page   :   boot stack   : 0xffffffff81aa3000 (pfn 0x1aa3)
domainbuilder: detail: xc_dom_build_image  : virt_alloc_end : 0xffffffff81aa4000
domainbuilder: detail: xc_dom_build_image  : virt_pgtab_end : 0xffffffff81c00000
domainbuilder: detail: xc_dom_boot_image: called
domainbuilder: detail: arch_setup_bootearly: doing nothing
domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_64 <= matches
domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_32p
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32p
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_64
domainbuilder: detail: xc_dom_update_guest_p2m: dst 64bit, pages 0x80000
domainbuilder: detail: clear_page: pfn 0x1a91, mfn 0x822526
domainbuilder: detail: clear_page: pfn 0x1a90, mfn 0x822527
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x1a8f+0x1 at 0x7f63b3bab000
domainbuilder: detail: start_info_x86_64: called
domainbuilder: detail: setup_hypercall_page: vaddr=0xffffffff80003000 pfn=0x3
domainbuilder: detail: domain builder memory footprint
domainbuilder: detail:    allocated
domainbuilder: detail:       malloc             : 16217 kB
domainbuilder: detail:       anon mmap          : 0 bytes
domainbuilder: detail:    mapped
domainbuilder: detail:       file mmap          : 0 bytes
domainbuilder: detail:       domU mmap          : 27260 kB
domainbuilder: detail: arch_setup_bootlate: shared_info: pfn 0x0, mfn 0xbf029
domainbuilder: detail: shared_info_x86_64: called
domainbuilder: detail: vcpu_x86_64: called
domainbuilder: detail: vcpu_x86_64: cr3: pfn 0x1a92 mfn 0x822525
domainbuilder: detail: launch_vm: called, ctxt=0x7fff4896ef90
domainbuilder: detail: xc_dom_release: called
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=xvda spec.backend=tap
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f89288 wpath=/local/domain/0/backend/vbd/3/51712/state token=3/5: register slotnum=3
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=xvdb spec.backend=tap
libxl: debug: libxl_device.c:229:libxl__device_disk_set_backend: Disk vdev=xvdb spec.backend=tap
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f876c8 wpath=/local/domain/0/backend/vbd/3/51728/state token=0/6: register slotnum=0
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f89288 wpath=/local/domain/0/backend/vbd/3/51712/state token=3/5: event epath=/local/domain/0/backend/vbd/3/51712/state
libxl: debug: libxl_event.c:642:devstate_watch_callback: backend /local/domain/0/backend/vbd/3/51712/state wanted state 2 ok
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f89288 wpath=/local/domain/0/backend/vbd/3/51712/state token=3/5: deregister slotnum=3
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f89288: deregister unregistered
libxl: debug: libxl_device.c:917:device_hotplug: calling hotplug script: /etc/xen/scripts/block add
libxl: debug: libxl_event.c:472:watchfd_callback: watch epath=/local/domain/0/backend/vbd/3/51712/state token=3/5: empty slot
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f876c8 wpath=/local/domain/0/backend/vbd/3/51728/state token=0/6: event epath=/local/domain/0/backend/vbd/3/51728/state
libxl: debug: libxl_event.c:642:devstate_watch_callback: backend /local/domain/0/backend/vbd/3/51728/state wanted state 2 ok
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f876c8 wpath=/local/domain/0/backend/vbd/3/51728/state token=0/6: deregister slotnum=0
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f876c8: deregister unregistered
libxl: debug: libxl_device.c:917:device_hotplug: calling hotplug script: /etc/xen/scripts/block add
libxl: debug: libxl_event.c:472:watchfd_callback: watch epath=/local/domain/0/backend/vbd/3/51728/state token=0/6: empty slot
libxl: debug: libxl_dm.c:1160:libxl__spawn_local_dm: Spawning device-model /usr/lib/xen/bin/qemu-system-i386 with arguments:
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   /usr/lib/xen/bin/qemu-system-i386
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -xen-domid
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   3
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-3,server,nowait
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -mon
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   chardev=libxl-cmd,mode=control
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -xen-attach
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -name
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   test1
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -vnc
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   0.0.0.0:1,password,to=99
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -M
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   xenpv
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   -m
libxl: debug: libxl_dm.c:1162:libxl__spawn_local_dm:   4097
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f85cc8 wpath=/local/domain/0/device-model/3/state token=0/7: register slotnum=0
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f85cc8 wpath=/local/domain/0/device-model/3/state token=0/7: event epath=/local/domain/0/device-model/3/state
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f85cc8 wpath=/local/domain/0/device-model/3/state token=0/7: event epath=/local/domain/0/device-model/3/state
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f85cc8 wpath=/local/domain/0/device-model/3/state token=0/7: deregister slotnum=0
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f85cc8: deregister unregistered
libxl: debug: libxl_qmp.c:696:libxl__qmp_initialize: connected to /var/run/xen/qmp-libxl-3
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: qmp
libxl: debug: libxl_qmp.c:556:qmp_send_prepare: next qmp command: '{"execute":"qmp_capabilities","id":1}'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_qmp.c:556:qmp_send_prepare: next qmp command: '{"execute":"query-chardev","id":2}'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_qmp.c:556:qmp_send_prepare: next qmp command: '{"execute":"change","id":3,"arguments":{"device":"vnc","target":"password","arg":"vnc"}}'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_qmp.c:556:qmp_send_prepare: next qmp command: '{"execute":"query-vnc","id":4}'
libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f81c68 wpath=/local/domain/0/backend/vif/3/0/state token=0/8: register slotnum=0
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f81c68 wpath=/local/domain/0/backend/vif/3/0/state token=0/8: event epath=/local/domain/0/backend/vif/3/0/state
libxl: debug: libxl_event.c:642:devstate_watch_callback: backend /local/domain/0/backend/vif/3/0/state wanted state 2 ok
libxl: debug: libxl_event.c:595:libxl__ev_xswatch_deregister: watch w=0x1f81c68 wpath=/local/domain/0/backend/vif/3/0/state token=0/8: deregister slotnum=0
libxl: debug: libxl_event.c:607:libxl__ev_xswatch_deregister: watch w=0x1f81c68: deregister unregistered
libxl: debug: libxl_device.c:917:device_hotplug: calling hotplug script: /etc/xen/scripts/vif-ovs online
libxl: debug: libxl_event.c:1736:libxl__ao_progress_report: ao 0x1f86fa0: progress report: ignored
libxl: debug: libxl_event.c:1568:libxl__ao_complete: ao 0x1f86fa0: complete, rc=0
libxl: debug: libxl_event.c:1540:libxl__ao__destroy: ao 0x1f86fa0: destroy
Waiting for domain test1 (domid 3) to die [pid 4440]
libxl: debug: libxl_event.c:558:libxl__ev_xswatch_register: watch w=0x1f8a0d0 wpath=@releaseDomain token=0/9: register slotnum=0
libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x1f8a0d0 wpath=@releaseDomain token=0/9: event epath=@releaseDomain
libxl: debug: libxl.c:998:domain_death_xswatch_callback: [evg=0x1f80a80:3] from domid=3 nentries=1 rc=1
libxl: debug: libxl.c:1009:domain_death_xswatch_callback: [evg=0x1f80a80:3]   got=domaininfos[0] got->domain=3
libxl: debug: libxl.c:1036:domain_death_xswatch_callback:  exists shutdown_reported=0 dominf.flags=ffff0020
libxl: debug: libxl.c:1002:domain_death_xswatch_callback: [evg=0] all reported
libxl: debug: libxl.c:1066:domain_death_xswatch_callback: domain death search done

[-- Attachment #6: messages.log --]
[-- Type: text/x-log, Size: 15840 bytes --]

2013-05-17T14:54:44.306147+03:00 hv04 kernel: [  909.307861] device vif2.0 left promiscuous mode
2013-05-17T14:54:44.325228+03:00 hv04 local: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/2/51712
2013-05-17T14:54:44.326123+03:00 hv04 local: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/2/51728
2013-05-17T14:54:44.334134+03:00 hv04 tapdisk2[4133]: received 'pid' message (uuid = 0)
2013-05-17T14:54:44.334667+03:00 hv04 tapdisk2[4133]: sending 'pid response' message (uuid = 0)                                                                                                                                    
2013-05-17T14:54:44.335095+03:00 hv04 tapdisk2[4136]: received 'pid' message (uuid = 0)                                                                                                                                            
2013-05-17T14:54:44.335536+03:00 hv04 tapdisk2[4136]: sending 'pid response' message (uuid = 0)                                                                                                                                    
2013-05-17T14:54:44.335968+03:00 hv04 tapdisk2[4292]: received 'pid' message (uuid = 0)                                                                                                                                            
2013-05-17T14:54:44.336414+03:00 hv04 tapdisk2[4292]: sending 'pid response' message (uuid = 0)                                                                                                                                    
2013-05-17T14:54:44.336932+03:00 hv04 tapdisk2[4297]: received 'pid' message (uuid = 0)                                                                                                                                            
2013-05-17T14:54:44.337392+03:00 hv04 tapdisk2[4297]: sending 'pid response' message (uuid = 0)                                                                                                                                    
2013-05-17T14:54:44.337822+03:00 hv04 tapdisk2[4133]: received 'list' message (uuid = 65535)                                                                                                                     
2013-05-17T14:54:44.338350+03:00 hv04 tapdisk2[4133]: sending 'list response' message (uuid = 65535)                                                                                                             
2013-05-17T14:54:44.338872+03:00 hv04 tapdisk2[4133]: sending 'list response' message (uuid = 65535)                                                                                                             
2013-05-17T14:54:44.339541+03:00 hv04 tapdisk2[4136]: received 'list' message (uuid = 65535)                                                                                                                     
2013-05-17T14:54:44.340338+03:00 hv04 tapdisk2[4136]: sending 'list response' message (uuid = 65535)                                                                                                             
2013-05-17T14:54:44.343005+03:00 hv04 tapdisk2[4136]: sending 'list response' message (uuid = 65535)                                                                                      
2013-05-17T14:54:44.343017+03:00 hv04 tapdisk2[4292]: received 'list' message (uuid = 65535)                                                                                              
2013-05-17T14:54:44.343022+03:00 hv04 tapdisk2[4292]: sending 'list response' message (uuid = 65535)                                                                                      
2013-05-17T14:54:44.343987+03:00 hv04 kernel: [  909.342243] blktap_ring_vm_close: 6 callbacks suppressed                                                                                 
2013-05-17T14:54:44.343995+03:00 hv04 kernel: [  909.342248] blktap_ring_vm_close: unmapping ring 2                                                                                
2013-05-17T14:54:44.343997+03:00 hv04 kernel: [  909.342252] blktap_ring_release: freeing device 2                                                                              
2013-05-17T14:54:44.343998+03:00 hv04 kernel: [  909.342991] blktap_device_destroy: destroy device 2 users 0                                                                    
2013-05-17T14:54:44.344010+03:00 hv04 tapdisk2[4292]: sending 'list response' message (uuid = 65535)                                                                          
2013-05-17T14:54:44.344018+03:00 hv04 tapdisk2[4297]: received 'list' message (uuid = 65535)                                                                                  
2013-05-17T14:54:44.344023+03:00 hv04 tapdisk2[4297]: sending 'list response' message (uuid = 65535)                                                                         
2013-05-17T14:54:44.344441+03:00 hv04 tapdisk2[4297]: sending 'list response' message (uuid = 65535)                                                                         
2013-05-17T14:54:44.344818+03:00 hv04 tapdisk2[4292]: received 'close' message (uuid = 2)                                                                                    
2013-05-17T14:54:44.344956+03:00 hv04 tapdisk2[4292]: closed image /storage/images/manual/test1.raw (0 users, state: 0x00000000, type: 0)                                   
2013-05-17T14:54:44.345150+03:00 hv04 tapdisk2[4292]: sending 'close response' message (uuid = 2)                                                                           
2013-05-17T14:54:44.345289+03:00 hv04 tapdisk2[4292]: received 'detach' message (uuid = 2)                                                                              
2013-05-17T14:54:44.345422+03:00 hv04 tapdisk2[4292]: sending 'detach response' message (uuid = 2)                                                                
2013-05-17T14:54:44.345553+03:00 hv04 local: /etc/xen/scripts/vif-ovs: offline type_if=vif XENBUS_PATH=backend/vif/2/0                                            
2013-05-17T14:54:44.345694+03:00 hv04 tapdisk2[4133]: received 'pid' message (uuid = 0)                                                                           
2013-05-17T14:54:44.346120+03:00 hv04 tapdisk2[4133]: sending 'pid response' message (uuid = 0)                                                                 
2013-05-17T14:54:44.346500+03:00 hv04 tapdisk2[4136]: received 'pid' message (uuid = 0)
2013-05-17T14:54:44.346878+03:00 hv04 tapdisk2[4136]: sending 'pid response' message (uuid = 0)
2013-05-17T14:54:44.347352+03:00 hv04 tapdisk2[4297]: received 'pid' message (uuid = 0)
2013-05-17T14:54:44.347771+03:00 hv04 tapdisk2[4297]: sending 'pid response' message (uuid = 0)
2013-05-17T14:54:44.348148+03:00 hv04 tapdisk2[4133]: received 'list' message (uuid = 65535)
2013-05-17T14:54:44.348530+03:00 hv04 tapdisk2[4133]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:44.348916+03:00 hv04 tapdisk2[4133]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:44.349301+03:00 hv04 tapdisk2[4136]: received 'list' message (uuid = 65535)
2013-05-17T14:54:44.349682+03:00 hv04 tapdisk2[4136]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:44.350069+03:00 hv04 tapdisk2[4136]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:44.350122+03:00 hv04 kernel: [  909.350488] blktap_ring_vm_close: unmapping ring 3
2013-05-17T14:54:44.350131+03:00 hv04 kernel: [  909.350492] blktap_ring_release: freeing device 3
2013-05-17T14:54:44.350590+03:00 hv04 tapdisk2[4297]: received 'list' message (uuid = 65535)
2013-05-17T14:54:44.350804+03:00 hv04 tapdisk2[4297]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:44.350996+03:00 hv04 tapdisk2[4297]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:44.351175+03:00 hv04 tapdisk2[4297]: received 'close' message (uuid = 3)
2013-05-17T14:54:44.351367+03:00 hv04 tapdisk2[4297]: closed image /storage/images/manual/test1_storage.raw (0 users, state: 0x00000000, type: 0)
2013-05-17T14:54:44.351549+03:00 hv04 tapdisk2[4297]: sending 'close response' message (uuid = 3)
2013-05-17T14:54:44.351729+03:00 hv04 tapdisk2[4297]: received 'detach' message (uuid = 3)
2013-05-17T14:54:44.351994+03:00 hv04 tapdisk2[4297]: sending 'detach response' message (uuid = 3)
2013-05-17T14:54:44.354133+03:00 hv04 kernel: [  909.352103] blktap_device_destroy: destroy device 3 users 0
2013-05-17T14:54:44.368669+03:00 hv04 ovs-vsctl: 00001|vsctl|INFO|Called as ovs-vsctl del-port vlannet vif2.0
2013-05-17T14:54:44.404213+03:00 hv04 local: /etc/xen/scripts/vif-ovs: ip link set vif2.0 down failed
2013-05-17T14:54:44.405650+03:00 hv04 local: /etc/xen/scripts/vif-ovs: Successful vif-ovs offline for switch vlannet dev vif2.0, vlan 2512, mtu 9000
2013-05-17T14:54:53.136729+03:00 hv04 tapdisk2[4133]: received 'pid' message (uuid = 0)
2013-05-17T14:54:53.137411+03:00 hv04 tapdisk2[4133]: sending 'pid response' message (uuid = 0)
2013-05-17T14:54:53.137945+03:00 hv04 tapdisk2[4136]: received 'pid' message (uuid = 0)
2013-05-17T14:54:53.138135+03:00 hv04 kernel: [  918.138974] blktap_control_allocate_tap: allocated tap ffff8800ea8a8000
2013-05-17T14:54:53.138590+03:00 hv04 tapdisk2[4136]: sending 'pid response' message (uuid = 0)
2013-05-17T14:54:53.138993+03:00 hv04 tapdisk2[4133]: received 'list' message (uuid = 65535)
2013-05-17T14:54:53.139410+03:00 hv04 tapdisk2[4133]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:53.139824+03:00 hv04 tapdisk2[4133]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:53.140303+03:00 hv04 tapdisk2[4136]: received 'list' message (uuid = 65535)
2013-05-17T14:54:53.140728+03:00 hv04 tapdisk2[4136]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:53.141144+03:00 hv04 tapdisk2[4136]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:53.141543+03:00 hv04 tapdisk2[4695]: I/O queue driver: lio
2013-05-17T14:54:53.142019+03:00 hv04 tapdisk2[4695]: received 'attach' message (uuid = 2)
2013-05-17T14:54:53.142123+03:00 hv04 kernel: [  918.143614] blktap_ring_open: opening device blktap2
2013-05-17T14:54:53.142130+03:00 hv04 kernel: [  918.143617] blktap_ring_open: opened device 2
2013-05-17T14:54:53.142131+03:00 hv04 kernel: [  918.143636] blktap_ring_mmap: blktap: mapping pid is 4696
2013-05-17T14:54:53.142133+03:00 hv04 kernel: [  918.143790] blktap_validate_params: aio:/storage/images/manual/test1.raw: capacity: 20971520, sector-size: 512
2013-05-17T14:54:53.142134+03:00 hv04 kernel: [  918.143793] blktap_validate_params: aio:/storage/images/manual/test1.raw: capacity: 20971520, sector-size: 512
2013-05-17T14:54:53.142135+03:00 hv04 kernel: [  918.143794] blktap_device_create: minor 2 sectors 20971520 sector-size 512
2013-05-17T14:54:53.142136+03:00 hv04 kernel: [  918.143973] blktap_device_create: creation of 253:2: 0
2013-05-17T14:54:53.142439+03:00 hv04 tapdisk2[4695]: sending 'attach response' message (uuid = 2)
2013-05-17T14:54:53.143007+03:00 hv04 tapdisk2[4695]: received 'open' message (uuid = 2)
2013-05-17T14:54:53.143514+03:00 hv04 tapdisk2[4695]: block-aio open('/storage/images/manual/test1.raw')
2013-05-17T14:54:53.144159+03:00 hv04 tapdisk2[4695]: open(/storage/images/manual/test1.raw) with O_DIRECT
2013-05-17T14:54:53.144654+03:00 hv04 tapdisk2[4695]: Image size:
2013-05-17T14:54:53.145141+03:00 hv04 tapdisk2[4695]: opened image /storage/images/manual/test1.raw (1 users, state: 0x00000001, type: 0)
2013-05-17T14:54:53.145556+03:00 hv04 tapdisk2[4695]: VBD CHAIN:
2013-05-17T14:54:53.146045+03:00 hv04 tapdisk2[4695]: /storage/images/manual/test1.raw: 0
2013-05-17T14:54:53.146456+03:00 hv04 tapdisk2[4695]: sending 'open response' message (uuid = 2)
2013-05-17T14:54:53.148803+03:00 hv04 tapdisk2[4133]: received 'pid' message (uuid = 0)
2013-05-17T14:54:53.149375+03:00 hv04 tapdisk2[4133]: sending 'pid response' message (uuid = 0)
2013-05-17T14:54:53.149855+03:00 hv04 tapdisk2[4136]: received 'pid' message (uuid = 0)
2013-05-17T14:54:53.150365+03:00 hv04 tapdisk2[4136]: sending 'pid response' message (uuid = 0)
2013-05-17T14:54:53.150846+03:00 hv04 tapdisk2[4695]: received 'pid' message (uuid = 0)
2013-05-17T14:54:53.151329+03:00 hv04 tapdisk2[4695]: sending 'pid response' message (uuid = 0)
2013-05-17T14:54:53.151800+03:00 hv04 tapdisk2[4133]: received 'list' message (uuid = 65535)
2013-05-17T14:54:53.152392+03:00 hv04 tapdisk2[4133]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:53.152893+03:00 hv04 tapdisk2[4133]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:53.153406+03:00 hv04 tapdisk2[4136]: received 'list' message (uuid = 65535)
2013-05-17T14:54:53.153897+03:00 hv04 tapdisk2[4136]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:53.154130+03:00 hv04 kernel: [  918.153577] blktap_control_allocate_tap: allocated tap ffff8800f4eba800
2013-05-17T14:54:53.154412+03:00 hv04 tapdisk2[4136]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:53.155063+03:00 hv04 tapdisk2[4695]: received 'list' message (uuid = 65535)
2013-05-17T14:54:53.155582+03:00 hv04 tapdisk2[4695]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:53.156209+03:00 hv04 tapdisk2[4695]: sending 'list response' message (uuid = 65535)
2013-05-17T14:54:53.156694+03:00 hv04 tapdisk2[4700]: I/O queue driver: lio
2013-05-17T14:54:53.157207+03:00 hv04 tapdisk2[4700]: received 'attach' message (uuid = 3)
2013-05-17T14:54:53.157696+03:00 hv04 tapdisk2[4700]: sending 'attach response' message (uuid = 3)
2013-05-17T14:54:53.158125+03:00 hv04 kernel: [  918.158869] blktap_ring_open: opening device blktap3
2013-05-17T14:54:53.158297+03:00 hv04 tapdisk2[4700]: received 'open' message (uuid = 3)
2013-05-17T14:54:53.158940+03:00 hv04 tapdisk2[4700]: block-aio open('/storage/images/manual/test1_storage.raw')
2013-05-17T14:54:53.159465+03:00 hv04 tapdisk2[4700]: open(/storage/images/manual/test1_storage.raw) with O_DIRECT
2013-05-17T14:54:53.159953+03:00 hv04 tapdisk2[4700]: Image size:
2013-05-17T14:54:53.160443+03:00 hv04 tapdisk2[4700]: opened image /storage/images/manual/test1_storage.raw (1 users, state: 0x00000001, type: 0)
2013-05-17T14:54:53.160932+03:00 hv04 tapdisk2[4700]: VBD CHAIN:
2013-05-17T14:54:53.161420+03:00 hv04 tapdisk2[4700]: /storage/images/manual/test1_storage.raw: 0
2013-05-17T14:54:53.161988+03:00 hv04 tapdisk2[4700]: sending 'open response' message (uuid = 3)
2013-05-17T14:54:53.190990+03:00 hv04 local: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/3/51712
2013-05-17T14:54:53.193395+03:00 hv04 local: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/3/51728
2013-05-17T14:54:53.271920+03:00 hv04 local: /etc/xen/scripts/vif-ovs: online type_if=vif XENBUS_PATH=backend/vif/3/0
2013-05-17T14:54:53.302602+03:00 hv04 ovs-vsctl: 00001|vsctl|INFO|Called as ovs-vsctl add-port vlannet vif3.0 tag=2512
2013-05-17T14:54:53.306135+03:00 hv04 kernel: [  918.307947] device vif3.0 entered promiscuous mode
2013-05-17T14:54:53.355759+03:00 hv04 local: /etc/xen/scripts/vif-ovs: Successful vif-ovs online for switch vlannet dev vif3.0, vlan 2512, mtu 9000
2013-05-17T14:54:53.357293+03:00 hv04 local: /etc/xen/scripts/vif-ovs: Writing backend/vif/3/0/hotplug-status connected to xenstore.
2013-05-17T14:54:54.878304+03:00 hv04 kernel: [  919.878284] blkback: event-channel 19
2013-05-17T14:54:54.878328+03:00 hv04 kernel: [  919.878385] blkback: protocol 1 (x86_64-abi)
2013-05-17T14:54:54.878331+03:00 hv04 kernel: [  919.878562] blkback: ring-ref 770
2013-05-17T14:54:54.886195+03:00 hv04 kernel: [  919.887871] blkback: event-channel 20
2013-05-17T14:54:54.886209+03:00 hv04 kernel: [  919.887966] blkback: protocol 1 (x86_64-abi)
2013-05-17T14:54:54.890191+03:00 hv04 kernel: [  919.888212] blkback: ring-ref 771

[-- Attachment #7: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17 12:03       ` Eugene Istomin
@ 2013-05-17 12:25         ` Jan Beulich
  2013-05-17 12:30           ` Eugene Istomin
  0 siblings, 1 reply; 85+ messages in thread
From: Jan Beulich @ 2013-05-17 12:25 UTC (permalink / raw)
  To: Eugene Istomin; +Cc: Wei Liu, xen-devel

>>> On 17.05.13 at 14:03, Eugene Istomin <e.istomin@edss.ee> wrote:
> We use 3.4 kernels in production, 3.4.37-1-xen and 3.4.45-2 are working 
> without this problems, a couple of days ago i started to test 3.9.

That's quite big a jump.

> Logs (test1 - iperf server, test2 - iperf client):
> 
> Dom0:
> 1) tailf /var/log/xen/xl-test1.log (turned just before bug) - xl-test1.log
> 2) tailf /var/log/xen/xl-test2.log (turned just before bug) - xl-test2.log
> 3) tailf /var/log/messages (turned just before bug) - messages.log
> 4) xl create log - xl_creation.log
> 
> DomU:
> 1) dmesg  - domu_dmesg

Did you look at these before sending - there's no sign of an oops
anywhere afaict.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17 12:25         ` Jan Beulich
@ 2013-05-17 12:30           ` Eugene Istomin
  2013-05-17 12:36             ` Jan Beulich
  0 siblings, 1 reply; 85+ messages in thread
From: Eugene Istomin @ 2013-05-17 12:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2780 bytes --]

> That's quite big a jump.
Kernels from 3.5 to 3.8 had OCFS2 bugs, in 3.9 all OCFS2 patches are 
merged.

> Did you look at these before sending - there's no sign of an oops
> anywhere afaict.
Yes, there are no oops in logs. oop is listed only in xl console, and this is 
looks strage to me.



Welcome to openSUSE 12.3 "Dartmouth" - Kernel 3.9.2-4.756ee56-xen 
(xvc0).


template login: local
Password: 
Last login: Fri May 17 14:34:27 on xvc0
Have a lot of fun...
local@template:~> sudo su
template:/home/local # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
43162
[  221.826637] BUG: unable to handle kernel paging request at 
ffff880078b85000
[  221.826674] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  221.826703] PGD a83067 PUD a93067 PMD 7fc2c067 PTE 
8010000078b85065
[  221.826732] Oops: 0003 [#1] SMP 
[  221.826748] Modules linked in: af_packet hwmon domctl crc32_pclmul 
crc32c_intel ghash_clmulni_intel joydev aesni_intel ablk_helper cryptd lrw 
aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua scsi_dh_rdac 
scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix
[  221.826875] CPU 0 
[  221.826885] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
#1  
[  221.826911] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[  221.826945] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[  221.826964] RAX: ffff88007da9b438 RBX: ffff880078b84ec0 RCX: 
ffff880078b85000
[  221.826989] RDX: ffff880078bdf7c0 RSI: ffff880078b84ec0 RDI: 
ffff880078536580
[  221.827014] RBP: ffff880078479a80 R08: 0000000000000248 R09: 
0000000000000000
[  221.827039] R10: 0000000000000000 R11: 0000000000000001 R12: 
0000000000000011
[  221.827064] R13: 00000000000030f0 R14: ffff88007b403dd8 R15: 
ffff880078be8800


-- 
*/Best regards,/*
/Eugene Istomin/




> >>> On 17.05.13 at 14:03, Eugene Istomin <e.istomin@edss.ee> 
wrote:
> > We use 3.4 kernels in production, 3.4.37-1-xen and 3.4.45-2 are 
working
> > without this problems, a couple of days ago i started to test 3.9.
> 
> That's quite big a jump.
> 
> > Logs (test1 - iperf server, test2 - iperf client):
> > 
> > Dom0:
> > 1) tailf /var/log/xen/xl-test1.log (turned just before bug) - xl-test1.log
> > 2) tailf /var/log/xen/xl-test2.log (turned just before bug) - xl-test2.log
> > 3) tailf /var/log/messages (turned just before bug) - messages.log
> > 4) xl create log - xl_creation.log
> > 
> > DomU:
> > 1) dmesg  - domu_dmesg
> 
> Did you look at these before sending - there's no sign of an oops
> anywhere afaict.
> 
> Jan

[-- Attachment #1.2: Type: text/html, Size: 13246 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17 12:30           ` Eugene Istomin
@ 2013-05-17 12:36             ` Jan Beulich
  2013-05-17 12:52               ` Eugene Istomin
  0 siblings, 1 reply; 85+ messages in thread
From: Jan Beulich @ 2013-05-17 12:36 UTC (permalink / raw)
  To: Eugene Istomin; +Cc: Wei Liu, xen-devel

>>> On 17.05.13 at 14:30, Eugene Istomin <e.istomin@edss.ee> wrote:
>>  That's quite big a jump.
> Kernels from 3.5 to 3.8 had OCFS2 bugs, in 3.9 all OCFS2 patches are 
> merged.
> 
>> Did you look at these before sending - there's no sign of an oops
>> anywhere afaict.
> Yes, there are no oops in logs. oop is listed only in xl console, and this 
> is 
> looks strage to me.

That likely depends on some configuration setting within the guest.

> [  221.826637] BUG: unable to handle kernel paging request at 
> ffff880078b85000
> [  221.826674] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
> [  221.826703] PGD a83067 PUD a93067 PMD 7fc2c067 PTE 
> 8010000078b85065
> [  221.826732] Oops: 0003 [#1] SMP 
> [  221.826748] Modules linked in: af_packet hwmon domctl crc32_pclmul 
> crc32c_intel ghash_clmulni_intel joydev aesni_intel ablk_helper cryptd lrw 
> aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua scsi_dh_rdac 
> scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix
> [  221.826875] CPU 0 
> [  221.826885] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
> #1  
> [  221.826911] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
> netif_poll+0x49c/0xe80 [xennet]
> [  221.826945] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
> [  221.826964] RAX: ffff88007da9b438 RBX: ffff880078b84ec0 RCX: 
> ffff880078b85000
> [  221.826989] RDX: ffff880078bdf7c0 RSI: ffff880078b84ec0 RDI: 
> ffff880078536580
> [  221.827014] RBP: ffff880078479a80 R08: 0000000000000248 R09: 
> 0000000000000000
> [  221.827039] R10: 0000000000000000 R11: 0000000000000001 R12: 
> 0000000000000011
> [  221.827064] R13: 00000000000030f0 R14: ffff88007b403dd8 R15: 
> ffff880078be8800

Sorry, but as said before, this is incomplete (stack dump, no call stack).

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17 12:36             ` Jan Beulich
@ 2013-05-17 12:52               ` Eugene Istomin
  2013-05-17 13:00                 ` Eugene Istomin
  0 siblings, 1 reply; 85+ messages in thread
From: Eugene Istomin @ 2013-05-17 12:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 5360 bytes --]

Jan,

after 10 seconds of working (speed is 33 Mb/s!!) here is new oops:

template:/home/local # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61806
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 5.4 sec  21.6 MBytes  33.5 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61807
[  5]  0.0-10.0 sec  39.5 MBytes  33.0 Mbits/sec
[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61808
[  4]  0.0-10.0 sec   149 MBytes   124 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61809
[  118.230386] BUG: unable to handle kernel paging request at 
ffff8800791e4000
[  118.230426] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  118.230501] PGD a83067 PUD a93067 PMD 7fc29067 PTE 
80100000791e4065
[  118.230534] Oops: 0003 [#1] SMP 
[  118.230552] Modules linked in: af_packet hwmon domctl crc32_pclmul 
joydev crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua scsi_dh_rdac 
scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix
[  118.230690] CPU 0 
[  118.230700] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
#1  
[  118.230729] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[  118.230766] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[  118.230787] RAX: ffff88007da89500 RBX: ffff8800791e3ec0 RCX: 
ffff8800791e4000
[  118.230814] RDX: ffff880078446150 RSI: ffff8800791e3ec0 RDI: 
ffff880078434880
[  118.230841] RBP: ffff88007852db80 R08: 0000000000000ba8 R09: 
0000000000000000
[  118.230867] R10: 0000000000000000 R11: 0000000000000001 R12: 
0000000000000011
[  118.230895] R13: 0000000000025322 R14: ffff88007b403dd8 R15: 
ffff880078470800
[  118.230925] FS:  00007ff5457b37c0(0000) GS:ffff88007b400000(0000) 
knlGS:0000000000000000
[  118.230956] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  118.230978] CR2: ffff8800791e4000 CR3: 0000000078c97000 CR4: 
0000000000002660
[  118.231629] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  118.232272] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  118.232906] Process swapper/0 (pid: 0, threadinfo ffffffff808b6000, task 
ffffffff808c1960)
[  118.233554] Stack:
[  118.234194]  ffff88007b403d38 ffffffff80060845 ffff88007b403d68 
ffff88007b403db8
[  118.234373]  ffff88007b403d98 0000001200025311 ffff880078471d50 
0000000000000000
[  118.234373]  ffff880078470858 ffff88007b40e878 0000000100000012 
0000004000025323
[  118.234373] Call Trace:
[  118.234373]  [<ffffffff80441070>] net_rx_action+0x170/0x2c0
[  118.234373]  [<ffffffff80036dbe>] __do_softirq+0xee/0x230
[  118.234373]  [<ffffffff80037085>] irq_exit+0x95/0xa0
[  118.234373]  [<ffffffff803933ed>] evtchn_do_upcall+0x2ad/0x2f0
[  118.234373]  [<ffffffff80534d5e>] do_hypervisor_callback+0x1e/0x30
[  118.234373]  [<ffffffff800033aa>] HYPERVISOR_sched_op_new+0xa/0x20
[  118.234373]  [<ffffffff8000e671>] xen_idle+0x41/0x110
[  118.234373]  [<ffffffff8000e7ef>] cpu_idle+0xaf/0x110
[  118.234373]  [<ffffffff80944b1f>] start_kernel+0x424/0x42f


I restarting Dom0 server now and will try to get first case call trace.

-- 
*/Best regards,/*
/Eugene Istomin/




> >>> On 17.05.13 at 14:30, Eugene Istomin <e.istomin@edss.ee> 
wrote:
> >>  That's quite big a jump.
> > 
> > Kernels from 3.5 to 3.8 had OCFS2 bugs, in 3.9 all OCFS2 patches are
> > merged.
> > 
> >> Did you look at these before sending - there's no sign of an oops
> >> anywhere afaict.
> > 
> > Yes, there are no oops in logs. oop is listed only in xl console, and this
> > is
> > looks strage to me.
> 
> That likely depends on some configuration setting within the guest.
> 
> > [  221.826637] BUG: unable to handle kernel paging request at
> > ffff880078b85000
> > [  221.826674] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 
[xennet]
> > [  221.826703] PGD a83067 PUD a93067 PMD 7fc2c067 PTE
> > 8010000078b85065
> > [  221.826732] Oops: 0003 [#1] SMP
> > [  221.826748] Modules linked in: af_packet hwmon domctl 
crc32_pclmul
> > crc32c_intel ghash_clmulni_intel joydev aesni_intel ablk_helper cryptd 
lrw
> > aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua 
scsi_dh_rdac
> > scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix
> > [  221.826875] CPU 0
> > [  221.826885] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-
xen
> > #1
> > [  221.826911] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>]
> > netif_poll+0x49c/0xe80 [xennet]
> > [  221.826945] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
> > [  221.826964] RAX: ffff88007da9b438 RBX: ffff880078b84ec0 RCX:
> > ffff880078b85000
> > [  221.826989] RDX: ffff880078bdf7c0 RSI: ffff880078b84ec0 RDI:
> > ffff880078536580
> > [  221.827014] RBP: ffff880078479a80 R08: 0000000000000248 R09:
> > 0000000000000000
> > [  221.827039] R10: 0000000000000000 R11: 0000000000000001 
R12:
> > 0000000000000011
> > [  221.827064] R13: 00000000000030f0 R14: ffff88007b403dd8 R15:
> > ffff880078be8800
> 
> Sorry, but as said before, this is incomplete (stack dump, no call stack).
> 
> Jan

[-- Attachment #1.2: Type: text/html, Size: 24095 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17 12:52               ` Eugene Istomin
@ 2013-05-17 13:00                 ` Eugene Istomin
  2013-05-18 10:06                   ` Eugene Istomin
  2013-05-21  9:55                   ` Jan Beulich
  0 siblings, 2 replies; 85+ messages in thread
From: Eugene Istomin @ 2013-05-17 13:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 6008 bytes --]

Bump, here it is:



template:/home/local # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------



[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
23902
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.1 sec  38.5 MBytes  32.0 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
23903
[  124.555698] BUG: unable to handle kernel paging request at 
ffff880078453000
[  124.555760] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  124.555813] PGD a83067 PUD a93067 PMD 7fc2f067 PTE 
8010000078453065
[  124.555866] Oops: 0003 [#1] SMP 
[  124.555894] Modules linked in: af_packet hwmon domctl crc32_pclmul 
crc32c_intel ghash_clmulni_intel joydev aesni_intel ablk_helper cryptd lrw 
aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua scsi_dh_rdac 
scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix
[  124.556126] CPU 0 
[  124.556143] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
#1  
[  124.556190] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[  124.556252] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[  124.556286] RAX: ffff88007da5f650 RBX: ffff880078452ec0 RCX: 
ffff880078453000
[  124.556332] RDX: ffff880078876810 RSI: ffff880078452ec0 RDI: 
ffff880078451580
[  124.556377] RBP: ffff88007802ac80 R08: 00000000000009e0 R09: 
0000000000000000
[  124.556423] R10: 0000000000000000 R11: 0000000000000001 R12: 
0000000000000011
[  124.556468] R13: 0000000000005afa R14: ffff88007b403dd8 R15: 
ffff880078440800
[  124.556518] FS:  00007faae5316700(0000) GS:ffff88007b400000(0000) 
knlGS:0000000000000000
[  124.556569] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  124.557637] CR2: ffff880078453000 CR3: 0000000079f61000 CR4: 
0000000000002660
[  124.558714] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  124.559681] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  124.559681] Process swapper/0 (pid: 0, threadinfo ffffffff808b6000, task 
ffffffff808c1960)
[  124.559681] Stack:
[  124.559681]  ffff8800786afb80 000000007953e510 
000000000000ad00 ffff88007b403db8
[  124.559681]  ffff88007b403d98 0000001200005ae9 ffff880078441d50 
0000000000000000
[  124.559681]  ffff880078440858 ffff88007b40e878 0000000100000012 
0000004000005afb
[  124.559681] Call Trace:
[  124.559681]  [<ffffffff80441070>] net_rx_action+0x170/0x2c0
[  124.559681]  [<ffffffff80036dbe>] __do_softirq+0xee/0x230
[  124.559681]  [<ffffffff80037085>] irq_exit+0x95/0xa0
[  124.559681]  [<ffffffff803933ed>] evtchn_do_upcall+0x2ad/0x2f0
[  124.559681]  [<ffffffff80534d5e>] do_hypervisor_callback+0x1e/0x30
[  124.559681]  [<ffffffff800033aa>] HYPERVISOR_sched_op_new+0xa/0x20
[  124.559681]  [<ffffffff8000e671>] xen_idle+0x41/0x110
[  124.559681]  [<ffffffff8000e7ef>] cpu_idle+0xaf/0x110
[  124.559681]  [<ffffffff80944b1f>] start_kernel+0x424/0x42f
[  124.559681] Code: 44 21 ea 48 8d 54 d0 40 8b 87 d8 00 00 00 44 0f bf 
42 06 44 0f b7 4a 02 48 8b 44 01 30 49 63 cc 48 83 c1 03 48 c1 e1 04 48 
01 f1 <48> 89 01 44 89 49 08 44 89 41 0c 48 8b 08 80 e5 80 0f 85 54 09 
[  124.559681] RIP  [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  124.559681]  RSP <ffff88007b403d18>
[  124.559681] CR2: ffff880078453000
[  124.559681] ---[ end trace d5028239de5a7a42 ]---
[  124.559681] Kernel panic - not syncing: Fatal exception in interrupt

-- 
*/Best regards,/*
/Eugene Istomin/



Jan,

after 10 seconds of working (speed is 33 Mb/s!!) here is new oops:

template:/home/local # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61806
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 5.4 sec  21.6 MBytes  33.5 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61807
[  5]  0.0-10.0 sec  39.5 MBytes  33.0 Mbits/sec
[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61808
[  4]  0.0-10.0 sec   149 MBytes   124 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61809
[  118.230386] BUG: unable to handle kernel paging request at 
ffff8800791e4000
[  118.230426] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  118.230501] PGD a83067 PUD a93067 PMD 7fc29067 PTE 
80100000791e4065
[  118.230534] Oops: 0003 [#1] SMP 
[  118.230552] Modules linked in: af_packet hwmon domctl crc32_pclmul 
joydev crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua scsi_dh_rdac 
scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix
[  118.230690] CPU 0 
[  118.230700] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
#1  
[  118.230729] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[  118.230766] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[  118.230787] RAX: ffff88007da89500 RBX: ffff8800791e3ec0 RCX: 
ffff8800791e4000
[  118.230814] RDX: ffff880078446150 RSI: ffff8800791e3ec0 RDI: 
ffff880078434880
[  118.230841] RBP: ffff88007852db80 R08: 0000000000000ba8 R09: 
0000000000000000
[  118.230867] R10: 0000000000000000 R11: 0000000000000001 R12: 
0000000000000011
[  118.230895] R13: 0000000000025322 R14: ffff88007b403dd8 R15: 
ffff880078470800
[  118.230925] FS:  00007ff5457b37c0(0000) GS:ffff88007b400000(0000) 
knlGS:0000000000000000
[  118.230956] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  118.230978] CR2: ffff8800791e4000 CR3: 0000000078c97000 CR4: 
0000000000002660
[  118.231629] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  118.232272] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400

[-- Attachment #1.2: Type: text/html, Size: 39897 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17 13:00                 ` Eugene Istomin
@ 2013-05-18 10:06                   ` Eugene Istomin
  2013-05-20  6:52                     ` Eugene Istomin
  2013-05-21  9:55                   ` Jan Beulich
  1 sibling, 1 reply; 85+ messages in thread
From: Eugene Istomin @ 2013-05-18 10:06 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 5682 bytes --]

Hello,

Do you need any other logs?
-- 
*/Best regards,/*
/Eugene Istomin/



Bump, here it is:



template:/home/local # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------



[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
23902
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.1 sec  38.5 MBytes  32.0 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
23903
[  124.555698] BUG: unable to handle kernel paging request at 
ffff880078453000
[  124.555760] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  124.555813] PGD a83067 PUD a93067 PMD 7fc2f067 PTE 
8010000078453065
[  124.555866] Oops: 0003 [#1] SMP 
[  124.555894] Modules linked in: af_packet hwmon domctl crc32_pclmul 
crc32c_intel ghash_clmulni_intel joydev aesni_intel ablk_helper cryptd lrw 
aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua scsi_dh_rdac 
scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix
[  124.556126] CPU 0 
[  124.556143] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
#1  
[  124.556190] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[  124.556252] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[  124.556286] RAX: ffff88007da5f650 RBX: ffff880078452ec0 RCX: 
ffff880078453000
[  124.556332] RDX: ffff880078876810 RSI: ffff880078452ec0 RDI: 
ffff880078451580
[  124.556377] RBP: ffff88007802ac80 R08: 00000000000009e0 R09: 
0000000000000000
[  124.556423] R10: 0000000000000000 R11: 0000000000000001 R12: 
0000000000000011
[  124.556468] R13: 0000000000005afa R14: ffff88007b403dd8 R15: 
ffff880078440800
[  124.556518] FS:  00007faae5316700(0000) GS:ffff88007b400000(0000) 
knlGS:0000000000000000
[  124.556569] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  124.557637] CR2: ffff880078453000 CR3: 0000000079f61000 CR4: 
0000000000002660
[  124.558714] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  124.559681] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  124.559681] Process swapper/0 (pid: 0, threadinfo ffffffff808b6000, task 
ffffffff808c1960)
[  124.559681] Stack:
[  124.559681]  ffff8800786afb80 000000007953e510 
000000000000ad00 ffff88007b403db8
[  124.559681]  ffff88007b403d98 0000001200005ae9 ffff880078441d50 
0000000000000000
[  124.559681]  ffff880078440858 ffff88007b40e878 0000000100000012 
0000004000005afb
[  124.559681] Call Trace:
[  124.559681]  [<ffffffff80441070>] net_rx_action+0x170/0x2c0
[  124.559681]  [<ffffffff80036dbe>] __do_softirq+0xee/0x230
[  124.559681]  [<ffffffff80037085>] irq_exit+0x95/0xa0
[  124.559681]  [<ffffffff803933ed>] evtchn_do_upcall+0x2ad/0x2f0
[  124.559681]  [<ffffffff80534d5e>] do_hypervisor_callback+0x1e/0x30
[  124.559681]  [<ffffffff800033aa>] HYPERVISOR_sched_op_new+0xa/0x20
[  124.559681]  [<ffffffff8000e671>] xen_idle+0x41/0x110
[  124.559681]  [<ffffffff8000e7ef>] cpu_idle+0xaf/0x110
[  124.559681]  [<ffffffff80944b1f>] start_kernel+0x424/0x42f
[  124.559681] Code: 44 21 ea 48 8d 54 d0 40 8b 87 d8 00 00 00 44 0f bf 
42 06 44 0f b7 4a 02 48 8b 44 01 30 49 63 cc 48 83 c1 03 48 c1 e1 04 48 
01 f1 <48> 89 01 44 89 49 08 44 89 41 0c 48 8b 08 80 e5 80 0f 85 54 09 
[  124.559681] RIP  [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  124.559681]  RSP <ffff88007b403d18>
[  124.559681] CR2: ffff880078453000
[  124.559681] ---[ end trace d5028239de5a7a42 ]---
[  124.559681] Kernel panic - not syncing: Fatal exception in interrupt

-- 
*/Best regards,/*
/Eugene Istomin/



Jan,

after 10 seconds of working (speed is 33 Mb/s!!) here is new oops:

template:/home/local # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61806
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 5.4 sec  21.6 MBytes  33.5 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61807
[  5]  0.0-10.0 sec  39.5 MBytes  33.0 Mbits/sec
[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61808
[  4]  0.0-10.0 sec   149 MBytes   124 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61809
[  118.230386] BUG: unable to handle kernel paging request at 
ffff8800791e4000
[  118.230426] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  118.230501] PGD a83067 PUD a93067 PMD 7fc29067 PTE 
80100000791e4065
[  118.230534] Oops: 0003 [#1] SMP 
[  118.230552] Modules linked in: af_packet hwmon domctl crc32_pclmul 
joydev crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua scsi_dh_rdac 
scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix
[  118.230690] CPU 0 
[  118.230700] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
#1  
[  118.230729] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[  118.230766] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[  118.230787] RAX: ffff88007da89500 RBX: ffff8800791e3ec0 RCX: 
ffff8800791e4000
[  118.230814] RDX: ffff880078446150 RSI: ffff8800791e3ec0 RDI: 
ffff880078434880
[  118.230841] RBP: ffff88007852db80 R08: 0000000000000ba8 R09: 
0000000000000000
[  118.230867] R10: 0000000000000000 R11: 0000000000000001 R12: 
0000000000000011
[  118.230895] R13: 0000000000025322 R14: ffff88007b403dd8 R15: 
ffff880078470800

[-- Attachment #1.2: Type: text/html, Size: 42146 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-18 10:06                   ` Eugene Istomin
@ 2013-05-20  6:52                     ` Eugene Istomin
  2013-05-20  8:58                       ` Ian Campbell
  2013-05-20  9:06                       ` Wei Liu
  0 siblings, 2 replies; 85+ messages in thread
From: Eugene Istomin @ 2013-05-20  6:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 5402 bytes --]

Ping
-- 
*/Best regards,/*
/Eugene Istomin/




Hello,

Do you need any other logs?
-- 
*/Best regards,/*
/Eugene Istomin/



Bump, here it is:



template:/home/local # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------



[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
23902
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.1 sec  38.5 MBytes  32.0 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
23903
[  124.555698] BUG: unable to handle kernel paging request at 
ffff880078453000
[  124.555760] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  124.555813] PGD a83067 PUD a93067 PMD 7fc2f067 PTE 
8010000078453065
[  124.555866] Oops: 0003 [#1] SMP 
[  124.555894] Modules linked in: af_packet hwmon domctl crc32_pclmul 
crc32c_intel ghash_clmulni_intel joydev aesni_intel ablk_helper cryptd lrw 
aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua scsi_dh_rdac 
scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix
[  124.556126] CPU 0 
[  124.556143] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
#1  
[  124.556190] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[  124.556252] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[  124.556286] RAX: ffff88007da5f650 RBX: ffff880078452ec0 RCX: 
ffff880078453000
[  124.556332] RDX: ffff880078876810 RSI: ffff880078452ec0 RDI: 
ffff880078451580
[  124.556377] RBP: ffff88007802ac80 R08: 00000000000009e0 R09: 
0000000000000000
[  124.556423] R10: 0000000000000000 R11: 0000000000000001 R12: 
0000000000000011
[  124.556468] R13: 0000000000005afa R14: ffff88007b403dd8 R15: 
ffff880078440800
[  124.556518] FS:  00007faae5316700(0000) GS:ffff88007b400000(0000) 
knlGS:0000000000000000
[  124.556569] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  124.557637] CR2: ffff880078453000 CR3: 0000000079f61000 CR4: 
0000000000002660
[  124.558714] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  124.559681] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  124.559681] Process swapper/0 (pid: 0, threadinfo ffffffff808b6000, task 
ffffffff808c1960)
[  124.559681] Stack:
[  124.559681]  ffff8800786afb80 000000007953e510 
000000000000ad00 ffff88007b403db8
[  124.559681]  ffff88007b403d98 0000001200005ae9 ffff880078441d50 
0000000000000000
[  124.559681]  ffff880078440858 ffff88007b40e878 0000000100000012 
0000004000005afb
[  124.559681] Call Trace:
[  124.559681]  [<ffffffff80441070>] net_rx_action+0x170/0x2c0
[  124.559681]  [<ffffffff80036dbe>] __do_softirq+0xee/0x230
[  124.559681]  [<ffffffff80037085>] irq_exit+0x95/0xa0
[  124.559681]  [<ffffffff803933ed>] evtchn_do_upcall+0x2ad/0x2f0
[  124.559681]  [<ffffffff80534d5e>] do_hypervisor_callback+0x1e/0x30
[  124.559681]  [<ffffffff800033aa>] HYPERVISOR_sched_op_new+0xa/0x20
[  124.559681]  [<ffffffff8000e671>] xen_idle+0x41/0x110
[  124.559681]  [<ffffffff8000e7ef>] cpu_idle+0xaf/0x110
[  124.559681]  [<ffffffff80944b1f>] start_kernel+0x424/0x42f
[  124.559681] Code: 44 21 ea 48 8d 54 d0 40 8b 87 d8 00 00 00 44 0f bf 
42 06 44 0f b7 4a 02 48 8b 44 01 30 49 63 cc 48 83 c1 03 48 c1 e1 04 48 
01 f1 <48> 89 01 44 89 49 08 44 89 41 0c 48 8b 08 80 e5 80 0f 85 54 09 
[  124.559681] RIP  [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  124.559681]  RSP <ffff88007b403d18>
[  124.559681] CR2: ffff880078453000
[  124.559681] ---[ end trace d5028239de5a7a42 ]---
[  124.559681] Kernel panic - not syncing: Fatal exception in interrupt

-- 
*/Best regards,/*
/Eugene Istomin/



Jan,

after 10 seconds of working (speed is 33 Mb/s!!) here is new oops:

template:/home/local # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61806
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 5.4 sec  21.6 MBytes  33.5 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61807
[  5]  0.0-10.0 sec  39.5 MBytes  33.0 Mbits/sec
[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61808
[  4]  0.0-10.0 sec   149 MBytes   124 Mbits/sec
[  5] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
61809
[  118.230386] BUG: unable to handle kernel paging request at 
ffff8800791e4000
[  118.230426] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  118.230501] PGD a83067 PUD a93067 PMD 7fc29067 PTE 
80100000791e4065
[  118.230534] Oops: 0003 [#1] SMP 
[  118.230552] Modules linked in: af_packet hwmon domctl crc32_pclmul 
joydev crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
aes_x86_64 xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua scsi_dh_rdac 
scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic ata_piix
[  118.230690] CPU 0 
[  118.230700] Pid: 0, comm: swapper/0 Not tainted 3.9.2-4.756ee56-xen 
#1  
[  118.230729] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[  118.230766] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[  118.230787] RAX: ffff88007da89500 RBX: ffff8800791e3ec0 RCX: 
ffff8800791e4000

[-- Attachment #1.2: Type: text/html, Size: 44240 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-20  6:52                     ` Eugene Istomin
@ 2013-05-20  8:58                       ` Ian Campbell
  2013-05-20  9:06                       ` Wei Liu
  1 sibling, 0 replies; 85+ messages in thread
From: Ian Campbell @ 2013-05-20  8:58 UTC (permalink / raw)
  To: Eugene Istomin; +Cc: Wei Liu, Jan Beulich, xen-devel

On Mon, 2013-05-20 at 09:52 +0300, Eugene Istomin wrote:
> Ping

It's been less than two days (over a weekend), not to mention your
initial ping on a Saturday morning less than a day after posting on a
Friday afternoon -- be a little patient please.

Ian.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-20  6:52                     ` Eugene Istomin
  2013-05-20  8:58                       ` Ian Campbell
@ 2013-05-20  9:06                       ` Wei Liu
  2013-05-20  9:17                         ` Eugene Istomin
  1 sibling, 1 reply; 85+ messages in thread
From: Wei Liu @ 2013-05-20  9:06 UTC (permalink / raw)
  To: Eugene Istomin; +Cc: Wei Liu, Jan Beulich, xen-devel

On Mon, May 20, 2013 at 09:52:21AM +0300, Eugene Istomin wrote:
> Ping

I might be missing something, but you only seemed to provide links to
the RPM package you use.

The kernel you're using, as Jan pointed out earlier, is a nightly build
of SuSE kernel. It is impossible to tell what's going on without source
code. And generally speaking, I don't think using that kernel in your
production environment a good idea.

Wei.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-20  9:06                       ` Wei Liu
@ 2013-05-20  9:17                         ` Eugene Istomin
  2013-05-20  9:24                           ` Wei Liu
  0 siblings, 1 reply; 85+ messages in thread
From: Eugene Istomin @ 2013-05-20  9:17 UTC (permalink / raw)
  To: Wei Liu; +Cc: Jan Beulich, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 745 bytes --]

Wei,

i use all opensuse available 3.9.2 kernels in tests:
1) stable ones 
(http://download.opensuse.org/repositories/Kernel:/stable/standard/x86_64
),
2) head opensuse release 
(http://download.opensuse.org/repositories/Kernel:/HEAD/standard/x86_64)

-- 
*/Best regards,/*
/Eugene Istomin/



> On Mon, May 20, 2013 at 09:52:21AM +0300, Eugene Istomin wrote:
> > Ping
> 
> I might be missing something, but you only seemed to provide links to
> the RPM package you use.
> 
> The kernel you're using, as Jan pointed out earlier, is a nightly build
> of SuSE kernel. It is impossible to tell what's going on without source
> code. And generally speaking, I don't think using that kernel in your
> production environment a good idea.
> 
> 
> Wei.

[-- Attachment #1.2: Type: text/html, Size: 5386 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-20  9:17                         ` Eugene Istomin
@ 2013-05-20  9:24                           ` Wei Liu
  0 siblings, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-05-20  9:24 UTC (permalink / raw)
  To: Eugene Istomin; +Cc: Wei Liu, Jan Beulich, xen-devel

On Mon, May 20, 2013 at 12:17:44PM +0300, Eugene Istomin wrote:
> Wei,
> 
> i use all opensuse available 3.9.2 kernels in tests:
> 1) stable ones 
> (http://download.opensuse.org/repositories/Kernel:/stable/standard/x86_64
> ),
> 2) head opensuse release 
> (http://download.opensuse.org/repositories/Kernel:/HEAD/standard/x86_64)
> 

Those are the binary packages, not the source code.


Wei.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17 13:00                 ` Eugene Istomin
  2013-05-18 10:06                   ` Eugene Istomin
@ 2013-05-21  9:55                   ` Jan Beulich
  2013-05-21 11:09                     ` Eugene Istomin
  2013-05-21 15:28                     ` Wei Liu
  1 sibling, 2 replies; 85+ messages in thread
From: Jan Beulich @ 2013-05-21  9:55 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu, Eugene Istomin; +Cc: xen-devel

>>> On 17.05.13 at 15:00, Eugene Istomin <e.istomin@edss.ee> wrote:
> Bump, here it is:

Okay, but I think we're still lacking information on what your
Dom0 kernel is.

Ian, Wei - looking at the forward ported as well as the upstream
frontends, I'm wondering if there isn't a fundamental flaw in
..._get_responses(): It allows up to MAX_SKB_FRAGS + 1 slots/
frags, and ..._fill_frags() then fills as many fragments as got
queued onto the respective list. Only after both of them are done,
__pskb_pull_tail() gets invoked reducing the fragment count by
one if the condition is met that made ..._get_responses() bump
the limit by one.

Am I overlooking something? I'm asking because working through
disassembly and register values of the dump Eugene had sent I
clearly see that ..._fill_frags() is processing the 18th fragment,
while in that kernel version MAX_SKB_FRAGS is only 17 (but
possibly, hence the first question above, the Dom0 kernel still is
one with MAX_SKB_FRAGS being 18, or the packet turns out to
be such that it fills 17 fragments and the header is smaller than
RX_COPY_THRESHOLD).

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-21  9:55                   ` Jan Beulich
@ 2013-05-21 11:09                     ` Eugene Istomin
  2013-05-21 11:49                       ` Jan Beulich
  2013-05-21 15:28                     ` Wei Liu
  1 sibling, 1 reply; 85+ messages in thread
From: Eugene Istomin @ 2013-05-21 11:09 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Ian Campbell, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 5665 bytes --]

Jan,

i updated to 3.9.3 from stable branch

*Dom0:*
#hv04: xl info
host                   : hv04.edss.local
release                : 3.9.3-1.g0b5d8f5-xen
version                : #1 SMP Mon May 20 09:43:24 UTC 2013 (0b5d8f5)
machine                : x86_64
nr_cpus                : 24
max_cpu_id             : 23
nr_nodes               : 2
cores_per_socket       : 6
threads_per_core       : 2
cpu_mhz                : 2666
hw_caps                : 
bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:000
00000
virt_caps              : hvm hvm_directio
total_memory           : 98294
free_memory            : 92985
sharing_freed_memory   : 0
sharing_used_memory    : 0
free_cpus              : 0
xen_major              : 4
xen_minor              : 2
xen_extra              : .2_01-240.2
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64 
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : 26064
xen_commandline        : vga=mode-0x314 dom0_max_vcpus=8 
dom0_mem=4G,max:4G msi=on iommu=on console=ttyS0,115200 xen-
netback.netback_max_groups=6
cc_compiler            : gcc (SUSE Linux) 4.7.2 20130108 [gcc-4_7-branch 
revision 195012
cc_compile_by          : abuild
cc_compile_domain      : 
cc_compile_date        : Fri May 10 15:05:06 UTC 2013
xend_config_format     : 4


*Both DomU:*
test:/home/local # uname -a
Linux test.edss.local 3.9.3-1.g0b5d8f5-xen #1 SMP Mon May 20 09:43:24 
UTC 2013 (0b5d8f5) x86_64 x86_64 x86_64 GNU/Linux


*From xl console*

[  4] local 10.251.2.201 port 5001 connected with 10.251.2.202 port 
26946
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 2.0 sec  8.38 MBytes  35.2 Mbits/sec
[  4]  2.0- 4.0 sec  90.3 MBytes   379 Mbits/sec
[  4]  4.0- 6.0 sec  7.30 MBytes  30.6 Mbits/sec
[  105.391855] BUG: unable to handle kernel paging request at 
ffff880078c6f000
[  105.391918] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  105.391970] PGD a85067 PUD a95067 PMD 7fc2b067 PTE 
8010000078c6f065
[  105.392029] Oops: 0003 [#1] SMP 
[  105.392058] Modules linked in: iptable_filter ip_tables x_tables af_packet 
hwmon domctl crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel 
ablk_helper joydev cryptd lrw aes_x86_64 xts gf128mul autofs4 
scsi_dh_emc scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk 
cdrom xennet ata_generic ata_piix
[  105.392319] CPU 0 
[  105.392336] Pid: 0, comm: swapper/0 Not tainted 3.9.3-1.g0b5d8f5-xen 
#1  
[  105.392383] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
netif_poll+0x49c/0xe80 [xennet]
[  105.392450] RSP: e02b:ffff88007b403d18  EFLAGS: 00010286
[  105.392485] RAX: ffff88007da85638 RBX: ffff880078c6eec0 RCX: 
ffff880078c6f000
[  105.392531] RDX: ffff880078b54170 RSI: ffff880078c6eec0 RDI: 
ffff880078479280
[  105.392576] RBP: ffff88007869c6c0 R08: 0000000000000dc0 R09: 
0000000000000000
[  105.392622] R10: 0000000000000000 R11: 0000000000000001 R12: 
0000000000000011
[  105.392667] R13: 000000000000b826 R14: ffff88007b403dd8 R15: 
ffff880078428800
[  105.393781] FS:  00007f11b3900700(0000) GS:ffff88007b400000(0000) 
knlGS:0000000000000000
[  105.394907] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  105.395840] CR2: ffff880078c6f000 CR3: 00000000797f5000 CR4: 
0000000000002660
[  105.395840] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  105.395840] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  105.395840] Process swapper/0 (pid: 0, threadinfo ffffffff808b8000, task 
ffffffff808c3960)
[  105.395840] Stack:
[  105.395840]  ffffffff00000269 0000002200000022 ffff8800784288d0 
ffff88007b403db8
[  105.395840]  ffff88007b403d98 000000150000b815 ffff880078429d50 
0000000000000000
[  105.395840]  ffff880078428858 ffff88007b40e878 0000000000000012 
000000400000b82a
[  105.395840] Call Trace:
[  105.395840]  [<ffffffff804419c0>] net_rx_action+0x170/0x2c0
[  105.395840]  [<ffffffff80036dae>] __do_softirq+0xee/0x230
[  105.395840]  [<ffffffff80037075>] irq_exit+0x95/0xa0
[  105.395840]  [<ffffffff803937cd>] evtchn_do_upcall+0x2ad/0x2f0
[  105.395840]  [<ffffffff80535a0e>] do_hypervisor_callback+0x1e/0x30
[  105.395840]  [<ffffffff800033aa>] HYPERVISOR_sched_op_new+0xa/0x20
[  105.395840]  [<ffffffff8000e671>] xen_idle+0x41/0x110
[  105.395840]  [<ffffffff8000e7ef>] cpu_idle+0xaf/0x110
[  105.395840]  [<ffffffff80946b1f>] start_kernel+0x424/0x42f
[  105.395840] Code: 44 21 ea 48 8d 54 d0 40 8b 87 d8 00 00 00 44 0f bf 
42 06 44 0f b7 4a 02 48 8b 44 01 30 49 63 cc 48 83 c1 03 48 c1 e1 04 48 
01 f1 <48> 89 01 44 89 49 08 44 89 41 0c 48 8b 08 80 e5 80 0f 85 54 09 
[  105.395840] RIP  [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
[  105.395840]  RSP <ffff88007b403d18>
[  105.395840] CR2: ffff880078c6f000
[  105.395840] ---[ end trace 67dcc8bdc485cab5 ]---
[  105.395840] Kernel panic - not syncing: Fatal exception in interrupt


-- 
*/Best regards,/*
/Eugene Istomin/



> >>> On 17.05.13 at 15:00, Eugene Istomin <e.istomin@edss.ee> 
wrote:
> > Bump, here it is:
> Okay, but I think we're still lacking information on what your
> Dom0 kernel is.
> 
> Ian, Wei - looking at the forward ported as well as the upstream
> frontends, I'm wondering if there isn't a fundamental flaw in
> ..._get_responses(): It allows up to MAX_SKB_FRAGS + 1 slots/
> frags, and ..._fill_frags() then fills as many fragments as got
> queued onto the respective list. Only after both of them are done,
> __pskb_pull_tail() gets invoked reducing the fragment count by
> one if the condition is met that made ..._get_responses() bump

[-- Attachment #1.2: Type: text/html, Size: 28734 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-21 11:09                     ` Eugene Istomin
@ 2013-05-21 11:49                       ` Jan Beulich
  2013-05-21 16:19                         ` Eugene Istomin
  0 siblings, 1 reply; 85+ messages in thread
From: Jan Beulich @ 2013-05-21 11:49 UTC (permalink / raw)
  To: Eugene Istomin; +Cc: Wei Liu, Ian Campbell, xen-devel

>>> On 21.05.13 at 13:09, Eugene Istomin <e.istomin@edss.ee> wrote:
> i updated to 3.9.3 from stable branch

I think I had pointed this out already - I'm not sure the stable branch
is really being kept up to date wrt Xen patches. I'm only updating
the master branch, so if you really don't want to stick to released
kernels, for Xen it may be better if you track the master branch (and
simply don't update as long as that one's on *-rc* for the next
upstream version).

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-21  9:55                   ` Jan Beulich
  2013-05-21 11:09                     ` Eugene Istomin
@ 2013-05-21 15:28                     ` Wei Liu
  2013-05-31 15:47                       ` Wei Liu
  1 sibling, 1 reply; 85+ messages in thread
From: Wei Liu @ 2013-05-21 15:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Eugene Istomin, Wei Liu, Ian Campbell, xen-devel

On Tue, May 21, 2013 at 10:55:00AM +0100, Jan Beulich wrote:
> >>> On 17.05.13 at 15:00, Eugene Istomin <e.istomin@edss.ee> wrote:
> > Bump, here it is:
> 
> Okay, but I think we're still lacking information on what your
> Dom0 kernel is.
> 
> Ian, Wei - looking at the forward ported as well as the upstream
> frontends, I'm wondering if there isn't a fundamental flaw in
> ..._get_responses(): It allows up to MAX_SKB_FRAGS + 1 slots/
> frags, and ..._fill_frags() then fills as many fragments as got
> queued onto the respective list. Only after both of them are done,
> __pskb_pull_tail() gets invoked reducing the fragment count by
> one if the condition is met that made ..._get_responses() bump
> the limit by one.
> 
> Am I overlooking something? I'm asking because working through
> disassembly and register values of the dump Eugene had sent I
> clearly see that ..._fill_frags() is processing the 18th fragment,

> while in that kernel version MAX_SKB_FRAGS is only 17 (but
> possibly, hence the first question above, the Dom0 kernel still is
> one with MAX_SKB_FRAGS being 18, or the packet turns out to

I have the same suspection. However I just did a quick hack to bump
MAX_SKB_FRAGS to 20 in backend, DomU worked just fine. 

> be such that it fills 17 fragments and the header is smaller than
> RX_COPY_THRESHOLD).

It would be better to know more info about Dom0. Will need some time to
figure out what's going on there.


Wei.

> Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-21 11:49                       ` Jan Beulich
@ 2013-05-21 16:19                         ` Eugene Istomin
  2013-05-31  3:37                           ` Eugene Istomin
  0 siblings, 1 reply; 85+ messages in thread
From: Eugene Istomin @ 2013-05-21 16:19 UTC (permalink / raw)
  To: Jan Beulich, Wei Liu; +Cc: Ian Campbell, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 908 bytes --]

Jan,

thanks for the explanation,

right now xen code and patches in stable branch  
(http://kernel.opensuse.org/cgit/kernel/log/?h=stable&qt=grep&q=xen) is 
in-sync with master 
(http://kernel.opensuse.org/cgit/kernel/log/?qt=grep&q=xen) 

Wei,

>>It would be better to know more info about Dom0.
i added Dom0 "xl info" in previous mail, do you need any other info?

-- 
*/Best regards,/*
/Eugene Istomin/




> >>> On 21.05.13 at 13:09, Eugene Istomin <e.istomin@edss.ee> 
wrote:
> > i updated to 3.9.3 from stable branch
> 
> I think I had pointed this out already - I'm not sure the stable branch
> is really being kept up to date wrt Xen patches. I'm only updating
> the master branch, so if you really don't want to stick to released
> kernels, for Xen it may be better if you track the master branch (and
> simply don't update as long as that one's on *-rc* for the next
> upstream version).
> 
> Jan

[-- Attachment #1.2: Type: text/html, Size: 6910 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-21 16:19                         ` Eugene Istomin
@ 2013-05-31  3:37                           ` Eugene Istomin
  2013-05-31  6:19                             ` Jan Beulich
  0 siblings, 1 reply; 85+ messages in thread
From: Eugene Istomin @ 2013-05-31  3:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Ian Campbell, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1029 bytes --]

Hello Jan,

do you need any other info? Can i help with this bug resolving?
-- 
*/Best regards,/*
/Eugene Istomin/




Jan,

thanks for the explanation,

right now xen code and patches in stable branch  
(http://kernel.opensuse.org/cgit/kernel/log/?h=stable&qt=grep&q=xen) is 
in-sync with master 
(http://kernel.opensuse.org/cgit/kernel/log/?qt=grep&q=xen) 

Wei,

>>It would be better to know more info about Dom0.
i added Dom0 "xl info" in previous mail, do you need any other info?

-- 
*/Best regards,/*
/Eugene Istomin/




> >>> On 21.05.13 at 13:09, Eugene Istomin <e.istomin@edss.ee> 
wrote:
> > i updated to 3.9.3 from stable branch
> 
> I think I had pointed this out already - I'm not sure the stable branch
> is really being kept up to date wrt Xen patches. I'm only updating
> the master branch, so if you really don't want to stick to released
> kernels, for Xen it may be better if you track the master branch (and
> simply don't update as long as that one's on *-rc* for the next
> upstream version).
> 
> Jan



[-- Attachment #1.2: Type: text/html, Size: 9155 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-31  3:37                           ` Eugene Istomin
@ 2013-05-31  6:19                             ` Jan Beulich
  2013-05-31  6:22                               ` Eugene Istomin
  2013-05-31  8:20                               ` Wei Liu
  0 siblings, 2 replies; 85+ messages in thread
From: Jan Beulich @ 2013-05-31  6:19 UTC (permalink / raw)
  To: Eugene Istomin; +Cc: Wei Liu, Ian Campbell, xen-devel

>>> On 31.05.13 at 05:37, Eugene Istomin <e.istomin@edss.ee> wrote:
> do you need any other info? Can i help with this bug resolving?

No, it's a time allocation thing. For one I'm waiting for Wei to find
time to look at whether the issue I spotted is a real one. And I'm
myself short on time preventing me to get to merging the grant
copy changes into our netback. Unless you can do either of these,
you'll have to be patient I'm afraid.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-31  6:19                             ` Jan Beulich
@ 2013-05-31  6:22                               ` Eugene Istomin
  2013-05-31  8:20                               ` Wei Liu
  1 sibling, 0 replies; 85+ messages in thread
From: Eugene Istomin @ 2013-05-31  6:22 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Ian Campbell, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 523 bytes --]

Ok,

thank you.
-- 
*/Best regards,/*
/Eugene Istomin/



> >>> On 31.05.13 at 05:37, Eugene Istomin <e.istomin@edss.ee> 
wrote:
> > do you need any other info? Can i help with this bug resolving?
> 
> No, it's a time allocation thing. For one I'm waiting for Wei to find
> time to look at whether the issue I spotted is a real one. And I'm
> myself short on time preventing me to get to merging the grant
> copy changes into our netback. Unless you can do either of these,
> you'll have to be patient I'm afraid.
> 
> Jan

[-- Attachment #1.2: Type: text/html, Size: 4065 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-31  6:19                             ` Jan Beulich
  2013-05-31  6:22                               ` Eugene Istomin
@ 2013-05-31  8:20                               ` Wei Liu
  1 sibling, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-05-31  8:20 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Eugene Istomin, Ian Campbell, xen-devel

On Fri, May 31, 2013 at 07:19:51AM +0100, Jan Beulich wrote:
> >>> On 31.05.13 at 05:37, Eugene Istomin <e.istomin@edss.ee> wrote:
> > do you need any other info? Can i help with this bug resolving?
> 
> No, it's a time allocation thing. For one I'm waiting for Wei to find
> time to look at whether the issue I spotted is a real one. And I'm
> myself short on time preventing me to get to merging the grant
> copy changes into our netback. Unless you can do either of these,
> you'll have to be patient I'm afraid.
> 

Sorry for the delay. I might be able to allocate some time today to have
a look at the issue you spotted.


Wei.

> Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-21 15:28                     ` Wei Liu
@ 2013-05-31 15:47                       ` Wei Liu
  2013-05-31 16:00                         ` Jan Beulich
  0 siblings, 1 reply; 85+ messages in thread
From: Wei Liu @ 2013-05-31 15:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Eugene Istomin, Ian Campbell, xen-devel

On Tue, May 21, 2013 at 04:28:47PM +0100, Wei Liu wrote:
> On Tue, May 21, 2013 at 10:55:00AM +0100, Jan Beulich wrote:
> > >>> On 17.05.13 at 15:00, Eugene Istomin <e.istomin@edss.ee> wrote:
> > > Bump, here it is:
> > 
> > Okay, but I think we're still lacking information on what your
> > Dom0 kernel is.
> > 
> > Ian, Wei - looking at the forward ported as well as the upstream
> > frontends, I'm wondering if there isn't a fundamental flaw in
> > ..._get_responses(): It allows up to MAX_SKB_FRAGS + 1 slots/
> > frags, and ..._fill_frags() then fills as many fragments as got
> > queued onto the respective list. Only after both of them are done,
> > __pskb_pull_tail() gets invoked reducing the fragment count by
> > one if the condition is met that made ..._get_responses() bump
> > the limit by one.
> > 
> > Am I overlooking something? I'm asking because working through
> > disassembly and register values of the dump Eugene had sent I
> > clearly see that ..._fill_frags() is processing the 18th fragment,
> 
> > while in that kernel version MAX_SKB_FRAGS is only 17 (but
> > possibly, hence the first question above, the Dom0 kernel still is
> > one with MAX_SKB_FRAGS being 18, or the packet turns out to
> 
> I have the same suspection. However I just did a quick hack to bump
> MAX_SKB_FRAGS to 20 in backend, DomU worked just fine. 
> 
> > be such that it fills 17 fragments and the header is smaller than
> > RX_COPY_THRESHOLD).
> 

Eugene,

I have more or less the same suspension as Jan did, but as I could not
effectively reproduce this problem upstream, could you please apply
following patch to opensuse kernel and see what happens.


Thanks
Wei.

-----
diff --git a/drivers/xen/netfront/netfront.c b/drivers/xen/netfront/netfront.c
index 6e5d233..ab7e35b 100644
--- a/drivers/xen/netfront/netfront.c
+++ b/drivers/xen/netfront/netfront.c
@@ -1319,6 +1319,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
 		kfree_skb(nskb);
 
 		nr_frags++;
+		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
 	}
 
 	shinfo->nr_frags = nr_frags;

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-31 15:47                       ` Wei Liu
@ 2013-05-31 16:00                         ` Jan Beulich
  2013-05-31 16:05                           ` Wei Liu
  0 siblings, 1 reply; 85+ messages in thread
From: Jan Beulich @ 2013-05-31 16:00 UTC (permalink / raw)
  To: Wei Liu; +Cc: Eugene Istomin, IanCampbell, xen-devel

>>> On 31.05.13 at 17:47, Wei Liu <wei.liu2@citrix.com> wrote:
> --- a/drivers/xen/netfront/netfront.c
> +++ b/drivers/xen/netfront/netfront.c
> @@ -1319,6 +1319,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
>  		kfree_skb(nskb);
>  
>  		nr_frags++;
> +		BUG_ON(nr_frags >= MAX_SKB_FRAGS);

Isn't this going to see false positives? I would have thought such a
check would need to go at the top of the loop body.

Jan

>  	}
>  
>  	shinfo->nr_frags = nr_frags;

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-31 16:00                         ` Jan Beulich
@ 2013-05-31 16:05                           ` Wei Liu
  0 siblings, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-05-31 16:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Eugene Istomin, Wei Liu, IanCampbell, xen-devel

On Fri, May 31, 2013 at 05:00:32PM +0100, Jan Beulich wrote:
> >>> On 31.05.13 at 17:47, Wei Liu <wei.liu2@citrix.com> wrote:
> > --- a/drivers/xen/netfront/netfront.c
> > +++ b/drivers/xen/netfront/netfront.c
> > @@ -1319,6 +1319,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
> >  		kfree_skb(nskb);
> >  
> >  		nr_frags++;
> > +		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> 
> Isn't this going to see false positives? I would have thought such a
> check would need to go at the top of the loop body.
> 
> Jan
> 

Ha, yes.

Please use this one.


Wei.

-----
diff --git a/drivers/xen/netfront/netfront.c b/drivers/xen/netfront/netfront.c
index 6e5d233..9583011 100644
--- a/drivers/xen/netfront/netfront.c
+++ b/drivers/xen/netfront/netfront.c
@@ -1306,6 +1306,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
 	struct sk_buff *nskb;
 
 	while ((nskb = __skb_dequeue(list))) {
+		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
 		struct netif_rx_response *rx =
 			RING_GET_RESPONSE(&np->rx, ++cons);

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-05-17  8:59 ` [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8 Wei Liu
  2013-05-17  9:08   ` Jan Beulich
  2013-05-17  9:37   ` Eugene Istomin
@ 2013-07-04 13:43   ` Dion Kant
  2013-07-04 14:18     ` Jan Beulich
  2013-07-04 15:01     ` Wei Liu
  2 siblings, 2 replies; 85+ messages in thread
From: Dion Kant @ 2013-07-04 13:43 UTC (permalink / raw)
  To: xen-users, xen-devel

Hello Wei and all other interested people,

I saw this thread from around May. It got silent on this thread after
your post on May 31.

Is there any progress on this problem?

I am running into this issue as well with the openSUSE 12.3
distribution. This is with their 3.7.10-1.16-xen kernel and Xen version
4.2.1_12-1.12.10. On the net I see some discussion of people hitting
this issue but not that much.  E.g., one of the symptoms is that a guest
crashes when running zypper install or zypper update when the Internet
connection is fast enough.

OpenSUSE 3.4.X kernels are running ok as guest on top of the openSUSE
12.3 Xen distribution, but apparently since 3.7.10 and higher there is
this issue.

I spent already quite some time in getting grip on the issue. I added a
bug to bugzilla.novell.com but no response. See
https://bugzilla.novell.com/show_bug.cgi?id=826374 for details.
Apparently for hitting this bug (i.e. make it all the way to the crash),
it is required to use some hardware which performs not too slow. With
this I mean it is easy to find hardware which is unable to reproduce the
issue.

In one of my recent experiments I changed the SLAB allocater to SLUB
which provides more detailed kernel logging. Here is the log output
after the first detected issue regarding xennet:

2013-07-03T23:51:16.560229+02:00 domUA kernel: [   97.562370] netfront:
Too many frags
2013-07-03T23:51:17.228143+02:00 domUA kernel: [   98.230466] netfront:
Too many frags
2013-07-03T23:51:17.596074+02:00 domUA kernel: [   98.597300] netfront:
Too many frags
2013-07-03T23:51:18.740215+02:00 domUA kernel: [   99.743080]
net_ratelimit: 2 callbacks suppressed
2013-07-03T23:51:18.740242+02:00 domUA kernel: [   99.743084] netfront:
Too many frags
2013-07-03T23:51:19.104100+02:00 domUA kernel: [  100.104281] netfront:
Too many frags
2013-07-03T23:51:19.760134+02:00 domUA kernel: [  100.760594] netfront:
Too many frags
2013-07-03T23:51:21.820154+02:00 domUA kernel: [  102.821202] netfront:
Too many frags
2013-07-03T23:51:22.192188+02:00 domUA kernel: [  103.192655] netfront:
Too many frags
2013-07-03T23:51:26.060144+02:00 domUA kernel: [  107.062447] netfront:
Too many frags
2013-07-03T23:51:26.412116+02:00 domUA kernel: [  107.415165] netfront:
Too many frags
2013-07-03T23:51:27.092147+02:00 domUA kernel: [  108.094615] netfront:
Too many frags
2013-07-03T23:51:27.492112+02:00 domUA kernel: [  108.494255] netfront:
Too many frags
2013-07-03T23:51:27.520194+02:00 domUA kernel: [  108.522445]
=============================================================================
2013-07-03T23:51:27.520206+02:00 domUA kernel: [  108.522448] BUG
kmalloc-1024 (Tainted: G        W   ): Redzone overwritten
2013-07-03T23:51:27.520209+02:00 domUA kernel: [  108.522450]
-----------------------------------------------------------------------------
2013-07-03T23:51:27.520212+02:00 domUA kernel: [  108.522450]
2013-07-03T23:51:27.520215+02:00 domUA kernel: [  108.522452] Disabling
lock debugging due to kernel taint
2013-07-03T23:51:27.520217+02:00 domUA kernel: [  108.522454] INFO:
0xffff8800f66068f8-0xffff8800f66068ff. First byte 0x0 instead of 0xcc
2013-07-03T23:51:27.520220+02:00 domUA kernel: [  108.522461] INFO:
Allocated in __alloc_skb+0x88/0x260 age=11 cpu=0 pid=1325
2013-07-03T23:51:27.520223+02:00 domUA kernel: [  108.522466]  
set_track+0x6c/0x190
2013-07-03T23:51:27.520225+02:00 domUA kernel: [  108.522470]  
alloc_debug_processing+0x83/0x109
2013-07-03T23:51:27.520228+02:00 domUA kernel: [  108.522472]  
__slab_alloc.constprop.48+0x523/0x593
2013-07-03T23:51:27.520231+02:00 domUA kernel: [  108.522474]  
__kmalloc_track_caller+0xb4/0x200
2013-07-03T23:51:27.520233+02:00 domUA kernel: [  108.522477]  
__kmalloc_reserve+0x3c/0xa0
2013-07-03T23:51:27.520236+02:00 domUA kernel: [  108.522478]  
__alloc_skb+0x88/0x260
2013-07-03T23:51:27.520239+02:00 domUA kernel: [  108.522483]  
network_alloc_rx_buffers+0x76/0x5f0 [xennet]
2013-07-03T23:51:27.520241+02:00 domUA kernel: [  108.522486]  
netif_poll+0xcf4/0xf30 [xennet]
2013-07-03T23:51:27.520243+02:00 domUA kernel: [  108.522489]  
net_rx_action+0xf0/0x2e0
2013-07-03T23:51:27.520246+02:00 domUA kernel: [  108.522493]  
__do_softirq+0x127/0x280
2013-07-03T23:51:27.520248+02:00 domUA kernel: [  108.522496]  
call_softirq+0x1c/0x30
2013-07-03T23:51:27.520251+02:00 domUA kernel: [  108.522499]  
do_softirq+0x56/0xd0
2013-07-03T23:51:27.520253+02:00 domUA kernel: [  108.522501]  
irq_exit+0x52/0xd0
2013-07-03T23:51:27.520256+02:00 domUA kernel: [  108.522503]  
evtchn_do_upcall+0x281/0x2e7
2013-07-03T23:51:27.520258+02:00 domUA kernel: [  108.522505]  
do_hypervisor_callback+0x1e/0x30
2013-07-03T23:51:27.520261+02:00 domUA kernel: [  108.522507]  
0x7f45f0a2f1e0
2013-07-03T23:51:27.520263+02:00 domUA kernel: [  108.522509] INFO:
Freed in skb_free_head+0x5c/0x70 age=14 cpu=0 pid=1325
2013-07-03T23:51:27.520266+02:00 domUA kernel: [  108.522512]  
set_track+0x6c/0x190
2013-07-03T23:51:27.520269+02:00 domUA kernel: [  108.522513]  
free_debug_processing+0x151/0x201
2013-07-03T23:51:27.520271+02:00 domUA kernel: [  108.522515]  
__slab_free+0x47/0x499
2013-07-03T23:51:27.520274+02:00 domUA kernel: [  108.522517]  
kfree+0x1df/0x230
2013-07-03T23:51:27.520276+02:00 domUA kernel: [  108.522519]  
skb_free_head+0x5c/0x70
2013-07-03T23:51:27.520279+02:00 domUA kernel: [  108.522521]  
skb_release_data+0xea/0xf0
2013-07-03T23:51:27.520281+02:00 domUA kernel: [  108.522522]  
__kfree_skb+0x1e/0xb0
2013-07-03T23:51:27.520284+02:00 domUA kernel: [  108.522524]  
kfree_skb+0x80/0xc0
2013-07-03T23:51:27.520286+02:00 domUA kernel: [  108.522527]  
netif_poll+0x824/0xf30 [xennet]
2013-07-03T23:51:27.520289+02:00 domUA kernel: [  108.522529]  
net_rx_action+0xf0/0x2e0
2013-07-03T23:51:27.520291+02:00 domUA kernel: [  108.522530]  
__do_softirq+0x127/0x280
2013-07-03T23:51:27.520294+02:00 domUA kernel: [  108.522532]  
call_softirq+0x1c/0x30
2013-07-03T23:51:27.520296+02:00 domUA kernel: [  108.522534]  
do_softirq+0x56/0xd0
2013-07-03T23:51:27.520299+02:00 domUA kernel: [  108.522536]  
irq_exit+0x52/0xd0
2013-07-03T23:51:27.520302+02:00 domUA kernel: [  108.522538]  
evtchn_do_upcall+0x281/0x2e7
2013-07-03T23:51:27.520304+02:00 domUA kernel: [  108.522539]  
do_hypervisor_callback+0x1e/0x30
2013-07-03T23:51:27.520307+02:00 domUA kernel: [  108.522541] INFO: Slab
0xffff8800ffd78100 objects=12 used=7 fp=0xffff8800f66074d0
flags=0x400000000000408
2013-07-03T23:51:27.520310+02:00 domUA kernel: [  108.522543] INFO:
Object 0xffff8800f66064f8 @offset=9464 fp=0x0000018800000000
2013-07-03T23:51:27.520312+02:00 domUA kernel: [  108.522543]
2013-07-03T23:51:27.520315+02:00 domUA kernel: [  108.522546] Bytes b4
ffff8800f66064e8: 4a 40 ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a 
J@......ZZZZZZZZ
2013-07-03T23:51:27.520318+02:00 domUA kernel: [  108.522548] Object
ffff8800f66064f8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
kkkkkkkkkkkkkkkk
2013-07-03T23:51:27.520320+02:00 domUA kernel: [  108.522549] Object
ffff8800f6606508: 00 16 3e 29 7e 3c 00 25 90 69 ea 4e 08 00 45 08 
..>)~<.%.i.N..E.
2013-07-03T23:51:27.520323+02:00 domUA kernel: [  108.522551] Object
ffff8800f6606518: fe bc 46 d7 40 00 40 06 d3 69 0a 57 06 91 0a 57 
..F.@.@..i.W...W
2013-07-03T23:51:27.520326+02:00 domUA kernel: [  108.522553] Object
ffff8800f6606528: 06 b4 9b 86 00 16 57 4d 5e bd 89 4c 40 ad 80 10 
......WM^..L@...
2013-07-03T23:51:27.520329+02:00 domUA kernel: [  108.522554] Object
ffff8800f6606538: 00 a6 20 a2 00 00 01 01 08 0a 01 eb 40 a7 ff ff  ..
.........@...
2013-07-03T23:51:27.520332+02:00 domUA kernel: [  108.522556] Object
ffff8800f6606548: 44 fa 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
D.kkkkkkkkkkkkkk
2013-07-03T23:51:27.520335+02:00 domUA kernel: [  108.522557] Object
ffff8800f6606558: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
kkkkkkkkkkkkkkkk
2013-07-03T23:51:27.520337+02:00 domUA kernel: [  108.522559] Object
ffff8800f6606568: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
kkkkkkkkkkkkkkkk

Skipping some of the object dumping.......

2013-07-03T23:51:27.520583+02:00 domUA kernel: [  108.522644] Object
ffff8800f66068d8: 00 d7 e4 ff 00 88 ff ff 00 00 00 00 00 10 00 00 
................
2013-07-03T23:51:27.520586+02:00 domUA kernel: [  108.522646] Object
ffff8800f66068e8: 00 92 dd ff 00 88 ff ff 00 00 00 00 88 01 00 00 
................
2013-07-03T23:51:27.520588+02:00 domUA kernel: [  108.522647] Redzone
ffff8800f66068f8: 00 92 dd ff 00 88 ff ff                          ........
2013-07-03T23:51:27.520591+02:00 domUA kernel: [  108.522649] Padding
ffff8800f6606a38: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
2013-07-03T23:51:27.520594+02:00 domUA kernel: [  108.522651] Pid: 1325,
comm: sshd Tainted: G    B   W    3.7.10-1.16-dbg-xen #3
2013-07-03T23:51:27.520597+02:00 domUA kernel: [  108.522652] Call Trace:
2013-07-03T23:51:27.520599+02:00 domUA kernel: [  108.522658] 
[<ffffffff8000b097>] try_stack_unwind+0x87/0x1c0
2013-07-03T23:51:27.520602+02:00 domUA kernel: [  108.522662] 
[<ffffffff80008fa5>] dump_trace+0xd5/0x250
2013-07-03T23:51:27.520605+02:00 domUA kernel: [  108.522665] 
[<ffffffff8000b22c>] show_trace_log_lvl+0x5c/0x80
2013-07-03T23:51:27.520608+02:00 domUA kernel: [  108.522668] 
[<ffffffff8000b265>] show_trace+0x15/0x20
2013-07-03T23:51:27.520610+02:00 domUA kernel: [  108.522672] 
[<ffffffff80553a69>] dump_stack+0x77/0x80
2013-07-03T23:51:27.520612+02:00 domUA kernel: [  108.522676] 
[<ffffffff801491b1>] print_trailer+0x131/0x140
2013-07-03T23:51:27.520615+02:00 domUA kernel: [  108.522680] 
[<ffffffff80149709>] check_bytes_and_report+0xc9/0x120
2013-07-03T23:51:27.520617+02:00 domUA kernel: [  108.522683] 
[<ffffffff8014a7f6>] check_object+0x56/0x240
2013-07-03T23:51:27.520620+02:00 domUA kernel: [  108.522687] 
[<ffffffff805575b6>] free_debug_processing+0xc4/0x201
2013-07-03T23:51:27.520622+02:00 domUA kernel: [  108.522690] 
[<ffffffff8055773a>] __slab_free+0x47/0x499
2013-07-03T23:51:27.520625+02:00 domUA kernel: [  108.522694] 
[<ffffffff8014beff>] kfree+0x1df/0x230
2013-07-03T23:51:27.520627+02:00 domUA kernel: [  108.522697] 
[<ffffffff8044a8cc>] skb_free_head+0x5c/0x70
2013-07-03T23:51:27.520630+02:00 domUA kernel: [  108.522701] 
[<ffffffff8044a9ca>] skb_release_data+0xea/0xf0
2013-07-03T23:51:27.520632+02:00 domUA kernel: [  108.522704] 
[<ffffffff8044a9ee>] __kfree_skb+0x1e/0xb0
2013-07-03T23:51:27.520635+02:00 domUA kernel: [  108.522709] 
[<ffffffff8049fa2a>] tcp_recvmsg+0x99a/0xd50
2013-07-03T23:51:27.520637+02:00 domUA kernel: [  108.522714] 
[<ffffffff804c796d>] inet_recvmsg+0xed/0x110
2013-07-03T23:51:27.520640+02:00 domUA kernel: [  108.522718] 
[<ffffffff80440be8>] sock_aio_read+0x158/0x190
2013-07-03T23:51:27.520642+02:00 domUA kernel: [  108.522722] 
[<ffffffff8015cb68>] do_sync_read+0x98/0xf0
2013-07-03T23:51:27.520645+02:00 domUA kernel: [  108.522726] 
[<ffffffff8015d32d>] vfs_read+0xbd/0x180
2013-07-03T23:51:27.520647+02:00 domUA kernel: [  108.522729] 
[<ffffffff8015d442>] sys_read+0x52/0xa0
2013-07-03T23:51:27.520650+02:00 domUA kernel: [  108.522733] 
[<ffffffff8056ab3b>] system_call_fastpath+0x1a/0x1f
2013-07-03T23:51:27.520652+02:00 domUA kernel: [  108.522736] 
[<00007f45ef74c960>] 0x7f45ef74c95f
2013-07-03T23:51:27.520655+02:00 domUA kernel: [  108.522738] FIX
kmalloc-1024: Restoring 0xffff8800f66068f8-0xffff8800f66068ff=0xcc
2013-07-03T23:51:27.520657+02:00 domUA kernel: [  108.522738]
2013-07-03T23:51:27.679444+02:00 domUA kernel: [  108.671750]
=============================================================================
2013-07-03T23:51:27.679454+02:00 domUA kernel: [  108.671753] BUG
kmalloc-1024 (Tainted: G    B   W   ): Redzone overwritten
2013-07-03T23:51:27.679456+02:00 domUA kernel: [  108.671754]
-----------------------------------------------------------------------------
2013-07-03T23:51:27.679458+02:00 domUA kernel: [  108.671754]
2013-07-03T23:51:27.679460+02:00 domUA kernel: [  108.671757] INFO:
0xffff8800f66068f8-0xffff8800f66068ff. First byte 0xcc instead of 0xbb
2013-07-03T23:51:27.679462+02:00 domUA kernel: [  108.671762] INFO:
Allocated in __alloc_skb+0x88/0x260 age=48 cpu=0 pid=1325
2013-07-03T23:51:27.679464+02:00 domUA kernel: [  108.671765]  
set_track+0x6c/0x190
2013-07-03T23:51:27.679466+02:00 domUA kernel: [  108.671767]  
alloc_debug_processing+0x83/0x109
2013-07-03T23:51:27.679468+02:00 domUA kernel: [  108.671769]  
__slab_alloc.constprop.48+0x523/0x593
2013-07-03T23:51:27.679469+02:00 domUA kernel: [  108.671771]  
__kmalloc_track_caller+0xb4/0x200
2013-07-03T23:51:27.679471+02:00 domUA kernel: [  108.671773]  
__kmalloc_reserve+0x3c/0xa0
2013-07-03T23:51:27.679473+02:00 domUA kernel: [  108.671775]  
__alloc_skb+0x88/0x260
2013-07-03T23:51:27.679475+02:00 domUA kernel: [  108.671778]  
network_alloc_rx_buffers+0x76/0x5f0 [xennet]
2013-07-03T23:51:27.679476+02:00 domUA kernel: [  108.671781]  
netif_poll+0xcf4/0xf30 [xennet]
2013-07-03T23:51:27.679478+02:00 domUA kernel: [  108.671783]  
net_rx_action+0xf0/0x2e0

I noticed that after turning on all this debugging stuff, a real panic
does not appear any more.

This happens while copying a file with scp from dom0 to the guest (scp
bigfile domu:/dev/null).

In my lab, I am currently experimenting with a SuperMicro based system
with Xen showing the following characteristics:

__  __            _  _    ____    _     _ ____     _   _ ____    _  ___ 
 \ \/ /___ _ __   | || |  |___ \  / |   / |___ \   / | / |___ \  / |/ _ \
  \  // _ \ '_ \  | || |_   __) | | |   | | __) |__| | | | __) | | | | | |
  /  \  __/ | | | |__   _| / __/ _| |   | |/ __/|__| |_| |/ __/ _| | |_| |
 /_/\_\___|_| |_|    |_|(_)_____(_)_|___|_|_____|  |_(_)_|_____(_)_|\___/
                                   |_____|                               
(XEN) Xen version 4.2.1_12-1.12.10 (abuild@) (gcc (SUSE Linux) 4.7.2
20130108 [gcc-4_7-branch revision 195012]) Wed May 29 20:31:49 UTC 2013
(XEN) Latest ChangeSet: 25952
(XEN) Bootloader: GNU GRUB 0.97
(XEN) Command line: dom0_mem=2048M,max:2048M loglvl=all guest_loglvl=all
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN)  Found 4 MBR signatures
(XEN)  Found 4 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 0000000000096400 (usable)
(XEN)  0000000000096400 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 00000000bf780000 (usable)
(XEN)  00000000bf78e000 - 00000000bf790000 type 9
(XEN)  00000000bf790000 - 00000000bf79e000 (ACPI data)
(XEN)  00000000bf79e000 - 00000000bf7d0000 (ACPI NVS)
(XEN)  00000000bf7d0000 - 00000000bf7e0000 (reserved)
(XEN)  00000000bf7ec000 - 00000000c0000000 (reserved)
(XEN)  00000000e0000000 - 00000000f0000000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ffc00000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000000340000000 (usable)

Skipping ACPI en SRAT

(XEN) System RAM: 12279MB (12573784kB)

(XEN) NUMA: Allocated memnodemap from 33e38a000 - 33e38e000
(XEN) NUMA: Using 8 for the hash shift.
(XEN) Domain heap initialised DMA width 30 bits
(XEN) found SMP MP-table at 000ff780
(XEN) DMI present.

(XEN) Enabling APIC mode:  Phys.  Using 2 I/O APICs
(XEN) ACPI: HPET id: 0x8086a301 base: 0xfed00000
(XEN) Failed to get Error Log Address Range.
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 24 CPUs (8 hotplug CPUs)
(XEN) IRQ limits: 48 GSI, 3040 MSI/MSI-X
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2400.115 MHz processor.
(XEN) Initing memory sharing.
(XEN) mce_intel.c:1238: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank 0
extended MCE MSR 0
(XEN) Intel machine check reporting enabled
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff
(XEN) PCI: MCFG area at e0000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-ff
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB.
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Shared EPT tables not enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) ENABLING IO-APIC IRQs
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) Platform timer is 14.318MHz HPET
(XEN) Allocated console ring of 128 KiB.
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) Brought up 16 CPUs
(XEN) ACPI sleep modes: S3
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, lsb, paddr 0x2000 -> 0xa65000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000336000000->0000000337000000 (516915 pages
to be allocated)
(XEN)  Init. ramdisk: 000000033f333000->0000000340000000
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff80002000->ffffffff80a65000
(XEN)  Init. ramdisk: 0000000000000000->0000000000000000
(XEN)  Phys-Mach map: ffffea0000000000->ffffea0000400000
(XEN)  Start info:    ffffffff80a65000->ffffffff80a654b4
(XEN)  Page tables:   ffffffff80a66000->ffffffff80a6f000
(XEN)  Boot stack:    ffffffff80a6f000->ffffffff80a70000
(XEN)  TOTAL:         ffffffff80000000->ffffffff80c00000
(XEN)  ENTRY ADDRESS: ffffffff80002000
(XEN) Dom0 has maximum 16 VCPUs
(XEN) Scrubbing Free RAM:
.....................................................................................................done.
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) Xen is relinquishing VGA console.



(XEN) ACPI: RSDP 000FACE0, 0024 (r2 ACPIAM)
(XEN) ACPI: XSDT BF790100, 008C (r1 SMCI            20110827 MSFT       97)
(XEN) ACPI: FACP BF790290, 00F4 (r4 082711 FACP1638 20110827 MSFT       97)
(XEN) ACPI: DSDT BF7906A0, 6563 (r2  10600 10600000        0 INTL 20051117)
(XEN) ACPI: FACS BF79E000, 0040
(XEN) ACPI: APIC BF790390, 011E (r2 082711 APIC1638 20110827 MSFT       97)
(XEN) ACPI: MCFG BF7904B0, 003C (r1 082711 OEMMCFG  20110827 MSFT       97)
(XEN) ACPI: SLIT BF7904F0, 0030 (r1 082711 OEMSLIT  20110827 MSFT       97)
(XEN) ACPI: OEMB BF79E040, 0085 (r1 082711 OEMB1638 20110827 MSFT       97)
(XEN) ACPI: SRAT BF79A6A0, 01D0 (r2 082711 OEMSRAT         1 INTL        1)
(XEN) ACPI: HPET BF79A870, 0038 (r1 082711 OEMHPET  20110827 MSFT       97)
(XEN) ACPI: DMAR BF79E0D0, 0130 (r1    AMI  OEMDMAR        1 MSFT       97)
(XEN) ACPI: SSDT BF7A1B30, 0363 (r1 DpgPmm    CpuPm       12 INTL 20051117)
(XEN) ACPI: EINJ BF79A8B0, 0130 (r1  AMIER AMI_EINJ 20110827 MSFT       97)
(XEN) ACPI: BERT BF79AA40, 0030 (r1  AMIER AMI_BERT 20110827 MSFT       97)
(XEN) ACPI: ERST BF79AA70, 01B0 (r1  AMIER AMI_ERST 20110827 MSFT       97)
(XEN) ACPI: HEST BF79AC20, 00A8 (r1  AMIER ABC_HEST 20110827 MSFT       97)
(XEN) System RAM: 12279MB (12573784kB)
(XEN) SRAT: PXM 0 -> APIC 0 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 2 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 18 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 20 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 1 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 3 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 19 -> Node 0
(XEN) SRAT: PXM 0 -> APIC 21 -> Node 0

I am happy to assist in more kernel probing. It is even possible for me
to setup access for someone to this machine.

Best regards,

Dion Kant

On 05/17/2013 10:59 AM, Wei Liu wrote:
> Moving discussion to Xen-devel
>
> On Thu, May 16, 2013 at 10:29:56PM +0300, Eugene Istomin wrote:
>> Hello,
>>
>> I tried to use 3.9.2 kernel with xen 4.2.2/4.3rc1 and in both variants leads 
>> to this error in network-intensive load (such as iperf, 100 nginx parallel 
>> requests to 1M files and so on):
>>
> It would be more helpful if you can provide info on your configurations
> (Dom0 and DomU), your workload, how to reproduce the bug.
>
> I run iperf and NFS to test Xen network, but never see any crash like
> this myself.
>
>> BUG: unable to handle kernel paging request at ffff8800795a3000
>> [   60.246945] IP: [<ffffffffa001a75c>] netif_poll+0x49c/0xe80 [xennet]
>> [   60.246975] PGD a8a067 PUD a9a067 PMD 7fc27067 PTE 
>> 80100000795a3065
>> [   60.247004] Oops: 0003 [#1] SMP 
>> [   60.247020] Modules linked in: af_packet hwmon domctl crc32_pclmul 
>> crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw 
>> aes_x86_64 joydev xts gf128mul autofs4 scsi_dh_emc scsi_dh_alua 
>> scsi_dh_rdac scsi_dh_hp_sw scsi_dh xenblk cdrom xennet ata_generic 
>> ata_piix
>> [   60.247144] CPU 0 
>> [   60.247154] Pid: 0, comm: swapper/0 Not tainted 3.9.2-1.g04040b9-xen 
>> #1  
>> [   60.247179] RIP: e030:[<ffffffffa001a75c>]  [<ffffffffa001a75c>] 
>> netif_poll+0x49c/0xe80 [xennet]
>> ...
> Could you provide fuul stack trace? AFAICT there is no netif_poll in Xen
> netfront/back.
>
> Presumably this is Dom0 log? (from the domctl module)
>
>> We have couple of production hypervisors on 3.4 kernels with high-
>> throughput  internal network (VM-to-VM in one Dom0), iperf on them is 
>> working well:
>> [  3]  0.0- 2.0 sec  3357 MBytes  14080 Mbits/sec
>> [  3]  2.0- 4.0 sec  2880 MBytes  12077 Mbits/sec
>> [  3]  4.0- 6.0 sec  2909 MBytes  12202 Mbits/sec
>> [  3]  6.0- 8.0 sec  2552 MBytes  10702 Mbits/sec
>> [  3]  8.0-10.0 sec  3616 MBytes  15166 Mbits/sec
>> [  3] 10.0-12.0 sec  3415 MBytes  14324 Mbits/sec
>>
>>
>> Seems like a kernel bug, is this related to one of this fixes in linux-next or i 
>> need to create new bugreport?
>>
>> 1) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=1aaf6d3d3d1e95f4be07e32dd84aa1c93855fbbd
>> 2) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=9ecd1a75d977e2e8c48139c7d3efed183f898d94
>> 3) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=2810e5b9a7731ca5fce22bfbe12c96e16ac44b6f
>> 4) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=03393fd5cc2b6cdeec32b704ecba64dbb0feae3c
>> 5) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=59ccb4ebbc35e36a3c143f2d1355deb75c2e628f
>>
> I don't think these patches can fix your problem at first glance.
>
>
> Wei.
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xen.org
> http://lists.xen.org/xen-users

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-07-04 13:43   ` Dion Kant
@ 2013-07-04 14:18     ` Jan Beulich
  2013-07-04 15:01     ` Wei Liu
  1 sibling, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-04 14:18 UTC (permalink / raw)
  To: Dion Kant; +Cc: xen-devel

>>> On 04.07.13 at 15:43, Dion Kant <g.w.kant@hunenet.nl> wrote:
> Hello Wei and all other interested people,
> 
> I saw this thread from around May. It got silent on this thread after
> your post on May 31.
> 
> Is there any progress on this problem?

We didn't receive feedback on the effect of the debugging patch
Wei had posted on that day
(http://lists.xenproject.org/archives/html/xen-devel/2013-05/msg03138.html).

Without knowing whether the possible issue pointed out earlier in
the thread is an actual one, or whether we need to look for
something else, we can't really make any progress here.

> I spent already quite some time in getting grip on the issue. I added a
> bug to bugzilla.novell.com but no response. See
> https://bugzilla.novell.com/show_bug.cgi?id=826374 for details.
> Apparently for hitting this bug (i.e. make it all the way to the crash),
> it is required to use some hardware which performs not too slow. With
> this I mean it is easy to find hardware which is unable to reproduce the
> issue.

You having picked the wrong component (Kernel instead of Xen)
may be part of the problem.

> 2013-07-03T23:51:16.560229+02:00 domUA kernel: [   97.562370] netfront:
> Too many frags

The fix for this is already in the queue (and half of it already
committed at least to our master branch). But this isn't necessarily
connected to the crash/corruption you observe (but I do think that
crash and corruption are just two different manifestations of the
same problem).

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-07-04 13:43   ` Dion Kant
  2013-07-04 14:18     ` Jan Beulich
@ 2013-07-04 15:01     ` Wei Liu
  2013-07-05  9:32       ` [PATCH] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
                         ` (3 more replies)
  1 sibling, 4 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-04 15:01 UTC (permalink / raw)
  To: Dion Kant; +Cc: wei.liu2, jbeulich, xen-devel

Dropping Xen-user, CCing Jan. This should be discussed on Xen-devel.

On Thu, Jul 04, 2013 at 03:43:59PM +0200, Dion Kant wrote:
> Hello Wei and all other interested people,
> 
> I saw this thread from around May. It got silent on this thread after
> your post on May 31.
> 
> Is there any progress on this problem?
> 

Sorry, no. I haven't been able to allocate time slots, and as this is
OpenSUSE kernel with forward ported patches it takes a little bit more
time to follow the code.

Last time I requested the reporter to run a test but he has not come
back until now.

> I am running into this issue as well with the openSUSE 12.3
> distribution. This is with their 3.7.10-1.16-xen kernel and Xen version
> 4.2.1_12-1.12.10. On the net I see some discussion of people hitting
> this issue but not that much.  E.g., one of the symptoms is that a guest
> crashes when running zypper install or zypper update when the Internet
> connection is fast enough.
> 

Do you have references to other reports?

> OpenSUSE 3.4.X kernels are running ok as guest on top of the openSUSE
> 12.3 Xen distribution, but apparently since 3.7.10 and higher there is
> this issue.
> 
> I spent already quite some time in getting grip on the issue. I added a
> bug to bugzilla.novell.com but no response. See
> https://bugzilla.novell.com/show_bug.cgi?id=826374 for details.
> Apparently for hitting this bug (i.e. make it all the way to the crash),
> it is required to use some hardware which performs not too slow. With
> this I mean it is easy to find hardware which is unable to reproduce the
                                                    able?
> issue.
> 

I'm not quite sure about what you mean. Do you mean this bug can only
be triggered when your receive path has real hardware NIC invloved?

And reading your test case below it doesn't seem so. Dom0 to DomU
transmission crashes the guest per your example.

> In one of my recent experiments I changed the SLAB allocater to SLUB
> which provides more detailed kernel logging. Here is the log output
> after the first detected issue regarding xennet:
> 

But the log below is not about SLUB. I cannot understand why SLAB v.s
SLUB makes a difference.

> 2013-07-03T23:51:16.560229+02:00 domUA kernel: [   97.562370] netfront:
> Too many frags
> 2013-07-03T23:51:17.228143+02:00 domUA kernel: [   98.230466] netfront:
> Too many frags
> 2013-07-03T23:51:17.596074+02:00 domUA kernel: [   98.597300] netfront:
> Too many frags
> 2013-07-03T23:51:18.740215+02:00 domUA kernel: [   99.743080]
> net_ratelimit: 2 callbacks suppressed
> 2013-07-03T23:51:18.740242+02:00 domUA kernel: [   99.743084] netfront:
> Too many frags
> 2013-07-03T23:51:19.104100+02:00 domUA kernel: [  100.104281] netfront:
> Too many frags
> 2013-07-03T23:51:19.760134+02:00 domUA kernel: [  100.760594] netfront:
> Too many frags
> 2013-07-03T23:51:21.820154+02:00 domUA kernel: [  102.821202] netfront:
> Too many frags
> 2013-07-03T23:51:22.192188+02:00 domUA kernel: [  103.192655] netfront:
> Too many frags
> 2013-07-03T23:51:26.060144+02:00 domUA kernel: [  107.062447] netfront:
> Too many frags
> 2013-07-03T23:51:26.412116+02:00 domUA kernel: [  107.415165] netfront:
> Too many frags
> 2013-07-03T23:51:27.092147+02:00 domUA kernel: [  108.094615] netfront:
> Too many frags
> 2013-07-03T23:51:27.492112+02:00 domUA kernel: [  108.494255] netfront:
> Too many frags
> 2013-07-03T23:51:27.520194+02:00 domUA kernel: [  108.522445]

"Too many frags" means your frontend is generating malformed packets.
This is not normal. And apparently you didn't use the latest kernel in
tree because the log message should be "Too many slots" in the latest
OpenSuSE kernel.

> =============================================================================
> 2013-07-03T23:51:27.520206+02:00 domUA kernel: [  108.522448] BUG
> kmalloc-1024 (Tainted: G        W   ): Redzone overwritten
> 2013-07-03T23:51:27.520209+02:00 domUA kernel: [  108.522450]
> -----------------------------------------------------------------------------
> 2013-07-03T23:51:27.520212+02:00 domUA kernel: [  108.522450]
> 2013-07-03T23:51:27.520215+02:00 domUA kernel: [  108.522452] Disabling
> lock debugging due to kernel taint
> 2013-07-03T23:51:27.520217+02:00 domUA kernel: [  108.522454] INFO:
> 0xffff8800f66068f8-0xffff8800f66068ff. First byte 0x0 instead of 0xcc
> 2013-07-03T23:51:27.520220+02:00 domUA kernel: [  108.522461] INFO:
> Allocated in __alloc_skb+0x88/0x260 age=11 cpu=0 pid=1325
> 2013-07-03T23:51:27.520223+02:00 domUA kernel: [  108.522466]  
> set_track+0x6c/0x190
> 2013-07-03T23:51:27.520225+02:00 domUA kernel: [  108.522470]  
> alloc_debug_processing+0x83/0x109
> 2013-07-03T23:51:27.520228+02:00 domUA kernel: [  108.522472]  
> __slab_alloc.constprop.48+0x523/0x593
> 2013-07-03T23:51:27.520231+02:00 domUA kernel: [  108.522474]  
> __kmalloc_track_caller+0xb4/0x200
> 2013-07-03T23:51:27.520233+02:00 domUA kernel: [  108.522477]  
> __kmalloc_reserve+0x3c/0xa0
> 2013-07-03T23:51:27.520236+02:00 domUA kernel: [  108.522478]  
> __alloc_skb+0x88/0x260
> 2013-07-03T23:51:27.520239+02:00 domUA kernel: [  108.522483]  
> network_alloc_rx_buffers+0x76/0x5f0 [xennet]
> 2013-07-03T23:51:27.520241+02:00 domUA kernel: [  108.522486]  
> netif_poll+0xcf4/0xf30 [xennet]
> 2013-07-03T23:51:27.520243+02:00 domUA kernel: [  108.522489]  
> net_rx_action+0xf0/0x2e0
> 2013-07-03T23:51:27.520246+02:00 domUA kernel: [  108.522493]  
> __do_softirq+0x127/0x280
> 2013-07-03T23:51:27.520248+02:00 domUA kernel: [  108.522496]  
> call_softirq+0x1c/0x30
> 2013-07-03T23:51:27.520251+02:00 domUA kernel: [  108.522499]  
> do_softirq+0x56/0xd0
> 2013-07-03T23:51:27.520253+02:00 domUA kernel: [  108.522501]  
> irq_exit+0x52/0xd0
> 2013-07-03T23:51:27.520256+02:00 domUA kernel: [  108.522503]  
> evtchn_do_upcall+0x281/0x2e7
> 2013-07-03T23:51:27.520258+02:00 domUA kernel: [  108.522505]  
> do_hypervisor_callback+0x1e/0x30
> 2013-07-03T23:51:27.520261+02:00 domUA kernel: [  108.522507]  
> 0x7f45f0a2f1e0
> 2013-07-03T23:51:27.520263+02:00 domUA kernel: [  108.522509] INFO:
> Freed in skb_free_head+0x5c/0x70 age=14 cpu=0 pid=1325
> 2013-07-03T23:51:27.520266+02:00 domUA kernel: [  108.522512]  
> set_track+0x6c/0x190
> 2013-07-03T23:51:27.520269+02:00 domUA kernel: [  108.522513]  
> free_debug_processing+0x151/0x201
> 2013-07-03T23:51:27.520271+02:00 domUA kernel: [  108.522515]  
> __slab_free+0x47/0x499
> 2013-07-03T23:51:27.520274+02:00 domUA kernel: [  108.522517]  
> kfree+0x1df/0x230
> 2013-07-03T23:51:27.520276+02:00 domUA kernel: [  108.522519]  
> skb_free_head+0x5c/0x70
> 2013-07-03T23:51:27.520279+02:00 domUA kernel: [  108.522521]  
> skb_release_data+0xea/0xf0
> 2013-07-03T23:51:27.520281+02:00 domUA kernel: [  108.522522]  
> __kfree_skb+0x1e/0xb0
> 2013-07-03T23:51:27.520284+02:00 domUA kernel: [  108.522524]  
> kfree_skb+0x80/0xc0
> 2013-07-03T23:51:27.520286+02:00 domUA kernel: [  108.522527]  
> netif_poll+0x824/0xf30 [xennet]
> 2013-07-03T23:51:27.520289+02:00 domUA kernel: [  108.522529]  
> net_rx_action+0xf0/0x2e0
> 2013-07-03T23:51:27.520291+02:00 domUA kernel: [  108.522530]  
> __do_softirq+0x127/0x280
> 2013-07-03T23:51:27.520294+02:00 domUA kernel: [  108.522532]  
> call_softirq+0x1c/0x30
> 2013-07-03T23:51:27.520296+02:00 domUA kernel: [  108.522534]  
> do_softirq+0x56/0xd0
> 2013-07-03T23:51:27.520299+02:00 domUA kernel: [  108.522536]  
> irq_exit+0x52/0xd0
> 2013-07-03T23:51:27.520302+02:00 domUA kernel: [  108.522538]  
> evtchn_do_upcall+0x281/0x2e7
> 2013-07-03T23:51:27.520304+02:00 domUA kernel: [  108.522539]  
> do_hypervisor_callback+0x1e/0x30
> 2013-07-03T23:51:27.520307+02:00 domUA kernel: [  108.522541] INFO: Slab
> 0xffff8800ffd78100 objects=12 used=7 fp=0xffff8800f66074d0
> flags=0x400000000000408
> 2013-07-03T23:51:27.520310+02:00 domUA kernel: [  108.522543] INFO:
> Object 0xffff8800f66064f8 @offset=9464 fp=0x0000018800000000
> 2013-07-03T23:51:27.520312+02:00 domUA kernel: [  108.522543]
> 2013-07-03T23:51:27.520315+02:00 domUA kernel: [  108.522546] Bytes b4
> ffff8800f66064e8: 4a 40 ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a 
> J@......ZZZZZZZZ
> 2013-07-03T23:51:27.520318+02:00 domUA kernel: [  108.522548] Object
> ffff8800f66064f8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
> kkkkkkkkkkkkkkkk
> 2013-07-03T23:51:27.520320+02:00 domUA kernel: [  108.522549] Object
> ffff8800f6606508: 00 16 3e 29 7e 3c 00 25 90 69 ea 4e 08 00 45 08 
> ..>)~<.%.i.N..E.
> 2013-07-03T23:51:27.520323+02:00 domUA kernel: [  108.522551] Object
> ffff8800f6606518: fe bc 46 d7 40 00 40 06 d3 69 0a 57 06 91 0a 57 
> ..F.@.@..i.W...W
> 2013-07-03T23:51:27.520326+02:00 domUA kernel: [  108.522553] Object
> ffff8800f6606528: 06 b4 9b 86 00 16 57 4d 5e bd 89 4c 40 ad 80 10 
> ......WM^..L@...
> 2013-07-03T23:51:27.520329+02:00 domUA kernel: [  108.522554] Object
> ffff8800f6606538: 00 a6 20 a2 00 00 01 01 08 0a 01 eb 40 a7 ff ff  ..
> .........@...
> 2013-07-03T23:51:27.520332+02:00 domUA kernel: [  108.522556] Object
> ffff8800f6606548: 44 fa 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
> D.kkkkkkkkkkkkkk
> 2013-07-03T23:51:27.520335+02:00 domUA kernel: [  108.522557] Object
> ffff8800f6606558: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
> kkkkkkkkkkkkkkkk
> 2013-07-03T23:51:27.520337+02:00 domUA kernel: [  108.522559] Object
> ffff8800f6606568: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
> kkkkkkkkkkkkkkkk
> 
> Skipping some of the object dumping.......
> 
> 2013-07-03T23:51:27.520583+02:00 domUA kernel: [  108.522644] Object
> ffff8800f66068d8: 00 d7 e4 ff 00 88 ff ff 00 00 00 00 00 10 00 00 
> ................
> 2013-07-03T23:51:27.520586+02:00 domUA kernel: [  108.522646] Object
> ffff8800f66068e8: 00 92 dd ff 00 88 ff ff 00 00 00 00 88 01 00 00 
> ................
> 2013-07-03T23:51:27.520588+02:00 domUA kernel: [  108.522647] Redzone
> ffff8800f66068f8: 00 92 dd ff 00 88 ff ff                          ........
> 2013-07-03T23:51:27.520591+02:00 domUA kernel: [  108.522649] Padding
> ffff8800f6606a38: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
> 2013-07-03T23:51:27.520594+02:00 domUA kernel: [  108.522651] Pid: 1325,
> comm: sshd Tainted: G    B   W    3.7.10-1.16-dbg-xen #3
> 2013-07-03T23:51:27.520597+02:00 domUA kernel: [  108.522652] Call Trace:
> 2013-07-03T23:51:27.520599+02:00 domUA kernel: [  108.522658] 
> [<ffffffff8000b097>] try_stack_unwind+0x87/0x1c0
> 2013-07-03T23:51:27.520602+02:00 domUA kernel: [  108.522662] 
> [<ffffffff80008fa5>] dump_trace+0xd5/0x250
> 2013-07-03T23:51:27.520605+02:00 domUA kernel: [  108.522665] 
> [<ffffffff8000b22c>] show_trace_log_lvl+0x5c/0x80
> 2013-07-03T23:51:27.520608+02:00 domUA kernel: [  108.522668] 
> [<ffffffff8000b265>] show_trace+0x15/0x20
> 2013-07-03T23:51:27.520610+02:00 domUA kernel: [  108.522672] 
> [<ffffffff80553a69>] dump_stack+0x77/0x80
> 2013-07-03T23:51:27.520612+02:00 domUA kernel: [  108.522676] 
> [<ffffffff801491b1>] print_trailer+0x131/0x140
> 2013-07-03T23:51:27.520615+02:00 domUA kernel: [  108.522680] 
> [<ffffffff80149709>] check_bytes_and_report+0xc9/0x120
> 2013-07-03T23:51:27.520617+02:00 domUA kernel: [  108.522683] 
> [<ffffffff8014a7f6>] check_object+0x56/0x240
> 2013-07-03T23:51:27.520620+02:00 domUA kernel: [  108.522687] 
> [<ffffffff805575b6>] free_debug_processing+0xc4/0x201
> 2013-07-03T23:51:27.520622+02:00 domUA kernel: [  108.522690] 
> [<ffffffff8055773a>] __slab_free+0x47/0x499
> 2013-07-03T23:51:27.520625+02:00 domUA kernel: [  108.522694] 
> [<ffffffff8014beff>] kfree+0x1df/0x230
> 2013-07-03T23:51:27.520627+02:00 domUA kernel: [  108.522697] 
> [<ffffffff8044a8cc>] skb_free_head+0x5c/0x70
> 2013-07-03T23:51:27.520630+02:00 domUA kernel: [  108.522701] 
> [<ffffffff8044a9ca>] skb_release_data+0xea/0xf0
> 2013-07-03T23:51:27.520632+02:00 domUA kernel: [  108.522704] 
> [<ffffffff8044a9ee>] __kfree_skb+0x1e/0xb0
> 2013-07-03T23:51:27.520635+02:00 domUA kernel: [  108.522709] 
> [<ffffffff8049fa2a>] tcp_recvmsg+0x99a/0xd50
> 2013-07-03T23:51:27.520637+02:00 domUA kernel: [  108.522714] 
> [<ffffffff804c796d>] inet_recvmsg+0xed/0x110
> 2013-07-03T23:51:27.520640+02:00 domUA kernel: [  108.522718] 
> [<ffffffff80440be8>] sock_aio_read+0x158/0x190
> 2013-07-03T23:51:27.520642+02:00 domUA kernel: [  108.522722] 
> [<ffffffff8015cb68>] do_sync_read+0x98/0xf0
> 2013-07-03T23:51:27.520645+02:00 domUA kernel: [  108.522726] 
> [<ffffffff8015d32d>] vfs_read+0xbd/0x180
> 2013-07-03T23:51:27.520647+02:00 domUA kernel: [  108.522729] 
> [<ffffffff8015d442>] sys_read+0x52/0xa0
> 2013-07-03T23:51:27.520650+02:00 domUA kernel: [  108.522733] 
> [<ffffffff8056ab3b>] system_call_fastpath+0x1a/0x1f
> 2013-07-03T23:51:27.520652+02:00 domUA kernel: [  108.522736] 
> [<00007f45ef74c960>] 0x7f45ef74c95f
> 2013-07-03T23:51:27.520655+02:00 domUA kernel: [  108.522738] FIX
> kmalloc-1024: Restoring 0xffff8800f66068f8-0xffff8800f66068ff=0xcc
> 2013-07-03T23:51:27.520657+02:00 domUA kernel: [  108.522738]
> 2013-07-03T23:51:27.679444+02:00 domUA kernel: [  108.671750]
> =============================================================================
> 2013-07-03T23:51:27.679454+02:00 domUA kernel: [  108.671753] BUG
> kmalloc-1024 (Tainted: G    B   W   ): Redzone overwritten
> 2013-07-03T23:51:27.679456+02:00 domUA kernel: [  108.671754]
> -----------------------------------------------------------------------------
> 2013-07-03T23:51:27.679458+02:00 domUA kernel: [  108.671754]
> 2013-07-03T23:51:27.679460+02:00 domUA kernel: [  108.671757] INFO:
> 0xffff8800f66068f8-0xffff8800f66068ff. First byte 0xcc instead of 0xbb
> 2013-07-03T23:51:27.679462+02:00 domUA kernel: [  108.671762] INFO:
> Allocated in __alloc_skb+0x88/0x260 age=48 cpu=0 pid=1325
> 2013-07-03T23:51:27.679464+02:00 domUA kernel: [  108.671765]  
> set_track+0x6c/0x190
> 2013-07-03T23:51:27.679466+02:00 domUA kernel: [  108.671767]  
> alloc_debug_processing+0x83/0x109
> 2013-07-03T23:51:27.679468+02:00 domUA kernel: [  108.671769]  
> __slab_alloc.constprop.48+0x523/0x593
> 2013-07-03T23:51:27.679469+02:00 domUA kernel: [  108.671771]  
> __kmalloc_track_caller+0xb4/0x200
> 2013-07-03T23:51:27.679471+02:00 domUA kernel: [  108.671773]  
> __kmalloc_reserve+0x3c/0xa0
> 2013-07-03T23:51:27.679473+02:00 domUA kernel: [  108.671775]  
> __alloc_skb+0x88/0x260
> 2013-07-03T23:51:27.679475+02:00 domUA kernel: [  108.671778]  
> network_alloc_rx_buffers+0x76/0x5f0 [xennet]
> 2013-07-03T23:51:27.679476+02:00 domUA kernel: [  108.671781]  
> netif_poll+0xcf4/0xf30 [xennet]
> 2013-07-03T23:51:27.679478+02:00 domUA kernel: [  108.671783]  
> net_rx_action+0xf0/0x2e0
> 

Seems like there's memory corruption in guest RX path.

> I noticed that after turning on all this debugging stuff, a real panic
> does not appear any more.
> 
> This happens while copying a file with scp from dom0 to the guest (scp
> bigfile domu:/dev/null).
> 
> In my lab, I am currently experimenting with a SuperMicro based system
> with Xen showing the following characteristics:
> 
> __  __            _  _    ____    _     _ ____     _   _ ____    _  ___ 
>  \ \/ /___ _ __   | || |  |___ \  / |   / |___ \   / | / |___ \  / |/ _ \
>   \  // _ \ '_ \  | || |_   __) | | |   | | __) |__| | | | __) | | | | | |
>   /  \  __/ | | | |__   _| / __/ _| |   | |/ __/|__| |_| |/ __/ _| | |_| |
>  /_/\_\___|_| |_|    |_|(_)_____(_)_|___|_|_____|  |_(_)_|_____(_)_|\___/
>                                    |_____|                               
> (XEN) Xen version 4.2.1_12-1.12.10 (abuild@) (gcc (SUSE Linux) 4.7.2
> 20130108 [gcc-4_7-branch revision 195012]) Wed May 29 20:31:49 UTC 2013
> (XEN) Latest ChangeSet: 25952
> (XEN) Bootloader: GNU GRUB 0.97
> (XEN) Command line: dom0_mem=2048M,max:2048M loglvl=all guest_loglvl=all
> (XEN) Video information:
> (XEN)  VGA is text mode 80x25, font 8x16
> (XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
> (XEN) Disc information:
> (XEN)  Found 4 MBR signatures
> (XEN)  Found 4 EDD information structures
> (XEN) Xen-e820 RAM map:
> (XEN)  0000000000000000 - 0000000000096400 (usable)
> (XEN)  0000000000096400 - 00000000000a0000 (reserved)
> (XEN)  00000000000e0000 - 0000000000100000 (reserved)
> (XEN)  0000000000100000 - 00000000bf780000 (usable)
> (XEN)  00000000bf78e000 - 00000000bf790000 type 9
> (XEN)  00000000bf790000 - 00000000bf79e000 (ACPI data)
> (XEN)  00000000bf79e000 - 00000000bf7d0000 (ACPI NVS)
> (XEN)  00000000bf7d0000 - 00000000bf7e0000 (reserved)
> (XEN)  00000000bf7ec000 - 00000000c0000000 (reserved)
> (XEN)  00000000e0000000 - 00000000f0000000 (reserved)
> (XEN)  00000000fee00000 - 00000000fee01000 (reserved)
> (XEN)  00000000ffc00000 - 0000000100000000 (reserved)
> (XEN)  0000000100000000 - 0000000340000000 (usable)
> 
> Skipping ACPI en SRAT
> 
> (XEN) System RAM: 12279MB (12573784kB)
> 
> (XEN) NUMA: Allocated memnodemap from 33e38a000 - 33e38e000
> (XEN) NUMA: Using 8 for the hash shift.
> (XEN) Domain heap initialised DMA width 30 bits
> (XEN) found SMP MP-table at 000ff780
> (XEN) DMI present.
> 
> (XEN) Enabling APIC mode:  Phys.  Using 2 I/O APICs
> (XEN) ACPI: HPET id: 0x8086a301 base: 0xfed00000
> (XEN) Failed to get Error Log Address Range.
> (XEN) Using ACPI (MADT) for SMP configuration information
> (XEN) SMP: Allowing 24 CPUs (8 hotplug CPUs)
> (XEN) IRQ limits: 48 GSI, 3040 MSI/MSI-X
> (XEN) Using scheduler: SMP Credit Scheduler (credit)
> (XEN) Detected 2400.115 MHz processor.
> (XEN) Initing memory sharing.
> (XEN) mce_intel.c:1238: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank 0
> extended MCE MSR 0
> (XEN) Intel machine check reporting enabled
> (XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff
> (XEN) PCI: MCFG area at e0000000 reserved in E820
> (XEN) PCI: Using MCFG for segment 0000 bus 00-ff
> (XEN) Intel VT-d iommu 0 supported page sizes: 4kB.
> (XEN) Intel VT-d Snoop Control enabled.
> (XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
> (XEN) Intel VT-d Queued Invalidation enabled.
> (XEN) Intel VT-d Interrupt Remapping enabled.
> (XEN) Intel VT-d Shared EPT tables not enabled.
> (XEN) I/O virtualisation enabled
> (XEN)  - Dom0 mode: Relaxed
> (XEN) Interrupt remapping enabled
> (XEN) Enabled directed EOI with ioapic_ack_old on!
> (XEN) ENABLING IO-APIC IRQs
> (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
> (XEN) Platform timer is 14.318MHz HPET
> (XEN) Allocated console ring of 128 KiB.
> (XEN) VMX: Supported advanced features:
> (XEN)  - APIC MMIO access virtualisation
> (XEN)  - APIC TPR shadow
> (XEN)  - Extended Page Tables (EPT)
> (XEN)  - Virtual-Processor Identifiers (VPID)
> (XEN)  - Virtual NMI
> (XEN)  - MSR direct-access bitmap
> (XEN)  - Unrestricted Guest
> (XEN) HVM: ASIDs enabled.
> (XEN) HVM: VMX enabled
> (XEN) HVM: Hardware Assisted Paging (HAP) detected
> (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
> (XEN) Brought up 16 CPUs
> (XEN) ACPI sleep modes: S3
> (XEN) mcheck_poll: Machine check polling timer started.
> (XEN) *** LOADING DOMAIN 0 ***
> (XEN)  Xen  kernel: 64-bit, lsb, compat32
> (XEN)  Dom0 kernel: 64-bit, lsb, paddr 0x2000 -> 0xa65000
> (XEN) PHYSICAL MEMORY ARRANGEMENT:
> (XEN)  Dom0 alloc.:   0000000336000000->0000000337000000 (516915 pages
> to be allocated)
> (XEN)  Init. ramdisk: 000000033f333000->0000000340000000
> (XEN) VIRTUAL MEMORY ARRANGEMENT:
> (XEN)  Loaded kernel: ffffffff80002000->ffffffff80a65000
> (XEN)  Init. ramdisk: 0000000000000000->0000000000000000
> (XEN)  Phys-Mach map: ffffea0000000000->ffffea0000400000
> (XEN)  Start info:    ffffffff80a65000->ffffffff80a654b4
> (XEN)  Page tables:   ffffffff80a66000->ffffffff80a6f000
> (XEN)  Boot stack:    ffffffff80a6f000->ffffffff80a70000
> (XEN)  TOTAL:         ffffffff80000000->ffffffff80c00000
> (XEN)  ENTRY ADDRESS: ffffffff80002000
> (XEN) Dom0 has maximum 16 VCPUs
> (XEN) Scrubbing Free RAM:
> .....................................................................................................done.
> (XEN) Initial low memory virq threshold set at 0x4000 pages.
> (XEN) Std. Loglevel: All
> (XEN) Guest Loglevel: All
> (XEN) Xen is relinquishing VGA console.
> 
> 
> 
> (XEN) ACPI: RSDP 000FACE0, 0024 (r2 ACPIAM)
> (XEN) ACPI: XSDT BF790100, 008C (r1 SMCI            20110827 MSFT       97)
> (XEN) ACPI: FACP BF790290, 00F4 (r4 082711 FACP1638 20110827 MSFT       97)
> (XEN) ACPI: DSDT BF7906A0, 6563 (r2  10600 10600000        0 INTL 20051117)
> (XEN) ACPI: FACS BF79E000, 0040
> (XEN) ACPI: APIC BF790390, 011E (r2 082711 APIC1638 20110827 MSFT       97)
> (XEN) ACPI: MCFG BF7904B0, 003C (r1 082711 OEMMCFG  20110827 MSFT       97)
> (XEN) ACPI: SLIT BF7904F0, 0030 (r1 082711 OEMSLIT  20110827 MSFT       97)
> (XEN) ACPI: OEMB BF79E040, 0085 (r1 082711 OEMB1638 20110827 MSFT       97)
> (XEN) ACPI: SRAT BF79A6A0, 01D0 (r2 082711 OEMSRAT         1 INTL        1)
> (XEN) ACPI: HPET BF79A870, 0038 (r1 082711 OEMHPET  20110827 MSFT       97)
> (XEN) ACPI: DMAR BF79E0D0, 0130 (r1    AMI  OEMDMAR        1 MSFT       97)
> (XEN) ACPI: SSDT BF7A1B30, 0363 (r1 DpgPmm    CpuPm       12 INTL 20051117)
> (XEN) ACPI: EINJ BF79A8B0, 0130 (r1  AMIER AMI_EINJ 20110827 MSFT       97)
> (XEN) ACPI: BERT BF79AA40, 0030 (r1  AMIER AMI_BERT 20110827 MSFT       97)
> (XEN) ACPI: ERST BF79AA70, 01B0 (r1  AMIER AMI_ERST 20110827 MSFT       97)
> (XEN) ACPI: HEST BF79AC20, 00A8 (r1  AMIER ABC_HEST 20110827 MSFT       97)
> (XEN) System RAM: 12279MB (12573784kB)
> (XEN) SRAT: PXM 0 -> APIC 0 -> Node 0
> (XEN) SRAT: PXM 0 -> APIC 2 -> Node 0
> (XEN) SRAT: PXM 0 -> APIC 18 -> Node 0
> (XEN) SRAT: PXM 0 -> APIC 20 -> Node 0
> (XEN) SRAT: PXM 0 -> APIC 1 -> Node 0
> (XEN) SRAT: PXM 0 -> APIC 3 -> Node 0
> (XEN) SRAT: PXM 0 -> APIC 19 -> Node 0
> (XEN) SRAT: PXM 0 -> APIC 21 -> Node 0
> 
> I am happy to assist in more kernel probing. It is even possible for me
> to setup access for someone to this machine.
> 

Excellent. Last time Jan suspected that we potentially overrun the frag
list of a skb (which would corrupt memory) but it has not been verified.

I also skimmed your bug report on novell bugzilla which did suggest
memory corruption.

I wrote a patch to crash the kernel immediately when looping over the
frag list, probably we could start from there? (You might need to adjust
context, but it is only a one-liner which should be easy).


Wei.

======
diff --git a/drivers/xen/netfront/netfront.c b/drivers/xen/netfront/netfront.c
index 6e5d233..9583011 100644
--- a/drivers/xen/netfront/netfront.c
+++ b/drivers/xen/netfront/netfront.c
@@ -1306,6 +1306,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
        struct sk_buff *nskb;

        while ((nskb = __skb_dequeue(list))) {
+               BUG_ON(nr_frags >= MAX_SKB_FRAGS);
                struct netif_rx_response *rx =
                        RING_GET_RESPONSE(&np->rx, ++cons);

^ permalink raw reply related	[flat|nested] 85+ messages in thread

* [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-04 15:01     ` Wei Liu
@ 2013-07-05  9:32       ` Jan Beulich
  2013-07-05 14:53         ` Wei Liu
  2013-07-05 14:53         ` Wei Liu
  2013-07-05  9:32       ` Jan Beulich
                         ` (2 subsequent siblings)
  3 siblings, 2 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-05  9:32 UTC (permalink / raw)
  To: davem; +Cc: Ian Campbell, wei.liu2, Dion Kant, xen-devel, netdev, stable

Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
linear area is big enough on RX") xennet_fill_frags() may end up
filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
the fragment count subsequently via __pskb_pull_tail(). That's a
result of xennet_get_responses() allowing a maximum of one more slot to
be consumed (and intermediately transformed into a fragment) if the
head slot has a size less than or equal to RX_COPY_THRESHOLD.

Hence we need to adjust xennet_fill_frags() to pull earlier if we
reached the maximum fragment count - due to the described behavior of
xennet_get_responses() this guarantees that at least the first fragment
will get completely consumed, and hence the fragment count reduced.

In order to not needlessly call __pskb_pull_tail() twice, make the
original call conditional upon the pull target not having been reached
yet, and defer the newly added one as much as possible (an alternative
would have been to always call the function right before the call to
xennet_fill_frags(), but that would imply more frequent cases of
needing to call it twice).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: stable@vger.kernel.org (3.6 onwards)

--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
 			RING_GET_RESPONSE(&np->rx, ++cons);
 		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];

+		if (nr_frags == MAX_SKB_FRAGS) {
+			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
+
+			BUG_ON(pull_to <= skb_headlen(skb));
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+			nr_frags = shinfo->nr_frags;
+		}
+		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
+
 		__skb_fill_page_desc(skb, nr_frags,
 				     skb_frag_page(nfrag),
 				     rx->offset, rx->status);
@@ -929,7 +938,8 @@ static int handle_incoming_queue(struct 
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
 		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;

-		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		if (pull_to > skb_headlen(skb))
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));

 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);

^ permalink raw reply	[flat|nested] 85+ messages in thread

* [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-04 15:01     ` Wei Liu
  2013-07-05  9:32       ` [PATCH] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
@ 2013-07-05  9:32       ` Jan Beulich
  2013-07-05  9:36       ` kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8 Jan Beulich
  2013-07-05 10:40       ` [Xen-users] " Dion Kant
  3 siblings, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-05  9:32 UTC (permalink / raw)
  To: davem; +Cc: wei.liu2, Ian Campbell, netdev, stable, xen-devel, Dion Kant

Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
linear area is big enough on RX") xennet_fill_frags() may end up
filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
the fragment count subsequently via __pskb_pull_tail(). That's a
result of xennet_get_responses() allowing a maximum of one more slot to
be consumed (and intermediately transformed into a fragment) if the
head slot has a size less than or equal to RX_COPY_THRESHOLD.

Hence we need to adjust xennet_fill_frags() to pull earlier if we
reached the maximum fragment count - due to the described behavior of
xennet_get_responses() this guarantees that at least the first fragment
will get completely consumed, and hence the fragment count reduced.

In order to not needlessly call __pskb_pull_tail() twice, make the
original call conditional upon the pull target not having been reached
yet, and defer the newly added one as much as possible (an alternative
would have been to always call the function right before the call to
xennet_fill_frags(), but that would imply more frequent cases of
needing to call it twice).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: stable@vger.kernel.org (3.6 onwards)

--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
 			RING_GET_RESPONSE(&np->rx, ++cons);
 		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];

+		if (nr_frags == MAX_SKB_FRAGS) {
+			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
+
+			BUG_ON(pull_to <= skb_headlen(skb));
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+			nr_frags = shinfo->nr_frags;
+		}
+		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
+
 		__skb_fill_page_desc(skb, nr_frags,
 				     skb_frag_page(nfrag),
 				     rx->offset, rx->status);
@@ -929,7 +938,8 @@ static int handle_incoming_queue(struct 
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
 		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;

-		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		if (pull_to > skb_headlen(skb))
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));

 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-07-04 15:01     ` Wei Liu
  2013-07-05  9:32       ` [PATCH] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
  2013-07-05  9:32       ` Jan Beulich
@ 2013-07-05  9:36       ` Jan Beulich
  2013-07-05 10:40       ` [Xen-users] " Dion Kant
  3 siblings, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-05  9:36 UTC (permalink / raw)
  To: Eugene Istomin, Dion Kant; +Cc: wei.liu2, xen-devel

[-- Attachment #1: Type: text/plain, Size: 364 bytes --]

Attached the openSUSE 12.3 patch corresponding to the one I just
sent against the upstream kernel, in case you want to give this a
try (this is in fact against the master branch, bit I verified that it
applies reasonably cleanly to the 12.3 branch too); neither Wei nor
me having been able to reproduce this would make your testing
pretty valuable.

Jan


[-- Attachment #2: xen3-patch-3.6-netfront-pull.patch --]
[-- Type: text/plain, Size: 1111 bytes --]

--- head.orig/drivers/xen/netfront/netfront.c	2013-05-10 14:43:24.000000000 +0200
+++ head/drivers/xen/netfront/netfront.c	2013-07-05 10:56:41.000000000 +0200
@@ -1309,6 +1309,15 @@ static RING_IDX xennet_fill_frags(struct
 		struct netif_rx_response *rx =
 			RING_GET_RESPONSE(&np->rx, ++cons);
 
+		if (nr_frags == MAX_SKB_FRAGS) {
+			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
+
+			BUG_ON(pull_to <= skb_headlen(skb));
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+			nr_frags = shinfo->nr_frags;
+		}
+		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
+
 		__skb_fill_page_desc(skb, nr_frags,
 				     skb_frag_page(skb_shinfo(nskb)->frags),
 				     rx->offset, rx->status);
@@ -1480,7 +1489,8 @@ err:	
 	while ((skb = __skb_dequeue(&rxq)) != NULL) {
 		unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		if (pull_to > skb_headlen(skb))
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
 
 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-07-04 15:01     ` Wei Liu
                         ` (2 preceding siblings ...)
  2013-07-05  9:36       ` kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8 Jan Beulich
@ 2013-07-05 10:40       ` Dion Kant
  2013-07-05 10:54         ` Wei Liu
  2013-07-05 10:56         ` Jan Beulich
  3 siblings, 2 replies; 85+ messages in thread
From: Dion Kant @ 2013-07-05 10:40 UTC (permalink / raw)
  To: Wei Liu; +Cc: jbeulich, xen-devel

On 07/04/2013 05:01 PM, Wei Liu wrote:
>
>> I am running into this issue as well with the openSUSE 12.3
>> distribution. This is with their 3.7.10-1.16-xen kernel and Xen version
>> 4.2.1_12-1.12.10. On the net I see some discussion of people hitting
>> this issue but not that much.  E.g., one of the symptoms is that a guest
>> crashes when running zypper install or zypper update when the Internet
>> connection is fast enough.
>>
> Do you have references to other reports?
I will gather them and post them later.

>
>> OpenSUSE 3.4.X kernels are running ok as guest on top of the openSUSE
>> 12.3 Xen distribution, but apparently since 3.7.10 and higher there is
>> this issue.
>>
>> I spent already quite some time in getting grip on the issue. I added a
>> bug to bugzilla.novell.com but no response. See
>> https://bugzilla.novell.com/show_bug.cgi?id=826374 for details.
>> Apparently for hitting this bug (i.e. make it all the way to the crash),
>> it is required to use some hardware which performs not too slow. With
>> this I mean it is easy to find hardware which is unable to reproduce the
>                                                     able?
>> issue.
>>
> I'm not quite sure about what you mean. Do you mean this bug can only
> be triggered when your receive path has real hardware NIC invloved?
>
> And reading your test case below it doesn't seem so. Dom0 to DomU
> transmission crashes the guest per your example.
Yes, a physical network card is not required. If you do send data to the
guest over a physical Ethernet card, it is required to operate it in 1
GbE mode. With a 100 MbE link I am unable to crash the guest.

If you do use vif interfaces only, the data rate will be high enough to
crash it.

However, I have also openSUSE 12.3 Xen configurations running which do
not have this issue. It is my feeling that smaller systems (in the sense
of less CPU cores and/or less memory bandwidth) do not reveal the issue.

>
>> In one of my recent experiments I changed the SLAB allocater to SLUB
>> which provides more detailed kernel logging. Here is the log output
>> after the first detected issue regarding xennet:
>>
> But the log below is not about SLUB. I cannot understand why SLAB v.s
> SLUB makes a difference.
I switched to SLUB from SLAB for its debugging functionality. The
openSUSE stock kernel used SLAB.



>
>> Too many frags
>> 2013-07-03T23:51:27.092147+02:00 domUA kernel: [  108.094615] netfront:
>> Too many frags
>> 2013-07-03T23:51:27.492112+02:00 domUA kernel: [  108.494255] netfront:
>> Too many frags
>> 2013-07-03T23:51:27.520194+02:00 domUA kernel: [  108.522445]
> "Too many frags" means your frontend is generating malformed packets.
> This is not normal. And apparently you didn't use the latest kernel in
> tree because the log message should be "Too many slots" in the latest
> OpenSuSE kernel.
Yes, I have seen that, but I used the latest openSUSE kernel which
belongs to openSUSE 12.3.


>> network_alloc_rx_buffers+0x76/0x5f0 [xennet]
>> 2013-07-03T23:51:27.679476+02:00 domUA kernel: [  108.671781]  
>> netif_poll+0xcf4/0xf30 [xennet]
>> 2013-07-03T23:51:27.679478+02:00 domUA kernel: [  108.671783]  
>> net_rx_action+0xf0/0x2e0
>>
> Seems like there's memory corruption in guest RX path.
As Jan already mentioned, it could be related to the kernel panics I 
obtain, however it may be a different issue as well.

>>
>>
>> I am happy to assist in more kernel probing. It is even possible for me
>> to setup access for someone to this machine.
>>
> Excellent. Last time Jan suspected that we potentially overrun the frag
> list of a skb (which would corrupt memory) but it has not been verified.
>
> I also skimmed your bug report on novell bugzilla which did suggest
> memory corruption.
>
> I wrote a patch to crash the kernel immediately when looping over the
> frag list, probably we could start from there? (You might need to adjust
> context, but it is only a one-liner which should be easy).
>
>
> Wei.
>
> ======
> diff --git a/drivers/xen/netfront/netfront.c b/drivers/xen/netfront/netfront.c
> index 6e5d233..9583011 100644
> --- a/drivers/xen/netfront/netfront.c
> +++ b/drivers/xen/netfront/netfront.c
> @@ -1306,6 +1306,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
>         struct sk_buff *nskb;
>
>         while ((nskb = __skb_dequeue(list))) {
> +               BUG_ON(nr_frags >= MAX_SKB_FRAGS);
>                 struct netif_rx_response *rx =
>                         RING_GET_RESPONSE(&np->rx, ++cons);
>

Integrated the patch. I obtained a crash dump and the log in it did not
show this BUG_ON. Here is the relevant section from the log

var/lib/xen/dump/domUA # crash /root/vmlinux-p1
2013-0705-1347.43-domUA.1.core

[    7.670132] Adding 4192252k swap on /dev/xvda1.  Priority:-1
extents:1 across:4192252k SS
[   10.204340] NET: Registered protocol family 17
[  481.534979] netfront: Too many frags
[  487.543946] netfront: Too many frags
[  491.049458] netfront: Too many frags
[  491.491153] ------------[ cut here ]------------
[  491.491628] kernel BUG at drivers/xen/netfront/netfront.c:1295!
[  491.492056] invalid opcode: 0000 [#1] SMP
[  491.492056] Modules linked in: af_packet autofs4 xennet xenblk cdrom
[  491.492056] CPU 0
[  491.492056] Pid: 1471, comm: sshd Not tainted 3.7.10-1.16-dbg-p1-xen #8 
[  491.492056] RIP: e030:[<ffffffffa0023aef>]  [<ffffffffa0023aef>]
netif_poll+0xe4f/0xf90 [xennet]
[  491.492056] RSP: e02b:ffff8801f5803c60  EFLAGS: 00010202
[  491.492056] RAX: ffff8801f5803da0 RBX: ffff8801f1a082c0 RCX:
0000000180200010
[  491.492056] RDX: ffff8801f5803da0 RSI: ffff8801fe83ec80 RDI:
ffff8801f03b2900
[  491.492056] RBP: ffff8801f5803e20 R08: 0000000000000001 R09:
0000000000000000
[  491.492056] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff8801f03b3400
[  491.492056] R13: 0000000000000011 R14: 000000000004327e R15:
ffff8801f06009c0
[  491.492056] FS:  00007fc519f3d7c0(0000) GS:ffff8801f5800000(0000)
knlGS:0000000000000000
[  491.492056] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[  491.492056] CR2: 00007fc51410c400 CR3: 00000001f1430000 CR4:
0000000000002660
[  491.492056] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  491.492056] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  491.492056] Process sshd (pid: 1471, threadinfo ffff8801f1264000,
task ffff8801f137bf00)
[  491.492056] Stack:
[  491.492056]  ffff8801f5803d60 ffffffff8008503e ffff8801f0600a40
ffff8801f0600000
[  491.492056]  0004328000000040 0000001200000000 ffff8801f5810570
ffff8801f0600a78
[  491.492056]  0000000000000000 ffff8801f0601fb0 0004326e00000012
ffff8801f5803d00
[  491.492056] Call Trace:
[  491.492056]  [<ffffffff8041ee35>] net_rx_action+0xd5/0x250
[  491.492056]  [<ffffffff800376d8>] __do_softirq+0xe8/0x230
[  491.492056]  [<ffffffff8051151c>] call_softirq+0x1c/0x30
[  491.492056]  [<ffffffff80008a75>] do_softirq+0x75/0xd0
[  491.492056]  [<ffffffff800379f5>] irq_exit+0xb5/0xc0
[  491.492056]  [<ffffffff8036c225>] evtchn_do_upcall+0x295/0x2d0
[  491.492056]  [<ffffffff8051114e>] do_hypervisor_callback+0x1e/0x30
[  491.492056]  [<00007fc519f97700>] 0x7fc519f976ff
[  491.492056] Code: ff 0f 1f 00 e8 a3 c1 40 e0 85 c0 90 75 69 44 89 ea
4c 89 f6 4c 89 ff e8 f0 cb ff ff c7 85 80 fe ff ff ea ff ff ff e9 7c f4
ff ff <0f> 0b ba 12 00 00 00 48 01 d0 48 39 c1 0f 82 bd fc ff ff e9 e9
[  491.492056] RIP  [<ffffffffa0023aef>] netif_poll+0xe4f/0xf90 [xennet]
[  491.492056]  RSP <ffff8801f5803c60>
[  491.511975] ---[ end trace c9e37475f12e1aaf ]---
[  491.512877] Kernel panic - not syncing: Fatal exception in interrupt

In the mean time Jan took the bug in bugzilla
(https://bugzilla.novell.com/show_bug.cgi?id=826374) and created a first
patch. I propose we continue the discussion there and post the
conclusion in this list to finish this thread here as well.


Dion

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-07-05 10:40       ` [Xen-users] " Dion Kant
@ 2013-07-05 10:54         ` Wei Liu
  2013-07-05 19:11           ` Dion Kant
  2013-07-05 10:56         ` Jan Beulich
  1 sibling, 1 reply; 85+ messages in thread
From: Wei Liu @ 2013-07-05 10:54 UTC (permalink / raw)
  To: Dion Kant; +Cc: Wei Liu, jbeulich, xen-devel

On Fri, Jul 05, 2013 at 12:40:02PM +0200, Dion Kant wrote:
> > ======
> > diff --git a/drivers/xen/netfront/netfront.c b/drivers/xen/netfront/netfront.c
> > index 6e5d233..9583011 100644
> > --- a/drivers/xen/netfront/netfront.c
> > +++ b/drivers/xen/netfront/netfront.c
> > @@ -1306,6 +1306,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
> >         struct sk_buff *nskb;
> >
> >         while ((nskb = __skb_dequeue(list))) {
> > +               BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> >                 struct netif_rx_response *rx =
> >                         RING_GET_RESPONSE(&np->rx, ++cons);
> >
> 
> Integrated the patch. I obtained a crash dump and the log in it did not
> show this BUG_ON. Here is the relevant section from the log
> 
> var/lib/xen/dump/domUA # crash /root/vmlinux-p1
> 2013-0705-1347.43-domUA.1.core
> 
> [    7.670132] Adding 4192252k swap on /dev/xvda1.  Priority:-1
> extents:1 across:4192252k SS
> [   10.204340] NET: Registered protocol family 17
> [  481.534979] netfront: Too many frags
> [  487.543946] netfront: Too many frags
> [  491.049458] netfront: Too many frags
> [  491.491153] ------------[ cut here ]------------
> [  491.491628] kernel BUG at drivers/xen/netfront/netfront.c:1295!

So what's the code around line 1295? There must be a BUG_ON there. It's
normal you didn't see the same line number in my patch (1306) because we
were using different kernel.

> [  491.492056] invalid opcode: 0000 [#1] SMP
> [  491.492056] Modules linked in: af_packet autofs4 xennet xenblk cdrom
> [  491.492056] CPU 0
> [  491.492056] Pid: 1471, comm: sshd Not tainted 3.7.10-1.16-dbg-p1-xen #8 
> [  491.492056] RIP: e030:[<ffffffffa0023aef>]  [<ffffffffa0023aef>]
> netif_poll+0xe4f/0xf90 [xennet]
> [  491.492056] RSP: e02b:ffff8801f5803c60  EFLAGS: 00010202
> [  491.492056] RAX: ffff8801f5803da0 RBX: ffff8801f1a082c0 RCX:
> 0000000180200010
> [  491.492056] RDX: ffff8801f5803da0 RSI: ffff8801fe83ec80 RDI:
> ffff8801f03b2900
> [  491.492056] RBP: ffff8801f5803e20 R08: 0000000000000001 R09:
> 0000000000000000
> [  491.492056] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff8801f03b3400
> [  491.492056] R13: 0000000000000011 R14: 000000000004327e R15:
> ffff8801f06009c0
> [  491.492056] FS:  00007fc519f3d7c0(0000) GS:ffff8801f5800000(0000)
> knlGS:0000000000000000
> [  491.492056] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  491.492056] CR2: 00007fc51410c400 CR3: 00000001f1430000 CR4:
> 0000000000002660
> [  491.492056] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  491.492056] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [  491.492056] Process sshd (pid: 1471, threadinfo ffff8801f1264000,
> task ffff8801f137bf00)
> [  491.492056] Stack:
> [  491.492056]  ffff8801f5803d60 ffffffff8008503e ffff8801f0600a40
> ffff8801f0600000
> [  491.492056]  0004328000000040 0000001200000000 ffff8801f5810570
> ffff8801f0600a78
> [  491.492056]  0000000000000000 ffff8801f0601fb0 0004326e00000012
> ffff8801f5803d00
> [  491.492056] Call Trace:
> [  491.492056]  [<ffffffff8041ee35>] net_rx_action+0xd5/0x250
> [  491.492056]  [<ffffffff800376d8>] __do_softirq+0xe8/0x230
> [  491.492056]  [<ffffffff8051151c>] call_softirq+0x1c/0x30
> [  491.492056]  [<ffffffff80008a75>] do_softirq+0x75/0xd0
> [  491.492056]  [<ffffffff800379f5>] irq_exit+0xb5/0xc0
> [  491.492056]  [<ffffffff8036c225>] evtchn_do_upcall+0x295/0x2d0
> [  491.492056]  [<ffffffff8051114e>] do_hypervisor_callback+0x1e/0x30
> [  491.492056]  [<00007fc519f97700>] 0x7fc519f976ff
> [  491.492056] Code: ff 0f 1f 00 e8 a3 c1 40 e0 85 c0 90 75 69 44 89 ea
> 4c 89 f6 4c 89 ff e8 f0 cb ff ff c7 85 80 fe ff ff ea ff ff ff e9 7c f4
> ff ff <0f> 0b ba 12 00 00 00 48 01 d0 48 39 c1 0f 82 bd fc ff ff e9 e9
> [  491.492056] RIP  [<ffffffffa0023aef>] netif_poll+0xe4f/0xf90 [xennet]
> [  491.492056]  RSP <ffff8801f5803c60>
> [  491.511975] ---[ end trace c9e37475f12e1aaf ]---
> [  491.512877] Kernel panic - not syncing: Fatal exception in interrupt
> 
> In the mean time Jan took the bug in bugzilla
> (https://bugzilla.novell.com/show_bug.cgi?id=826374) and created a first
> patch. I propose we continue the discussion there and post the
> conclusion in this list to finish this thread here as well.
> 

I'm not so sure what's the best way. In general it would be much a
burden for developers to look at every bugzilla, let alone register
account for each bugzilla.

Anyway, as we already started the discussion here, I think we should
just continue it here. We could do this the other way around, discuss
here and post conclusion / references there.


Wei.

> 
> Dion

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-07-05 10:40       ` [Xen-users] " Dion Kant
  2013-07-05 10:54         ` Wei Liu
@ 2013-07-05 10:56         ` Jan Beulich
  2013-07-05 19:46           ` Dion Kant
  1 sibling, 1 reply; 85+ messages in thread
From: Jan Beulich @ 2013-07-05 10:56 UTC (permalink / raw)
  To: Dion Kant; +Cc: Wei Liu, xen-devel

>>> On 05.07.13 at 12:40, Dion Kant <g.w.kant@hunenet.nl> wrote:
> On 07/04/2013 05:01 PM, Wei Liu wrote:
>> --- a/drivers/xen/netfront/netfront.c
>> +++ b/drivers/xen/netfront/netfront.c
>> @@ -1306,6 +1306,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
>>         struct sk_buff *nskb;
>>
>>         while ((nskb = __skb_dequeue(list))) {
>> +               BUG_ON(nr_frags >= MAX_SKB_FRAGS);
>>                 struct netif_rx_response *rx =
>>                         RING_GET_RESPONSE(&np->rx, ++cons);
>>
> 
> Integrated the patch. I obtained a crash dump and the log in it did not
> show this BUG_ON. Here is the relevant section from the log
> 
> var/lib/xen/dump/domUA # crash /root/vmlinux-p1
> 2013-0705-1347.43-domUA.1.core
> 
> [    7.670132] Adding 4192252k swap on /dev/xvda1.  Priority:-1 extents:1 across:4192252k SS
> [   10.204340] NET: Registered protocol family 17
> [  481.534979] netfront: Too many frags
> [  487.543946] netfront: Too many frags
> [  491.049458] netfront: Too many frags
> [  491.491153] ------------[ cut here ]------------
> [  491.491628] kernel BUG at drivers/xen/netfront/netfront.c:1295!

So if not the BUG_ON() from the patch above, what else does that
line have in your sources?

Jan

> [  491.492056] invalid opcode: 0000 [#1] SMP
> [  491.492056] Modules linked in: af_packet autofs4 xennet xenblk cdrom
> [  491.492056] CPU 0
> [  491.492056] Pid: 1471, comm: sshd Not tainted 3.7.10-1.16-dbg-p1-xen #8 
> [  491.492056] RIP: e030:[<ffffffffa0023aef>]  [<ffffffffa0023aef>]
> netif_poll+0xe4f/0xf90 [xennet]
> [  491.492056] RSP: e02b:ffff8801f5803c60  EFLAGS: 00010202
> [  491.492056] RAX: ffff8801f5803da0 RBX: ffff8801f1a082c0 RCX:
> 0000000180200010
> [  491.492056] RDX: ffff8801f5803da0 RSI: ffff8801fe83ec80 RDI:
> ffff8801f03b2900
> [  491.492056] RBP: ffff8801f5803e20 R08: 0000000000000001 R09:
> 0000000000000000
> [  491.492056] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff8801f03b3400
> [  491.492056] R13: 0000000000000011 R14: 000000000004327e R15:
> ffff8801f06009c0
> [  491.492056] FS:  00007fc519f3d7c0(0000) GS:ffff8801f5800000(0000)
> knlGS:0000000000000000
> [  491.492056] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  491.492056] CR2: 00007fc51410c400 CR3: 00000001f1430000 CR4:
> 0000000000002660
> [  491.492056] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  491.492056] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [  491.492056] Process sshd (pid: 1471, threadinfo ffff8801f1264000,
> task ffff8801f137bf00)
> [  491.492056] Stack:
> [  491.492056]  ffff8801f5803d60 ffffffff8008503e ffff8801f0600a40
> ffff8801f0600000
> [  491.492056]  0004328000000040 0000001200000000 ffff8801f5810570
> ffff8801f0600a78
> [  491.492056]  0000000000000000 ffff8801f0601fb0 0004326e00000012
> ffff8801f5803d00
> [  491.492056] Call Trace:
> [  491.492056]  [<ffffffff8041ee35>] net_rx_action+0xd5/0x250
> [  491.492056]  [<ffffffff800376d8>] __do_softirq+0xe8/0x230
> [  491.492056]  [<ffffffff8051151c>] call_softirq+0x1c/0x30
> [  491.492056]  [<ffffffff80008a75>] do_softirq+0x75/0xd0
> [  491.492056]  [<ffffffff800379f5>] irq_exit+0xb5/0xc0
> [  491.492056]  [<ffffffff8036c225>] evtchn_do_upcall+0x295/0x2d0
> [  491.492056]  [<ffffffff8051114e>] do_hypervisor_callback+0x1e/0x30
> [  491.492056]  [<00007fc519f97700>] 0x7fc519f976ff
> [  491.492056] Code: ff 0f 1f 00 e8 a3 c1 40 e0 85 c0 90 75 69 44 89 ea
> 4c 89 f6 4c 89 ff e8 f0 cb ff ff c7 85 80 fe ff ff ea ff ff ff e9 7c f4
> ff ff <0f> 0b ba 12 00 00 00 48 01 d0 48 39 c1 0f 82 bd fc ff ff e9 e9
> [  491.492056] RIP  [<ffffffffa0023aef>] netif_poll+0xe4f/0xf90 [xennet]
> [  491.492056]  RSP <ffff8801f5803c60>
> [  491.511975] ---[ end trace c9e37475f12e1aaf ]---
> [  491.512877] Kernel panic - not syncing: Fatal exception in interrupt
> 
> In the mean time Jan took the bug in bugzilla
> (https://bugzilla.novell.com/show_bug.cgi?id=826374) and created a first
> patch. I propose we continue the discussion there and post the
> conclusion in this list to finish this thread here as well.
> 
> 
> Dion

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-05  9:32       ` [PATCH] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
  2013-07-05 14:53         ` Wei Liu
@ 2013-07-05 14:53         ` Wei Liu
  2013-07-07  1:10           ` David Miller
                             ` (3 more replies)
  1 sibling, 4 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-05 14:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: davem, Ian Campbell, wei.liu2, Dion Kant, xen-devel, netdev, stable

On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
> linear area is big enough on RX") xennet_fill_frags() may end up
> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
> the fragment count subsequently via __pskb_pull_tail(). That's a
> result of xennet_get_responses() allowing a maximum of one more slot to
> be consumed (and intermediately transformed into a fragment) if the
> head slot has a size less than or equal to RX_COPY_THRESHOLD.
> 
> Hence we need to adjust xennet_fill_frags() to pull earlier if we
> reached the maximum fragment count - due to the described behavior of
> xennet_get_responses() this guarantees that at least the first fragment
> will get completely consumed, and hence the fragment count reduced.
> 
> In order to not needlessly call __pskb_pull_tail() twice, make the
> original call conditional upon the pull target not having been reached
> yet, and defer the newly added one as much as possible (an alternative
> would have been to always call the function right before the call to
> xennet_fill_frags(), but that would imply more frequent cases of
> needing to call it twice).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: stable@vger.kernel.org (3.6 onwards)
> 
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
>  			RING_GET_RESPONSE(&np->rx, ++cons);
>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>  
> +		if (nr_frags == MAX_SKB_FRAGS) {
> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> +
> +			BUG_ON(pull_to <= skb_headlen(skb));
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));

skb_headlen is in fact "skb->len - skb->data_len". Looking at the
caller code:

    while loop {
        skb_shinfo(skb)->frags[0].page_offset = rx->offset;
	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
	skb->data_len = rx->status;

	i = xennet_fill_frags(np, skb, &tmpq);

	/*                                                                                                                                             
	 * Truesize is the actual allocation size, even if the                                                                                         
	 * allocation is only partially used.                                                                                                          
	 */
	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
	skb->len += skb->data_len;
    }

    handle_incoming_packet();

You seem to be altering the behavior of the original code, because in
your patch the skb->len is incremented before use, while in the original
code (which calls skb_headlen in handle_incoming_packet) the skb->len is
correctly set.

> +			nr_frags = shinfo->nr_frags;
> +		}
> +		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> +
>  		__skb_fill_page_desc(skb, nr_frags,
>  				     skb_frag_page(nfrag),
>  				     rx->offset, rx->status);
> @@ -929,7 +938,8 @@ static int handle_incoming_queue(struct 
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		if (pull_to > skb_headlen(skb))
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> 
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-05  9:32       ` [PATCH] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
@ 2013-07-05 14:53         ` Wei Liu
  2013-07-05 14:53         ` Wei Liu
  1 sibling, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-05 14:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
> linear area is big enough on RX") xennet_fill_frags() may end up
> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
> the fragment count subsequently via __pskb_pull_tail(). That's a
> result of xennet_get_responses() allowing a maximum of one more slot to
> be consumed (and intermediately transformed into a fragment) if the
> head slot has a size less than or equal to RX_COPY_THRESHOLD.
> 
> Hence we need to adjust xennet_fill_frags() to pull earlier if we
> reached the maximum fragment count - due to the described behavior of
> xennet_get_responses() this guarantees that at least the first fragment
> will get completely consumed, and hence the fragment count reduced.
> 
> In order to not needlessly call __pskb_pull_tail() twice, make the
> original call conditional upon the pull target not having been reached
> yet, and defer the newly added one as much as possible (an alternative
> would have been to always call the function right before the call to
> xennet_fill_frags(), but that would imply more frequent cases of
> needing to call it twice).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: stable@vger.kernel.org (3.6 onwards)
> 
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
>  			RING_GET_RESPONSE(&np->rx, ++cons);
>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>  
> +		if (nr_frags == MAX_SKB_FRAGS) {
> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> +
> +			BUG_ON(pull_to <= skb_headlen(skb));
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));

skb_headlen is in fact "skb->len - skb->data_len". Looking at the
caller code:

    while loop {
        skb_shinfo(skb)->frags[0].page_offset = rx->offset;
	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
	skb->data_len = rx->status;

	i = xennet_fill_frags(np, skb, &tmpq);

	/*                                                                                                                                             
	 * Truesize is the actual allocation size, even if the                                                                                         
	 * allocation is only partially used.                                                                                                          
	 */
	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
	skb->len += skb->data_len;
    }

    handle_incoming_packet();

You seem to be altering the behavior of the original code, because in
your patch the skb->len is incremented before use, while in the original
code (which calls skb_headlen in handle_incoming_packet) the skb->len is
correctly set.

> +			nr_frags = shinfo->nr_frags;
> +		}
> +		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> +
>  		__skb_fill_page_desc(skb, nr_frags,
>  				     skb_frag_page(nfrag),
>  				     rx->offset, rx->status);
> @@ -929,7 +938,8 @@ static int handle_incoming_queue(struct 
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		if (pull_to > skb_headlen(skb))
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> 
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-07-05 10:54         ` Wei Liu
@ 2013-07-05 19:11           ` Dion Kant
  0 siblings, 0 replies; 85+ messages in thread
From: Dion Kant @ 2013-07-05 19:11 UTC (permalink / raw)
  To: Wei Liu; +Cc: jbeulich, xen-devel

On 07/05/2013 12:54 PM, Wei Liu wrote:
>>
>> [    7.670132] Adding 4192252k swap on /dev/xvda1.  Priority:-1
>> extents:1 across:4192252k SS
>> [   10.204340] NET: Registered protocol family 17
>> [  481.534979] netfront: Too many frags
>> [  487.543946] netfront: Too many frags
>> [  491.049458] netfront: Too many frags
>> [  491.491153] ------------[ cut here ]------------
>> [  491.491628] kernel BUG at drivers/xen/netfront/netfront.c:1295!
> 
> So what's the code around line 1295? There must be a BUG_ON there. It's
> normal you didn't see the same line number in my patch (1306) because we
> were using different kernel.
> 

Yes, stupid me. I was definitely in a wrong state when writing this.
Even more stupid, I did make the same stupid remark after obtaining
results with Jan's patch. 1000 times sorry... ( I was to much in a hurry
to get my response back to you likely because I was happy that you guys
could free some for this.)

So now we know that nr_frags >= MAX_SKB_FRAGS

1285 static RING_IDX xennet_fill_frags(struct netfront_info *np,
1286                                   struct sk_buff *skb,
1287                                   struct sk_buff_head *list)
1288 {
1289         struct skb_shared_info *shinfo = skb_shinfo(skb);
1290         int nr_frags = shinfo->nr_frags;
1291         RING_IDX cons = np->rx.rsp_cons;
1292         struct sk_buff *nskb;
1293
1294         while ((nskb = __skb_dequeue(list))) {
1295                BUG_ON(nr_frags >= MAX_SKB_FRAGS);
1296
1297                 struct netif_rx_response *rx =
1298                         RING_GET_RESPONSE(&np->rx, ++cons);
1299

>>
>> In the mean time Jan took the bug in bugzilla
>> (https://bugzilla.novell.com/show_bug.cgi?id=826374) and created a first
>> patch. I propose we continue the discussion there and post the
>> conclusion in this list to finish this thread here as well.
>>
> 
> I'm not so sure what's the best way. In general it would be much a
> burden for developers to look at every bugzilla, let alone register
> account for each bugzilla.
> 
> Anyway, as we already started the discussion here, I think we should
> just continue it here. We could do this the other way around, discuss
> here and post conclusion / references there.

This is oke with me. I was wondering what would be most convenient.

Dion

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-07-05 10:56         ` Jan Beulich
@ 2013-07-05 19:46           ` Dion Kant
  2013-07-06 13:36             ` Wei Liu
  2013-07-08  8:25             ` Jan Beulich
  0 siblings, 2 replies; 85+ messages in thread
From: Dion Kant @ 2013-07-05 19:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, xen-devel

On 07/05/2013 12:56 PM, Jan Beulich wrote:
>>>> On 05.07.13 at 12:40, Dion Kant <g.w.kant@hunenet.nl> wrote:
>> On 07/04/2013 05:01 PM, Wei Liu wrote:
>>> --- a/drivers/xen/netfront/netfront.c
>>> +++ b/drivers/xen/netfront/netfront.c
>>> @@ -1306,6 +1306,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
>>>         struct sk_buff *nskb;
>>>
>>>         while ((nskb = __skb_dequeue(list))) {
>>> +               BUG_ON(nr_frags >= MAX_SKB_FRAGS);
>>>                 struct netif_rx_response *rx =
>>>                         RING_GET_RESPONSE(&np->rx, ++cons);
>>>
>>
>> Integrated the patch. I obtained a crash dump and the log in it did not
>> show this BUG_ON. Here is the relevant section from the log
>>
>> var/lib/xen/dump/domUA # crash /root/vmlinux-p1
>> 2013-0705-1347.43-domUA.1.core
>>
>> [    7.670132] Adding 4192252k swap on /dev/xvda1.  Priority:-1 extents:1 across:4192252k SS
>> [   10.204340] NET: Registered protocol family 17
>> [  481.534979] netfront: Too many frags
>> [  487.543946] netfront: Too many frags
>> [  491.049458] netfront: Too many frags
>> [  491.491153] ------------[ cut here ]------------
>> [  491.491628] kernel BUG at drivers/xen/netfront/netfront.c:1295!
> 
> So if not the BUG_ON() from the patch above, what else does that
> line have in your sources?

Nothing else, but thanks for pointing this out to me.

After obtaining results with your patch

1285 static RING_IDX xennet_fill_frags(struct netfront_info *np,
1286                                   struct sk_buff *skb,
1287                                   struct sk_buff_head *list)
1288 {
1289         struct skb_shared_info *shinfo = skb_shinfo(skb);
1290         int nr_frags = shinfo->nr_frags;
1291         RING_IDX cons = np->rx.rsp_cons;
1292         struct sk_buff *nskb;
1293
1294         while ((nskb = __skb_dequeue(list))) {
1295                 struct netif_rx_response *rx =
1296                         RING_GET_RESPONSE(&np->rx, ++cons);
1297
1298
1299 if (nr_frags == MAX_SKB_FRAGS) {
1300         unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
1301
1302         BUG_ON(pull_to <= skb_headlen(skb));
1303         __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
1304                 nr_frags = shinfo->nr_frags;
1305 }
1306 BUG_ON(nr_frags >= MAX_SKB_FRAGS);
1307
1308  __skb_fill_page_desc(skb, nr_frags,
1309                       skb_frag_page(skb_shinfo(nskb)->frags),
1310                       rx->offset, rx->status);

Can I conclude that nr_frags == MAX_SKB_FRAGS, pull_to <=
skb_headlen(skb) and the panic happens before the next BUG_ON is reached ?

> [  717.568040] netfront: Too many frags
> [  723.768738] ------------[ cut here ]------------
> [  723.769226] kernel BUG at drivers/xen/netfront/netfront.c:1302!
> [  723.769724] invalid opcode: 0000 [#1] SMP 
> [  723.770203] Modules linked in: af_packet autofs4 xennet xenblk cdrom
> [  723.770697] CPU 0 
> [  723.770710] Pid: 1309, comm: sshd Not tainted 3.7.10-1.16-dbg-jbp1-xen #9  
> [  723.771667] RIP: e030:[<ffffffffa0023b17>]  [<ffffffffa0023b17>] netif_poll+0xe77/0xf70 [xennet]
> [  723.772057] RSP: e02b:ffff8800fb403c60  EFLAGS: 00010213
> [  723.772057] RAX: 00000000ffff0986 RBX: 0000000000049b7d RCX: ffff8800f78a5428
> [  723.772057] RDX: 000000000000007d RSI: 0000000000000042 RDI: ffff8800f98026c0
> [  723.772057] RBP: ffff8800fb403e20 R08: 0000000000000001 R09: 0000000000000000
> [  723.772057] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800f79709c0
> [  723.772057] R13: 0000000000000011 R14: ffff8800f866bc00 R15: ffff8800f85a5800
> [  723.772057] FS:  00007fc8f97e87c0(0000) GS:ffff8800fb400000(0000) knlGS:0000000000000000
> [  723.772057] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  723.772057] CR2: 00007fc8f3aafff0 CR3: 00000000f8a82000 CR4: 0000000000002660
> [  723.772057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  723.772057] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [  723.772057] Process sshd (pid: 1309, threadinfo ffff8800f88f4000, task ffff8800f8279f80)
> [  723.772057] Stack:
> [  723.772057]  ffff8800fb403d60 ffff8800f7970a40 ffff8800f7970000 0000004000000002
> [  723.772057]  0000000000049b7e ffff8800fb410570 ffff8800f7970a78 0000000000000012
> [  723.772057]  ffff8800f7971fb0 0000001200000000 ffff8800f8b8d2c0 ffff8800fb403d50
> [  723.772057] Call Trace:
> [  723.772057]  [<ffffffff8041ee35>] net_rx_action+0xd5/0x250
> [  723.772057]  [<ffffffff800376d8>] __do_softirq+0xe8/0x230
> [  723.772057]  [<ffffffff8051151c>] call_softirq+0x1c/0x30
> [  723.772057]  [<ffffffff80008a75>] do_softirq+0x75/0xd0
> [  723.772057]  [<ffffffff800379f5>] irq_exit+0xb5/0xc0
> [  723.772057]  [<ffffffff8036c225>] evtchn_do_upcall+0x295/0x2d0
> [  723.772057]  [<ffffffff8051114e>] do_hypervisor_callback+0x1e/0x30
> [  723.772057]  [<00007fc8f8c5529b>] 0x7fc8f8c5529a
> [  723.772057] Code: 85 7c fe ff ff ea ff ff ff e9 69 f4 ff ff ba 12 00 00 00 48 01 d0 48 39 c1 0f
>                      82 b2 fc ff ff e9 e0 fe ff ff ba 08 00 00 00 eb e8 <0f> 0b 0f b7 33 48 c7 c7
>                      70 6c 02 a0 31 c0 e8 40 7a 4d e0 c7 85 
> [  723.772057] RIP  [<ffffffffa0023b17>] netif_poll+0xe77/0xf70 [xennet]
> [  723.772057]  RSP <ffff8800fb403c60>
> [  723.790939] ---[ end trace 846551a77e015655 ]---
> [  723.791847] Kernel panic - not syncing: Fatal exception in interrupt


Dion.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-07-05 19:46           ` Dion Kant
@ 2013-07-06 13:36             ` Wei Liu
  2013-07-08  8:25             ` Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-06 13:36 UTC (permalink / raw)
  To: Dion Kant; +Cc: Wei Liu, Jan Beulich, xen-devel

On Fri, Jul 05, 2013 at 09:46:25PM +0200, Dion Kant wrote:
> On 07/05/2013 12:56 PM, Jan Beulich wrote:
> >>>> On 05.07.13 at 12:40, Dion Kant <g.w.kant@hunenet.nl> wrote:
> >> On 07/04/2013 05:01 PM, Wei Liu wrote:
> >>> --- a/drivers/xen/netfront/netfront.c
> >>> +++ b/drivers/xen/netfront/netfront.c
> >>> @@ -1306,6 +1306,7 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
> >>>         struct sk_buff *nskb;
> >>>
> >>>         while ((nskb = __skb_dequeue(list))) {
> >>> +               BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> >>>                 struct netif_rx_response *rx =
> >>>                         RING_GET_RESPONSE(&np->rx, ++cons);
> >>>
> >>
> >> Integrated the patch. I obtained a crash dump and the log in it did not
> >> show this BUG_ON. Here is the relevant section from the log
> >>
> >> var/lib/xen/dump/domUA # crash /root/vmlinux-p1
> >> 2013-0705-1347.43-domUA.1.core
> >>
> >> [    7.670132] Adding 4192252k swap on /dev/xvda1.  Priority:-1 extents:1 across:4192252k SS
> >> [   10.204340] NET: Registered protocol family 17
> >> [  481.534979] netfront: Too many frags
> >> [  487.543946] netfront: Too many frags
> >> [  491.049458] netfront: Too many frags
> >> [  491.491153] ------------[ cut here ]------------
> >> [  491.491628] kernel BUG at drivers/xen/netfront/netfront.c:1295!
> > 
> > So if not the BUG_ON() from the patch above, what else does that
> > line have in your sources?
> 
> Nothing else, but thanks for pointing this out to me.
> 
> After obtaining results with your patch
> 
> 1285 static RING_IDX xennet_fill_frags(struct netfront_info *np,
> 1286                                   struct sk_buff *skb,
> 1287                                   struct sk_buff_head *list)
> 1288 {
> 1289         struct skb_shared_info *shinfo = skb_shinfo(skb);
> 1290         int nr_frags = shinfo->nr_frags;
> 1291         RING_IDX cons = np->rx.rsp_cons;
> 1292         struct sk_buff *nskb;
> 1293
> 1294         while ((nskb = __skb_dequeue(list))) {
> 1295                 struct netif_rx_response *rx =
> 1296                         RING_GET_RESPONSE(&np->rx, ++cons);
> 1297
> 1298
> 1299 if (nr_frags == MAX_SKB_FRAGS) {
> 1300         unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> 1301
> 1302         BUG_ON(pull_to <= skb_headlen(skb));
> 1303         __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> 1304                 nr_frags = shinfo->nr_frags;
> 1305 }
> 1306 BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> 1307
> 1308  __skb_fill_page_desc(skb, nr_frags,
> 1309                       skb_frag_page(skb_shinfo(nskb)->frags),
> 1310                       rx->offset, rx->status);
> 
> Can I conclude that nr_frags == MAX_SKB_FRAGS, pull_to <=
> skb_headlen(skb) and the panic happens before the next BUG_ON is reached ?
> 

So the oops message below is from a kernel with Jan's patch applied,
right? The quoting level confuses me.

> > [  717.568040] netfront: Too many frags
> > [  723.768738] ------------[ cut here ]------------
> > [  723.769226] kernel BUG at drivers/xen/netfront/netfront.c:1302!

>From you description and the line number I presume it is a kernel with
Jan's patch applied.

Looking at Jan's patch for upstream I discovered a problem that might
lead to this crash.


Wei.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-05 14:53         ` Wei Liu
  2013-07-07  1:10           ` David Miller
@ 2013-07-07  1:10           ` David Miller
  2013-07-08  9:59           ` Jan Beulich
  2013-07-08  9:59           ` Jan Beulich
  3 siblings, 0 replies; 85+ messages in thread
From: David Miller @ 2013-07-07  1:10 UTC (permalink / raw)
  To: wei.liu2; +Cc: JBeulich, ian.campbell, g.w.kant, xen-devel, netdev, stable

From: Wei Liu <wei.liu2@citrix.com>
Date: Fri, 5 Jul 2013 15:53:19 +0100

> You seem to be altering the behavior of the original code, because in
> your patch the skb->len is incremented before use, while in the original
> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
> correctly set.

Indeed, I think you're right.

Because __pskb_pull_tail() decrements the number of bytes pulled
from skb->data_len, it won't be accounted for in skb->len

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-05 14:53         ` Wei Liu
@ 2013-07-07  1:10           ` David Miller
  2013-07-07  1:10           ` David Miller
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 85+ messages in thread
From: David Miller @ 2013-07-07  1:10 UTC (permalink / raw)
  To: wei.liu2; +Cc: ian.campbell, netdev, stable, xen-devel, JBeulich, g.w.kant

From: Wei Liu <wei.liu2@citrix.com>
Date: Fri, 5 Jul 2013 15:53:19 +0100

> You seem to be altering the behavior of the original code, because in
> your patch the skb->len is incremented before use, while in the original
> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
> correctly set.

Indeed, I think you're right.

Because __pskb_pull_tail() decrements the number of bytes pulled
from skb->data_len, it won't be accounted for in skb->len

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8
  2013-07-05 19:46           ` Dion Kant
  2013-07-06 13:36             ` Wei Liu
@ 2013-07-08  8:25             ` Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-08  8:25 UTC (permalink / raw)
  To: Dion Kant; +Cc: Wei Liu, xen-devel

>>> On 05.07.13 at 21:46, Dion Kant <g.w.kant@hunenet.nl> wrote:
> After obtaining results with your patch
> 
> 1285 static RING_IDX xennet_fill_frags(struct netfront_info *np,
> 1286                                   struct sk_buff *skb,
> 1287                                   struct sk_buff_head *list)
> 1288 {
> 1289         struct skb_shared_info *shinfo = skb_shinfo(skb);
> 1290         int nr_frags = shinfo->nr_frags;
> 1291         RING_IDX cons = np->rx.rsp_cons;
> 1292         struct sk_buff *nskb;
> 1293
> 1294         while ((nskb = __skb_dequeue(list))) {
> 1295                 struct netif_rx_response *rx =
> 1296                         RING_GET_RESPONSE(&np->rx, ++cons);
> 1297
> 1298
> 1299 if (nr_frags == MAX_SKB_FRAGS) {
> 1300         unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> 1301
> 1302         BUG_ON(pull_to <= skb_headlen(skb));
> 1303         __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> 1304                 nr_frags = shinfo->nr_frags;
> 1305 }
> 1306 BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> 1307
> 1308  __skb_fill_page_desc(skb, nr_frags,
> 1309                       skb_frag_page(skb_shinfo(nskb)->frags),
> 1310                       rx->offset, rx->status);
> 
> Can I conclude that nr_frags == MAX_SKB_FRAGS, pull_to <=
> skb_headlen(skb) and the panic happens before the next BUG_ON is reached ?

Yes, exactly. And Wei pointed at an issue with the patch, so I'll
have to see whether that might be an explanation for the crash
you observed.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-05 14:53         ` Wei Liu
  2013-07-07  1:10           ` David Miller
  2013-07-07  1:10           ` David Miller
@ 2013-07-08  9:59           ` Jan Beulich
  2013-07-08 12:16             ` Dion Kant
                               ` (2 more replies)
  2013-07-08  9:59           ` Jan Beulich
  3 siblings, 3 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-08  9:59 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, davem, Dion Kant, xen-devel, netdev, stable

>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
> On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
>> --- a/drivers/net/xen-netfront.c
>> +++ b/drivers/net/xen-netfront.c
>> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
>>  			RING_GET_RESPONSE(&np->rx, ++cons);
>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>>  
>> +		if (nr_frags == MAX_SKB_FRAGS) {
>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>> +
>> +			BUG_ON(pull_to <= skb_headlen(skb));
>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> 
> skb_headlen is in fact "skb->len - skb->data_len". Looking at the
> caller code:
> 
>     while loop {
>         skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> 	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> 	skb->data_len = rx->status;
> 
> 	i = xennet_fill_frags(np, skb, &tmpq);
> 
> 	/*                                                                           
>                                                                   
> 	 * Truesize is the actual allocation size, even if the                       
>                                                                   
> 	 * allocation is only partially used.                                        
>                                                                   
> 	 */
> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> 	skb->len += skb->data_len;
>     }
> 
>     handle_incoming_packet();
> 
> You seem to be altering the behavior of the original code, because in
> your patch the skb->len is incremented before use, while in the original
> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
> correctly set.

Right. So I basically need to keep skb->len up-to-date along with
->data_len. Just handed a patch to Dion with that done; I'll defer
sending a v2 for the upstream code until I know the change works
for our kernel.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-05 14:53         ` Wei Liu
                             ` (2 preceding siblings ...)
  2013-07-08  9:59           ` Jan Beulich
@ 2013-07-08  9:59           ` Jan Beulich
  3 siblings, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-08  9:59 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
> On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
>> --- a/drivers/net/xen-netfront.c
>> +++ b/drivers/net/xen-netfront.c
>> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
>>  			RING_GET_RESPONSE(&np->rx, ++cons);
>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>>  
>> +		if (nr_frags == MAX_SKB_FRAGS) {
>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>> +
>> +			BUG_ON(pull_to <= skb_headlen(skb));
>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> 
> skb_headlen is in fact "skb->len - skb->data_len". Looking at the
> caller code:
> 
>     while loop {
>         skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> 	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> 	skb->data_len = rx->status;
> 
> 	i = xennet_fill_frags(np, skb, &tmpq);
> 
> 	/*                                                                           
>                                                                   
> 	 * Truesize is the actual allocation size, even if the                       
>                                                                   
> 	 * allocation is only partially used.                                        
>                                                                   
> 	 */
> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> 	skb->len += skb->data_len;
>     }
> 
>     handle_incoming_packet();
> 
> You seem to be altering the behavior of the original code, because in
> your patch the skb->len is incremented before use, while in the original
> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
> correctly set.

Right. So I basically need to keep skb->len up-to-date along with
->data_len. Just handed a patch to Dion with that done; I'll defer
sending a v2 for the upstream code until I know the change works
for our kernel.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08  9:59           ` Jan Beulich
@ 2013-07-08 12:16             ` Dion Kant
  2013-07-08 12:41               ` Jan Beulich
  2013-07-08 12:41               ` Jan Beulich
  2013-07-08 14:20             ` [Xen-devel] " Jan Beulich
  2013-07-08 14:20             ` Jan Beulich
  2 siblings, 2 replies; 85+ messages in thread
From: Dion Kant @ 2013-07-08 12:16 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, davem

On 07/08/2013 11:59 AM, Jan Beulich wrote:
>>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
>> On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
>>> --- a/drivers/net/xen-netfront.c
>>> +++ b/drivers/net/xen-netfront.c
>>> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
>>>  			RING_GET_RESPONSE(&np->rx, ++cons);
>>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>>>  
>>> +		if (nr_frags == MAX_SKB_FRAGS) {
>>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>>> +
>>> +			BUG_ON(pull_to <= skb_headlen(skb));
>>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>>
>> skb_headlen is in fact "skb->len - skb->data_len". Looking at the
>> caller code:
>>
>>     while loop {
>>         skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>> 	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
>> 	skb->data_len = rx->status;
>>
>> 	i = xennet_fill_frags(np, skb, &tmpq);
>>
>> 	/*                                                                           
>>                                                                   
>> 	 * Truesize is the actual allocation size, even if the                       
>>                                                                   
>> 	 * allocation is only partially used.                                        
>>                                                                   
>> 	 */
>> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
>> 	skb->len += skb->data_len;
>>     }
>>
>>     handle_incoming_packet();
>>
>> You seem to be altering the behavior of the original code, because in
>> your patch the skb->len is incremented before use, while in the original
>> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
>> correctly set.
> 
> Right. So I basically need to keep skb->len up-to-date along with
> ->data_len. Just handed a patch to Dion with that done; I'll defer
> sending a v2 for the upstream code until I know the change works
> for our kernel.
> 
> Jan
> 

Jan,

I was wondering about the following.

In netif_poll(struct napi_struct *napi, int budget) the skb->cb is assigend,
but it may be clipped to RX_COPY_THRESHOLD

		NETFRONT_SKB_CB(skb)->pull_to = rx->status;
		if (NETFRONT_SKB_CB(skb)->pull_to > RX_COPY_THRESHOLD)
		      NETFRONT_SKB_CB(skb)->pull_to = RX_COPY_THRESHOLD;

How does this modification of NETFRONT_SKB_CB(skb)->pull_to propagates
or is this irrelevant?

Dion

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 12:16             ` Dion Kant
@ 2013-07-08 12:41               ` Jan Beulich
  2013-07-08 12:41               ` Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-08 12:41 UTC (permalink / raw)
  To: Dion Kant; +Cc: Ian Campbell, Wei Liu, davem, xen-devel, netdev, stable

>>> On 08.07.13 at 14:16, Dion Kant <g.w.kant@hunenet.nl> wrote:
> I was wondering about the following.
> 
> In netif_poll(struct napi_struct *napi, int budget) the skb->cb is assigend,
> but it may be clipped to RX_COPY_THRESHOLD
> 
> 		NETFRONT_SKB_CB(skb)->pull_to = rx->status;
> 		if (NETFRONT_SKB_CB(skb)->pull_to > RX_COPY_THRESHOLD)
> 		      NETFRONT_SKB_CB(skb)->pull_to = RX_COPY_THRESHOLD;
> 
> How does this modification of NETFRONT_SKB_CB(skb)->pull_to propagates
> or is this irrelevant?

I'm not sure I understand what you're asking. pull_to is a private
(to netfront) field used to communicate how much of the skb
needs to be pulled. I don't see how the word "propagate" makes
any sense here, but as said I may simply not be getting what
you're asking about.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 12:16             ` Dion Kant
  2013-07-08 12:41               ` Jan Beulich
@ 2013-07-08 12:41               ` Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-08 12:41 UTC (permalink / raw)
  To: Dion Kant; +Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, davem

>>> On 08.07.13 at 14:16, Dion Kant <g.w.kant@hunenet.nl> wrote:
> I was wondering about the following.
> 
> In netif_poll(struct napi_struct *napi, int budget) the skb->cb is assigend,
> but it may be clipped to RX_COPY_THRESHOLD
> 
> 		NETFRONT_SKB_CB(skb)->pull_to = rx->status;
> 		if (NETFRONT_SKB_CB(skb)->pull_to > RX_COPY_THRESHOLD)
> 		      NETFRONT_SKB_CB(skb)->pull_to = RX_COPY_THRESHOLD;
> 
> How does this modification of NETFRONT_SKB_CB(skb)->pull_to propagates
> or is this irrelevant?

I'm not sure I understand what you're asking. pull_to is a private
(to netfront) field used to communicate how much of the skb
needs to be pulled. I don't see how the word "propagate" makes
any sense here, but as said I may simply not be getting what
you're asking about.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08  9:59           ` Jan Beulich
  2013-07-08 12:16             ` Dion Kant
@ 2013-07-08 14:20             ` Jan Beulich
  2013-07-08 15:22               ` Eric Dumazet
                                 ` (5 more replies)
  2013-07-08 14:20             ` Jan Beulich
  2 siblings, 6 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-08 14:20 UTC (permalink / raw)
  To: Wei Liu, davem; +Cc: Ian Campbell, Dion Kant, xen-devel, netdev, stable

>>> On 08.07.13 at 11:59, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
>> On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
>>> --- a/drivers/net/xen-netfront.c
>>> +++ b/drivers/net/xen-netfront.c
>>> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
>>>  			RING_GET_RESPONSE(&np->rx, ++cons);
>>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>>>  
>>> +		if (nr_frags == MAX_SKB_FRAGS) {
>>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>>> +
>>> +			BUG_ON(pull_to <= skb_headlen(skb));
>>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>> 
>> skb_headlen is in fact "skb->len - skb->data_len". Looking at the
>> caller code:
>> 
>>     while loop {
>>         skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>> 	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
>> 	skb->data_len = rx->status;
>> 
>> 	i = xennet_fill_frags(np, skb, &tmpq);
>> 
>> 	/*                                                                           
> 
>>                                                                   
>> 	 * Truesize is the actual allocation size, even if the                       
> 
>>                                                                   
>> 	 * allocation is only partially used.                                        
> 
>>                                                                   
>> 	 */
>> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
>> 	skb->len += skb->data_len;
>>     }
>> 
>>     handle_incoming_packet();
>> 
>> You seem to be altering the behavior of the original code, because in
>> your patch the skb->len is incremented before use, while in the original
>> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
>> correctly set.
> 
> Right. So I basically need to keep skb->len up-to-date along with
> ->data_len. Just handed a patch to Dion with that done; I'll defer
> sending a v2 for the upstream code until I know the change works
> for our kernel.

Okay, so with that done (see below) Dion is now seeing the
WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
course, with it having crashed before, it's hard to tell whether the
triggering now is an effect of the patch, or just got unmasked by it.

Looking over the ->truesize handling, I can't see how the change
here could break things: RX_COPY_THRESHOLD is already
accounted for by how alloc_skb() gets called, and the increment
right after the call to xennet_fill_frags() should now be even more
correct than before (since __pskb_pull_tail() can drop fragments,
which would then have made this an over-estimation afaict).

That all said with me knowing pretty little about the networking
code, so I'd appreciate if you could point out anything wrong with
my idea of how things work. Additionally - is my fundamental (for
this patch) assumption right that multiple __pskb_pull_tail() call
are cumulative (i.e. calling it twice with a delta of
pull_to - skb_headlen(skb) would indeed end up pulling up to
pull_to, provided there is enough data)?

Jan

--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -831,10 +831,20 @@ static RING_IDX xennet_fill_frags(struct
 			RING_GET_RESPONSE(&np->rx, ++cons);
 		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
 
+		if (nr_frags == MAX_SKB_FRAGS) {
+			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
+
+			BUG_ON(pull_to <= skb_headlen(skb));
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+			nr_frags = shinfo->nr_frags;
+		}
+		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
+
 		__skb_fill_page_desc(skb, nr_frags,
 				     skb_frag_page(nfrag),
 				     rx->offset, rx->status);
 
+		skb->len += rx->status;
 		skb->data_len += rx->status;
 
 		skb_shinfo(nskb)->nr_frags = 0;
@@ -929,7 +939,8 @@ static int handle_incoming_queue(struct 
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
 		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		if (pull_to > skb_headlen(skb))
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
 
 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);
@@ -1014,7 +1025,7 @@ err:
 
 		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
 		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
-		skb->data_len = rx->status;
+		skb->len = skb->data_len = rx->status;
 
 		i = xennet_fill_frags(np, skb, &tmpq);
 
@@ -1023,7 +1034,6 @@ err:
                  * allocation is only partially used.
                  */
 		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
-		skb->len += skb->data_len;
 
 		if (rx->flags & XEN_NETRXF_csum_blank)
 			skb->ip_summed = CHECKSUM_PARTIAL;

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08  9:59           ` Jan Beulich
  2013-07-08 12:16             ` Dion Kant
  2013-07-08 14:20             ` [Xen-devel] " Jan Beulich
@ 2013-07-08 14:20             ` Jan Beulich
  2 siblings, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-08 14:20 UTC (permalink / raw)
  To: Wei Liu, davem; +Cc: netdev, Dion Kant, Ian Campbell, stable, xen-devel

>>> On 08.07.13 at 11:59, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
>> On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
>>> --- a/drivers/net/xen-netfront.c
>>> +++ b/drivers/net/xen-netfront.c
>>> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
>>>  			RING_GET_RESPONSE(&np->rx, ++cons);
>>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>>>  
>>> +		if (nr_frags == MAX_SKB_FRAGS) {
>>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>>> +
>>> +			BUG_ON(pull_to <= skb_headlen(skb));
>>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>> 
>> skb_headlen is in fact "skb->len - skb->data_len". Looking at the
>> caller code:
>> 
>>     while loop {
>>         skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>> 	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
>> 	skb->data_len = rx->status;
>> 
>> 	i = xennet_fill_frags(np, skb, &tmpq);
>> 
>> 	/*                                                                           
> 
>>                                                                   
>> 	 * Truesize is the actual allocation size, even if the                       
> 
>>                                                                   
>> 	 * allocation is only partially used.                                        
> 
>>                                                                   
>> 	 */
>> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
>> 	skb->len += skb->data_len;
>>     }
>> 
>>     handle_incoming_packet();
>> 
>> You seem to be altering the behavior of the original code, because in
>> your patch the skb->len is incremented before use, while in the original
>> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
>> correctly set.
> 
> Right. So I basically need to keep skb->len up-to-date along with
> ->data_len. Just handed a patch to Dion with that done; I'll defer
> sending a v2 for the upstream code until I know the change works
> for our kernel.

Okay, so with that done (see below) Dion is now seeing the
WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
course, with it having crashed before, it's hard to tell whether the
triggering now is an effect of the patch, or just got unmasked by it.

Looking over the ->truesize handling, I can't see how the change
here could break things: RX_COPY_THRESHOLD is already
accounted for by how alloc_skb() gets called, and the increment
right after the call to xennet_fill_frags() should now be even more
correct than before (since __pskb_pull_tail() can drop fragments,
which would then have made this an over-estimation afaict).

That all said with me knowing pretty little about the networking
code, so I'd appreciate if you could point out anything wrong with
my idea of how things work. Additionally - is my fundamental (for
this patch) assumption right that multiple __pskb_pull_tail() call
are cumulative (i.e. calling it twice with a delta of
pull_to - skb_headlen(skb) would indeed end up pulling up to
pull_to, provided there is enough data)?

Jan

--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -831,10 +831,20 @@ static RING_IDX xennet_fill_frags(struct
 			RING_GET_RESPONSE(&np->rx, ++cons);
 		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
 
+		if (nr_frags == MAX_SKB_FRAGS) {
+			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
+
+			BUG_ON(pull_to <= skb_headlen(skb));
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+			nr_frags = shinfo->nr_frags;
+		}
+		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
+
 		__skb_fill_page_desc(skb, nr_frags,
 				     skb_frag_page(nfrag),
 				     rx->offset, rx->status);
 
+		skb->len += rx->status;
 		skb->data_len += rx->status;
 
 		skb_shinfo(nskb)->nr_frags = 0;
@@ -929,7 +939,8 @@ static int handle_incoming_queue(struct 
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
 		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		if (pull_to > skb_headlen(skb))
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
 
 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);
@@ -1014,7 +1025,7 @@ err:
 
 		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
 		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
-		skb->data_len = rx->status;
+		skb->len = skb->data_len = rx->status;
 
 		i = xennet_fill_frags(np, skb, &tmpq);
 
@@ -1023,7 +1034,6 @@ err:
                  * allocation is only partially used.
                  */
 		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
-		skb->len += skb->data_len;
 
 		if (rx->flags & XEN_NETRXF_csum_blank)
 			skb->ip_summed = CHECKSUM_PARTIAL;

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 14:20             ` [Xen-devel] " Jan Beulich
  2013-07-08 15:22               ` Eric Dumazet
@ 2013-07-08 15:22               ` Eric Dumazet
  2013-07-09  7:47                 ` Jan Beulich
  2013-07-09  7:47                 ` Jan Beulich
  2013-07-08 15:48               ` [Xen-devel] " Wei Liu
                                 ` (3 subsequent siblings)
  5 siblings, 2 replies; 85+ messages in thread
From: Eric Dumazet @ 2013-07-08 15:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, davem, Ian Campbell, Dion Kant, xen-devel, netdev, stable

On Mon, 2013-07-08 at 15:20 +0100, Jan Beulich wrote:

> Okay, so with that done (see below) Dion is now seeing the
> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
> course, with it having crashed before, it's hard to tell whether the
> triggering now is an effect of the patch, or just got unmasked by it.

Please note this warning is not the sign of an error.

It can be triggered because network stack (IP + TCP) had to reallocate
skb->head to hold all the headers.

pskb_may_pull() doesn't change skb->truesize, for various reasons, so we
can have a situation where the warning triggers.

For example, it can happen when drivers use a really small skb->head,
like 64 bytes, as its not big enough to hold ethernet+IP+TCP headers.

For example, I did a patch on a wifi driver (iwl3945: better skb
management in rx path) because of this sub optimal skb->head sizing.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 14:20             ` [Xen-devel] " Jan Beulich
@ 2013-07-08 15:22               ` Eric Dumazet
  2013-07-08 15:22               ` [Xen-devel] " Eric Dumazet
                                 ` (4 subsequent siblings)
  5 siblings, 0 replies; 85+ messages in thread
From: Eric Dumazet @ 2013-07-08 15:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

On Mon, 2013-07-08 at 15:20 +0100, Jan Beulich wrote:

> Okay, so with that done (see below) Dion is now seeing the
> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
> course, with it having crashed before, it's hard to tell whether the
> triggering now is an effect of the patch, or just got unmasked by it.

Please note this warning is not the sign of an error.

It can be triggered because network stack (IP + TCP) had to reallocate
skb->head to hold all the headers.

pskb_may_pull() doesn't change skb->truesize, for various reasons, so we
can have a situation where the warning triggers.

For example, it can happen when drivers use a really small skb->head,
like 64 bytes, as its not big enough to hold ethernet+IP+TCP headers.

For example, I did a patch on a wifi driver (iwl3945: better skb
management in rx path) because of this sub optimal skb->head sizing.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 14:20             ` [Xen-devel] " Jan Beulich
  2013-07-08 15:22               ` Eric Dumazet
  2013-07-08 15:22               ` [Xen-devel] " Eric Dumazet
@ 2013-07-08 15:48               ` Wei Liu
  2013-07-09  6:52                 ` Jan Beulich
  2013-07-09  6:52                 ` [Xen-devel] " Jan Beulich
  2013-07-08 15:48               ` Wei Liu
                                 ` (2 subsequent siblings)
  5 siblings, 2 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-08 15:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, davem, Ian Campbell, Dion Kant, xen-devel, netdev, stable

On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
> >>> On 08.07.13 at 11:59, "Jan Beulich" <JBeulich@suse.com> wrote:
> >>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
> >> On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
> >>> --- a/drivers/net/xen-netfront.c
> >>> +++ b/drivers/net/xen-netfront.c
> >>> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
> >>>  			RING_GET_RESPONSE(&np->rx, ++cons);
> >>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
> >>>  
> >>> +		if (nr_frags == MAX_SKB_FRAGS) {
> >>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> >>> +
> >>> +			BUG_ON(pull_to <= skb_headlen(skb));
> >>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> >> 
> >> skb_headlen is in fact "skb->len - skb->data_len". Looking at the
> >> caller code:
> >> 
> >>     while loop {
> >>         skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> >> 	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> >> 	skb->data_len = rx->status;
> >> 
> >> 	i = xennet_fill_frags(np, skb, &tmpq);
> >> 
> >> 	/*                                                                           
> > 
> >>                                                                   
> >> 	 * Truesize is the actual allocation size, even if the                       
> > 
> >>                                                                   
> >> 	 * allocation is only partially used.                                        
> > 
> >>                                                                   
> >> 	 */
> >> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> >> 	skb->len += skb->data_len;
> >>     }
> >> 
> >>     handle_incoming_packet();
> >> 
> >> You seem to be altering the behavior of the original code, because in
> >> your patch the skb->len is incremented before use, while in the original
> >> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
> >> correctly set.
> > 
> > Right. So I basically need to keep skb->len up-to-date along with
> > ->data_len. Just handed a patch to Dion with that done; I'll defer
> > sending a v2 for the upstream code until I know the change works
> > for our kernel.
> 
> Okay, so with that done (see below) Dion is now seeing the
> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
> course, with it having crashed before, it's hard to tell whether the
> triggering now is an effect of the patch, or just got unmasked by it.
> 

I think the previous crash was due to incorrect accounting.

  BUG_ON(pull_to <= skb_headlen(skb))

  skb_headlen(skb) = skb->len - skb->data_len

In your first patch, skb_headlen(skb) can return a very large number if
skb->len is smaller then skb->data_len (they are both unsigned), because
before calling skb_headlen() skb->data_len is incremented while skb->len
is not.

As for the WARN_ON_ONCE(delte < len), I haven't looked that far as I
discover a problem with your new patch. Please see below comments.

> Looking over the ->truesize handling, I can't see how the change
> here could break things: RX_COPY_THRESHOLD is already
> accounted for by how alloc_skb() gets called, and the increment
> right after the call to xennet_fill_frags() should now be even more
> correct than before (since __pskb_pull_tail() can drop fragments,
> which would then have made this an over-estimation afaict).
> 
> That all said with me knowing pretty little about the networking
> code, so I'd appreciate if you could point out anything wrong with
> my idea of how things work. Additionally - is my fundamental (for
> this patch) assumption right that multiple __pskb_pull_tail() call
> are cumulative (i.e. calling it twice with a delta of
> pull_to - skb_headlen(skb) would indeed end up pulling up to
> pull_to, provided there is enough data)?
> 
> Jan
> 
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -831,10 +831,20 @@ static RING_IDX xennet_fill_frags(struct
>  			RING_GET_RESPONSE(&np->rx, ++cons);
>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>  
> +		if (nr_frags == MAX_SKB_FRAGS) {
> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> +
> +			BUG_ON(pull_to <= skb_headlen(skb));
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +			nr_frags = shinfo->nr_frags;
> +		}
> +		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> +
>  		__skb_fill_page_desc(skb, nr_frags,
>  				     skb_frag_page(nfrag),
>  				     rx->offset, rx->status);
>  
> +		skb->len += rx->status;
>  		skb->data_len += rx->status;
>  
>  		skb_shinfo(nskb)->nr_frags = 0;
> @@ -929,7 +939,8 @@ static int handle_incoming_queue(struct 
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		if (pull_to > skb_headlen(skb))
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> @@ -1014,7 +1025,7 @@ err:
>  
>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> -		skb->data_len = rx->status;
> +		skb->len = skb->data_len = rx->status;

This is not correct. You should not be needing this. Now you lose count
of SKB head len. Try to go without the above line and see if it makes a
difference?


Wei.

>  
>  		i = xennet_fill_frags(np, skb, &tmpq);
>  
> @@ -1023,7 +1034,6 @@ err:
>                   * allocation is only partially used.
>                   */
>  		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> -		skb->len += skb->data_len;
>  
>  		if (rx->flags & XEN_NETRXF_csum_blank)
>  			skb->ip_summed = CHECKSUM_PARTIAL;

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 14:20             ` [Xen-devel] " Jan Beulich
                                 ` (2 preceding siblings ...)
  2013-07-08 15:48               ` [Xen-devel] " Wei Liu
@ 2013-07-08 15:48               ` Wei Liu
  2013-07-12  8:32               ` [Xen-devel] " Wei Liu
  2013-07-12  8:32               ` Wei Liu
  5 siblings, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-08 15:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
> >>> On 08.07.13 at 11:59, "Jan Beulich" <JBeulich@suse.com> wrote:
> >>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
> >> On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
> >>> --- a/drivers/net/xen-netfront.c
> >>> +++ b/drivers/net/xen-netfront.c
> >>> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
> >>>  			RING_GET_RESPONSE(&np->rx, ++cons);
> >>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
> >>>  
> >>> +		if (nr_frags == MAX_SKB_FRAGS) {
> >>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> >>> +
> >>> +			BUG_ON(pull_to <= skb_headlen(skb));
> >>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> >> 
> >> skb_headlen is in fact "skb->len - skb->data_len". Looking at the
> >> caller code:
> >> 
> >>     while loop {
> >>         skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> >> 	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> >> 	skb->data_len = rx->status;
> >> 
> >> 	i = xennet_fill_frags(np, skb, &tmpq);
> >> 
> >> 	/*                                                                           
> > 
> >>                                                                   
> >> 	 * Truesize is the actual allocation size, even if the                       
> > 
> >>                                                                   
> >> 	 * allocation is only partially used.                                        
> > 
> >>                                                                   
> >> 	 */
> >> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> >> 	skb->len += skb->data_len;
> >>     }
> >> 
> >>     handle_incoming_packet();
> >> 
> >> You seem to be altering the behavior of the original code, because in
> >> your patch the skb->len is incremented before use, while in the original
> >> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
> >> correctly set.
> > 
> > Right. So I basically need to keep skb->len up-to-date along with
> > ->data_len. Just handed a patch to Dion with that done; I'll defer
> > sending a v2 for the upstream code until I know the change works
> > for our kernel.
> 
> Okay, so with that done (see below) Dion is now seeing the
> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
> course, with it having crashed before, it's hard to tell whether the
> triggering now is an effect of the patch, or just got unmasked by it.
> 

I think the previous crash was due to incorrect accounting.

  BUG_ON(pull_to <= skb_headlen(skb))

  skb_headlen(skb) = skb->len - skb->data_len

In your first patch, skb_headlen(skb) can return a very large number if
skb->len is smaller then skb->data_len (they are both unsigned), because
before calling skb_headlen() skb->data_len is incremented while skb->len
is not.

As for the WARN_ON_ONCE(delte < len), I haven't looked that far as I
discover a problem with your new patch. Please see below comments.

> Looking over the ->truesize handling, I can't see how the change
> here could break things: RX_COPY_THRESHOLD is already
> accounted for by how alloc_skb() gets called, and the increment
> right after the call to xennet_fill_frags() should now be even more
> correct than before (since __pskb_pull_tail() can drop fragments,
> which would then have made this an over-estimation afaict).
> 
> That all said with me knowing pretty little about the networking
> code, so I'd appreciate if you could point out anything wrong with
> my idea of how things work. Additionally - is my fundamental (for
> this patch) assumption right that multiple __pskb_pull_tail() call
> are cumulative (i.e. calling it twice with a delta of
> pull_to - skb_headlen(skb) would indeed end up pulling up to
> pull_to, provided there is enough data)?
> 
> Jan
> 
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -831,10 +831,20 @@ static RING_IDX xennet_fill_frags(struct
>  			RING_GET_RESPONSE(&np->rx, ++cons);
>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>  
> +		if (nr_frags == MAX_SKB_FRAGS) {
> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> +
> +			BUG_ON(pull_to <= skb_headlen(skb));
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +			nr_frags = shinfo->nr_frags;
> +		}
> +		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> +
>  		__skb_fill_page_desc(skb, nr_frags,
>  				     skb_frag_page(nfrag),
>  				     rx->offset, rx->status);
>  
> +		skb->len += rx->status;
>  		skb->data_len += rx->status;
>  
>  		skb_shinfo(nskb)->nr_frags = 0;
> @@ -929,7 +939,8 @@ static int handle_incoming_queue(struct 
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		if (pull_to > skb_headlen(skb))
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> @@ -1014,7 +1025,7 @@ err:
>  
>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> -		skb->data_len = rx->status;
> +		skb->len = skb->data_len = rx->status;

This is not correct. You should not be needing this. Now you lose count
of SKB head len. Try to go without the above line and see if it makes a
difference?


Wei.

>  
>  		i = xennet_fill_frags(np, skb, &tmpq);
>  
> @@ -1023,7 +1034,6 @@ err:
>                   * allocation is only partially used.
>                   */
>  		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> -		skb->len += skb->data_len;
>  
>  		if (rx->flags & XEN_NETRXF_csum_blank)
>  			skb->ip_summed = CHECKSUM_PARTIAL;

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 15:48               ` [Xen-devel] " Wei Liu
  2013-07-09  6:52                 ` Jan Beulich
@ 2013-07-09  6:52                 ` Jan Beulich
  2013-07-09 16:51                   ` Wei Liu
  2013-07-09 16:51                   ` [Xen-devel] " Wei Liu
  1 sibling, 2 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-09  6:52 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, davem, Dion Kant, xen-devel, netdev, stable

>>> On 08.07.13 at 17:48, Wei Liu <wei.liu2@citrix.com> wrote:
> On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
>> @@ -1014,7 +1025,7 @@ err:
>>  
>>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
>> -		skb->data_len = rx->status;
>> +		skb->len = skb->data_len = rx->status;
> 
> This is not correct. You should not be needing this. Now you lose count
> of SKB head len. Try to go without the above line and see if it makes a
> difference?

I don't follow - at this point, there's 0 bytes of head (this only
changes with the first call to __pskb_pull_tail()). Hence ->len ==
->data_len seems correct to me (and afaict pulling would do the
wrong thing if I dropped that change).

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 15:48               ` [Xen-devel] " Wei Liu
@ 2013-07-09  6:52                 ` Jan Beulich
  2013-07-09  6:52                 ` [Xen-devel] " Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-09  6:52 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

>>> On 08.07.13 at 17:48, Wei Liu <wei.liu2@citrix.com> wrote:
> On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
>> @@ -1014,7 +1025,7 @@ err:
>>  
>>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
>> -		skb->data_len = rx->status;
>> +		skb->len = skb->data_len = rx->status;
> 
> This is not correct. You should not be needing this. Now you lose count
> of SKB head len. Try to go without the above line and see if it makes a
> difference?

I don't follow - at this point, there's 0 bytes of head (this only
changes with the first call to __pskb_pull_tail()). Hence ->len ==
->data_len seems correct to me (and afaict pulling would do the
wrong thing if I dropped that change).

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 15:22               ` [Xen-devel] " Eric Dumazet
@ 2013-07-09  7:47                 ` Jan Beulich
  2013-07-09  7:47                 ` Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-09  7:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ian Campbell, Wei Liu, davem, Dion Kant, xen-devel, netdev, stable

>>> On 08.07.13 at 17:22, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Mon, 2013-07-08 at 15:20 +0100, Jan Beulich wrote:
> 
>> Okay, so with that done (see below) Dion is now seeing the
>> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
>> course, with it having crashed before, it's hard to tell whether the
>> triggering now is an effect of the patch, or just got unmasked by it.
> 
> Please note this warning is not the sign of an error.

Thanks for pointing this out.

> It can be triggered because network stack (IP + TCP) had to reallocate
> skb->head to hold all the headers.
> 
> pskb_may_pull() doesn't change skb->truesize, for various reasons, so we
> can have a situation where the warning triggers.
> 
> For example, it can happen when drivers use a really small skb->head,
> like 64 bytes, as its not big enough to hold ethernet+IP+TCP headers.

We know from the crashes prior to the patch here that this is
exactly what is happening.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 15:22               ` [Xen-devel] " Eric Dumazet
  2013-07-09  7:47                 ` Jan Beulich
@ 2013-07-09  7:47                 ` Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-09  7:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

>>> On 08.07.13 at 17:22, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Mon, 2013-07-08 at 15:20 +0100, Jan Beulich wrote:
> 
>> Okay, so with that done (see below) Dion is now seeing the
>> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
>> course, with it having crashed before, it's hard to tell whether the
>> triggering now is an effect of the patch, or just got unmasked by it.
> 
> Please note this warning is not the sign of an error.

Thanks for pointing this out.

> It can be triggered because network stack (IP + TCP) had to reallocate
> skb->head to hold all the headers.
> 
> pskb_may_pull() doesn't change skb->truesize, for various reasons, so we
> can have a situation where the warning triggers.
> 
> For example, it can happen when drivers use a really small skb->head,
> like 64 bytes, as its not big enough to hold ethernet+IP+TCP headers.

We know from the crashes prior to the patch here that this is
exactly what is happening.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-09  6:52                 ` [Xen-devel] " Jan Beulich
  2013-07-09 16:51                   ` Wei Liu
@ 2013-07-09 16:51                   ` Wei Liu
  2013-07-10  6:58                     ` Jan Beulich
  2013-07-10  6:58                     ` [Xen-devel] " Jan Beulich
  1 sibling, 2 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-09 16:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, davem, Dion Kant, xen-devel, netdev, stable

On Tue, Jul 09, 2013 at 07:52:31AM +0100, Jan Beulich wrote:
> >>> On 08.07.13 at 17:48, Wei Liu <wei.liu2@citrix.com> wrote:
> > On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
> >> @@ -1014,7 +1025,7 @@ err:
> >>  
> >>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> >>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> >> -		skb->data_len = rx->status;
> >> +		skb->len = skb->data_len = rx->status;
> > 
> > This is not correct. You should not be needing this. Now you lose count
> > of SKB head len. Try to go without the above line and see if it makes a
> > difference?
> 
> I don't follow - at this point, there's 0 bytes of head (this only
> changes with the first call to __pskb_pull_tail()). Hence ->len ==
> ->data_len seems correct to me (and afaict pulling would do the
> wrong thing if I dropped that change).
> 

My bad, I suggested the wrong thing. :-(

But I would prefer skb->len += skb->data_len. In the case that skb->len
== 0 it's the same as your line while skb->len is not zero it would also
do the right thing.

As for the warning in skb_try_coalesce, I don't see any direct call to
it in netfront, I will need to think about it. It looks like it's really
something very deep in the stack.


Wei.

> Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-09  6:52                 ` [Xen-devel] " Jan Beulich
@ 2013-07-09 16:51                   ` Wei Liu
  2013-07-09 16:51                   ` [Xen-devel] " Wei Liu
  1 sibling, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-09 16:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

On Tue, Jul 09, 2013 at 07:52:31AM +0100, Jan Beulich wrote:
> >>> On 08.07.13 at 17:48, Wei Liu <wei.liu2@citrix.com> wrote:
> > On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
> >> @@ -1014,7 +1025,7 @@ err:
> >>  
> >>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> >>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> >> -		skb->data_len = rx->status;
> >> +		skb->len = skb->data_len = rx->status;
> > 
> > This is not correct. You should not be needing this. Now you lose count
> > of SKB head len. Try to go without the above line and see if it makes a
> > difference?
> 
> I don't follow - at this point, there's 0 bytes of head (this only
> changes with the first call to __pskb_pull_tail()). Hence ->len ==
> ->data_len seems correct to me (and afaict pulling would do the
> wrong thing if I dropped that change).
> 

My bad, I suggested the wrong thing. :-(

But I would prefer skb->len += skb->data_len. In the case that skb->len
== 0 it's the same as your line while skb->len is not zero it would also
do the right thing.

As for the warning in skb_try_coalesce, I don't see any direct call to
it in netfront, I will need to think about it. It looks like it's really
something very deep in the stack.


Wei.

> Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-09 16:51                   ` [Xen-devel] " Wei Liu
  2013-07-10  6:58                     ` Jan Beulich
@ 2013-07-10  6:58                     ` Jan Beulich
  2013-07-10 10:04                       ` Wei Liu
                                         ` (3 more replies)
  1 sibling, 4 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-10  6:58 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, davem, Dion Kant, xen-devel, netdev, stable

>>> On 09.07.13 at 18:51, Wei Liu <wei.liu2@citrix.com> wrote:
> On Tue, Jul 09, 2013 at 07:52:31AM +0100, Jan Beulich wrote:
>> >>> On 08.07.13 at 17:48, Wei Liu <wei.liu2@citrix.com> wrote:
>> > On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
>> >> @@ -1014,7 +1025,7 @@ err:
>> >>  
>> >>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>> >>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
>> >> -		skb->data_len = rx->status;
>> >> +		skb->len = skb->data_len = rx->status;
>> > 
>> > This is not correct. You should not be needing this. Now you lose count
>> > of SKB head len. Try to go without the above line and see if it makes a
>> > difference?
>> 
>> I don't follow - at this point, there's 0 bytes of head (this only
>> changes with the first call to __pskb_pull_tail()). Hence ->len ==
>> ->data_len seems correct to me (and afaict pulling would do the
>> wrong thing if I dropped that change).
>> 
> 
> My bad, I suggested the wrong thing. :-(
> 
> But I would prefer skb->len += skb->data_len. In the case that skb->len
> == 0 it's the same as your line while skb->len is not zero it would also
> do the right thing.

I can do that, albeit I don't see how ->len could end up non-zero
here.

> As for the warning in skb_try_coalesce, I don't see any direct call to
> it in netfront, I will need to think about it. It looks like it's really
> something very deep in the stack.

Yes, as the call stack provided by Dion proves. The question
really is whether the patch somehow results in ->truesize to be
incorrect, or whether - as Eric points out - this is "normal" for
the sort of special SKBs here (having a rather small headlen). If
what he says is applicable here, it may hint at the pulling we do
still not being sufficient for the full TCP header to be in the linear
part (which iirc is the main [if not the only purpose] of us doing
the pull in the first place).

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-09 16:51                   ` [Xen-devel] " Wei Liu
@ 2013-07-10  6:58                     ` Jan Beulich
  2013-07-10  6:58                     ` [Xen-devel] " Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-10  6:58 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

>>> On 09.07.13 at 18:51, Wei Liu <wei.liu2@citrix.com> wrote:
> On Tue, Jul 09, 2013 at 07:52:31AM +0100, Jan Beulich wrote:
>> >>> On 08.07.13 at 17:48, Wei Liu <wei.liu2@citrix.com> wrote:
>> > On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
>> >> @@ -1014,7 +1025,7 @@ err:
>> >>  
>> >>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>> >>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
>> >> -		skb->data_len = rx->status;
>> >> +		skb->len = skb->data_len = rx->status;
>> > 
>> > This is not correct. You should not be needing this. Now you lose count
>> > of SKB head len. Try to go without the above line and see if it makes a
>> > difference?
>> 
>> I don't follow - at this point, there's 0 bytes of head (this only
>> changes with the first call to __pskb_pull_tail()). Hence ->len ==
>> ->data_len seems correct to me (and afaict pulling would do the
>> wrong thing if I dropped that change).
>> 
> 
> My bad, I suggested the wrong thing. :-(
> 
> But I would prefer skb->len += skb->data_len. In the case that skb->len
> == 0 it's the same as your line while skb->len is not zero it would also
> do the right thing.

I can do that, albeit I don't see how ->len could end up non-zero
here.

> As for the warning in skb_try_coalesce, I don't see any direct call to
> it in netfront, I will need to think about it. It looks like it's really
> something very deep in the stack.

Yes, as the call stack provided by Dion proves. The question
really is whether the patch somehow results in ->truesize to be
incorrect, or whether - as Eric points out - this is "normal" for
the sort of special SKBs here (having a rather small headlen). If
what he says is applicable here, it may hint at the pulling we do
still not being sufficient for the full TCP header to be in the linear
part (which iirc is the main [if not the only purpose] of us doing
the pull in the first place).

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10  6:58                     ` [Xen-devel] " Jan Beulich
  2013-07-10 10:04                       ` Wei Liu
@ 2013-07-10 10:04                       ` Wei Liu
  2013-07-10 10:46                         ` Jan Beulich
  2013-07-10 10:46                         ` Jan Beulich
  2013-07-10 13:19                       ` [Xen-devel] " Eric Dumazet
  2013-07-10 13:19                       ` Eric Dumazet
  3 siblings, 2 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-10 10:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, davem, Dion Kant, xen-devel, netdev, stable

On Wed, Jul 10, 2013 at 07:58:06AM +0100, Jan Beulich wrote:
> >>> On 09.07.13 at 18:51, Wei Liu <wei.liu2@citrix.com> wrote:
> > On Tue, Jul 09, 2013 at 07:52:31AM +0100, Jan Beulich wrote:
> >> >>> On 08.07.13 at 17:48, Wei Liu <wei.liu2@citrix.com> wrote:
> >> > On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
> >> >> @@ -1014,7 +1025,7 @@ err:
> >> >>  
> >> >>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> >> >>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> >> >> -		skb->data_len = rx->status;
> >> >> +		skb->len = skb->data_len = rx->status;
> >> > 
> >> > This is not correct. You should not be needing this. Now you lose count
> >> > of SKB head len. Try to go without the above line and see if it makes a
> >> > difference?
> >> 
> >> I don't follow - at this point, there's 0 bytes of head (this only
> >> changes with the first call to __pskb_pull_tail()). Hence ->len ==
> >> ->data_len seems correct to me (and afaict pulling would do the
> >> wrong thing if I dropped that change).
> >> 
> > 
> > My bad, I suggested the wrong thing. :-(
> > 
> > But I would prefer skb->len += skb->data_len. In the case that skb->len
> > == 0 it's the same as your line while skb->len is not zero it would also
> > do the right thing.
> 
> I can do that, albeit I don't see how ->len could end up non-zero
> here.
> 
> > As for the warning in skb_try_coalesce, I don't see any direct call to
> > it in netfront, I will need to think about it. It looks like it's really
> > something very deep in the stack.
> 
> Yes, as the call stack provided by Dion proves. The question
> really is whether the patch somehow results in ->truesize to be
> incorrect, or whether - as Eric points out - this is "normal" for
> the sort of special SKBs here (having a rather small headlen). If
> what he says is applicable here, it may hint at the pulling we do
> still not being sufficient for the full TCP header to be in the linear

__netdev_alloc_skb in netfront is fed with RX_COPY_THRESHOLD+NET_IP_ALIGN.
RX_COPY_THRESHOLD is 256 while MAX_TCP_HEADER can be as larger as 256+48
depending on kernel configurations.

> part (which iirc is the main [if not the only purpose] of us doing
> the pull in the first place).
> 

Ian, any comment on this?

Jan, looking at the commit log, the overrun issue in
xennet_get_responses was not introduced by __pskb_pull_tail. The call to
xennet_fill_frags has always been in the same place.

Now I start to think about the purpose of "max = MAX_SKB_FRAGS +
(rx->status <= RX_COPY_THRESHOLD)" in xennet_get_responses which queues
up to MAX_SKB_FRAGS+1 responeses. I'm not clear about the rationale
of that line.


Wei.

> Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10  6:58                     ` [Xen-devel] " Jan Beulich
@ 2013-07-10 10:04                       ` Wei Liu
  2013-07-10 10:04                       ` [Xen-devel] " Wei Liu
                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-10 10:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

On Wed, Jul 10, 2013 at 07:58:06AM +0100, Jan Beulich wrote:
> >>> On 09.07.13 at 18:51, Wei Liu <wei.liu2@citrix.com> wrote:
> > On Tue, Jul 09, 2013 at 07:52:31AM +0100, Jan Beulich wrote:
> >> >>> On 08.07.13 at 17:48, Wei Liu <wei.liu2@citrix.com> wrote:
> >> > On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
> >> >> @@ -1014,7 +1025,7 @@ err:
> >> >>  
> >> >>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> >> >>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> >> >> -		skb->data_len = rx->status;
> >> >> +		skb->len = skb->data_len = rx->status;
> >> > 
> >> > This is not correct. You should not be needing this. Now you lose count
> >> > of SKB head len. Try to go without the above line and see if it makes a
> >> > difference?
> >> 
> >> I don't follow - at this point, there's 0 bytes of head (this only
> >> changes with the first call to __pskb_pull_tail()). Hence ->len ==
> >> ->data_len seems correct to me (and afaict pulling would do the
> >> wrong thing if I dropped that change).
> >> 
> > 
> > My bad, I suggested the wrong thing. :-(
> > 
> > But I would prefer skb->len += skb->data_len. In the case that skb->len
> > == 0 it's the same as your line while skb->len is not zero it would also
> > do the right thing.
> 
> I can do that, albeit I don't see how ->len could end up non-zero
> here.
> 
> > As for the warning in skb_try_coalesce, I don't see any direct call to
> > it in netfront, I will need to think about it. It looks like it's really
> > something very deep in the stack.
> 
> Yes, as the call stack provided by Dion proves. The question
> really is whether the patch somehow results in ->truesize to be
> incorrect, or whether - as Eric points out - this is "normal" for
> the sort of special SKBs here (having a rather small headlen). If
> what he says is applicable here, it may hint at the pulling we do
> still not being sufficient for the full TCP header to be in the linear

__netdev_alloc_skb in netfront is fed with RX_COPY_THRESHOLD+NET_IP_ALIGN.
RX_COPY_THRESHOLD is 256 while MAX_TCP_HEADER can be as larger as 256+48
depending on kernel configurations.

> part (which iirc is the main [if not the only purpose] of us doing
> the pull in the first place).
> 

Ian, any comment on this?

Jan, looking at the commit log, the overrun issue in
xennet_get_responses was not introduced by __pskb_pull_tail. The call to
xennet_fill_frags has always been in the same place.

Now I start to think about the purpose of "max = MAX_SKB_FRAGS +
(rx->status <= RX_COPY_THRESHOLD)" in xennet_get_responses which queues
up to MAX_SKB_FRAGS+1 responeses. I'm not clear about the rationale
of that line.


Wei.

> Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10 10:04                       ` [Xen-devel] " Wei Liu
@ 2013-07-10 10:46                         ` Jan Beulich
  2013-07-10 12:50                           ` Ian Campbell
  2013-07-10 12:50                           ` Ian Campbell
  2013-07-10 10:46                         ` Jan Beulich
  1 sibling, 2 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-10 10:46 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu; +Cc: davem, Dion Kant, xen-devel, netdev, stable

>>> On 10.07.13 at 12:04, Wei Liu <wei.liu2@citrix.com> wrote:
> Jan, looking at the commit log, the overrun issue in
> xennet_get_responses was not introduced by __pskb_pull_tail. The call to
> xennet_fill_frags has always been in the same place.

I'm convinced it was: Prior to that commit, if the first response slot
contained up to RX_COPY_THRESHOLD bytes, it got entirely
consumed into the linear portion of the SKB, leaving the number of
fragments available for filling at MAX_SKB_FRAGS. Said commit
dropped the early copying, leaving the fragment count at 1
unconditionally, and now accumulates all of the response slots into
fragments, only pulling after all of them got filled in. It neglected to
realize - due to the count now always being 1 at the beginning - that
this can lead to MAX_SKB_FRAGS + 1 frags getting filled, corrupting
memory.

Ian - I have to admit that I'm slightly irritated by you so far not
having participated at all in sorting out the fix for this bug that a
change of yours introduced.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10 10:04                       ` [Xen-devel] " Wei Liu
  2013-07-10 10:46                         ` Jan Beulich
@ 2013-07-10 10:46                         ` Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-10 10:46 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu; +Cc: netdev, Dion Kant, davem, stable, xen-devel

>>> On 10.07.13 at 12:04, Wei Liu <wei.liu2@citrix.com> wrote:
> Jan, looking at the commit log, the overrun issue in
> xennet_get_responses was not introduced by __pskb_pull_tail. The call to
> xennet_fill_frags has always been in the same place.

I'm convinced it was: Prior to that commit, if the first response slot
contained up to RX_COPY_THRESHOLD bytes, it got entirely
consumed into the linear portion of the SKB, leaving the number of
fragments available for filling at MAX_SKB_FRAGS. Said commit
dropped the early copying, leaving the fragment count at 1
unconditionally, and now accumulates all of the response slots into
fragments, only pulling after all of them got filled in. It neglected to
realize - due to the count now always being 1 at the beginning - that
this can lead to MAX_SKB_FRAGS + 1 frags getting filled, corrupting
memory.

Ian - I have to admit that I'm slightly irritated by you so far not
having participated at all in sorting out the fix for this bug that a
change of yours introduced.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10 10:46                         ` Jan Beulich
@ 2013-07-10 12:50                           ` Ian Campbell
  2013-07-10 12:53                             ` Wei Liu
                                               ` (3 more replies)
  2013-07-10 12:50                           ` Ian Campbell
  1 sibling, 4 replies; 85+ messages in thread
From: Ian Campbell @ 2013-07-10 12:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, davem, Dion Kant, xen-devel, netdev, stable

On Wed, 2013-07-10 at 11:46 +0100, Jan Beulich wrote:
> >>> On 10.07.13 at 12:04, Wei Liu <wei.liu2@citrix.com> wrote:
> > Jan, looking at the commit log, the overrun issue in
> > xennet_get_responses was not introduced by __pskb_pull_tail. The call to
> > xennet_fill_frags has always been in the same place.
> 
> I'm convinced it was: Prior to that commit, if the first response slot
> contained up to RX_COPY_THRESHOLD bytes, it got entirely
> consumed into the linear portion of the SKB, leaving the number of
> fragments available for filling at MAX_SKB_FRAGS. Said commit
> dropped the early copying, leaving the fragment count at 1
> unconditionally, and now accumulates all of the response slots into
> fragments, only pulling after all of them got filled in. It neglected to
> realize - due to the count now always being 1 at the beginning - that
> this can lead to MAX_SKB_FRAGS + 1 frags getting filled, corrupting
> memory.

That argument makes sense to me.

Is it possible to hit a scenario where we need to pull more than
RX_COPY_THRESHOLD in order to fit all of the data in MAX_SKB_FRAGS ?

> Ian - I have to admit that I'm slightly irritated by you so far not
> having participated at all in sorting out the fix for this bug that a
> change of yours introduced.

Sorry I've been travelling and not following closely enough to realise
this was related to something I'd done.

Does this relate somehow to the patch Annie has sent out recently too?

Ian.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10 10:46                         ` Jan Beulich
  2013-07-10 12:50                           ` Ian Campbell
@ 2013-07-10 12:50                           ` Ian Campbell
  1 sibling, 0 replies; 85+ messages in thread
From: Ian Campbell @ 2013-07-10 12:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, netdev, stable, xen-devel, Dion Kant, davem

On Wed, 2013-07-10 at 11:46 +0100, Jan Beulich wrote:
> >>> On 10.07.13 at 12:04, Wei Liu <wei.liu2@citrix.com> wrote:
> > Jan, looking at the commit log, the overrun issue in
> > xennet_get_responses was not introduced by __pskb_pull_tail. The call to
> > xennet_fill_frags has always been in the same place.
> 
> I'm convinced it was: Prior to that commit, if the first response slot
> contained up to RX_COPY_THRESHOLD bytes, it got entirely
> consumed into the linear portion of the SKB, leaving the number of
> fragments available for filling at MAX_SKB_FRAGS. Said commit
> dropped the early copying, leaving the fragment count at 1
> unconditionally, and now accumulates all of the response slots into
> fragments, only pulling after all of them got filled in. It neglected to
> realize - due to the count now always being 1 at the beginning - that
> this can lead to MAX_SKB_FRAGS + 1 frags getting filled, corrupting
> memory.

That argument makes sense to me.

Is it possible to hit a scenario where we need to pull more than
RX_COPY_THRESHOLD in order to fit all of the data in MAX_SKB_FRAGS ?

> Ian - I have to admit that I'm slightly irritated by you so far not
> having participated at all in sorting out the fix for this bug that a
> change of yours introduced.

Sorry I've been travelling and not following closely enough to realise
this was related to something I'd done.

Does this relate somehow to the patch Annie has sent out recently too?

Ian.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10 12:50                           ` Ian Campbell
@ 2013-07-10 12:53                             ` Wei Liu
  2013-07-10 12:53                             ` Wei Liu
                                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-10 12:53 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Jan Beulich, Wei Liu, davem, Dion Kant, xen-devel, netdev, stable

On Wed, Jul 10, 2013 at 01:50:44PM +0100, Ian Campbell wrote:
> On Wed, 2013-07-10 at 11:46 +0100, Jan Beulich wrote:
> > >>> On 10.07.13 at 12:04, Wei Liu <wei.liu2@citrix.com> wrote:
> > > Jan, looking at the commit log, the overrun issue in
> > > xennet_get_responses was not introduced by __pskb_pull_tail. The call to
> > > xennet_fill_frags has always been in the same place.
> > 
> > I'm convinced it was: Prior to that commit, if the first response slot
> > contained up to RX_COPY_THRESHOLD bytes, it got entirely
> > consumed into the linear portion of the SKB, leaving the number of
> > fragments available for filling at MAX_SKB_FRAGS. Said commit
> > dropped the early copying, leaving the fragment count at 1
> > unconditionally, and now accumulates all of the response slots into
> > fragments, only pulling after all of them got filled in. It neglected to
> > realize - due to the count now always being 1 at the beginning - that
> > this can lead to MAX_SKB_FRAGS + 1 frags getting filled, corrupting
> > memory.
> 
> That argument makes sense to me.
> 
> Is it possible to hit a scenario where we need to pull more than
> RX_COPY_THRESHOLD in order to fit all of the data in MAX_SKB_FRAGS ?
> 
> > Ian - I have to admit that I'm slightly irritated by you so far not
> > having participated at all in sorting out the fix for this bug that a
> > change of yours introduced.
> 
> Sorry I've been travelling and not following closely enough to realise
> this was related to something I'd done.
> 
> Does this relate somehow to the patch Annie has sent out recently too?
> 

No. That's not related.


Wei.


> Ian.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10 12:50                           ` Ian Campbell
  2013-07-10 12:53                             ` Wei Liu
@ 2013-07-10 12:53                             ` Wei Liu
  2013-07-10 13:58                             ` Jan Beulich
  2013-07-10 13:58                             ` [Xen-devel] " Jan Beulich
  3 siblings, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-10 12:53 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Wei Liu, netdev, stable, xen-devel, Jan Beulich, Dion Kant, davem

On Wed, Jul 10, 2013 at 01:50:44PM +0100, Ian Campbell wrote:
> On Wed, 2013-07-10 at 11:46 +0100, Jan Beulich wrote:
> > >>> On 10.07.13 at 12:04, Wei Liu <wei.liu2@citrix.com> wrote:
> > > Jan, looking at the commit log, the overrun issue in
> > > xennet_get_responses was not introduced by __pskb_pull_tail. The call to
> > > xennet_fill_frags has always been in the same place.
> > 
> > I'm convinced it was: Prior to that commit, if the first response slot
> > contained up to RX_COPY_THRESHOLD bytes, it got entirely
> > consumed into the linear portion of the SKB, leaving the number of
> > fragments available for filling at MAX_SKB_FRAGS. Said commit
> > dropped the early copying, leaving the fragment count at 1
> > unconditionally, and now accumulates all of the response slots into
> > fragments, only pulling after all of them got filled in. It neglected to
> > realize - due to the count now always being 1 at the beginning - that
> > this can lead to MAX_SKB_FRAGS + 1 frags getting filled, corrupting
> > memory.
> 
> That argument makes sense to me.
> 
> Is it possible to hit a scenario where we need to pull more than
> RX_COPY_THRESHOLD in order to fit all of the data in MAX_SKB_FRAGS ?
> 
> > Ian - I have to admit that I'm slightly irritated by you so far not
> > having participated at all in sorting out the fix for this bug that a
> > change of yours introduced.
> 
> Sorry I've been travelling and not following closely enough to realise
> this was related to something I'd done.
> 
> Does this relate somehow to the patch Annie has sent out recently too?
> 

No. That's not related.


Wei.


> Ian.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10  6:58                     ` [Xen-devel] " Jan Beulich
  2013-07-10 10:04                       ` Wei Liu
  2013-07-10 10:04                       ` [Xen-devel] " Wei Liu
@ 2013-07-10 13:19                       ` Eric Dumazet
  2013-07-10 13:19                       ` Eric Dumazet
  3 siblings, 0 replies; 85+ messages in thread
From: Eric Dumazet @ 2013-07-10 13:19 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, davem, Dion Kant, xen-devel, netdev, stable

On Wed, 2013-07-10 at 07:58 +0100, Jan Beulich wrote:

> 
> Yes, as the call stack provided by Dion proves. The question
> really is whether the patch somehow results in ->truesize to be
> incorrect, or whether - as Eric points out - this is "normal" for
> the sort of special SKBs here (having a rather small headlen). If
> what he says is applicable here, it may hint at the pulling we do
> still not being sufficient for the full TCP header to be in the linear
> part (which iirc is the main [if not the only purpose] of us doing
> the pull in the first place).

I have not closely followed the thread (it seems it did not began on
netdev so I do not have the first messages), but if we cook an skb with
no tailroom (headroom is irrelevant), and IP stack or TCP stack has to
pull more bytes from the frags to skb->head, then skb->head must be
reallocated to get more headroom before pulling the headers.

This operation is _expensive_, as it involves copy of struct
skb_shared_info and eventually page refcounts games if original skb
was cloned (sniffer)

We have a one time WARN_ON_ONCE() in skb_try_coalesce(),
to detect such non optimal behavior, but not in the fast path, only in
the part used in TCP coalescing.

So the right thing would be to cook skbs so that there is enough
tailroom.

But there are 256 bytes of tailroom, it should be enough unless some
huge amount of headers are used.

If you believe it's fine and very unlikely, then ignore the warning,
because using 512 bytes of tailroom for all skbs only to cope with very
unlikely reallocations is not worth it.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10  6:58                     ` [Xen-devel] " Jan Beulich
                                         ` (2 preceding siblings ...)
  2013-07-10 13:19                       ` [Xen-devel] " Eric Dumazet
@ 2013-07-10 13:19                       ` Eric Dumazet
  3 siblings, 0 replies; 85+ messages in thread
From: Eric Dumazet @ 2013-07-10 13:19 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

On Wed, 2013-07-10 at 07:58 +0100, Jan Beulich wrote:

> 
> Yes, as the call stack provided by Dion proves. The question
> really is whether the patch somehow results in ->truesize to be
> incorrect, or whether - as Eric points out - this is "normal" for
> the sort of special SKBs here (having a rather small headlen). If
> what he says is applicable here, it may hint at the pulling we do
> still not being sufficient for the full TCP header to be in the linear
> part (which iirc is the main [if not the only purpose] of us doing
> the pull in the first place).

I have not closely followed the thread (it seems it did not began on
netdev so I do not have the first messages), but if we cook an skb with
no tailroom (headroom is irrelevant), and IP stack or TCP stack has to
pull more bytes from the frags to skb->head, then skb->head must be
reallocated to get more headroom before pulling the headers.

This operation is _expensive_, as it involves copy of struct
skb_shared_info and eventually page refcounts games if original skb
was cloned (sniffer)

We have a one time WARN_ON_ONCE() in skb_try_coalesce(),
to detect such non optimal behavior, but not in the fast path, only in
the part used in TCP coalescing.

So the right thing would be to cook skbs so that there is enough
tailroom.

But there are 256 bytes of tailroom, it should be enough unless some
huge amount of headers are used.

If you believe it's fine and very unlikely, then ignore the warning,
because using 512 bytes of tailroom for all skbs only to cope with very
unlikely reallocations is not worth it.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10 12:50                           ` Ian Campbell
                                               ` (2 preceding siblings ...)
  2013-07-10 13:58                             ` Jan Beulich
@ 2013-07-10 13:58                             ` Jan Beulich
  3 siblings, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-10 13:58 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, davem, Dion Kant, xen-devel, netdev, stable

>>> On 10.07.13 at 14:50, Ian Campbell <ian.campbell@citrix.com> wrote:
> On Wed, 2013-07-10 at 11:46 +0100, Jan Beulich wrote:
>> >>> On 10.07.13 at 12:04, Wei Liu <wei.liu2@citrix.com> wrote:
>> > Jan, looking at the commit log, the overrun issue in
>> > xennet_get_responses was not introduced by __pskb_pull_tail. The call to
>> > xennet_fill_frags has always been in the same place.
>> 
>> I'm convinced it was: Prior to that commit, if the first response slot
>> contained up to RX_COPY_THRESHOLD bytes, it got entirely
>> consumed into the linear portion of the SKB, leaving the number of
>> fragments available for filling at MAX_SKB_FRAGS. Said commit
>> dropped the early copying, leaving the fragment count at 1
>> unconditionally, and now accumulates all of the response slots into
>> fragments, only pulling after all of them got filled in. It neglected to
>> realize - due to the count now always being 1 at the beginning - that
>> this can lead to MAX_SKB_FRAGS + 1 frags getting filled, corrupting
>> memory.
> 
> That argument makes sense to me.
> 
> Is it possible to hit a scenario where we need to pull more than
> RX_COPY_THRESHOLD in order to fit all of the data in MAX_SKB_FRAGS ?

I'm not aware of any, but I'm no expert here in any way.

> Does this relate somehow to the patch Annie has sent out recently too?

I don't think so.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-10 12:50                           ` Ian Campbell
  2013-07-10 12:53                             ` Wei Liu
  2013-07-10 12:53                             ` Wei Liu
@ 2013-07-10 13:58                             ` Jan Beulich
  2013-07-10 13:58                             ` [Xen-devel] " Jan Beulich
  3 siblings, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-10 13:58 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, netdev, stable, xen-devel, Dion Kant, davem

>>> On 10.07.13 at 14:50, Ian Campbell <ian.campbell@citrix.com> wrote:
> On Wed, 2013-07-10 at 11:46 +0100, Jan Beulich wrote:
>> >>> On 10.07.13 at 12:04, Wei Liu <wei.liu2@citrix.com> wrote:
>> > Jan, looking at the commit log, the overrun issue in
>> > xennet_get_responses was not introduced by __pskb_pull_tail. The call to
>> > xennet_fill_frags has always been in the same place.
>> 
>> I'm convinced it was: Prior to that commit, if the first response slot
>> contained up to RX_COPY_THRESHOLD bytes, it got entirely
>> consumed into the linear portion of the SKB, leaving the number of
>> fragments available for filling at MAX_SKB_FRAGS. Said commit
>> dropped the early copying, leaving the fragment count at 1
>> unconditionally, and now accumulates all of the response slots into
>> fragments, only pulling after all of them got filled in. It neglected to
>> realize - due to the count now always being 1 at the beginning - that
>> this can lead to MAX_SKB_FRAGS + 1 frags getting filled, corrupting
>> memory.
> 
> That argument makes sense to me.
> 
> Is it possible to hit a scenario where we need to pull more than
> RX_COPY_THRESHOLD in order to fit all of the data in MAX_SKB_FRAGS ?

I'm not aware of any, but I'm no expert here in any way.

> Does this relate somehow to the patch Annie has sent out recently too?

I don't think so.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 14:20             ` [Xen-devel] " Jan Beulich
                                 ` (3 preceding siblings ...)
  2013-07-08 15:48               ` Wei Liu
@ 2013-07-12  8:32               ` Wei Liu
  2013-07-12  8:56                 ` Jan Beulich
  2013-07-12  8:56                 ` Jan Beulich
  2013-07-12  8:32               ` Wei Liu
  5 siblings, 2 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-12  8:32 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, davem, Ian Campbell, Dion Kant, xen-devel, netdev, stable

On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
> >>> On 08.07.13 at 11:59, "Jan Beulich" <JBeulich@suse.com> wrote:
> >>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
> >> On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
> >>> --- a/drivers/net/xen-netfront.c
> >>> +++ b/drivers/net/xen-netfront.c
> >>> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
> >>>  			RING_GET_RESPONSE(&np->rx, ++cons);
> >>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
> >>>  
> >>> +		if (nr_frags == MAX_SKB_FRAGS) {
> >>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> >>> +
> >>> +			BUG_ON(pull_to <= skb_headlen(skb));
> >>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> >> 
> >> skb_headlen is in fact "skb->len - skb->data_len". Looking at the
> >> caller code:
> >> 
> >>     while loop {
> >>         skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> >> 	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> >> 	skb->data_len = rx->status;
> >> 
> >> 	i = xennet_fill_frags(np, skb, &tmpq);
> >> 
> >> 	/*                                                                           
> > 
> >>                                                                   
> >> 	 * Truesize is the actual allocation size, even if the                       
> > 
> >>                                                                   
> >> 	 * allocation is only partially used.                                        
> > 
> >>                                                                   
> >> 	 */
> >> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> >> 	skb->len += skb->data_len;
> >>     }
> >> 
> >>     handle_incoming_packet();
> >> 
> >> You seem to be altering the behavior of the original code, because in
> >> your patch the skb->len is incremented before use, while in the original
> >> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
> >> correctly set.
> > 
> > Right. So I basically need to keep skb->len up-to-date along with
> > ->data_len. Just handed a patch to Dion with that done; I'll defer
> > sending a v2 for the upstream code until I know the change works
> > for our kernel.
> 
> Okay, so with that done (see below) Dion is now seeing the
> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
> course, with it having crashed before, it's hard to tell whether the
> triggering now is an effect of the patch, or just got unmasked by it.
> 

I just ported your below patch to upstream kernel and I didn't see the
WARN_ON_ONCE. I only did iperf and netperf tests.

If the work load to trigger this bug is simple enough I can give it a
shot...


Wei.

> Looking over the ->truesize handling, I can't see how the change
> here could break things: RX_COPY_THRESHOLD is already
> accounted for by how alloc_skb() gets called, and the increment
> right after the call to xennet_fill_frags() should now be even more
> correct than before (since __pskb_pull_tail() can drop fragments,
> which would then have made this an over-estimation afaict).
> 
> That all said with me knowing pretty little about the networking
> code, so I'd appreciate if you could point out anything wrong with
> my idea of how things work. Additionally - is my fundamental (for
> this patch) assumption right that multiple __pskb_pull_tail() call
> are cumulative (i.e. calling it twice with a delta of
> pull_to - skb_headlen(skb) would indeed end up pulling up to
> pull_to, provided there is enough data)?
> 
> Jan
> 
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -831,10 +831,20 @@ static RING_IDX xennet_fill_frags(struct
>  			RING_GET_RESPONSE(&np->rx, ++cons);
>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>  
> +		if (nr_frags == MAX_SKB_FRAGS) {
> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> +
> +			BUG_ON(pull_to <= skb_headlen(skb));
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +			nr_frags = shinfo->nr_frags;
> +		}
> +		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> +
>  		__skb_fill_page_desc(skb, nr_frags,
>  				     skb_frag_page(nfrag),
>  				     rx->offset, rx->status);
>  
> +		skb->len += rx->status;
>  		skb->data_len += rx->status;
>  
>  		skb_shinfo(nskb)->nr_frags = 0;
> @@ -929,7 +939,8 @@ static int handle_incoming_queue(struct 
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		if (pull_to > skb_headlen(skb))
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> @@ -1014,7 +1025,7 @@ err:
>  
>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> -		skb->data_len = rx->status;
> +		skb->len = skb->data_len = rx->status;
>  
>  		i = xennet_fill_frags(np, skb, &tmpq);
>  
> @@ -1023,7 +1034,6 @@ err:
>                   * allocation is only partially used.
>                   */
>  		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> -		skb->len += skb->data_len;
>  
>  		if (rx->flags & XEN_NETRXF_csum_blank)
>  			skb->ip_summed = CHECKSUM_PARTIAL;

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-08 14:20             ` [Xen-devel] " Jan Beulich
                                 ` (4 preceding siblings ...)
  2013-07-12  8:32               ` [Xen-devel] " Wei Liu
@ 2013-07-12  8:32               ` Wei Liu
  5 siblings, 0 replies; 85+ messages in thread
From: Wei Liu @ 2013-07-12  8:32 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
> >>> On 08.07.13 at 11:59, "Jan Beulich" <JBeulich@suse.com> wrote:
> >>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
> >> On Fri, Jul 05, 2013 at 10:32:41AM +0100, Jan Beulich wrote:
> >>> --- a/drivers/net/xen-netfront.c
> >>> +++ b/drivers/net/xen-netfront.c
> >>> @@ -831,6 +831,15 @@ static RING_IDX xennet_fill_frags(struct
> >>>  			RING_GET_RESPONSE(&np->rx, ++cons);
> >>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
> >>>  
> >>> +		if (nr_frags == MAX_SKB_FRAGS) {
> >>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> >>> +
> >>> +			BUG_ON(pull_to <= skb_headlen(skb));
> >>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> >> 
> >> skb_headlen is in fact "skb->len - skb->data_len". Looking at the
> >> caller code:
> >> 
> >>     while loop {
> >>         skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> >> 	skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> >> 	skb->data_len = rx->status;
> >> 
> >> 	i = xennet_fill_frags(np, skb, &tmpq);
> >> 
> >> 	/*                                                                           
> > 
> >>                                                                   
> >> 	 * Truesize is the actual allocation size, even if the                       
> > 
> >>                                                                   
> >> 	 * allocation is only partially used.                                        
> > 
> >>                                                                   
> >> 	 */
> >> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> >> 	skb->len += skb->data_len;
> >>     }
> >> 
> >>     handle_incoming_packet();
> >> 
> >> You seem to be altering the behavior of the original code, because in
> >> your patch the skb->len is incremented before use, while in the original
> >> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
> >> correctly set.
> > 
> > Right. So I basically need to keep skb->len up-to-date along with
> > ->data_len. Just handed a patch to Dion with that done; I'll defer
> > sending a v2 for the upstream code until I know the change works
> > for our kernel.
> 
> Okay, so with that done (see below) Dion is now seeing the
> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
> course, with it having crashed before, it's hard to tell whether the
> triggering now is an effect of the patch, or just got unmasked by it.
> 

I just ported your below patch to upstream kernel and I didn't see the
WARN_ON_ONCE. I only did iperf and netperf tests.

If the work load to trigger this bug is simple enough I can give it a
shot...


Wei.

> Looking over the ->truesize handling, I can't see how the change
> here could break things: RX_COPY_THRESHOLD is already
> accounted for by how alloc_skb() gets called, and the increment
> right after the call to xennet_fill_frags() should now be even more
> correct than before (since __pskb_pull_tail() can drop fragments,
> which would then have made this an over-estimation afaict).
> 
> That all said with me knowing pretty little about the networking
> code, so I'd appreciate if you could point out anything wrong with
> my idea of how things work. Additionally - is my fundamental (for
> this patch) assumption right that multiple __pskb_pull_tail() call
> are cumulative (i.e. calling it twice with a delta of
> pull_to - skb_headlen(skb) would indeed end up pulling up to
> pull_to, provided there is enough data)?
> 
> Jan
> 
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -831,10 +831,20 @@ static RING_IDX xennet_fill_frags(struct
>  			RING_GET_RESPONSE(&np->rx, ++cons);
>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>  
> +		if (nr_frags == MAX_SKB_FRAGS) {
> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> +
> +			BUG_ON(pull_to <= skb_headlen(skb));
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +			nr_frags = shinfo->nr_frags;
> +		}
> +		BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> +
>  		__skb_fill_page_desc(skb, nr_frags,
>  				     skb_frag_page(nfrag),
>  				     rx->offset, rx->status);
>  
> +		skb->len += rx->status;
>  		skb->data_len += rx->status;
>  
>  		skb_shinfo(nskb)->nr_frags = 0;
> @@ -929,7 +939,8 @@ static int handle_incoming_queue(struct 
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		if (pull_to > skb_headlen(skb))
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> @@ -1014,7 +1025,7 @@ err:
>  
>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> -		skb->data_len = rx->status;
> +		skb->len = skb->data_len = rx->status;
>  
>  		i = xennet_fill_frags(np, skb, &tmpq);
>  
> @@ -1023,7 +1034,6 @@ err:
>                   * allocation is only partially used.
>                   */
>  		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> -		skb->len += skb->data_len;
>  
>  		if (rx->flags & XEN_NETRXF_csum_blank)
>  			skb->ip_summed = CHECKSUM_PARTIAL;

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-12  8:32               ` [Xen-devel] " Wei Liu
@ 2013-07-12  8:56                 ` Jan Beulich
  2013-07-13 11:26                   ` Dion Kant
  2013-07-13 11:26                   ` [Xen-devel] " Dion Kant
  2013-07-12  8:56                 ` Jan Beulich
  1 sibling, 2 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-12  8:56 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, davem, Dion Kant, xen-devel, netdev, stable

>>> On 12.07.13 at 10:32, Wei Liu <wei.liu2@citrix.com> wrote:
> On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
>> >>> On 08.07.13 at 11:59, "Jan Beulich" <JBeulich@suse.com> wrote:
>> >>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
>> >> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
>> >> 	skb->len += skb->data_len;
>> >>     }
>> >> 
>> >>     handle_incoming_packet();
>> >> 
>> >> You seem to be altering the behavior of the original code, because in
>> >> your patch the skb->len is incremented before use, while in the original
>> >> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
>> >> correctly set.
>> > 
>> > Right. So I basically need to keep skb->len up-to-date along with
>> > ->data_len. Just handed a patch to Dion with that done; I'll defer
>> > sending a v2 for the upstream code until I know the change works
>> > for our kernel.
>> 
>> Okay, so with that done (see below) Dion is now seeing the
>> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
>> course, with it having crashed before, it's hard to tell whether the
>> triggering now is an effect of the patch, or just got unmasked by it.
>> 
> 
> I just ported your below patch to upstream kernel and I didn't see the
> WARN_ON_ONCE. I only did iperf and netperf tests.
> 
> If the work load to trigger this bug is simple enough I can give it a
> shot...

I'm meanwhile relatively convinced that the warning isn't an effect
of the patch (final verification pending); I intend to submit v2 as
soon as 3.11-rc1 is out.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-12  8:32               ` [Xen-devel] " Wei Liu
  2013-07-12  8:56                 ` Jan Beulich
@ 2013-07-12  8:56                 ` Jan Beulich
  1 sibling, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-12  8:56 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

>>> On 12.07.13 at 10:32, Wei Liu <wei.liu2@citrix.com> wrote:
> On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
>> >>> On 08.07.13 at 11:59, "Jan Beulich" <JBeulich@suse.com> wrote:
>> >>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
>> >> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
>> >> 	skb->len += skb->data_len;
>> >>     }
>> >> 
>> >>     handle_incoming_packet();
>> >> 
>> >> You seem to be altering the behavior of the original code, because in
>> >> your patch the skb->len is incremented before use, while in the original
>> >> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
>> >> correctly set.
>> > 
>> > Right. So I basically need to keep skb->len up-to-date along with
>> > ->data_len. Just handed a patch to Dion with that done; I'll defer
>> > sending a v2 for the upstream code until I know the change works
>> > for our kernel.
>> 
>> Okay, so with that done (see below) Dion is now seeing the
>> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
>> course, with it having crashed before, it's hard to tell whether the
>> triggering now is an effect of the patch, or just got unmasked by it.
>> 
> 
> I just ported your below patch to upstream kernel and I didn't see the
> WARN_ON_ONCE. I only did iperf and netperf tests.
> 
> If the work load to trigger this bug is simple enough I can give it a
> shot...

I'm meanwhile relatively convinced that the warning isn't an effect
of the patch (final verification pending); I intend to submit v2 as
soon as 3.11-rc1 is out.

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [Xen-devel] [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-12  8:56                 ` Jan Beulich
  2013-07-13 11:26                   ` Dion Kant
@ 2013-07-13 11:26                   ` Dion Kant
  2013-07-13 20:17                     ` Dion Kant
  1 sibling, 1 reply; 85+ messages in thread
From: Dion Kant @ 2013-07-13 11:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Ian Campbell, davem, xen-devel, netdev, stable

On 07/12/2013 10:56 AM, Jan Beulich wrote:
>>>> On 12.07.13 at 10:32, Wei Liu <wei.liu2@citrix.com> wrote:
>> On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
>>>>>> On 08.07.13 at 11:59, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>>>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
>>>>> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
>>>>> 	skb->len += skb->data_len;
>>>>>     }
>>>>>
>>>>>     handle_incoming_packet();
>>>>>
>>>>> You seem to be altering the behavior of the original code, because in
>>>>> your patch the skb->len is incremented before use, while in the original
>>>>> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
>>>>> correctly set.
>>>>
>>>> Right. So I basically need to keep skb->len up-to-date along with
>>>> ->data_len. Just handed a patch to Dion with that done; I'll defer
>>>> sending a v2 for the upstream code until I know the change works
>>>> for our kernel.
>>>
>>> Okay, so with that done (see below) Dion is now seeing the
>>> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
>>> course, with it having crashed before, it's hard to tell whether the
>>> triggering now is an effect of the patch, or just got unmasked by it.
>>>
>>
>> I just ported your below patch to upstream kernel and I didn't see the
>> WARN_ON_ONCE. I only did iperf and netperf tests.
>>
>> If the work load to trigger this bug is simple enough I can give it a
>> shot...
> 
> I'm meanwhile relatively convinced that the warning isn't an effect
> of the patch (final verification pending); I intend to submit v2 as
> soon as 3.11-rc1 is out.
> 
> Jan
> 

I have been doing some testing and focussed on xennet_fill_frags. I
enhanced a bit the detection of the skb->data_len > nr_frags * PAGE_SIZE
condition (as suggested by Jan) by storing rx->status and nr_frags as
function of each dequeued skb in the while loop.

There is something wrong with the part

+if (nr_frags == MAX_SKB_FRAGS) {
+       unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
+
+       BUG_ON(pull_to <= skb_headlen(skb));
+       __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+       nr_frags = shinfo->nr_frags;
+}
+BUG_ON(nr_frags >= MAX_SKB_FRAGS);

When nr_frags reaches MAX_SKB_FRAGS (and this happens), nr_frags is
"reset" to shinfo->nr_frags. In fact I see the nr_frags set to 0 the
first time nr_frags reaches MAX_SKB_FRAGS.

The first problem with this is of course that the "skb->truesize +=
PAGE_SIZE * skb_shinfo(skb)->nr_frags" calculation following the "i =
xennet_fill_frags(np, skb, &tmpq)" in  neif_poll leads to a wrong
result. At the end the skb has ->truesize way smaller than ->len.

The second problem is that the BUG_ON(nr_frags >= MAX_SKB_FRAGS) shall
not occur, since "shinfo->nr_frags" afaict, is not updated in between.

Furthermore, I cannot figure out why, when control enters
xennet_fill_frags(), shinfo->nr_frags equals 1, and a little later when
nr_frags reaches MAX_SKB_FRAGS, it is 0.

Here is the (printk) result of a trial:


>   1 2013-07-13T04:27:30.757241+02:00 doma kernel: [   41.759528] netfront: Too many frags
>   2 2013-07-13T04:27:39.629261+02:00 doma kernel: [   50.628482] 17 == 17 (shinfo->nr_frags=0)
>   3 2013-07-13T04:27:39.629285+02:00 doma kernel: [   50.628486] j=0,rx_trace[j]=48,nr_frags_trace[j]=1
>   4 2013-07-13T04:27:39.629287+02:00 doma kernel: [   50.628488] j=1,rx_trace[j]=1000,nr_frags_trace[j]=2
>   5 2013-07-13T04:27:39.629289+02:00 doma kernel: [   50.628489] j=2,rx_trace[j]=1000,nr_frags_trace[j]=3
>   6 2013-07-13T04:27:39.629291+02:00 doma kernel: [   50.628491] j=3,rx_trace[j]=1000,nr_frags_trace[j]=4
>   7 2013-07-13T04:27:39.629293+02:00 doma kernel: [   50.628492] j=4,rx_trace[j]=1000,nr_frags_trace[j]=5
>   8 2013-07-13T04:27:39.629294+02:00 doma kernel: [   50.628493] j=5,rx_trace[j]=1000,nr_frags_trace[j]=6
>   9 2013-07-13T04:27:39.629296+02:00 doma kernel: [   50.628494] j=6,rx_trace[j]=1000,nr_frags_trace[j]=7
>  10 2013-07-13T04:27:39.629298+02:00 doma kernel: [   50.628496] j=7,rx_trace[j]=1000,nr_frags_trace[j]=8
>  11 2013-07-13T04:27:39.629300+02:00 doma kernel: [   50.628497] j=8,rx_trace[j]=1000,nr_frags_trace[j]=9
>  12 2013-07-13T04:27:39.629302+02:00 doma kernel: [   50.628498] j=9,rx_trace[j]=1000,nr_frags_trace[j]=10
>  13 2013-07-13T04:27:39.629304+02:00 doma kernel: [   50.628499] j=10,rx_trace[j]=1000,nr_frags_trace[j]=11
>  14 2013-07-13T04:27:39.629305+02:00 doma kernel: [   50.628503] j=11,rx_trace[j]=1000,nr_frags_trace[j]=12
>  15 2013-07-13T04:27:39.629307+02:00 doma kernel: [   50.628504] j=12,rx_trace[j]=1000,nr_frags_trace[j]=13
>  16 2013-07-13T04:27:39.629328+02:00 doma kernel: [   50.628506] j=13,rx_trace[j]=1000,nr_frags_trace[j]=14
>  17 2013-07-13T04:27:39.629330+02:00 doma kernel: [   50.628507] j=14,rx_trace[j]=1000,nr_frags_trace[j]=15
>  18 2013-07-13T04:27:39.629332+02:00 doma kernel: [   50.628508] j=15,rx_trace[j]=1000,nr_frags_trace[j]=16
>  19 2013-07-13T04:27:39.629335+02:00 doma kernel: [   50.628510] j=16,rx_trace[j]=e40,nr_frags_trace[j]=0
>  20 2013-07-13T04:27:39.629336+02:00 doma kernel: [   50.628511] xennet::xennet_fill_frags (fe88,feca,500,1,e40,42,42)

line 2 comes from

if (nr_frags == MAX_SKB_FRAGS) {
  unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;

  BUG_ON(pull_to <= skb_headlen(skb));
  __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+  printk("%u == %lu (shinfo->nr_frags=%u)\n",//temp
+                     nr_frags,MAX_SKB_FRAGS, shinfo->nr_frags);//tmp
  nr_frags = shinfo->nr_frags;
}

Lines 3 to 19 show the dumping I added at the end of the while loop

+if (skb->data_len > nr_frags * PAGE_SIZE) {//tmp
+ for (j=0; j<i; ++j)//tmp
+   printk("j=%u,rx_trace[j]=%x,nr_frags_trace[j]=%u\n", //tmp
+           j,rx_trace[j],nr_frags_trace[j]);//tmp
+ printk("xennet::xennet_fill_frags (%x,%x,%x,%u,%x,%x,%x)\n",//temp
+         skb->data_len, skb->len, skb->truesize, nr_frags,//tmp
+         rx->status, len0, data_len0);//temp
+}

For gathering the data I added a couple of variables on the stack

struct skb_shared_info *shinfo = skb_shinfo(skb);
int nr_frags = shinfo->nr_frags;
RING_IDX cons = np->rx.rsp_cons;
struct sk_buff *nskb;
+unsigned rx_trace[MAX_SKB_FRAGS+1], i=0,j, len0=0, data_len0=0;//tmp
+int nr_frags_trace[MAX_SKB_FRAGS+1];//tmp

and in the while loop

__skb_fill_page_desc(skb, nr_frags,
                     skb_frag_page(skb_shinfo(nskb)->frags),
                     rx->offset, rx->status);

+if (i==0) {//tmp
+  len0=skb->len;//tmp
+  data_len0=skb->data_len;//tmp
+}//tmp

+rx_trace[i] = rx->status;//tmp
skb->len += rx->status;
skb->data_len += rx->status;

skb_shinfo(nskb)->nr_frags = 0;
kfree_skb(nskb);

+nr_frags_trace[i++] = nr_frags;//tmp
nr_frags++;

As a coincidence I noticed that there has to be a subtle condition
related to speed of the while loop required to reach the state where
nr_frags becomes MAX_SKB_FRAGS.

A bit more delay in the while and it does not occur anymore.

E.g. first I had the gathering ordered like

+if (i==0) {//tmp
+  len0=skb->len;//tmp
+  data_len0=skb->data_len;//tmp
+}//tmp

+rx_trace[i] = rx->status;//tmp
+nr_frags_trace[i++] = nr_frags;//tmp
skb->len += rx->status;
skb->data_len += rx->status;

skb_shinfo(nskb)->nr_frags = 0;
kfree_skb(nskb);

nr_frags++;

and I was unable to find nr_frags reaching MAX_SKB_FRAGS within a finite
testing time. I thought this may be worthwhile to share this.

As a workload to trigger this I use simply

dom0-t:~ # scp bigrand.bin doma:/dev/null

and with this on my system the transfer rate settles around 45 MB/s
limited by a saturated core which runs the user land ssh process.

Dion

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-12  8:56                 ` Jan Beulich
@ 2013-07-13 11:26                   ` Dion Kant
  2013-07-13 11:26                   ` [Xen-devel] " Dion Kant
  1 sibling, 0 replies; 85+ messages in thread
From: Dion Kant @ 2013-07-13 11:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, davem

On 07/12/2013 10:56 AM, Jan Beulich wrote:
>>>> On 12.07.13 at 10:32, Wei Liu <wei.liu2@citrix.com> wrote:
>> On Mon, Jul 08, 2013 at 03:20:26PM +0100, Jan Beulich wrote:
>>>>>> On 08.07.13 at 11:59, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>>>>> On 05.07.13 at 16:53, Wei Liu <wei.liu2@citrix.com> wrote:
>>>>> 	skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
>>>>> 	skb->len += skb->data_len;
>>>>>     }
>>>>>
>>>>>     handle_incoming_packet();
>>>>>
>>>>> You seem to be altering the behavior of the original code, because in
>>>>> your patch the skb->len is incremented before use, while in the original
>>>>> code (which calls skb_headlen in handle_incoming_packet) the skb->len is
>>>>> correctly set.
>>>>
>>>> Right. So I basically need to keep skb->len up-to-date along with
>>>> ->data_len. Just handed a patch to Dion with that done; I'll defer
>>>> sending a v2 for the upstream code until I know the change works
>>>> for our kernel.
>>>
>>> Okay, so with that done (see below) Dion is now seeing the
>>> WARN_ON_ONCE(delta < len) in skb_try_coalesce() triggering. Of
>>> course, with it having crashed before, it's hard to tell whether the
>>> triggering now is an effect of the patch, or just got unmasked by it.
>>>
>>
>> I just ported your below patch to upstream kernel and I didn't see the
>> WARN_ON_ONCE. I only did iperf and netperf tests.
>>
>> If the work load to trigger this bug is simple enough I can give it a
>> shot...
> 
> I'm meanwhile relatively convinced that the warning isn't an effect
> of the patch (final verification pending); I intend to submit v2 as
> soon as 3.11-rc1 is out.
> 
> Jan
> 

I have been doing some testing and focussed on xennet_fill_frags. I
enhanced a bit the detection of the skb->data_len > nr_frags * PAGE_SIZE
condition (as suggested by Jan) by storing rx->status and nr_frags as
function of each dequeued skb in the while loop.

There is something wrong with the part

+if (nr_frags == MAX_SKB_FRAGS) {
+       unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
+
+       BUG_ON(pull_to <= skb_headlen(skb));
+       __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+       nr_frags = shinfo->nr_frags;
+}
+BUG_ON(nr_frags >= MAX_SKB_FRAGS);

When nr_frags reaches MAX_SKB_FRAGS (and this happens), nr_frags is
"reset" to shinfo->nr_frags. In fact I see the nr_frags set to 0 the
first time nr_frags reaches MAX_SKB_FRAGS.

The first problem with this is of course that the "skb->truesize +=
PAGE_SIZE * skb_shinfo(skb)->nr_frags" calculation following the "i =
xennet_fill_frags(np, skb, &tmpq)" in  neif_poll leads to a wrong
result. At the end the skb has ->truesize way smaller than ->len.

The second problem is that the BUG_ON(nr_frags >= MAX_SKB_FRAGS) shall
not occur, since "shinfo->nr_frags" afaict, is not updated in between.

Furthermore, I cannot figure out why, when control enters
xennet_fill_frags(), shinfo->nr_frags equals 1, and a little later when
nr_frags reaches MAX_SKB_FRAGS, it is 0.

Here is the (printk) result of a trial:


>   1 2013-07-13T04:27:30.757241+02:00 doma kernel: [   41.759528] netfront: Too many frags
>   2 2013-07-13T04:27:39.629261+02:00 doma kernel: [   50.628482] 17 == 17 (shinfo->nr_frags=0)
>   3 2013-07-13T04:27:39.629285+02:00 doma kernel: [   50.628486] j=0,rx_trace[j]=48,nr_frags_trace[j]=1
>   4 2013-07-13T04:27:39.629287+02:00 doma kernel: [   50.628488] j=1,rx_trace[j]=1000,nr_frags_trace[j]=2
>   5 2013-07-13T04:27:39.629289+02:00 doma kernel: [   50.628489] j=2,rx_trace[j]=1000,nr_frags_trace[j]=3
>   6 2013-07-13T04:27:39.629291+02:00 doma kernel: [   50.628491] j=3,rx_trace[j]=1000,nr_frags_trace[j]=4
>   7 2013-07-13T04:27:39.629293+02:00 doma kernel: [   50.628492] j=4,rx_trace[j]=1000,nr_frags_trace[j]=5
>   8 2013-07-13T04:27:39.629294+02:00 doma kernel: [   50.628493] j=5,rx_trace[j]=1000,nr_frags_trace[j]=6
>   9 2013-07-13T04:27:39.629296+02:00 doma kernel: [   50.628494] j=6,rx_trace[j]=1000,nr_frags_trace[j]=7
>  10 2013-07-13T04:27:39.629298+02:00 doma kernel: [   50.628496] j=7,rx_trace[j]=1000,nr_frags_trace[j]=8
>  11 2013-07-13T04:27:39.629300+02:00 doma kernel: [   50.628497] j=8,rx_trace[j]=1000,nr_frags_trace[j]=9
>  12 2013-07-13T04:27:39.629302+02:00 doma kernel: [   50.628498] j=9,rx_trace[j]=1000,nr_frags_trace[j]=10
>  13 2013-07-13T04:27:39.629304+02:00 doma kernel: [   50.628499] j=10,rx_trace[j]=1000,nr_frags_trace[j]=11
>  14 2013-07-13T04:27:39.629305+02:00 doma kernel: [   50.628503] j=11,rx_trace[j]=1000,nr_frags_trace[j]=12
>  15 2013-07-13T04:27:39.629307+02:00 doma kernel: [   50.628504] j=12,rx_trace[j]=1000,nr_frags_trace[j]=13
>  16 2013-07-13T04:27:39.629328+02:00 doma kernel: [   50.628506] j=13,rx_trace[j]=1000,nr_frags_trace[j]=14
>  17 2013-07-13T04:27:39.629330+02:00 doma kernel: [   50.628507] j=14,rx_trace[j]=1000,nr_frags_trace[j]=15
>  18 2013-07-13T04:27:39.629332+02:00 doma kernel: [   50.628508] j=15,rx_trace[j]=1000,nr_frags_trace[j]=16
>  19 2013-07-13T04:27:39.629335+02:00 doma kernel: [   50.628510] j=16,rx_trace[j]=e40,nr_frags_trace[j]=0
>  20 2013-07-13T04:27:39.629336+02:00 doma kernel: [   50.628511] xennet::xennet_fill_frags (fe88,feca,500,1,e40,42,42)

line 2 comes from

if (nr_frags == MAX_SKB_FRAGS) {
  unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;

  BUG_ON(pull_to <= skb_headlen(skb));
  __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+  printk("%u == %lu (shinfo->nr_frags=%u)\n",//temp
+                     nr_frags,MAX_SKB_FRAGS, shinfo->nr_frags);//tmp
  nr_frags = shinfo->nr_frags;
}

Lines 3 to 19 show the dumping I added at the end of the while loop

+if (skb->data_len > nr_frags * PAGE_SIZE) {//tmp
+ for (j=0; j<i; ++j)//tmp
+   printk("j=%u,rx_trace[j]=%x,nr_frags_trace[j]=%u\n", //tmp
+           j,rx_trace[j],nr_frags_trace[j]);//tmp
+ printk("xennet::xennet_fill_frags (%x,%x,%x,%u,%x,%x,%x)\n",//temp
+         skb->data_len, skb->len, skb->truesize, nr_frags,//tmp
+         rx->status, len0, data_len0);//temp
+}

For gathering the data I added a couple of variables on the stack

struct skb_shared_info *shinfo = skb_shinfo(skb);
int nr_frags = shinfo->nr_frags;
RING_IDX cons = np->rx.rsp_cons;
struct sk_buff *nskb;
+unsigned rx_trace[MAX_SKB_FRAGS+1], i=0,j, len0=0, data_len0=0;//tmp
+int nr_frags_trace[MAX_SKB_FRAGS+1];//tmp

and in the while loop

__skb_fill_page_desc(skb, nr_frags,
                     skb_frag_page(skb_shinfo(nskb)->frags),
                     rx->offset, rx->status);

+if (i==0) {//tmp
+  len0=skb->len;//tmp
+  data_len0=skb->data_len;//tmp
+}//tmp

+rx_trace[i] = rx->status;//tmp
skb->len += rx->status;
skb->data_len += rx->status;

skb_shinfo(nskb)->nr_frags = 0;
kfree_skb(nskb);

+nr_frags_trace[i++] = nr_frags;//tmp
nr_frags++;

As a coincidence I noticed that there has to be a subtle condition
related to speed of the while loop required to reach the state where
nr_frags becomes MAX_SKB_FRAGS.

A bit more delay in the while and it does not occur anymore.

E.g. first I had the gathering ordered like

+if (i==0) {//tmp
+  len0=skb->len;//tmp
+  data_len0=skb->data_len;//tmp
+}//tmp

+rx_trace[i] = rx->status;//tmp
+nr_frags_trace[i++] = nr_frags;//tmp
skb->len += rx->status;
skb->data_len += rx->status;

skb_shinfo(nskb)->nr_frags = 0;
kfree_skb(nskb);

nr_frags++;

and I was unable to find nr_frags reaching MAX_SKB_FRAGS within a finite
testing time. I thought this may be worthwhile to share this.

As a workload to trigger this I use simply

dom0-t:~ # scp bigrand.bin doma:/dev/null

and with this on my system the transfer rate settles around 45 MB/s
limited by a saturated core which runs the user land ssh process.

Dion

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-13 11:26                   ` [Xen-devel] " Dion Kant
@ 2013-07-13 20:17                     ` Dion Kant
  2013-07-15  6:37                       ` Jan Beulich
  0 siblings, 1 reply; 85+ messages in thread
From: Dion Kant @ 2013-07-13 20:17 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On 07/13/2013 01:26 PM, Dion Kant wrote:
> There is something wrong with the part
> 
> +if (nr_frags == MAX_SKB_FRAGS) {
> +       unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
> +
> +       BUG_ON(pull_to <= skb_headlen(skb));
> +       __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +       nr_frags = shinfo->nr_frags;
> +}
> +BUG_ON(nr_frags >= MAX_SKB_FRAGS);
> 
> When nr_frags reaches MAX_SKB_FRAGS (and this happens), nr_frags is
> "reset" to shinfo->nr_frags. In fact I see the nr_frags set to 0 the
> first time nr_frags reaches MAX_SKB_FRAGS.
> 
> The first problem with this is of course that the "skb->truesize +=
> PAGE_SIZE * skb_shinfo(skb)->nr_frags" calculation following the "i =
> xennet_fill_frags(np, skb, &tmpq)" in  neif_poll leads to a wrong
> result. At the end the skb has ->truesize way smaller than ->len.
> 
> The second problem is that the BUG_ON(nr_frags >= MAX_SKB_FRAGS) shall
> not occur, since "shinfo->nr_frags" afaict, is not updated in between.
> 
> Furthermore, I cannot figure out why, when control enters
> xennet_fill_frags(), shinfo->nr_frags equals 1, and a little later when
> nr_frags reaches MAX_SKB_FRAGS, it is 0.

Jan,

After some reading I now understand, it is __pskb_pull_tail(skb, pull_to
- skb_headlen(skb)) which lowers shinfo->nr_frags by one.

Must we then not first set shinfo->nr_frags to its correct value before
the call to __pskb_pull_tail(skb, pull_to - skb_headlen(skb)), like below?

+if (nr_frags == MAX_SKB_FRAGS) {
+       unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
+
+       BUG_ON(pull_to <= skb_headlen(skb));
+       shinfo->nr_frags = nr_frags;
+       __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+       nr_frags = shinfo->nr_frags;
+}
+BUG_ON(nr_frags >= MAX_SKB_FRAGS);

I'll do some testing with this.

Dion

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-13 20:17                     ` Dion Kant
@ 2013-07-15  6:37                       ` Jan Beulich
  0 siblings, 0 replies; 85+ messages in thread
From: Jan Beulich @ 2013-07-15  6:37 UTC (permalink / raw)
  To: Dion Kant; +Cc: xen-devel

>>> On 13.07.13 at 22:17, Dion Kant <g.w.kant@hunenet.nl> wrote:
> After some reading I now understand, it is __pskb_pull_tail(skb, pull_to
> - skb_headlen(skb)) which lowers shinfo->nr_frags by one.
> 
> Must we then not first set shinfo->nr_frags to its correct value before
> the call to __pskb_pull_tail(skb, pull_to - skb_headlen(skb)), like below?

Yes, absolutely - your previous mail already made that clear.

Part of the issue here is that I think the code really should be
using skb_add_rx_frag(), which would take care of keeping the
field up-to-date. I'll probably re-write the patch to that effect.

Thanks a lot!

Jan

^ permalink raw reply	[flat|nested] 85+ messages in thread

end of thread, other threads:[~2013-07-15  6:37 UTC | newest]

Thread overview: 85+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <8511913.uMAmUdIO30@eistomin.edss.local>
2013-05-17  8:59 ` [Xen-users] kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8 Wei Liu
2013-05-17  9:08   ` Jan Beulich
2013-05-17  9:37   ` Eugene Istomin
2013-05-17  9:40     ` Eugene Istomin
2013-05-17 10:45     ` Jan Beulich
2013-05-17 12:03       ` Eugene Istomin
2013-05-17 12:25         ` Jan Beulich
2013-05-17 12:30           ` Eugene Istomin
2013-05-17 12:36             ` Jan Beulich
2013-05-17 12:52               ` Eugene Istomin
2013-05-17 13:00                 ` Eugene Istomin
2013-05-18 10:06                   ` Eugene Istomin
2013-05-20  6:52                     ` Eugene Istomin
2013-05-20  8:58                       ` Ian Campbell
2013-05-20  9:06                       ` Wei Liu
2013-05-20  9:17                         ` Eugene Istomin
2013-05-20  9:24                           ` Wei Liu
2013-05-21  9:55                   ` Jan Beulich
2013-05-21 11:09                     ` Eugene Istomin
2013-05-21 11:49                       ` Jan Beulich
2013-05-21 16:19                         ` Eugene Istomin
2013-05-31  3:37                           ` Eugene Istomin
2013-05-31  6:19                             ` Jan Beulich
2013-05-31  6:22                               ` Eugene Istomin
2013-05-31  8:20                               ` Wei Liu
2013-05-21 15:28                     ` Wei Liu
2013-05-31 15:47                       ` Wei Liu
2013-05-31 16:00                         ` Jan Beulich
2013-05-31 16:05                           ` Wei Liu
2013-07-04 13:43   ` Dion Kant
2013-07-04 14:18     ` Jan Beulich
2013-07-04 15:01     ` Wei Liu
2013-07-05  9:32       ` [PATCH] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
2013-07-05 14:53         ` Wei Liu
2013-07-05 14:53         ` Wei Liu
2013-07-07  1:10           ` David Miller
2013-07-07  1:10           ` David Miller
2013-07-08  9:59           ` Jan Beulich
2013-07-08 12:16             ` Dion Kant
2013-07-08 12:41               ` Jan Beulich
2013-07-08 12:41               ` Jan Beulich
2013-07-08 14:20             ` [Xen-devel] " Jan Beulich
2013-07-08 15:22               ` Eric Dumazet
2013-07-08 15:22               ` [Xen-devel] " Eric Dumazet
2013-07-09  7:47                 ` Jan Beulich
2013-07-09  7:47                 ` Jan Beulich
2013-07-08 15:48               ` [Xen-devel] " Wei Liu
2013-07-09  6:52                 ` Jan Beulich
2013-07-09  6:52                 ` [Xen-devel] " Jan Beulich
2013-07-09 16:51                   ` Wei Liu
2013-07-09 16:51                   ` [Xen-devel] " Wei Liu
2013-07-10  6:58                     ` Jan Beulich
2013-07-10  6:58                     ` [Xen-devel] " Jan Beulich
2013-07-10 10:04                       ` Wei Liu
2013-07-10 10:04                       ` [Xen-devel] " Wei Liu
2013-07-10 10:46                         ` Jan Beulich
2013-07-10 12:50                           ` Ian Campbell
2013-07-10 12:53                             ` Wei Liu
2013-07-10 12:53                             ` Wei Liu
2013-07-10 13:58                             ` Jan Beulich
2013-07-10 13:58                             ` [Xen-devel] " Jan Beulich
2013-07-10 12:50                           ` Ian Campbell
2013-07-10 10:46                         ` Jan Beulich
2013-07-10 13:19                       ` [Xen-devel] " Eric Dumazet
2013-07-10 13:19                       ` Eric Dumazet
2013-07-08 15:48               ` Wei Liu
2013-07-12  8:32               ` [Xen-devel] " Wei Liu
2013-07-12  8:56                 ` Jan Beulich
2013-07-13 11:26                   ` Dion Kant
2013-07-13 11:26                   ` [Xen-devel] " Dion Kant
2013-07-13 20:17                     ` Dion Kant
2013-07-15  6:37                       ` Jan Beulich
2013-07-12  8:56                 ` Jan Beulich
2013-07-12  8:32               ` Wei Liu
2013-07-08 14:20             ` Jan Beulich
2013-07-08  9:59           ` Jan Beulich
2013-07-05  9:32       ` Jan Beulich
2013-07-05  9:36       ` kernel 3.9.2 - xen 4.2.2/4.3rc1 => BUG unable to handle kernel paging request netif_poll+0x49c/0xe8 Jan Beulich
2013-07-05 10:40       ` [Xen-users] " Dion Kant
2013-07-05 10:54         ` Wei Liu
2013-07-05 19:11           ` Dion Kant
2013-07-05 10:56         ` Jan Beulich
2013-07-05 19:46           ` Dion Kant
2013-07-06 13:36             ` Wei Liu
2013-07-08  8:25             ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.