All of lore.kernel.org
 help / color / mirror / Atom feed
* kni: continuous memory restriction ?
@ 2018-03-09 12:14 cys
  2018-03-13  9:36 ` cys
  2018-03-13 14:57 ` Ferruh Yigit
  0 siblings, 2 replies; 5+ messages in thread
From: cys @ 2018-03-09 12:14 UTC (permalink / raw)
  To: dev, ferruh.yigit

Commit 8451269e6d7ba7501723fe2efd0 said "remove continuous memory restriction";
http://dpdk.org/browse/dpdk/commit/lib/librte_eal/linuxapp/kni/kni_net.c?id=8451269e6d7ba7501723fe2efd05745010295bac
For chained mbufs(nb_segs > 1), function va2pa use the offset of previous mbuf to calculate physical address of next mbuf.
So anywhere guarante that all mbufs have the same offset (buf_addr - buf_physaddr) ?
Or have I misunderstood chained mbufs?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kni: continuous memory restriction ?
  2018-03-09 12:14 kni: continuous memory restriction ? cys
@ 2018-03-13  9:36 ` cys
  2018-03-13 14:57 ` Ferruh Yigit
  1 sibling, 0 replies; 5+ messages in thread
From: cys @ 2018-03-13  9:36 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit

We got several kernel panic. It looks like related to mbuf address translation.
Anybody can help please?


[1741994.707024] skbuff: skb_over_panic: text:ffffffffa00832fe len:43920 put:41872 head:ffff880e5d6b8000 data:ffff880e5d6b8042 tail:0xabd2 end:0x7ec0 dev:<NULL>
[1741994.707186] ------------[ cut here ]------------
[1741994.707284] kernel BUG at net/core/skbuff.c:130!
[1741994.707382] invalid opcode: 0000 [#1] SMP
[1741994.707640] Modules linked in: vfio_iommu_type1 vfio_pci vfio rte_kni(O) igb_uio(O) uio sw(O) nfsv3 iptable_nat nf_nat_ipv4 nf_nat rpcsec_gss_krb5 nfsv4 dns_resolver fuse nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_multipath tipc tun nbd coretemp bridge stp llc watch_reboot(O) sffs(O) cl_lock(O) cl_softdog(O) kvm_intel kvm irqbypass bnx2x(O) libcrc32c igb(O) i2c_algo_bit ixgbe mdio dca i40e loop dm_mod sg sd_mod crct10dif_generic crct10dif_pclmul crc_t10dif crct10dif_common iTCO_wdt iTCO_vendor_support pcspkr mpt3sas raid_class i2c_i801 scsi_transport_sas ahci i2c_core shpchp lpc_ich libahci mfd_core libata wmi ipmi_si
[1741994.715519]  ipmi_msghandler acpi_cpufreq [last unloaded: sw]
[1741994.715994] CPU: 6 PID: 137677 Comm: kni_single Tainted: G           O   ------------   3.10.0 #22
[1741994.716378] Hardware name: Sugon I620-G30/60P24-US, BIOS 0JGST025 12/08/2017
[1741994.716755] task: ffff88201fbf9000 ti: ffff882014844000 task.ti: ffff882014844000
[1741994.717133] RIP: 0010:[<ffffffff8174a48d>]  [<ffffffff8174a48d>] skb_panic+0x63/0x65
[1741994.717597] RSP: 0018:ffff882014847dc8  EFLAGS: 00010292
[1741994.717832] RAX: 000000000000008f RBX: 000000000000a390 RCX: 0000000000000000
[1741994.718349] RDX: 0000000000000001 RSI: ffff88103e7908c8 RDI: ffff88103e7908c8
[1741994.718726] RBP: ffff882014847de8 R08: 0000000000000000 R09: 0000000000000000
[1741994.719103] R10: 00000000000025ac R11: 0000000000000006 R12: ffff880000000000
[1741994.719625] R13: 0000000000000002 R14: ffff880ee3848a40 R15: 000006fd81955cfe
[1741994.720005] FS:  0000000000000000(0000) GS:ffff88103e780000(0000) knlGS:0000000000000000
[1741994.720387] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1741994.720624] CR2: 0000000002334fb8 CR3: 0000001ff4ee0000 CR4: 00000000003427e0
[1741994.721000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1741994.721377] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1741994.721753] Stack:
[1741994.721977]  ffff880e5d6b8042 000000000000abd2 0000000000007ec0 ffffffff819cba3d
[1741994.722682]  ffff882014847df8 ffffffff8161c3e6 ffff882014847e68 ffffffffa00832fe
[1741994.723384]  ffff880fe39ee000 0000661200000020 ffff880fe39eeb78 ffff880fe39ee8c0
[1741994.724085] Call Trace:
[1741994.724314]  [<ffffffff8161c3e6>] skb_put+0x46/0x50
[1741994.724552]  [<ffffffffa00832fe>] kni_net_rx_normal+0x1de/0x370 [rte_kni]
[1741994.724795]  [<ffffffffa008353f>] kni_net_rx+0xf/0x20 [rte_kni]
[1741994.725034]  [<ffffffffa00810f8>] kni_thread_single+0x58/0xb0 [rte_kni]
[1741994.725276]  [<ffffffffa00810a0>] ? kni_dev_remove+0xa0/0xa0 [rte_kni]
[1741994.725519]  [<ffffffff810a12f0>] kthread+0xc0/0xd0
[1741994.725754]  [<ffffffff810a1230>] ? kthread_create_on_node+0x130/0x130
[1741994.725997]  [<ffffffff817582d8>] ret_from_fork+0x58/0x90
[1741994.726234]  [<ffffffff810a1230>] ? kthread_create_on_node+0x130/0x130
[1741994.726473] Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 fd a1 81 48 89 04 24 31 c0 e8 b3 81 ff ff <0f> 0b 55 48 89 f8 48 8b 57 30 48 89 e5 48 8b 0f 5d 80 e5 80 48
[1741994.732282] RIP  [<ffffffff8174a48d>] skb_panic+0x63/0x65
[1741994.732601]  RSP <ffff882014847dc8>





在2018年03月09 20时14分, "cys"<chaoys155@163.com>写道:

Commit 8451269e6d7ba7501723fe2efd0 said "remove continuous memory restriction";
http://dpdk.org/browse/dpdk/commit/lib/librte_eal/linuxapp/kni/kni_net.c?id=8451269e6d7ba7501723fe2efd05745010295bac
For chained mbufs(nb_segs > 1), function va2pa use the offset of previous mbuf to calculate physical address of next mbuf.
So anywhere guarante that all mbufs have the same offset (buf_addr - buf_physaddr) ?
Or have I misunderstood chained mbufs?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kni: continuous memory restriction ?
  2018-03-09 12:14 kni: continuous memory restriction ? cys
  2018-03-13  9:36 ` cys
@ 2018-03-13 14:57 ` Ferruh Yigit
  2018-03-14  0:35   ` cys
  1 sibling, 1 reply; 5+ messages in thread
From: Ferruh Yigit @ 2018-03-13 14:57 UTC (permalink / raw)
  To: cys, dev

On 3/9/2018 12:14 PM, cys wrote:
> Commit 8451269e6d7ba7501723fe2efd0 said "remove continuous memory restriction";
> http://dpdk.org/browse/dpdk/commit/lib/librte_eal/linuxapp/kni/kni_net.c?id=8451269e6d7ba7501723fe2efd05745010295bac
> For chained mbufs(nb_segs > 1), function va2pa use the offset of previous mbuf
> to calculate physical address of next mbuf.
> So anywhere guarante that all mbufs have the same offset (buf_addr - buf_physaddr) ?
> Or have I misunderstood chained mbufs?

Hi,

Your description is correct, KNI chained mbufs is broken if chained mbufs are
from different mempools.

Two commits seems involved, in time order:
[1] d89a58dfe90b ("kni: support chained mbufs")
[2] 8451269e6d7b ("kni: remove continuous memory restriction")

With current implementation, kernel needs to know physical address of the mbuf
to be able to access it.
For chained mbufs, first mbuf is OK but for rest kernel side gets the virtual
address of the mbuf and this only works if all chained mbufs are from same mempool.

I don't have any good solution indeed, but it is possible to:
a) If you are using chained mbufs, keep old limitation of using singe mempool
b) Serialize chained mbufs for KNI in userspace

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kni: continuous memory restriction ?
  2018-03-13 14:57 ` Ferruh Yigit
@ 2018-03-14  0:35   ` cys
  2018-03-20 15:25     ` Ferruh Yigit
  0 siblings, 1 reply; 5+ messages in thread
From: cys @ 2018-03-14  0:35 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev

Thanks for your reply.
With your solution a), I guess 'single mempool' mean a mempool fit in one memseg (continuous memory).
What about a mempool across many memsegs ? I'm afraid it's still not safe.

Just like this one:
-------------- MEMPOOL ----------------
mempool <mbuf_pool[0]>@0x7ff9e4833d00
  flags=10
  pool=0x7ff9fbfffe00
  phys_addr=0xc4fc33d00
  nb_mem_chunks=91
  size=524288
  populated_size=524288
  header_size=64
  elt_size=2432
  trailer_size=0
  total_obj_size=2496
  private_data_size=64
  avg bytes/object=2496.233643


Zone 0: name:<rte_eth_dev_data>, phys:0xc4fdb7f40, len:0x34000, virt:0x7ff9e49b7f40, socket_id:0, flags:0
Zone 1: name:<MP_mbuf_pool[0]>, phys:0xc4fc33d00, len:0x182100, virt:0x7ff9e4833d00, socket_id:0, flags:0
Zone 2: name:<MP_mbuf_pool[0]_0>, phys:0xb22000080, len:0x16ffff40, virt:0x7ffa3a800080, socket_id:0, flags:0
Zone 3: name:<RG_MP_mbuf_pool[0]>, phys:0xc199ffe00, len:0x800180, virt:0x7ff9fbfffe00, socket_id:0, flags:0
Zone 4: name:<MP_mbuf_pool[0]_1>, phys:0xc29c00080, len:0x77fff40, virt:0x7ff9e5800080, socket_id:0, flags:0
Zone 5: name:<MP_mbuf_pool[0]_2>, phys:0xc22c00080, len:0x67fff40, virt:0x7ff9ed200080, socket_id:0, flags:0
Zone 6: name:<MP_mbuf_pool[0]_3>, phys:0xc1dc00080, len:0x3bfff40, virt:0x7ff9f4800080, socket_id:0, flags:0
Zone 7: name:<MP_mbuf_pool[0]_4>, phys:0xc1bc00080, len:0x1bfff40, virt:0x7ff9f8600080, socket_id:0, flags:0
Zone 8: name:<MP_mbuf_pool[0]_5>, phys:0xbf4600080, len:0xffff40, virt:0x7ffa1ea00080, socket_id:0, flags:0
Zone 9: name:<MP_mbuf_pool[0]_6>, phys:0xc0e000080, len:0xdfff40, virt:0x7ffa06400080, socket_id:0, flags:0
Zone 10: name:<MP_mbuf_pool[0]_7>, phys:0xbe0600080, len:0xdfff40, virt:0x7ffa32000080, socket_id:0, flags:0
Zone 11: name:<MP_mbuf_pool[0]_8>, phys:0xc18000080, len:0xbfff40, virt:0x7ff9fd000080, socket_id:0, flags:0
Zone 12: name:<MP_mbuf_pool[0]_9>, phys:0x65000080, len:0xbfff40, virt:0x7ffa54e00080, socket_id:0, flags:0
Zone 13: name:<MP_mbuf_pool[0]_10>, phys:0xc12a00080, len:0x7fff40, virt:0x7ffa02200080, socket_id:0, flags:0
Zone 14: name:<MP_mbuf_pool[0]_11>, phys:0xc0d600080, len:0x7fff40, virt:0x7ffa07400080, socket_id:0, flags:0
Zone 15: name:<MP_mbuf_pool[0]_12>, phys:0xc06600080, len:0x7fff40, virt:0x7ffa0de00080, socket_id:0, flags:0

...


在2018年03月13 22时57分, "Ferruh Yigit"<ferruh.yigit@intel.com>写道:

On 3/9/2018 12:14 PM, cys wrote:
> Commit 8451269e6d7ba7501723fe2efd0 said "remove continuous memory restriction";
> http://dpdk.org/browse/dpdk/commit/lib/librte_eal/linuxapp/kni/kni_net.c?id=8451269e6d7ba7501723fe2efd05745010295bac
> For chained mbufs(nb_segs > 1), function va2pa use the offset of previous mbuf
> to calculate physical address of next mbuf.
> So anywhere guarante that all mbufs have the same offset (buf_addr - buf_physaddr) ?
> Or have I misunderstood chained mbufs?

Hi,

Your description is correct, KNI chained mbufs is broken if chained mbufs are
from different mempools.

Two commits seems involved, in time order:
[1] d89a58dfe90b ("kni: support chained mbufs")
[2] 8451269e6d7b ("kni: remove continuous memory restriction")

With current implementation, kernel needs to know physical address of the mbuf
to be able to access it.
For chained mbufs, first mbuf is OK but for rest kernel side gets the virtual
address of the mbuf and this only works if all chained mbufs are from same mempool.

I don't have any good solution indeed, but it is possible to:
a) If you are using chained mbufs, keep old limitation of using singe mempool
b) Serialize chained mbufs for KNI in userspace

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kni: continuous memory restriction ?
  2018-03-14  0:35   ` cys
@ 2018-03-20 15:25     ` Ferruh Yigit
  0 siblings, 0 replies; 5+ messages in thread
From: Ferruh Yigit @ 2018-03-20 15:25 UTC (permalink / raw)
  To: cys; +Cc: dev


> 
> 在2018年03月13 22时57分, "Ferruh Yigit"<ferruh.yigit@intel.com>写道:
> 
> 
>     On 3/9/2018 12:14 PM, cys wrote:
>     > Commit 8451269e6d7ba7501723fe2efd0 said "remove continuous memory
>     restriction";
>     >
>     http://dpdk.org/browse/dpdk/commit/lib/librte_eal/linuxapp/kni/kni_net.c?id=8451269e6d7ba7501723fe2efd05745010295bac
>     > For chained mbufs(nb_segs > 1), function va2pa use the offset of previous mbuf
>     > to calculate physical address of next mbuf.
>     > So anywhere guarante that all mbufs have the same offset (buf_addr -
>     buf_physaddr) ?
>     > Or have I misunderstood chained mbufs?
> 
>     Hi,
> 
>     Your description is correct, KNI chained mbufs is broken if chained mbufs are
>     from different mempools.
> 
>     Two commits seems involved, in time order:
>     [1] d89a58dfe90b ("kni: support chained mbufs")
>     [2] 8451269e6d7b ("kni: remove continuous memory restriction")
> 
>     With current implementation, kernel needs to know physical address of the mbuf
>     to be able to access it.
>     For chained mbufs, first mbuf is OK but for rest kernel side gets the virtual
>     address of the mbuf and this only works if all chained mbufs are from same
>     mempool.
> 
>     I don't have any good solution indeed, but it is possible to:
>     a) If you are using chained mbufs, keep old limitation of using singe mempool
>     b) Serialize chained mbufs for KNI in userspace
> 

On 3/14/2018 12:35 AM, cys wrote:
> Thanks for your reply.
> With your solution a), I guess 'single mempool' mean a mempool fit in one memseg
> (continuous memory).

Yes I mean physically continuous memory, a mempool from single memseg, otherwise
it has same problem.

> What about a mempool across many memsegs ? I'm afraid it's still not safe.
> Just like this one:
> -------------- MEMPOOL ----------------
> mempool <mbuf_pool[0]>@0x7ff9e4833d00
>   flags=10
>   pool=0x7ff9fbfffe00
>   phys_addr=0xc4fc33d00
>   nb_mem_chunks=91
>   size=524288
>   populated_size=524288
>   header_size=64
>   elt_size=2432
>   trailer_size=0
>   total_obj_size=2496
>   private_data_size=64
>   avg bytes/object=2496.233643
>
> Zone 0: name:<rte_eth_dev_data>, phys:0xc4fdb7f40, len:0x34000,
> virt:0x7ff9e49b7f40, socket_id:0, flags:0
> Zone 1: name:<MP_mbuf_pool[0]>, phys:0xc4fc33d00, len:0x182100,
> virt:0x7ff9e4833d00, socket_id:0, flags:0
> Zone 2: name:<MP_mbuf_pool[0]_0>, phys:0xb22000080, len:0x16ffff40,
> virt:0x7ffa3a800080, socket_id:0, flags:0
> Zone 3: name:<RG_MP_mbuf_pool[0]>, phys:0xc199ffe00, len:0x800180,
> virt:0x7ff9fbfffe00, socket_id:0, flags:0
> Zone 4: name:<MP_mbuf_pool[0]_1>, phys:0xc29c00080, len:0x77fff40,
> virt:0x7ff9e5800080, socket_id:0, flags:0
> Zone 5: name:<MP_mbuf_pool[0]_2>, phys:0xc22c00080, len:0x67fff40,
> virt:0x7ff9ed200080, socket_id:0, flags:0
> Zone 6: name:<MP_mbuf_pool[0]_3>, phys:0xc1dc00080, len:0x3bfff40,
> virt:0x7ff9f4800080, socket_id:0, flags:0
> Zone 7: name:<MP_mbuf_pool[0]_4>, phys:0xc1bc00080, len:0x1bfff40,
> virt:0x7ff9f8600080, socket_id:0, flags:0
> Zone 8: name:<MP_mbuf_pool[0]_5>, phys:0xbf4600080, len:0xffff40,
> virt:0x7ffa1ea00080, socket_id:0, flags:0
> Zone 9: name:<MP_mbuf_pool[0]_6>, phys:0xc0e000080, len:0xdfff40,
> virt:0x7ffa06400080, socket_id:0, flags:0
> Zone 10: name:<MP_mbuf_pool[0]_7>, phys:0xbe0600080, len:0xdfff40,
> virt:0x7ffa32000080, socket_id:0, flags:0
> Zone 11: name:<MP_mbuf_pool[0]_8>, phys:0xc18000080, len:0xbfff40,
> virt:0x7ff9fd000080, socket_id:0, flags:0
> Zone 12: name:<MP_mbuf_pool[0]_9>, phys:0x65000080, len:0xbfff40,
> virt:0x7ffa54e00080, socket_id:0, flags:0
> Zone 13: name:<MP_mbuf_pool[0]_10>, phys:0xc12a00080, len:0x7fff40,
> virt:0x7ffa02200080, socket_id:0, flags:0
> Zone 14: name:<MP_mbuf_pool[0]_11>, phys:0xc0d600080, len:0x7fff40,
> virt:0x7ffa07400080, socket_id:0, flags:0
> Zone 15: name:<MP_mbuf_pool[0]_12>, phys:0xc06600080, len:0x7fff40,
> virt:0x7ffa0de00080, socket_id:0, flags:0
> ...

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-03-20 15:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-09 12:14 kni: continuous memory restriction ? cys
2018-03-13  9:36 ` cys
2018-03-13 14:57 ` Ferruh Yigit
2018-03-14  0:35   ` cys
2018-03-20 15:25     ` Ferruh Yigit

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.