mutli process C/S model example init failed on xen dom0 with dpdk-16.07 rc2 package

All of lore.kernel.org
 help / color / mirror / Atom feed

* mutli process C/S model example init failed on xen dom0 with dpdk-16.07 rc2 package
@ 2016-07-12  9:22 Xu, HuilongX
  2016-07-12 11:30 ` Olivier MATZ
  0 siblings, 1 reply; 6+ messages in thread
From: Xu, HuilongX @ 2016-07-12  9:22 UTC (permalink / raw)
  To: dev, Olivier MATZ; +Cc: Cao, Waterman, Chen, WeichunX

Hi all,
I run mutli procee C/S model example failed on xen dom0. Does anyone give me some suggest how to debug it?
Thanks a lot

test environment:
      OS&kernel: 3.17.4-301.fc21.x86_64
Gcc version: gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC)
Package :dpdk.16.07-rc1.tar.gz
Target: x86_64-native-linuxapp-gcc
Compile switch: enable CONFIG_RTE_LIBRTE_XEN_DOM0
Xen version:4.4.1
Test cmdline and result:
/examples/multi_process/client_server_mp/mp_server/mp_server/x86_64-native-linuxapp-gcc/mp_server -c f -n 4 --xen-dom0 -- -p 0x3 -n 2
EAL: Detected 72 lcore(s)
EAL: Probing VFIO support...
PMD: bnxt_rte_pmd_init() called for (null)
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL: probe driver: 8086:1521 rte_igb_pmd
EAL: PCI device 0000:01:00.1 on NUMA socket 0
EAL: probe driver: 8086:1521 rte_igb_pmd
EAL: PCI device 0000:04:00.0 on NUMA socket 0
EAL: probe driver: 8086:10fb rte_ixgbe_pmd
EAL: PCI device 0000:04:00.1 on NUMA socket 0
EAL: probe driver: 8086:10fb rte_ixgbe_pmd
Creating mbuf pool 'MProc_pktmbuf_pool' [6144 mbufs] ...
Port 0 init ... Segmentation fault (core dumped)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mutli process C/S model example init failed on xen dom0 with dpdk-16.07 rc2 package
  2016-07-12  9:22 mutli process C/S model example init failed on xen dom0 with dpdk-16.07 rc2 package Xu, HuilongX
@ 2016-07-12 11:30 ` Olivier MATZ
  2016-07-18 11:33   ` Sergio Gonzalez Monroy
  0 siblings, 1 reply; 6+ messages in thread
From: Olivier MATZ @ 2016-07-12 11:30 UTC (permalink / raw)
  To: Xu, HuilongX, dev
  Cc: Cao, Waterman, Chen, WeichunX, Sergio Gonzalez Monroy, Thomas Monjalon

Hi Huilong,


On 07/12/2016 11:22 AM, Xu, HuilongX wrote:
> Hi all,
>
> I run mutli procee C/S model example failed on xen dom0. Does anyone
> give me some suggest how to debug it?
>
> Thanks a lot
>
> test environment:
>
>        OS&kernel: 3.17.4-301.fc21.x86_64
>
> Gcc version: gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC)
>
> Package :dpdk.16.07-rc1.tar.gz
>
> Target: x86_64-native-linuxapp-gcc
>
> Compile switch: enable CONFIG_RTE_LIBRTE_XEN_DOM0
>
> Xen version:4.4.1
>
> Test cmdline and result:
>
> /examples/multi_process/client_server_mp/mp_server/mp_server/x86_64-native-linuxapp-gcc/mp_server
> -c f -n 4 --xen-dom0 -- -p 0x3 -n 2
> EAL: Detected 72 lcore(s)
> EAL: Probing VFIO support...
> PMD: bnxt_rte_pmd_init() called for (null)
> EAL: PCI device 0000:01:00.0 on NUMA socket 0
> EAL: probe driver: 8086:1521 rte_igb_pmd
> EAL: PCI device 0000:01:00.1 on NUMA socket 0
> EAL: probe driver: 8086:1521 rte_igb_pmd
> EAL: PCI device 0000:04:00.0 on NUMA socket 0
> EAL: probe driver: 8086:10fb rte_ixgbe_pmd
> EAL: PCI device 0000:04:00.1 on NUMA socket 0
> EAL: probe driver: 8086:10fb rte_ixgbe_pmd
> Creating mbuf pool 'MProc_pktmbuf_pool' [6144 mbufs] ...
> Port 0 init ... Segmentation fault (core dumped)
>

I reproduced the issue on my platform. In my case, the crash occurs in 
rx_queue_setup():

         /* Free memory prior to re-allocation if needed. */
         if (dev->data->rx_queues[queue_idx] != NULL) {
=>              em_rx_queue_release(dev->data->rx_queues[queue_idx]);
                 dev->data->rx_queues[queue_idx] = NULL;
         }

I don't this we should go in that area for the first rx queue 
initialization. I suspect it could be related to this commit:
http://dpdk.org/browse/dpdk/commit/?id=ea0bddbd14e68f

I think we cannot expect that memory is initialized at 0 when using Xen 
dom0. If I add the following (dirty) patch, I don't see a crash anymore:

--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -258,6 +258,8 @@ memzone_reserve_aligned_thread_unsafe(const char 
*name, size_t len,
         mz->flags = 0;
         mz->memseg_id = elem->ms - 
rte_eal_get_configuration()->mem_config->memseg;

+       memset(mz->addr, 0, mz->len);
+
         return mz;
  }

--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -123,7 +123,13 @@ rte_malloc(const char *type, size_t size, unsigned 
align)
  void *
  rte_zmalloc_socket(const char *type, size_t size, unsigned align, int 
socket)
  {
-       return rte_malloc_socket(type, size, align, socket);
+       void *x = rte_malloc_socket(type, size, align, socket);
+
+       if (x == NULL)
+               return NULL;
+
+       memset(x, 0, size);
+       return x;
  }

  /*


Sergio, could you have a look at it?

Regards,
Olivier

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mutli process C/S model example init failed on xen dom0 with dpdk-16.07 rc2 package
  2016-07-12 11:30 ` Olivier MATZ
@ 2016-07-18 11:33   ` Sergio Gonzalez Monroy
  2016-07-18 11:49     ` Olivier Matz
  0 siblings, 1 reply; 6+ messages in thread
From: Sergio Gonzalez Monroy @ 2016-07-18 11:33 UTC (permalink / raw)
  To: Olivier MATZ, Xu, HuilongX, dev
  Cc: Cao, Waterman, Chen, WeichunX, Thomas Monjalon

Hi,

On 12/07/2016 12:30, Olivier MATZ wrote:
> Hi Huilong,
>
>
> On 07/12/2016 11:22 AM, Xu, HuilongX wrote:
>> Hi all,
>>
>> I run mutli procee C/S model example failed on xen dom0. Does anyone
>> give me some suggest how to debug it?
>>
>> Thanks a lot
>>
>> test environment:
>>
>>        OS&kernel: 3.17.4-301.fc21.x86_64
>>
>> Gcc version: gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC)
>>
>> Package :dpdk.16.07-rc1.tar.gz
>>
>> Target: x86_64-native-linuxapp-gcc
>>
>> Compile switch: enable CONFIG_RTE_LIBRTE_XEN_DOM0
>>
>> Xen version:4.4.1
>>
>> Test cmdline and result:
>>
>> /examples/multi_process/client_server_mp/mp_server/mp_server/x86_64-native-linuxapp-gcc/mp_server 
>>
>> -c f -n 4 --xen-dom0 -- -p 0x3 -n 2
>> EAL: Detected 72 lcore(s)
>> EAL: Probing VFIO support...
>> PMD: bnxt_rte_pmd_init() called for (null)
>> EAL: PCI device 0000:01:00.0 on NUMA socket 0
>> EAL: probe driver: 8086:1521 rte_igb_pmd
>> EAL: PCI device 0000:01:00.1 on NUMA socket 0
>> EAL: probe driver: 8086:1521 rte_igb_pmd
>> EAL: PCI device 0000:04:00.0 on NUMA socket 0
>> EAL: probe driver: 8086:10fb rte_ixgbe_pmd
>> EAL: PCI device 0000:04:00.1 on NUMA socket 0
>> EAL: probe driver: 8086:10fb rte_ixgbe_pmd
>> Creating mbuf pool 'MProc_pktmbuf_pool' [6144 mbufs] ...
>> Port 0 init ... Segmentation fault (core dumped)
>>
>
> I reproduced the issue on my platform. In my case, the crash occurs in 
> rx_queue_setup():
>
>         /* Free memory prior to re-allocation if needed. */
>         if (dev->data->rx_queues[queue_idx] != NULL) {
> => em_rx_queue_release(dev->data->rx_queues[queue_idx]);
>                 dev->data->rx_queues[queue_idx] = NULL;
>         }
>
> I don't this we should go in that area for the first rx queue 
> initialization. I suspect it could be related to this commit:
> http://dpdk.org/browse/dpdk/commit/?id=ea0bddbd14e68f
>
> I think we cannot expect that memory is initialized at 0 when using 
> Xen dom0. If I add the following (dirty) patch, I don't see a crash 
> anymore:

I don't have a Xen system available right now, but I'm not sure I follow 
here.
Are you saying that when we allocate pages/hugepages from Xen they are 
not zeroed?

>
> --- a/lib/librte_eal/common/eal_common_memzone.c
> +++ b/lib/librte_eal/common/eal_common_memzone.c
> @@ -258,6 +258,8 @@ memzone_reserve_aligned_thread_unsafe(const char 
> *name, size_t len,
>         mz->flags = 0;
>         mz->memseg_id = elem->ms - 
> rte_eal_get_configuration()->mem_config->memseg;
>
> +       memset(mz->addr, 0, mz->len);
> +
>         return mz;
>  }
>

The commit you are referring to does not touch the memzone reserve APIs, 
only changes zmalloc and related APIs.

> --- a/lib/librte_eal/common/rte_malloc.c
> +++ b/lib/librte_eal/common/rte_malloc.c
> @@ -123,7 +123,13 @@ rte_malloc(const char *type, size_t size, 
> unsigned align)
>  void *
>  rte_zmalloc_socket(const char *type, size_t size, unsigned align, int 
> socket)
>  {
> -       return rte_malloc_socket(type, size, align, socket);
> +       void *x = rte_malloc_socket(type, size, align, socket);
> +
> +       if (x == NULL)
> +               return NULL;
> +
> +       memset(x, 0, size);
> +       return x;
>  }
>
>  /*
>
>
> Sergio, could you have a look at it?
>
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mutli process C/S model example init failed on xen dom0 with dpdk-16.07 rc2 package
  2016-07-18 11:33   ` Sergio Gonzalez Monroy
@ 2016-07-18 11:49     ` Olivier Matz
  2016-07-18 13:15       ` Sergio Gonzalez Monroy
  0 siblings, 1 reply; 6+ messages in thread
From: Olivier Matz @ 2016-07-18 11:49 UTC (permalink / raw)
  To: Sergio Gonzalez Monroy, Xu, HuilongX, dev
  Cc: Cao, Waterman, Chen, WeichunX, Thomas Monjalon

Hi Sergio,

On 07/18/2016 01:33 PM, Sergio Gonzalez Monroy wrote:
> On 12/07/2016 12:30, Olivier MATZ wrote:
>> On 07/12/2016 11:22 AM, Xu, HuilongX wrote:
>>> /examples/multi_process/client_server_mp/mp_server/mp_server/x86_64-native-linuxapp-gcc/mp_server
>>>
>>> -c f -n 4 --xen-dom0 -- -p 0x3 -n 2
>>> EAL: Detected 72 lcore(s)
>>> EAL: Probing VFIO support...
>>> PMD: bnxt_rte_pmd_init() called for (null)
>>> EAL: PCI device 0000:01:00.0 on NUMA socket 0
>>> EAL: probe driver: 8086:1521 rte_igb_pmd
>>> EAL: PCI device 0000:01:00.1 on NUMA socket 0
>>> EAL: probe driver: 8086:1521 rte_igb_pmd
>>> EAL: PCI device 0000:04:00.0 on NUMA socket 0
>>> EAL: probe driver: 8086:10fb rte_ixgbe_pmd
>>> EAL: PCI device 0000:04:00.1 on NUMA socket 0
>>> EAL: probe driver: 8086:10fb rte_ixgbe_pmd
>>> Creating mbuf pool 'MProc_pktmbuf_pool' [6144 mbufs] ...
>>> Port 0 init ... Segmentation fault (core dumped)
>>>
>>
>> I reproduced the issue on my platform. In my case, the crash occurs in
>> rx_queue_setup():
>>
>>         /* Free memory prior to re-allocation if needed. */
>>         if (dev->data->rx_queues[queue_idx] != NULL) {
>> => em_rx_queue_release(dev->data->rx_queues[queue_idx]);
>>                 dev->data->rx_queues[queue_idx] = NULL;
>>         }
>>
>> I don't this we should go in that area for the first rx queue
>> initialization. I suspect it could be related to this commit:
>> http://dpdk.org/browse/dpdk/commit/?id=ea0bddbd14e68f
>>
>> I think we cannot expect that memory is initialized at 0 when using
>> Xen dom0. If I add the following (dirty) patch, I don't see a crash
>> anymore:
> 
> I don't have a Xen system available right now, but I'm not sure I follow
> here.
> Are you saying that when we allocate pages/hugepages from Xen they are
> not zeroed?

I did not check it, but from the tests I've done, I suppose they're not.


>> --- a/lib/librte_eal/common/eal_common_memzone.c
>> +++ b/lib/librte_eal/common/eal_common_memzone.c
>> @@ -258,6 +258,8 @@ memzone_reserve_aligned_thread_unsafe(const char
>> *name, size_t len,
>>         mz->flags = 0;
>>         mz->memseg_id = elem->ms -
>> rte_eal_get_configuration()->mem_config->memseg;
>>
>> +       memset(mz->addr, 0, mz->len);
>> +
>>         return mz;
>>  }
>>
> 
> The commit you are referring to does not touch the memzone reserve APIs,
> only changes zmalloc and related APIs.

I just did a quick test, adding the memset() at the places where I
thought it could be required. Maybe the patch is a bit overkill and only
the zmalloc part fixes the issue.


Regards,
Olivier

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mutli process C/S model example init failed on xen dom0 with dpdk-16.07 rc2 package
  2016-07-18 11:49     ` Olivier Matz
@ 2016-07-18 13:15       ` Sergio Gonzalez Monroy
  2016-07-18 13:34         ` Thomas Monjalon
  0 siblings, 1 reply; 6+ messages in thread
From: Sergio Gonzalez Monroy @ 2016-07-18 13:15 UTC (permalink / raw)
  To: Olivier Matz, Xu, HuilongX, dev
  Cc: Cao, Waterman, Chen, WeichunX, Thomas Monjalon

On 18/07/2016 12:49, Olivier Matz wrote:
> Hi Sergio,
>
> On 07/18/2016 01:33 PM, Sergio Gonzalez Monroy wrote:
>> On 12/07/2016 12:30, Olivier MATZ wrote:
>>> On 07/12/2016 11:22 AM, Xu, HuilongX wrote:
>>>> /examples/multi_process/client_server_mp/mp_server/mp_server/x86_64-native-linuxapp-gcc/mp_server
>>>>
>>>> -c f -n 4 --xen-dom0 -- -p 0x3 -n 2
>>>> EAL: Detected 72 lcore(s)
>>>> EAL: Probing VFIO support...
>>>> PMD: bnxt_rte_pmd_init() called for (null)
>>>> EAL: PCI device 0000:01:00.0 on NUMA socket 0
>>>> EAL: probe driver: 8086:1521 rte_igb_pmd
>>>> EAL: PCI device 0000:01:00.1 on NUMA socket 0
>>>> EAL: probe driver: 8086:1521 rte_igb_pmd
>>>> EAL: PCI device 0000:04:00.0 on NUMA socket 0
>>>> EAL: probe driver: 8086:10fb rte_ixgbe_pmd
>>>> EAL: PCI device 0000:04:00.1 on NUMA socket 0
>>>> EAL: probe driver: 8086:10fb rte_ixgbe_pmd
>>>> Creating mbuf pool 'MProc_pktmbuf_pool' [6144 mbufs] ...
>>>> Port 0 init ... Segmentation fault (core dumped)
>>>>
>>> I reproduced the issue on my platform. In my case, the crash occurs in
>>> rx_queue_setup():
>>>
>>>          /* Free memory prior to re-allocation if needed. */
>>>          if (dev->data->rx_queues[queue_idx] != NULL) {
>>> => em_rx_queue_release(dev->data->rx_queues[queue_idx]);
>>>                  dev->data->rx_queues[queue_idx] = NULL;
>>>          }
>>>
>>> I don't this we should go in that area for the first rx queue
>>> initialization. I suspect it could be related to this commit:
>>> http://dpdk.org/browse/dpdk/commit/?id=ea0bddbd14e68f
>>>
>>> I think we cannot expect that memory is initialized at 0 when using
>>> Xen dom0. If I add the following (dirty) patch, I don't see a crash
>>> anymore:
>> I don't have a Xen system available right now, but I'm not sure I follow
>> here.
>> Are you saying that when we allocate pages/hugepages from Xen they are
>> not zeroed?
> I did not check it, but from the tests I've done, I suppose they're not.

If that is the case then I would suggest to zero all memory on EAL init 
(only for Xen) so
all memory is zeroed after init for both Linux and Xen.

What do you think about that?

Regards,
Sergio

>
>>> --- a/lib/librte_eal/common/eal_common_memzone.c
>>> +++ b/lib/librte_eal/common/eal_common_memzone.c
>>> @@ -258,6 +258,8 @@ memzone_reserve_aligned_thread_unsafe(const char
>>> *name, size_t len,
>>>          mz->flags = 0;
>>>          mz->memseg_id = elem->ms -
>>> rte_eal_get_configuration()->mem_config->memseg;
>>>
>>> +       memset(mz->addr, 0, mz->len);
>>> +
>>>          return mz;
>>>   }
>>>
>> The commit you are referring to does not touch the memzone reserve APIs,
>> only changes zmalloc and related APIs.
> I just did a quick test, adding the memset() at the places where I
> thought it could be required. Maybe the patch is a bit overkill and only
> the zmalloc part fixes the issue.
>
>
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mutli process C/S model example init failed on xen dom0 with dpdk-16.07 rc2 package
  2016-07-18 13:15       ` Sergio Gonzalez Monroy
@ 2016-07-18 13:34         ` Thomas Monjalon
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Monjalon @ 2016-07-18 13:34 UTC (permalink / raw)
  To: Sergio Gonzalez Monroy
  Cc: Olivier Matz, Xu, HuilongX, dev, Cao, Waterman, Chen, WeichunX

2016-07-18 14:15, Sergio Gonzalez Monroy:
> On 18/07/2016 12:49, Olivier Matz wrote:
> > On 07/18/2016 01:33 PM, Sergio Gonzalez Monroy wrote:
> >> On 12/07/2016 12:30, Olivier MATZ wrote:
> >>> I think we cannot expect that memory is initialized at 0 when using
> >>> Xen dom0. If I add the following (dirty) patch, I don't see a crash
> >>> anymore:
> >> I don't have a Xen system available right now, but I'm not sure I follow
> >> here.
> >> Are you saying that when we allocate pages/hugepages from Xen they are
> >> not zeroed?
> > I did not check it, but from the tests I've done, I suppose they're not.
> 
> If that is the case then I would suggest to zero all memory on EAL init 
> (only for Xen) so
> all memory is zeroed after init for both Linux and Xen.
> 
> What do you think about that?

It is an idea.
It is probable that you won't have any answer as the Xen support is
unmaintained:
	http://dpdk.org/ml/archives/dev/2016-July/043875.html
Feel free to make a patch to try fixing it or we can remove this
whole dead code.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-07-18 13:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-12  9:22 mutli process C/S model example init failed on xen dom0 with dpdk-16.07 rc2 package Xu, HuilongX
2016-07-12 11:30 ` Olivier MATZ
2016-07-18 11:33   ` Sergio Gonzalez Monroy
2016-07-18 11:49     ` Olivier Matz
2016-07-18 13:15       ` Sergio Gonzalez Monroy
2016-07-18 13:34         ` Thomas Monjalon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.