All of lore.kernel.org
 help / color / mirror / Atom feed
* [SPDK] Re: SPDK v19.10/DPDK v19.05.0 eal_adjust_config issue, Illegal instruction causing application to abort
@ 2020-04-23 14:27 Harris, James R
  0 siblings, 0 replies; 6+ messages in thread
From: Harris, James R @ 2020-04-23 14:27 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 4225 bytes --]

Hi Prasanna,

I haven't seen this issue before.  But I'd like to help you get to the bottom of this.

Since adding -g makes the problem go away, could you try instrumenting eal_adjust_config somehow to see where it is crashing in that function?  Of course the instrumentation itself could make the problem go away too.

Maybe also try reducing the optimization level for the eal library.

You mentioned cmake, which neither SPDK nor DPDK use.  Are you building DPDK with cmake?

-Jim

On 4/23/20, 12:57 AM, "asprasannavenkatesh(a)gmail.com" <asprasannavenkatesh(a)gmail.com> wrote:

    Hi guys,

    I am running SPDK version 19.10 and DPDK version 19.05.0 in my initiator application and I am running into an issue that seems to be sporadic in nature. I am running into this issue, where the application aborts saying Illegal instruction, with Signal 4, SIGILL (code: illegal operand). Issue seems to be more of a library related/EAL config related issue.

    I am using the following DPDK EAL parameters in the application. I am also attaching the topmost information from backtrace, as the other things are company confidential.

    Starting SPDK v19.10 / DPDK 19.05.0 initialization...
    [ DPDK EAL parameters: nvmf --no-shconf -c 0x1 --log-level=lib.eal:6 --log-level=lib.cryptodev:5 --log-level=user1:6 --base-virtaddr=0x200000000000 --match-allocations --file-prefix=spdk_pid40756  ]
    *** Aborted at 1587486078 (Unix time, try 'date -d @1587486078') ***
    *** Signal 4 (SIGILL) (0x7f2e40ec986d) received by PID 40756 (pthread TID 0x7f2e17fff700) (linux TID 40767) (maybe from PID 1089247341, UID 32558) (code: illegal operand), stack trace: ***
    @ 00007f2e3fb2938f (unknown)
    @ 00007f2e40ec986d eal_adjust_config -> <wokspace>/spdk/spdk-src/lib/json/json_write.c
    @ 00007f2e40eb03a5 rte_eal_init -> <wokspace>/spdk/spdk-src/lib/json/json_write.c
    @ 00007f2e40e906e6 spdk_env_init <wokspace>/spdk/spdk-src/lib/env_dpdk/init.c:456

    I am trying to debug this issue. But adding -g and -ggdb compilation flags to the cmake list file makes the issue to vanish. Have any of you seen this kind of crash and how should we go about fixing this? Is adding -g and -gdb a workaround or is that the required fix. Is this a DPDK issue or just a gcc issue?

    Also here is the system information that I am using
    Initiator_system:~$ lscpu
    Architecture:            x86_64
    CPU op-mode(s):     32-bit, 64-bit
    Byte Order:               Little Endian
    CPU(s):                      56
    On-line CPU(s) list:    0-55
    Thread(s) per core:    2
    Core(s) per socket:    14
    Socket(s):             2
    NUMA node(s):    2
    Vendor ID:            GenuineIntel
    CPU family:           6
    Model:                  79
    Model name:        Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
    Stepping:              1
    CPU MHz:             1236.281
    CPU max MHz:     3300.0000
    CPU min MHz:      1200.0000
    BogoMIPS:            4801.15
    Virtualization:        VT-x
    L1d cache:             32K
    L1i cache:              32K
    L2 cache:               256K
    L3 cache:               35840K
    NUMA node0 CPU(s):     0-13,28-41
    NUMA node1 CPU(s):     14-27,42-55
    Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts

    Kindly keep us posted if you guys have any inputs.

    Thanks and Regards,
    Prasanna
    _______________________________________________
    SPDK mailing list -- spdk(a)lists.01.org
    To unsubscribe send an email to spdk-leave(a)lists.01.org


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [SPDK] Re: SPDK v19.10/DPDK v19.05.0 eal_adjust_config issue, Illegal instruction causing application to abort
@ 2020-04-24 12:56 asprasannavenkatesh
  0 siblings, 0 replies; 6+ messages in thread
From: asprasannavenkatesh @ 2020-04-24 12:56 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 614 bytes --]

Hi Jim,

No, I wasn't able to reproduce this issue again after that. This happened a couple of weeks ago and all I have is this information from the console. I wasn't running it on gdb back then. Adding -g and/or -ggdb option makes the issue vanish and so gdb isn't helping much.

Also, another thing was, in our application, while building the DPDK, configuration is done using x86_64-native-linuxapp-gcc. So. I tried removing this target build folder and rebuilt it using cmake and then it started working fine. So, was wondering if it is just a outdated/stale library issue.

Thanks and Regards,
Prasanna

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [SPDK] Re: SPDK v19.10/DPDK v19.05.0 eal_adjust_config issue, Illegal instruction causing application to abort
@ 2020-04-23 19:40 Harris, James R
  0 siblings, 0 replies; 6+ messages in thread
From: Harris, James R @ 2020-04-23 19:40 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 1559 bytes --]

That's interesting, but it doesn't make sense to me.  DPDK #defines RTE_CPU_AND directly to CPU_AND on Linux, and CPU_AND is provided by system sched.h.  Are you able to get the exact instruction that this crashes on via gdb?  Not just the instruction, but also the values of whatever registers used by that instruction.  Your original e-mail indicated "illegal operand" in the log, so that's a little bit of clue.

-Jim


On 4/23/20, 10:56 AM, "asprasannavenkatesh(a)gmail.com" <asprasannavenkatesh(a)gmail.com> wrote:

    Hi Jim,

    Thanks will try out your suggestions. 

    Before that, I wanted to update you, on the only time I was able to hit this issue without the -g option, it was crashing in the following macro inside the compute_ctrl_threads_cpuset line 1466.

    (eal_adjust_config -> compute_ctrl_threads_cpuset) 
    In compute_ctrl_threads_cpuset , line 1466
    1466         RTE_CPU_AND(cpuset, cpuset, &default_set);

        @ 00007f800a86c38f (unknown)
        @ 00007f800bc0e3dd eal_adjust_config
                           <wokspace>/dpdk/dpdk-src/lib/librte_eal/common/eal_common_options.c:1466
        @ 00007f800bbf4f15 rte_eal_init
                           <wokspace>/dpdk/dpdk-src/lib/librte_eal/linux/eal/eal.c:846
        @ 00007f800bbd52b6 spdk_env_init
                           <wokspace>/spdk/spdk-src/lib/env_dpdk/init.c:456
    _______________________________________________
    SPDK mailing list -- spdk(a)lists.01.org
    To unsubscribe send an email to spdk-leave(a)lists.01.org


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [SPDK] Re: SPDK v19.10/DPDK v19.05.0 eal_adjust_config issue, Illegal instruction causing application to abort
@ 2020-04-23 18:06 asprasannavenkatesh
  0 siblings, 0 replies; 6+ messages in thread
From: asprasannavenkatesh @ 2020-04-23 18:06 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 120 bytes --]

Hi Ben,

Yes the SPDK binaries were built on the same system that I am running it on.

Thanks and Regards,
Prasanna

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [SPDK] Re: SPDK v19.10/DPDK v19.05.0 eal_adjust_config issue, Illegal instruction causing application to abort
@ 2020-04-23 17:54 asprasannavenkatesh
  0 siblings, 0 replies; 6+ messages in thread
From: asprasannavenkatesh @ 2020-04-23 17:54 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 810 bytes --]

Hi Jim,

Thanks will try out your suggestions. 

Before that, I wanted to update you, on the only time I was able to hit this issue without the -g option, it was crashing in the following macro inside the compute_ctrl_threads_cpuset line 1466.

(eal_adjust_config -> compute_ctrl_threads_cpuset) 
In compute_ctrl_threads_cpuset , line 1466
1466         RTE_CPU_AND(cpuset, cpuset, &default_set);

    @ 00007f800a86c38f (unknown)
    @ 00007f800bc0e3dd eal_adjust_config
                       <wokspace>/dpdk/dpdk-src/lib/librte_eal/common/eal_common_options.c:1466
    @ 00007f800bbf4f15 rte_eal_init
                       <wokspace>/dpdk/dpdk-src/lib/librte_eal/linux/eal/eal.c:846
    @ 00007f800bbd52b6 spdk_env_init
                       <wokspace>/spdk/spdk-src/lib/env_dpdk/init.c:456

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [SPDK] Re: SPDK v19.10/DPDK v19.05.0 eal_adjust_config issue, Illegal instruction causing application to abort
@ 2020-04-23 16:07 Walker, Benjamin
  0 siblings, 0 replies; 6+ messages in thread
From: Walker, Benjamin @ 2020-04-23 16:07 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 5161 bytes --]

Also - did you build the SPDK binaries on the same system that you're running on? If not, when you run SPDK's configure script pass it the --target-arch parameter to target the specific instruction set you need to run it on.

> -----Original Message-----
> From: Harris, James R <james.r.harris(a)intel.com>
> Sent: Thursday, April 23, 2020 7:28 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: [SPDK] Re: SPDK v19.10/DPDK v19.05.0 eal_adjust_config issue, Illegal
> instruction causing application to abort
> 
> Hi Prasanna,
> 
> I haven't seen this issue before.  But I'd like to help you get to the bottom of
> this.
> 
> Since adding -g makes the problem go away, could you try instrumenting
> eal_adjust_config somehow to see where it is crashing in that function?  Of
> course the instrumentation itself could make the problem go away too.
> 
> Maybe also try reducing the optimization level for the eal library.
> 
> You mentioned cmake, which neither SPDK nor DPDK use.  Are you building DPDK
> with cmake?
> 
> -Jim
> 
> On 4/23/20, 12:57 AM, "asprasannavenkatesh(a)gmail.com"
> <asprasannavenkatesh(a)gmail.com> wrote:
> 
>     Hi guys,
> 
>     I am running SPDK version 19.10 and DPDK version 19.05.0 in my initiator
> application and I am running into an issue that seems to be sporadic in nature. I
> am running into this issue, where the application aborts saying Illegal
> instruction, with Signal 4, SIGILL (code: illegal operand). Issue seems to be more
> of a library related/EAL config related issue.
> 
>     I am using the following DPDK EAL parameters in the application. I am also
> attaching the topmost information from backtrace, as the other things are
> company confidential.
> 
>     Starting SPDK v19.10 / DPDK 19.05.0 initialization...
>     [ DPDK EAL parameters: nvmf --no-shconf -c 0x1 --log-level=lib.eal:6 --log-
> level=lib.cryptodev:5 --log-level=user1:6 --base-virtaddr=0x200000000000 --
> match-allocations --file-prefix=spdk_pid40756  ]
>     *** Aborted at 1587486078 (Unix time, try 'date -d @1587486078') ***
>     *** Signal 4 (SIGILL) (0x7f2e40ec986d) received by PID 40756 (pthread TID
> 0x7f2e17fff700) (linux TID 40767) (maybe from PID 1089247341, UID 32558)
> (code: illegal operand), stack trace: ***
>     @ 00007f2e3fb2938f (unknown)
>     @ 00007f2e40ec986d eal_adjust_config -> <wokspace>/spdk/spdk-
> src/lib/json/json_write.c
>     @ 00007f2e40eb03a5 rte_eal_init -> <wokspace>/spdk/spdk-
> src/lib/json/json_write.c
>     @ 00007f2e40e906e6 spdk_env_init <wokspace>/spdk/spdk-
> src/lib/env_dpdk/init.c:456
> 
>     I am trying to debug this issue. But adding -g and -ggdb compilation flags to
> the cmake list file makes the issue to vanish. Have any of you seen this kind of
> crash and how should we go about fixing this? Is adding -g and -gdb a
> workaround or is that the required fix. Is this a DPDK issue or just a gcc issue?
> 
>     Also here is the system information that I am using
>     Initiator_system:~$ lscpu
>     Architecture:            x86_64
>     CPU op-mode(s):     32-bit, 64-bit
>     Byte Order:               Little Endian
>     CPU(s):                      56
>     On-line CPU(s) list:    0-55
>     Thread(s) per core:    2
>     Core(s) per socket:    14
>     Socket(s):             2
>     NUMA node(s):    2
>     Vendor ID:            GenuineIntel
>     CPU family:           6
>     Model:                  79
>     Model name:        Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
>     Stepping:              1
>     CPU MHz:             1236.281
>     CPU max MHz:     3300.0000
>     CPU min MHz:      1200.0000
>     BogoMIPS:            4801.15
>     Virtualization:        VT-x
>     L1d cache:             32K
>     L1i cache:              32K
>     L2 cache:               256K
>     L3 cache:               35840K
>     NUMA node0 CPU(s):     0-13,28-41
>     NUMA node1 CPU(s):     14-27,42-55
>     Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
> pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
> xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
> ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2
> x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm
> 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase
> tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap
> xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida
> arat pln pts
> 
>     Kindly keep us posted if you guys have any inputs.
> 
>     Thanks and Regards,
>     Prasanna
>     _______________________________________________
>     SPDK mailing list -- spdk(a)lists.01.org
>     To unsubscribe send an email to spdk-leave(a)lists.01.org
> 
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org
> To unsubscribe send an email to spdk-leave(a)lists.01.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-24 12:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-23 14:27 [SPDK] Re: SPDK v19.10/DPDK v19.05.0 eal_adjust_config issue, Illegal instruction causing application to abort Harris, James R
2020-04-23 16:07 Walker, Benjamin
2020-04-23 17:54 asprasannavenkatesh
2020-04-23 18:06 asprasannavenkatesh
2020-04-23 19:40 Harris, James R
2020-04-24 12:56 asprasannavenkatesh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.