All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [SPDK] Strange CI failure
@ 2019-01-30 12:18 Shahar Salzman
  0 siblings, 0 replies; 10+ messages in thread
From: Shahar Salzman @ 2019-01-30 12:18 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 4499 bytes --]

It looks like there are now consistent failures in iscsi and spdk-nvme-cli tests, I tried to retrigger and the failures happened again:

spdk nvme cli:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-03/build.log
iscsi:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-09/build.log
________________________________
From: Harris, James R <james.r.harris(a)intel.com>
Sent: Tuesday, January 29, 2019 6:09 PM
To: Storage Performance Development Kit; Shahar Salzman
Subject: Re: [SPDK] Strange CI failure

Thanks Shahar.  For now, you can reply to your own patch on GerritHub with just the word "retrigger" - it will re-run your patch through the test pool.  That will get your patch unblocked while Paul looks at the intermittent test failure.

-Jim


On 1/29/19, 8:48 AM, "SPDK on behalf of Luse, Paul E" <spdk-bounces(a)lists.01.org on behalf of paul.e.luse(a)intel.com> wrote:

    Thanks!  I've got a few hours of meetings coming up but here's what I see.  If you can repro that'd be great, we can get a github issue up and going.  If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)

    Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
    [ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
    EAL: Detected 16 lcore(s)
    EAL: Detected 2 NUMA nodes
    EAL: Auto-detected process type: SECONDARY
    EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
    EAL: Probing VFIO support...
    EAL: VFIO support initialized
    test/nvme/nvme.sh: line 108: 835807 Segmentation fault      (core dumped) $rootdir/examples/nvme/identify/identify -i 0
      08:50:18     # trap - ERR
      08:50:18     # print_backtrace
      08:50:18     # [[ ehxBE =~ e ]]
      08:50:18     # local shell_options=ehxBE
      08:50:18     # set +x
    ========== Backtrace start: ==========


    From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
    Sent: Tuesday, January 29, 2019 8:35 AM
    To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
    Subject: Re: Strange CI failure

    https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log

    I can copy paste it if you cannot reach the link.
    ________________________________
    From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
    Sent: Tuesday, January 29, 2019 5:22 PM
    To: Storage Performance Development Kit
    Subject: Re: [SPDK] Strange CI failure

    Can you send a link to the full log?

    -----Original Message-----
    From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
    Sent: Tuesday, January 29, 2019 8:21 AM
    To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
    Subject: [SPDK] Strange CI failure

    Hi,

    I have encountered a CI failure that has nothing to do with my code.
    The reason that I know it has nothing to do with it, is that the change is a gdb macro.
    Do we know that this test machine is unstable?

    Here is the backtrace:
    ========== Backtrace start: ==========

    in test/nvme/nvme.sh:108 -> main()
         ...
       103   report_test_completion "nightly_nvme_reset"
       104   timing_exit reset
       105  fi
       106
       107  timing_enter identify
    => 108  $rootdir/examples/nvme/identify/identify -i 0
       109  for bdf in $(iter_pci_class_code 01 08 02); do
       110   $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
       111  done
       112  timing_exit identify
       113
         ...


    Shahar
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
    https://lists.01.org/mailman/listinfo/spdk
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
    https://lists.01.org/mailman/listinfo/spdk
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org
    https://lists.01.org/mailman/listinfo/spdk



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [SPDK] Strange CI failure
@ 2019-01-31  8:47 Shahar Salzman
  0 siblings, 0 replies; 10+ messages in thread
From: Shahar Salzman @ 2019-01-31  8:47 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 5752 bytes --]

I rebased, CI now passes.

Thanks!
________________________________
From: SPDK <spdk-bounces(a)lists.01.org> on behalf of Howell, Seth <seth.howell(a)intel.com>
Sent: Wednesday, January 30, 2019 4:28 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure

Hi Shahar,

I apologize for the inconvenience. There was a change to the nvme-cli repo that when applied to the chandler test pool caused consistent failures. A change has since been merged to the SPDK repo. Please rebase your changes on master to prevent this failure on future versions of your patch.

Again, I'm sorry for any inconvenience this has caused.

Thank you,

Seth Howell

-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Wednesday, January 30, 2019 5:18 AM
To: Harris, James R <james.r.harris(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Strange CI failure

It looks like there are now consistent failures in iscsi and spdk-nvme-cli tests, I tried to retrigger and the failures happened again:

spdk nvme cli:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-03/build.log
iscsi:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-09/build.log
________________________________
From: Harris, James R <james.r.harris(a)intel.com>
Sent: Tuesday, January 29, 2019 6:09 PM
To: Storage Performance Development Kit; Shahar Salzman
Subject: Re: [SPDK] Strange CI failure

Thanks Shahar.  For now, you can reply to your own patch on GerritHub with just the word "retrigger" - it will re-run your patch through the test pool.  That will get your patch unblocked while Paul looks at the intermittent test failure.

-Jim


On 1/29/19, 8:48 AM, "SPDK on behalf of Luse, Paul E" <spdk-bounces(a)lists.01.org on behalf of paul.e.luse(a)intel.com> wrote:

    Thanks!  I've got a few hours of meetings coming up but here's what I see.  If you can repro that'd be great, we can get a github issue up and going.  If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)

    Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
    [ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
    EAL: Detected 16 lcore(s)
    EAL: Detected 2 NUMA nodes
    EAL: Auto-detected process type: SECONDARY
    EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
    EAL: Probing VFIO support...
    EAL: VFIO support initialized
    test/nvme/nvme.sh: line 108: 835807 Segmentation fault      (core dumped) $rootdir/examples/nvme/identify/identify -i 0
      08:50:18     # trap - ERR
      08:50:18     # print_backtrace
      08:50:18     # [[ ehxBE =~ e ]]
      08:50:18     # local shell_options=ehxBE
      08:50:18     # set +x
    ========== Backtrace start: ==========


    From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
    Sent: Tuesday, January 29, 2019 8:35 AM
    To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
    Subject: Re: Strange CI failure

    https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log

    I can copy paste it if you cannot reach the link.
    ________________________________
    From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
    Sent: Tuesday, January 29, 2019 5:22 PM
    To: Storage Performance Development Kit
    Subject: Re: [SPDK] Strange CI failure

    Can you send a link to the full log?

    -----Original Message-----
    From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
    Sent: Tuesday, January 29, 2019 8:21 AM
    To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
    Subject: [SPDK] Strange CI failure

    Hi,

    I have encountered a CI failure that has nothing to do with my code.
    The reason that I know it has nothing to do with it, is that the change is a gdb macro.
    Do we know that this test machine is unstable?

    Here is the backtrace:
    ========== Backtrace start: ==========

    in test/nvme/nvme.sh:108 -> main()
         ...
       103   report_test_completion "nightly_nvme_reset"
       104   timing_exit reset
       105  fi
       106
       107  timing_enter identify
    => 108  $rootdir/examples/nvme/identify/identify -i 0
       109  for bdf in $(iter_pci_class_code 01 08 02); do
       110   $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
       111  done
       112  timing_exit identify
       113
         ...


    Shahar
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
    https://lists.01.org/mailman/listinfo/spdk
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
    https://lists.01.org/mailman/listinfo/spdk
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org
    https://lists.01.org/mailman/listinfo/spdk


_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [SPDK] Strange CI failure
@ 2019-01-30 14:28 Howell, Seth
  0 siblings, 0 replies; 10+ messages in thread
From: Howell, Seth @ 2019-01-30 14:28 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 5327 bytes --]

Hi Shahar,

I apologize for the inconvenience. There was a change to the nvme-cli repo that when applied to the chandler test pool caused consistent failures. A change has since been merged to the SPDK repo. Please rebase your changes on master to prevent this failure on future versions of your patch.

Again, I'm sorry for any inconvenience this has caused.

Thank you,

Seth Howell

-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Wednesday, January 30, 2019 5:18 AM
To: Harris, James R <james.r.harris(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Strange CI failure

It looks like there are now consistent failures in iscsi and spdk-nvme-cli tests, I tried to retrigger and the failures happened again:

spdk nvme cli:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-03/build.log
iscsi:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-09/build.log
________________________________
From: Harris, James R <james.r.harris(a)intel.com>
Sent: Tuesday, January 29, 2019 6:09 PM
To: Storage Performance Development Kit; Shahar Salzman
Subject: Re: [SPDK] Strange CI failure

Thanks Shahar.  For now, you can reply to your own patch on GerritHub with just the word "retrigger" - it will re-run your patch through the test pool.  That will get your patch unblocked while Paul looks at the intermittent test failure.

-Jim


On 1/29/19, 8:48 AM, "SPDK on behalf of Luse, Paul E" <spdk-bounces(a)lists.01.org on behalf of paul.e.luse(a)intel.com> wrote:

    Thanks!  I've got a few hours of meetings coming up but here's what I see.  If you can repro that'd be great, we can get a github issue up and going.  If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)

    Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
    [ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
    EAL: Detected 16 lcore(s)
    EAL: Detected 2 NUMA nodes
    EAL: Auto-detected process type: SECONDARY
    EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
    EAL: Probing VFIO support...
    EAL: VFIO support initialized
    test/nvme/nvme.sh: line 108: 835807 Segmentation fault      (core dumped) $rootdir/examples/nvme/identify/identify -i 0
      08:50:18     # trap - ERR
      08:50:18     # print_backtrace
      08:50:18     # [[ ehxBE =~ e ]]
      08:50:18     # local shell_options=ehxBE
      08:50:18     # set +x
    ========== Backtrace start: ==========


    From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
    Sent: Tuesday, January 29, 2019 8:35 AM
    To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
    Subject: Re: Strange CI failure

    https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log

    I can copy paste it if you cannot reach the link.
    ________________________________
    From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
    Sent: Tuesday, January 29, 2019 5:22 PM
    To: Storage Performance Development Kit
    Subject: Re: [SPDK] Strange CI failure

    Can you send a link to the full log?

    -----Original Message-----
    From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
    Sent: Tuesday, January 29, 2019 8:21 AM
    To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
    Subject: [SPDK] Strange CI failure

    Hi,

    I have encountered a CI failure that has nothing to do with my code.
    The reason that I know it has nothing to do with it, is that the change is a gdb macro.
    Do we know that this test machine is unstable?

    Here is the backtrace:
    ========== Backtrace start: ==========

    in test/nvme/nvme.sh:108 -> main()
         ...
       103   report_test_completion "nightly_nvme_reset"
       104   timing_exit reset
       105  fi
       106
       107  timing_enter identify
    => 108  $rootdir/examples/nvme/identify/identify -i 0
       109  for bdf in $(iter_pci_class_code 01 08 02); do
       110   $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
       111  done
       112  timing_exit identify
       113
         ...


    Shahar
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
    https://lists.01.org/mailman/listinfo/spdk
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
    https://lists.01.org/mailman/listinfo/spdk
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org
    https://lists.01.org/mailman/listinfo/spdk


_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [SPDK] Strange CI failure
@ 2019-01-29 18:29 Luse, Paul E
  0 siblings, 0 replies; 10+ messages in thread
From: Luse, Paul E @ 2019-01-29 18:29 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 4208 bytes --]

Yup I saw that.  Looking through the logs there was nothing else suspicious on the segfault so glad it passed a 2nd time.  Will have to remove the CH TP -1 from your patch, I'll mention it on IRC in case you're not on...

Thx
Paul
From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 9:08 AM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: Strange CI failure

When I uploaded the patch for the name change, spdk CI passed, so this isn't a stable repro.

Now I get a failure on the Mellanox ConnectX4 fedora machine:
https://ci.spdk.io/spdk/builds/review/f394f9839325a00d263ddeb5a54fd0f37c4a4055.1548775374/fedora-03/build.log
________________________________
From: Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:47 PM
To: Shahar Salzman; Storage Performance Development Kit
Subject: RE: Strange CI failure


Thanks!  I've got a few hours of meetings coming up but here's what I see.  If you can repro that'd be great, we can get a github issue up and going.  If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)



Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...

[ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]

EAL: Detected 16 lcore(s)

EAL: Detected 2 NUMA nodes

EAL: Auto-detected process type: SECONDARY

EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b

EAL: Probing VFIO support...

EAL: VFIO support initialized

test/nvme/nvme.sh: line 108: 835807 Segmentation fault      (core dumped) $rootdir/examples/nvme/identify/identify -i 0

  08:50:18     # trap - ERR

  08:50:18     # print_backtrace

  08:50:18     # [[ ehxBE =~ e ]]

  08:50:18     # local shell_options=ehxBE

  08:50:18     # set +x

========== Backtrace start: ==========





From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 8:35 AM
To: Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>; Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: Strange CI failure



https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log



I can copy paste it if you cannot reach the link.

________________________________

From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure



Can you send a link to the full log?

-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: [SPDK] Strange CI failure

Hi,

I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?

Here is the backtrace:
========== Backtrace start: ==========

in test/nvme/nvme.sh:108 -> main()
     ...
   103   report_test_completion "nightly_nvme_reset"
   104   timing_exit reset
   105  fi
   106
   107  timing_enter identify
=> 108  $rootdir/examples/nvme/identify/identify -i 0
   109  for bdf in $(iter_pci_class_code 01 08 02); do
   110   $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
   111  done
   112  timing_exit identify
   113
     ...


Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [SPDK] Strange CI failure
@ 2019-01-29 16:09 Harris, James R
  0 siblings, 0 replies; 10+ messages in thread
From: Harris, James R @ 2019-01-29 16:09 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 3944 bytes --]

Thanks Shahar.  For now, you can reply to your own patch on GerritHub with just the word "retrigger" - it will re-run your patch through the test pool.  That will get your patch unblocked while Paul looks at the intermittent test failure.

-Jim


On 1/29/19, 8:48 AM, "SPDK on behalf of Luse, Paul E" <spdk-bounces(a)lists.01.org on behalf of paul.e.luse(a)intel.com> wrote:

    Thanks!  I've got a few hours of meetings coming up but here's what I see.  If you can repro that'd be great, we can get a github issue up and going.  If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)
    
    Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
    [ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
    EAL: Detected 16 lcore(s)
    EAL: Detected 2 NUMA nodes
    EAL: Auto-detected process type: SECONDARY
    EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
    EAL: Probing VFIO support...
    EAL: VFIO support initialized
    test/nvme/nvme.sh: line 108: 835807 Segmentation fault      (core dumped) $rootdir/examples/nvme/identify/identify -i 0
      08:50:18     # trap - ERR
      08:50:18     # print_backtrace
      08:50:18     # [[ ehxBE =~ e ]]
      08:50:18     # local shell_options=ehxBE
      08:50:18     # set +x
    ========== Backtrace start: ==========
    
    
    From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
    Sent: Tuesday, January 29, 2019 8:35 AM
    To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
    Subject: Re: Strange CI failure
    
    https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log
    
    I can copy paste it if you cannot reach the link.
    ________________________________
    From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
    Sent: Tuesday, January 29, 2019 5:22 PM
    To: Storage Performance Development Kit
    Subject: Re: [SPDK] Strange CI failure
    
    Can you send a link to the full log?
    
    -----Original Message-----
    From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
    Sent: Tuesday, January 29, 2019 8:21 AM
    To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
    Subject: [SPDK] Strange CI failure
    
    Hi,
    
    I have encountered a CI failure that has nothing to do with my code.
    The reason that I know it has nothing to do with it, is that the change is a gdb macro.
    Do we know that this test machine is unstable?
    
    Here is the backtrace:
    ========== Backtrace start: ==========
    
    in test/nvme/nvme.sh:108 -> main()
         ...
       103   report_test_completion "nightly_nvme_reset"
       104   timing_exit reset
       105  fi
       106
       107  timing_enter identify
    => 108  $rootdir/examples/nvme/identify/identify -i 0
       109  for bdf in $(iter_pci_class_code 01 08 02); do
       110   $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
       111  done
       112  timing_exit identify
       113
         ...
    
    
    Shahar
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
    https://lists.01.org/mailman/listinfo/spdk
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
    https://lists.01.org/mailman/listinfo/spdk
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org
    https://lists.01.org/mailman/listinfo/spdk
    


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [SPDK] Strange CI failure
@ 2019-01-29 16:08 Shahar Salzman
  0 siblings, 0 replies; 10+ messages in thread
From: Shahar Salzman @ 2019-01-29 16:08 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 3652 bytes --]

When I uploaded the patch for the name change, spdk CI passed, so this isn't a stable repro.

Now I get a failure on the Mellanox ConnectX4 fedora machine:
https://ci.spdk.io/spdk/builds/review/f394f9839325a00d263ddeb5a54fd0f37c4a4055.1548775374/fedora-03/build.log
________________________________
From: Luse, Paul E <paul.e.luse(a)intel.com>
Sent: Tuesday, January 29, 2019 5:47 PM
To: Shahar Salzman; Storage Performance Development Kit
Subject: RE: Strange CI failure


Thanks!  I’ve got a few hours of meetings coming up but here’s what I see.  If you can repro that’d be great, we can get a github issue up and going.  If not I can look deeper into this later if someone else doesn’t jump in by then with an “aha” moment :)



Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...

[ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]

EAL: Detected 16 lcore(s)

EAL: Detected 2 NUMA nodes

EAL: Auto-detected process type: SECONDARY

EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b

EAL: Probing VFIO support...

EAL: VFIO support initialized

test/nvme/nvme.sh: line 108: 835807 Segmentation fault      (core dumped) $rootdir/examples/nvme/identify/identify -i 0

  08:50:18     # trap - ERR

  08:50:18     # print_backtrace

  08:50:18     # [[ ehxBE =~ e ]]

  08:50:18     # local shell_options=ehxBE

  08:50:18     # set +x

========== Backtrace start: ==========





From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 8:35 AM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: Strange CI failure



https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log



I can copy paste it if you cannot reach the link.

________________________________

From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure



Can you send a link to the full log?

-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: [SPDK] Strange CI failure

Hi,

I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?

Here is the backtrace:
========== Backtrace start: ==========

in test/nvme/nvme.sh:108 -> main()
     ...
   103   report_test_completion "nightly_nvme_reset"
   104   timing_exit reset
   105  fi
   106
   107  timing_enter identify
=> 108  $rootdir/examples/nvme/identify/identify -i 0
   109  for bdf in $(iter_pci_class_code 01 08 02); do
   110   $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
   111  done
   112  timing_exit identify
   113
     ...


Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [SPDK] Strange CI failure
@ 2019-01-29 15:47 Luse, Paul E
  0 siblings, 0 replies; 10+ messages in thread
From: Luse, Paul E @ 2019-01-29 15:47 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 3101 bytes --]

Thanks!  I've got a few hours of meetings coming up but here's what I see.  If you can repro that'd be great, we can get a github issue up and going.  If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)

Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
[ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
EAL: Probing VFIO support...
EAL: VFIO support initialized
test/nvme/nvme.sh: line 108: 835807 Segmentation fault      (core dumped) $rootdir/examples/nvme/identify/identify -i 0
  08:50:18     # trap - ERR
  08:50:18     # print_backtrace
  08:50:18     # [[ ehxBE =~ e ]]
  08:50:18     # local shell_options=ehxBE
  08:50:18     # set +x
========== Backtrace start: ==========


From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 8:35 AM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: Strange CI failure

https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log

I can copy paste it if you cannot reach the link.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure

Can you send a link to the full log?

-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: [SPDK] Strange CI failure

Hi,

I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?

Here is the backtrace:
========== Backtrace start: ==========

in test/nvme/nvme.sh:108 -> main()
     ...
   103   report_test_completion "nightly_nvme_reset"
   104   timing_exit reset
   105  fi
   106
   107  timing_enter identify
=> 108  $rootdir/examples/nvme/identify/identify -i 0
   109  for bdf in $(iter_pci_class_code 01 08 02); do
   110   $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
   111  done
   112  timing_exit identify
   113
     ...


Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [SPDK] Strange CI failure
@ 2019-01-29 15:35 Shahar Salzman
  0 siblings, 0 replies; 10+ messages in thread
From: Shahar Salzman @ 2019-01-29 15:35 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 1700 bytes --]

https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log

I can copy paste it if you cannot reach the link.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org> on behalf of Luse, Paul E <paul.e.luse(a)intel.com>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure

Can you send a link to the full log?

-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: [SPDK] Strange CI failure

Hi,

I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?

Here is the backtrace:
========== Backtrace start: ==========

in test/nvme/nvme.sh:108 -> main()
     ...
   103   report_test_completion "nightly_nvme_reset"
   104   timing_exit reset
   105  fi
   106
   107  timing_enter identify
=> 108  $rootdir/examples/nvme/identify/identify -i 0
   109  for bdf in $(iter_pci_class_code 01 08 02); do
   110   $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
   111  done
   112  timing_exit identify
   113
     ...


Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [SPDK] Strange CI failure
@ 2019-01-29 15:22 Luse, Paul E
  0 siblings, 0 replies; 10+ messages in thread
From: Luse, Paul E @ 2019-01-29 15:22 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 1153 bytes --]

Can you send a link to the full log?

-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: [SPDK] Strange CI failure

Hi,

I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?

Here is the backtrace:
========== Backtrace start: ==========

in test/nvme/nvme.sh:108 -> main()
     ...
   103   report_test_completion "nightly_nvme_reset"
   104   timing_exit reset
   105  fi
   106
   107  timing_enter identify
=> 108  $rootdir/examples/nvme/identify/identify -i 0
   109  for bdf in $(iter_pci_class_code 01 08 02); do
   110   $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
   111  done
   112  timing_exit identify
   113
     ...


Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [SPDK] Strange CI failure
@ 2019-01-29 15:20 Shahar Salzman
  0 siblings, 0 replies; 10+ messages in thread
From: Shahar Salzman @ 2019-01-29 15:20 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 733 bytes --]

Hi,

I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?

Here is the backtrace:
========== Backtrace start: ==========

in test/nvme/nvme.sh:108 -> main()
     ...
   103   report_test_completion "nightly_nvme_reset"
   104   timing_exit reset
   105  fi
   106
   107  timing_enter identify
=> 108  $rootdir/examples/nvme/identify/identify -i 0
   109  for bdf in $(iter_pci_class_code 01 08 02); do
   110   $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
   111  done
   112  timing_exit identify
   113
     ...


Shahar

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-01-31  8:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-30 12:18 [SPDK] Strange CI failure Shahar Salzman
  -- strict thread matches above, loose matches on Subject: below --
2019-01-31  8:47 Shahar Salzman
2019-01-30 14:28 Howell, Seth
2019-01-29 18:29 Luse, Paul E
2019-01-29 16:09 Harris, James R
2019-01-29 16:08 Shahar Salzman
2019-01-29 15:47 Luse, Paul E
2019-01-29 15:35 Shahar Salzman
2019-01-29 15:22 Luse, Paul E
2019-01-29 15:20 Shahar Salzman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.