* Re: [SPDK] Strange CI failure
@ 2019-01-30 12:18 Shahar Salzman
0 siblings, 0 replies; 10+ messages in thread
From: Shahar Salzman @ 2019-01-30 12:18 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 4499 bytes --]
It looks like there are now consistent failures in iscsi and spdk-nvme-cli tests, I tried to retrigger and the failures happened again:
spdk nvme cli:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-03/build.log
iscsi:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-09/build.log
________________________________
From: Harris, James R <james.r.harris(a)intel.com>
Sent: Tuesday, January 29, 2019 6:09 PM
To: Storage Performance Development Kit; Shahar Salzman
Subject: Re: [SPDK] Strange CI failure
Thanks Shahar. For now, you can reply to your own patch on GerritHub with just the word "retrigger" - it will re-run your patch through the test pool. That will get your patch unblocked while Paul looks at the intermittent test failure.
-Jim
On 1/29/19, 8:48 AM, "SPDK on behalf of Luse, Paul E" <spdk-bounces(a)lists.01.org on behalf of paul.e.luse(a)intel.com> wrote:
Thanks! I've got a few hours of meetings coming up but here's what I see. If you can repro that'd be great, we can get a github issue up and going. If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)
Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
[ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
EAL: Probing VFIO support...
EAL: VFIO support initialized
test/nvme/nvme.sh: line 108: 835807 Segmentation fault (core dumped) $rootdir/examples/nvme/identify/identify -i 0
08:50:18 # trap - ERR
08:50:18 # print_backtrace
08:50:18 # [[ ehxBE =~ e ]]
08:50:18 # local shell_options=ehxBE
08:50:18 # set +x
========== Backtrace start: ==========
From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 8:35 AM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: Strange CI failure
https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log
I can copy paste it if you cannot reach the link.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure
Can you send a link to the full log?
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: [SPDK] Strange CI failure
Hi,
I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?
Here is the backtrace:
========== Backtrace start: ==========
in test/nvme/nvme.sh:108 -> main()
...
103 report_test_completion "nightly_nvme_reset"
104 timing_exit reset
105 fi
106
107 timing_enter identify
=> 108 $rootdir/examples/nvme/identify/identify -i 0
109 for bdf in $(iter_pci_class_code 01 08 02); do
110 $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
111 done
112 timing_exit identify
113
...
Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [SPDK] Strange CI failure
@ 2019-01-31 8:47 Shahar Salzman
0 siblings, 0 replies; 10+ messages in thread
From: Shahar Salzman @ 2019-01-31 8:47 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 5752 bytes --]
I rebased, CI now passes.
Thanks!
________________________________
From: SPDK <spdk-bounces(a)lists.01.org> on behalf of Howell, Seth <seth.howell(a)intel.com>
Sent: Wednesday, January 30, 2019 4:28 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure
Hi Shahar,
I apologize for the inconvenience. There was a change to the nvme-cli repo that when applied to the chandler test pool caused consistent failures. A change has since been merged to the SPDK repo. Please rebase your changes on master to prevent this failure on future versions of your patch.
Again, I'm sorry for any inconvenience this has caused.
Thank you,
Seth Howell
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Wednesday, January 30, 2019 5:18 AM
To: Harris, James R <james.r.harris(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Strange CI failure
It looks like there are now consistent failures in iscsi and spdk-nvme-cli tests, I tried to retrigger and the failures happened again:
spdk nvme cli:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-03/build.log
iscsi:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-09/build.log
________________________________
From: Harris, James R <james.r.harris(a)intel.com>
Sent: Tuesday, January 29, 2019 6:09 PM
To: Storage Performance Development Kit; Shahar Salzman
Subject: Re: [SPDK] Strange CI failure
Thanks Shahar. For now, you can reply to your own patch on GerritHub with just the word "retrigger" - it will re-run your patch through the test pool. That will get your patch unblocked while Paul looks at the intermittent test failure.
-Jim
On 1/29/19, 8:48 AM, "SPDK on behalf of Luse, Paul E" <spdk-bounces(a)lists.01.org on behalf of paul.e.luse(a)intel.com> wrote:
Thanks! I've got a few hours of meetings coming up but here's what I see. If you can repro that'd be great, we can get a github issue up and going. If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)
Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
[ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
EAL: Probing VFIO support...
EAL: VFIO support initialized
test/nvme/nvme.sh: line 108: 835807 Segmentation fault (core dumped) $rootdir/examples/nvme/identify/identify -i 0
08:50:18 # trap - ERR
08:50:18 # print_backtrace
08:50:18 # [[ ehxBE =~ e ]]
08:50:18 # local shell_options=ehxBE
08:50:18 # set +x
========== Backtrace start: ==========
From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 8:35 AM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: Strange CI failure
https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log
I can copy paste it if you cannot reach the link.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure
Can you send a link to the full log?
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: [SPDK] Strange CI failure
Hi,
I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?
Here is the backtrace:
========== Backtrace start: ==========
in test/nvme/nvme.sh:108 -> main()
...
103 report_test_completion "nightly_nvme_reset"
104 timing_exit reset
105 fi
106
107 timing_enter identify
=> 108 $rootdir/examples/nvme/identify/identify -i 0
109 for bdf in $(iter_pci_class_code 01 08 02); do
110 $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
111 done
112 timing_exit identify
113
...
Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [SPDK] Strange CI failure
@ 2019-01-30 14:28 Howell, Seth
0 siblings, 0 replies; 10+ messages in thread
From: Howell, Seth @ 2019-01-30 14:28 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 5327 bytes --]
Hi Shahar,
I apologize for the inconvenience. There was a change to the nvme-cli repo that when applied to the chandler test pool caused consistent failures. A change has since been merged to the SPDK repo. Please rebase your changes on master to prevent this failure on future versions of your patch.
Again, I'm sorry for any inconvenience this has caused.
Thank you,
Seth Howell
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Wednesday, January 30, 2019 5:18 AM
To: Harris, James R <james.r.harris(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] Strange CI failure
It looks like there are now consistent failures in iscsi and spdk-nvme-cli tests, I tried to retrigger and the failures happened again:
spdk nvme cli:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-03/build.log
iscsi:
https://ci.spdk.io/spdk/builds/review/253dd179d38ac2b608f5adf1edad56e1ec6eb519.1548848568/fedora-09/build.log
________________________________
From: Harris, James R <james.r.harris(a)intel.com>
Sent: Tuesday, January 29, 2019 6:09 PM
To: Storage Performance Development Kit; Shahar Salzman
Subject: Re: [SPDK] Strange CI failure
Thanks Shahar. For now, you can reply to your own patch on GerritHub with just the word "retrigger" - it will re-run your patch through the test pool. That will get your patch unblocked while Paul looks at the intermittent test failure.
-Jim
On 1/29/19, 8:48 AM, "SPDK on behalf of Luse, Paul E" <spdk-bounces(a)lists.01.org on behalf of paul.e.luse(a)intel.com> wrote:
Thanks! I've got a few hours of meetings coming up but here's what I see. If you can repro that'd be great, we can get a github issue up and going. If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)
Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
[ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
EAL: Probing VFIO support...
EAL: VFIO support initialized
test/nvme/nvme.sh: line 108: 835807 Segmentation fault (core dumped) $rootdir/examples/nvme/identify/identify -i 0
08:50:18 # trap - ERR
08:50:18 # print_backtrace
08:50:18 # [[ ehxBE =~ e ]]
08:50:18 # local shell_options=ehxBE
08:50:18 # set +x
========== Backtrace start: ==========
From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 8:35 AM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: Strange CI failure
https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log
I can copy paste it if you cannot reach the link.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure
Can you send a link to the full log?
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: [SPDK] Strange CI failure
Hi,
I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?
Here is the backtrace:
========== Backtrace start: ==========
in test/nvme/nvme.sh:108 -> main()
...
103 report_test_completion "nightly_nvme_reset"
104 timing_exit reset
105 fi
106
107 timing_enter identify
=> 108 $rootdir/examples/nvme/identify/identify -i 0
109 for bdf in $(iter_pci_class_code 01 08 02); do
110 $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
111 done
112 timing_exit identify
113
...
Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [SPDK] Strange CI failure
@ 2019-01-29 18:29 Luse, Paul E
0 siblings, 0 replies; 10+ messages in thread
From: Luse, Paul E @ 2019-01-29 18:29 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 4208 bytes --]
Yup I saw that. Looking through the logs there was nothing else suspicious on the segfault so glad it passed a 2nd time. Will have to remove the CH TP -1 from your patch, I'll mention it on IRC in case you're not on...
Thx
Paul
From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 9:08 AM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: Strange CI failure
When I uploaded the patch for the name change, spdk CI passed, so this isn't a stable repro.
Now I get a failure on the Mellanox ConnectX4 fedora machine:
https://ci.spdk.io/spdk/builds/review/f394f9839325a00d263ddeb5a54fd0f37c4a4055.1548775374/fedora-03/build.log
________________________________
From: Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:47 PM
To: Shahar Salzman; Storage Performance Development Kit
Subject: RE: Strange CI failure
Thanks! I've got a few hours of meetings coming up but here's what I see. If you can repro that'd be great, we can get a github issue up and going. If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)
Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
[ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
EAL: Probing VFIO support...
EAL: VFIO support initialized
test/nvme/nvme.sh: line 108: 835807 Segmentation fault (core dumped) $rootdir/examples/nvme/identify/identify -i 0
08:50:18 # trap - ERR
08:50:18 # print_backtrace
08:50:18 # [[ ehxBE =~ e ]]
08:50:18 # local shell_options=ehxBE
08:50:18 # set +x
========== Backtrace start: ==========
From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 8:35 AM
To: Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>; Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: Re: Strange CI failure
https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log
I can copy paste it if you cannot reach the link.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure
Can you send a link to the full log?
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: [SPDK] Strange CI failure
Hi,
I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?
Here is the backtrace:
========== Backtrace start: ==========
in test/nvme/nvme.sh:108 -> main()
...
103 report_test_completion "nightly_nvme_reset"
104 timing_exit reset
105 fi
106
107 timing_enter identify
=> 108 $rootdir/examples/nvme/identify/identify -i 0
109 for bdf in $(iter_pci_class_code 01 08 02); do
110 $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
111 done
112 timing_exit identify
113
...
Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [SPDK] Strange CI failure
@ 2019-01-29 16:09 Harris, James R
0 siblings, 0 replies; 10+ messages in thread
From: Harris, James R @ 2019-01-29 16:09 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 3944 bytes --]
Thanks Shahar. For now, you can reply to your own patch on GerritHub with just the word "retrigger" - it will re-run your patch through the test pool. That will get your patch unblocked while Paul looks at the intermittent test failure.
-Jim
On 1/29/19, 8:48 AM, "SPDK on behalf of Luse, Paul E" <spdk-bounces(a)lists.01.org on behalf of paul.e.luse(a)intel.com> wrote:
Thanks! I've got a few hours of meetings coming up but here's what I see. If you can repro that'd be great, we can get a github issue up and going. If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)
Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
[ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
EAL: Probing VFIO support...
EAL: VFIO support initialized
test/nvme/nvme.sh: line 108: 835807 Segmentation fault (core dumped) $rootdir/examples/nvme/identify/identify -i 0
08:50:18 # trap - ERR
08:50:18 # print_backtrace
08:50:18 # [[ ehxBE =~ e ]]
08:50:18 # local shell_options=ehxBE
08:50:18 # set +x
========== Backtrace start: ==========
From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 8:35 AM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: Strange CI failure
https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log
I can copy paste it if you cannot reach the link.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure
Can you send a link to the full log?
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: [SPDK] Strange CI failure
Hi,
I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?
Here is the backtrace:
========== Backtrace start: ==========
in test/nvme/nvme.sh:108 -> main()
...
103 report_test_completion "nightly_nvme_reset"
104 timing_exit reset
105 fi
106
107 timing_enter identify
=> 108 $rootdir/examples/nvme/identify/identify -i 0
109 for bdf in $(iter_pci_class_code 01 08 02); do
110 $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
111 done
112 timing_exit identify
113
...
Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [SPDK] Strange CI failure
@ 2019-01-29 16:08 Shahar Salzman
0 siblings, 0 replies; 10+ messages in thread
From: Shahar Salzman @ 2019-01-29 16:08 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 3652 bytes --]
When I uploaded the patch for the name change, spdk CI passed, so this isn't a stable repro.
Now I get a failure on the Mellanox ConnectX4 fedora machine:
https://ci.spdk.io/spdk/builds/review/f394f9839325a00d263ddeb5a54fd0f37c4a4055.1548775374/fedora-03/build.log
________________________________
From: Luse, Paul E <paul.e.luse(a)intel.com>
Sent: Tuesday, January 29, 2019 5:47 PM
To: Shahar Salzman; Storage Performance Development Kit
Subject: RE: Strange CI failure
Thanks! I’ve got a few hours of meetings coming up but here’s what I see. If you can repro that’d be great, we can get a github issue up and going. If not I can look deeper into this later if someone else doesn’t jump in by then with an “aha” moment :)
Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
[ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
EAL: Probing VFIO support...
EAL: VFIO support initialized
test/nvme/nvme.sh: line 108: 835807 Segmentation fault (core dumped) $rootdir/examples/nvme/identify/identify -i 0
08:50:18 # trap - ERR
08:50:18 # print_backtrace
08:50:18 # [[ ehxBE =~ e ]]
08:50:18 # local shell_options=ehxBE
08:50:18 # set +x
========== Backtrace start: ==========
From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 8:35 AM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: Strange CI failure
https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log
I can copy paste it if you cannot reach the link.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure
Can you send a link to the full log?
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: [SPDK] Strange CI failure
Hi,
I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?
Here is the backtrace:
========== Backtrace start: ==========
in test/nvme/nvme.sh:108 -> main()
...
103 report_test_completion "nightly_nvme_reset"
104 timing_exit reset
105 fi
106
107 timing_enter identify
=> 108 $rootdir/examples/nvme/identify/identify -i 0
109 for bdf in $(iter_pci_class_code 01 08 02); do
110 $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
111 done
112 timing_exit identify
113
...
Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [SPDK] Strange CI failure
@ 2019-01-29 15:47 Luse, Paul E
0 siblings, 0 replies; 10+ messages in thread
From: Luse, Paul E @ 2019-01-29 15:47 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 3101 bytes --]
Thanks! I've got a few hours of meetings coming up but here's what I see. If you can repro that'd be great, we can get a github issue up and going. If not I can look deeper into this later if someone else doesn't jump in by then with an "aha" moment :)
Starting SPDK v19.01-pre / DPDK 18.11.0 initialization...
[ DPDK EAL parameters: identify -c 0x1 -n 1 -m 0 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket_835807_c029d817e596b
EAL: Probing VFIO support...
EAL: VFIO support initialized
test/nvme/nvme.sh: line 108: 835807 Segmentation fault (core dumped) $rootdir/examples/nvme/identify/identify -i 0
08:50:18 # trap - ERR
08:50:18 # print_backtrace
08:50:18 # [[ ehxBE =~ e ]]
08:50:18 # local shell_options=ehxBE
08:50:18 # set +x
========== Backtrace start: ==========
From: Shahar Salzman [mailto:shahar.salzman(a)kaminario.com]
Sent: Tuesday, January 29, 2019 8:35 AM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: Strange CI failure
https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log
I can copy paste it if you cannot reach the link.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org<mailto:spdk-bounces(a)lists.01.org>> on behalf of Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure
Can you send a link to the full log?
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>>
Subject: [SPDK] Strange CI failure
Hi,
I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?
Here is the backtrace:
========== Backtrace start: ==========
in test/nvme/nvme.sh:108 -> main()
...
103 report_test_completion "nightly_nvme_reset"
104 timing_exit reset
105 fi
106
107 timing_enter identify
=> 108 $rootdir/examples/nvme/identify/identify -i 0
109 for bdf in $(iter_pci_class_code 01 08 02); do
110 $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
111 done
112 timing_exit identify
113
...
Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org<mailto:SPDK(a)lists.01.org>
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [SPDK] Strange CI failure
@ 2019-01-29 15:35 Shahar Salzman
0 siblings, 0 replies; 10+ messages in thread
From: Shahar Salzman @ 2019-01-29 15:35 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 1700 bytes --]
https://ci.spdk.io/spdk-jenkins/results/autotest-per-patch/builds/21382/archive/nvme_phy_autotest/build.log
I can copy paste it if you cannot reach the link.
________________________________
From: SPDK <spdk-bounces(a)lists.01.org> on behalf of Luse, Paul E <paul.e.luse(a)intel.com>
Sent: Tuesday, January 29, 2019 5:22 PM
To: Storage Performance Development Kit
Subject: Re: [SPDK] Strange CI failure
Can you send a link to the full log?
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: [SPDK] Strange CI failure
Hi,
I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?
Here is the backtrace:
========== Backtrace start: ==========
in test/nvme/nvme.sh:108 -> main()
...
103 report_test_completion "nightly_nvme_reset"
104 timing_exit reset
105 fi
106
107 timing_enter identify
=> 108 $rootdir/examples/nvme/identify/identify -i 0
109 for bdf in $(iter_pci_class_code 01 08 02); do
110 $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
111 done
112 timing_exit identify
113
...
Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [SPDK] Strange CI failure
@ 2019-01-29 15:22 Luse, Paul E
0 siblings, 0 replies; 10+ messages in thread
From: Luse, Paul E @ 2019-01-29 15:22 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 1153 bytes --]
Can you send a link to the full log?
-----Original Message-----
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar Salzman
Sent: Tuesday, January 29, 2019 8:21 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: [SPDK] Strange CI failure
Hi,
I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?
Here is the backtrace:
========== Backtrace start: ==========
in test/nvme/nvme.sh:108 -> main()
...
103 report_test_completion "nightly_nvme_reset"
104 timing_exit reset
105 fi
106
107 timing_enter identify
=> 108 $rootdir/examples/nvme/identify/identify -i 0
109 for bdf in $(iter_pci_class_code 01 08 02); do
110 $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
111 done
112 timing_exit identify
113
...
Shahar
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
^ permalink raw reply [flat|nested] 10+ messages in thread
* [SPDK] Strange CI failure
@ 2019-01-29 15:20 Shahar Salzman
0 siblings, 0 replies; 10+ messages in thread
From: Shahar Salzman @ 2019-01-29 15:20 UTC (permalink / raw)
To: spdk
[-- Attachment #1: Type: text/plain, Size: 733 bytes --]
Hi,
I have encountered a CI failure that has nothing to do with my code.
The reason that I know it has nothing to do with it, is that the change is a gdb macro.
Do we know that this test machine is unstable?
Here is the backtrace:
========== Backtrace start: ==========
in test/nvme/nvme.sh:108 -> main()
...
103 report_test_completion "nightly_nvme_reset"
104 timing_exit reset
105 fi
106
107 timing_enter identify
=> 108 $rootdir/examples/nvme/identify/identify -i 0
109 for bdf in $(iter_pci_class_code 01 08 02); do
110 $rootdir/examples/nvme/identify/identify -r "trtype:PCIe traddr:${bdf}" -i 0
111 done
112 timing_exit identify
113
...
Shahar
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2019-01-31 8:47 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-30 12:18 [SPDK] Strange CI failure Shahar Salzman
-- strict thread matches above, loose matches on Subject: below --
2019-01-31 8:47 Shahar Salzman
2019-01-30 14:28 Howell, Seth
2019-01-29 18:29 Luse, Paul E
2019-01-29 16:09 Harris, James R
2019-01-29 16:08 Shahar Salzman
2019-01-29 15:47 Luse, Paul E
2019-01-29 15:35 Shahar Salzman
2019-01-29 15:22 Luse, Paul E
2019-01-29 15:20 Shahar Salzman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.