All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
To: yskoh@mellanox.com
Cc: Ferruh Yigit <ferruh.yigit@intel.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	keith.wiles@intel.com, dev <dev@dpdk.org>,
	Bruce Richardson <bruce.richardson@intel.com>,
	Shahaf Shuler <shahafs@mellanox.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	anatoly.burakov@intel.com, stable@dpdk.org,
	justin.parus@microsoft.com,
	David Coronel <david.coronel@canonical.com>,
	Josh Powers <josh.powers@canonical.com>,
	Jay Vosburgh <jay.vosburgh@canonical.com>,
	Dan Streetman <dan.streetman@canonical.com>
Subject: Re: AVX512 bug on SkyLake
Date: Fri, 9 Nov 2018 07:27:06 +0100	[thread overview]
Message-ID: <CAATJJ0LhUsYJydhvY-8MJkfGKu39MDUv3+B7VLx+SWS1p7ZhPA@mail.gmail.com> (raw)
In-Reply-To: <CCB20D12-954E-46D3-98BC-D1E832F07DEA@mellanox.com>

On Fri, Nov 9, 2018 at 12:01 AM Yongseok Koh <yskoh@mellanox.com> wrote:
>
>
> > On Nov 8, 2018, at 9:21 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> >
> > On 11/8/2018 3:59 PM, Thomas Monjalon wrote:
> >> Hi,
> >>
> >> We need to gather more information about this bug.
> >> More below.
> >>

Thanks Thomas for looping us in!

> >> 07/11/2018 10:04, Wiles, Keith:
> >>>> On Nov 6, 2018, at 9:30 PM, Yongseok Koh <yskoh@mellanox.com> wrote:
> >>>>> On Nov 5, 2018, at 6:06 AM, Wiles, Keith <keith.wiles@intel.com> wrote:
> >>>>>> On Nov 2, 2018, at 9:04 PM, Yongseok Koh <yskoh@mellanox.com> wrote:
> >>>>>>
> >>>>>> This is a workaround to prevent a crash, which might be caused by
> >>>>>> optimization of newer gcc (7.3.0) on Intel Skylake.
> >>>>>
> >>>>> Should the code below not also test for the gcc version and
> >>>>> the Sky Lake processor, maybe I am wrong but it seems it is
> >>>>> turning AVX512 for all GCC builds
> >>>>
> >>>> I didn't want to check gcc version as 7.3.0 is very new. Only gcc 8 is newly up since then (gcc 8.2).
> >>>> Also, I wasn't able to test every gcc versions and I wanted to be a bit conservative for this crash.
> >>>> Performance drop (if any) by disabling a new (experimental) feature would be less risky than unaccountable crash.
> >>>> And, it does disable the feature only if CONFIG_RTE_ENABLE_AVX512=n. Please refer to v3.
> >>>
> >>> Are you not turning off all of the GCC versions for AVX512.
> >>> And you can test for range or greater then GCC version and
> >>> it just seems like we are turning off every gcc version, is that true?
> >>
> >> Do we know exactly which GCC versions are affected?
> >>
> >>>>> Also bug 97 seems a bit obscure reference, maybe you know
> >>>>> the bug report, but more details would be good?
> >>>>
> >>>> I sent out the report to dev list two month ago.
> >>>> And I created the Bug 97 in order to reference it
> >>>> in the commit message.
> >>>> I didn't want to repeat same message here and there,
> >>>> but it would've been better to have some sort of summary
> >>>> of the Bug, although v3 has a few more words.
> >>>> However, v3 has been merged.
> >>>
> >>> Still this is too obscure if nothing else give a link to
> >>> a specific bug not just 97.
> >>
> >> The URL is
> >>      https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.dpdk.org%2Fshow_bug.cgi%3Fid%3D97&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C90ff6c361faf422b976108d6459eb490%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636772945282345908&amp;sdata=2o%2Fg203aWrKCYg16S6oI4BcS41igpLu1DloS%2FrRnknc%3D&amp;reserved=0
> >> The bug is also pointing to an email:
> >>      https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2018-September%2F111522.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C90ff6c361faf422b976108d6459eb490%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636772945282345908&amp;sdata=NCFKxaREd69iZ8eyFKg%2FWBP73CLTXkxrNQQeii%2Bbsao%3D&amp;reserved=0
> >>
> >> Summary:
> >>      - CPU: Intel Skylake
> >>      - Linux environment: Ubuntu 18.04
> >>      - Compiler: gcc-7.3 (Ubuntu 7.3.0-16ubuntu3)
> >
> > Is it possible to test a few other gcc versions to check if the issue is
> > specific to this compiler version?
>
> Nothing's impossible but even with my quick search in gcc.gnu.org,
> I could find the following documents mention mavx512f support:
>
> GCC 4.9.0
> April 22, 2014 (changes, documentation)
>
> GCC 5.1
> April 22, 2015 (changes, documentation)
>
> GCC 6.4
> July 4, 2017 (changes, documentation)
>
> GCC 7.1
> May 2, 2017 (changes, documentation)
>
> GCC 8.1
> May 2, 2018 (changes, documentation)
>
> We altogether have to put quite large resource to verify all of the versions.
>
> I assumed older than gcc 7 would have the same issue. I know it was a speculation
> but like I mentioned I wanted to be more conservative. I didn't mean this is a permanent fix.
> For two months, we couldn't have any tangible solution (actually nobody cared including myself),
> so I submitted the patch to temporarily disable mavx512f.
>
> I'm still not sure what the best option is...
>

What I wonder in all of this as I don't understand that part of it yet is this.
I assume you are building on Ubuntu as that is your gcc reference.
FYI: as people asked for bug references, there also is [1] which seems
pretty much the same issue.

It builds with mostly defaults, that means per
mk/machine/default/rte.vars.mk and similar it sets -march=corei7

But when I look at what that implies all avx512 is disabled
$ gcc -Q --help=target -m64 -march=corei7 | grep avx512f
 -mavx512f                             [disabled]

So I wonder what/why -mno-avx512f should help at all.
I used the full list of gcc args we have for the build (e.g. [2] of a
18.05 build), but that doesn't change that (mostly -W, -I and -D).
So I wonder, did people do a custom build and bump up march or enable
-mavx512f on their own to hit that?
Or are we facing a real gcc issue where " -mavx512f [disabled]" is not
the same as -mno-avx512f ?
Maybe someone who hit the bug could clarify that please?

BTW: per reports I've seen it also seems to apply to the latest
compiler update of the same series - at least it was said to be fully
updated, that would be 7.3.0-27ubuntu1~18.04
But this is 2nd grade information as I don't have a system with the
right combo MLX5+Skylake available atm, so I can't confirm for sure
:-/

[1]: https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1799397
[2]: https://launchpadlibrarian.net/373589345/buildlog_ubuntu-bionic-amd64.dpdk_18.05-1~ubuntu0.18.04.1_BUILDING.txt.gz

  reply	other threads:[~2018-11-09  6:27 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-23 21:23 [PATCH] build: disable compiler AVX512F support Yongseok Koh
2018-11-01 23:11 ` Thomas Monjalon
2018-11-02 12:42 ` [dpdk-stable] " Ferruh Yigit
2018-11-02 13:48   ` Ferruh Yigit
2018-11-02 20:59     ` Yongseok Koh
2018-11-02 21:46       ` Ferruh Yigit
2018-11-02 23:31         ` Yongseok Koh
2018-11-02 21:04 ` [PATCH v2] " Yongseok Koh
2018-11-05 14:06   ` Wiles, Keith
2018-11-06 21:30     ` Yongseok Koh
2018-11-07  9:04       ` Wiles, Keith
2018-11-08 15:59         ` AVX512 bug on SkyLake Thomas Monjalon
2018-11-08 17:21           ` Ferruh Yigit
2018-11-08 23:01             ` Yongseok Koh
2018-11-09  6:27               ` Christian Ehrhardt [this message]
2018-11-09  9:49                 ` Ferruh Yigit
2018-11-09 11:35                   ` Thomas Monjalon
2018-11-09 10:03               ` Ferruh Yigit
2018-11-09 13:17                 ` [dpdk-stable] " Thomas Monjalon
2018-11-09 14:27                   ` Thomas Monjalon
2018-11-09 20:06                     ` Ferruh Yigit
2018-11-09 18:46           ` Stephen Hemminger
2018-11-10  2:13           ` [dpdk-stable] " Thomas Monjalon
2018-11-11 14:15             ` Ananyev, Konstantin
2018-11-11 18:15               ` Thomas Monjalon
2018-11-12  9:09                 ` Christian Ehrhardt
2018-11-12  9:21                   ` Thomas Monjalon
2018-11-12  9:26                 ` Ananyev, Konstantin
2018-11-03  1:06 ` [PATCH v3] build: disable gcc AVX512F support Yongseok Koh
2018-11-04 20:56   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAATJJ0LhUsYJydhvY-8MJkfGKu39MDUv3+B7VLx+SWS1p7ZhPA@mail.gmail.com \
    --to=christian.ehrhardt@canonical.com \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dan.streetman@canonical.com \
    --cc=david.coronel@canonical.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=josh.powers@canonical.com \
    --cc=justin.parus@microsoft.com \
    --cc=keith.wiles@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=shahafs@mellanox.com \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.