* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
@ 2020-09-22 0:00 ` bugzilla-daemon
2020-09-22 0:01 ` bugzilla-daemon
` (13 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 0:00 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #1 from Alexander von Gluck (kallisti5@unixzen.com) ---
lspci -vvn
08:00.0 0302: 10de:102d (rev ff) (prog-if ff)
!!! Unknown header type 7f
Kernel driver in use: nouveau
Kernel modules: nouveau
09:00.0 0302: 10de:102d (rev ff) (prog-if ff)
!!! Unknown header type 7f
Kernel modules: nouveau
08:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev ff)
09:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev ff)
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
2020-09-22 0:00 ` [Bug 209345] " bugzilla-daemon
@ 2020-09-22 0:01 ` bugzilla-daemon
2020-09-22 1:14 ` bugzilla-daemon
` (12 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 0:01 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #2 from Alexander von Gluck (kallisti5@unixzen.com) ---
Created attachment 292555
--> https://bugzilla.kernel.org/attachment.cgi?id=292555&action=edit
nvidia thunderbolt 3 attachment
logs from attachment
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
2020-09-22 0:00 ` [Bug 209345] " bugzilla-daemon
2020-09-22 0:01 ` bugzilla-daemon
@ 2020-09-22 1:14 ` bugzilla-daemon
2020-09-22 1:54 ` bugzilla-daemon
` (11 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 1:14 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
Ilia Mirkin (imirkin@alum.mit.edu) changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |imirkin@alum.mit.edu
--- Comment #3 from Ilia Mirkin (imirkin@alum.mit.edu) ---
Is this a physical K80 board, or a virtual one via some sort of cloud provider?
We have nvf0 = GK110, nvf1 = what we call GK110B, but I'm not sure that's an
official name - basically the GTX 780 Ti and related Titans.
We don't have explicit support for nvf2 -- as I understand the K80 (GK210)
actually has some slight differences, e.g. more shared memory, etc... not sure
if that translates into some ctxsw fw differences or if it should just work --
you can check drivers/gpu/drm/nouveau/nvkm/engine/device/base.c, should be easy
to add in 0xf2 support based on the nvf1 if you want to play with it.
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (2 preceding siblings ...)
2020-09-22 1:14 ` bugzilla-daemon
@ 2020-09-22 1:54 ` bugzilla-daemon
2020-09-22 4:05 ` bugzilla-daemon
` (10 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 1:54 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #4 from Alexander von Gluck (kallisti5@unixzen.com) ---
Physical K80 board in my possession. They go for cheap now-a-days on ebay :-)
Memory size (GDDR5): 24GB
CUDA cores: 4992
Number Of GPUs: 2x GK120 GPUs
I'll try adding the nvf2 and see what happens. I have it in a TB3 enclosure
plugged into my Dell XPS 13, so it makes testing things pretty easy.
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (3 preceding siblings ...)
2020-09-22 1:54 ` bugzilla-daemon
@ 2020-09-22 4:05 ` bugzilla-daemon
2020-09-22 4:07 ` bugzilla-daemon
` (9 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 4:05 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #5 from Alexander von Gluck (kallisti5@unixzen.com) ---
[ 2208.130049] nouveau: version magic '5.8.10 SMP mod_unload ' should be
'5.8.10-200.fc32.x86_64 SMP mod_unload '
[ 2460.923164] ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch -
Found [Buffer], ACPI requires [Package] (20200528/nsarguments-59)
[ 2460.923220] nouveau 0000:08:00.0: can't change power state from D3hot to D0
(config space inaccessible)
[ 2460.923393] nouveau 0000:08:00.0: GPU not supported on big-endian
[ 2460.923411] nouveau: probe of 0000:08:00.0 failed with error -38
[ 2460.923424] nouveau 0000:09:00.0: can't change power state from D3hot to D0
(config space inaccessible)
[ 2460.923504] nouveau 0000:09:00.0: GPU not supported on big-endian
[ 2460.923507] nouveau: probe of 0000:09:00.0 failed with error -38
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (4 preceding siblings ...)
2020-09-22 4:05 ` bugzilla-daemon
@ 2020-09-22 4:07 ` bugzilla-daemon
2020-09-22 13:59 ` bugzilla-daemon
` (8 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 4:07 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #6 from Alexander von Gluck (kallisti5@unixzen.com) ---
Created attachment 292557
--> https://bugzilla.kernel.org/attachment.cgi?id=292557&action=edit
tesla k80 patch
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (5 preceding siblings ...)
2020-09-22 4:07 ` bugzilla-daemon
@ 2020-09-22 13:59 ` bugzilla-daemon
2020-09-22 14:14 ` bugzilla-daemon
` (7 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 13:59 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #7 from Ilia Mirkin (imirkin@alum.mit.edu) ---
[ 2460.923220] nouveau 0000:08:00.0: can't change power state from D3hot to D0
(config space inaccessible)
That's just really bad. My guess is that the "big-endian" notice is just due to
a register returning all 0xffffffff (we try to flip the GPU into little-endian
mode if we can). Seems like there are issues with the TB enclosure, or
something along those lines.
It does seem like you got further earlier to have gotten the "unknown chipset"
error, but by the time you were running lspci above, they were gone already
(returning all 1's, and PCI is active-low, so that just means it's all off).
Don't know what the difference is, I know nothing about those enclosures. I'd
try to disable any sort of power management that might be turning the enclosure
off.
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (6 preceding siblings ...)
2020-09-22 13:59 ` bugzilla-daemon
@ 2020-09-22 14:14 ` bugzilla-daemon
2020-09-22 14:15 ` bugzilla-daemon
` (6 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 14:14 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #8 from Alexander von Gluck (kallisti5@unixzen.com) ---
rebooted without TB3 enclosure attached. Msnuslly loaded nouveau vis insmod
after the TB3 attachment calmed down, and got something a bit cleaner:
[ 176.083524] nouveau: loading out-of-tree module taints kernel.
[ 176.084343] nouveau: module verification failed: signature and/or required
key missing - tainting kernel
[ 176.124991] ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch -
Found [Buffer], ACPI requires [Package] (20200528/nsarguments-59)
[ 176.125405] nouveau 0000:08:00.0: NVIDIA GK120 (0f22d0a1)
[ 176.406057] nouveau 0000:08:00.0: bios: version 80.21.1f.00.01
[ 176.537701] nouveau 0000:08:00.0: fb: 11520 MiB GDDR5
[ 176.562278] nouveau 0000:08:00.0: bar: one-time init failed, -12
[ 176.562522] nouveau 0000:08:00.0: init failed with -12
[ 176.562523] nouveau: DRM-master:00000000:00000080: init failed with -12
[ 176.562525] nouveau 0000:08:00.0: DRM-master: Device allocation failed: -12
[ 176.563099] nouveau: probe of 0000:08:00.0 failed with error -12
[ 176.563387] nouveau 0000:09:00.0: NVIDIA GK120 (0f22d0a1)
[ 176.842900] nouveau 0000:09:00.0: bios: version 80.21.1f.00.02
[ 176.977507] nouveau 0000:09:00.0: fb: 11520 MiB GDDR5
[ 177.002138] nouveau 0000:09:00.0: bar: one-time init failed, -12
[ 177.002380] nouveau 0000:09:00.0: init failed with -12
[ 177.002382] nouveau: DRM-master:00000000:00000080: init failed with -12
[ 177.002384] nouveau 0000:09:00.0: DRM-master: Device allocation failed: -12
[ 177.003019] nouveau: probe of 0000:09:00.0 failed with error -12
So, each GK120 gets 11.5 GiB to make up that 24GiB of ram.
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (7 preceding siblings ...)
2020-09-22 14:14 ` bugzilla-daemon
@ 2020-09-22 14:15 ` bugzilla-daemon
2020-09-22 14:18 ` bugzilla-daemon
` (5 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 14:15 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #9 from Alexander von Gluck (kallisti5@unixzen.com) ---
better lspci:
08:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
Subsystem: NVIDIA Corporation Device 106c
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 18
Region 0: Memory at c4000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at <unassigned> (64-bit, prefetchable)
Region 3: Memory at a0000000 (64-bit, prefetchable) [size=32M]
Capabilities: [60] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [78] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
unlimited, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
SlotPowerLimit 25.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr-
TransPend-
LnkCap: Port #8, Speed 8GT/s, Width x16, ASPM not supported
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s (downgraded), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range AB, TimeoutDis+, NROPrPrP-,
LTR-
10BitTagComp-, 10BitTagReq-, OBFF Not Supported,
ExtFmt-, EETLPPrefix-
EmergencyPowerReduction Not Supported,
EmergencyPowerReductionInit-
FRS-, TPHComp-, ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-,
OBFF Disabled
AtomicOpsCtl: ReqEn-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB,
EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2-, EqualizationPhase3-,
LinkEqualizationRequest-
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [128 v1] Power Budgeting <?>
Capabilities: [420 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn-
ECRCChkCap- ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1
Len=024 <?>
Capabilities: [900 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
LaneErrStat: 0
Kernel modules: nouveau
09:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
Subsystem: NVIDIA Corporation Device 106c
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 18
Region 0: Memory at c5000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at <unassigned> (64-bit, prefetchable)
Region 3: Memory at a4000000 (64-bit, prefetchable) [size=32M]
Capabilities: [60] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [78] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
unlimited, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
SlotPowerLimit 25.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr-
TransPend-
LnkCap: Port #16, Speed 8GT/s, Width x16, ASPM not supported
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s (downgraded), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range AB, TimeoutDis+, NROPrPrP-,
LTR-
10BitTagComp-, 10BitTagReq-, OBFF Not Supported,
ExtFmt-, EETLPPrefix-
EmergencyPowerReduction Not Supported,
EmergencyPowerReductionInit-
FRS-, TPHComp-, ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-,
OBFF Disabled
AtomicOpsCtl: ReqEn-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB,
EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2-, EqualizationPhase3-,
LinkEqualizationRequest-
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [128 v1] Power Budgeting <?>
Capabilities: [420 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn-
ECRCChkCap- ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1
Len=024 <?>
Capabilities: [900 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
LaneErrStat: 0
Kernel modules: nouveau
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (8 preceding siblings ...)
2020-09-22 14:15 ` bugzilla-daemon
@ 2020-09-22 14:18 ` bugzilla-daemon
2020-09-22 14:23 ` bugzilla-daemon
` (4 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 14:18 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #10 from Ilia Mirkin (imirkin@alum.mit.edu) ---
[ 176.562278] nouveau 0000:08:00.0: bar: one-time init failed, -12
08:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
Subsystem: NVIDIA Corporation Device 106c
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 18
Region 0: Memory at c4000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at <unassigned> (64-bit, prefetchable)
Region 3: Memory at a0000000 (64-bit, prefetchable) [size=32M]
That's not good. BAR1 is unassigned. We want BAR1. This is fallout from the TB
enclosure. I know nothing about this stuff... there are various memory windows,
etc. And apparently we don't fit in the window. I'm guessing there are errors
further up about how there's not enough space to assign those BAR's.
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (9 preceding siblings ...)
2020-09-22 14:18 ` bugzilla-daemon
@ 2020-09-22 14:23 ` bugzilla-daemon
2021-04-14 15:00 ` bugzilla-daemon
` (3 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2020-09-22 14:23 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #11 from Alexander von Gluck (kallisti5@unixzen.com) ---
weird... let me move things over to my Ryzen desktop and see what changes.
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (10 preceding siblings ...)
2020-09-22 14:23 ` bugzilla-daemon
@ 2021-04-14 15:00 ` bugzilla-daemon
2021-04-14 15:08 ` bugzilla-daemon
` (2 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2021-04-14 15:00 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #12 from Alexander von Gluck (kallisti5@unixzen.com) ---
A new motherboard later.. and after enabling 64-bit PCIe stuff the card posts.
ArchLinux 5.11.13
[ 4.689213] nouveau 0000:0d:00.0: enabling device (0000 -> 0002)
[ 4.689343] nouveau 0000:0d:00.0: unknown chipset (0f22d0a1)
[ 4.690686] nouveau 0000:0e:00.0: enabling device (0000 -> 0002)
[ 4.690758] nouveau 0000:0e:00.0: unknown chipset (0f22d0a1)
0d:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
Subsystem: NVIDIA Corporation Device 106c
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 44
IOMMU group: 21
Region 0: Memory at fb000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at 7800000000 (64-bit, prefetchable) [size=16G]
Region 3: Memory at 7c00000000 (64-bit, prefetchable) [size=32M]
Capabilities: [60] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [78] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
unlimited, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
SlotPowerLimit 25.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr-
TransPend-
LnkCap: Port #8, Speed 8GT/s, Width x16, ASPM not supported
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP-
LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt-
EETLPPrefix-
EmergencyPowerReduction Not Supported,
EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR-
OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer-
2Retimers- DRS-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+
EqualizationPhase1+
EqualizationPhase2- EqualizationPhase3-
LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
Status: NegoPending- InProgress-
Capabilities: [128 v1] Power Budgeting <?>
Capabilities: [420 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn-
ECRCChkCap- ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1
Len=024 <?>
Capabilities: [900 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Kernel modules: nouveau
0e:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
Subsystem: NVIDIA Corporation Device 106c
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 44
IOMMU group: 22
Region 0: Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at 7000000000 (64-bit, prefetchable) [size=16G]
Region 3: Memory at 7400000000 (64-bit, prefetchable) [size=32M]
Capabilities: [60] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [78] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
unlimited, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
SlotPowerLimit 25.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr-
TransPend-
LnkCap: Port #16, Speed 8GT/s, Width x16, ASPM not supported
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP-
LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt-
EETLPPrefix-
EmergencyPowerReduction Not Supported,
EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR-
OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer-
2Retimers- DRS-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+
EqualizationPhase1+
EqualizationPhase2- EqualizationPhase3-
LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
Status: NegoPending- InProgress-
Capabilities: [128 v1] Power Budgeting <?>
Capabilities: [420 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn-
ECRCChkCap- ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1
Len=024 <?>
Capabilities: [900 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Kernel modules: nouveau
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (11 preceding siblings ...)
2021-04-14 15:00 ` bugzilla-daemon
@ 2021-04-14 15:08 ` bugzilla-daemon
2021-04-14 15:10 ` bugzilla-daemon
2021-04-14 15:35 ` bugzilla-daemon
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2021-04-14 15:08 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #13 from Ilia Mirkin (imirkin@alum.mit.edu) ---
See comment #3 - it explains what you need to copy in nouveau to try to load
it.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (12 preceding siblings ...)
2021-04-14 15:08 ` bugzilla-daemon
@ 2021-04-14 15:10 ` bugzilla-daemon
2021-04-14 15:35 ` bugzilla-daemon
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2021-04-14 15:10 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #14 from Ilia Mirkin (imirkin@alum.mit.edu) ---
Also, wow, BAR1 = 16GB?? Normally it's like 256MB. No wonder your TB setup had
issues.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug 209345] [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80)
2020-09-21 23:55 [Bug 209345] New: [nouveau] unknown chipset (0f22d0a1) (nVidia Tesla K80) bugzilla-daemon
` (13 preceding siblings ...)
2021-04-14 15:10 ` bugzilla-daemon
@ 2021-04-14 15:35 ` bugzilla-daemon
14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2021-04-14 15:35 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=209345
--- Comment #15 from Alexander von Gluck (kallisti5@unixzen.com) ---
Applied my patch above to ArchLinux (5.11.13-arch1-1) and gave it a whirl. Got
a little information from nouveou before the system hard locks up.
nouveau 0000:0d:00.0: enabling device (0000 -> 0002)
nouveau 0000:0d:00.0: NVIDIA GK120 (0f22d0a1)
nouveau 0000:0d:00.0: bios: version 80.21.1f.00.01
nouveau 0000:0d:00.0: fb: 11520 MiB GDDR5
(hard crash)
I might get more information from serial... however, ran into an unrelated
issue. Cooling!
The Tesla K80 got up to 175F+ at idle and I had to shut things down. Need to
rig some better cooling solution.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 16+ messages in thread